5438 Commits

Author SHA1 Message Date
Christophe JAILLET
f47f758cff drivers/base/memory: fix an unlikely reference counting issue in __add_memory_block()
__add_memory_block() calls both put_device() and device_unregister() when
storing the memory block into the xarray.  This is incorrect because
xarray doesn't take an additional reference and device_unregister()
already calls put_device().

Triggering the issue looks really unlikely and its only effect should be
to log a spurious warning about a ref counted issue.

Link: https://lkml.kernel.org/r/d44c63d78affe844f020dc02ad6af29abc448fc4.1650611702.git.christophe.jaillet@wanadoo.fr
Fixes: 4fb6eabf1037 ("drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Scott Cheloha <cheloha@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28 23:16:19 -07:00
Miaohe Lin
da63dc84be drivers/base/node.c: fix compaction sysfs file leak
Compaction sysfs file is created via compaction_register_node in
register_node.  But we forgot to remove it in unregister_node.  Thus
compaction sysfs file is leaked.  Using compaction_unregister_node to fix
this issue.

Link: https://lkml.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com
Fixes: ed4a6d7f0676 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28 23:16:06 -07:00
Dan Williams
fd3abd2caf device-core: Kill the lockdep_mutex
Per Peter [1], the lockdep API has native support for all the use cases
lockdep_mutex was attempting to enable. Now that all lockdep_mutex users
have been converted to those APIs, drop this lock.

Link: https://lore.kernel.org/r/Ylf0dewci8myLvoW@hirez.programming.kicks-ass.net [1]
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/165055522548.3745911.14298368286915484086.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28 14:01:55 -07:00
Lu Baolu
512881eacf bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management
The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.

This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).

For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.

Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
is currently a problem. As things stand, the IOMMU driver ignored the
initial iommu_probe_device() call when the device was added, since at
that point it had no fwspec yet. In this situation,
{of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
the IOMMU driver has seen the firmware data via .of_xlate to learn that
it actually responsible for the given device. As the result, before
that gets fixed, iommu_use_default_domain() goes at the end, and calls
arch_teardown_dma_ops() if it fails.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220418005000.897664-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 15:32:20 +02:00
Lu Baolu
4a6d9dd564 amba: Stop sharing platform_dma_configure()
Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220418005000.897664-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 15:32:20 +02:00
Lu Baolu
25f3bcfc54 driver core: Add dma_cleanup callback in bus_type
The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220418005000.897664-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 15:32:20 +02:00
Greg Kroah-Hartman
4388f887b8 Revert "firmware_loader: use kernel credentials when reading firmware"
This reverts commit 3677563eb8731e1ad5970e3e57f74e5f9d63502a as it leaks
memory :(

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Link: https://lore.kernel.org/r/20220427135823.GD71@qian
Cc: Thiébaud Weksteen <tweek@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: John Stultz <jstultz@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 16:19:39 +02:00
Won Chung
6423d29510 driver core: Add sysfs support for physical location of a device
When ACPI table includes _PLD fields for a device, create a new
directory (physical_location) in sysfs to share _PLD fields.

Currently without PLD information, when there are multiple of same
devices, it is hard to distinguish which device corresponds to which
physical device at which location. For example, when there are two Type
C connectors, it is hard to find out which connector corresponds to the
Type C port on the left panel versus the Type C port on the right panel.
With PLD information provided, we can determine which specific device at
which location is doing what.

_PLD output includes much more fields, but only generic fields are added
and exposed to sysfs, so that non-ACPI devices can also support it in
the future. The minimal generic fields needed for locating a device are
the following.
- panel
- vertical_position
- horizontal_position
- dock
- lid

Signed-off-by: Won Chung <wonchung@google.com>
Link: https://lore.kernel.org/r/20220314195458.271430-1-wonchung@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 09:51:57 +02:00
Sergey Shtylyov
ce753ad154 platform: finally disallow IRQ0 in platform_get_irq() and its ilk
The commit a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is
invalid") only calls WARN() when IRQ0 is about to be returned, however
using IRQ0 is considered invalid (according to Linus) outside the arch/
code where it's used by the i8253 drivers. Many driver subsystems treat
0 specially (e.g. as an indication of the polling mode by libata), so
the users of platform_get_irq[_byname]() in them would have to filter
out IRQ0 explicitly and this (quite obviously) doesn't scale...
Let's finally get this straight and return -EINVAL instead of IRQ0!

Fixes: a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid")
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/025679e1-1f0a-ae4b-4369-01164f691511@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 09:51:57 +02:00
Miaohe Lin
a72b6dff40 drivers/base/node.c: fix compaction sysfs file leak
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node
to fix this issue.

Fixes: ed4a6d7f0676 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Link: https://lore.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 09:51:57 +02:00
Mark-PK Tsai
84e7c6786a driver core: Prevent overriding async driver of a device before it probe
When there are 2 matched drivers for a device using
async probe mechanism, the dev->p->async_driver might
be overridden by the last attached driver.
So just skip the later one if the previous matched driver
was not handled by async thread yet.

Below is my use case which having this problem.

Make both driver mmcblk and mmc_test allow async probe,
the dev->p->async_driver will be overridden by the later driver
mmc_test and bind to the device then claim it for testing.
When it happen, mmcblk will never do probe again.

Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Link: https://lore.kernel.org/r/20220316074328.1801-1-mark-pk.tsai@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 09:51:56 +02:00
Bjorn Andersson
bcd6a517aa device property: Use multi-connection matchers for single case
The newly introduced helpers for searching for matches in the case of
multiple connections can be resused by the single-connection case, so do
this to save some duplication.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220422222351.1297276-3-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 13:47:12 +02:00
Bjorn Andersson
7a20917d30 device property: Add helper to match multiple connections
In some cases multiple connections with the same connection id
needs to be resolved from a fwnode graph.

One such example is when separate hardware is used for performing muxing
and/or orientation switching of the SuperSpeed and SBU lines in a USB
Type-C connector. In this case the connector needs to belong to a graph
with multiple matching remote endpoints, and the Type-C controller needs
to be able to resolve them both.

Add a new API that allows this kind of lookup.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220422222351.1297276-2-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 13:47:12 +02:00
Bagas Sanjaya
4c32174a24 Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
Sphinx reported build warnings mentioning drivers/base/dd.c:

</path/to/linux>/Documentation/driver-api/infrastructure:35:
./drivers/base/dd.c:280: WARNING: Unexpected indentation.
</path/to/linux>/Documentation/driver-api/infrastructure:35:
./drivers/base/dd.c:281: WARNING: Block quote ends without a blank line;
unexpected unindent.

The warnings above is due to syntax error in the "Return" section of driver_deferred_probe_check_state() which messed up with desired line breaks.

Fix the issue by using ReST lists syntax.

Fixes: c8c43cee29f6ca ("driver core: Fix driver_deferred_probe_check_state() logic")
Cc: linux-pm@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220416071137.19512-1-bagasdotme@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 12:38:17 +02:00
Russ Weight
536fd8184b firmware_loader: Add sysfs nodes to monitor fw_upload
Add additional sysfs nodes to monitor the transfer of firmware upload data
to the target device:

cancel: Write 1 to cancel the data transfer
error: Display error status for a failed firmware upload
remaining_size: Display the remaining amount of data to be transferred
status: Display the progress of the firmware upload

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-6-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 12:35:55 +02:00
Russ Weight
97730bbb24 firmware_loader: Add firmware-upload support
Extend the firmware subsystem to support a persistent sysfs interface that
userspace may use to initiate a firmware update. For example, FPGA based
PCIe cards load firmware and FPGA images from local FLASH when the card
boots. The images in FLASH may be updated with new images provided by the
user at his/her convenience.

A device driver may call firmware_upload_register() to expose persistent
"loading" and "data" sysfs files. These files are used in the same way as
the fallback sysfs "loading" and "data" files. When 0 is written to
"loading" to complete the write of firmware data, the data is transferred
to the lower-level driver using pre-registered call-back functions. The
data transfer is done in the context of a kernel worker thread.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-5-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 12:34:28 +02:00
Russ Weight
e0c11a8b98 firmware_loader: Split sysfs support from fallback
In preparation for sharing the "loading" and "data" sysfs nodes with the
new firmware upload support, split out sysfs functionality from fallback.c
and fallback.h into sysfs.c and sysfs.h. This includes the firmware
class driver code that is associated with the sysfs files and the
fw_fallback_config support for the timeout sysfs node.

CONFIG_FW_LOADER_SYSFS is created and is selected by
CONFIG_FW_LOADER_USER_HELPER in order to include sysfs.o in
firmware_class-objs.

This is mostly just a code reorganization. There are a few symbols that
change in scope, and these can be identified by looking at the header
file changes. A few white-space warnings from checkpatch are also
addressed in this patch.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-4-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 12:34:09 +02:00
Jeongtae Park
d640947562
regmap: cache: set max_register with reg_stride
Current logic does not consider multi-stride cases,
the max_register have to calculate with reg_stride
because it is a kind of address range.

Signed-off-by: Jeongtae Park <jtp.park@samsung.com>
Link: https://lore.kernel.org/r/20220425114613.15934-1-jtp.park@samsung.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-04-25 14:00:44 +01:00
Greg Kroah-Hartman
c95ce3a23d topology: Fix up build warning in topology_is_visible()
Commit aa63a74d4535 ("topology/sysfs: Hide PPIN on systems that do not
support it.") caused a build warning on some configurations:

drivers/base/topology.c: In function 'topology_is_visible':
drivers/base/topology.c:158:24: warning: unused variable 'dev' [-Wunused-variable]
  158 |         struct device *dev = kobj_to_dev(kobj);

Fix this up by getting rid of the variable entirely.

Fixes: aa63a74d4535 ("topology/sysfs: Hide PPIN on systems that do not support it.")
Cc: Tony Luck <tony.luck@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20220422062653.3899972-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-23 12:53:11 +02:00
Christophe JAILLET
4e224719f5 drivers/base/memory: Fix an unlikely reference counting issue in __add_memory_block()
__add_memory_block() calls both put_device() and device_unregister() when
storing the memory block into the xarray. This is incorrect because xarray
doesn't take an additional reference and device_unregister() already calls
put_device().

Triggering the issue looks really unlikely and its only effect should be to
log a spurious warning about a ref counted issue.

Fixes: 4fb6eabf1037 ("drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/d44c63d78affe844f020dc02ad6af29abc448fc4.1650611702.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:19:39 +02:00
Thiébaud Weksteen
3677563eb8 firmware_loader: use kernel credentials when reading firmware
Device drivers may decide to not load firmware when probed to avoid
slowing down the boot process should the firmware filesystem not be
available yet. In this case, the firmware loading request may be done
when a device file associated with the driver is first accessed. The
credentials of the userspace process accessing the device file may be
used to validate access to the firmware files requested by the driver.
Ensure that the kernel assumes the responsibility of reading the
firmware.

This was observed on Android for a graphic driver loading their firmware
when the device file (e.g. /dev/mali0) was first opened by userspace
(i.e. surfaceflinger). The security context of surfaceflinger was used
to validate the access to the firmware file (e.g.
/vendor/firmware/mali.bin).

Because previous configurations were relying on the userspace fallback
mechanism, the security context of the userspace daemon (i.e. ueventd)
was consistently used to read firmware files. More devices are found to
use the command line argument firmware_class.path which gives the kernel
the opportunity to read the firmware directly, hence surfacing this
misattribution.

Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20220422013215.2301793-1-tweek@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:56 +02:00
Russ Weight
736da0b657 firmware_loader: Check fw_state_is_done in loading_store
Rename fw_sysfs_done() and fw_sysfs_loading() to fw_state_is_done() and
fw_state_is_loading() respectively, and place them along side companion
functions in drivers/base/firmware_loader/firmware.h.

Use the fw_state_is_done() function to exit early from
firmware_loading_store() if the state is already "done". This is being done
in preparation for supporting persistent sysfs nodes to allow userspace to
upload firmware to a device, potentially reusing the sysfs loading and data
files multiple times.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-3-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:54 +02:00
Russ Weight
4ac4a90d77 firmware_loader: Clear data and size in fw_free_paged_buf
The fw_free_paged_buf() function resets the paged buffer information in
the fw_priv data structure. Additionally, clear the data and size members
of fw_priv in order to facilitate the reuse of fw_priv. This is being
done in preparation for enabling userspace to initiate multiple firmware
uploads using this sysfs interface.

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-2-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:54 +02:00
Krzysztof Kozlowski
6c2f421174 driver: platform: Add helper for safer setting of driver_override
Several core drivers and buses expect that driver_override is a
dynamically allocated memory thus later they can kfree() it.

However such assumption is not documented, there were in the past and
there are already users setting it to a string literal. This leads to
kfree() of static memory during device release (e.g. in error paths or
during unbind):

    kernel BUG at ../mm/slub.c:3960!
    Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
    ...
    (kfree) from [<c058da50>] (platform_device_release+0x88/0xb4)
    (platform_device_release) from [<c0585be0>] (device_release+0x2c/0x90)
    (device_release) from [<c0a69050>] (kobject_put+0xec/0x20c)
    (kobject_put) from [<c0f2f120>] (exynos5_clk_probe+0x154/0x18c)
    (exynos5_clk_probe) from [<c058de70>] (platform_drv_probe+0x6c/0xa4)
    (platform_drv_probe) from [<c058b7ac>] (really_probe+0x280/0x414)
    (really_probe) from [<c058baf4>] (driver_probe_device+0x78/0x1c4)
    (driver_probe_device) from [<c0589854>] (bus_for_each_drv+0x74/0xb8)
    (bus_for_each_drv) from [<c058b48c>] (__device_attach+0xd4/0x16c)
    (__device_attach) from [<c058a638>] (bus_probe_device+0x88/0x90)
    (bus_probe_device) from [<c05871fc>] (device_add+0x3dc/0x62c)
    (device_add) from [<c075ff10>] (of_platform_device_create_pdata+0x94/0xbc)
    (of_platform_device_create_pdata) from [<c07600ec>] (of_platform_bus_create+0x1a8/0x4fc)
    (of_platform_bus_create) from [<c0760150>] (of_platform_bus_create+0x20c/0x4fc)
    (of_platform_bus_create) from [<c07605f0>] (of_platform_populate+0x84/0x118)
    (of_platform_populate) from [<c0f3c964>] (of_platform_default_populate_init+0xa0/0xb8)
    (of_platform_default_populate_init) from [<c01031f8>] (do_one_initcall+0x8c/0x404)

Provide a helper which clearly documents the usage of driver_override.
This will allow later to reuse the helper and reduce the amount of
duplicated code.

Convert the platform driver to use a new helper and make the
driver_override field const char (it is not modified by the core).

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:13:53 +02:00
Ulf Hansson
bd40cbb0e3 PM: domains: Move genpd's time-accounting to ktime_get_mono_fast_ns()
To move towards a more consistent behaviour between genpd and the runtime
PM core, let's start by converting genpd's time-accounting from ktime_get()
into ktime_get_mono_fast_ns().

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-22 16:11:25 +02:00
Takashi Iwai
23cfbc6ec4 firmware: Add the support for ZSTD-compressed firmware files
As the growing demand on ZSTD compressions, there have been requests
for the support of ZSTD-compressed firmware files, so here it is:
this patch extends the firmware loader code to allow loading ZSTD
files.  The implementation is fairly straightforward, it just adds a
ZSTD decompression routine for the file expander.  (And the code is
even simpler than XZ thanks to the ZSTD API that gives the original
decompressed size from the header.)

Link: https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/
Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-2-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:16 +02:00
Wang Qing
1dc9f1a66e arch_topology: Do not set llc_sibling if llc_id is invalid
When ACPI is not enabled, cpuid_topo->llc_id = cpu_topo->llc_id = -1, which
will set llc_sibling 0xff(...), this is misleading.

Don't set llc_sibling(default 0) if we don't know the cache topology.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Wang Qing <wangqing@vivo.com>
Fixes: 37c3ec2d810f ("arm64: topology: divorce MC scheduling domain from core_siblings")
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1649644580-54626-1-git-send-email-wangqing@vivo.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20 18:57:59 +02:00
Darren Hart
db1e59483d topology: make core_mask include at least cluster_siblings
Ampere Altra defines CPU clusters in the ACPI PPTT. They share a Snoop
Control Unit, but have no shared CPU-side last level cache.

cpu_coregroup_mask() will return a cpumask with weight 1, while
cpu_clustergroup_mask() will return a cpumask with weight 2.

As a result, build_sched_domain() will BUG() once per CPU with:

BUG: arch topology borken
the CLS domain not a subset of the MC domain

The MC level cpumask is then extended to that of the CLS child, and is
later removed entirely as redundant. This sched domain topology is an
improvement over previous topologies, or those built without
SCHED_CLUSTER, particularly for certain latency sensitive workloads.
With the current scheduler model and heuristics, this is a desirable
default topology for Ampere Altra and Altra Max system.

Rather than create a custom sched domains topology structure and
introduce new logic in arch/arm64 to detect these systems, update the
core_mask so coregroup is never a subset of clustergroup, extending it
to cluster_siblings if necessary. Only do this if CONFIG_SCHED_CLUSTER
is enabled to avoid also changing the topology (MC) when
CONFIG_SCHED_CLUSTER is disabled.

This has the added benefit over a custom topology of working for both
symmetric and asymmetric topologies. It does not address systems where
the CLUSTER topology is above a populated MC topology, but these are not
considered today and can be addressed separately if and when they
appear.

The final sched domain topology for a 2 socket Ampere Altra system is
unchanged with or without CONFIG_SCHED_CLUSTER, and the BUG is avoided:

For CPU0:

CONFIG_SCHED_CLUSTER=y
CLS  [0-1]
DIE  [0-79]
NUMA [0-159]

CONFIG_SCHED_CLUSTER is not set
DIE  [0-79]
NUMA [0-159]

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: D. Scott Phillips <scott@os.amperecomputing.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: <stable@vger.kernel.org> # 5.16.x
Suggested-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Link: https://lore.kernel.org/r/c8fe9fce7c86ed56b4c455b8c902982dc2303868.1649696956.git.darren@os.amperecomputing.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20 18:57:20 +02:00
Tony Luck
aa63a74d45 topology/sysfs: Hide PPIN on systems that do not support it.
Systems that do not support a Protected Processor Identification Number
currently report:

	# cat /sys/devices/system/cpu/cpu0/topology/ppin
	0x0

which is confusing/wrong.

Add a ".is_visible" function to suppress inclusion of the ppin file.

Fixes: ab28e944197f ("topology/sysfs: Add PPIN in sysfs under cpu topology")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20220406220150.63855-1-tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20 18:56:39 +02:00
Ulf Hansson
13966517d4 PM: runtime: Allow to call __pm_runtime_set_status() from atomic context
The only two users of __pm_runtime_set_status() are pm_runtime_set_active()
and pm_runtime_set_suspended(). These are widely used and should be called
from non-atomic context to work as expected. However, it would be
convenient to allow them be called from atomic context too, as shown from a
subsequent change, so let's add support for this.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Maulik Shah <quic_mkshah@quicinc.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-19 17:24:02 +02:00
Andy Shevchenko
022fe6bc8f device property: Drop 'test' prefix in parameters of fwnode_is_ancestor_of()
The part 'is' in the function name implies the test against something.
Drop unnecessary 'test' prefix in the fwnode_is_ancestor_of() parameters.

No functional change intended.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 20:04:02 +02:00
Andy Shevchenko
87ffea0947 device property: Introduce fwnode_for_each_parent_node()
In a few cases the functionality of fwnode_for_each_parent_node()
is already in use. Introduce a common helper macro for it.

It may be used by others as well in the future.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 20:04:02 +02:00
Andy Shevchenko
002752af7b device property: Allow error pointer to be passed to fwnode APIs
Some of the fwnode APIs might return an error pointer instead of NULL
or valid fwnode handle. The result of such API call may be considered
optional and hence the test for it is usually done in a form of

	fwnode = fwnode_find_reference(...);
	if (IS_ERR(fwnode))
		...error handling...

Nevertheless the resulting fwnode may have bumped the reference count
and hence caller of the above API is obliged to call fwnode_handle_put().
Since fwnode may be not valid either as NULL or error pointer the check
has to be performed there. This approach uglifies the code and adds
a point of making a mistake, i.e. forgetting about error point case.

To prevent this, allow an error pointer to be passed to the fwnode APIs.

Fixes: 83b34afb6b79 ("device property: Introduce fwnode_find_reference()")
Reported-by: Nuno Sá <nuno.sa@analog.com>
Tested-by: Nuno Sá <nuno.sa@analog.com>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 20:04:02 +02:00
Rafael J. Wysocki
82586a7215 PM: runtime: Avoid device usage count underflows
A PM-runtime device usage count underflow is potentially critical,
because it may cause a device to be suspended when it is expected to
be operational.  It is also a programming problem that would be good
to catch and warn about.

For this reason, (1) make rpm_check_suspend_allowed() return an error
when the device usage count is negative to prevent devices from being
suspended in that case, (2) introduce rpm_drop_usage_count() that will
detect device usage count underflows, warn about them and fix them up,
and (3) use it to drop the usage count in a few places instead of
atomic_dec_and_test().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-04-13 17:30:32 +02:00
Krzysztof Kozlowski
eefa861810 PM: domains: Extend dev_pm_domain_detach() doc
Mention all domain attach menthods which dev_pm_domain_detach()
reverses.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 17:07:58 +02:00
Vladimir Oltean
74befa447e net: mdio: don't defer probe forever if PHY IRQ provider is missing
When a driver for an interrupt controller is missing, of_irq_get()
returns -EPROBE_DEFER ad infinitum, causing
fwnode_mdiobus_phy_device_register(), and ultimately, the entire
of_mdiobus_register() call, to fail. In turn, any phy_connect() call
towards a PHY on this MDIO bus will also fail.

This is not what is expected to happen, because the PHY library falls
back to poll mode when of_irq_get() returns a hard error code, and the
MDIO bus, PHY and attached Ethernet controller work fine, albeit
suboptimally, when the PHY library polls for link status. However,
-EPROBE_DEFER has special handling given the assumption that at some
point probe deferral will stop, and the driver for the supplier will
kick in and create the IRQ domain.

Reasons for which the interrupt controller may be missing:

- It is not yet written. This may happen if a more recent DT blob (with
  an interrupt-parent for the PHY) is used to boot an old kernel where
  the driver didn't exist, and that kernel worked with the
  vintage-correct DT blob using poll mode.

- It is compiled out. Behavior is the same as above.

- It is compiled as a module. The kernel will wait for a number of
  seconds specified in the "deferred_probe_timeout" boot parameter for
  user space to load the required module. The current default is 0,
  which times out at the end of initcalls. It is possible that this
  might cause regressions unless users adjust this boot parameter.

The proposed solution is to use the driver_deferred_probe_check_state()
helper function provided by the driver core, which gives up after some
-EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration.
The return code is changed from -EPROBE_DEFER into -ENODEV or
-ETIMEDOUT, depending on whether the kernel is compiled with support for
modules or not.

Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 14:17:55 -07:00
Sakari Ailus
99c63707ba device property: Add irq_get to fwnode operation
Add irq_get() fwnode operation to implement fwnode_irq_get() through
fwnode operations, moving the code in fwnode_irq_get() to OF and ACPI
frameworks.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05 15:30:47 +02:00
Sakari Ailus
68b979d068 device property: Add iomap to fwnode operations
Add iomap() fwnode operation to implement fwnode_iomap() through fwnode
operations, moving the code in fwnode_iomap() to OF framework.

Note that the IS_ENABLED(CONFIG_OF_ADDRESS) && is_of_node(fwnode) check is
needed for Sparc that has its own implementation of of_iomap anyway. Let
the pre-compiler to handle that check.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05 15:30:47 +02:00
Sakari Ailus
8c756a0a2d device property: Convert device_{dma_supported,get_dma_attr} to fwnode
Make the device_dma_supported and device_get_dma_attr functions to use the
fwnode ops, and move the implementation to ACPI and OF frameworks.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05 15:30:47 +02:00
Rikard Falkeborn
bd941dfaae
regmap: Constify static regmap_bus structs
The only usage of these is to pass their address to __regmap_init() or
__devm_regmap_init(), both which takes pointers to const struct
regmap_bus. Make them const to allow the compiler to put them in
read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20220330214110.36337-1-rikard.falkeborn@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-04-04 08:41:50 +01:00
Linus Torvalds
13776ebb99 Device properties code update for 5.18-rc1
This is based on new i2c material for 5.18-rc1 and simply reorganizes
 the code on top of it so as to group similar functions together (Andy
 Shevchenko).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pkcSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxF0wQAIbgtC27ABvKjl+FptW2ooo8t/1o5Mtq
 LvQ274WtBY0pUo7D9S1ZpAPHoGCM++mZYK+RzL5tpOl2J5t3zv4qaWk2vKILtI0I
 5aEQm78uJxnHunG9T8LcmQctk5G6sjjA775sEIGf2NC5ovGa8LmtH/24+roOwM30
 y0bP0+2/iTUjnfqUbIYFs8xst2i+2+8euHxahRIqmzkkdJXzUa3q8/qxWDz5nl37
 2laGI+ril5p8SIpYTReFB8Vzpd07Lfr+HcOjeKdk3dJw4bwmxv5VFUFsTFRP7/0v
 xFEnckVNG8gSt3LATj9MAL2ettAief1+IgeJubtNIWMlmai07J5ujZXRoRc4BbD8
 sJauw/BU79C6KDt2CBTbABhHXDfc5dw4ZjKqLSYLTKa2Fw9njhTGs41zn+tAj0vY
 ubycJxyOlzHeYoqHmZQEF28AHqiMsFS28xIWoVTJfThQOBJCOYLaohWhxcS31C/Z
 4BJQIrQCX97oroLIvNOuW/Wco0RdHGXAN1f0XeWfrEcRyVJQ/XJOsR2gVDNVflpD
 ya3je+R6mH+J95lRrtqgxl0AUyzVREQhr6EMw3nYfacdj3rzvy1PDjzSZsAdOsrd
 6asNc/y84sDqOh+4kmH1wWUqK/Ooq0j1YFqPnyAfdkUD8D0RZvkdz5ior3knU9lE
 O6CDMLTCygi6
 =voeP
 -----END PGP SIGNATURE-----

Merge tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull device properties code update from Rafael Wysocki:
 "This is based on new i2c material for 5.18-rc1 and simply reorganizes
  the code on top of it so as to group similar functions together (Andy
  Shevchenko)"

* tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: Don't split fwnode_get_irq*() APIs in the code
2022-03-29 11:30:12 -07:00
Linus Torvalds
266d17a8c0 Driver core changes for 5.18-rc1
Here is the set of driver core changes for 5.18-rc1.
 
 Not much here, primarily it was a bunch of cleanups and small updates:
 	- kobj_type cleanups for default_groups
 	- documentation updates
 	- firmware loader minor changes
 	- component common helper added and take advantage of it in many
 	  drivers (the largest part of this pull request).
 
 There will be a merge conflict in drivers/power/supply/ab8500_chargalg.c
 with your tree, the merge conflict should be easy (take all the
 changes).
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYkG6PA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylMFwCfSIyAU4oLEgj+/Rfmx4o45cAVIWMAnit3zbdU
 wUUCGqKcOnTJEcW6dMPh
 =1VVi
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of driver core changes for 5.18-rc1.

  Not much here, primarily it was a bunch of cleanups and small updates:

   - kobj_type cleanups for default_groups

   - documentation updates

   - firmware loader minor changes

   - component common helper added and take advantage of it in many
     drivers (the largest part of this pull request).

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (54 commits)
  Documentation: update stable review cycle documentation
  drivers/base/dd.c : Remove the initial value of the global variable
  Documentation: update stable tree link
  Documentation: add link to stable release candidate tree
  devres: fix typos in comments
  Documentation: add note block surrounding security patch note
  samples/kobject: Use sysfs_emit instead of sprintf
  base: soc: Make soc_device_match() simpler and easier to read
  driver core: dd: fix return value of __setup handler
  driver core: Refactor sysfs and drv/bus remove hooks
  driver core: Refactor multiple copies of device cleanup
  scripts: get_abi.pl: Fix typo in help message
  kernfs: fix typos in comments
  kernfs: remove unneeded #if 0 guard
  ALSA: hda/realtek: Make use of the helper component_compare_dev_name
  video: omapfb: dss: Make use of the helper component_compare_dev
  power: supply: ab8500: Make use of the helper component_compare_dev
  ASoC: codecs: wcd938x: Make use of the helper component_compare/release_of
  iommu/mediatek: Make use of the helper component_compare/release_of
  drm: of: Make use of the helper component_release_of
  ...
2022-03-28 12:41:28 -07:00
Linus Torvalds
5627ecb837 Merge branch 'i2c/for-mergewindow' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:

 - tracepoints when Linux acts as an I2C client

 - added support for AMD PSP

 - whole subsystem now uses generic_handle_irq_safe()

 - piix4 driver gained MMIO access enabling so far missed controllers
   with AMD chipsets

 - a bulk of device driver updates, refactorization, and fixes.

* 'i2c/for-mergewindow' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (61 commits)
  i2c: mux: demux-pinctrl: do not deactivate a master that is not active
  i2c: meson: Fix wrong speed use from probe
  i2c: add tracepoints for I2C slave events
  i2c: designware: Remove code duplication
  i2c: cros-ec-tunnel: Fix syntax errors in comments
  MAINTAINERS: adjust XLP9XX I2C DRIVER after removing the devicetree binding
  i2c: designware: Mark dw_i2c_plat_{suspend,resume}() as __maybe_unused
  i2c: mediatek: Add i2c compatible for Mediatek MT8168
  dt-bindings: i2c: update bindings for MT8168 SoC
  i2c: mt65xx: Simplify with clk-bulk
  i2c: i801: Drop two outdated comments
  i2c: xiic: Make bus names unique
  i2c: i801: Add support for the Process Call command
  i2c: i801: Drop useless masking in i801_access
  i2c: tegra: Add SMBus block read function
  i2c: designware: Use the i2c_mark_adapter_suspended/resumed() helpers
  i2c: designware: Lock the adapter while setting the suspended flag
  i2c: mediatek: remove redundant null check
  i2c: mediatek: modify bus speed calculation formula
  i2c: designware: Fix improper usage of readl
  ...
2022-03-26 12:46:08 -07:00
Linus Torvalds
3bf03b9a08 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs

 - Most the MM patches which precede the patches in Willy's tree: kasan,
   pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
   sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb,
   userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp,
   cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap,
   zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits)
  mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release()
  Docs/ABI/testing: add DAMON sysfs interface ABI document
  Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface
  selftests/damon: add a test for DAMON sysfs interface
  mm/damon/sysfs: support DAMOS stats
  mm/damon/sysfs: support DAMOS watermarks
  mm/damon/sysfs: support schemes prioritization
  mm/damon/sysfs: support DAMOS quotas
  mm/damon/sysfs: support DAMON-based Operation Schemes
  mm/damon/sysfs: support the physical address space monitoring
  mm/damon/sysfs: link DAMON for virtual address spaces monitoring
  mm/damon: implement a minimal stub for sysfs-based DAMON interface
  mm/damon/core: add number of each enum type values
  mm/damon/core: allow non-exclusive DAMON start/stop
  Docs/damon: update outdated term 'regions update interval'
  Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling
  Docs/vm/damon: call low level monitoring primitives the operations
  mm/damon: remove unnecessary CONFIG_DAMON option
  mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}()
  mm/damon/dbgfs-test: fix is_target_id() change
  ...
2022-03-22 16:11:53 -07:00
David Hildenbrand
2aa065f7af drivers/base/memory: clarify adding and removing of memory blocks
Let's make it clearer at which places we actually add and remove memory
blocks -- streamlining the terminology -- and highlight which memory block
start out online and which start out as offline.

 * rename add_memory_block -> add_boot_memory_block
 * rename init_memory_block -> add_memory_block
 * rename unregister_memory -> remove_memory_block
 * rename register_memory -> __add_memory_block
 * add add_hotplug_memory_block
 * mark add_boot_memory_block with __init (suggested by Oscar)

__add_memory_block() is  a pure helper for add_memory_block(), remove
the somewhat obvious comment.

Link: https://lkml.kernel.org/r/20220221154531.11382-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:10 -07:00
David Hildenbrand
395f6081ba drivers/base/memory: determine and store zone for single-zone memory blocks
test_pages_in_a_zone() is just another nasty PFN walker that can easily
stumble over ZONE_DEVICE memory ranges falling into the same memory block
as ordinary system RAM: the memmap of parts of these ranges might possibly
be uninitialized.  In fact, we observed (on an older kernel) with UBSAN:

  UBSAN: Undefined behaviour in ./include/linux/mm.h:1133:50
  index 7 is out of range for type 'zone [5]'
  CPU: 121 PID: 35603 Comm: read_all Kdump: loaded Tainted: [...]
  Hardware name: Dell Inc. PowerEdge R7425/08V001, BIOS 1.12.2 11/15/2019
  Call Trace:
   dump_stack+0x9a/0xf0
   ubsan_epilogue+0x9/0x7a
   __ubsan_handle_out_of_bounds+0x13a/0x181
   test_pages_in_a_zone+0x3c4/0x500
   show_valid_zones+0x1fa/0x380
   dev_attr_show+0x43/0xb0
   sysfs_kf_seq_show+0x1c5/0x440
   seq_read+0x49d/0x1190
   vfs_read+0xff/0x300
   ksys_read+0xb8/0x170
   do_syscall_64+0xa5/0x4b0
   entry_SYSCALL_64_after_hwframe+0x6a/0xdf
  RIP: 0033:0x7f01f4439b52

We seem to stumble over a memmap that contains a garbage zone id.  While
we could try inserting pfn_to_online_page() calls, it will just make
memory offlining slower, because we use test_pages_in_a_zone() to make
sure we're offlining pages that all belong to the same zone.

Let's just get rid of this PFN walker and determine the single zone of a
memory block -- if any -- for early memory blocks during boot.  For memory
onlining, we know the single zone already.  Let's avoid any additional
memmap scanning and just rely on the zone information available during
boot.

For memory hot(un)plug, we only really care about memory blocks that:
* span a single zone (and, thereby, a single node)
* are completely System RAM (IOW, no holes, no ZONE_DEVICE)
If one of these conditions is not met, we reject memory offlining.
Hotplugged memory blocks (starting out offline), always meet both
conditions.

There are three scenarios to handle:

(1) Memory hot(un)plug

A memory block with zone == NULL cannot be offlined, corresponding to
our previous test_pages_in_a_zone() check.

After successful memory onlining/offlining, we simply set the zone
accordingly.
* Memory onlining: set the zone we just used for onlining
* Memory offlining: set zone = NULL

So a hotplugged memory block starts with zone = NULL. Once memory
onlining is done, we set the proper zone.

(2) Boot memory with !CONFIG_NUMA

We know that there is just a single pgdat, so we simply scan all zones
of that pgdat for an intersection with our memory block PFN range when
adding the memory block. If more than one zone intersects (e.g., DMA and
DMA32 on x86 for the first memory block) we set zone = NULL and
consequently mimic what test_pages_in_a_zone() used to do.

(3) Boot memory with CONFIG_NUMA

At the point in time we create the memory block devices during boot, we
don't know yet which nodes *actually* span a memory block. While we could
scan all zones of all nodes for intersections, overlapping nodes complicate
the situation and scanning all nodes is possibly expensive. But that
problem has already been solved by the code that sets the node of a memory
block and creates the link in the sysfs --
do_register_memory_block_under_node().

So, we hook into the code that sets the node id for a memory block. If
we already have a different node id set for the memory block, we know
that multiple nodes *actually* have PFNs falling into our memory block:
we set zone = NULL and consequently mimic what test_pages_in_a_zone() used
to do. If there is no node id set, we do the same as (2) for the given
node.

Note that the call order in driver_init() is:
-> memory_dev_init(): create memory block devices
-> node_dev_init(): link memory block devices to the node and set the
		    node id

So in summary, we detect if there is a single zone responsible for this
memory block and we consequently store the zone in that case in the
memory block, updating it during memory onlining/offlining.

Link: https://lkml.kernel.org/r/20220210184359.235565-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Rafael Parra <rparrazo@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:10 -07:00
David Hildenbrand
cc6515591b drivers/base/node: rename link_mem_sections() to register_memory_block_under_node()
Patch series "drivers/base/memory: determine and store zone for single-zone memory blocks", v2.

I remember talking to Michal in the past about removing
test_pages_in_a_zone(), which we use for:
* verifying that a memory block we intend to offline is really only managed
  by a single zone. We don't support offlining of memory blocks that are
  managed by multiple zones (e.g., multiple nodes, DMA and DMA32)
* exposing that zone to user space via
  /sys/devices/system/memory/memory*/valid_zones

Now that I identified some more cases where test_pages_in_a_zone() might
go wrong, and we received an UBSAN report (see patch #3), let's get rid of
this PFN walker.

So instead of detecting the zone at runtime with test_pages_in_a_zone() by
scanning the memmap, let's determine and remember for each memory block if
it's managed by a single zone.  The stored zone can then be used for the
above two cases, avoiding a manual lookup using test_pages_in_a_zone().

This avoids eventually stumbling over uninitialized memmaps in corner
cases, especially when ZONE_DEVICE ranges partly fall into memory block
(that are responsible for managing System RAM).

Handling memory onlining is easy, because we online to exactly one zone.
Handling boot memory is more tricky, because we want to avoid scanning all
zones of all nodes to detect possible zones that overlap with the physical
memory region of interest.  Fortunately, we already have code that
determines the applicable nodes for a memory block, to create sysfs links
-- we'll hook into that.

Patch #1 is a simple cleanup I had laying around for a longer time.
Patch #2 contains the main logic to remove test_pages_in_a_zone() and
further details.

[1] https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
[2] https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com

This patch (of 2):

Let's adjust the stale terminology, making it match
unregister_memory_block_under_nodes() and
do_register_memory_block_under_node().  We're dealing with memory block
devices, which span 1..X memory sections.

Link: https://lkml.kernel.org/r/20220210184359.235565-1-david@redhat.com
Link: https://lkml.kernel.org/r/20220210184359.235565-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:10 -07:00
David Hildenbrand
2848a28b0a drivers/base/node: consolidate node device subsystem initialization in node_dev_init()
...  and call node_dev_init() after memory_dev_init() from driver_init(),
so before any of the existing arch/subsys calls.  All online nodes should
be known at that point: early during boot, arch code determines node and
zone ranges and sets the relevant nodes online; usually this happens in
setup_arch().

This is in line with memory_dev_init(), which initializes the memory
device subsystem and creates all memory block devices.

Similar to memory_dev_init(), panic() if anything goes wrong, we don't
want to continue with such basic initialization errors.

The important part is that node_dev_init() gets called after
memory_dev_init() and after cpu_dev_init(), but before any of the relevant
archs call register_cpu() to register the new cpu device under the node
device.  The latter should be the case for the current users of
topology_init().

Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64)
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:10 -07:00
David Hildenbrand
7ea0d2d79d drivers/base/memory: add memory block to memory group after registration succeeded
If register_memory() fails, we freed the memory block but already added
the memory block to the group list, not good.  Let's defer adding the
block to the memory group to after registering the memory block device.

We do handle it properly during unregister_memory(), but that's not
called when the registration fails.

Link: https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
Fixes: 028fc57a1c36 ("drivers/base/memory: introduce "memory groups" to logically group memory blocks")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:10 -07:00
luofei
d1fe111fb6 mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler
When the hwpoison page meets the filter conditions, it should not be
regarded as successful memory_failure() processing for mce handler, but
should return a distinct value, otherwise mce handler regards the error
page has been identified and isolated, which may lead to calling
set_mce_nospec() to change page attribute, etc.

Here memory_failure() return -EOPNOTSUPP to indicate that the error
event is filtered, mce handler should not take any action for this
situation and hwpoison injector should treat as correct.

Link: https://lkml.kernel.org/r/20220223082135.2769649-1-luofei@unicloud.com
Signed-off-by: luofei <luofei@unicloud.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:07 -07:00