1073944 Commits

Author SHA1 Message Date
Rafael J. Wysocki
ac9f31096b Merge branch 'powercap'
Merge Dynamic Thermal Power Management (DTPM) changes for 5.18-rc1:

 - Add DTPM hierarchy description (Daniel Lezcano).

 - Change the locking scheme in DTPM (Daniel Lezcano).

 - Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
   release (Daniel Lezcano).

 - Make dtpm_node_callback[] static (kernel test robot).

 - Fix spelling mistake "initialze" -> "initialize" in
   dtpm_create_hierarchy() (Colin Ian King).

* powercap:
  powercap: DTPM: Fix spelling mistake "initialze" -> "initialize"
  powercap: DTPM: dtpm_node_callback[] can be static
  dtpm/soc/rk3399: Add the ability to unload the module
  powercap/dtpm_cpu: Add exit function
  powercap/dtpm: Move the 'root' reset place
  powercap/dtpm: Destroy hierarchy function
  powercap/dtpm: Fixup kfree for virtual node
  powercap/dtpm_cpu: Reset per_cpu variable in the release function
  powercap/dtpm: Change locking scheme
  rockchip/soc/drivers: Add DTPM description for rk3399
  powercap/drivers/dtpm: Add dtpm devfreq with energy model support
  powercap/drivers/dtpm: Add CPU DT initialization support
  powercap/drivers/dtpm: Add hierarchy creation
  powercap/drivers/dtpm: Convert the init table section to a simple array
2022-03-18 18:40:38 +01:00
Rafael J. Wysocki
dfad78e07e Merge branches 'pm-sleep', 'pm-domains' and 'pm-docs'
Merge changes related to system sleep, PM domains changes and power
management documentation changes for 5.18-rc1:

 - Fix load_image_and_restore() error path (Ye Bin).

 - Fix typos in comments in the system wakeup hadling code (Tom Rix).

 - Clean up non-kernel-doc comments in hibernation code (Jiapeng
   Chong).

 - Fix __setup handler error handling in system-wide suspend and
   hibernation core code (Randy Dunlap).

 - Add device name to suspend_report_result() (Youngjin Jang).

 - Make virtual guests honour ACPI S4 hardware signature by
   default (David Woodhouse).

 - Block power off of a parent PM domain unless child is in deepest
   state (Ulf Hansson).

 - Use dev_err_probe() to simplify error handling for generic PM
   domains (Ahmad Fatoum).

 - Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).

 - Document Intel uncore frequency scaling (Srinivas Pandruvada).

* pm-sleep:
  PM: hibernate: Honour ACPI hardware signature by default for virtual guests
  PM: sleep: Add device name to suspend_report_result()
  PM: suspend: fix return value of __setup handler
  PM: hibernate: fix __setup handler error handling
  PM: hibernate: Clean up non-kernel-doc comments
  PM: sleep: wakeup: Fix typos in comments
  PM: hibernate: fix load_image_and_restore() error path

* pm-domains:
  PM: domains: Fix sleep-in-atomic bug caused by genpd_debug_remove()
  PM: domains: use dev_err_probe() to simplify error handling
  PM: domains: Prevent power off for parent unless child is in deepest state

* pm-docs:
  Documentation: admin-guide: pm: Document uncore frequency scaling
2022-03-18 18:29:21 +01:00
Rafael J. Wysocki
86c17c40d2 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
Merge cpufreq and cpuidle changes for 5.18-rc1:

 - Make the schedutil cpufreq governor use to_gov_attr_set() instead
   of open coding it (Kevin Hao).

 - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
   cpufreq longhaul driver (Rafael Wysocki).

 - Unify show() and store() naming in cpufreq and make it use
   __ATTR_XX (Lianjie Zhang).

 - Make the intel_pstate driver use the EPP value set by the firmware
   by default (Srinivas Pandruvada).

 - Re-order the init checks in the powernow-k8 cpufreq driver (Mario
   Limonciello).

 - Make the ACPI processor idle driver check for architectural
   support for LPI to avoid using it on x86 by mistake (Mario
   Limonciello).

 - Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
   Bityutskiy).

 - Add 'preferred_cstates' module argument to the intel_idle driver
   to work around C1 and C1E handling issue on Sapphire Rapids (Artem
   Bityutskiy).

 - Add core C6 optimization on Sapphire Rapids to the intel_idle
   driver (Artem Bityutskiy).

 - Optimize the haltpoll cpuidle driver a bit (Li RongQing).

 - Remove leftover text from intel_idle() kerneldoc comment and fix
   up white space in intel_idle (Rafael Wysocki).

* pm-cpufreq:
  cpufreq: powernow-k8: Re-order the init checks
  cpufreq: intel_pstate: Use firmware default EPP
  cpufreq: unify show() and store() naming and use __ATTR_XX
  cpufreq: longhaul: Replace acpi_bus_get_device()
  cpufreq: schedutil: Use to_gov_attr_set() to get the gov_attr_set
  cpufreq: Move to_gov_attr_set() to cpufreq.h

* pm-cpuidle:
  cpuidle: intel_idle: Drop redundant backslash at line end
  cpuidle: intel_idle: Update intel_idle() kerneldoc comment
  cpuidle: haltpoll: Call cpuidle_poll_state_init() later
  intel_idle: add core C6 optimization for SPR
  intel_idle: add 'preferred_cstates' module argument
  intel_idle: add SPR support
  ACPI: processor idle: Check for architectural support for LPI
  cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
2022-03-18 18:14:55 +01:00
Mario Limonciello
3870a44d50 cpufreq: powernow-k8: Re-order the init checks
The powernow-k8 driver will do checks at startup that the current
active driver is acpi-cpufreq and show a warning when they're not
expected.

Because of this the following warning comes up on systems that
support amd-pstate and compiled in both drivers:
`WTF driver: amd-pstate`

The systems that support powernow-k8 will not support amd-pstate,
so re-order the checks to validate the CPU model number first to
avoid this warning being displayed on modern SOCs.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-17 14:54:11 +01:00
Rafael J. Wysocki
03eb65224e cpuidle: intel_idle: Drop redundant backslash at line end
Drop a redundant backslash character at the end of a line in the
spr_cstates[] definition.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2022-03-17 14:32:59 +01:00
Rafael J. Wysocki
a335b1e6bb cpuidle: intel_idle: Update intel_idle() kerneldoc comment
Commit bf9282dc26e7 ("cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic")
moved the leave_mm() call away from intel_idle(), but it didn't update
its kerneldoc comment accordingly, so do that now.

Fixes: bf9282dc26e7 ("cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-17 14:32:59 +01:00
David Woodhouse
f6c46b1d62 PM: hibernate: Honour ACPI hardware signature by default for virtual guests
The ACPI specification says that OSPM should refuse to restore from
hibernate if the hardware signature changes, and should boot from
scratch. However, real BIOSes often vary the hardware signature in cases
where we *do* want to resume from hibernate, so Linux doesn't follow the
spec by default.

However, in a virtual environment there's no reason for the VMM to vary
the hardware signature *unless* it wants to trigger a clean reboot as
defined by the ACPI spec. So enable the check by default if a hypervisor
is detected.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-16 19:29:32 +01:00
Srinivas Pandruvada
3d13058ed2 cpufreq: intel_pstate: Use firmware default EPP
For some specific platforms (E.g. AlderLake) the balance performance
EPP is updated from the hard coded value in the driver. This acts as
the default and balance_performance EPP. The purpose of this EPP
update is to reach maximum 1 core turbo frequency (when possible) out
of the box.

Although we can achieve the objective by using hard coded value in the
driver, there can be other EPP which can be better in terms of power.
But that will be very subjective based on platform and use cases.
This is not practical to have a per platform specific default hard coded
in the driver.

If a platform wants to specify default EPP, it can be set in the firmware.
If this EPP is not the chipset default of 0x80 (balance_perf_epp unless
driver changed it) and more performance oriented but not 0, the driver
can use this as the default and balanced_perf EPP. In this case no driver
update is required every time there is some new platform and default EPP.

If the firmware didn't update the EPP from the chipset default then
the hard coded value is used as per existing implementation.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-16 19:14:55 +01:00
Lianjie Zhang
85750bcd48 cpufreq: unify show() and store() naming and use __ATTR_XX
Usually, sysfs attributes have .show and .store and their naming
convention is filename_show() and filename_store().

But in cpufreq the naming convention of these functions is
show_filename() and store_filename() which prevents __ATTR_RW() and
__ATTR_RO() from being used in there to simplify code.

Accordingly, change the naming convention of the sysfs .show and
.store methods in cpufreq to follow the one expected by __ATTR_RW()
and __ATTR_RO() and use these macros in that code.

Signed-off-by: Lianjie Zhang <zhanglianjie@uniontech.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-10 19:55:05 +01:00
Dmitry Baryshkov
524bb1da78 PM: core: keep irq flags in device_pm_check_callbacks()
The function device_pm_check_callbacks() can be called under the spin
lock (in the reported case it happens from genpd_add_device() ->
dev_pm_domain_set(), when the genpd uses spinlocks rather than mutexes.

However this function uncoditionally uses spin_lock_irq() /
spin_unlock_irq(), thus not preserving the CPU flags. Use the
irqsave/irqrestore instead.

The backtrace for the reference:
[    2.752010] ------------[ cut here ]------------
[    2.756769] raw_local_irq_restore() called with IRQs enabled
[    2.762596] WARNING: CPU: 4 PID: 1 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x34/0x50
[    2.772338] Modules linked in:
[    2.775487] CPU: 4 PID: 1 Comm: swapper/0 Tainted: G S                5.17.0-rc6-00384-ge330d0d82eff-dirty #684
[    2.781384] Freeing initrd memory: 46024K
[    2.785839] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.785841] pc : warn_bogus_irq_restore+0x34/0x50
[    2.785844] lr : warn_bogus_irq_restore+0x34/0x50
[    2.785846] sp : ffff80000805b7d0
[    2.785847] x29: ffff80000805b7d0 x28: 0000000000000000 x27: 0000000000000002
[    2.785850] x26: ffffd40e80930b18 x25: ffff7ee2329192b8 x24: ffff7edfc9f60800
[    2.785853] x23: ffffd40e80930b18 x22: ffffd40e80930d30 x21: ffff7edfc0dffa00
[    2.785856] x20: ffff7edfc09e3768 x19: 0000000000000000 x18: ffffffffffffffff
[    2.845775] x17: 6572206f74206465 x16: 6c696166203a3030 x15: ffff80008805b4f7
[    2.853108] x14: 0000000000000000 x13: ffffd40e809550b0 x12: 00000000000003d8
[    2.860441] x11: 0000000000000148 x10: ffffd40e809550b0 x9 : ffffd40e809550b0
[    2.867774] x8 : 00000000ffffefff x7 : ffffd40e809ad0b0 x6 : ffffd40e809ad0b0
[    2.875107] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
[    2.882440] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff7edfc03a8000
[    2.889774] Call trace:
[    2.892290]  warn_bogus_irq_restore+0x34/0x50
[    2.896770]  _raw_spin_unlock_irqrestore+0x94/0xa0
[    2.901690]  genpd_unlock_spin+0x20/0x30
[    2.905724]  genpd_add_device+0x100/0x2d0
[    2.909850]  __genpd_dev_pm_attach+0xa8/0x23c
[    2.914329]  genpd_dev_pm_attach_by_id+0xc4/0x190
[    2.919167]  genpd_dev_pm_attach_by_name+0x3c/0xd0
[    2.924086]  dev_pm_domain_attach_by_name+0x24/0x30
[    2.929102]  psci_dt_attach_cpu+0x24/0x90
[    2.933230]  psci_cpuidle_probe+0x2d4/0x46c
[    2.937534]  platform_probe+0x68/0xe0
[    2.941304]  really_probe.part.0+0x9c/0x2fc
[    2.945605]  __driver_probe_device+0x98/0x144
[    2.950085]  driver_probe_device+0x44/0x15c
[    2.954385]  __device_attach_driver+0xb8/0x120
[    2.958950]  bus_for_each_drv+0x78/0xd0
[    2.962896]  __device_attach+0xd8/0x180
[    2.966843]  device_initial_probe+0x14/0x20
[    2.971144]  bus_probe_device+0x9c/0xa4
[    2.975092]  device_add+0x380/0x88c
[    2.978679]  platform_device_add+0x114/0x234
[    2.983067]  platform_device_register_full+0x100/0x190
[    2.988344]  psci_idle_init+0x6c/0xb0
[    2.992113]  do_one_initcall+0x74/0x3a0
[    2.996060]  kernel_init_freeable+0x2fc/0x384
[    3.000543]  kernel_init+0x28/0x130
[    3.004132]  ret_from_fork+0x10/0x20
[    3.007817] irq event stamp: 319826
[    3.011404] hardirqs last  enabled at (319825): [<ffffd40e7eda0268>] __up_console_sem+0x78/0x84
[    3.020332] hardirqs last disabled at (319826): [<ffffd40e7fd6d9d8>] el1_dbg+0x24/0x8c
[    3.028458] softirqs last  enabled at (318312): [<ffffd40e7ec90410>] _stext+0x410/0x588
[    3.036678] softirqs last disabled at (318299): [<ffffd40e7ed1bf68>] __irq_exit_rcu+0x158/0x174
[    3.045607] ---[ end trace 0000000000000000 ]---

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-10 19:40:15 +01:00
Li RongQing
659b66e98b cpuidle: haltpoll: Call cpuidle_poll_state_init() later
Call cpuidle_poll_state_init() only if it is needed to avoid doing
useless work.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-09 19:59:45 +01:00
Youngjin Jang
a759de6991 PM: sleep: Add device name to suspend_report_result()
Currently, suspend_report_result() prints only function information.

If any driver uses a common PM function, nobody knows who exactly
called the failing function.

A device pinter is needed to recognize the failing device.

For example:

 PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
 PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0

become after the change:

 serial 00:05: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
 pci 0000:00:01.3: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0

Signed-off-by: Youngjin Jang <yj84.jang@samsung.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:57:01 +01:00
Artem Bityutskiy
3a9cf77b60 intel_idle: add core C6 optimization for SPR
Add a Sapphire Rapids Xeon C6 optimization, similar to what we have for Sky Lake
Xeon: if package C6 is disabled, adjust C6 exit latency and target residency to
match core C6 values, instead of using the default package C6 values.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-04 19:54:32 +01:00
Artem Bityutskiy
da0e58c038 intel_idle: add 'preferred_cstates' module argument
On Sapphire Rapids Xeon (SPR) the C1 and C1E states are basically mutually
exclusive - only one of them can be enabled. By default, 'intel_idle' driver
enables C1 and disables C1E. However, some users prefer to use C1E instead of
C1, because it saves more energy.

This patch adds a new module parameter ('preferred_cstates') for enabling C1E
and disabling C1. Here is the idea behind it.

1. This option has effect only for "mutually exclusive" C-states like C1 and
   C1E on SPR.
2. It does not have any effect on independent C-states, which do not require
   other C-states to be disabled (most states on most platforms as of today).
3. For mutually exclusive C-states, the 'intel_idle' driver always has a
   reasonable default, such as enabling C1 on SPR by default. On other
   platforms, the default may be different.
4. Users can override the default using the 'preferred_cstates' parameter.
5. The parameter accepts the preferred C-states bit-mask, similarly to the
   existing 'states_off' parameter.
6. This parameter is not limited to C1/C1E, and leaves room for supporting
   other mutually exclusive C-states, if they come in the future.

Today 'intel_idle' can only be compiled-in, which means that on SPR, in order
to disable C1 and enable C1E, users should boot with the following kernel
argument: intel_idle.preferred_cstates=4

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-04 19:54:32 +01:00
Artem Bityutskiy
9edf3c0ffe intel_idle: add SPR support
Add Sapphire Rapids Xeon support.

Up until very recently, the C1 and C1E C-states were independent, but this
has changed in some new chips, including Sapphire Rapids Xeon (SPR). In these
chips the C1 and C1E states cannot be enabled at the same time. The "C1E
promotion" bit in 'MSR_IA32_POWER_CTL' also has its semantics changed a bit.

Here are the C1, C1E, and "C1E promotion" bit rules on Xeons before SPR.

1. If C1E promotion bit is disabled.
   a. C1  requests end up with C1  C-state.
   b. C1E requests end up with C1E C-state.
2. If C1E promotion bit is enabled.
   a. C1  requests end up with C1E C-state.
   b. C1E requests end up with C1E C-state.

Here are the C1, C1E, and "C1E promotion" bit rules on Sapphire Rapids Xeon.
1. If C1E promotion bit is disabled.
   a. C1  requests end up with C1 C-state.
   b. C1E requests end up with C1 C-state.
2. If C1E promotion bit is enabled.
   a. C1  requests end up with C1E C-state.
   b. C1E requests end up with C1E C-state.

Before SPR Xeon, the 'intel_idle' driver was disabling C1E promotion and was
exposing C1 and C1E as independent C-states. But on SPR, C1 and C1E cannot be
enabled at the same time.

This patch adds both C1 and C1E states. However, C1E is marked as with the
"CPUIDLE_FLAG_UNUSABLE" flag, which means that in won't be registered by
default. The C1E promotion bit will be cleared, which means that by default
only C1 and C6 will be registered on SPR.

The next patch will add an option for enabling C1E and disabling C1 on SPR.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-04 19:54:31 +01:00
Douglas Anderson
b4060db925 PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
The PM Runtime docs say:

  Drivers in ->remove() callback should undo the runtime PM changes done
  in ->probe(). Usually this means calling pm_runtime_disable(),
  pm_runtime_dont_use_autosuspend() etc.

From grepping code, it's clear that many people aren't aware of the
need to call pm_runtime_dont_use_autosuspend().

When brainstorming solutions, one idea that came up was to leverage
the new-ish devm_pm_runtime_enable() function. The idea here is that:

 * When the devm action is called we know that the driver is being
   removed. It's the perfect time to undo the use_autosuspend.

 * The code of pm_runtime_dont_use_autosuspend() already handles the
   case of being called when autosuspend wasn't enabled.

Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-04 18:26:54 +01:00
Mario Limonciello
eb087f3059 ACPI: processor idle: Check for architectural support for LPI
When `osc_pc_lpi_support_confirmed` is set through `_OSC` and `_LPI` is
populated then the cpuidle driver assumes that LPI is fully functional.

However currently the kernel only provides architectural support for LPI
on ARM.  This leads to high power consumption on X86 platforms that
otherwise try to enable LPI.

So probe whether or not LPI support is implemented before enabling LPI in
the kernel.  This is done by overloading `acpi_processor_ffh_lpi_probe` to
check whether it returns `-EOPNOTSUPP`. It also means that all future
implementations of `acpi_processor_ffh_lpi_probe` will need to follow
these semantics as well.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-03 20:20:06 +01:00
Mario Limonciello
01f6c7338c cpuidle: PSCI: Move the has_lpi check to the beginning of the function
Currently the first thing checked is whether the PCSI cpu_suspend function
has been initialized.

Another change will be overloading `acpi_processor_ffh_lpi_probe` and
calling it sooner.  So make the `has_lpi` check the first thing checked
to prepare for that change.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-03 20:20:06 +01:00
Colin Ian King
55ddcd9f32 powercap: DTPM: Fix spelling mistake "initialze" -> "initialize"
There is a spelling mistake in a pr_info() message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 18:59:35 +01:00
kernel test robot
5bf19d0aa3 powercap: DTPM: dtpm_node_callback[] can be static
drivers/powercap/dtpm.c:525:22: warning: symbol 'dtpm_node_callback' was not declared. Should it be static?

Fixes: 3759ec678e89 ("powercap/drivers/dtpm: Add hierarchy creation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 18:57:37 +01:00
Randy Dunlap
7a64ca17e4 PM: suspend: fix return value of __setup handler
If an invalid option is given for "test_suspend=<option>", the entire
string is added to init's environment, so return 1 instead of 0 from
the __setup handler.

  Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc5
    test_suspend=invalid"

and

 Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc5
     test_suspend=invalid

Fixes: 2ce986892faf ("PM / sleep: Enhance test_suspend option with repeat capability")
Fixes: 27ddcc6596e5 ("PM / sleep: Add state field to pm_states[] entries")
Fixes: a9d7052363a6 ("PM: Separate suspend to RAM functionality from core")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 18:55:07 +01:00
Randy Dunlap
ba7ffcd4c4 PM: hibernate: fix __setup handler error handling
If an invalid value is used in "resumedelay=<seconds>", it is
silently ignored. Add a warning message and then let the __setup
handler return 1 to indicate that the kernel command line option
has been handled.

Fixes: 317cf7e5e85e3 ("PM / hibernate: convert simple_strtoul to kstrtoul")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 18:51:11 +01:00
Srinivas Pandruvada
a644161ba1 Documentation: admin-guide: pm: Document uncore frequency scaling
Added documentation to configure uncore frequency limits in Intel
Xeon processors.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Clean up the document wording ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 16:46:53 +01:00
Jiapeng Chong
444e1154b2 PM: hibernate: Clean up non-kernel-doc comments
Address the following W=1 kernel build warning:

kernel/power/swap.c:120: warning: This comment starts with '/**', but
isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 16:20:46 +01:00
Tom Rix
7dfe105dfc PM: sleep: wakeup: Fix typos in comments
Remove the second 'the'.
Replace the second 'of' with 'the'.
Replace 'couter' with 'counter'.

Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 16:17:32 +01:00
Shawn Guo
f6bfe8b5b2 PM: domains: Fix sleep-in-atomic bug caused by genpd_debug_remove()
When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following
sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called
with a spinlock being held.

[    0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460
[    0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
[    0.029219] preempt_count: 1, expected: 0
[    0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489
[    0.029245] Hardware name: Thundercomm TurboX CM2290 (DT)
[    0.029256] Call trace:
[    0.029265]  dump_backtrace.part.0+0xbc/0xd0
[    0.029285]  show_stack+0x3c/0xa0
[    0.029298]  dump_stack_lvl+0x7c/0xa0
[    0.029311]  dump_stack+0x18/0x34
[    0.029323]  __might_resched+0x10c/0x13c
[    0.029338]  __might_sleep+0x4c/0x80
[    0.029351]  down_read+0x24/0xd0
[    0.029363]  lookup_one_len_unlocked+0x9c/0xcc
[    0.029379]  lookup_positive_unlocked+0x10/0x50
[    0.029392]  debugfs_lookup+0x68/0xac
[    0.029406]  genpd_remove.part.0+0x12c/0x1b4
[    0.029419]  of_genpd_remove_last+0xa8/0xd4
[    0.029434]  psci_cpuidle_domain_probe+0x174/0x53c
[    0.029449]  platform_probe+0x68/0xe0
[    0.029462]  really_probe+0x190/0x430
[    0.029473]  __driver_probe_device+0x90/0x18c
[    0.029485]  driver_probe_device+0x40/0xe0
[    0.029497]  __driver_attach+0xf4/0x1d0
[    0.029508]  bus_for_each_dev+0x70/0xd0
[    0.029523]  driver_attach+0x24/0x30
[    0.029534]  bus_add_driver+0x164/0x22c
[    0.029545]  driver_register+0x78/0x130
[    0.029556]  __platform_driver_register+0x28/0x34
[    0.029569]  psci_idle_init_domains+0x1c/0x28
[    0.029583]  do_one_initcall+0x50/0x1b0
[    0.029595]  kernel_init_freeable+0x214/0x280
[    0.029609]  kernel_init+0x2c/0x13c
[    0.029622]  ret_from_fork+0x10/0x20

It doesn't seem necessary to call genpd_debug_remove() with the lock, so
move it out from locking to fix the problem.

Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 16:00:46 +01:00
Ahmad Fatoum
9a6582b839 PM: domains: use dev_err_probe() to simplify error handling
dev_err_probe() can reduce code size, makes the code easier to read
and has the added benefit of recording the defer reason for later
read out. Use it where appropriate.

This also fixes an issue, where an error message in __genpd_dev_pm_attach
was not terminated by a line break.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 15:58:17 +01:00
Ulf Hansson
e7d90cfac5 PM: domains: Prevent power off for parent unless child is in deepest state
A PM domain managed by genpd may support multiple idlestates (power-off
states). During genpd_power_off() a genpd governor may be asked to select
one of the idlestates based upon the dev PM QoS constraints, for example.

However, there is a problem with the behaviour around this in genpd. More
precisely, a parent-domain is allowed to be powered off, no matter of what
idlestate that has been selected for the child-domain.

For the stm32mp1 platform from STMicro, this behaviour doesn't play well.
Instead, the parent-domain must not be powered off, unless the deepest
idlestate has been selected for the child-domain. As the current behaviour
in genpd is quite questionable anyway, let's simply change it into what is
needed by the stm32mp1 platform.

If it surprisingly turns out that other platforms may need a different
behaviour from genpd, then we will have to revisit this to find a way to
make it configurable.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-01 15:55:41 +01:00
Rafael J. Wysocki
075c3c483c Merge back cpufreq changes for v5.18. 2022-02-28 20:47:57 +01:00
Linus Torvalds
7e57714cd0 Linux 5.17-rc6 v5.17-rc6 2022-02-27 14:36:33 -08:00
Linus Torvalds
52a0255467 A single fix for a regression caused by the recent PCI/MSI rework which
resulted in a recursive locking problem in the VMD driver. The cure is to
 cache the relevant information upfront instead of retrieving it at runtime.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmIbQhATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoX+ND/0ZY7N+NbHnOWz8aPRIelSgchSq4xEU
 jjNe+FIe32Zq8zmVDeS59aAXF/gbT4LwR8eL0clzpM+Sd0Rg7xyvYE5v9ltwgv17
 3IJNnmJgLmeJazI5qMRSeDZcV5Ys0AIJYueVkDOkiMiJd0alLuGkRocOsejVdFhh
 27mLu33tfnXf0qFCZHFUiQtrus5zgJWh+kKz2vOuzLUxF9QPUe+CCTyA9HVNRneh
 94PFK7hjjbtyI65KLqSjEQRnGP3ddRwwII4EwE1aa+x/Fx6cDA6/L0PinpIDCSkh
 vXfODriwqW2Y9M4g3WrKLU69OB+LxVzV5pKcbC8Rrs9xOfNVGOBJNbzyqnR3nye6
 jPOb1I5DF427LJpac8BQKcdu9kxwqTF8D77BWZpkjYdKbIFh5Otd0/DgKaLOH4EG
 u4eMSNsgYkFLTc1Aa59CrYdAM03yflYI0BJ0Sdrw+fZbhRoFFmuEMm9R7f6J6E4+
 2tbq8uZpZcqBP7YLbAuMmC1Km7fhMlGZNj/8XXHj2168wKmTmQm48J2bARkZmIPt
 Jk2el2wKM14gGttES2nqEf/UDrl8XCllTD+cRzBqEAjOv3himpsErZmuKxni6BAd
 pQozQpyJlK5swF7U3mZkalJE/btyVL6dzAzlDp0psZbDGFmFK5O+/F3kxQOpoGzo
 hsbHVeTZFmmWdA==
 =ukul
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
 "A single fix for a regression caused by the recent PCI/MSI rework
  which resulted in a recursive locking problem in the VMD driver.

  The cure is to cache the relevant information upfront instead of
  retrieving it at runtime"

* tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI: vmd: Prevent recursive locking on interrupt allocation
2022-02-27 13:07:40 -08:00
Linus Torvalds
98f3e84f8d dma-mapping fix for Linux 5.17
- fix a swiotlb info leak (Halil Pasic)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmIbvm8LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMcBg/8DD+QsKEEv+D/+bCSWZVbW1ekFDiDyEdO9xuSymxT
 pf56Dy693G0BoPZ1JFiN17LIOp3vsHQfwE4ZDo/OHmyDTcziepD40SZiHD3eqEDB
 cl7RXhAcoxnUMkgK+hIb6Q7t/4dXaC3+rVXUuL//xUjXZu7GML46tRzuDoVr9MZp
 MBSGdZlXIzfvpzNf+yVzpk7sZP/w4nrzyuzeE5T6fdipjN2o78PhWmZzqsHGujeI
 ptMuZQwwDuXXetQNiA/t0+DkeWCz/+ERI7k/6lAma14aKHjM1Jn8HcNqW0UGUuGV
 148V4wKFqhaOXnoWuodrgjo9AJkOVqh+YH4ui60MMOUvZ4MPh4+5muobybzndtNy
 3bAM3JbcNUH1/vyyazvwr22My/TapbydwPuXda63Ls/2WbnV66xbgFEm6S9GXC4X
 6+ynK5CzWaTF5INgpd6WjrEMPXGCi0hXM9yQmJvlI1I0muFv9mBXhW/wP7OE3Na8
 eQXuP/v+QqYaDVTsGTtySNPkwBstFa9GQUTghrnxKzk3giLVZfGPX5Bs4/+gl4gl
 rrGBNhUAnMOdvBSkLUf4hDgyGvsCiYtB1YX9QYxq2aRCbIT03164VnXB0lQRKnmz
 Yu547NO96tBnoIT9/gQYMTHvwPU1WjWi3jVbT2zpMKgG4DBUEEIH33P9S1vbgecY
 vLY=
 =bR6d
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

 - fix a swiotlb info leak (Halil Pasic)

* tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: fix info leak with DMA_FROM_DEVICE
2022-02-27 12:42:37 -08:00
Linus Torvalds
6676ba2a6d Pin control fixes for the v5.17 series:
- Fix some drive strength and pull-up code in the K210 driver.
 
 - Add the Alder Lake-M ACPI ID so it starts to work properly.
 
 - Use a static name for the StarFive GPIO irq_chip, forestalling
   an upcoming fixes series from Marc Zyngier.
 
 - Fix an ages old bug in the Tegra 186 driver where we were
   indexing at random into struct and being lucky getting the
   right member.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmIaztAACgkQQRCzN7AZ
 XXOp+xAAvF525eqPH/wAvoj0p7KbpQQJ9gYpqCL4FNp+t/Wj3q+7ihgy/Y4iQ6wC
 vU72M8OQ+i7t/5R8l1cJUj/f/OA2+icNeD5L1+DD+4RB52wQvdbjz7XDqVqEHSFG
 YnV9YJFGQ3Tr62HU2MrWUCxsY13J6YHlRWHTcMoM0/fcRSHaDYU5mlgGPQV6fb94
 WViv+PZncebY9PeyNm/wIpqL/VHqLI5fcaHz+0u6ppNTF7rGRBv7La/Du0mTbmlw
 rofbc2ynv+gIERyZBZ26UepBid2ZY4qaBzNy5S4srNeY8odlE+C9qCi/UcC3j3aM
 1UgsiuZKvn7arR7uR6cKPQSeIEHS25zxbL+FXPa5wtg9KrNhZUG7LG1IB3M7jcK4
 CiNj7zm9Zuy/qGGbMNWmmqpFk8ueL2fq7oE6K8oQa9HxzMFd48sB0Ckhyt5PCOEV
 zcLEo/WeIp3BqOJ4vZQquWO0lcEZr+2SeiGUaUJYwfZI7K2Myrc66hxKVUzBB+EK
 QWOQj+2W15qBkZd49ygQMJK5D8CQPBkT66AjqtZHr/7jk5H4S0oyyhJHyEWMPcSW
 oEk7UxKGfULG+zPfg6tCKQSN/QEyF9V2DZ5Ve13klWYZwR6uTZIGhQ2ZVWhJH7DN
 KXmTGOLGKUp1xR17t8hrAg60WPRoltofY21U/XiABDMGHgLmyYk=
 =es6M
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

 - Fix some drive strength and pull-up code in the K210 driver.

 - Add the Alder Lake-M ACPI ID so it starts to work properly.

 - Use a static name for the StarFive GPIO irq_chip, forestalling an
   upcoming fixes series from Marc Zyngier.

 - Fix an ages old bug in the Tegra 186 driver where we were indexing at
   random into struct and being lucky getting the right member.

* tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  gpio: tegra186: Fix chip_data type confusion
  pinctrl: starfive: Use a static name for the GPIO irq_chip
  pinctrl: tigerlake: Revert "Add Alder Lake-M ACPI ID"
  pinctrl: k210: Fix bias-pull-up
  pinctrl: fix loop in k210_pinconf_get_drive()
2022-02-27 12:30:54 -08:00
Linus Torvalds
2293be58d6 Tracing fixes for 5.17:
- rtla (Real-Time Linux Analysis tool): fix typo in man page
 
  - rtla: Update API -e to -E before it is released
 
  - rlla: Error message fix and memory leak fix
 
  - Partially uninline trace event soft disable to shrink text
 
  - Fix function graph start up test
 
  - Have triggers affect the trace instance they are in and not top level
 
  - Have osnoise sleep in the units it says it uses
 
  - Remove unused ftrace stub function
 
  - Remove event probe redundant info from event in the buffer
 
  - Fix group ownership setting in tracefs
 
  - Ensure trace buffer is minimum size to prevent crashes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYho7XBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qiOhAQDbCbEjIYwkGCpckuGgSQiMU4bAWUzk
 jCz9PoaTxoIWJwEAsLWrAPb0pDzNwdEKjiC3fJoUJhz3NwlEjJ7hQ3BxzAI=
 =iXOQ
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - rtla (Real-Time Linux Analysis tool):
    - fix typo in man page
    - Update API -e to -E before it is released
    - Error message fix and memory leak fix

 - Partially uninline trace event soft disable to shrink text

 - Fix function graph start up test

 - Have triggers affect the trace instance they are in and not top level

 - Have osnoise sleep in the units it says it uses

 - Remove unused ftrace stub function

 - Remove event probe redundant info from event in the buffer

 - Fix group ownership setting in tracefs

 - Ensure trace buffer is minimum size to prevent crashes

* tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  rtla/osnoise: Fix error message when failing to enable trace instance
  rtla/osnoise: Free params at the exit
  rtla/hist: Make -E the short version of --entries
  tracing: Fix selftest config check for function graph start up test
  tracefs: Set the group ownership in apply_options() not parse_options()
  tracing/osnoise: Make osnoise_main to sleep for microseconds
  ftrace: Remove unused ftrace_startup_enable() stub
  tracing: Ensure trace buffer is at least 4096 bytes large
  tracing: Uninline trace_trigger_soft_disabled() partly
  eprobes: Remove redundant event type information
  tracing: Have traceon and traceoff trigger honor the instance
  tracing: Dump stacktrace trigger to the corresponding instance
  rtla: Fix systme -> system typo on man page
2022-02-26 12:10:17 -08:00
Linus Torvalds
e41898d2ba memblock: use kfree() to release kmalloced memblock regions
memblock.{reserved,memory}.regions may be allocated using kmalloc() in
 memblock_double_array(). Use kfree() to release these kmalloced regions.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmIZ1qgTHHJwcHRAbGlu
 dXguaWJtLmNvbQAKCRA5A4Ymyw79kXsaB/0TnrLt98t/jPvVGinsnf7r3hXnNq7F
 8FXWqdUIBWRfiHVd74pX6VE4Be56BbMUUyRQWDfbjrluVFnBibA3qJhNmpIuwdSb
 9GESikUdEnuq0t059yPLupKvYY0ysq4OjNLWage+8tnA/TzlN/+t27c75iZWwGn2
 JbutM/j5YKnvAcqUVv/plLVIVrGz1RCaG0diYoY1vxrbpRCicmAI8LHTkK1Xtow2
 7YVkRuQWY+yJLOJ/SCst5pxy6cm3R96KvnaC9fg1Pp+8wVFrZp/hDsH8nObFccXq
 6zQTbXqS88VKOoNEcuqk2ITbFyghepPIBrliEmcI2h96OSdp6BtrNau7
 =UpBU
 -----END PGP SIGNATURE-----

Merge tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock fix from Mike Rapoport:
 "Use kfree() to release kmalloced memblock regions

  memblock.{reserved,memory}.regions may be allocated using kmalloc()
  in memblock_double_array(). Use kfree() to release these kmalloced
  regions"

* tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: use kfree() to release kmalloced memblock regions
2022-02-26 12:00:44 -08:00
Linus Torvalds
086ee11b03 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "12 patches.

  Subsystems affected by this patch series: MAINTAINERS, mailmap, memfd,
  and mm (hugetlb, kasan, hugetlbfs, pagemap, selftests, memcg, and
  slab)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  selftests/memfd: clean up mapping in mfd_fail_write
  mailmap: update Roman Gushchin's email
  MAINTAINERS, SLAB: add Roman as reviewer, git tree
  MAINTAINERS: add Shakeel as a memcg co-maintainer
  MAINTAINERS: remove Vladimir from memcg maintainers
  MAINTAINERS: add Roman as a memcg co-maintainer
  selftest/vm: fix map_fixed_noreplace test failure
  mm: fix use-after-free bug when mm->mmap is reused after being freed
  hugetlbfs: fix a truncation issue in hugepages parameter
  kasan: test: prevent cache merging in kmem_cache_double_destroy
  mm/hugetlb: fix kernel crash with hugetlb mremap
  MAINTAINERS: add sysctl-next git tree
2022-02-26 11:52:14 -08:00
Linus Torvalds
2c8c230eda RISC-V Fixes for 5.17-rc6
* A fix for the K210 sdcard defconfig, to avoid using a fixed delay for
   the root FS.
 * A fix to make sure there's a proper call frame for
   trace_hardirqs_{on,off}().
 
 ---
 
 There are a handful of additional fixes in flight, but not for this
 week.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIZHmQTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQT4ND/42sEQLhcQcDpdvFDX/0zBr1Y8RNy25
 7I9JBYmuTK5AfwmE52I/OcdCLE9bNELH1g+LMK/3amEqhkUtDelBb5UdC4TYfvRm
 SRlj75XKPxESMEW9EjU5BeAz+uDI4oMkOmDPyp+Xv/OayGrFQIPUTo75/SiOdlH7
 a2khiH4/OxqkVlOff3Ko96M4RNSUeUIEVSfrH4pgJC8n+031u02TvR1IIx5TT7ti
 W5YIMw6VZ32Gl5ByZaBMbs9pz+iOKDrn3UfnPrVpbs6P3389EmR4btJpqfzN9JeC
 UQzcx4rqoDzTtJvqkOxiR+Ig4nNJGyeYVvxaGH67MkD/nz6rS26adIs+xPGKjDCC
 TtFyLt4h2+JX+1kNiutTLrQLAaQO4N+LSkysIsoSr9wNGCdnSrAQRxoOLwIuTMBS
 61kRsBvuiuRJZQlbgkP2tiTug/8dYs4vQzPNeC5VO/c3MZB5/j2ykYdKBSsElrpi
 +br602CMdeqvT+M+pT9lWdxa8X9lbYVm1z3hx2FyRdbnYw3nbcQq/mGp8Ju9O5zt
 JXajwPFtUPpWXzm2CcRjeh+2GKoLetgVpwHAIOmnDd6meTXp/BEu12+o7c8vf3H7
 k12BPHVPH1gPklG2Oh8Z/UvevICd4AHlSJT7J19xh7fYVMaaALVQ71Nd2jFbv9N/
 eu8KKxR2pKXiPw==
 =1a16
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the K210 sdcard defconfig, to avoid using a
   fixed delay for the root FS

 - A fix to make sure there's a proper call frame for
   trace_hardirqs_{on,off}().

* tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: fix oops caused by irqsoff latency tracer
  riscv: fix nommu_k210_sdcard_defconfig
2022-02-26 10:26:24 -08:00
Linus Torvalds
3bd9dd8138 Bug fixes for 5.17-rc4:
- Only call sync_filesystem when we're remounting the filesystem
    readonly readonly, and actually check its return value.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmIEnhUACgkQ+H93GTRK
 tOsn7g//e3R/lqpkx6jJtg6SqiC1KiI9euwD0wBdIvrCWSJZ6IdjOSvfRRG13vN0
 S1spU0joiLLlVhzLIQdysgZkRub57P0mRmq3zVpHYxMOWKBvPH1OmZtdu83HOiAv
 /zjy3tNFc/1ZaqHudAZv3+4780qMZtQTmL7DbgLnvFspCBf4PdBlT0d7Wbf982w8
 dMWF7Pla8DhLVFbMsGdyXlnGROz+pw3jofVwY9P6f+PaY37mo+lZu65GrTlNecnc
 QfTyX45VAWFO/XZtXm7pXCr8211eK2SnrOFZXZH9u3qxSD5vo1NWf9KPKVkYxc8/
 7icz+Yp5t61HQg3o0z7cNAQZp7CQl0BWz6gp2YXMWHS3ZJMnd6H4zTDBdV2MSA5/
 alT4kcwncRVcmHtFET7JAsnQkWNeREBqhqCRoAf8hW8uxpjkXw6sPop7+hbZtoJw
 VAp1TxbEMbPGTZb76Kw4nZt1eZ3SyJOl6ByzsJMxekEFiMYVh4yxO+a3Q6KNOkMM
 O62JpzdE1EeFgV7qmoZ8QzCZuD7z7KC99iv5QtyacFITCqv5y0h/RLGCsOwJ0EMc
 fJGKN7uQOZrBIJYInx53S7fCYGGMm0+HUUXMUatBe4RK3dADyqapLzQb0tCGamAf
 NQra6NotwfNq8SN+Sn17PJ1KifSRKfw6l7Q+6pt9LA2eVbr2jV4=
 =6ODm
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Nothing exciting, just more fixes for not returning sync_filesystem
  error values (and eliding it when it's not necessary).

  Summary:

   - Only call sync_filesystem when we're remounting the filesystem
     readonly readonly, and actually check its return value"

* tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: only bother with sync_filesystem during readonly remount
2022-02-26 09:53:19 -08:00
Mike Kravetz
fda153c89a selftests/memfd: clean up mapping in mfd_fail_write
Running the memfd script ./run_hugetlbfs_test.sh will often end in error
as follows:

    memfd-hugetlb: CREATE
    memfd-hugetlb: BASIC
    memfd-hugetlb: SEAL-WRITE
    memfd-hugetlb: SEAL-FUTURE-WRITE
    memfd-hugetlb: SEAL-SHRINK
    fallocate(ALLOC) failed: No space left on device
    ./run_hugetlbfs_test.sh: line 60: 166855 Aborted                 (core dumped) ./memfd_test hugetlbfs
    opening: ./mnt/memfd
    fuse: DONE

If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will
allocate 'just enough' pages to run the test.  In the SEAL-FUTURE-WRITE
test the mfd_fail_write routine maps the file, but does not unmap.  As a
result, two hugetlb pages remain reserved for the mapping.  When the
fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb
pages, it is short by the two reserved pages.

Fix by making sure to unmap in mfd_fail_write.

Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Roman Gushchin
9502bdbf34 mailmap: update Roman Gushchin's email
I'm moving to a @linux.dev account. Map my old addresses.

Link: https://lkml.kernel.org/r/20220221200006.416377-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Vlastimil Babka
7b0112f343 MAINTAINERS, SLAB: add Roman as reviewer, git tree
The slab code has an overlap with kmem accounting, where Roman has done
a lot of work recently and it would be useful to make sure he's CC'd on
patches that potentially affect it.  Thus add him as a reviewer for the
SLAB subsystem.

Also while at it, add the link to slab git tree.

Link: https://lkml.kernel.org/r/20220222103104.13241-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Shakeel Butt
bb9d545499 MAINTAINERS: add Shakeel as a memcg co-maintainer
I have been contributing and reviewing to the memcg codebase for last
couple of years.  So, making it official.

Link: https://lkml.kernel.org/r/20220224060148.4092228-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Vladimir Davydov
0a972e72e2 MAINTAINERS: remove Vladimir from memcg maintainers
Link: https://lkml.kernel.org/r/4ad1f8da49d7b71c84a0c15bd5347f5ce704e730.1645608825.git.vdavydov.dev@gmail.com
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Roman Gushchin
7d547dcf97 MAINTAINERS: add Roman as a memcg co-maintainer
Add myself as a memcg co-maintainer.  My primary focus over last few
years was the kernel memory accounting stack, but I do work on some
other parts of the memory controller as well.

Link: https://lkml.kernel.org/r/20220221233951.659048-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Aneesh Kumar K.V
f39c58008d selftest/vm: fix map_fixed_noreplace test failure
On the latest RHEL the test fails due to executable mapped at 256MB
address

     # ./map_fixed_noreplace
    mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists
    10000000-10010000 r-xp 00000000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10010000-10020000 r--p 00000000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10020000-10030000 rw-p 00010000 fd:04 34905657                           /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace
    10029b90000-10029bc0000 rw-p 00000000 00:00 0                            [heap]
    7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534                      /usr/lib64/libc.so.6
    7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0                          [vvar]
    7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0                          [vdso]
    7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514                      /usr/lib64/ld64.so.2
    7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0                          [stack]
    Error: couldn't map the space we need for the test

Fix this by finding a free address using mmap instead of hardcoding
BASE_ADDRESS.

Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Jann Horn <jannh@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Suren Baghdasaryan
f798a1d4f9 mm: fix use-after-free bug when mm->mmap is reused after being freed
oom reaping (__oom_reap_task_mm) relies on a 2 way synchronization with
exit_mmap.  First it relies on the mmap_lock to exclude from unlock
path[1], page tables tear down (free_pgtables) and vma destruction.
This alone is not sufficient because mm->mmap is never reset.

For historical reasons[2] the lock is taken there is also MMF_OOM_SKIP
set for oom victims before.

The oom reaper only ever looks at oom victims so the whole scheme works
properly but process_mrelease can opearate on any task (with fatal
signals pending) which doesn't really imply oom victims.  That means
that the MMF_OOM_SKIP part of the synchronization doesn't work and it
can see a task after the whole address space has been demolished and
traverse an already released mm->mmap list.  This leads to use after
free as properly caught up by KASAN report.

Fix the issue by reseting mm->mmap so that MMF_OOM_SKIP synchronization
is not needed anymore.  The MMF_OOM_SKIP is not removed from exit_mmap
yet but it acts mostly as an optimization now.

[1] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
[2] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")

[mhocko@suse.com: changelog rewrite]

Link: https://lore.kernel.org/all/00000000000072ef2c05d7f81950@google.com/
Link: https://lkml.kernel.org/r/20220215201922.1908156-1-surenb@google.com
Fixes: 64591e8605d6 ("mm: protect free_pgtables with mmap_lock write lock in exit_mmap")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: syzbot+2ccf63a4bd07cf39cab0@syzkaller.appspotmail.com
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Liu Yuntao
e79ce98323 hugetlbfs: fix a truncation issue in hugepages parameter
When we specify a large number for node in hugepages parameter, it may
be parsed to another number due to truncation in this statement:

	node = tmp;

For example, add following parameter in command line:

	hugepagesz=1G hugepages=4294967297:5

and kernel will allocate 5 hugepages for node 1 instead of ignoring it.

I move the validation check earlier to fix this issue, and slightly
simplifies the condition here.

Link: https://lkml.kernel.org/r/20220209134018.8242-1-liuyuntao10@huawei.com
Fixes: b5389086ad7be0 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Liu Yuntao <liuyuntao10@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Andrey Konovalov
70effdc375 kasan: test: prevent cache merging in kmem_cache_double_destroy
With HW_TAGS KASAN and kasan.stacktrace=off, the cache created in the
kmem_cache_double_destroy() test might get merged with an existing one.
Thus, the first kmem_cache_destroy() call won't actually destroy it but
will only decrease the refcount.  This causes the test to fail.

Provide an empty constructor for the created cache to prevent the cache
from getting merged.

Link: https://lkml.kernel.org/r/b597bd434c49591d8af00ee3993a42c609dc9a59.1644346040.git.andreyknvl@google.com
Fixes: f98f966cd750 ("kasan: test: add test case for double-kmem_cache_destroy()")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Aneesh Kumar K.V
db110a99d3 mm/hugetlb: fix kernel crash with hugetlb mremap
This fixes the below crash:

  kernel BUG at include/linux/mm.h:2373!
  cpu 0x5d: Vector: 700 (Program Check) at [c00000003c6e76e0]
      pc: c000000000581a54: pmd_to_page+0x54/0x80
      lr: c00000000058d184: move_hugetlb_page_tables+0x4e4/0x5b0
      sp: c00000003c6e7980
     msr: 9000000000029033
    current = 0xc00000003bd8d980
    paca    = 0xc000200fff610100   irqmask: 0x03   irq_happened: 0x01
      pid   = 9349, comm = hugepage-mremap
  kernel BUG at include/linux/mm.h:2373!
    move_hugetlb_page_tables+0x4e4/0x5b0 (link register)
    move_hugetlb_page_tables+0x22c/0x5b0 (unreliable)
    move_page_tables+0xdbc/0x1010
    move_vma+0x254/0x5f0
    sys_mremap+0x7c0/0x900
    system_call_exception+0x160/0x2c0

the kernel can't use huge_pte_offset before it set the pte entry because
a page table lookup check for huge PTE bit in the page table to
differentiate between a huge pte entry and a pointer to pte page.  A
huge_pte_alloc won't mark the page table entry huge and hence kernel
should not use huge_pte_offset after a huge_pte_alloc.

Link: https://lkml.kernel.org/r/20220211063221.99293-1-aneesh.kumar@linux.ibm.com
Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
Luis Chamberlain
bbcf7b0e2e MAINTAINERS: add sysctl-next git tree
Add a git tree for sysctls as there's been quite a bit of work lately to
remove all the syctls out of kernel/sysctl.c and move to their respective
places, so coordination has been needed to avoid conflicts.  This tree
will also help soak these changes on linux-next prior to getting to Linus.

Link: https://lkml.kernel.org/r/20220218182736.3694508-1-mcgrof@kernel.org
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00