1137 Commits

Author SHA1 Message Date
Linus Torvalds
4da9f33026 Support for FSGSBASE. Almost 5 years after the first RFC to support it,
this has been brought into a shape which is maintainable and actually
 works.
 
 This final version was done by Sasha Levin who took it up after Intel
 dropped the ball. Sasha discovered that the SGX (sic!) offerings out there
 ship rogue kernel modules enabling FSGSBASE behind the kernels back which
 opens an instantanious unpriviledged root hole.
 
 The FSGSBASE instructions provide a considerable speedup of the context
 switch path and enable user space to write GSBASE without kernel
 interaction. This enablement requires careful handling of the exception
 entries which go through the paranoid entry path as they cannot longer rely
 on the assumption that user GSBASE is positive (as enforced via prctl() on
 non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and
 exceptions) can still just utilize SWAPGS unconditionally when the entry
 comes from user space. Converting these entries to use FSGSBASE has no
 benefit as SWAPGS is only marginally slower than WRGSBASE and locating and
 retrieving the kernel GSBASE value is not a free operation either. The real
 benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes.
 
 The changes come with appropriate selftests and have held up in field
 testing against the (sanitized) Graphene-SGX driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pGnoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTYJD/9873GkwvGcc/Vq/dJH1szGTgFftPyZ
 c/Y9gzx7EGBPLo25BS820L+ZlynzXHDxExKfCEaD10TZfe5XIc1vYNR0J74M2NmK
 IBgEDstJeW93ai+rHCFRXIevhpzU4GgGYJ1MeeOgbVMN3aGU1g6HfzMvtF0fPn8Y
 n6fsLZa43wgnoTdjwjjikpDTrzoZbaL1mbODBzBVPAaTbim7IKKTge6r/iCKrOjz
 Uixvm3g9lVzx52zidJ9kWa8esmbOM1j0EPe7/hy3qH9DFo87KxEzjHNH3T6gY5t6
 NJhRAIfY+YyTHpPCUCshj6IkRudE6w/qjEAmKP9kWZxoJrvPCTWOhCzelwsFS9b9
 gxEYfsnaKhsfNhB6fi0PtWlMzPINmEA7SuPza33u5WtQUK7s1iNlgHfvMbjstbwg
 MSETn4SG2/ZyzUrSC06lVwV8kh0RgM3cENc/jpFfIHD0vKGI3qfka/1RY94kcOCG
 AeJd0YRSU2RqL7lmxhHyG8tdb8eexns41IzbPCLXX2sF00eKNkVvMRYT2mKfKLFF
 q8v1x7yuwmODdXfFR6NdCkGm9IU7wtL6wuQ8Nhu9UraFmcXo6X6FLJC18FqcvSb9
 jvcRP4XY/8pNjjf44JB8yWfah0xGQsaMIKQGP4yLv4j6Xk1xAQKH1MqcC7l1D2HN
 5Z24GibFqSK/vA==
 =QaAN
 -----END PGP SIGNATURE-----

Merge tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fsgsbase from Thomas Gleixner:
 "Support for FSGSBASE. Almost 5 years after the first RFC to support
  it, this has been brought into a shape which is maintainable and
  actually works.

  This final version was done by Sasha Levin who took it up after Intel
  dropped the ball. Sasha discovered that the SGX (sic!) offerings out
  there ship rogue kernel modules enabling FSGSBASE behind the kernels
  back which opens an instantanious unpriviledged root hole.

  The FSGSBASE instructions provide a considerable speedup of the
  context switch path and enable user space to write GSBASE without
  kernel interaction. This enablement requires careful handling of the
  exception entries which go through the paranoid entry path as they
  can no longer rely on the assumption that user GSBASE is positive (as
  enforced via prctl() on non FSGSBASE enabled systemn).

  All other entries (syscalls, interrupts and exceptions) can still just
  utilize SWAPGS unconditionally when the entry comes from user space.
  Converting these entries to use FSGSBASE has no benefit as SWAPGS is
  only marginally slower than WRGSBASE and locating and retrieving the
  kernel GSBASE value is not a free operation either. The real benefit
  of RD/WRGSBASE is the avoidance of the MSR reads and writes.

  The changes come with appropriate selftests and have held up in field
  testing against the (sanitized) Graphene-SGX driver"

* tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/fsgsbase: Fix Xen PV support
  x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase
  selftests/x86/fsgsbase: Add a missing memory constraint
  selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test
  selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE
  selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE
  selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write
  Documentation/x86/64: Add documentation for GS/FS addressing mode
  x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit
  x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
  x86/entry/64: Introduce the FIND_PERCPU_BASE macro
  x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
  x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
  x86/process/64: Use FSGSBASE instructions on thread copy and ptrace
  x86/process/64: Use FSBSBASE in switch_to() if available
  x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE
  x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  ...
2020-08-04 21:16:22 -07:00
Linus Torvalds
0408497800 Power management updates for 5.9-rc1
- Make the Energy Model cover non-CPU devices (Lukasz Luba).
 
  - Add Ice Lake server idle states table to the intel_idle driver
    and eliminate a redundant static variable from it (Chen Yu,
    Rafael Wysocki).
 
  - Eliminate all W=1 build warnings from cpufreq (Lee Jones).
 
  - Add support for Sapphire Rapids and for Power Limit 4 to the
    Intel RAPL power capping driver (Sumeet Pawnikar, Zhang Rui).
 
  - Fix function name in kerneldoc comments in the idle_inject power
    capping driver (Yangtao Li).
 
  - Fix locking issues with cpufreq governors and drop a redundant
    "weak" function definition from cpufreq (Viresh Kumar).
 
  - Rearrange cpufreq to register non-modular governors at the
    core_initcall level and allow the default cpufreq governor to
    be specified in the kernel command line (Quentin Perret).
 
  - Extend, fix and clean up the intel_pstate driver (Srinivas
    Pandruvada, Rafael Wysocki):
 
    * Add a new sysfs attribute for disabling/enabling CPU
      energy-efficiency optimizations in the processor.
 
    * Make the driver avoid enabling HWP if EPP is not supported.
 
    * Allow the driver to handle numeric EPP values in the sysfs
      interface and fix the setting of EPP via sysfs in the active
      mode.
 
    * Eliminate a static checker warning and clean up a kerneldoc
      comment.
 
  - Clean up some variable declarations in the powernv cpufreq
    driver (Wei Yongjun).
 
  - Fix up the ->enter_s2idle callback definition to cover the case
    when it points to the same function as ->idle correctly (Neal
    Liu).
 
  - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson).
 
  - Make the PM core emit "changed" uevent when adding/removing the
    "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi).
 
  - Add a helper macro for declaring PM callbacks and use it in the
    MMC jz4740 driver (Paul Cercueil).
 
  - Fix white space in some places in the hibernate code and make the
    system-wide PM code use "const char *" where appropriate (Xiang
    Chen, Alexey Dobriyan).
 
  - Add one more "unsafe" helper macro to the freezer to cover the NFS
    use case (He Zhe).
 
  - Change the language in the generic PM domains framework to use
    parent/child terminology and clean up a typo and some comment
    fromatting in that code (Kees Cook, Geert Uytterhoeven).
 
  - Update the operating performance points OPP framework (Lukasz
    Luba, Andrew-sh.Cheng, Valdis Kletnieks):
 
    * Refactor dev_pm_opp_of_register_em() and update related drivers.
 
    * Add a missing function export.
 
    * Allow disabled OPPs in dev_pm_opp_get_freq().
 
  - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric
    Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier):
 
    * Add support for delayed timers to the devfreq core and make the
      Samsung exynos5422-dmc driver use it.
 
    * Unify sysfs interface to use "df-" as a prefix in instance names
      consistently.
 
    * Fix devfreq_summary debugfs node indentation.
 
    * Add the rockchip,pmu phandle to the rk3399_dmc driver DT
      bindings.
 
    * List Dmitry Osipenko as the Tegra devfreq driver maintainer.
 
    * Fix typos in the core devfreq code.
 
  - Update the pm-graph utility to version 5.7 including a number of
    fixes related to suspend-to-idle (Todd Brandt).
 
  - Fix coccicheck errors and warnings in the cpupower utility (Shuah
    Khan).
 
  - Replace HTTP links with HTTPs ones in multiple places (Alexander
    A. Klimov).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl8oO24SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx7ZQP/0lQ0yABnASnwomdOH6+K/m7rvc+e9FE
 zx5pTDQswhU5tM7SQAIKqe0uSI+okF2UrBrT5onA16F+JUbnrbexJLazBPfVTTGF
 AKpKEQ7Wh69Wz+Y6cQZjm1dTuRL+dlBJuBrzR2tLSnONPMMHuFcO3xd7lgE9UAxC
 oGEf393taA6OqcUNRQIa2gqbq+k1qhKjeDucGkbOaoJ6CL0ZyWI+Tfw1WWaBBGv0
 /2wBd6V513OH8WtQCW6H3YpHmhYW6OwL8w19KyGcjPRGJaeaIP4W/Ng7mkvgL5ZB
 vZqg3XiufFV9uTe8W1NQaVv/NjlN256OteuK809aosTVjD0dhFkhBYg5TLu6HbQq
 C/NciZ+78oLedWLT73EUfw3NyS+V0jk6X2EIlBUwNi0Qw1B1pCifGOCKzWFFe5cr
 ci4xr4FG7dBkxScOxwFAU2s5TdPHLOkGkQtg4jZr0OYDrzkyLEdsnZEUjLPORo+0
 6EBXGfTOSy2CBHcYswRtzJr/1pUTzj7oejhTAMCCuYW2r3VyQtnYcVjlehtp20if
 6BfmGisk8nmtxlSm+/Y2FqKa4bNnSTMmr0UJQ+Rjp0tHs47QeucI0ORfZ5nPaBac
 +ptvIjWmn3xejT/+oAehpH9066Iuy66vzHdnj7x5+WAsmYS8n8OFtlBFkYELmLJB
 3xI5hIl7WtGo
 =8cUO
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The most significant change here is the extension of the Energy Model
  to cover non-CPU devices (as well as CPUs) from Lukasz Luba.

  There is also some new hardware support (Ice Lake server idle states
  table for intel_idle, Sapphire Rapids and Power Limit 4 support in the
  RAPL driver), some new functionality in the existing drivers (eg. a
  new switch to disable/enable CPU energy-efficiency optimizations in
  intel_pstate, delayed timers in devfreq), some assorted fixes (cpufreq
  core, intel_pstate, intel_idle) and cleanups (eg. cpuidle-psci,
  devfreq), including the elimination of W=1 build warnings from cpufreq
  done by Lee Jones.

  Specifics:

   - Make the Energy Model cover non-CPU devices (Lukasz Luba).

   - Add Ice Lake server idle states table to the intel_idle driver and
     eliminate a redundant static variable from it (Chen Yu, Rafael
     Wysocki).

   - Eliminate all W=1 build warnings from cpufreq (Lee Jones).

   - Add support for Sapphire Rapids and for Power Limit 4 to the Intel
     RAPL power capping driver (Sumeet Pawnikar, Zhang Rui).

   - Fix function name in kerneldoc comments in the idle_inject power
     capping driver (Yangtao Li).

   - Fix locking issues with cpufreq governors and drop a redundant
     "weak" function definition from cpufreq (Viresh Kumar).

   - Rearrange cpufreq to register non-modular governors at the
     core_initcall level and allow the default cpufreq governor to be
     specified in the kernel command line (Quentin Perret).

   - Extend, fix and clean up the intel_pstate driver (Srinivas
     Pandruvada, Rafael Wysocki):

       * Add a new sysfs attribute for disabling/enabling CPU
         energy-efficiency optimizations in the processor.

       * Make the driver avoid enabling HWP if EPP is not supported.

       * Allow the driver to handle numeric EPP values in the sysfs
         interface and fix the setting of EPP via sysfs in the active
         mode.

       * Eliminate a static checker warning and clean up a kerneldoc
         comment.

   - Clean up some variable declarations in the powernv cpufreq driver
     (Wei Yongjun).

   - Fix up the ->enter_s2idle callback definition to cover the case
     when it points to the same function as ->idle correctly (Neal Liu).

   - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson).

   - Make the PM core emit "changed" uevent when adding/removing the
     "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi).

   - Add a helper macro for declaring PM callbacks and use it in the MMC
     jz4740 driver (Paul Cercueil).

   - Fix white space in some places in the hibernate code and make the
     system-wide PM code use "const char *" where appropriate (Xiang
     Chen, Alexey Dobriyan).

   - Add one more "unsafe" helper macro to the freezer to cover the NFS
     use case (He Zhe).

   - Change the language in the generic PM domains framework to use
     parent/child terminology and clean up a typo and some comment
     fromatting in that code (Kees Cook, Geert Uytterhoeven).

   - Update the operating performance points OPP framework (Lukasz Luba,
     Andrew-sh.Cheng, Valdis Kletnieks):

       * Refactor dev_pm_opp_of_register_em() and update related drivers.

       * Add a missing function export.

       * Allow disabled OPPs in dev_pm_opp_get_freq().

   - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric
     Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier):

       * Add support for delayed timers to the devfreq core and make the
         Samsung exynos5422-dmc driver use it.

       * Unify sysfs interface to use "df-" as a prefix in instance
         names consistently.

       * Fix devfreq_summary debugfs node indentation.

       * Add the rockchip,pmu phandle to the rk3399_dmc driver DT
         bindings.

       * List Dmitry Osipenko as the Tegra devfreq driver maintainer.

       * Fix typos in the core devfreq code.

   - Update the pm-graph utility to version 5.7 including a number of
     fixes related to suspend-to-idle (Todd Brandt).

   - Fix coccicheck errors and warnings in the cpupower utility (Shuah
     Khan).

   - Replace HTTP links with HTTPs ones in multiple places (Alexander A.
     Klimov)"

* tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits)
  cpuidle: ACPI: fix 'return' with no value build warning
  cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode
  cpufreq: intel_pstate: Rearrange the storing of new EPP values
  intel_idle: Customize IceLake server support
  PM / devfreq: Fix the wrong end with semicolon
  PM / devfreq: Fix indentaion of devfreq_summary debugfs node
  PM / devfreq: Clean up the devfreq instance name in sysfs attr
  memory: samsung: exynos5422-dmc: Add module param to control IRQ mode
  memory: samsung: exynos5422-dmc: Adjust polling interval and uptreshold
  memory: samsung: exynos5422-dmc: Use delayed timer as default
  PM / devfreq: Add support delayed timer for polling mode
  dt-bindings: devfreq: rk3399_dmc: Add rockchip,pmu phandle
  PM / devfreq: tegra: Add Dmitry as a maintainer
  PM / devfreq: event: Fix trivial spelling
  PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
  cpuidle: change enter_s2idle() prototype
  cpuidle: psci: Prevent domain idlestates until consumers are ready
  cpuidle: psci: Convert PM domain to platform driver
  cpuidle: psci: Fix error path via converting to a platform driver
  cpuidle: psci: Fail cpuidle registration if set OSI mode failed
  ...
2020-08-03 20:28:08 -07:00
Linus Torvalds
09a0bd0776 platform-drivers-x86 for v5.9-1
* ASUS WMI driver honors BAT1 name of the battery
   (quite a few new laptops are using it)
 * Dell WMI driver supports new key codes and backlight events
 * ThinkPad ACPI driver now may use standard charge threshold interface,
   it also has been updated to provide Laptop or Desktop mode to the user
 * Intel Speed Select Technology gained support on Sapphire Rapids platform
 * Regular update of Speed Select Technology tools
 * Mellanox has been updated to support complex attributes
 * PMC core driver has been fixed to show correct names for LPM0 register
 * HTTP links were replaced by HTTPS ones where it applies
 * Miscellaneous fixes and cleanups here and there
 
 The following is an automated git shortlog grouped by driver:
 
 acerhdf:
  -  Replace HTTP links with HTTPS ones
 
 Add new intel_atomisp2_led driver:
  - Add new intel_atomisp2_led driver
 
 apple-gmux:
  -  Replace HTTP links with HTTPS ones
 
 asus-nb-wmi:
  -  Drop duplicate DMI quirk structures
  -  add support for ASUS ROG Zephyrus G14 and G15
 
 asus-wmi:
  -  allow BAT1 battery name
 
 dell-wmi:
  -  add new dmi mapping for keycode 0xffff
  -  add new keymap type 0x0012
  -  add new backlight events
 
 intel_cht_int33fe:
  -  Drop double check for ACPI companion device
 
 intel-hid:
  -  Fix return value check in check_acpi_dev()
 
 intel_pmc_core:
  -  fix bound check in pmc_core_mphy_pg_show()
  -  update TGL's LPM0 reg bit map name
 
 intel-vbtn:
  -  Fix return value check in check_acpi_dev()
 
 ISST:
  -  drop a duplicated word in isst_if.h
  -  Add new PCI device ids
 
 pcengines-apuv2:
  -  revert wiring up simswitch GPIO as LED
 
 platform/mellanox:
  -  Introduce string_upper() and string_lower() helpers
  -  Add string_upper() and string_lower() tests
  -  Extend FAN platform data description
  -  Add more definitions for system attributes
  -  Add new attribute for mlxreg-io sysfs interfaces
  -  Add presence register field for FAN devices
  -  Add support for complex attributes
  -  mlxreg-io: Add support for complex attributes
  -  mlxreg-hotplug: Add environmental data to uevent
  -  mlxreg-hotplug: Use capability register for attribute creation
  -  mlxreg-hotplug: Modify module license
 
 system76-acpi:
  -  Fix brightness_set schedule while atomic
 
 thinkpad_acpi:
  -  Make some symbols static
  -  add documentation for battery charge control
  -  use standard charge control attribute names
  -  remove unused defines
  -  Replace HTTP links with HTTPS ones
  -  not loading brightness_init when _BCL invalid
  -  lap or desk mode interface
  -  Revert "Use strndup_user() in dispatch_proc_write()"
 
 tools/power/x86/intel-speed-select:
  -  Update version for v5.9
  -  Add retries for mail box commands
  -  Add option to delay mbox commands
  -  Ignore -o option processing on error
  -  Change path for caching topology info
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl8n9UIACgkQb7wzTHR8
 rCg0cRAAnHC699qMns53T+jSMAc9kY5vrwbf2Bacwjp5vYS7uw3S12kLNgQh6JWM
 B8imS1RAMlHfCvGT9vUs+mbpemCfYpFZ0HQ4z+VMzziKayJu8pCD9O2XDS9lRUHj
 uKqA19FX6lXfJnrhRoi3vM0JC35FXZQZkueLyJ728t1xqcDkNS2/O/by3c76XnK1
 r81jl9m+qMJwB5zt+6qeSWjZAgFCTDvOHA/gZTEKF1VBOcafN1RHWTLH49JtpBH7
 +pPoy9mw3zfDJReqKxFhaXgDlueBlRwTBG5kEQk9aIxgA9T8q2jqEyAw5XtAKOWp
 RCebgCpC5NHXyz/2CVzaJWYikS/Iuvr4ynx6F2nKXVClGMtF7Obb2/3rvSCOTMri
 ge1bJ2NomUL9F5Cd/iR/F4/jNtxk9LSuzW+otA+/m+g8UkK574cefQQmBdrhrfDU
 GqK6C1kB7OQW2v3tPKVsWvduXQTaaCVzINEkdUxK50PXZayY7+w0djM+QFR/sadw
 2/okr1zuNjFnEt7HtZRP5bzEm1ofXVo5JIaCnpwgsz6dCQH4NiuVxsc/GyanRpBJ
 AadUAYcD+PRRnOybsfhx5MlUt/JIr6cV0OvT8LlWPINJjTpWPlqMeBYdDUOLCndX
 3LXmoZ0k8IQ2LLmONc5UBX1+iayCO+PkHcBDoJdNuWT1eoO06M8=
 =yvnW
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver updates from Andy Shevchenko:

 - ASUS WMI driver honors BAT1 name of the battery (quite a few new
   laptops are using it)

 - Dell WMI driver supports new key codes and backlight events

 - ThinkPad ACPI driver now may use standard charge threshold interface,
   it also has been updated to provide Laptop or Desktop mode to the
   user

 - Intel Speed Select Technology gained support on Sapphire Rapids
   platform

 - Regular update of Speed Select Technology tools

 - Mellanox has been updated to support complex attributes

 - PMC core driver has been fixed to show correct names for LPM0
   register

 - HTTP links were replaced by HTTPS ones where it applies

 - Miscellaneous fixes and cleanups here and there

* tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86: (42 commits)
  platform/x86: asus-nb-wmi: Drop duplicate DMI quirk structures
  platform/x86: thinkpad_acpi: Make some symbols static
  platform/x86: thinkpad_acpi: add documentation for battery charge control
  platform/x86: thinkpad_acpi: use standard charge control attribute names
  platform/x86: thinkpad_acpi: remove unused defines
  platform/x86: ISST: drop a duplicated word in isst_if.h
  tools/power/x86/intel-speed-select: Update version for v5.9
  tools/power/x86/intel-speed-select: Add retries for mail box commands
  tools/power/x86/intel-speed-select: Add option to delay mbox commands
  tools/power/x86/intel-speed-select: Ignore -o option processing on error
  tools/power/x86/intel-speed-select: Change path for caching topology info
  platform/x86: acerhdf: Replace HTTP links with HTTPS ones
  platform/x86: apple-gmux: Replace HTTP links with HTTPS ones
  platform/x86: pcengines-apuv2: revert wiring up simswitch GPIO as LED
  platform/x86: mlx-platform: Extend FAN platform data description
  platform_data/mlxreg: Add presence register field for FAN devices
  Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
  platform/mellanox: mlxreg-io: Add support for complex attributes
  platform/x86: mlx-platform: Add more definitions for system attributes
  platform_data/mlxreg: Add support for complex attributes
  ...
2020-08-03 18:02:53 -07:00
Linus Torvalds
e4cbce4d13 The main changes in this cycle were:
- Improve uclamp performance by using a static key for the fast path
 
  - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
    better power efficiency of RT tasks on battery powered devices.
    (The default is to maximize performance & reduce RT latencies.)
 
  - Improve utime and stime tracking accuracy, which had a fixed boundary
    of error, which created larger and larger relative errors as the values
    become larger. This is now replaced with more precise arithmetics,
    using the new mul_u64_u64_div_u64() helper in math64.h.
 
  - Improve the deadline scheduler, such as making it capacity aware
 
  - Improve frequency-invariant scheduling
 
  - Misc cleanups in energy/power aware scheduling
 
  - Add sched_update_nr_running tracepoint to track changes to nr_running
 
  - Documentation additions and updates
 
  - Misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oJDURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ixLg//bqWzFlfWirvngTgDxDnplwUTyKXmMCcq
 R1IYhlyK2O5FxvhbRmdmW11W3yzyTPvgCs6Q/70negGaPNe2w1OxfxiK9NMKz5eu
 M1LoXas7pL5g7Pr/ZxxHk/8VqJLV4t9MkodiiInmV6lTaznT3sU6a/kpYQjJyFnG
 Tuu9jd6JhdRKmePDJnNmUBoGQ7JiOQDcX4HtkcQ3OA+An3624tmJzbW1yts+uj7J
 ZWo2EY60RfbA9MxQXGPOaR/nAjngWs4Q6tddAh10mftsPq1gR2iFUKju1d31MQt/
 RHLdiqJf+AyUC4popKG7a+7ilCKMBwPociSreTJNPyEUQ1X4AM3vUVk4yjUoiDph
 k2WdsCF8/JRdhXg0NnrpPUqOaAbQj53EeXnitEb92E7WyTZgLOvAtpV//xZo6utp
 2QHerfrQ9SoGQjz/ho78za5vQtV1x25yDhd+X4XV4QEhIy85G9/2JCpC/Kc/TXLf
 OO7A4X69XztKTEJhP60g8ldCPUe4N2vbh1vKY6oAD8AFQVVNZ6n7375/Qa//b0/k
 ++hcYkPc2EK97/aBFdvzDgqb7aUo7Mtn2ibke16sQU4szulaoRuAHQG4jdGKMwbD
 dk2VBoxyxeYFXWHsNneSe87+ha3sd0dSN0ul1EB/SlFrVELMvy634YXnMYGW8ima
 PzyPB0ezpuA=
 =PbO7
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Improve uclamp performance by using a static key for the fast path

 - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
   better power efficiency of RT tasks on battery powered devices.
   (The default is to maximize performance & reduce RT latencies.)

 - Improve utime and stime tracking accuracy, which had a fixed boundary
   of error, which created larger and larger relative errors as the
   values become larger. This is now replaced with more precise
   arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h.

 - Improve the deadline scheduler, such as making it capacity aware

 - Improve frequency-invariant scheduling

 - Misc cleanups in energy/power aware scheduling

 - Add sched_update_nr_running tracepoint to track changes to nr_running

 - Documentation additions and updates

 - Misc cleanups and smaller fixes

* tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst
  sched/doc: Document capacity aware scheduling
  sched: Document arch_scale_*_capacity()
  arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE
  Documentation/sysctl: Document uclamp sysctl knobs
  sched/uclamp: Add a new sysctl to control RT default boost value
  sched/uclamp: Fix a deadlock when enabling uclamp static key
  sched: Remove duplicated tick_nohz_full_enabled() check
  sched: Fix a typo in a comment
  sched/uclamp: Remove unnecessary mutex_init()
  arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE
  sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry
  arch_topology, sched/core: Cleanup thermal pressure definition
  trace/events/sched.h: fix duplicated word
  linux/sched/mm.h: drop duplicated words in comments
  smp: Fix a potential usage of stale nr_cpus
  sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal
  sched: nohz: stop passing around unused "ticks" parameter.
  sched: Better document ttwu()
  sched: Add a tracepoint to track rq->nr_running
  ...
2020-08-03 14:58:38 -07:00
Linus Torvalds
8f0cb6660a These are the latest RCU bits for v5.9:
- kfree_rcu updates
   - RCU tasks updates
   - Read-side scalability tests
   - SRCU updates
   - Torture-test updates
   - Documentation updates
   - Miscellaneous fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8n80ERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gauA/+NtuExW9V9cPDZ8AAp6x6QfoEIgqN4VEk
 pYuyP0+ZbmwH+h8z7qPqMrwxUHQnhef7gqtlWa7wj9MawbEbmqnA/3uivjX/3Aao
 bGMMXkqXppc6hgwktgLNk8vfq3LRVEH2P0i0I+Tymgxu3DCHSGRep4LWfdAS/q3z
 4pe5JXqdMx+Qnfy/bsVxJTaJAncMq1LQNAtWY1TIwK8L8RmpXrj5dvuLKUr7q+zl
 P+BfXyrdX+x05TpmHHnI/bR3w9yASL32E0S3IaQYRRqH8TsUIGHWe13Ib6hKXXG5
 j7W5KrsOgr0fQBxi+JW2fgGQkrua4o7yk4H2Ygj+Fi5RvP2uqNZdvXFAlP2cUMu/
 7Pg8+7kC6jKIrwpD03s9ZZzm0QN3jsCxFs2PEkkHMzjXbe1CI4tIkTH6ex1uvjR2
 v3OhCIp6ypxpEIJbFQucia0iQ4NF+evKjqCvRkbepqQ096jg+CNFh0VG0Tp8XR+y
 Gk9B9oXvLLPMd6ah5CI9nLJKiMWVRV8mvvqspoblGo//+39ksh4mzxm865tFXYg4
 C+DPJvKlY15Ib5eJ/xr8EZ/oS0K2sUF9sMYnK4P8QMhyTBMbpAZiljHYK+Wujt8I
 g/JCWxrEMv3LHPY9/guB5Nod/Qb4Jqqm9iE9qEX3MQxtt2O2nmmWd91pzFcUXlFU
 RDBWYJ63Okg=
 =rNhf
 -----END PGP SIGNATURE-----

Merge tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:

 - kfree_rcu updates

 - RCU tasks updates

 - Read-side scalability tests

 - SRCU updates

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

* tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (109 commits)
  torture: Remove obsolete "cd $KVM"
  torture: Avoid duplicate specification of qemu command
  torture: Dump ftrace at shutdown only if requested
  torture: Add kvm-tranform.sh script for qemu-cmd files
  torture: Add more tracing crib notes to kvm.sh
  torture: Improve diagnostic for KCSAN-incapable compilers
  torture: Correctly summarize build-only runs
  torture: Pass --kmake-arg to all make invocations
  rcutorture: Check for unwatched readers
  torture: Abstract out console-log error detection
  torture: Add a stop-run capability
  torture: Create qemu-cmd in --buildonly runs
  rcu/rcutorture: Replace 0 with false
  torture: Add --allcpus argument to the kvm.sh script
  torture: Remove whitespace from identify_qemu_vcpus output
  rcutorture: NULL rcu_torture_current earlier in cleanup code
  rcutorture: Handle non-statistic bang-string error messages
  torture: Set configfile variable to current scenario
  rcutorture: Add races with task-exit processing
  locktorture: Use true and false to assign to bool variables
  ...
2020-08-03 14:31:33 -07:00
Linus Torvalds
145ff1ec09 arm64 and cross-arch updates for 5.9:
- Removal of the tremendously unpopular read_barrier_depends() barrier,
   which is a NOP on all architectures apart from Alpha, in favour of
   allowing architectures to override READ_ONCE() and do whatever dance
   they need to do to ensure address dependencies provide LOAD ->
   LOAD/STORE ordering. This work also offers a potential solution if
   compilers are shown to convert LOAD -> LOAD address dependencies into
   control dependencies (e.g. under LTO), as weakly ordered architectures
   will effectively be able to upgrade READ_ONCE() to smp_load_acquire().
   The latter case is not used yet, but will be discussed further at LPC.
 
 - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment
   the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
   bus-specific parameter and apply the resulting changes to the device
   ID space provided by the Freescale FSL bus.
 
 - arm64 support for TLBI range operations and translation table level
   hints (part of the ARMv8.4 architecture version).
 
 - Time namespace support for arm64.
 
 - Export the virtual and physical address sizes in vmcoreinfo for
   makedumpfile and crash utilities.
 
 - CPU feature handling cleanups and checks for programmer errors
   (overlapping bit-fields).
 
 - ACPI updates for arm64: disallow AML accesses to EFI code regions and
   kernel memory.
 
 - perf updates for arm64.
 
 - Miscellaneous fixes and cleanups, most notably PLT counting
   optimisation for module loading, recordmcount fix to ignore
   relocations other than R_AARCH64_CALL26, CMA areas reserved for
   gigantic pages on 16K and 64K configurations.
 
 - Trivial typos, duplicate words.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl8oTcsACgkQa9axLQDI
 XvEj6hAAkn39mO5xrR/Vhpg3DyFPk63ZlMSX9SsOeVyaLbovT6stTs1XAZXPpnkt
 rV3gwACyGSrqH6+uey9pHgHJuPF2TdrGEVK08yVKo9KGW/6yXSIncdKFE4jUJ/WJ
 wF5j7eMET2aGzcpm5AlzMmq6HOrKB8nZac9H8/x6H+Ox2WdgJkEjOkDvyqACUyum
 N3FsTZkWj2pIkTXHNgDZ8KjxVLO8HlFaB2hkxFDl9NPlX2UTCQJ8Tg1KiPLafKaK
 gUvH4usQDFdb5RU/UWogre37J4emO0ZTApZOyju+U+PMMWlWVHjZ4isUIS9zz/AE
 JNZ23dnKZX2HrYa5p8HZx175zwj/vXUqUHCZPLvQXaAudCEhF8BVljPiG0e80FV5
 GHFUgUbylKspp01I/9L+2JvsG96Mr0e+P3Sx7L2HTI42cmtoSa14+MpoSRj7zlft
 Qcl8hfrVOjCjUnFRHa/1y1cGvnD9GbgnKJR7zgVxl9bD/Jd48r1HUtwRORZCzWFr
 mRPVbPS72fWxMzMV9DZYJm02jJY9kLX2BMl49njbB8MhAhzOvrMVzoVVtMMeRFLR
 XHeJpmg36W09FiRGe7LRXlkXIhCQzQG2bJfiphuupCfhjRAitPoq8I925G6Pig60
 c8RWaXGU7PrEsdMNrL83vekvGKgqrkoFkRVtsCoQ2X6Hvu/XdYI=
 =mh79
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 and cross-arch updates from Catalin Marinas:
 "Here's a slightly wider-spread set of updates for 5.9.

  Going outside the usual arch/arm64/ area is the removal of
  read_barrier_depends() series from Will and the MSI/IOMMU ID
  translation series from Lorenzo.

  The notable arm64 updates include ARMv8.4 TLBI range operations and
  translation level hint, time namespace support, and perf.

  Summary:

   - Removal of the tremendously unpopular read_barrier_depends()
     barrier, which is a NOP on all architectures apart from Alpha, in
     favour of allowing architectures to override READ_ONCE() and do
     whatever dance they need to do to ensure address dependencies
     provide LOAD -> LOAD/STORE ordering.

     This work also offers a potential solution if compilers are shown
     to convert LOAD -> LOAD address dependencies into control
     dependencies (e.g. under LTO), as weakly ordered architectures will
     effectively be able to upgrade READ_ONCE() to smp_load_acquire().
     The latter case is not used yet, but will be discussed further at
     LPC.

   - Make the MSI/IOMMU input/output ID translation PCI agnostic,
     augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
     bus-specific parameter and apply the resulting changes to the
     device ID space provided by the Freescale FSL bus.

   - arm64 support for TLBI range operations and translation table level
     hints (part of the ARMv8.4 architecture version).

   - Time namespace support for arm64.

   - Export the virtual and physical address sizes in vmcoreinfo for
     makedumpfile and crash utilities.

   - CPU feature handling cleanups and checks for programmer errors
     (overlapping bit-fields).

   - ACPI updates for arm64: disallow AML accesses to EFI code regions
     and kernel memory.

   - perf updates for arm64.

   - Miscellaneous fixes and cleanups, most notably PLT counting
     optimisation for module loading, recordmcount fix to ignore
     relocations other than R_AARCH64_CALL26, CMA areas reserved for
     gigantic pages on 16K and 64K configurations.

   - Trivial typos, duplicate words"

Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org
Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits)
  arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
  arm64/mm: save memory access in check_and_switch_context() fast switch path
  arm64: sigcontext.h: delete duplicated word
  arm64: ptrace.h: delete duplicated word
  arm64: pgtable-hwdef.h: delete duplicated words
  bus: fsl-mc: Add ACPI support for fsl-mc
  bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
  of/irq: Make of_msi_map_rid() PCI bus agnostic
  of/irq: make of_msi_map_get_device_domain() bus agnostic
  dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
  of/device: Add input id to of_dma_configure()
  of/iommu: Make of_map_rid() PCI agnostic
  ACPI/IORT: Add an input ID to acpi_dma_configure()
  ACPI/IORT: Remove useless PCI bus walk
  ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
  ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
  ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC
  arm64: enable time namespace support
  arm64/vdso: Restrict splitting VVAR VMA
  arm64/vdso: Handle faults on timens page
  ...
2020-08-03 14:11:08 -07:00
Linus Torvalds
382625d0d4 for-5.9/block-20200802
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8m7YwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpt+dEAC7a0HYuX2OrkyawBnsgd1QQR/soC7surec
 yDDa7SMM8cOq3935bfzcYHV9FWJszEGIknchiGb9R3/T+vmSohbvDsM5zgwya9u/
 FHUIuTq324I6JWXKl30k4rwjiX9wQeMt+WZ5gC8KJYCWA296i2IpJwd0A45aaKuS
 x4bTjxqknE+fD4gQiMUSt+bmuOUAp81fEku3EPapCRYDPAj8f5uoY7R2arT/POwB
 b+s+AtXqzBymIqx1z0sZ/XcdZKmDuhdurGCWu7BfJFIzw5kQ2Qe3W8rUmrQ3pGut
 8a21YfilhUFiBv+B4wptfrzJuzU6Ps0BXHCnBsQjzvXwq5uFcZH495mM/4E4OJvh
 SbjL2K4iFj+O1ngFkukG/F8tdEM1zKBYy2ZEkGoWKUpyQanbAaGI6QKKJA+DCdBi
 yPEb7yRAa5KfLqMiocm1qCEO1I56HRiNHaJVMqCPOZxLmpXj19Fs71yIRplP1Trv
 GGXdWZsccjuY6OljoXWdEfnxAr5zBsO3Yf2yFT95AD+egtGsU1oOzlqAaU1mtflw
 ABo452pvh6FFpxGXqz6oK4VqY4Et7WgXOiljA4yIGoPpG/08L1Yle4eVc2EE01Jb
 +BL49xNJVeUhGFrvUjPGl9kVMeLmubPFbmgrtipW+VRg9W8+Yirw7DPP6K+gbPAR
 RzAUdZFbWw==
 =abJG
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Good amount of cleanups and tech debt removals in here, and as a
  result, the diffstat shows a nice net reduction in code.

   - Softirq completion cleanups (Christoph)

   - Stop using ->queuedata (Christoph)

   - Cleanup bd claiming (Christoph)

   - Use check_events, moving away from the legacy media change
     (Christoph)

   - Use inode i_blkbits consistently (Christoph)

   - Remove old unused writeback congestion bits (Christoph)

   - Cleanup/unify submission path (Christoph)

   - Use bio_uninit consistently, instead of bio_disassociate_blkg
     (Christoph)

   - sbitmap cleared bits handling (John)

   - Request merging blktrace event addition (Jan)

   - sysfs add/remove race fixes (Luis)

   - blk-mq tag fixes/optimizations (Ming)

   - Duplicate words in comments (Randy)

   - Flush deferral cleanup (Yufen)

   - IO context locking/retry fixes (John)

   - struct_size() usage (Gustavo)

   - blk-iocost fixes (Chengming)

   - blk-cgroup IO stats fixes (Boris)

   - Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
  block: blk-timeout: delete duplicated word
  block: blk-mq-sched: delete duplicated word
  block: blk-mq: delete duplicated word
  block: genhd: delete duplicated words
  block: elevator: delete duplicated word and fix typos
  block: bio: delete duplicated words
  block: bfq-iosched: fix duplicated word
  iocost_monitor: start from the oldest usage index
  iocost: Fix check condition of iocg abs_vdebt
  block: Remove callback typedefs for blk_mq_ops
  block: Use non _rcu version of list functions for tag_set_list
  blk-cgroup: show global disk stats in root cgroup io.stat
  blk-cgroup: make iostat functions visible to stat printing
  block: improve discard bio alignment in __blkdev_issue_discard()
  block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
  block: defer flush request no matter whether we have elevator
  block: make blk_timeout_init() static
  block: remove retry loop in ioc_release_fn()
  block: remove unnecessary ioc nested locking
  block: integrate bd_start_claiming into __blkdev_get
  ...
2020-08-03 11:57:03 -07:00
Linus Torvalds
690b25675f fscrypt updates for 5.9
This release, we add support for inline encryption via the blk-crypto
 framework which was added in 5.8.  Now when an ext4 or f2fs filesystem
 is mounted with '-o inlinecrypt', the contents of encrypted files will
 be encrypted/decrypted via blk-crypto, instead of directly using the
 crypto API.  This model allows taking advantage of the inline encryption
 hardware that is integrated into the UFS or eMMC host controllers on
 most mobile SoCs.  Note that this is just an alternate implementation;
 the ciphertext written to disk stays the same.
 
 (This pull request does *not* include support for direct I/O on
 encrypted files, which blk-crypto makes possible, since that part is
 still being discussed.)
 
 Besides the above feature update, there are also a few fixes and
 cleanups, e.g. strengthening some memory barriers that may be too weak.
 
 All these patches have been in linux-next with no reported issues.  I've
 also tested them with the fscrypt xfstests, as usual.  It's also been
 tested that the inline encryption support works with the support for
 Qualcomm and Mediatek inline encryption hardware that will be in the
 scsi pull request for 5.9.  Also, several SoC vendors are already using
 a previous, functionally equivalent version of these patches.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXye2EBQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK0veAQCKEnwvy+M6s2/QWhC9vo01rABMtt7h
 VRAAKPiFzLNH3AD/dCnZNsFUzk3x0ZyiU1YRW3FvlxFOaEO7Ea0Pt/pyyQ0=
 =g9FK
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt updates from Eric Biggers:
 "This release, we add support for inline encryption via the blk-crypto
  framework which was added in 5.8.

  Now when an ext4 or f2fs filesystem is mounted with '-o inlinecrypt',
  the contents of encrypted files will be encrypted/decrypted via
  blk-crypto, instead of directly using the crypto API. This model
  allows taking advantage of the inline encryption hardware that is
  integrated into the UFS or eMMC host controllers on most mobile SoCs.

  Note that this is just an alternate implementation; the ciphertext
  written to disk stays the same.

  (This pull request does *not* include support for direct I/O on
  encrypted files, which blk-crypto makes possible, since that part is
  still being discussed.)

  Besides the above feature update, there are also a few fixes and
  cleanups, e.g. strengthening some memory barriers that may be too
  weak.

  All these patches have been in linux-next with no reported issues.
  I've also tested them with the fscrypt xfstests, as usual. It's also
  been tested that the inline encryption support works with the support
  for Qualcomm and Mediatek inline encryption hardware that will be in
  the scsi pull request for 5.9. Also, several SoC vendors are already
  using a previous, functionally equivalent version of these patches"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: don't load ->i_crypt_info before it's known to be valid
  fscrypt: document inline encryption support
  fscrypt: use smp_load_acquire() for ->i_crypt_info
  fscrypt: use smp_load_acquire() for ->s_master_keys
  fscrypt: use smp_load_acquire() for fscrypt_prepared_key
  fscrypt: switch fscrypt_do_sha256() to use the SHA-256 library
  fscrypt: restrict IV_INO_LBLK_* to AES-256-XTS
  fscrypt: rename FS_KEY_DERIVATION_NONCE_SIZE
  fscrypt: add comments that describe the HKDF info strings
  ext4: add inline encryption support
  f2fs: add inline encryption support
  fscrypt: add inline encryption support
  fs: introduce SB_INLINECRYPT
2020-08-03 10:09:59 -07:00
Ingo Molnar
c1cc4784ce Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull the v5.9 RCU bits from Paul E. McKenney:

 - Documentation updates
 - Miscellaneous fixes
 - kfree_rcu updates
 - RCU tasks updates
 - Read-side scalability tests
 - SRCU updates
 - Torture-test updates

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 00:15:53 +02:00
Qais Yousef
1f73d1abe5 Documentation/sysctl: Document uclamp sysctl knobs
Uclamp exposes 3 sysctl knobs:

	* sched_util_clamp_min
	* sched_util_clamp_max
	* sched_util_clamp_min_rt_default

Document them in sysctl/kernel.rst.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200716110347.19553-3-qais.yousef@arm.com
2020-07-29 13:51:48 +02:00
Barnabás Pőcze
6178129852 platform/x86: thinkpad_acpi: add documentation for battery charge control
Add a section to the Thinkpad ACPI extras driver documentation detailing
the provided features that may be used to modify battery charge related state.
As of yet, only charge_control_{start,end}_threshold attributes are supported
and documented.

Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-07-28 12:31:14 +03:00
Rafael J. Wysocki
80e3036866 Merge back cpufreq material for v5.9. 2020-07-27 12:34:55 +02:00
Boris Burkov
ef45fe470e blk-cgroup: show global disk stats in root cgroup io.stat
In order to improve consistency and usability in cgroup stat accounting,
we would like to support the root cgroup's io.stat.

Since the root cgroup has processes doing io even if the system has no
explicitly created cgroups, we need to be careful to avoid overhead in
that case.  For that reason, the rstat algorithms don't handle the root
cgroup, so just turning the file on wouldn't give correct statistics.

To get around this, we simulate flushing the iostat struct by filling it
out directly from global disk stats. The result is a root cgroup io.stat
file consistent with both /proc/diskstats and io.stat.

Note that in order to collect the disk stats, we needed to iterate over
devices. To facilitate that, we had to change the linkage of a disk_type
to external so that it can be used from blk-cgroup.c to iterate over
disks.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Boris Burkov <boris@bur.io>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-17 20:18:00 -06:00
Mark Pearson
acf7f4a591 platform/x86: thinkpad_acpi: lap or desk mode interface
Newer Lenovo Thinkpad platforms have support to identify whether the
system is on-lap or not using an ACPI DYTC event from the firmware.

This patch provides the ability to retrieve the current mode via sysfs
entrypoints and will be used by userspace for thermal mode and WWAN
functionality

Co-developed-by: Nitin Joshi <njoshi1@lenovo.com>
Signed-off-by: Nitin Joshi <njoshi1@lenovo.com>
Reviewed-by: Sugumaran <slacshiminar@lenovo.com>
Reviewed-by: Bastien Nocera <bnocera@redhat.com>
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2020-07-15 12:45:06 +03:00
Randy Dunlap
0bddd227f3 Documentation: update for gcc 4.9 requirement
Update Documentation for the gcc v4.9 upgrade requirement.

Fixes: 5429ef62bcf3 ("compiler/gcc: Raise minimum GCC version for kernel builds to 4.8")
Fixes: 6ec4476ac825 ("Raise gcc version requirement to 4.9")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-08 12:28:44 -07:00
Eric Biggers
4f74d15fe4 ext4: add inline encryption support
Wire up ext4 to support inline encryption via the helper functions which
fs/crypto/ now provides.  This includes:

- Adding a mount option 'inlinecrypt' which enables inline encryption
  on encrypted files where it can be used.

- Setting the bio_crypt_ctx on bios that will be submitted to an
  inline-encrypted file.

  Note: submit_bh_wbc() in fs/buffer.c also needed to be patched for
  this part, since ext4 sometimes uses ll_rw_block() on file data.

- Not adding logically discontiguous data to bios that will be submitted
  to an inline-encrypted file.

- Not doing filesystem-layer crypto on inline-encrypted files.

Co-developed-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Satya Tangirala <satyat@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20200702015607.1215430-5-satyat@google.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-08 10:29:43 -07:00
Bhupesh Sharma
bbdbc11804 arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
TCR_EL1.TxSZ, which controls the VA space size, is configured by a
single kernel image to support either 48-bit or 52-bit VA space.

If the ARMv8.2-LVA optional feature is present and we are running
with a 64KB page size, then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size of the memory region addressed by
TTBR1_EL1, export the same in vmcoreinfo. User-space utilities like
makedumpfile and crash-utility need to read this value from vmcoreinfo
for determining if a virtual address lies in the linear map range.

While at it also add documentation for TCR_EL1.T1SZ variable being
added to vmcoreinfo.

It indicates the size offset of the memory region addressed by
TTBR1_EL1.

Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: Kamlakant Patel <kamlakantp@marvell.com>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Link: https://lore.kernel.org/r/1589395957-24628-3-git-send-email-bhsharma@redhat.com
[catalin.marinas@arm.com: removed vabits_actual from the commit log]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02 17:56:49 +01:00
Bhupesh Sharma
1d50e5d0c5 crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT    (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS'
variable being added to vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Boris Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: x86@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Link: https://lore.kernel.org/r/1589395957-24628-2-git-send-email-bhsharma@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02 17:56:11 +01:00
Quentin Perret
8412b4563e cpufreq: Specify default governor on command line
Currently, the only way to specify the default CPUfreq governor is
via Kconfig options, which suits users who can build the kernel
themselves perfectly.

However, for those who use a distro-like kernel (such as Android,
with the Generic Kernel Image project), the only way to use a
non-default governor is to boot to userspace, and to then switch
using the sysfs interface. Being able to specify the default governor
on the command line, like is the case for cpuidle, would allow those
users to specify their governor of choice earlier on, and to simplify
the userspace boot procedure slighlty.

To support this use-case, add a kernel command line parameter
allowing the default governor for CPUfreq to be specified, which
takes precedence over the built-in default.

This implementation has one notable limitation: the default governor
must be registered before the driver. This is solved for builtin
governors and drivers using appropriate *_initcall() functions. And
in the modular case, this must be reflected as a constraint on the
module loading order.

Signed-off-by: Quentin Perret <qperret@google.com>
[ Viresh: Converted 'default_governor' to a string and parsing it only
	  at initcall level, and several updates to
	  cpufreq_init_policy(). ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-02 13:03:30 +02:00
Srinivas Pandruvada
f473bf398b cpufreq: intel_pstate: Allow raw energy performance preference value
Currently using attribute "energy_performance_preference", user space can
write one of the four per-defined preference string. These preference
strings gets mapped to a hard-coded Energy-Performance Preference (EPP) or
Energy-Performance Bias (EPB) knob.

These four values are supposed to cover broad spectrum of use cases, but
are not uniformly distributed in the range. There are number of cases,
where this is not enough. For example:

Suppose user wants more performance when connected to AC. Instead of using
default "balance performance", the "performance" setting can be used. This
changes EPP value from 0x80 to 0x00. But setting EPP to 0, results in
electrical and thermal issues on some platforms. This results in
aggressive throttling, which causes a drop in performance. But some value
between 0x80 and 0x00 results in better performance. But that value can't
be fixed as the power curve is not linear. In some cases just changing EPP
from 0x80 to 0x75 is enough to get significant performance gain.

Similarly on battery the default "balance_performance" mode can be
aggressive in power consumption. But picking up the next choice
"balance power" results in too much loss of performance, which results in
bad user experience in use cases like "Google Hangout". It was observed
that some value between these two EPP is optimal.

This change allows fine grain EPP tuning for platform like Chromebook or
for users who wants to fine tune power and performance.
Here based on the product and use cases, different EPP values can be set.
This change is similar to the change done for:
/sys/devices/system/cpu/cpu*/power/energy_perf_bias
where user has choice to write a predefined string or raw value.

The change itself is trivial. When user preference doesn't match
predefined string preferences and value is an unsigned integer and in
range, use that value for EPP. When the EPP feature is not present
writing raw value is not supported.

Suggested-by: Len Brown <lenb@kernel.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-02 13:02:46 +02:00
Srinivas Pandruvada
ed7bde7a6d cpufreq: intel_pstate: Allow enable/disable energy efficiency
By default intel_pstate the driver disables energy efficiency by setting
MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode.
This CPU model is also shared by Coffee Lake desktop CPUs. This allows
these systems to reach maximum possible frequency. But this adds power
penalty, which some customers don't want. They want some way to enable/
disable dynamically.

So, add an additional attribute "energy_efficiency" under
/sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows
to read and write bit 19 ("Disable Energy Efficiency Optimization") in
the MSR IA32_POWER_CTL.

This attribute is present in both HWP and non-HWP mode as this has an
effect in both modes. Refer to Intel Software Developer's manual for
details.

The scope of this bit is package wide. Also these systems are single
package systems. So read/write MSR on the current CPU is enough.

The energy efficiency (EE) bit setting needs to be preserved during
suspend/resume and CPU offline/online operation. To do this:
- Restoring the EE setting from the cpufreq resume() callback, if there
is change from the system default.
- By default, don't disable EE from cpufreq init() callback for matching
CPU models. Since the scope is package wide and is a single package
system, move the disable EE calls from init() callback to
intel_pstate_init() function, which is called only once.

Suggested-by: Len Brown <lenb@kernel.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-02 13:02:46 +02:00
Paul E. McKenney
13625c0a40 Merge branches 'doc.2020.06.29a', 'fixes.2020.06.29a', 'kfree_rcu.2020.06.29a', 'rcu-tasks.2020.06.29a', 'scale.2020.06.29a', 'srcu.2020.06.29a' and 'torture.2020.06.29a' into HEAD
doc.2020.06.29a:  Documentation updates.
fixes.2020.06.29a:  Miscellaneous fixes.
kfree_rcu.2020.06.29a:  kfree_rcu() updates.
rcu-tasks.2020.06.29a:  RCU Tasks updates.
scale.2020.06.29a:  Read-side scalability tests.
srcu.2020.06.29a:  SRCU updates.
torture.2020.06.29a:  Torture-test updates.
2020-06-29 12:03:15 -07:00
Paul E. McKenney
2102ad290a torture: Dump ftrace at shutdown only if requested
If there is a large number of torture tests running concurrently,
all of which are dumping large ftrace buffers at shutdown time, the
resulting dumping can take a very long time, particularly on systems
with rotating-rust storage.  This commit therefore adds a default-off
torture.ftrace_dump_at_shutdown module parameter that enables
shutdown-time ftrace-buffer dumping.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29 12:01:45 -07:00
Paul E. McKenney
4a5f133c15 rcutorture: Add races with task-exit processing
Several variants of Linux-kernel RCU interact with task-exit processing,
including preemptible RCU, Tasks RCU, and Tasks Trace RCU.  This commit
therefore adds testing of this interaction to rcutorture by adding
rcutorture.read_exit_burst and rcutorture.read_exit_delay kernel-boot
parameters.  These kernel parameters control the frequency and spacing
of special read-then-exit kthreads that are spawned.

[ paulmck: Apply feedback from Dan Carpenter's static checker. ]
[ paulmck: Reduce latency to avoid false-positive shutdown hangs. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29 12:01:44 -07:00
Paul E. McKenney
1fbeb3a8c4 refperf: Rename refperf.c to refscale.c and change internal names
This commit further avoids conflation of refperf with the kernel's perf
feature by renaming kernel/rcu/refperf.c to kernel/rcu/refscale.c,
and also by similarly renaming the functions and variables inside
this file.  This has the side effect of changing the names of the
kernel boot parameters, so kernel-parameters.txt and ver_functions.sh
are also updated.

The rcutorture --torture type remains refperf, and this will be
addressed in a separate commit.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29 12:00:46 -07:00
Paul E. McKenney
847dd70aa9 doc: Document rcuperf's module parameters
This commit adds documentation for the rcuperf module parameters.

Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29 12:00:45 -07:00
Uladzislau Rezki (Sony)
53c72b590b rcu/tree: cache specified number of objects
In order to reduce the dynamic need for pages in kfree_rcu(),
pre-allocate a configurable number of pages per CPU and link
them in a list. When kfree_rcu() reclaims objects, the object's
container page is cached into a list instead of being released
to the low-level page allocator.

Such an approach provides O(1) access to free pages while also
reducing the number of requests to the page allocator. It also
makes the kfree_rcu() code to have free pages available during
a low memory condition.

A read-only sysfs parameter (rcu_min_cached_objs) reflects the
minimum number of allowed cached pages per CPU.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29 11:59:25 -07:00
Linus Torvalds
5e8eed279f - Quite a few DM zoned target fixes and a Zone append fix in DM core.
Considering the amount of dm-zoned changes that went in during the
   5.8 merge window these fixes are not that surprising.
 
 - A few DM writecache target fixes.
 
 - A fix to Documentation index to include DM ebs target docs.
 
 - Small cleanup to use struct_size() in DM core's retrieve_deps().
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl72SdYTHHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWtzXB/4y42NJLrH8EXNW/bGIpPlexpbWRW29
 5z6V1VL/q/s9AWrDunLsvhMAtx38FjHJO6i8GQld/gKRtEbOMHfEItsyrQZqKoSr
 M1NzlthuwFnUS30Uq6i06/ihdH5DDEhTElLs7+Q7pvNUd+KiI0mTgM/y856x3s4p
 4OMG1qEi6AnwvYFyCgu/w7/LLBYRipis+j+cm9Y0cTZujh4QBTeeQdBsu++fit2G
 7U1jkddiQSlb8DrbziRt9JK5WrvR4WpQmP9bnK1SxZtKY/55ZNpuNmFnUmUCgUAY
 Y7CEafoElJ6XYTBzKgYSDqY/eutXSzaxn1euDWizDS4H7kdgDS+3uVT9
 =zq2g
 -----END PGP SIGNATURE-----

Merge tag 'for-5.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Quite a few DM zoned target fixes and a Zone append fix in DM core.

   Considering the amount of dm-zoned changes that went in during the
   5.8 merge window these fixes are not that surprising.

 - A few DM writecache target fixes.

 - A fix to Documentation index to include DM ebs target docs.

 - Small cleanup to use struct_size() in DM core's retrieve_deps().

* tag 'for-5.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm writecache: add cond_resched to loop in persistent_memory_claim()
  dm zoned: Fix reclaim zone selection
  dm zoned: Fix random zone reclaim selection
  dm: update original bio sector on Zone Append
  dm zoned: Fix metadata zone size check
  docs: device-mapper: add dm-ebs.rst to an index file
  dm ioctl: use struct_size() helper in retrieve_deps()
  dm writecache: skip writecache_wait when using pmem mode
  dm writecache: correct uncommitted_block when discarding uncommitted entry
  dm zoned: assign max_io_len correctly
  dm zoned: fix uninitialized pointer dereference
2020-06-27 08:57:16 -07:00
Yang Shi
2a8bef3217 doc: THP CoW fault no longer allocate THP
Since commit 3917c80280c9 ("thp: change CoW semantics for anon-THP"),
THP CoW page fault is rewritten.  Now it just splits pmd then fallback
to base page fault, it doesn't try to allocate THP anymore.  So it is no
longer counted in THP_FAULT_ALLOC.

Remove the obsolete statement in documentation about THP CoW allocation
to avoid confusion.

Link: http://lkml.kernel.org/r/1592424895-5421-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-26 00:27:37 -07:00
Mauro Carvalho Chehab
e0034433a7 docs: device-mapper: add dm-ebs.rst to an index file
Solves this Sphinx warning:
	Documentation/admin-guide/device-mapper/dm-ebs.rst: WARNING: document isn't included in any toctree

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-06-19 12:21:56 -04:00
Andy Lutomirski
b745cfba44 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit
Now that FSGSBASE is fully supported, remove unsafe_fsgsbase, enable
FSGSBASE by default, and add nofsgsbase to disable it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/1557309753-24073-17-git-send-email-chang.seok.bae@intel.com
Link: https://lkml.kernel.org/r/20200528201402.1708239-13-sashal@kernel.org
2020-06-18 15:47:05 +02:00
Andy Lutomirski
dd649bd0b3 x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
This is temporary.  It will allow the next few patches to be tested
incrementally.

Setting unsafe_fsgsbase is a root hole.  Don't do it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/1557309753-24073-4-git-send-email-chang.seok.bae@intel.com
Link: https://lkml.kernel.org/r/20200528201402.1708239-3-sashal@kernel.org
2020-06-18 15:46:59 +02:00
Linus Torvalds
6d62c5b211 A handful of late-arriving docs fixes, along with a patch changing a lot of
HTTP links to HTTPS that had to be yanked and redone before the first
 pull.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl7hRs0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YEE0H/jlTBzV3kkf09nzngka07kUlXTsd5kkQjXNr
 sr0zV4q7o70Hc9mv8r5m5suBFuQ+2rx9Oy2BB4ywOAFIXMkhycCzj/v+tkBLav2s
 +oJ6ytSAnFIoyfChq+ynX2ub0VEE86zxafGaL1xt0SwRt/hSAQWXmfN3m8DorqF3
 bXn2WzKhBxPhjxrRUhzgQVyXnEUbONshFzng7E0OtKMbOw7ftEdh18JdZ2M/KQwH
 DPD+wtmzns/uUUarkC/fmaj8JLD1Bq5X9VTpkz0YU151GX1P4mbMZcWSA8/QHnoS
 B21LMa58itBsQxchN7LvdnkbDj3GIUzDiXsLt9VXMOd+TK4TyhE=
 =mp4N
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.8-2' of git://git.lwn.net/linux

Pull more documentation updates from Jonathan Corbet:
 "A handful of late-arriving docs fixes, along with a patch changing a
  lot of HTTP links to HTTPS that had to be yanked and redone before the
  first pull"

* tag 'docs-5.8-2' of git://git.lwn.net/linux:
  docs/memory-barriers.txt/kokr: smp_mb__{before,after}_atomic(): update Documentation
  Documentation: devres: add missing entry for devm_platform_get_and_ioremap_resource()
  Replace HTTP links with HTTPS ones: documentation
  docs: it_IT: address invalid reference warnings
  doc: zh_CN: use doc reference to resolve undefined label warning
  docs: Update the location of the LF NDA program
  docs: dev-tools: coccinelle: underlines
2020-06-10 14:12:15 -07:00
Linus Torvalds
a5ad5742f6 Merge branch 'akpm' (patches from Andrew)
Merge even more updates from Andrew Morton:

 - a kernel-wide sweep of show_stack()

 - pagetable cleanups

 - abstract out accesses to mmap_sem - prep for mmap_sem scalability work

 - hch's user acess work

Subsystems affected by this patch series: debug, mm/pagemap, mm/maccess,
mm/documentation.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (93 commits)
  include/linux/cache.h: expand documentation over __read_mostly
  maccess: return -ERANGE when probe_kernel_read() fails
  x86: use non-set_fs based maccess routines
  maccess: allow architectures to provide kernel probing directly
  maccess: move user access routines together
  maccess: always use strict semantics for probe_kernel_read
  maccess: remove strncpy_from_unsafe
  tracing/kprobes: handle mixed kernel/userspace probes better
  bpf: rework the compat kernel probe handling
  bpf:bpf_seq_printf(): handle potentially unsafe format string better
  bpf: handle the compat string in bpf_trace_copy_string better
  bpf: factor out a bpf_trace_copy_string helper
  maccess: unify the probe kernel arch hooks
  maccess: remove probe_read_common and probe_write_common
  maccess: rename strnlen_unsafe_user to strnlen_user_nofault
  maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault
  maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault
  maccess: update the top of file comment
  maccess: clarify kerneldoc comments
  maccess: remove duplicate kerneldoc comments
  ...
2020-06-09 09:54:46 -07:00
Michel Lespinasse
c1e8d7c6a7 mmap locking API: convert mmap_sem comments
Convert comments that reference mmap_sem to reference mmap_lock instead.

[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Linus Torvalds
8b4d37db9a Merge branch 'x86/srbds' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 srbds fixes from Thomas Gleixner:
 "The 9th episode of the dime novel "The performance killer" with the
  subtitle "Slow Randomizing Boosts Denial of Service".

  SRBDS is an MDS-like speculative side channel that can leak bits from
  the random number generator (RNG) across cores and threads. New
  microcode serializes the processor access during the execution of
  RDRAND and RDSEED. This ensures that the shared buffer is overwritten
  before it is released for reuse. This is equivalent to a full bus
  lock, which means that many threads running the RNG instructions in
  parallel have the same effect as the same amount of threads issuing a
  locked instruction targeting an address which requires locking of two
  cachelines at once.

  The mitigation support comes with the usual pile of unpleasant
  ingredients:

   - command line options

   - sysfs file

   - microcode checks

   - a list of vulnerable CPUs identified by model and stepping this
     time which requires stepping match support for the cpu match logic.

   - the inevitable slowdown of affected CPUs"

* branch 'x86/srbds' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add Ivy Bridge to affected list
  x86/speculation: Add SRBDS vulnerability and mitigation documentation
  x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation
  x86/cpu: Add 'table' argument to cpu_matches()
2020-06-09 09:30:21 -07:00
Linus Torvalds
23fc02e36e s390 updates for the 5.8 merge window
- Add support for multi-function devices in pci code.
 
 - Enable PF-VF linking for architectures using the
   pdev->no_vf_scan flag (currently just s390).
 
 - Add reipl from NVMe support.
 
 - Get rid of critical section cleanup in entry.S.
 
 - Refactor PNSO CHSC (perform network subchannel operation) in cio
   and qeth.
 
 - QDIO interrupts and error handling fixes and improvements, more
   refactoring changes.
 
 - Align ioremap() with generic code.
 
 - Accept requests without the prefetch bit set in vfio-ccw.
 
 - Enable path handling via two new regions in vfio-ccw.
 
 - Other small fixes and improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl7eVGcACgkQjYWKoQLX
 FBhweQgAkicvx31x230rdfG+jQkQkl0UqF99vvWrJHEll77SqadfjzKAGIjUB+K0
 EoeHVD5Wcj7BogDGcyHeQ0bZpu4WzE+y1nmnrsvu7TEEvcBmkJH0rF2jF+y0sb/O
 3qvwFkX/CB5OqaMzKC/AEeRpcCKR+ZUXkWu1irbYth7CBXaycD9EAPc4cj8CfYGZ
 r5njUdYOVk77TaO4aV+t5pCYc5TCRJaWXSsWaAv/nuLcIqsFBYOy2q+L47zITGXp
 utZVanIDjzx+ikpaKicOIfC3hJsRuNX9MnlZKsQFwpVEZAUZmIUm29XdhGJTWSxU
 RV7m1ORINbFP1nGAqWqkOvGo/LC0ZA==
 =VhXR
 -----END PGP SIGNATURE-----

Merge tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for multi-function devices in pci code.

 - Enable PF-VF linking for architectures using the pdev->no_vf_scan
   flag (currently just s390).

 - Add reipl from NVMe support.

 - Get rid of critical section cleanup in entry.S.

 - Refactor PNSO CHSC (perform network subchannel operation) in cio and
   qeth.

 - QDIO interrupts and error handling fixes and improvements, more
   refactoring changes.

 - Align ioremap() with generic code.

 - Accept requests without the prefetch bit set in vfio-ccw.

 - Enable path handling via two new regions in vfio-ccw.

 - Other small fixes and improvements all over the code.

* tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits)
  vfio-ccw: make vfio_ccw_regops variables declarations static
  vfio-ccw: Add trace for CRW event
  vfio-ccw: Wire up the CRW irq and CRW region
  vfio-ccw: Introduce a new CRW region
  vfio-ccw: Refactor IRQ handlers
  vfio-ccw: Introduce a new schib region
  vfio-ccw: Refactor the unregister of the async regions
  vfio-ccw: Register a chp_event callback for vfio-ccw
  vfio-ccw: Introduce new helper functions to free/destroy regions
  vfio-ccw: document possible errors
  vfio-ccw: Enable transparent CCW IPL from DASD
  s390/pci: Log new handle in clp_disable_fh()
  s390/cio, s390/qeth: cleanup PNSO CHSC
  s390/qdio: remove q->first_to_kick
  s390/qdio: fix up qdio_start_irq() kerneldoc
  s390: remove critical section cleanup from entry.S
  s390: add machine check SIGP
  s390/pci: ioremap() align with generic code
  s390/ap: introduce new ap function ap_get_qdev()
  Documentation/s390: Update / remove developerWorks web links
  ...
2020-06-08 12:05:31 -07:00
Konstantin Khlebnikov
db33ec371b doc: cgroup: update note about conditions when oom killer is invoked
Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
back to the charge path") cgroup oom killer is no longer invoked only
from page faults.  Now it implements the same semantics as global OOM
killer: allocation context invokes OOM killer and keeps retrying until
success.

[akpm@linux-foundation.org: fixes per Randy]

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/158894738928.208854.5244393925922074518.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:58 -07:00
Guilherme G. Piccoli
60c958d8df panic: add sysctl to dump all CPUs backtraces on oops event
Usually when the kernel reaches an oops condition, it's a point of no
return; in case not enough debug information is available in the kernel
splat, one of the last resorts would be to collect a kernel crash dump
and analyze it.  The problem with this approach is that in order to
collect the dump, a panic is required (to kexec-load the crash kernel).
When in an environment of multiple virtual machines, users may prefer to
try living with the oops, at least until being able to properly shutdown
their VMs / finish their important tasks.

This patch implements a way to collect a bit more debug details when an
oops event is reached, by printing all the CPUs backtraces through the
usage of NMIs (on architectures that support that).  The sysctl added
(and documented) here was called "oops_all_cpu_backtrace", and when set
will (as the name suggests) dump all CPUs backtraces.

Far from ideal, this may be the last option though for users that for
some reason cannot panic on oops.  Most of times oopses are clear enough
to indicate the kernel portion that must be investigated, but in virtual
environments it's possible to observe hypervisor/KVM issues that could
lead to oopses shown in other guests CPUs (like virtual APIC crashes).
This patch hence aims to help debug such complex issues without
resorting to kdump.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Guilherme G. Piccoli
0ec9dc9bcb kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected
Commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before
panic") introduced a change in that we started to show all CPUs
backtraces when a hung task is detected _and_ the sysctl/kernel
parameter "hung_task_panic" is set.  The idea is good, because usually
when observing deadlocks (that may lead to hung tasks), the culprit is
another task holding a lock and not necessarily the task detected as
hung.

The problem with this approach is that dumping backtraces is a slightly
expensive task, specially printing that on console (and specially in
many CPU machines, as servers commonly found nowadays).  So, users that
plan to collect a kdump to investigate the hung tasks and narrow down
the deadlock definitely don't need the CPUs backtrace on dmesg/console,
which will delay the panic and pollute the log (crash tool would easily
grab all CPUs traces with 'bt -a' command).

Also, there's the reciprocal scenario: some users may be interested in
seeing the CPUs backtraces but not have the system panic when a hung
task is detected.  The current approach hence is almost as embedding a
policy in the kernel, by forcing the CPUs backtraces' dump (only) on
hung_task_panic.

This patch decouples the panic event on hung task from the CPUs
backtraces dump, by creating (and documenting) a new sysctl called
"hung_task_all_cpu_backtrace", analog to the approach taken on soft/hard
lockups, that have both a panic and an "all_cpu_backtrace" sysctl to
allow individual control.  The new mechanism for dumping the CPUs
backtraces on hung task detection respects "hung_task_warnings" by not
dumping the traces in case there's no warnings left.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: http://lkml.kernel.org/r/20200327223646.20779-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Guilherme G. Piccoli
f117955a22 kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases
After a recent change introduced by Vlastimil's series [0], kernel is
able now to handle sysctl parameters on kernel command line; also, the
series introduced a simple infrastructure to convert legacy boot
parameters (that duplicate sysctls) into sysctl aliases.

This patch converts the watchdog parameters softlockup_panic and
{hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure.
It fixes the documentation too, since the alias only accepts values 0 or
1, not the full range of integers.

We also took the opportunity here to improve the documentation of the
previously converted hung_task_panic (see the patch series [0]) and put
the alias table in alphabetical order.

[0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz

Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Vlastimil Babka
b467f3ef3c kernel/hung_task convert hung_task_panic boot parameter to sysctl
We can now handle sysctl parameters on kernel command line and have
infrastructure to convert legacy command line options that duplicate
sysctl to become a sysctl alias.

This patch converts the hung_task_panic parameter.  Note that the sysctl
handler is more strict and allows only 0 and 1, while the legacy
parameter allowed any non-zero value.  But there is little reason anyone
would not be using 1.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200427180433.7029-4-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Vlastimil Babka
3db978d480 kernel/sysctl: support setting sysctl parameters from kernel command line
Patch series "support setting sysctl parameters from kernel command line", v3.

This series adds support for something that seems like many people
always wanted but nobody added it yet, so here's the ability to set
sysctl parameters via kernel command line options in the form of
sysctl.vm.something=1

The important part is Patch 1.  The second, not so important part is an
attempt to clean up legacy one-off parameters that do the same thing as
a sysctl.  I don't want to remove them completely for compatibility
reasons, but with generic sysctl support the idea is to remove the
one-off param handlers and treat the parameters as aliases for the
sysctl variants.

I have identified several parameters that mention sysctl counterparts in
Documentation/admin-guide/kernel-parameters.txt but there might be more.
The conversion also has varying level of success:

 - numa_zonelist_order is converted in Patch 2 together with adding the
   necessary infrastructure. It's easy as it doesn't really do anything
   but warn on deprecated value these days.

 - hung_task_panic is converted in Patch 3, but there's a downside that
   now it only accepts 0 and 1, while previously it was any integer
   value

 - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
   so there's no straighforward conversion possible

 - traceoff_on_warning is a flag without value and it would be required
   to handle that somehow in the conversion infractructure, which seems
   pointless for a single flag

This patch (of 5):

A recently proposed patch to add vm_swappiness command line parameter in
addition to existing sysctl [1] made me wonder why we don't have a
general support for passing sysctl parameters via command line.

Googling found only somebody else wondering the same [2], but I haven't
found any prior discussion with reasons why not to do this.

Settings the vm_swappiness issue aside (the underlying issue might be
solved in a different way), quick search of kernel-parameters.txt shows
there are already some that exist as both sysctl and kernel parameter -
hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.

A general mechanism would remove the need to add more of those one-offs
and might be handy in situations where configuration by e.g.
/etc/sysctl.d/ is impractical.

Hence, this patch adds a new parse_args() pass that looks for parameters
prefixed by 'sysctl.' and tries to interpret them as writes to the
corresponding sys/ files using an temporary in-kernel procfs mount.
This mechanism was suggested by Eric W.  Biederman [3], as it handles
all dynamically registered sysctl tables, even though we don't handle
modular sysctls.  Errors due to e.g.  invalid parameter name or value
are reported in the kernel log.

The processing is hooked right before the init process is loaded, as
some handlers might be more complicated than simple setters and might
need some subsystems to be initialized.  At the moment the init process
can be started and eventually execute a process writing to /proc/sys/
then it should be also fine to do that from the kernel.

Sysctls registered later on module load time are not set by this
mechanism - it's expected that in such scenarios, setting sysctl values
from userspace is practical enough.

[1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
[2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
[3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Rafael Aquini
db38d5c106 kernel: add panic_on_taint
Analogously to the introduction of panic_on_warn, this patch introduces
a kernel option named panic_on_taint in order to provide a simple and
generic way to stop execution and catch a coredump when the kernel gets
tainted by any given flag.

This is useful for debugging sessions as it avoids having to rebuild the
kernel to explicitly add calls to panic() into the code sites that
introduce the taint flags of interest.

For instance, if one is interested in proceeding with a post-mortem
analysis at the point a given code path is hitting a bad page (i.e.
unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
by rebooting the kernel with 'panic_on_taint=0x20' amended to the
command line.

Another, perhaps less frequent, use for this option would be as a means
for assuring a security policy case where only a subset of taints, or no
single taint (in paranoid mode), is allowed for the running system.  The
optional switch 'nousertaint' is handy in this particular scenario, as
it will avoid userspace induced crashes by writes to sysctl interface
/proc/sys/kernel/tainted causing false positive hits for such policies.

[akpm@linux-foundation.org: tweak kernel-parameters.txt wording]

Suggested-by: Qian Cai <cai@lca.pw>
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Orson Zhai
ceabef7dd7 dynamic_debug: add an option to enable dynamic debug for modules only
Instead of enabling dynamic debug globally with CONFIG_DYNAMIC_DEBUG,
CONFIG_DYNAMIC_DEBUG_CORE will only enable core function of dynamic
debug.  With the DYNAMIC_DEBUG_MODULE defined for any modules, dynamic
debug will be tied to them.

This is useful for people who only want to enable dynamic debug for
kernel modules without worrying about kernel image size and memory
consumption is increasing too much.

[orson.zhai@unisoc.com: v2]
  Link: http://lkml.kernel.org/r/1587408228-10861-1-git-send-email-orson.unisoc@gmail.com

Signed-off-by: Orson Zhai <orson.zhai@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Petr Mladek <pmladek@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/1586521984-5890-1-git-send-email-orson.unisoc@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Alexander A. Klimov
93431e0607 Replace HTTP links with HTTPS ones: documentation
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  For each line:
    If doesn't contain `\bxmlns\b`:
      For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
        If both the HTTP and HTTPS versions
        return 200 OK and serve the same content:
          Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200526060544.25127-1-grandmaster@al2klimov.de
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-06-08 09:30:19 -06:00
Linus Torvalds
081096d98b TTY/Serial driver updates for 5.8-rc1
Here is the tty and serial driver updates for 5.8-rc1
 
 Nothing huge at all, just a lot of little serial driver fixes, updates
 for new devices and features, and other small things.  Full details are
 in the shortlog.
 
 Note, you will get a conflict merging with your tree in the
 Documentation/devicetree/bindings/serial/rs485.yaml file, but it should
 be pretty obvious what to do.  If not, I'm sure Rob will clean it all up
 afterwards :)
 
 All of these have been in linux-next with no issues for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXtzpCg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylRxACgjGtOKPjahONL4lWd0F8ZYEcyw7sAn34woBCO
 BDUV3kolrRQ4OYNJWsHP
 =TvqG
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver updates from Greg KH:
 "Here is the tty and serial driver updates for 5.8-rc1

  Nothing huge at all, just a lot of little serial driver fixes, updates
  for new devices and features, and other small things. Full details are
  in the shortlog.

  All of these have been in linux-next with no issues for a while"

* tag 'tty-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (67 commits)
  tty: serial: qcom_geni_serial: Add 51.2MHz frequency support
  tty: serial: imx: clear Ageing Timer Interrupt in handler
  serial: 8250_fintek: Add F81966 Support
  sc16is7xx: Add flag to activate IrDA mode
  dt-bindings: sc16is7xx: Add flag to activate IrDA mode
  serial: 8250: Support rs485 bus termination GPIO
  serial: 8520_port: Fix function param documentation
  dt-bindings: serial: Add binding for rs485 bus termination GPIO
  vt: keyboard: avoid signed integer overflow in k_ascii
  serial: 8250: Enable 16550A variants by default on non-x86
  tty: hvc_console, fix crashes on parallel open/close
  serial: imx: Initialize lock for non-registered console
  sc16is7xx: Read the LSR register for basic device presence check
  sc16is7xx: Allow sharing the IRQ line
  sc16is7xx: Use threaded IRQ
  sc16is7xx: Always use falling edge IRQ
  tty: n_gsm: Fix bogus i++ in gsm_data_kick
  tty: n_gsm: Remove unnecessary test in gsm_print_packet()
  serial: stm32: add no_console_suspend support
  tty: serial: fsl_lpuart: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
  ...
2020-06-07 09:52:36 -07:00
Linus Torvalds
b170290c28 Kconfig updates for v5.8
- allow only 'config', 'comment', 'if' statements inside 'choice' since
    the other statements are not sensible inside 'choice' and should be
    grammatical error
 
  - support LMC_KEEP env variable for 'make local{yes,mod}config' to
    preserve some CONFIG options
 
  - deprecate 'make kvmconfig' and 'make xenconfig' in favor of
    'make kvm_guest.config' and 'make xen.config'
 
  - code cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl7bsBkVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGjgwP/17odsg7DK15nFXiT0A7A1VzMZpf
 4oOjRo7W+J3zplEzLAFCf759Cov+MeqEdApzTWc+10bLXh5UlWBN8A1wof8dt5SV
 uQZMDO90uqB/sIwTZc8PJrn+9PiGl8D0L/WHMBOvz1AYaK8SBnB6xO0Xf/mn/AFX
 xpjxOhkUjTO59sNn8X3xjgDROkLp/iD3aC0sxo88U30n1aoV2surkPbG2/X879M4
 Q+sW1phPqXoC9WsOJHmsWOx5IdZoj2hy0z4uWGGjXuZwimNQd2PnhuhRg0xfp+WN
 8OYK4B74gnTAAFh2AyQu89DcNCiBYMGwRlrPrsvszODsUZjjgAPWouDpvrwVHf+R
 fZDTPtccQK4BELxIg3lbPCQPfzTDXfsxcK7R4X2AizdLKsh6yI5mdqGR0XcCxCyH
 gEtBNDHsMt4X4g4Z916cvewCmr5dmz+gAKem83j5fPhmVddMCcBgd6W8Dj9NUzGs
 cloHhs9gOGt0w+xBWGftcWARW4/wTf+Dn0yJ6QAaobq/QwMLFDWv8rDy0xzLttjA
 ISNetBUofolmvKiLqgblwY66yCgDzLJhNhR0L8jEJTMGq02Bi76BwnADlr3vnraw
 Bx4tmAoVGx+JXw/TrEY7Be5u/Ji/ofv7YTN9RZZOD9m5Om5HD3zFbtCXN+/8t83D
 2iD1imSJ0TQId36y
 =FlOD
 -----END PGP SIGNATURE-----

Merge tag 'kconfig-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kconfig updates from Masahiro Yamada:

 - allow only 'config', 'comment', 'if' statements inside 'choice' since
   the other statements are not sensible inside 'choice' and should be
   grammatical error

 - support LMC_KEEP env variable for 'make local{yes,mod}config' to
   preserve some CONFIG options

 - deprecate 'make kvmconfig' and 'make xenconfig' in favor of
   'make kvm_guest.config' and 'make xen.config'

 - code cleanups

* tag 'kconfig-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: announce removal of 'kvmconfig' and 'xenconfig' shorthands
  streamline_config.pl: add LMC_KEEP to preserve some kconfigs
  kconfig: allow only 'config', 'comment', and 'if' inside 'choice'
  kconfig: tests: remove randconfig test for choice in choice
  kconfig: do not assign a variable in the return statement
  kconfig: do not use OR-assignment for zero-cleared structure
2020-06-06 12:07:28 -07:00
Linus Torvalds
4a7e89c5ec Merge branch 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "Just two patches: one to add system-level cpu.stat to the root cgroup
  for convenience and a trivial comment update"

* 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: add cpu.stat file to root cgroup
  cgroup: Remove stale comments
2020-06-06 09:59:34 -07:00
Linus Torvalds
b25c6644bf - Largest change for this cycle is the DM zoned target's metadata
version 2 feature that adds support for pairing regular block
   devices with a zoned device to ease performance impact associated
   with finite random zones of zoned device.  Changes came in 3
   batches: first prepared for and then added the ability to pair a
   single regular block device, second was a batch of fixes to improve
   zoned's reclaim heuristic, third removed the limitation of only
   adding a single additional regular block device to allow many
   devices.  Testing has shown linear scaling as more devices are
   added.
 
 - Add new emulated block size (ebs) target that emulates a smaller
   logical_block_size than a block device supports.  Primary use-case
   is to emulate "512e" devices that have 512 byte logical_block_size
   and 4KB physical_block_size.  This is useful to some legacy
   applications otherwise wouldn't be ablee to be used on 4K devices
   because they depend on issuing IO in 512 byte granularity.
 
 - Add discard interfaces to DM bufio.  First consumer of the interface
   is the dm-ebs target that makes heavy use of dm-bufio.
 
 - Fix DM crypt's block queue_limits stacking to not truncate
   logic_block_size.
 
 - Add Documentation for DM integrity's status line.
 
 - Switch DMDEBUG from a compile time config option to instead use
   dynamic debug via pr_debug.
 
 - Fix DM multipath target's hueristic for how it manages
   "queue_if_no_path" state internally.  DM multipath now avoids
   disabling "queue_if_no_path" unless it is actually needed (e.g. in
   response to configure timeout or explicit "fail_if_no_path"
   message).  This fixes reports of spurious -EIO being reported back
   to userspace application during fault tolerance testing with an NVMe
   backend.  Added various dynamic DMDEBUG messages to assist with
   debugging queue_if_no_path in the future.
 
 - Add a new DM multipath "Historical Service Time" Path Selector.
 
 - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.
 
 - Improve DM writecache target performance by using explicit
   cache flushing for target's single-threaded usecase and a small
   cleanup to remove unnecessary test in persistent_memory_claim.
 
 - Other small cleanups in DM core, dm-persistent-data, and DM integrity.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl7alrgTHHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWl42B/9sBd+j60emy4Bliu/f3pd7SEkFrSXv
 K2jXicRFx4E5kO0aLK+fX65cOiq2vvLsDh8c++0TLXcD9q7oK0qxK9c8TCPq29Cx
 W2J2dwdjyyqqbr3/FZHYYM9KLOl5rsxJLygXwhJhQ2Gny44L7nVACrAXNzXIHJ3r
 f8xr+GLdF/jz7WLj8bwEDo3Bf8wvxDvl2ijqj7EceOhTutNE8xHQ6UcTPqTtozJy
 sNM8UQNk1L43DBAvXfrKZ+yQ5DYAKdXKJpV9C8qv5DEGbaEikuMrHddgO4KlDdp4
 VjPk9GSfPwGcJ4ecN8vgecZVGvh52ZU7OZ8qey/q0zqps74jeHTZQQyM
 =jRun
 -----END PGP SIGNATURE-----

Merge tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - The largest change for this cycle is the DM zoned target's metadata
   version 2 feature that adds support for pairing regular block devices
   with a zoned device to ease the performance impact associated with
   finite random zones of zoned device.

   The changes came in three batches: the first prepared for and then
   added the ability to pair a single regular block device, the second
   was a batch of fixes to improve zoned's reclaim heuristic, and the
   third removed the limitation of only adding a single additional
   regular block device to allow many devices.

   Testing has shown linear scaling as more devices are added.

 - Add new emulated block size (ebs) target that emulates a smaller
   logical_block_size than a block device supports

   The primary use-case is to emulate "512e" devices that have 512 byte
   logical_block_size and 4KB physical_block_size. This is useful to
   some legacy applications that otherwise wouldn't be able to be used
   on 4K devices because they depend on issuing IO in 512 byte
   granularity.

 - Add discard interfaces to DM bufio. First consumer of the interface
   is the dm-ebs target that makes heavy use of dm-bufio.

 - Fix DM crypt's block queue_limits stacking to not truncate
   logic_block_size.

 - Add Documentation for DM integrity's status line.

 - Switch DMDEBUG from a compile time config option to instead use
   dynamic debug via pr_debug.

 - Fix DM multipath target's hueristic for how it manages
   "queue_if_no_path" state internally.

   DM multipath now avoids disabling "queue_if_no_path" unless it is
   actually needed (e.g. in response to configure timeout or explicit
   "fail_if_no_path" message).

   This fixes reports of spurious -EIO being reported back to userspace
   application during fault tolerance testing with an NVMe backend.
   Added various dynamic DMDEBUG messages to assist with debugging
   queue_if_no_path in the future.

 - Add a new DM multipath "Historical Service Time" Path Selector.

 - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.

 - Improve DM writecache target performance by using explicit cache
   flushing for target's single-threaded usecase and a small cleanup to
   remove unnecessary test in persistent_memory_claim.

 - Other small cleanups in DM core, dm-persistent-data, and DM
   integrity.

* tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
  dm crypt: avoid truncating the logical block size
  dm mpath: add DM device name to Failing/Reinstating path log messages
  dm mpath: enhance queue_if_no_path debugging
  dm mpath: restrict queue_if_no_path state machine
  dm mpath: simplify __must_push_back
  dm zoned: check superblock location
  dm zoned: prefer full zones for reclaim
  dm zoned: select reclaim zone based on device index
  dm zoned: allocate zone by device index
  dm zoned: support arbitrary number of devices
  dm zoned: move random and sequential zones into struct dmz_dev
  dm zoned: per-device reclaim
  dm zoned: add metadata pointer to struct dmz_dev
  dm zoned: add device pointer to struct dm_zone
  dm zoned: allocate temporary superblock for tertiary devices
  dm zoned: convert to xarray
  dm zoned: add a 'reserved' zone flag
  dm zoned: improve logging messages for reclaim
  dm zoned: avoid unnecessary device recalulation for secondary superblock
  dm zoned: add debugging message for reading superblocks
  ...
2020-06-05 15:45:03 -07:00