IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
* Enable thermal policy for ASUS TUF FX705DY/FX505DY
* Support left round button on ASUS N56VB
* Support new Mellanox platforms of basic class VMOD0009 and VMOD0010
* Intel Comet Lake, Tiger Lake and Elkhart Lake support in the PMC driver
* Big clean up to Intel PMC core, PMC IPC and SCU IPC drivers
* Touchscreen support for the PiPO W11 tablet
The following is an automated git shortlog grouped by driver:
asus-nb-wmi:
- Support left round button on N56VB
asus-wmi:
- Fix keyboard brightness cannot be set to 0
- Set throttle thermal policy to default
- Support throttle thermal policy
Documentation/ABI:
- Add new attribute for mlxreg-io sysfs interfaces
- Style changes
- Add missed attribute for mlxreg-io sysfs interfaces
- Fix documentation inconsistency for mlxreg-io sysfs interfaces
GPD pocket fan:
- Allow somewhat lower/higher temperature limits
- Use default values when wrong modparams are given
intel_atomisp2_pm:
- Spelling fixes
- Refactor timeout loop
intel_mid_powerbtn:
- Take a copy of ddata
intel_pmc_core:
- update Comet Lake platform driver
- Fix spelling of MHz unit
- Fix indentation in function definitions
- Put more stuff under #ifdef DEBUG_FS
- Respect error code of kstrtou32_from_user()
- Add Intel Elkhart Lake support
- Add Intel Tiger Lake support
- Make debugfs entry for pch_ip_power_gating_status conditional
- Create platform dependent bitmap structs
- Remove unnecessary assignments
- Clean up: Remove comma after the termination line
intel_pmc_ipc:
- Switch to use driver->dev_groups
- Propagate error from kstrtoul()
- Use octal permissions in sysfs attributes
- Get rid of unnecessary includes
- Drop ipc_data_readb()
- Drop intel_pmc_gcr_read() and intel_pmc_gcr_write()
- Make intel_pmc_ipc_raw_cmd() static
- Make intel_pmc_ipc_simple_command() static
- Make intel_pmc_gcr_update() static
intel_scu_ipc:
- Reformat kernel-doc comments of exported functions
- Drop intel_scu_ipc_raw_command()
- Drop intel_scu_ipc_io[read|write][8|16]()
- Drop unused macros
- Drop unused prototype intel_scu_ipc_fw_update()
- Sleeping is fine when polling
- Drop intel_scu_ipc_i2c_cntrl()
- Remove Lincroft support
- Add constants for register offsets
- Fix interrupt support
intel_scu_ipcutil:
- Remove default y from Kconfig
intel_telemetry_debugfs:
- Respect error code of kstrtou32_from_user()
intel_telemetry_pltdrv:
- use devm_platform_ioremap_resource()
mlx-platform:
- Add support for next generation systems
- Add support for new capability register
- Add support for new system type
- Set system mux configuration based on system type
- Add more definitions for system attributes
- Cosmetic changes
platform/mellanox:
- mlxreg-hotplug: Add support for new capability register
- fix potential deadlock in the tmfifo driver
tools/power/x86/intel-speed-select:
- Update version
- Change the order for clos disable
- Fix result display for turbo-freq auto mode
- Add support for core-power discovery
- Allow additional core-power mailbox commands
- Update MAINTAINERS for the intel uncore frequency control
- Add support for Uncore frequency control
touchscreen_dmi:
- Fix indentation in several places
- Add info for the PiPO W11 tablet
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl4uxRcACgkQb7wzTHR8
rCjV3xAArz8e4FmFRe77LF83BHduBYkMjnobDFqWa+BhUicapezU4121K/k1cSBi
MRhYOK3+lZvMqFTbjFatgkCapjma34+6xMJztrT/drCgmxHzsLQgJsmJ3pXWqnYM
t4CcPOjvjmZbNJT0NdKjeNyCwFGG6Lupf82fVfIXIshtaKE3xw51242+PJEI978i
2VL3Ck0NL1stgUkKf/ppHkqsjLb0s6IBtgWv052s/0SlHT7RlFAonsCBxIktVa5L
divCtnWJKRhbalCPnKOJ1Q6/kpR/BSNq6OddcgaACKH9i0d5KwtdNDdQ66FkEidN
4rKK3gYMLRpgpLjApH91n7jWMQUaJawVN/3WuRfdnwQOcJ/PVY5vosv2yHKMy3yB
mFLG0G2v3cXEmvRNEyeKBpSkOrAnsoylFt8zLN6/OP3cQ/jdrnq+UpurxRnBEGkm
ZDbs5FYqjAPMyjL+/ldhndIdIqvusVgrl2V93BuAzN080UvkQ2FKjLnk0Dkc/xmZ
XbXUAn+BCUpmABstLa7NJGruV3O9IdiSLHW5Cr0NZC74h4nHqDS57At7LkQbphkC
Pt/+7tirpaEvSbgQZsVI0WfOwDjJf79FT4dYtCY1GkmSvmvF+OUgKicfEZsfuDan
hLez0resoJkHLeJtcrF1lW3l3kyMaokBmkUhB1skOYlLUvcsOwg=
=rMop
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.6-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:
- Enable thermal policy for ASUS TUF FX705DY/FX505DY
- Support left round button on ASUS N56VB
- Support new Mellanox platforms of basic class VMOD0009 and VMOD0010
- Intel Comet Lake, Tiger Lake and Elkhart Lake support in the PMC
driver
- Big clean-up to Intel PMC core, PMC IPC and SCU IPC drivers
- Touchscreen support for the PiPO W11 tablet
* tag 'platform-drivers-x86-v5.6-1' of git://git.infradead.org/linux-platform-drivers-x86: (64 commits)
platform/x86: intel_pmc_ipc: Switch to use driver->dev_groups
platform/x86: intel_pmc_ipc: Propagate error from kstrtoul()
platform/x86: intel_pmc_ipc: Use octal permissions in sysfs attributes
platform/x86: intel_pmc_ipc: Get rid of unnecessary includes
platform/x86: intel_pmc_ipc: Drop ipc_data_readb()
platform/x86: intel_pmc_ipc: Drop intel_pmc_gcr_read() and intel_pmc_gcr_write()
platform/x86: intel_pmc_ipc: Make intel_pmc_ipc_raw_cmd() static
platform/x86: intel_pmc_ipc: Make intel_pmc_ipc_simple_command() static
platform/x86: intel_pmc_ipc: Make intel_pmc_gcr_update() static
platform/x86: intel_scu_ipc: Reformat kernel-doc comments of exported functions
platform/x86: intel_scu_ipc: Drop intel_scu_ipc_raw_command()
platform/x86: intel_scu_ipc: Drop intel_scu_ipc_io[read|write][8|16]()
platform/x86: intel_scu_ipc: Drop unused macros
platform/x86: intel_scu_ipc: Drop unused prototype intel_scu_ipc_fw_update()
platform/x86: intel_scu_ipc: Sleeping is fine when polling
platform/x86: intel_scu_ipc: Drop intel_scu_ipc_i2c_cntrl()
platform/x86: intel_scu_ipc: Remove Lincroft support
platform/x86: intel_scu_ipc: Add constants for register offsets
platform/x86: intel_scu_ipc: Fix interrupt support
platform/x86: intel_scu_ipcutil: Remove default y from Kconfig
...
Pull x86 microcode update from Borislav Petkov:
"Another boring branch this time around: mark a stub function inline,
by Valdis Kletnieks"
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Make stub function static inline
Pull RAS updates from Borislav Petkov:
- Misc fixes to the MCE code all over the place, by Jan H. Schönherr.
- Initial support for AMD F19h and other cleanups to amd64_edac, by
Yazen Ghannam.
- Other small cleanups.
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
EDAC/mce_amd: Make fam_ops static global
EDAC/amd64: Drop some family checks for newer systems
EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh
x86/amd_nb: Add Family 19h PCI IDs
EDAC/mce_amd: Always load on SMCA systems
x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType
x86/mce: Fix use of uninitialized MCE message string
x86/mce: Fix mce=nobootlog
x86/mce: Take action on UCNA/Deferred errors again
x86/mce: Remove mce_inject_log() in favor of mce_log()
x86/mce: Pass MCE message to mce_panic() on failed kernel recovery
x86/mce/therm_throt: Mark throttle_active_work() as __maybe_unused
These functions are not used anywhere so drop them completely.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This function is not called outside of intel_pmc_ipc.c so we can make it
static instead.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This function is not called outside of intel_pmc_ipc.c so we can make it
static instead.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This function is not called outside of intel_pmc_ipc.c so we can make it
static instead.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
There is no user for this function so we can drop it from the driver.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
There are no users for these so we can remove them.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
There is no implementation for that anymore so drop the prototype.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
There are no existing users for this functionality so drop it from the
driver completely. This also means we don't need to keep the struct
intel_scu_ipc_pdata_t around anymore so remove that as well.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- a resctrl fix for uninitialized objects found by debugobjects
- a resctrl memory leak fix
- fix the unintended re-enabling of the of SME and SEV CPU flags if
memory encryption was disabled at bootup via the MSR space"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
x86/resctrl: Fix potential memory leak
x86/resctrl: Fix an imbalance in domain_remove_cpu()
Pull x86 RAS fix from Ingo Molnar:
"Fix a thermal throttling race that can result in easy to trigger boot
crashes on certain Ice Lake platforms"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce/therm_throt: Do not access uninitialized therm_work
Pull perf fixes from Ingo Molnar:
"Tooling fixes, three Intel uncore driver fixes, plus an AUX events fix
uncovered by the perf fuzzer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Remove PCIe3 unit for SNR
perf/x86/intel/uncore: Fix missing marker for snr_uncore_imc_freerunning_events
perf/x86/intel/uncore: Add PCI ID of IMC for Xeon E3 V5 Family
perf: Correctly handle failed perf_get_aux_event()
perf hists: Fix variable name's inconsistency in hists__for_each() macro
perf map: Set kmap->kmaps backpointer for main kernel map chunks
perf report: Fix incorrectly added dimensions as switch perf data file
tools lib traceevent: Fix memory leakage in filter_event
Pull EFI fixes from Ingo Molnar:
"Three EFI fixes:
- Fix a slow-boot-scrolling regression but making sure we use WC for
EFI earlycon framebuffer mappings on x86
- Fix a mixed EFI mode boot crash
- Disable paging explicitly before entering startup_32() in mixed
mode bootup"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efistub: Disable paging at mixed mode entry
efi/libstub/random: Initialize pointer variables to zero for mixed mode
efi/earlycon: Fix write-combine mapping on x86
The PCIe Root Port driver for CPU Complex PCIe Root Ports are not
loaded on SNR.
The device ID for SNR PCIe3 unit is used by both uncore driver and the
PCIe Root Port driver. If uncore driver is loaded, the PCIe Root Port
driver never be probed.
Remove the PCIe3 unit for SNR for now. The support for PCIe3 unit will
be added later separately.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200116200210.18937-2-kan.liang@linux.intel.com
An Oops during the boot is found on some SNR machines. It turns out
this is because the snr_uncore_imc_freerunning_events[] array was
missing an end-marker.
Fixes: ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
Reported-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Like Xu <like.xu@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200116200210.18937-1-kan.liang@linux.intel.com
The IMC uncore support is missed for E3-1585 v5 CPU.
Intel Xeon E3 V5 Family has Sky Lake CPU.
Add the PCI ID of IMC for Intel Xeon E3 V5 Family.
Reported-by: Rosales-fernandez, Carlos <carlos.rosales-fernandez@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Rosales-fernandez, Carlos <carlos.rosales-fernandez@intel.com>
Link: https://lkml.kernel.org/r/1578687311-158748-1-git-send-email-kan.liang@linux.intel.com
If the SME and SEV features are present via CPUID, but memory encryption
support is not enabled (MSR 0xC001_0010[23]), the feature flags are cleared
using clear_cpu_cap(). However, if get_cpu_cap() is later called, these
feature flags will be reset back to present, which is not desired.
Change from using clear_cpu_cap() to setup_clear_cpu_cap() so that the
clearing of the flags is maintained.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.16.x-
Link: https://lkml.kernel.org/r/226de90a703c3c0be5a49565047905ac4e94e8f3.1579125915.git.thomas.lendacky@amd.com
Add the new PCI Device 18h IDs for AMD Family 19h systems. Note that
Family 19h systems will not have a new PCI root device ID.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-4-Yazen.Ghannam@amd.com
Add support for a new version of the Load Store unit bank type as
indicated by its McaType value, which will be present in future SMCA
systems.
Add the new (HWID, MCATYPE) tuple. Reuse the same name, since this is
logically the same to the user.
Also, add the new error descriptions to edac_mce_amd.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-2-Yazen.Ghannam@amd.com
It is relatively easy to trigger the following boot splat on an Ice Lake
client platform. The call stack is like:
kernel BUG at kernel/timer/timer.c:1152!
Call Trace:
__queue_delayed_work
queue_delayed_work_on
therm_throt_process
intel_thermal_interrupt
...
The reason is that a CPU's thermal interrupt is enabled prior to
executing its hotplug onlining callback which will initialize the
throttling workqueues.
Such a race can lead to therm_throt_process() accessing an uninitialized
therm_work, leading to the above BUG at a very early bootup stage.
Therefore, unmask the thermal interrupt vector only after having setup
the workqueues completely.
[ bp: Heavily massage commit message and correct comment formatting. ]
Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com
The function mce_severity() is not required to update its msg argument.
In fact, mce_severity_amd() does not, which makes mce_no_way_out()
return uninitialized data, which may be used later for printing.
Assuming that implementations of mce_severity() either always or never
update the msg argument (which is currently the case), it is sufficient
to initialize the temporary variable in mce_no_way_out().
While at it, avoid printing a useless "Unknown".
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200103150722.20313-4-jschoenh@amazon.de
Since commit
8b38937b7ab5 ("x86/mce: Do not enter deferred errors into the generic
pool twice")
the mce=nobootlog option has become mostly ineffective (after being only
slightly ineffective before), as the code is taking actions on MCEs left
over from boot when they have a usable address.
Move the check for MCP_DONTLOG a bit outward to make it effective again.
Also, since commit
011d82611172 ("RAS: Add a Corrected Errors Collector")
the two branches of the remaining "if" at the bottom of machine_check_poll()
do same. Unify them.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200103150722.20313-3-jschoenh@amazon.de
Commit
fa92c5869426 ("x86, mce: Support memory error recovery for both UCNA
and Deferred error in machine_check_poll")
added handling of UCNA and Deferred errors by adding them to the ring
for SRAO errors.
Later, commit
fd4cf79fcc4b ("x86/mce: Remove the MCE ring for Action Optional errors")
switched storage from the SRAO ring to the unified pool that is still
in use today. In order to only act on the intended errors, a filter
for MCE_AO_SEVERITY is used -- effectively removing handling of
UCNA/Deferred errors again.
Extend the severity filter to include UCNA/Deferred errors again.
Also, generalize the naming of the notifier from SRAO to UC to capture
the extended scope.
Note, that this change may cause a message like the following to appear,
as the same address may be reported as SRAO and as UCNA:
Memory failure: 0x5fe3284: already hardware poisoned
Technically, this is a return to previous behavior.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200103150722.20313-2-jschoenh@amazon.de
Use devm_platform_ioremap_resource() to simplify the code a bit.
While here, drop initialized but unused ssram_base_addr and ssram_size members.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
We currently try to shrink a single zone when removing memory. We use
the zone of the first page of the memory we are removing. If that
memmap was never initialized (e.g., memory was never onlined), we will
read garbage and can trigger kernel BUGs (due to a stale pointer):
BUG: unable to handle page fault for address: 000000000000353d
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
RIP: 0010:clear_zone_contiguous+0x5/0x10
Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__remove_pages+0x4b/0x640
arch_remove_memory+0x63/0x8d
try_remove_memory+0xdb/0x130
__remove_memory+0xa/0x11
acpi_memory_device_remove+0x70/0x100
acpi_bus_trim+0x55/0x90
acpi_device_hotplug+0x227/0x3a0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x221/0x550
worker_thread+0x50/0x3b0
kthread+0x105/0x140
ret_from_fork+0x3a/0x50
Modules linked in:
CR2: 000000000000353d
Instead, shrink the zones when offlining memory or when onlining failed.
Introduce and use remove_pfn_range_from_zone(() for that. We now
properly shrink the zones, even if we have DIMMs whereby
- Some memory blocks fall into no zone (never onlined)
- Some memory blocks fall into multiple zones (offlined+re-onlined)
- Multiple memory blocks that fall into different zones
Drop the zone parameter (with a potential dubious value) from
__remove_pages() and __remove_section().
Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <stable@vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
set_cache_qos_cfg() is leaking memory when the given level is not
RDT_RESOURCE_L3 or RDT_RESOURCE_L2. At the moment, this function is
called with only valid levels but move the allocation after the valid
level checks in order to make it more robust and future proof.
[ bp: Massage commit message. ]
Fixes: 99adde9b370de ("x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20200102165844.133133-1-shakeelb@google.com
A system that supports resource monitoring may have multiple resources
while not all of these resources are capable of monitoring. Monitoring
related state is initialized only for resources that are capable of
monitoring and correspondingly this state should subsequently only be
removed from these resources that are capable of monitoring.
domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
all resources, even those that never initialized d->mbm_over because
they are not capable of monitoring. Hence, it triggers a debugobjects
warning when offlining CPUs because those timer debugobjects are never
initialized:
ODEBUG: assert_init not available (active state 0) object type:
timer_list hint: 0x0
WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
debug_print_object
Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS I40 05/23/2018
RIP: 0010:debug_print_object
Call Trace:
debug_object_assert_init
del_timer
try_to_grab_pending
cancel_delayed_work
resctrl_offline_cpu
cpuhp_invoke_callback
cpuhp_thread_fun
smpboot_thread_fn
kthread
ret_from_fork
Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: tj@kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191211033042.2188-1-cai@lca.pw
The EFI mixed mode entry code goes through the ordinary startup_32()
routine before jumping into the kernel's EFI boot code in 64-bit
mode. The 32-bit startup code must be entered with paging disabled,
but this is not documented as a requirement for the EFI handover
protocol, and so we should disable paging explicitly when entering
the kernel from 32-bit EFI firmware.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: https://lkml.kernel.org/r/20191224132909.102540-4-ardb@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Fix a bug where we try to do an ultracall on a system without an ultravisor.
KVM:
- Fix uninitialised sysreg accessor
- Fix handling of demand-paged device mappings
- Stop spamming the console on IMPDEF sysregs
- Relax mappings of writable memslots
- Assorted cleanups
MIPS:
- Now orphan, James Hogan is stepping down
x86:
- MAINTAINERS change, so long Radim and thanks for all the fish
- supported CPUID fixes for AMD machines without SPEC_CTRL
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJd/1+WAAoJEL/70l94x66DFuYH/A8x/P6BuCpppdGoEw+VGy7X
E8141dHTd7b1Wgi0kDNLRREr4QIfArvavGe0z0W8p4fGtcVjXdyhhfPd0UK6dfKG
9P66phY4AGPjde/8q/qSdFup9yshpcFwSVYdRC0L1w86dBRlXwuqk6K5zsRyCU4b
38v5Q3rPdMnWWB0K88/GMvAyQmPkgMOXJvhoecKeDQ+9IZ3ub6DBBNGM/xTJ9Y3z
vUe2BoYkZ3KKn6sfP66PdprBVI1EOrrAoj/l4BSuo/yUPcQsxTihXMkh5iGl18TF
h7TN9eq2Bn2ryh0TsaSK8opuePcotVvx7oll3ERtSV4e+89z5FDt4vVcY1VyRuc=
=adm7
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"PPC:
- Fix a bug where we try to do an ultracall on a system without an
ultravisor
KVM:
- Fix uninitialised sysreg accessor
- Fix handling of demand-paged device mappings
- Stop spamming the console on IMPDEF sysregs
- Relax mappings of writable memslots
- Assorted cleanups
MIPS:
- Now orphan, James Hogan is stepping down
x86:
- MAINTAINERS change, so long Radim and thanks for all the fish
- supported CPUID fixes for AMD machines without SPEC_CTRL"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
MAINTAINERS: remove Radim from KVM maintainers
MAINTAINERS: Orphan KVM for MIPS
kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
KVM: arm/arm64: Properly handle faulting of device mappings
KVM: arm64: Ensure 'params' is initialised when looking up sys register
KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region
KVM: arm64: Don't log IMP DEF sysreg traps
KVM: arm64: Sanely ratelimit sysreg messages
KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create()
KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
- Fix a bug where we try to do an ultracall on a system without an
ultravisor.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEv0VLfXa2m9eKuaRpnZrqdyxjcZ8FAl35s5kACgkQnZrqdyxj
cZ8cwwf/UPCvZIYPSeYvzrCrlA+wlhBAh3bh47+ZXaNybOpss1xZ7QOFGkgoVBkn
ES2Sdx3qgLvhmbR+nEKon8YCDVSwUj2ehwJu1nzAUzuVYw+m8OHGjdW07+go5KKi
xZOndwBQGYaaWxch2O8Qw27TZU4lcVY/FNQiti5Ahg9dKK98CLyMsWnTms23ZjGD
JMN/jCoMxa6godxWk3mSLaIwXj8P1P4pH3oiMFF8ngRTqyMgi1l02wim+DV10rD4
5JoAF2kzSYngDlrhhQAsSOWrsWst1X2txcHA2QsoL7ZGYUQzzKyHH6zC6dS9eWk4
ni70RLEnJj8YpsjwT52tFYokxwTPfQ==
=kPkE
-----END PGP SIGNATURE-----
Merge tag 'kvm-ppc-fixes-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
PPC KVM fix for 5.5
- Fix a bug where we try to do an ultracall on a system without an
ultravisor.
Pull perf fixes from Ingo Molnar:
"Misc fixes: a BTS fix, a PT NMI handling fix, a PMU sysfs fix and an
SRCU annotation"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Add SRCU annotation for pmus list walk
perf/x86/intel: Fix PT PMI handling
perf/x86/intel/bts: Fix the use of page_private()
perf/x86: Fix potential out-of-bounds access
- fix warning in out-of-tree 'make clean'
- add READELF variable to the top Makefile
- fix broken builds when LINUX_COMPILE_BY contains a backslash
- fix build warning in kallsyms
- fix NULL pointer access in expr_eq() in Kconfig
- fix missing dependency on rsync in deb-pkg build
- remove ---help--- from documentation
- fix misleading documentation about directory descending
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl3+OssVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGCccP/ROhTMQviPaj1W8qgE5r18Yh5U7H
8f3h2OuqX2mVn6Jnqhs83JvuAKKENczh884MCgvmZCq4eVGkxB5yigZjctpQYzpR
hpLZRSCfv415wSTZOttsVmff4ovrJ0KpYfx6cMOTK7Hm+4UBRtDMK2xjE+x4ZNmH
dX04q3oA5eoc1jx2cqP5nltiwFbfEk1hx8s9QwXc9JysYTlUbJZ4SQE7NGGEdk5+
rZOA9koJOpcbqcGMjMusApD2lzl2RgP2/fbLdA68zlgpUV9xT3GiieQiSAkIi8gh
+PcWjWVoZyuQUMeGG+3HmnUVfBaIGXxWLfERbB5LZEftUS7HRS5+i2DHvla67gPp
alqxtz1sE5VeEmkq+msdrnZxqakNP5EjLnCa57BkVvRRphtKy1gijN9FuT8mjZUz
UJA7CYhB2HIcwSzJIosRPhIhNudwpZidWU0nnjxmEatVgL5y2UG1UNe9Wfmh45Gf
fNYQSMwL+LL/xiO2E1WACBqAhCfvGbCG3NEinO1IAe0SuMtF2Ha77Fo3CKqFtLyu
ROB3ffuADIWID/tKe02BOnACcHXjHV0XEOc9CjRmn9pMw109Wy03EdfzgQyULTU7
qKOWeE+dcnsoLwnUGrWzaqaqH3yADVy9pIKLYKtD/Veu6lLL3kJ1dOIdr6spSICO
Ji3f8/3ADS8jJh/a
=OKOu
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix warning in out-of-tree 'make clean'
- add READELF variable to the top Makefile
- fix broken builds when LINUX_COMPILE_BY contains a backslash
- fix build warning in kallsyms
- fix NULL pointer access in expr_eq() in Kconfig
- fix missing dependency on rsync in deb-pkg build
- remove ---help--- from documentation
- fix misleading documentation about directory descending
* tag 'kbuild-fixes-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: clarify the difference between obj-y and obj-m w.r.t. descending
kconfig: remove ---help--- from documentation
scripts: package: mkdebian: add missing rsync dependency
kconfig: don't crash on NULL expressions in expr_eq()
scripts/kallsyms: fix offset overflow of kallsyms_relative_base
mkcompile_h: use printf for LINUX_COMPILE_BY
mkcompile_h: git rid of UTS_TRUNCATE from LINUX_COMPILE_{BY,HOST}
x86/boot: kbuild: allow readelf executable to be specified
kbuild: fix 'No such file or directory' warning when cleaning
Pull x86 RAS fixes from Borislav Petkov:
"Three urgent RAS fixes for the AMD side of things:
- initialize struct mce.bank so that calculated error severity on AMD
SMCA machines is correct
- do not send IPIs early during bank initialization, when interrupts
are disabled
- a fix for when only a subset of MCA banks are enabled, which led to
boot hangs on some new AMD CPUs"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix possibly incorrect severity calculation on AMD
x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
CPUID.80000008H:EBX.AMD_SSBD[bit 24]
CPUID.80000008H:EBX.VIRT_SSBD[bit 25]
Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.80000008H:EBX.AMD_SSBD[bit 24] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.
Fixes: 4c6903a0f9d76 ("KVM: x86: fix reporting of AMD speculation bug CPUID leaf")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
CPUID.80000008H:EBX.AMD_SSBD[bit 24]
CPUID.80000008H:EBX.VIRT_SSBD[bit 25]
Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.
Fixes: 0c54914d0c52a ("KVM: x86: use Intel speculation bugs and features as derived in generic x86 code")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull x86 fix from Ingo Molnar:
"Fix kexec booting with certain EFI memory map layouts"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Update e820 with reserved EFI boot services data to fix kexec breakage
Pull timer fixes from Ingo Molnar:
"Add HPET quirks for the Intel 'Coffee Lake H' and 'Ice Lake' platforms"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel: Disable HPET on Intel Ice Lake platforms
x86/intel: Disable HPET on Intel Coffee Lake H platforms
Commit:
ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")
skips the PT/LBR exclusivity check on CPUs where PT and LBRs coexist, but
also inadvertently skips the active_events bump for PT in that case, which
is a bug. If there aren't any hardware events at the same time as PT, the
PMI handler will ignore PT PMIs, as active_events reads zero in that case,
resulting in the "Uhhuh" spurious NMI warning and PT data loss.
Fix this by always increasing active_events for PT events.
Fixes: ccbebba4c6bf ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")
Reported-by: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lkml.kernel.org/r/20191210105101.77210-1-alexander.shishkin@linux.intel.com
Commit
8062382c8dbe2 ("perf/x86/intel/bts: Add BTS PMU driver")
brought in a warning with the BTS buffer initialization
that is easily tripped with (assuming KPTI is disabled):
instantly throwing:
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 326 at arch/x86/events/intel/bts.c:86 bts_buffer_setup_aux+0x117/0x3d0
> Modules linked in:
> CPU: 2 PID: 326 Comm: perf Not tainted 5.4.0-rc8-00291-gceb9e77324fa #904
> RIP: 0010:bts_buffer_setup_aux+0x117/0x3d0
> Call Trace:
> rb_alloc_aux+0x339/0x550
> perf_mmap+0x607/0xc70
> mmap_region+0x76b/0xbd0
...
It appears to assume (for lost raisins) that PagePrivate() is set,
while later it actually tests for PagePrivate() before using
page_private().
Make it consistent and always check PagePrivate() before using
page_private().
Fixes: 8062382c8dbe2 ("perf/x86/intel/bts: Add BTS PMU driver")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lkml.kernel.org/r/20191205142853.28894-2-alexander.shishkin@linux.intel.com
UBSAN reported out-of-bound accesses for x86_pmu.event_map(), it's
arguments should be < x86_pmu.max_events. Make sure all users observe
this constraint.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Meelis Roos <mroos@linux.ee>
The mutex in mce_inject_log() became unnecessary with commit
5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver"),
though the original reason for its presence only vanished with commit
7298f08ea887 ("x86/mcelog: Get rid of RCU remnants").
Drop the mutex. And as that makes mce_inject_log() identical to mce_log(),
get rid of the former in favor of the latter.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191210000733.17979-7-jschoenh@amazon.de
In commit
b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
another call to mce_panic() was introduced. Pass the message of the
handled MCE to that instance of mce_panic() as well, as there doesn't
seem to be a reason not to.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191210000733.17979-6-jschoenh@amazon.de
throttle_active_work() is only called if CONFIG_SYSFS is set, otherwise
we get a harmless warning:
arch/x86/kernel/cpu/mce/therm_throt.c:238:13: error: 'throttle_active_work' \
defined but not used [-Werror=unused-function]
Mark the function as __maybe_unused to avoid the warning.
Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: bberg@redhat.com
Cc: ckellner@redhat.com
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: hdegoede@redhat.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191210203925.3119091-1-arnd@arndb.de
The function mce_severity_amd_smca() requires m->bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.
Fixes: 6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes: d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
Each logical CPU in Scalable MCA systems controls a unique set of MCA
banks in the system. These banks are not shared between CPUs. The bank
types and ordering will be the same across CPUs on currently available
systems.
However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
other CPUs do not. In this case, the bank seen as Reserved on one CPU is
assumed to be the same type as the bank seen as a known type on another
CPU.
In general, this occurs when the hardware represented by the MCA bank
is disabled, e.g. disabled memory controllers on certain models, etc.
The MCA bank is disabled in the hardware, so there is no possibility of
getting an MCA/MCE from it even if it is assumed to have a known type.
For example:
Full system:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | UMC | UMC
2 | CS | CS
System with hardware disabled:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | UMC | RAZ
2 | CS | CS
For this reason, there is a single, global struct smca_banks[] that is
initialized at boot time. This array is initialized on each CPU as it
comes online. However, the array will not be updated if an entry already
exists.
This works as expected when the first CPU (usually CPU0) has all
possible MCA banks enabled. But if the first CPU has a subset, then it
will save a "Reserved" type in smca_banks[]. Successive CPUs will then
not be able to update smca_banks[] even if they encounter a known bank
type.
This may result in unexpected behavior. Depending on the system
configuration, a user may observe issues enumerating the MCA
thresholding sysfs interface. The issues may be as trivial as sysfs
entries not being available, or as severe as system hangs.
For example:
Bank | Type seen on CPU0 | Type seen on CPU1
------------------------------------------------
0 | LS | LS
1 | RAZ | UMC
2 | CS | CS
Extend the smca_banks[] entry check to return if the entry is a
non-reserved type. Otherwise, continue so that CPUs that encounter a
known bank type can update smca_banks[].
Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
... because interrupts are disabled that early and sending IPIs can
deadlock:
BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
no locks held by swapper/1/0.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0
softirqs last enabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0
softirqs last disabled at (0): [<0000000000000000>] 0x0
Preemption disabled at:
[<ffffffff8104703b>] start_secondary+0x3b/0x190
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1
Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
Call Trace:
dump_stack
___might_sleep.cold.92
wait_for_completion
? generic_exec_single
rdmsr_safe_on_cpu
? wrmsr_on_cpus
mce_amd_feature_init
mcheck_cpu_init
identify_cpu
identify_secondary_cpu
smp_store_cpu_info
start_secondary
secondary_startup_64
The function smca_configure() is called only on the current CPU anyway,
therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid
the IPI.
[ bp: Update commit message. ]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz
Introduce a new READELF variable to top-level Makefile, so the name of
readelf binary can be specified.
Before this change the name of the binary was hardcoded to
"$(CROSS_COMPILE)readelf" which might not be present for every
toolchain.
This allows to build with LLVM Object Reader by using make parameter
READELF=llvm-readelf.
Link: https://github.com/ClangBuiltLinux/linux/issues/771
Signed-off-by: Dmitry Golovin <dima@golovin.in>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>