2019-05-27 08:55:01 +02:00
// SPDX-License-Identifier: GPL-2.0-or-later
2008-04-18 13:33:50 -07:00
/*
* pseries Memory Hotplug infrastructure .
*
* Copyright ( C ) 2008 Badari Pulavarty , IBM Corporation
*/
powerpc/pseries: Create new device hotplug entry point
The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.
For PowerKVM the approach for device hotplug is to follow what is currently
being done for pci hotplug. A hotplug request is initiated from the host.
QEMU then generates an EPOW interrupt to the guest which causes the guest
to make the rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest.
Please note that the current pci hotplug path for PowerKVM involves the
kernel receiving the rtas hotplug event, passing it to rtas_errd in
userspace, and having rtas_errd invoke drmgr. The drmgr command then
handles the request as described above for PowerVM systems.
There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am planning is to enable this
by moving the code to handle hotplug from drmgr into the kernel to
provide a single path for handling device hotplug for both PowerVM and
PowerKVM systems. This patch provides the common iframework and entry point.
For PowerKVM a future update to the kernel rtas code will recognize rtas
hotplug events returned from rtas,check-exception calls and use the common
entry point to handle hotplug of the device.
For PowerVM systems, This patch creates /sys/kernel/dlpar that can be
used by the drmgr command to initiate hotplug requests. In order to do
this a string of the format "<resource> <action> <id_type> <id>" is
written to this file. The string consists of a resource (cpu, memory, pci,
phb), an action (add or remove), an id_type (count, drc index, drc name),
and the corresponding id. The kernel will parse the string and create a
rtas hotplug section that can be passed to the common entry point for
handling hotplug requests.
It should be noted that there is no chance of updating how we receive
hotplug (dlpar) requests from the HMC on PowerVM systems.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-02-10 13:47:02 -06:00
# define pr_fmt(fmt) "pseries-hotplug-mem: " fmt
2008-04-18 13:33:50 -07:00
# include <linux/of.h>
2013-09-26 07:40:04 -05:00
# include <linux/of_address.h>
2010-07-12 14:36:09 +10:00
# include <linux/memblock.h>
2011-06-14 10:57:51 +10:00
# include <linux/memory.h>
2014-01-27 10:54:06 -06:00
# include <linux/memory_hotplug.h>
2015-02-10 13:48:25 -06:00
# include <linux/slab.h>
2011-06-14 10:57:51 +10:00
2008-04-18 13:33:50 -07:00
# include <asm/firmware.h>
# include <asm/machdep.h>
2009-02-08 14:49:39 +00:00
# include <asm/sparsemem.h>
2017-06-01 22:51:26 +05:30
# include <asm/fadump.h>
2017-12-01 10:47:31 -06:00
# include <asm/drmem.h>
2014-08-20 08:55:19 +10:00
# include "pseries.h"
2008-04-18 13:33:50 -07:00
2016-06-20 09:00:39 -05:00
static void dlpar_free_property ( struct property * prop )
2015-02-10 13:48:25 -06:00
{
kfree ( prop - > name ) ;
kfree ( prop - > value ) ;
kfree ( prop ) ;
}
2016-06-20 09:00:39 -05:00
static struct property * dlpar_clone_property ( struct property * prop ,
u32 prop_size )
{
struct property * new_prop ;
new_prop = kzalloc ( sizeof ( * new_prop ) , GFP_KERNEL ) ;
if ( ! new_prop )
return NULL ;
new_prop - > name = kstrdup ( prop - > name , GFP_KERNEL ) ;
new_prop - > value = kzalloc ( prop_size , GFP_KERNEL ) ;
if ( ! new_prop - > name | | ! new_prop - > value ) {
dlpar_free_property ( new_prop ) ;
return NULL ;
}
memcpy ( new_prop - > value , prop - > value , prop - > length ) ;
new_prop - > length = prop_size ;
of_property_set_flag ( new_prop , OF_DYNAMIC ) ;
return new_prop ;
}
2018-10-09 21:59:13 +08:00
static bool find_aa_index ( struct device_node * dr_node ,
struct property * ala_prop ,
const u32 * lmb_assoc , u32 * aa_index )
2016-06-20 09:01:38 -05:00
{
2023-10-11 16:37:05 +11:00
__be32 * assoc_arrays ;
u32 new_prop_size ;
2018-10-09 21:59:13 +08:00
struct property * new_prop ;
2016-06-20 09:01:38 -05:00
int aa_arrays , aa_array_entries , aa_array_sz ;
int i , index ;
/*
* The ibm , associativity - lookup - arrays property is defined to be
* a 32 - bit value specifying the number of associativity arrays
* followed by a 32 - bitvalue specifying the number of entries per
* array , followed by the associativity arrays .
*/
assoc_arrays = ala_prop - > value ;
aa_arrays = be32_to_cpu ( assoc_arrays [ 0 ] ) ;
aa_array_entries = be32_to_cpu ( assoc_arrays [ 1 ] ) ;
aa_array_sz = aa_array_entries * sizeof ( u32 ) ;
for ( i = 0 ; i < aa_arrays ; i + + ) {
index = ( i * aa_array_entries ) + 2 ;
if ( memcmp ( & assoc_arrays [ index ] , & lmb_assoc [ 1 ] , aa_array_sz ) )
continue ;
2018-10-09 21:59:13 +08:00
* aa_index = i ;
return true ;
2016-06-20 09:01:38 -05:00
}
2018-10-09 21:59:13 +08:00
new_prop_size = ala_prop - > length + aa_array_sz ;
new_prop = dlpar_clone_property ( ala_prop , new_prop_size ) ;
if ( ! new_prop )
return false ;
2016-06-20 09:01:38 -05:00
2018-10-09 21:59:13 +08:00
assoc_arrays = new_prop - > value ;
2016-06-20 09:01:38 -05:00
2018-10-09 21:59:13 +08:00
/* increment the number of entries in the lookup array */
assoc_arrays [ 0 ] = cpu_to_be32 ( aa_arrays + 1 ) ;
2016-06-20 09:01:38 -05:00
2018-10-09 21:59:13 +08:00
/* copy the new associativity into the lookup array */
index = aa_arrays * aa_array_entries + 2 ;
memcpy ( & assoc_arrays [ index ] , & lmb_assoc [ 1 ] , aa_array_sz ) ;
2016-06-20 09:01:38 -05:00
2018-10-09 21:59:13 +08:00
of_update_property ( dr_node , new_prop ) ;
2016-06-20 09:01:38 -05:00
2018-10-09 21:59:13 +08:00
/*
* The associativity lookup array index for this lmb is
* number of entries - 1 since we added its associativity
* to the end of the lookup array .
*/
* aa_index = be32_to_cpu ( assoc_arrays [ 0 ] ) - 1 ;
return true ;
2016-06-20 09:01:38 -05:00
}
2018-04-20 15:29:48 -05:00
static int update_lmb_associativity_index ( struct drmem_lmb * lmb )
2016-02-10 11:12:13 -06:00
{
struct device_node * parent , * lmb_node , * dr_node ;
2016-06-20 09:01:38 -05:00
struct property * ala_prop ;
2016-02-10 11:12:13 -06:00
const u32 * lmb_assoc ;
u32 aa_index ;
2018-10-09 21:59:13 +08:00
bool found ;
2016-02-10 11:12:13 -06:00
parent = of_find_node_by_path ( " / " ) ;
if ( ! parent )
return - ENODEV ;
lmb_node = dlpar_configure_connector ( cpu_to_be32 ( lmb - > drc_index ) ,
parent ) ;
of_node_put ( parent ) ;
if ( ! lmb_node )
return - EINVAL ;
lmb_assoc = of_get_property ( lmb_node , " ibm,associativity " , NULL ) ;
if ( ! lmb_assoc ) {
dlpar_free_cc_nodes ( lmb_node ) ;
return - ENODEV ;
}
2021-08-12 18:52:21 +05:30
update_numa_distance ( lmb_node ) ;
2016-02-10 11:12:13 -06:00
dr_node = of_find_node_by_path ( " /ibm,dynamic-reconfiguration-memory " ) ;
if ( ! dr_node ) {
dlpar_free_cc_nodes ( lmb_node ) ;
return - ENODEV ;
}
2016-06-20 09:01:38 -05:00
ala_prop = of_find_property ( dr_node , " ibm,associativity-lookup-arrays " ,
NULL ) ;
if ( ! ala_prop ) {
of_node_put ( dr_node ) ;
2016-02-10 11:12:13 -06:00
dlpar_free_cc_nodes ( lmb_node ) ;
return - ENODEV ;
}
2018-10-09 21:59:13 +08:00
found = find_aa_index ( dr_node , ala_prop , lmb_assoc , & aa_index ) ;
2016-02-10 11:12:13 -06:00
2018-11-27 19:16:44 +11:00
of_node_put ( dr_node ) ;
2016-02-10 11:12:13 -06:00
dlpar_free_cc_nodes ( lmb_node ) ;
2018-10-09 21:59:13 +08:00
if ( ! found ) {
2018-04-20 15:29:48 -05:00
pr_err ( " Could not find LMB associativity \n " ) ;
return - 1 ;
2016-02-10 11:12:13 -06:00
}
lmb - > aa_index = aa_index ;
2018-04-20 15:29:48 -05:00
return 0 ;
2016-02-10 11:12:13 -06:00
}
2017-12-01 10:47:31 -06:00
static struct memory_block * lmb_to_memblock ( struct drmem_lmb * lmb )
2017-02-15 13:45:30 -05:00
{
unsigned long section_nr ;
struct memory_block * mem_block ;
section_nr = pfn_to_section_nr ( PFN_DOWN ( lmb - > base_addr ) ) ;
2021-09-02 14:57:01 -07:00
mem_block = find_memory_block ( section_nr ) ;
2017-02-15 13:45:30 -05:00
return mem_block ;
}
2017-12-01 10:47:31 -06:00
static int get_lmb_range ( u32 drc_index , int n_lmbs ,
struct drmem_lmb * * start_lmb ,
struct drmem_lmb * * end_lmb )
{
struct drmem_lmb * lmb , * start , * end ;
powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable
In guests without hotplugagble memory drmem structure is only zero
initialized. Trying to manipulate DLPAR parameters results in a crash.
$ echo "memory add count 1" > /sys/kernel/dlpar
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000
REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000
CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
...
NIP dlpar_memory+0x6e4/0xd00
LR dlpar_memory+0x698/0xd00
Call Trace:
dlpar_memory+0x698/0xd00 (unreliable)
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
__vfs_write+0x3c/0x70
vfs_write+0xd0/0x260
ksys_write+0xdc/0x130
system_call+0x5c/0x68
Taking closer look at the code, I can see that for_each_drmem_lmb is a
macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <=
&drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs
is NULL, the loop would iterate through the whole address range if it
weren't stopped by the NULL pointer dereference on the next line.
This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range
macro behavior with the common C semantics, where the end marker does
not belong to the scanned range, and alters get_lmb_range() semantics.
As a side effect, the wraparound observed in the crash is prevented.
Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
2020-01-31 14:28:29 +01:00
struct drmem_lmb * limit ;
2017-12-01 10:47:31 -06:00
start = NULL ;
for_each_drmem_lmb ( lmb ) {
if ( lmb - > drc_index = = drc_index ) {
start = lmb ;
break ;
}
}
if ( ! start )
return - EINVAL ;
powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable
In guests without hotplugagble memory drmem structure is only zero
initialized. Trying to manipulate DLPAR parameters results in a crash.
$ echo "memory add count 1" > /sys/kernel/dlpar
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000
REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000
CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
...
NIP dlpar_memory+0x6e4/0xd00
LR dlpar_memory+0x698/0xd00
Call Trace:
dlpar_memory+0x698/0xd00 (unreliable)
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
__vfs_write+0x3c/0x70
vfs_write+0xd0/0x260
ksys_write+0xdc/0x130
system_call+0x5c/0x68
Taking closer look at the code, I can see that for_each_drmem_lmb is a
macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <=
&drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs
is NULL, the loop would iterate through the whole address range if it
weren't stopped by the NULL pointer dereference on the next line.
This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range
macro behavior with the common C semantics, where the end marker does
not belong to the scanned range, and alters get_lmb_range() semantics.
As a side effect, the wraparound observed in the crash is prevented.
Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
2020-01-31 14:28:29 +01:00
end = & start [ n_lmbs ] ;
2017-12-01 10:47:31 -06:00
powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable
In guests without hotplugagble memory drmem structure is only zero
initialized. Trying to manipulate DLPAR parameters results in a crash.
$ echo "memory add count 1" > /sys/kernel/dlpar
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000
REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000
CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
...
NIP dlpar_memory+0x6e4/0xd00
LR dlpar_memory+0x698/0xd00
Call Trace:
dlpar_memory+0x698/0xd00 (unreliable)
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
__vfs_write+0x3c/0x70
vfs_write+0xd0/0x260
ksys_write+0xdc/0x130
system_call+0x5c/0x68
Taking closer look at the code, I can see that for_each_drmem_lmb is a
macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <=
&drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs
is NULL, the loop would iterate through the whole address range if it
weren't stopped by the NULL pointer dereference on the next line.
This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range
macro behavior with the common C semantics, where the end marker does
not belong to the scanned range, and alters get_lmb_range() semantics.
As a side effect, the wraparound observed in the crash is prevented.
Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
2020-01-31 14:28:29 +01:00
limit = & drmem_info - > lmbs [ drmem_info - > n_lmbs ] ;
if ( end > limit )
2017-12-01 10:47:31 -06:00
return - EINVAL ;
* start_lmb = start ;
* end_lmb = end ;
return 0 ;
}
static int dlpar_change_lmb_state ( struct drmem_lmb * lmb , bool online )
2017-08-02 14:03:22 -04:00
{
struct memory_block * mem_block ;
int rc ;
mem_block = lmb_to_memblock ( lmb ) ;
2023-11-14 11:01:55 -06:00
if ( ! mem_block ) {
pr_err ( " Failed memory block lookup for LMB 0x%x \n " , lmb - > drc_index ) ;
2017-08-02 14:03:22 -04:00
return - EINVAL ;
2023-11-14 11:01:55 -06:00
}
2017-08-02 14:03:22 -04:00
if ( online & & mem_block - > dev . offline )
rc = device_online ( & mem_block - > dev ) ;
else if ( ! online & & ! mem_block - > dev . offline )
rc = device_offline ( & mem_block - > dev ) ;
else
rc = 0 ;
put_device ( & mem_block - > dev ) ;
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_online_lmb ( struct drmem_lmb * lmb )
2017-08-02 14:03:22 -04:00
{
return dlpar_change_lmb_state ( lmb , true ) ;
}
2013-04-29 15:08:22 -07:00
# ifdef CONFIG_MEMORY_HOTREMOVE
2017-12-01 10:47:31 -06:00
static int dlpar_offline_lmb ( struct drmem_lmb * lmb )
2017-08-02 14:03:22 -04:00
{
return dlpar_change_lmb_state ( lmb , false ) ;
}
2020-10-07 17:18:34 +05:30
static int pseries_remove_memblock ( unsigned long base , unsigned long memblock_size )
2014-01-27 10:54:06 -06:00
{
2023-08-01 10:14:46 +05:30
unsigned long start_pfn ;
2014-01-27 10:54:06 -06:00
int sections_per_block ;
2021-09-07 19:55:09 -07:00
int i ;
2008-07-03 13:20:58 +10:00
2008-10-01 09:44:02 +00:00
start_pfn = base > > PAGE_SHIFT ;
2008-10-13 08:42:00 +00:00
powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-10 16:25:31 +08:00
lock_device_hotplug ( ) ;
if ( ! pfn_valid ( start_pfn ) )
goto out ;
2008-10-13 08:42:00 +00:00
2023-08-01 10:14:46 +05:30
sections_per_block = memory_block_size / MIN_MEMORY_BLOCK_SIZE ;
2008-04-18 13:33:50 -07:00
2014-01-27 10:54:06 -06:00
for ( i = 0 ; i < sections_per_block ; i + + ) {
2021-09-07 19:55:09 -07:00
__remove_memory ( base , MIN_MEMORY_BLOCK_SIZE ) ;
2014-01-27 10:54:06 -06:00
base + = MIN_MEMORY_BLOCK_SIZE ;
2012-10-08 16:34:14 -07:00
}
2008-04-18 13:33:50 -07:00
powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-10 16:25:31 +08:00
out :
2014-01-27 10:54:06 -06:00
/* Update memory regions for memory remove */
2010-07-12 14:36:09 +10:00
memblock_remove ( base , memblock_size ) ;
powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-10 16:25:31 +08:00
unlock_device_hotplug ( ) ;
2014-01-27 10:54:06 -06:00
return 0 ;
2008-04-18 13:33:50 -07:00
}
2014-01-27 10:54:06 -06:00
static int pseries_remove_mem_node ( struct device_node * np )
2008-07-03 13:22:39 +10:00
{
2023-03-29 17:03:36 -05:00
int ret ;
struct resource res ;
2008-07-03 13:22:39 +10:00
/*
* Check to see if we are actually removing memory
*/
2018-11-16 16:11:00 -06:00
if ( ! of_node_is_type ( np , " memory " ) )
2008-07-03 13:22:39 +10:00
return 0 ;
/*
2014-08-07 13:11:58 +08:00
* Find the base address and size of the memblock
2008-07-03 13:22:39 +10:00
*/
2023-03-29 17:03:36 -05:00
ret = of_address_to_resource ( np , 0 , & res ) ;
if ( ret )
2008-07-03 13:22:39 +10:00
return ret ;
2023-03-29 17:03:36 -05:00
pseries_remove_memblock ( res . start , resource_size ( & res ) ) ;
2014-01-27 10:54:06 -06:00
return 0 ;
2008-07-03 13:22:39 +10:00
}
2015-02-10 13:49:22 -06:00
2017-12-01 10:47:31 -06:00
static bool lmb_is_removable ( struct drmem_lmb * lmb )
2015-02-10 13:49:22 -06:00
{
2021-05-12 17:28:07 -03:00
if ( ( lmb - > flags & DRCONF_MEM_RESERVED ) | |
! ( lmb - > flags & DRCONF_MEM_ASSIGNED ) )
2015-02-10 13:49:22 -06:00
return false ;
2017-06-01 22:51:26 +05:30
# ifdef CONFIG_FA_DUMP
2018-08-20 13:47:32 +05:30
/*
* Don ' t hot - remove memory that falls in fadump boot memory area
* and memory that is reserved for capturing old kernel memory .
*/
powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks
as removable"), the user space interface to compute whether a memory block
can be offlined (exposed via /sys/devices/system/memory/memoryX/removable)
has effectively been deprecated. We want to remove the leftovers of the
kernel implementation.
When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
we'll start by:
1. Testing if it contains any holes, and reject if so
2. Testing if pages belong to different zones, and reject if so
3. Isolating the page range, checking if it contains any unmovable pages
Using is_mem_section_removable() before trying to offline is not only
racy, it can easily result in false positives/negatives. Let's stop
manually checking is_mem_section_removable(), and let device_offline()
handle it completely instead. We can remove the racy
is_mem_section_removable() implementation next.
We now take more locks (e.g., memory hotplug lock when offlining and the
zone lock when isolating), but maybe we should optimize that
implementation instead if this ever becomes a real problem (after all,
memory unplug is already an expensive operation). We started using
is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
Implement memory hotplug remove in the kernel"), with the initial
hotremove support of lmbs.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Link: http://lkml.kernel.org/r/20200407135416.24093-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 16:48:28 -07:00
if ( is_fadump_memory_area ( lmb - > base_addr , memory_block_size_bytes ( ) ) )
2017-06-01 22:51:26 +05:30
return false ;
# endif
powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory blocks
as removable"), the user space interface to compute whether a memory block
can be offlined (exposed via /sys/devices/system/memory/memoryX/removable)
has effectively been deprecated. We want to remove the leftovers of the
kernel implementation.
When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
we'll start by:
1. Testing if it contains any holes, and reject if so
2. Testing if pages belong to different zones, and reject if so
3. Isolating the page range, checking if it contains any unmovable pages
Using is_mem_section_removable() before trying to offline is not only
racy, it can easily result in false positives/negatives. Let's stop
manually checking is_mem_section_removable(), and let device_offline()
handle it completely instead. We can remove the racy
is_mem_section_removable() implementation next.
We now take more locks (e.g., memory hotplug lock when offlining and the
zone lock when isolating), but maybe we should optimize that
implementation instead if this ever becomes a real problem (after all,
memory unplug is already an expensive operation). We started using
is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
Implement memory hotplug remove in the kernel"), with the initial
hotremove support of lmbs.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Link: http://lkml.kernel.org/r/20200407135416.24093-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 16:48:28 -07:00
/* device_offline() will determine if we can actually remove this lmb */
return true ;
2015-02-10 13:49:22 -06:00
}
2017-12-01 10:47:31 -06:00
static int dlpar_add_lmb ( struct drmem_lmb * ) ;
2015-02-10 13:49:22 -06:00
2017-12-01 10:47:31 -06:00
static int dlpar_remove_lmb ( struct drmem_lmb * lmb )
2015-02-10 13:49:22 -06:00
{
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
struct memory_block * mem_block ;
2018-10-02 10:35:59 -05:00
int rc ;
2015-02-10 13:49:22 -06:00
if ( ! lmb_is_removable ( lmb ) )
return - EINVAL ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
mem_block = lmb_to_memblock ( lmb ) ;
if ( mem_block = = NULL )
return - EINVAL ;
2017-08-02 14:03:22 -04:00
rc = dlpar_offline_lmb ( lmb ) ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
if ( rc ) {
put_device ( & mem_block - > dev ) ;
2015-02-10 13:49:22 -06:00
return rc ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
}
2015-02-10 13:49:22 -06:00
2023-08-01 10:14:46 +05:30
__remove_memory ( lmb - > base_addr , memory_block_size ) ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
put_device ( & mem_block - > dev ) ;
2015-02-10 13:49:22 -06:00
/* Update memory regions for memory remove */
2023-08-01 10:14:46 +05:30
memblock_remove ( lmb - > base_addr , memory_block_size ) ;
2015-02-10 13:49:22 -06:00
2018-04-20 15:29:48 -05:00
invalidate_lmb_associativity_index ( lmb ) ;
lmb - > flags & = ~ DRCONF_MEM_ASSIGNED ;
2015-02-10 13:49:22 -06:00
return 0 ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_count ( u32 lmbs_to_remove )
2015-02-10 13:49:22 -06:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb ;
2021-05-12 17:28:08 -03:00
int lmbs_reserved = 0 ;
2015-02-10 13:49:22 -06:00
int lmbs_available = 0 ;
2017-12-01 10:47:31 -06:00
int rc ;
2015-02-10 13:49:22 -06:00
pr_info ( " Attempting to hot-remove %d LMB(s) \n " , lmbs_to_remove ) ;
if ( lmbs_to_remove = = 0 )
return - EINVAL ;
/* Validate that there are enough LMBs to satisfy the request */
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( lmb_is_removable ( lmb ) )
2015-02-10 13:49:22 -06:00
lmbs_available + + ;
2017-12-01 10:47:31 -06:00
if ( lmbs_available = = lmbs_to_remove )
break ;
2015-02-10 13:49:22 -06:00
}
2016-11-28 11:50:45 -05:00
if ( lmbs_available < lmbs_to_remove ) {
pr_info ( " Not enough LMBs available (%d of %d) to satisfy request \n " ,
lmbs_available , lmbs_to_remove ) ;
2015-02-10 13:49:22 -06:00
return - EINVAL ;
2016-11-28 11:50:45 -05:00
}
2015-02-10 13:49:22 -06:00
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
rc = dlpar_remove_lmb ( lmb ) ;
2015-02-10 13:49:22 -06:00
if ( rc )
continue ;
/* Mark this lmb so we can add it later if all of the
* requested LMBs cannot be removed .
*/
2017-12-01 10:47:31 -06:00
drmem_mark_lmb_reserved ( lmb ) ;
2021-05-12 17:28:08 -03:00
lmbs_reserved + + ;
if ( lmbs_reserved = = lmbs_to_remove )
2017-12-01 10:47:31 -06:00
break ;
2015-02-10 13:49:22 -06:00
}
2021-05-12 17:28:08 -03:00
if ( lmbs_reserved ! = lmbs_to_remove ) {
2015-02-10 13:49:22 -06:00
pr_err ( " Memory hot-remove failed, adding LMB's back \n " ) ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2015-02-10 13:49:22 -06:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_add_lmb ( lmb ) ;
2015-02-10 13:49:22 -06:00
if ( rc )
pr_err ( " Failed to add LMB back, drc index %x \n " ,
2017-12-01 10:47:31 -06:00
lmb - > drc_index ) ;
2015-02-10 13:49:22 -06:00
2017-12-01 10:47:31 -06:00
drmem_remove_lmb_reservation ( lmb ) ;
2021-05-12 17:28:08 -03:00
lmbs_reserved - - ;
if ( lmbs_reserved = = 0 )
break ;
2015-02-10 13:49:22 -06:00
}
rc = - EINVAL ;
} else {
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2015-02-10 13:49:22 -06:00
continue ;
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2015-02-10 13:49:22 -06:00
pr_info ( " Memory at %llx was hot-removed \n " ,
2017-12-01 10:47:31 -06:00
lmb - > base_addr ) ;
2015-02-10 13:49:22 -06:00
2017-12-01 10:47:31 -06:00
drmem_remove_lmb_reservation ( lmb ) ;
2021-05-12 17:28:08 -03:00
lmbs_reserved - - ;
if ( lmbs_reserved = = 0 )
break ;
2015-02-10 13:49:22 -06:00
}
rc = 0 ;
}
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_index ( u32 drc_index )
2015-02-10 13:49:22 -06:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb ;
2015-02-10 13:49:22 -06:00
int lmb_found ;
2017-12-01 10:47:31 -06:00
int rc ;
2015-02-10 13:49:22 -06:00
powerpc/pseries/memhotplug: Quieten some DLPAR operations
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
2020-12-11 15:59:54 +01:00
pr_debug ( " Attempting to hot-remove LMB, drc index %x \n " , drc_index ) ;
2015-02-10 13:49:22 -06:00
lmb_found = 0 ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( lmb - > drc_index = = drc_index ) {
2015-02-10 13:49:22 -06:00
lmb_found = 1 ;
2017-12-01 10:47:31 -06:00
rc = dlpar_remove_lmb ( lmb ) ;
2017-01-06 13:25:53 -06:00
if ( ! rc )
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2017-01-06 13:25:53 -06:00
2015-02-10 13:49:22 -06:00
break ;
}
}
2023-11-14 11:01:53 -06:00
if ( ! lmb_found ) {
pr_debug ( " Failed to look up LMB for drc index %x \n " , drc_index ) ;
2015-02-10 13:49:22 -06:00
rc = - EINVAL ;
2023-11-14 11:01:53 -06:00
} else if ( rc ) {
powerpc/pseries/memhotplug: Quieten some DLPAR operations
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
2020-12-11 15:59:54 +01:00
pr_debug ( " Failed to hot-remove memory at %llx \n " ,
lmb - > base_addr ) ;
2023-11-14 11:01:53 -06:00
} else {
powerpc/pseries/memhotplug: Quieten some DLPAR operations
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
2020-12-11 15:59:54 +01:00
pr_debug ( " Memory at %llx was hot-removed \n " , lmb - > base_addr ) ;
2023-11-14 11:01:53 -06:00
}
2015-02-10 13:49:22 -06:00
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_ic ( u32 lmbs_to_remove , u32 drc_index )
2017-02-15 13:46:18 -05:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb , * start_lmb , * end_lmb ;
int rc ;
2017-02-15 13:46:18 -05:00
pr_info ( " Attempting to hot-remove %u LMB(s) at %x \n " ,
lmbs_to_remove , drc_index ) ;
if ( lmbs_to_remove = = 0 )
return - EINVAL ;
2017-12-01 10:47:31 -06:00
rc = get_lmb_range ( drc_index , lmbs_to_remove , & start_lmb , & end_lmb ) ;
if ( rc )
2017-02-15 13:46:18 -05:00
return - EINVAL ;
2021-05-12 17:28:09 -03:00
/*
* Validate that all LMBs in range are not reserved . Note that it
* is ok if they are ! ASSIGNED since our goal here is to remove the
* LMB range , regardless of whether some LMBs were already removed
* by any other reason .
*
* This is a contrast to what is done in remove_by_count ( ) where we
* check for both RESERVED and ! ASSIGNED ( via lmb_is_removable ( ) ) ,
* because we want to remove a fixed amount of LMBs in that function .
*/
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
2021-05-12 17:28:09 -03:00
if ( lmb - > flags & DRCONF_MEM_RESERVED ) {
pr_err ( " Memory at %llx (drc index %x) is reserved \n " ,
lmb - > base_addr , lmb - > drc_index ) ;
return - EINVAL ;
}
2017-02-15 13:46:18 -05:00
}
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
2021-05-12 17:28:09 -03:00
/*
* dlpar_remove_lmb ( ) will error out if the LMB is already
* ! ASSIGNED , but this case is a no - op for us .
*/
2017-12-01 10:47:31 -06:00
if ( ! ( lmb - > flags & DRCONF_MEM_ASSIGNED ) )
2017-02-15 13:46:18 -05:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_remove_lmb ( lmb ) ;
2017-02-15 13:46:18 -05:00
if ( rc )
break ;
2017-12-01 10:47:31 -06:00
drmem_mark_lmb_reserved ( lmb ) ;
2017-02-15 13:46:18 -05:00
}
if ( rc ) {
pr_err ( " Memory indexed-count-remove failed, adding any removed LMBs \n " ) ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2017-02-15 13:46:18 -05:00
continue ;
2021-05-12 17:28:06 -03:00
/*
* Setting the isolation state of an UNISOLATED / CONFIGURED
* device to UNISOLATE is a no - op , but the hypervisor can
* use it as a hint that the LMB removal failed .
*/
dlpar_unisolate_drc ( lmb - > drc_index ) ;
2017-12-01 10:47:31 -06:00
rc = dlpar_add_lmb ( lmb ) ;
2017-02-15 13:46:18 -05:00
if ( rc )
pr_err ( " Failed to add LMB, drc index %x \n " ,
2017-12-01 10:47:31 -06:00
lmb - > drc_index ) ;
2017-02-15 13:46:18 -05:00
2017-12-01 10:47:31 -06:00
drmem_remove_lmb_reservation ( lmb ) ;
2017-02-15 13:46:18 -05:00
}
rc = - EINVAL ;
} else {
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2017-02-15 13:46:18 -05:00
continue ;
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2017-02-15 13:46:18 -05:00
pr_info ( " Memory at %llx (drc index %x) was hot-removed \n " ,
2017-12-01 10:47:31 -06:00
lmb - > base_addr , lmb - > drc_index ) ;
2017-02-15 13:46:18 -05:00
2017-12-01 10:47:31 -06:00
drmem_remove_lmb_reservation ( lmb ) ;
2017-02-15 13:46:18 -05:00
}
}
return rc ;
}
2013-04-29 15:08:22 -07:00
# else
static inline int pseries_remove_memblock ( unsigned long base ,
2020-10-07 17:18:34 +05:30
unsigned long memblock_size )
2013-04-29 15:08:22 -07:00
{
return - EOPNOTSUPP ;
}
2014-01-27 10:54:06 -06:00
static inline int pseries_remove_mem_node ( struct device_node * np )
2013-04-29 15:08:22 -07:00
{
2014-08-11 19:16:19 +10:00
return 0 ;
2013-04-29 15:08:22 -07:00
}
2017-12-01 10:47:31 -06:00
static int dlpar_remove_lmb ( struct drmem_lmb * lmb )
2015-04-14 17:01:56 +10:00
{
return - EOPNOTSUPP ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_count ( u32 lmbs_to_remove )
2015-04-14 17:01:56 +10:00
{
return - EOPNOTSUPP ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_index ( u32 drc_index )
2015-04-14 17:01:56 +10:00
{
return - EOPNOTSUPP ;
}
2017-02-15 13:46:18 -05:00
2017-12-01 10:47:31 -06:00
static int dlpar_memory_remove_by_ic ( u32 lmbs_to_remove , u32 drc_index )
2017-02-15 13:46:18 -05:00
{
return - EOPNOTSUPP ;
}
2013-04-29 15:08:22 -07:00
# endif /* CONFIG_MEMORY_HOTREMOVE */
2008-07-03 13:22:39 +10:00
2017-12-01 10:47:31 -06:00
static int dlpar_add_lmb ( struct drmem_lmb * lmb )
2015-02-10 13:48:25 -06:00
{
unsigned long block_sz ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
int nid , rc ;
2015-02-10 13:48:25 -06:00
2016-02-10 11:10:44 -06:00
if ( lmb - > flags & DRCONF_MEM_ASSIGNED )
return - EINVAL ;
2018-04-20 15:29:48 -05:00
rc = update_lmb_associativity_index ( lmb ) ;
2016-02-10 11:12:13 -06:00
if ( rc ) {
dlpar_release_drc ( lmb - > drc_index ) ;
2023-11-14 11:01:55 -06:00
pr_err ( " Failed to configure LMB 0x%x \n " , lmb - > drc_index ) ;
2016-02-10 11:12:13 -06:00
return rc ;
}
2016-06-29 12:20:30 -05:00
block_sz = memory_block_size_bytes ( ) ;
pseries/hotplug-memory: hot-add: skip redundant LMB lookup
During memory hot-add, dlpar_add_lmb() calls memory_add_physaddr_to_nid()
to determine which node id (nid) to use when later calling __add_memory().
This is wasteful. On pseries, memory_add_physaddr_to_nid() finds an
appropriate nid for a given address by looking up the LMB containing the
address and then passing that LMB to of_drconf_to_nid_single() to get the
nid. In dlpar_add_lmb() we get this address from the LMB itself.
In short, we have a pointer to an LMB and then we are searching for
that LMB *again* in order to find its nid.
If we call of_drconf_to_nid_single() directly from dlpar_add_lmb() we
can skip the redundant lookup. The only error handling we need to
duplicate from memory_add_physaddr_to_nid() is the fallback to the
default nid when drconf_to_nid_single() returns -1 (NUMA_NO_NODE) or
an invalid nid.
Skipping the extra lookup makes hot-add operations faster, especially
on machines with many LMBs.
Consider an LPAR with 126976 LMBs. In one test, hot-adding 126000
LMBs on an upatched kernel took ~3.5 hours while a patched kernel
completed the same operation in ~2 hours:
Unpatched (12450 seconds):
Sep 9 04:06:31 ltc-brazos1 drmgr[810169]: drmgr: -c mem -a -q 126000
Sep 9 04:06:31 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
[...]
Sep 9 07:34:01 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
Patched (7065 seconds):
Sep 8 21:49:57 ltc-brazos1 drmgr[877703]: drmgr: -c mem -a -q 126000
Sep 8 21:49:57 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
[...]
Sep 8 23:27:42 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
It should be noted that the speedup grows more substantial when
hot-adding LMBs at the end of the drconf range. This is because we
are skipping a linear LMB search.
To see the distinction, consider smaller hot-add test on the same
LPAR. A perf-stat run with 10 iterations showed that hot-adding 4096
LMBs completed less than 1 second faster on a patched kernel:
Unpatched:
Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
104,753.42 msec task-clock # 0.992 CPUs utilized ( +- 0.55% )
4,708 context-switches # 0.045 K/sec ( +- 0.69% )
2,444 cpu-migrations # 0.023 K/sec ( +- 1.25% )
394 page-faults # 0.004 K/sec ( +- 0.22% )
445,902,503,057 cycles # 4.257 GHz ( +- 0.55% ) (66.67%)
8,558,376,740 stalled-cycles-frontend # 1.92% frontend cycles idle ( +- 0.88% ) (49.99%)
300,346,181,651 stalled-cycles-backend # 67.36% backend cycles idle ( +- 0.76% ) (50.01%)
258,091,488,691 instructions # 0.58 insn per cycle
# 1.16 stalled cycles per insn ( +- 0.22% ) (66.67%)
70,568,169,256 branches # 673.660 M/sec ( +- 0.17% ) (50.01%)
3,100,725,426 branch-misses # 4.39% of all branches ( +- 0.20% ) (49.99%)
105.583 +- 0.589 seconds time elapsed ( +- 0.56% )
Patched:
Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
104,055.69 msec task-clock # 0.993 CPUs utilized ( +- 0.32% )
4,606 context-switches # 0.044 K/sec ( +- 0.20% )
2,463 cpu-migrations # 0.024 K/sec ( +- 0.93% )
394 page-faults # 0.004 K/sec ( +- 0.25% )
442,951,129,921 cycles # 4.257 GHz ( +- 0.32% ) (66.66%)
8,710,413,329 stalled-cycles-frontend # 1.97% frontend cycles idle ( +- 0.47% ) (50.06%)
299,656,905,836 stalled-cycles-backend # 67.65% backend cycles idle ( +- 0.39% ) (50.02%)
252,731,168,193 instructions # 0.57 insn per cycle
# 1.19 stalled cycles per insn ( +- 0.20% ) (66.66%)
68,902,851,121 branches # 662.173 M/sec ( +- 0.13% ) (49.94%)
3,100,242,882 branch-misses # 4.50% of all branches ( +- 0.15% ) (49.98%)
104.829 +- 0.325 seconds time elapsed ( +- 0.31% )
This is consistent. An add-by-count hot-add operation adds LMBs
greedily, so LMBs near the start of the drconf range are considered
first. On an otherwise idle LPAR with so many LMBs we would expect to
find the LMBs we need near the start of the drconf range, hence the
smaller speedup.
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200916145122.3408129-1-cheloha@linux.ibm.com
2020-09-16 09:51:22 -05:00
/* Find the node id for this LMB. Fake one if necessary. */
nid = of_drconf_to_nid_single ( lmb ) ;
if ( nid < 0 | | ! node_possible ( nid ) )
nid = first_online_node ;
pseries/drmem: don't cache node id in drmem_lmb struct
At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block. There is no need to store the nid
in multiple locations.
Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block. As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.
In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.
On systems with many LMBs that initialization overhead is palpable and
disruptive. For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:
[ 53.721639] drmem: initializing drmem v2
[ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[ 80.604377] Modules linked in:
[ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+)
[ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d
[ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[ 80.604431] GPR12: 0000000000000000 c00000001e8ec200
[ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[ 80.604492] Call Trace:
[ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 80.604567] Instruction dump:
[ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[ 89.047390] drmem: 249854 LMB(s)
With a patched kernel on the same machine we're no longer seeing the
soft lockup. drmem_init() now completes in negligible time, even when
the LMB count is large.
Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
2020-08-10 20:51:15 -05:00
2016-06-29 12:20:30 -05:00
/* Add the memory */
2023-08-08 14:45:00 +05:30
rc = __add_memory ( nid , lmb - > base_addr , block_sz , MHP_MEMMAP_ON_MEMORY ) ;
2017-02-15 13:45:30 -05:00
if ( rc ) {
2023-11-14 11:01:55 -06:00
pr_err ( " Failed to add LMB 0x%x to node %u " , lmb - > drc_index , nid ) ;
2018-04-20 15:29:48 -05:00
invalidate_lmb_associativity_index ( lmb ) ;
2017-02-15 13:45:30 -05:00
return rc ;
}
rc = dlpar_online_lmb ( lmb ) ;
if ( rc ) {
2023-11-14 11:01:55 -06:00
pr_err ( " Failed to online LMB 0x%x on node %u \n " , lmb - > drc_index , nid ) ;
2021-09-07 19:55:09 -07:00
__remove_memory ( lmb - > base_addr , block_sz ) ;
2018-04-20 15:29:48 -05:00
invalidate_lmb_associativity_index ( lmb ) ;
2017-02-15 13:45:30 -05:00
} else {
2016-06-29 12:20:30 -05:00
lmb - > flags | = DRCONF_MEM_ASSIGNED ;
2017-02-15 13:45:30 -05:00
}
2016-02-10 11:10:44 -06:00
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_add_by_count ( u32 lmbs_to_add )
2015-02-10 13:48:25 -06:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb ;
2015-02-10 13:48:25 -06:00
int lmbs_available = 0 ;
2021-06-22 10:39:22 -03:00
int lmbs_reserved = 0 ;
2017-12-01 10:47:31 -06:00
int rc ;
2015-02-10 13:48:25 -06:00
pr_info ( " Attempting to hot-add %d LMB(s) \n " , lmbs_to_add ) ;
if ( lmbs_to_add = = 0 )
return - EINVAL ;
/* Validate that there are enough LMBs to satisfy the request */
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
2021-06-22 10:39:21 -03:00
if ( lmb - > flags & DRCONF_MEM_RESERVED )
continue ;
2017-12-01 10:47:31 -06:00
if ( ! ( lmb - > flags & DRCONF_MEM_ASSIGNED ) )
2015-02-10 13:48:25 -06:00
lmbs_available + + ;
2017-12-01 10:47:31 -06:00
if ( lmbs_available = = lmbs_to_add )
break ;
2015-02-10 13:48:25 -06:00
}
if ( lmbs_available < lmbs_to_add )
return - EINVAL ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( lmb - > flags & DRCONF_MEM_ASSIGNED )
2017-08-23 12:18:43 -05:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_acquire_drc ( lmb - > drc_index ) ;
2015-02-10 13:48:25 -06:00
if ( rc )
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_add_lmb ( lmb ) ;
2017-01-06 13:25:53 -06:00
if ( rc ) {
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2017-01-06 13:25:53 -06:00
continue ;
}
2015-02-10 13:48:25 -06:00
/* Mark this lmb so we can remove it later if all of the
* requested LMBs cannot be added .
*/
2017-12-01 10:47:31 -06:00
drmem_mark_lmb_reserved ( lmb ) ;
2021-06-22 10:39:22 -03:00
lmbs_reserved + + ;
if ( lmbs_reserved = = lmbs_to_add )
2017-12-01 10:47:31 -06:00
break ;
2015-02-10 13:48:25 -06:00
}
2021-06-22 10:39:22 -03:00
if ( lmbs_reserved ! = lmbs_to_add ) {
2015-02-10 13:49:22 -06:00
pr_err ( " Memory hot-add failed, removing any added LMBs \n " ) ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2015-02-10 13:49:22 -06:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_remove_lmb ( lmb ) ;
2015-02-10 13:49:22 -06:00
if ( rc )
pr_err ( " Failed to remove LMB, drc index %x \n " ,
2017-12-01 10:47:31 -06:00
lmb - > drc_index ) ;
2017-01-06 13:25:53 -06:00
else
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
drmem_remove_lmb_reservation ( lmb ) ;
2021-06-22 10:39:22 -03:00
lmbs_reserved - - ;
if ( lmbs_reserved = = 0 )
break ;
2015-02-10 13:49:22 -06:00
}
2015-02-10 13:48:25 -06:00
rc = - EINVAL ;
} else {
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2015-02-10 13:48:25 -06:00
continue ;
powerpc/pseries/memhotplug: Quieten some DLPAR operations
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
2020-12-11 15:59:54 +01:00
pr_debug ( " Memory at %llx (drc index %x) was hot-added \n " ,
lmb - > base_addr , lmb - > drc_index ) ;
2017-12-01 10:47:31 -06:00
drmem_remove_lmb_reservation ( lmb ) ;
2021-06-22 10:39:22 -03:00
lmbs_reserved - - ;
if ( lmbs_reserved = = 0 )
break ;
2015-02-10 13:48:25 -06:00
}
2017-08-23 12:18:43 -05:00
rc = 0 ;
2015-02-10 13:48:25 -06:00
}
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_add_by_index ( u32 drc_index )
2015-02-10 13:48:25 -06:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb ;
int rc , lmb_found ;
2015-02-10 13:48:25 -06:00
pr_info ( " Attempting to hot-add LMB, drc index %x \n " , drc_index ) ;
lmb_found = 0 ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb ( lmb ) {
if ( lmb - > drc_index = = drc_index ) {
2015-02-10 13:48:25 -06:00
lmb_found = 1 ;
2017-12-01 10:47:31 -06:00
rc = dlpar_acquire_drc ( lmb - > drc_index ) ;
2017-01-06 13:25:53 -06:00
if ( ! rc ) {
2017-12-01 10:47:31 -06:00
rc = dlpar_add_lmb ( lmb ) ;
2017-01-06 13:25:53 -06:00
if ( rc )
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2017-01-06 13:25:53 -06:00
}
2015-02-10 13:48:25 -06:00
break ;
}
}
if ( ! lmb_found )
rc = - EINVAL ;
if ( rc )
pr_info ( " Failed to hot-add memory, drc index %x \n " , drc_index ) ;
else
pr_info ( " Memory at %llx (drc index %x) was hot-added \n " ,
2017-12-01 10:47:31 -06:00
lmb - > base_addr , drc_index ) ;
2015-02-10 13:48:25 -06:00
return rc ;
}
2017-12-01 10:47:31 -06:00
static int dlpar_memory_add_by_ic ( u32 lmbs_to_add , u32 drc_index )
2017-02-15 13:45:56 -05:00
{
2017-12-01 10:47:31 -06:00
struct drmem_lmb * lmb , * start_lmb , * end_lmb ;
int rc ;
2017-02-15 13:45:56 -05:00
pr_info ( " Attempting to hot-add %u LMB(s) at index %x \n " ,
lmbs_to_add , drc_index ) ;
if ( lmbs_to_add = = 0 )
return - EINVAL ;
2017-12-01 10:47:31 -06:00
rc = get_lmb_range ( drc_index , lmbs_to_add , & start_lmb , & end_lmb ) ;
if ( rc )
2017-02-15 13:45:56 -05:00
return - EINVAL ;
/* Validate that the LMBs in this range are not reserved */
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
2021-06-22 10:39:23 -03:00
/* Fail immediately if the whole range can't be hot-added */
if ( lmb - > flags & DRCONF_MEM_RESERVED ) {
pr_err ( " Memory at %llx (drc index %x) is reserved \n " ,
lmb - > base_addr , lmb - > drc_index ) ;
return - EINVAL ;
}
2017-02-15 13:45:56 -05:00
}
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
if ( lmb - > flags & DRCONF_MEM_ASSIGNED )
2017-02-15 13:45:56 -05:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_acquire_drc ( lmb - > drc_index ) ;
2017-02-15 13:45:56 -05:00
if ( rc )
break ;
2017-12-01 10:47:31 -06:00
rc = dlpar_add_lmb ( lmb ) ;
2017-02-15 13:45:56 -05:00
if ( rc ) {
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
2017-02-15 13:45:56 -05:00
break ;
}
2017-12-01 10:47:31 -06:00
drmem_mark_lmb_reserved ( lmb ) ;
2017-02-15 13:45:56 -05:00
}
if ( rc ) {
pr_err ( " Memory indexed-count-add failed, removing any added LMBs \n " ) ;
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2017-02-15 13:45:56 -05:00
continue ;
2017-12-01 10:47:31 -06:00
rc = dlpar_remove_lmb ( lmb ) ;
2017-02-15 13:45:56 -05:00
if ( rc )
pr_err ( " Failed to remove LMB, drc index %x \n " ,
2017-12-01 10:47:31 -06:00
lmb - > drc_index ) ;
2017-02-15 13:45:56 -05:00
else
2017-12-01 10:47:31 -06:00
dlpar_release_drc ( lmb - > drc_index ) ;
drmem_remove_lmb_reservation ( lmb ) ;
2017-02-15 13:45:56 -05:00
}
rc = - EINVAL ;
} else {
2017-12-01 10:47:31 -06:00
for_each_drmem_lmb_in_range ( lmb , start_lmb , end_lmb ) {
if ( ! drmem_lmb_reserved ( lmb ) )
2017-02-15 13:45:56 -05:00
continue ;
pr_info ( " Memory at %llx (drc index %x) was hot-added \n " ,
2017-12-01 10:47:31 -06:00
lmb - > base_addr , lmb - > drc_index ) ;
drmem_remove_lmb_reservation ( lmb ) ;
2017-02-15 13:45:56 -05:00
}
}
return rc ;
}
powerpc/pseries: Create new device hotplug entry point
The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.
For PowerKVM the approach for device hotplug is to follow what is currently
being done for pci hotplug. A hotplug request is initiated from the host.
QEMU then generates an EPOW interrupt to the guest which causes the guest
to make the rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest.
Please note that the current pci hotplug path for PowerKVM involves the
kernel receiving the rtas hotplug event, passing it to rtas_errd in
userspace, and having rtas_errd invoke drmgr. The drmgr command then
handles the request as described above for PowerVM systems.
There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am planning is to enable this
by moving the code to handle hotplug from drmgr into the kernel to
provide a single path for handling device hotplug for both PowerVM and
PowerKVM systems. This patch provides the common iframework and entry point.
For PowerKVM a future update to the kernel rtas code will recognize rtas
hotplug events returned from rtas,check-exception calls and use the common
entry point to handle hotplug of the device.
For PowerVM systems, This patch creates /sys/kernel/dlpar that can be
used by the drmgr command to initiate hotplug requests. In order to do
this a string of the format "<resource> <action> <id_type> <id>" is
written to this file. The string consists of a resource (cpu, memory, pci,
phb), an action (add or remove), an id_type (count, drc index, drc name),
and the corresponding id. The kernel will parse the string and create a
rtas hotplug section that can be passed to the common entry point for
handling hotplug requests.
It should be noted that there is no chance of updating how we receive
hotplug (dlpar) requests from the HMC on PowerVM systems.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-02-10 13:47:02 -06:00
int dlpar_memory ( struct pseries_hp_errorlog * hp_elog )
{
2015-02-10 13:48:25 -06:00
u32 count , drc_index ;
int rc ;
powerpc/pseries: Create new device hotplug entry point
The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.
For PowerKVM the approach for device hotplug is to follow what is currently
being done for pci hotplug. A hotplug request is initiated from the host.
QEMU then generates an EPOW interrupt to the guest which causes the guest
to make the rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest.
Please note that the current pci hotplug path for PowerKVM involves the
kernel receiving the rtas hotplug event, passing it to rtas_errd in
userspace, and having rtas_errd invoke drmgr. The drmgr command then
handles the request as described above for PowerVM systems.
There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am planning is to enable this
by moving the code to handle hotplug from drmgr into the kernel to
provide a single path for handling device hotplug for both PowerVM and
PowerKVM systems. This patch provides the common iframework and entry point.
For PowerKVM a future update to the kernel rtas code will recognize rtas
hotplug events returned from rtas,check-exception calls and use the common
entry point to handle hotplug of the device.
For PowerVM systems, This patch creates /sys/kernel/dlpar that can be
used by the drmgr command to initiate hotplug requests. In order to do
this a string of the format "<resource> <action> <id_type> <id>" is
written to this file. The string consists of a resource (cpu, memory, pci,
phb), an action (add or remove), an id_type (count, drc index, drc name),
and the corresponding id. The kernel will parse the string and create a
rtas hotplug section that can be passed to the common entry point for
handling hotplug requests.
It should be noted that there is no chance of updating how we receive
hotplug (dlpar) requests from the HMC on PowerVM systems.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-02-10 13:47:02 -06:00
lock_device_hotplug ( ) ;
switch ( hp_elog - > action ) {
2015-02-10 13:48:25 -06:00
case PSERIES_HP_ELOG_ACTION_ADD :
2019-08-01 19:52:51 -03:00
switch ( hp_elog - > id_type ) {
case PSERIES_HP_ELOG_ID_DRC_COUNT :
2017-02-15 13:45:56 -05:00
count = hp_elog - > _drc_u . drc_count ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_add_by_count ( count ) ;
2019-08-01 19:52:51 -03:00
break ;
case PSERIES_HP_ELOG_ID_DRC_INDEX :
2017-02-15 13:45:56 -05:00
drc_index = hp_elog - > _drc_u . drc_index ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_add_by_index ( drc_index ) ;
2019-08-01 19:52:51 -03:00
break ;
case PSERIES_HP_ELOG_ID_DRC_IC :
2017-02-15 13:45:56 -05:00
count = hp_elog - > _drc_u . ic . count ;
drc_index = hp_elog - > _drc_u . ic . index ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_add_by_ic ( count , drc_index ) ;
2019-08-01 19:52:51 -03:00
break ;
default :
2015-02-10 13:48:25 -06:00
rc = - EINVAL ;
2019-08-01 19:52:51 -03:00
break ;
2017-02-15 13:45:56 -05:00
}
2015-02-10 13:48:25 -06:00
break ;
2015-02-10 13:49:22 -06:00
case PSERIES_HP_ELOG_ACTION_REMOVE :
2019-08-01 19:52:51 -03:00
switch ( hp_elog - > id_type ) {
case PSERIES_HP_ELOG_ID_DRC_COUNT :
2017-02-15 13:45:56 -05:00
count = hp_elog - > _drc_u . drc_count ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_remove_by_count ( count ) ;
2019-08-01 19:52:51 -03:00
break ;
case PSERIES_HP_ELOG_ID_DRC_INDEX :
2017-02-15 13:45:56 -05:00
drc_index = hp_elog - > _drc_u . drc_index ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_remove_by_index ( drc_index ) ;
2019-08-01 19:52:51 -03:00
break ;
case PSERIES_HP_ELOG_ID_DRC_IC :
2017-02-15 13:46:18 -05:00
count = hp_elog - > _drc_u . ic . count ;
drc_index = hp_elog - > _drc_u . ic . index ;
2017-12-01 10:47:31 -06:00
rc = dlpar_memory_remove_by_ic ( count , drc_index ) ;
2019-08-01 19:52:51 -03:00
break ;
default :
2015-02-10 13:49:22 -06:00
rc = - EINVAL ;
2019-08-01 19:52:51 -03:00
break ;
2017-02-15 13:45:56 -05:00
}
2017-01-06 13:27:26 -06:00
break ;
powerpc/pseries: Create new device hotplug entry point
The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.
For PowerKVM the approach for device hotplug is to follow what is currently
being done for pci hotplug. A hotplug request is initiated from the host.
QEMU then generates an EPOW interrupt to the guest which causes the guest
to make the rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest.
Please note that the current pci hotplug path for PowerKVM involves the
kernel receiving the rtas hotplug event, passing it to rtas_errd in
userspace, and having rtas_errd invoke drmgr. The drmgr command then
handles the request as described above for PowerVM systems.
There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am planning is to enable this
by moving the code to handle hotplug from drmgr into the kernel to
provide a single path for handling device hotplug for both PowerVM and
PowerKVM systems. This patch provides the common iframework and entry point.
For PowerKVM a future update to the kernel rtas code will recognize rtas
hotplug events returned from rtas,check-exception calls and use the common
entry point to handle hotplug of the device.
For PowerVM systems, This patch creates /sys/kernel/dlpar that can be
used by the drmgr command to initiate hotplug requests. In order to do
this a string of the format "<resource> <action> <id_type> <id>" is
written to this file. The string consists of a resource (cpu, memory, pci,
phb), an action (add or remove), an id_type (count, drc index, drc name),
and the corresponding id. The kernel will parse the string and create a
rtas hotplug section that can be passed to the common entry point for
handling hotplug requests.
It should be noted that there is no chance of updating how we receive
hotplug (dlpar) requests from the HMC on PowerVM systems.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-02-10 13:47:02 -06:00
default :
pr_err ( " Invalid action (%d) specified \n " , hp_elog - > action ) ;
rc = - EINVAL ;
break ;
}
2020-06-12 00:12:38 -05:00
if ( ! rc )
2018-04-20 15:29:48 -05:00
rc = drmem_update_dt ( ) ;
powerpc/pseries: Create new device hotplug entry point
The current hotplug (or dlpar) of devices (the process is generally the
same for memory, cpu, and pci) on PowerVM systems is initiated
from the HMC, which communicates the request to the partitions through
the RSCT framework. The RSCT framework then invokes the drmgr command.
The drmgr command performs the hotplug operation by doing some pieces,
such as most of the rtas calls and device tree parsing, in userspace
and make requests to the kernel to online/offline the device, update the
device tree and add/remove the device.
For PowerKVM the approach for device hotplug is to follow what is currently
being done for pci hotplug. A hotplug request is initiated from the host.
QEMU then generates an EPOW interrupt to the guest which causes the guest
to make the rtas,check-exception call. In QEMU, the rtas,check-exception call
returns a rtas hotplug event to the guest.
Please note that the current pci hotplug path for PowerKVM involves the
kernel receiving the rtas hotplug event, passing it to rtas_errd in
userspace, and having rtas_errd invoke drmgr. The drmgr command then
handles the request as described above for PowerVM systems.
There is no need for this circuitous route, we should just handle the entire
hotplug of devices in the kernel. What I am planning is to enable this
by moving the code to handle hotplug from drmgr into the kernel to
provide a single path for handling device hotplug for both PowerVM and
PowerKVM systems. This patch provides the common iframework and entry point.
For PowerKVM a future update to the kernel rtas code will recognize rtas
hotplug events returned from rtas,check-exception calls and use the common
entry point to handle hotplug of the device.
For PowerVM systems, This patch creates /sys/kernel/dlpar that can be
used by the drmgr command to initiate hotplug requests. In order to do
this a string of the format "<resource> <action> <id_type> <id>" is
written to this file. The string consists of a resource (cpu, memory, pci,
phb), an action (add or remove), an id_type (count, drc index, drc name),
and the corresponding id. The kernel will parse the string and create a
rtas hotplug section that can be passed to the common entry point for
handling hotplug requests.
It should be noted that there is no chance of updating how we receive
hotplug (dlpar) requests from the HMC on PowerVM systems.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2015-02-10 13:47:02 -06:00
unlock_device_hotplug ( ) ;
return rc ;
}
2014-01-27 10:54:06 -06:00
static int pseries_add_mem_node ( struct device_node * np )
2008-04-18 13:33:52 -07:00
{
2023-03-29 17:03:36 -05:00
int ret ;
struct resource res ;
2008-04-18 13:33:52 -07:00
/*
* Check to see if we are actually adding memory
*/
2018-11-16 16:11:00 -06:00
if ( ! of_node_is_type ( np , " memory " ) )
2008-04-18 13:33:52 -07:00
return 0 ;
/*
2010-07-12 14:36:09 +10:00
* Find the base and size of the memblock
2008-04-18 13:33:52 -07:00
*/
2023-03-29 17:03:36 -05:00
ret = of_address_to_resource ( np , 0 , & res ) ;
if ( ret )
2008-04-18 13:33:52 -07:00
return ret ;
/*
* Update memory region to represent the memory add
*/
2023-03-29 17:03:36 -05:00
ret = memblock_add ( res . start , resource_size ( & res ) ) ;
2008-07-03 13:22:39 +10:00
return ( ret < 0 ) ? - EINVAL : 0 ;
}
2008-04-18 13:33:50 -07:00
static int pseries_memory_notifier ( struct notifier_block * nb ,
2014-11-24 17:58:01 +00:00
unsigned long action , void * data )
2008-04-18 13:33:50 -07:00
{
2014-11-24 17:58:01 +00:00
struct of_reconfig_data * rd = data ;
2011-06-21 03:35:56 +00:00
int err = 0 ;
2008-04-18 13:33:50 -07:00
switch ( action ) {
2012-10-02 16:57:57 +00:00
case OF_RECONFIG_ATTACH_NODE :
2014-11-24 17:58:01 +00:00
err = pseries_add_mem_node ( rd - > dn ) ;
2008-04-18 13:33:50 -07:00
break ;
2012-10-02 16:57:57 +00:00
case OF_RECONFIG_DETACH_NODE :
2014-11-24 17:58:01 +00:00
err = pseries_remove_mem_node ( rd - > dn ) ;
2008-04-18 13:33:50 -07:00
break ;
pseries/drmem: update LMBs after LPM
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.
This is handled by the kernel, but the memory's node is not updated because
there is no way to move a memory block between nodes from the Linux kernel
point of view.
If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node ibm,dynamic-reconfiguration-memory to
match the added or removed LMB. But the LMB's associativity node has not
been updated after the DT node update and thus the node is overwritten by
the Linux's topology instead of the hypervisor one.
Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity. However, ignore the
call to that hook when the update has been triggered by drmem_update_dt().
Because, in that case, the LMB tree has been used to set the DT property
and thus it doesn't need to be updated back. Since drmem_update_dt() is
called under the protection of the device_hotplug_lock and the hook is
called in the same context, use a simple boolean variable to detect that
call.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210517090606.56930-1-ldufour@linux.ibm.com
2021-05-17 11:06:06 +02:00
case OF_RECONFIG_UPDATE_PROPERTY :
if ( ! strcmp ( rd - > dn - > name ,
" ibm,dynamic-reconfiguration-memory " ) )
drmem_update_lmbs ( rd - > prop ) ;
2008-04-18 13:33:50 -07:00
}
2011-06-21 03:35:56 +00:00
return notifier_from_errno ( err ) ;
2008-04-18 13:33:50 -07:00
}
static struct notifier_block pseries_mem_nb = {
. notifier_call = pseries_memory_notifier ,
} ;
static int __init pseries_memory_hotplug_init ( void )
{
if ( firmware_has_feature ( FW_FEATURE_LPAR ) )
2012-10-02 16:57:57 +00:00
of_reconfig_notifier_register ( & pseries_mem_nb ) ;
2008-04-18 13:33:50 -07:00
return 0 ;
}
machine_device_initcall ( pseries , pseries_memory_hotplug_init ) ;