2019-05-19 15:07:45 +03:00
# SPDX-License-Identifier: GPL-2.0-only
2015-06-09 21:13:37 +03:00
menuconfig LIBNVDIMM
2015-05-20 05:54:31 +03:00
tristate "NVDIMM (Non-Volatile Memory Device) Support"
depends on PHYS_ADDR_T_64BIT
2016-06-07 03:42:38 +03:00
depends on HAS_IOMEM
2015-05-20 05:54:31 +03:00
depends on BLK_DEV
2019-11-07 04:43:31 +03:00
select MEMREGION
2015-05-20 05:54:31 +03:00
help
Generic support for non-volatile memory devices including
ACPI-6-NFIT defined resources. On platforms that define an
NFIT, or otherwise can discover NVDIMM resources, a libnvdimm
bus is registered to advertise PMEM (persistent memory)
2022-03-10 06:49:26 +03:00
namespaces (/dev/pmemX). A PMEM namespace refers to a
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
memory resource that may span multiple DIMMs and support DAX
2022-03-10 06:49:26 +03:00
(see CONFIG_DAX).
2015-06-09 21:13:37 +03:00
if LIBNVDIMM
config BLK_DEV_PMEM
tristate "PMEM: Persistent memory block device support"
default LIBNVDIMM
2021-11-29 13:21:37 +03:00
select DAX
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
select ND_BTT if BTT
2015-07-31 00:57:47 +03:00
select ND_PFN if NVDIMM_PFN
2015-06-09 21:13:37 +03:00
help
Memory ranges for PMEM are described by either an NFIT
(NVDIMM Firmware Interface Table, see CONFIG_NFIT_ACPI), a
non-standard OEM-specific E820 memory type (type-12, see
CONFIG_X86_PMEM_LEGACY), or it is manually specified by the
'memmap=nn[KMG]!ss[KMG]' kernel command line (see
2016-10-18 15:12:27 +03:00
Documentation/admin-guide/kernel-parameters.rst). This driver converts
2015-06-09 21:13:37 +03:00
these persistent memory ranges into block devices that are
capable of DAX (direct-access) file system mappings. See
2019-06-18 22:32:31 +03:00
Documentation/driver-api/nvdimm/nvdimm.rst for more details.
2015-06-09 21:13:37 +03:00
Say Y if you want to use an NVDIMM
2015-07-31 00:57:47 +03:00
config ND_CLAIM
bool
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
config ND_BTT
tristate
2015-06-25 11:20:04 +03:00
config BTT
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
bool "BTT: Block Translation Table (atomic sector updates)"
default y if LIBNVDIMM
2015-07-31 00:57:47 +03:00
select ND_CLAIM
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
help
The Block Translation Table (BTT) provides atomic sector
update semantics for persistent memory devices, so that
applications that rely on sector writes not being torn (a
guarantee that typical disks provide) can continue to do so.
The BTT manifests itself as an alternate personality for an
2022-03-10 06:49:26 +03:00
NVDIMM namespace, i.e. a namespace can be in raw mode pmemX,
or 'sectored' mode.
nd_btt: atomic sector updates
BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.
The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.
The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25 11:20:32 +03:00
Select Y if unsure
2015-06-25 11:20:04 +03:00
2015-07-31 00:57:47 +03:00
config ND_PFN
tristate
config NVDIMM_PFN
bool "PFN: Map persistent (device) memory"
default LIBNVDIMM
2015-08-01 09:16:37 +03:00
depends on ZONE_DEVICE
2015-07-31 00:57:47 +03:00
select ND_CLAIM
help
Map persistent memory, i.e. advertise it to the memory
management sub-system. By default persistent memory does
not support direct I/O, RDMA, or any other usage that
requires a 'struct page' to mediate an I/O request. This
driver allocates and initializes the infrastructure needed
to support those use cases.
Select Y if unsure
2016-03-11 21:15:36 +03:00
config NVDIMM_DAX
2016-10-25 18:52:04 +03:00
bool "NVDIMM DAX: Raw access to persistent memory"
2016-03-11 21:15:36 +03:00
default LIBNVDIMM
depends on NVDIMM_PFN
help
Support raw device dax access to a persistent memory
namespace. For environments that want to hard partition
2017-09-26 06:47:59 +03:00
persistent memory, this capability provides a mechanism to
2016-03-11 21:15:36 +03:00
sub-divide a namespace into character devices that can only be
accessed via DAX (mmap(2)).
Select Y if unsure
2018-04-06 08:21:14 +03:00
config OF_PMEM
2018-04-20 01:07:42 +03:00
tristate "Device-tree support for persistent memory regions"
2018-04-06 08:21:14 +03:00
depends on OF
default LIBNVDIMM
help
Allows regions of persistent memory to be described in the
device-tree.
Select Y if unsure.
2018-12-06 23:40:01 +03:00
config NVDIMM_KEYS
def_bool y
depends on ENCRYPTED_KEYS
depends on (LIBNVDIMM=ENCRYPTED_KEYS) || LIBNVDIMM=m
2023-01-25 23:23:46 +03:00
config NVDIMM_KMSAN
bool
depends on KMSAN
help
KMSAN, and other memory debug facilities, increase the size of
'struct page' to contain extra metadata. This collides with
the NVDIMM capability to store a potentially
larger-than-"System RAM" size 'struct page' array in a
reservation of persistent memory rather than limited /
precious DRAM. However, that reservation needs to persist for
the life of the given NVDIMM namespace. If you are using KMSAN
to debug an issue unrelated to NVDIMMs or DAX then say N to this
option. Otherwise, say Y but understand that any namespaces
(with the page array stored pmem) created with this build of
the kernel will permanently reserve and strand excess
capacity compared to the CONFIG_KMSAN=n case.
Select N if unsure.
2019-09-05 01:43:31 +03:00
config NVDIMM_TEST_BUILD
tristate "Build the unit test core"
depends on m
depends on COMPILE_TEST && X86_64
default m if COMPILE_TEST
help
Build the core of the unit test infrastructure. The result of
this build is non-functional for unit test execution, but it
otherwise helps catch build errors induced by changes to the
core devm_memremap_pages() implementation and other
infrastructure.
2022-11-30 22:23:07 +03:00
config NVDIMM_SECURITY_TEST
bool "Enable NVDIMM security unit tests"
depends on NVDIMM_KEYS
help
The NVDIMM and CXL subsystems support unit testing of their device
security state machines. The NVDIMM_SECURITY_TEST option disables CPU
cache maintenance operations around events like secure erase and
overwrite. Also, when enabled, the NVDIMM subsystem core helps the unit
test implement a mock state machine.
Select N if unsure.
2015-06-09 21:13:37 +03:00
endif