improve reading and repairing vg metadata

The fact that vg repair is implemented as a part of vg read has led to a very poor implementation of vg_read. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read can proceed with a single good copy of metadata - vg_read should ignore bad or old copies of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - wipe outdated PVs in vg_write instead of vg_read - fix PV headers in vg_write instead of vg_read - update old metadata in vg_write instead of vg_read - do not conflate bad/old metadata with missing devs - separate commands for other vg repairs will follow Reading bad/old metadata ------------------------ - "bad metadata" is a copy of the metadata that has been corrupted, or can't be read, or has some invalid data that can't be parsed or understood by lvm. It's often reported as a checksum error, but not always. Bad metadata should be replaced with a copy of good metadata from another PV (or from a good copy on the same PV.) - "old metadata" is a copy of the metadata that has a smaller seqno than other copies of the metadata. It could happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that is still in the VG should be replaced with a copy of good metadata from another PV (or from a good copy on the same PV). Old metadata on a PV that has been removed from the VG should be erased. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad/old metadata is something that users will often want to investigate and repair themselves, since it should not generally happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with metadata repair (see next commit). (We may also want to allow pvchange --metadataignore on a PV with bad metadata.) When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the problematic old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata includes the MISSING_PV flag on the PV with the missing device. When the VG is next used, it needs to be treated as if this PV with the MISSING flag is still missing, even if the device has reappeared. vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs.
cleanup: indent
2025-09-28 09:44:18 +03:00 · 2019-01-30 15:59:00 -06:00 · 2019-01-28 22:39:10 +01:00 · 2019-01-28 22:39:10 +01:00 · 2019-01-28 22:39:10 +01:00 · 2019-01-28 22:39:10 +01:00
338 changed files with 11989 additions and 8609 deletions
--- a/Makefile.in
+++ b/Makefile.in
@@ -43,16 +43,22 @@ DISTCLEAN_TARGETS += config.cache config.log config.status make.tmpl

 include make.tmpl

+include $(top_srcdir)/base/Makefile
+include $(top_srcdir)/device_mapper/Makefile
+include $(top_srcdir)/test/unit/Makefile
+
 libdm: include
 libdaemon: include
-lib: libdm libdaemon
+lib: libdaemon $(BASE_TARGET) $(DEVICE_MAPPER_TARGET)
 daemons: lib libdaemon tools
-tools: lib libdaemon device-mapper
+scripts: lib
+tools: lib libdaemon
 po: tools daemons
 man: tools
 all_man: tools
 scripts: libdm
 test: tools daemons
+unit-test  run-unit-test: test

 lib.device-mapper: include.device-mapper
 libdm.device-mapper: include.device-mapper
@@ -148,18 +154,8 @@ install_all_man:
 install_tmpfiles_configuration:
 	$(MAKE) -C scripts install_tmpfiles_configuration

-LCOV_TRACES = libdm.info lib.info tools.info \
-	libdaemon/client.info libdaemon/server.info \
-	test/unit.info \
-	daemons/clvmd.info \
-	daemons/dmeventd.info \
-	daemons/lvmlockd.info \
-	daemons/lvmpolld.info
-
-CLEAN_TARGETS += $(LCOV_TRACES)
-
 ifneq ("$(LCOV)", "")
-.PHONY: lcov-reset lcov lcov-dated $(LCOV_TRACES)
+.PHONY: lcov-reset lcov lcov-dated

 ifeq ($(MAKECMDGOALS),lcov-dated)
 LCOV_REPORTS_DIR := lcov_reports-$(shell date +%Y%m%d%k%M%S)
@@ -169,35 +165,22 @@ LCOV_REPORTS_DIR := lcov_reports
 endif

 lcov-reset:
-	$(LCOV) --zerocounters $(addprefix -d , $(basename $(LCOV_TRACES)))
-
-# maybe use subdirs processing to create tracefiles...
-$(LCOV_TRACES):
-	$(LCOV) -b $(basename $@) -d $(basename $@) \
-		--ignore-errors source -c -o - | $(SED) \
-		-e "s/\(dmeventd_lvm.[ch]\)/plugins\/lvm2\/\1/" \
-		-e "s/dmeventd_\(mirror\|snapshot\|thin\|raid\)\.c/plugins\/\1\/dmeventd_\1\.c/" \
-		>$@
+	$(LCOV) --zerocounters --directory $(top_builddir)

 ifneq ("$(GENHTML)", "")
-lcov: $(LCOV_TRACES)
-	$(RM) -r $(LCOV_REPORTS_DIR)
+lcov:
+	$(RM) -rf $(LCOV_REPORTS_DIR)
 	$(MKDIR_P) $(LCOV_REPORTS_DIR)
-	for i in $(LCOV_TRACES); do \
-		test -s $$i -a $$(wc -w <$$i) -ge 100 && lc="$$lc $$i"; \
-	done; \
-	test -z "$$lc" || $(GENHTML) -p @abs_top_builddir@ \
-		-o $(LCOV_REPORTS_DIR) $$lc
+	$(LCOV) --capture --directory $(top_builddir) --ignore-errors source \
+		--output-file $(LCOV_REPORTS_DIR)/out.info
+	-test ! -s $(LCOV_REPORTS_DIR)/out.info || \
+		$(GENHTML) -o $(LCOV_REPORTS_DIR) --ignore-errors source \
+		$(LCOV_REPORTS_DIR)/out.info
 endif

 endif

-# FIXME: Drop once top-level make is resolved
-include test/unit/Makefile
-include $(top_srcdir)/device_mapper/Makefile
-include $(top_srcdir)/base/Makefile
-
-ifneq ($(shell which ctags),)
+ifneq ($(shell which ctags 2>/dev/null),)
 .PHONY: tags
 tags:
 	test -z "$(shell find $(top_srcdir) -type f -name '*.[ch]' -newer tags 2>/dev/null | head -1)" || $(RM) tags
--- a/2
+++ b/2
@@ -1 +1 @@
-2.03.00(2) (2018-10-10)
+2.03.02(2)-git (2018-10-31)
--- a/2
+++ b/2
@@ -1 +1 @@
-1.02.151 (2018-10-10)
+1.02.155-git (2018-10-31)
--- a/24
+++ b/24
@@ -1,3 +1,27 @@
+Version 2.03.02 - 
+===================================
+  Thin-pool selects power-of-2 chunk size by default.
+  Cache selects power-of-2 chunk size by default.
+  Support reszing for VDOPoolLV and VDOLV.
+  Improve -lXXX%VG modifier which improves cache segment estimation.
+  Ensure migration_threshold for cache is at least 8 chunks.
+  Restore missing man info lvcreate --zero for thin-pools.
+  Drop misleadning comment for metadata minimum_io_size for VDO segment.
+  Add device hints to reduce scanning.
+  Introduce LVM_SUPPRESS_SYSLOG to suppress syslog usage by generator.
+  Fix generator quering lvmconfig unpresent config option.
+  Fix memleak on bcache error path code.
+  Fix missing unlock on lvm2 dmeventd plugin error path initialization.
+  Improve Makefile dependency tracking.
+  Move VDO support towards V2 target (6.2) support.
+  Fix missing proper initialization of pv_list struct when adding pv.
+  Fix (de)activation of RaidLVs with visible SubLVs
+  Prohibit mirrored 'mirror' log via lvcreate and lvconvert
+  Use sync io if async io_setup fails, or use_aio=0 is set in config.
+
+Version 2.03.01 - 31st October 2018
+===================================
+
 Version 2.03.00 - 10th October 2018
 ===================================
  Add hot fix to avoiding locking collision when monitoring thin-pools.
--- a/11
+++ b/11
@@ -1,3 +1,14 @@
+Version 1.02.155 - 
+====================================
+  Ensure migration_threshold for cache is at least 8 chunks.
+  Include correct internal header inside libdm list.c.
+  Enhance ioctl flattening and add parameters only when needed.
+  Add DM_DEVICE_ARM_POLL for API completness matching kernel.
+  Do not add parameters for RESUME with DM_DEVICE_CREATE dm task.
+
+Version 1.02.153 - 31st October 2018
+====================================
+
 Version 1.02.151 - 10th October 2018
 ====================================
  Add hot fix to avoiding locking collision when monitoring thin-pools.
--- a/base/Makefile
+++ b/base/Makefile
@@ -14,22 +14,27 @@
 # Comment to build the advanced radix tree.
 #base/data-struct/radix-tree.o: CFLAGS += -DSIMPLE_RADIX_TREE

+# NOTE: this Makefile only works as 'include' for toplevel Makefile
+#       which defined all top_* variables
+
 BASE_SOURCE=\
-	base/data-struct/radix-tree.c \
 	base/data-struct/hash.c \
-	base/data-struct/list.c
+	base/data-struct/list.c \
+	base/data-struct/radix-tree.c

-BASE_DEPENDS=$(addprefix $(top_builddir)/,$(subst .c,.d,$(BASE_SOURCE)))
-BASE_OBJECTS=$(addprefix $(top_builddir)/,$(subst .c,.o,$(BASE_SOURCE)))
-CLEAN_TARGETS+=$(BASE_DEPENDS) $(BASE_OBJECTS)
+BASE_TARGET = base/libbase.a
+BASE_DEPENDS = $(BASE_SOURCE:%.c=%.d)
+BASE_OBJECTS = $(BASE_SOURCE:%.c=%.o)
+CLEAN_TARGETS += $(BASE_DEPENDS) $(BASE_OBJECTS) \
+	$(BASE_SOURCE:%.c=%.gcda) \
+	$(BASE_SOURCE:%.c=%.gcno) \
+	$(BASE_TARGET)

-include $(BASE_DEPENDS)
-
-$(BASE_OBJECTS): INCLUDES+=-I$(top_srcdir)/base/
-
-$(top_builddir)/base/libbase.a: $(BASE_OBJECTS)
+$(BASE_TARGET): $(BASE_OBJECTS)
 	@echo "    [AR] $@"
 	$(Q) $(RM) $@
 	$(Q) $(AR) rsv $@ $(BASE_OBJECTS) > /dev/null

-CLEAN_TARGETS+=$(top_builddir)/base/libbase.a
+ifeq ("$(DEPENDS)","yes")
+-include $(BASE_DEPENDS)
+endif
--- a/base/data-struct/radix-tree-adaptive.c
+++ b/base/data-struct/radix-tree-adaptive.c
@@ -18,6 +18,7 @@
 #include <assert.h>
 #include <stdlib.h>
 #include <stdio.h>
+#include <string.h>

 //----------------------------------------------------------------

@@ -265,7 +266,8 @@ static bool _insert_prefix_chain(struct radix_tree *rt, struct value *v, uint8_t
 			if (kb[i] != pc->prefix[i])
 				break;

-		pc2 = zalloc(sizeof(*pc2) + pc->len - i);
+		if (!(pc2 = zalloc(sizeof(*pc2) + pc->len - i)))
+			return false;
 		pc2->len = pc->len - i;
 		memmove(pc2->prefix, pc->prefix + i, pc2->len);
 		pc2->child = pc->child;
@@ -353,6 +355,7 @@ static bool _insert_node16(struct radix_tree *rt, struct value *v, uint8_t *kb,
 			return false;

 		n48->nr_entries = 17;
+		/* coverity[bad_memset] intentional use of '0' */
 		memset(n48->keys, 48, sizeof(n48->keys));

 		for (i = 0; i < 16; i++) {
@@ -563,6 +566,8 @@ static void _degrade_to_n4(struct node16 *n16, struct value *result)
 {
        struct node4 *n4 = zalloc(sizeof(*n4));

+	assert(n4 != NULL);
+
        n4->nr_entries = n16->nr_entries;
        memcpy(n4->keys, n16->keys, n16->nr_entries * sizeof(*n4->keys));
        memcpy(n4->values, n16->values, n16->nr_entries * sizeof(*n4->values));
@@ -577,6 +582,8 @@ static void _degrade_to_n16(struct node48 *n48, struct value *result)
 	unsigned i, count = 0;
        struct node16 *n16 = zalloc(sizeof(*n16));

+	assert(n16 != NULL);
+
        n16->nr_entries = n48->nr_entries;
        for (i = 0; i < 256; i++) {
 	        if (n48->keys[i] < 48) {
@@ -597,6 +604,8 @@ static void _degrade_to_n48(struct node256 *n256, struct value *result)
        unsigned i, count = 0;
        struct node48 *n48 = zalloc(sizeof(*n48));

+	assert(n48 != NULL);
+
        n48->nr_entries = n256->nr_entries;
        for (i = 0; i < 256; i++) {
 		if (n256->values[i].type == UNSET)
@@ -616,15 +625,15 @@ static void _degrade_to_n48(struct node256 *n256, struct value *result)
 }

 // Removes an entry in an array by sliding the values above it down.
-static void _erase_elt(void *array, unsigned obj_size, unsigned count, unsigned index)
+static void _erase_elt(void *array, size_t obj_size, unsigned count, unsigned idx)
 {
-	if (index == (count - 1))
+	if (idx == (count - 1))
 		// The simple case
 		return;

-	memmove(((uint8_t *) array) + (obj_size * index),
-                ((uint8_t *) array) + (obj_size * (index + 1)),
-                obj_size * (count - index - 1));
+	memmove(((uint8_t *) array) + (obj_size * idx),
+                ((uint8_t *) array) + (obj_size * (idx + 1)),
+                obj_size * (count - idx - 1));

 	// Zero the now unused last elt (set's v.type to UNSET)
 	memset(((uint8_t *) array) + (count - 1) * obj_size, 0, obj_size);
--- a/base/memory/zalloc.h
+++ b/base/memory/zalloc.h
@@ -14,16 +14,12 @@
 #define BASE_MEMORY_ZALLOC_H

 #include <stdlib.h>
-#include <string.h>

 //----------------------------------------------------------------

 static inline void *zalloc(size_t len)
 {
-	void *ptr = malloc(len);
-	if (ptr)
-		memset(ptr, 0, len);
-	return ptr;
+	return calloc(1, len);
 }

 //----------------------------------------------------------------
--- a/conf/Makefile.in
+++ b/conf/Makefile.in
@@ -49,8 +49,9 @@ install_localconf: $(CONFLOCAL)
 	fi

 install_profiles: $(PROFILES)
-	$(INSTALL_DIR) $(profiledir)
-	$(INSTALL_DATA) $(PROFILES) $(profiledir)/
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_DIR) $(profiledir)
+	$(Q) $(INSTALL_DATA) $(PROFILES) $(profiledir)/

 install_lvm2: install_conf install_localconf install_profiles

--- a/conf/example.conf.in
+++ b/conf/example.conf.in
@@ -123,7 +123,6 @@ devices {
 	# then the device is accepted. Be careful mixing 'a' and 'r' patterns,
 	# as the combination might produce unexpected results (test changes.)
 	# Run vgscan after changing the filter to regenerate the cache.
-	# See the use_lvmetad comment for a special case regarding filters.
 	# 
 	# Example
 	# Accept every block device:
@@ -143,25 +142,13 @@ devices {
 	# Configuration option devices/global_filter.
 	# Limit the block devices that are used by LVM system components.
 	# Because devices/filter may be overridden from the command line, it is
-	# not suitable for system-wide device filtering, e.g. udev and lvmetad.
+	# not suitable for system-wide device filtering, e.g. udev.
 	# Use global_filter to hide devices from these LVM system components.
 	# The syntax is the same as devices/filter. Devices rejected by
 	# global_filter are not opened by LVM.
 	# This configuration option has an automatic default value.
 	# global_filter = [ "a|.*/|" ]

-	# Configuration option devices/cache_dir.
-	# This setting is no longer used.
-	cache_dir = "@DEFAULT_SYS_DIR@/@DEFAULT_CACHE_SUBDIR@"
-
-	# Configuration option devices/cache_file_prefix.
-	# This setting is no longer used.
-	cache_file_prefix = ""
-
-	# Configuration option devices/write_cache_state.
-	# This setting is no longer used.
-	write_cache_state = 1
-
 	# Configuration option devices/types.
 	# List of additional acceptable block device types.
 	# These are of device type names from /proc/devices, followed by the
@@ -179,6 +166,10 @@ devices {
 	# present on the system. sysfs must be part of the kernel and mounted.)
 	sysfs_scan = 1

+	# Configuration option devices/scan_lvs.
+	# Scan LVM LVs for layered PVs.
+	scan_lvs = 1
+
 	# Configuration option devices/multipath_component_detection.
 	# Ignore devices that are components of DM multipath devices.
 	multipath_component_detection = 1
@@ -194,19 +185,24 @@ devices {
 	fw_raid_component_detection = 0

 	# Configuration option devices/md_chunk_alignment.
-	# Align PV data blocks with md device's stripe-width.
+	# Align the start of a PV data area with md device's stripe-width.
 	# This applies if a PV is placed directly on an md device.
+	# default_data_alignment will be overriden if it is not aligned
+	# with the value detected for this setting.
+	# This setting is overriden by data_alignment_detection,
+	# data_alignment, and the --dataalignment option.
 	md_chunk_alignment = 1

 	# Configuration option devices/default_data_alignment.
-	# Default alignment of the start of a PV data area in MB.
-	# If set to 0, a value of 64KiB will be used.
-	# Set to 1 for 1MiB, 2 for 2MiB, etc.
+	# Align the start of a PV data area with this number of MiB.
+	# Set to 1 for 1MiB, 2 for 2MiB, etc. Set to 0 to disable.
+	# This setting is overriden by data_alignment and the --dataalignment
+	# option.
 	# This configuration option has an automatic default value.
 	# default_data_alignment = 1

 	# Configuration option devices/data_alignment_detection.
-	# Detect PV data alignment based on sysfs device information.
+	# Align the start of a PV data area with sysfs io properties.
 	# The start of a PV data area will be a multiple of minimum_io_size or
 	# optimal_io_size exposed in sysfs. minimum_io_size is the smallest
 	# request the device can perform without incurring a read-modify-write
@@ -214,27 +210,29 @@ devices {
 	# preferred unit of receiving I/O, e.g. MD stripe width.
 	# minimum_io_size is used if optimal_io_size is undefined (0).
 	# If md_chunk_alignment is enabled, that detects the optimal_io_size.
-	# This setting takes precedence over md_chunk_alignment.
+	# default_data_alignment and md_chunk_alignment will be overriden
+	# if they are not aligned with the value detected for this setting.
+	# This setting is overriden by data_alignment and the --dataalignment
+	# option.
 	data_alignment_detection = 1

 	# Configuration option devices/data_alignment.
-	# Alignment of the start of a PV data area in KiB.
-	# If a PV is placed directly on an md device and md_chunk_alignment or
-	# data_alignment_detection are enabled, then this setting is ignored.
-	# Otherwise, md_chunk_alignment and data_alignment_detection are
-	# disabled if this is set. Set to 0 to use the default alignment or the
-	# page size, if larger.
+	# Align the start of a PV data area with this number of KiB.
+	# When non-zero, this setting overrides default_data_alignment.
+	# Set to 0 to disable, in which case default_data_alignment
+	# is used to align the first PE in units of MiB.
+	# This setting is overriden by the --dataalignment option.
 	data_alignment = 0

 	# Configuration option devices/data_alignment_offset_detection.
-	# Detect PV data alignment offset based on sysfs device information.
-	# The start of a PV aligned data area will be shifted by the
+	# Shift the start of an aligned PV data area based on sysfs information.
+	# After a PV data area is aligned, it will be shifted by the
 	# alignment_offset exposed in sysfs. This offset is often 0, but may
 	# be non-zero. Certain 4KiB sector drives that compensate for windows
 	# partitioning will have an alignment_offset of 3584 bytes (sector 7
 	# is the lowest aligned logical block, the 4KiB sectors start at
 	# LBA -1, and consequently sector 63 is aligned on a 4KiB boundary).
-	# pvcreate --dataalignmentoffset will skip this detection.
+	# This setting is overriden by the --dataalignmentoffset option.
 	data_alignment_offset_detection = 1

 	# Configuration option devices/ignore_suspended_devices.
@@ -262,10 +260,6 @@ devices {
 	# different way, making them a better choice for VG stacking.
 	ignore_lvm_mirrors = 1

-	# Configuration option devices/disable_after_error_count.
-	# This setting is no longer used.
-	disable_after_error_count = 0
-
 	# Configuration option devices/require_restorefile_with_uuid.
 	# Allow use of pvcreate --uuid without requiring --restorefile.
 	require_restorefile_with_uuid = 1
@@ -336,7 +330,7 @@ allocation {
 	maximise_cling = 1

 	# Configuration option allocation/use_blkid_wiping.
-	# Use blkid to detect existing signatures on new PVs and LVs.
+	# Use blkid to detect and erase existing signatures on new PVs and LVs.
 	# The blkid library can detect more signatures than the native LVM
 	# detection code, but may take longer. LVM needs to be compiled with
 	# blkid wiping support for this setting to apply. LVM native detection
@@ -503,10 +497,19 @@ allocation {
 	# This configuration option has an automatic default value.
 	# vdo_use_deduplication = 1

-	# Configuration option allocation/vdo_emulate_512_sectors.
-	# Specifies that the VDO volume is to emulate a 512 byte block device.
+	# Configuration option allocation/vdo_use_metadata_hints.
+	# Enables or disables whether VDO volume should tag its latency-critical
+	# writes with the REQ_SYNC flag. Some device mapper targets such as dm-raid5
+	# process writes with this flag at a higher priority.
+	# Default is enabled.
 	# This configuration option has an automatic default value.
-	# vdo_emulate_512_sectors = 0
+	# vdo_use_metadata_hints = 1
+
+	# Configuration option allocation/vdo_minimum_io_size.
+	# The minimum IO size for VDO volume to accept, in bytes.
+	# Valid values are 512 or 4096. The recommended and default value is 4096.
+	# This configuration option has an automatic default value.
+	# vdo_minimum_io_size = 4096

 	# Configuration option allocation/vdo_block_map_cache_size_mb.
 	# Specifies the amount of memory in MiB allocated for caching block map
@@ -517,10 +520,10 @@ allocation {
 	# vdo_block_map_cache_size_mb = 128

 	# Configuration option allocation/vdo_block_map_period.
-	# Tunes the quantity of block map updates that can accumulate
-	# before cache pages are flushed to disk. The value must be
-	# at least 1 and less then 16380.
-	# A lower value means shorter recovery time but lower performance.
+	# The speed with which the block map cache writes out modified block map pages.
+	# A smaller era length is likely to reduce the amount time spent rebuilding,
+	# at the cost of increased block map writes during normal operation.
+	# The maximum and recommended value is 16380; the minimum value is 1.
 	# This configuration option has an automatic default value.
 	# vdo_block_map_period = 16380

@@ -540,22 +543,6 @@ allocation {
 	# This configuration option has an automatic default value.
 	# vdo_index_memory_size_mb = 256

-	# Configuration option allocation/vdo_use_read_cache.
-	# Enables or disables the read cache within the VDO volume.
-	# The cache should be enabled if write workloads are expected
-	# to have high levels of deduplication, or for read intensive
-	# workloads of highly compressible data.
-	# This configuration option has an automatic default value.
-	# vdo_use_read_cache = 0
-
-	# Configuration option allocation/vdo_read_cache_size_mb.
-	# Specifies the extra VDO volume read cache size in MiB.
-	# This space is in addition to a system-defined minimum.
-	# The value must be less then 16TiB and 1.12 MiB of memory
-	# will be used per MiB of read cache specified, per bio thread.
-	# This configuration option has an automatic default value.
-	# vdo_read_cache_size_mb = 0
-
 	# Configuration option allocation/vdo_slab_size_mb.
 	# Specifies the size in MiB of the increment by which a VDO is grown.
 	# Using a smaller size constrains the total maximum physical size
@@ -631,6 +618,18 @@ allocation {
 	#         Data which has not been flushed is not guaranteed to persist in this mode.
 	# This configuration option has an automatic default value.
 	# vdo_write_policy = "auto"
+
+	# Configuration option allocation/vdo_max_discard.
+	# Specified te maximum size of discard bio accepted, in 4096 byte blocks.
+	# I/O requests to a VDO volume are normally split into 4096-byte blocks,
+	# and processed up to 2048 at a time. However, discard requests to a VDO volume
+	# can be automatically split to a larger size, up to <max discard> 4096-byte blocks
+	# in a single bio, and are limited to 1500 at a time.
+	# Increasing this value may provide better overall performance, at the cost of
+	# increased latency for the individual discard requests.
+	# The default and minimum is 1. The maximum is UINT_MAX / 4096.
+	# This configuration option has an automatic default value.
+	# vdo_max_discard = 1
 }

 # Configuration section log.
@@ -744,9 +743,9 @@ log {
 	# Select log messages by class.
 	# Some debugging messages are assigned to a class and only appear in
 	# debug output if the class is listed here. Classes currently
-	# available: memory, devices, io, activation, allocation, lvmetad,
+	# available: memory, devices, io, activation, allocation,
 	# metadata, cache, locking, lvmpolld. Use "all" to see everything.
-	debug_classes = [ "memory", "devices", "io", "activation", "allocation", "lvmetad", "metadata", "cache", "locking", "lvmpolld", "dbus" ]
+	debug_classes = [ "memory", "devices", "io", "activation", "allocation", "metadata", "cache", "locking", "lvmpolld", "dbus" ]
 }

 # Configuration section backup.
@@ -834,20 +833,6 @@ global {
 	# the error messages.
 	activation = 1

-	# Configuration option global/fallback_to_lvm1.
-	# This setting is no longer used.
-	# This configuration option has an automatic default value.
-	# fallback_to_lvm1 = 0
-
-	# Configuration option global/format.
-	# This setting is no longer used.
-	# This configuration option has an automatic default value.
-	# format = "lvm2"
-
-	# Configuration option global/format_libraries.
-	# This setting is no longer used.
-	# This configuration option does not have a default value defined.
-
 	# Configuration option global/segment_libraries.
 	# This configuration option does not have a default value defined.

@@ -860,22 +845,10 @@ global {
 	# Location of /etc system configuration directory.
 	etc = "@CONFDIR@"

-	# Configuration option global/locking_type.
-	# This setting is no longer used.
-	locking_type = 1
-
 	# Configuration option global/wait_for_locks.
 	# When disabled, fail if a lock request would block.
 	wait_for_locks = 1

-	# Configuration option global/fallback_to_clustered_locking.
-	# This setting is no longer used.
-	fallback_to_clustered_locking = 1
-
-	# Configuration option global/fallback_to_local_locking.
-	# This setting is no longer used.
-	fallback_to_local_locking = 1
-
 	# Configuration option global/locking_dir.
 	# Directory to use for LVM command file locks.
 	# Local non-LV directory that holds file-based locks while commands are
@@ -896,11 +869,6 @@ global {
 	# Search this directory first for shared libraries.
 	# This configuration option does not have a default value defined.

-	# Configuration option global/locking_library.
-	# This setting is no longer used.
-	# This configuration option has an automatic default value.
-	# locking_library = "liblvm2clusterlock.so"
-
 	# Configuration option global/abort_on_internal_errors.
 	# Abort a command that encounters an internal error.
 	# Treat any internal errors as fatal errors, aborting the process that
@@ -941,6 +909,16 @@ global {
 	# 
 	mirror_segtype_default = "@DEFAULT_MIRROR_SEGTYPE@"

+	# Configuration option global/support_mirrored_mirror_log.
+	# Enable mirrored 'mirror' log type for testing.
+	# 
+	# This type is deprecated to create or convert to but can
+	# be enabled to test that activation of existing mirrored
+	# logs and conversion to disk/core works.
+	# 
+	# Not supported for regular operation!
+	support_mirrored_mirror_log = 0
+
 	# Configuration option global/raid10_segtype_default.
 	# The segment type used by the -i -m combination.
 	# The --type raid10|mirror option overrides this setting.
@@ -989,41 +967,20 @@ global {
 	# This configuration option has an automatic default value.
 	# lvdisplay_shows_full_device_path = 0

-	# Configuration option global/use_lvmetad.
-	# Use lvmetad to cache metadata and reduce disk scanning.
-	# When enabled (and running), lvmetad provides LVM commands with VG
-	# metadata and PV state. LVM commands then avoid reading this
-	# information from disks which can be slow. When disabled (or not
-	# running), LVM commands fall back to scanning disks to obtain VG
-	# metadata. lvmetad is kept updated via udev rules which must be set
-	# up for LVM to work correctly. (The udev rules should be installed
-	# by default.) Without a proper udev setup, changes in the system's
-	# block device configuration will be unknown to LVM, and ignored
-	# until a manual 'pvscan --cache' is run. If lvmetad was running
-	# while use_lvmetad was disabled, it must be stopped, use_lvmetad
-	# enabled, and then started. When using lvmetad, LV activation is
-	# switched to an automatic, event-based mode. In this mode, LVs are
-	# activated based on incoming udev events that inform lvmetad when
-	# PVs appear on the system. When a VG is complete (all PVs present),
-	# it is auto-activated. The auto_activation_volume_list setting
-	# controls which LVs are auto-activated (all by default.)
-	# When lvmetad is updated (automatically by udev events, or directly
-	# by pvscan --cache), devices/filter is ignored and all devices are
-	# scanned by default. lvmetad always keeps unfiltered information
-	# which is provided to LVM commands. Each LVM command then filters
-	# based on devices/filter. This does not apply to other, non-regexp,
-	# filtering settings: component filters such as multipath and MD
-	# are checked during pvscan --cache. To filter a device and prevent
-	# scanning from the LVM system entirely, including lvmetad, use
-	# devices/global_filter.
-	use_lvmetad = @DEFAULT_USE_LVMETAD@
+	# Configuration option global/event_activation.
+	# Activate LVs based on system-generated device events.
+	# When a device appears on the system, a system-generated event runs
+	# the pvscan command to activate LVs if the new PV completes the VG.
+	# Use auto_activation_volume_list to select which LVs should be
+	# activated from these events (the default is all.)
+	# When event_activation is disabled, the system will generally run
+	# a direct activation command to activate LVs in complete VGs.
+	event_activation = 1

-	# Configuration option global/lvmetad_update_wait_time.
-	# Number of seconds a command will wait for lvmetad update to finish.
-	# After waiting for this period, a command will not use lvmetad, and
-	# will revert to disk scanning.
+	# Configuration option global/use_aio.
+	# Use async I/O when reading and writing devices.
 	# This configuration option has an automatic default value.
-	# lvmetad_update_wait_time = 10
+	# use_aio = 1

 	# Configuration option global/use_lvmlockd.
 	# Use lvmlockd for locking among hosts using LVM on shared storage.
@@ -1705,13 +1662,20 @@ activation {
 	# vgmetadatacopies = 0

 	# Configuration option metadata/pvmetadatasize.
-	# Approximate number of sectors to use for each metadata copy.
-	# VGs with large numbers of PVs or LVs, or VGs containing complex LV
-	# structures, may need additional space for VG metadata. The metadata
-	# areas are treated as circular buffers, so unused space becomes filled
-	# with an archive of the most recent previous versions of the metadata.
+	# The default size of the metadata area in units of 512 byte sectors.
+	# The metadata area begins at an offset of the page size from the start
+	# of the device. The first PE is by default at 1 MiB from the start of
+	# the device. The space between these is the default metadata area size.
+	# The actual size of the metadata area may be larger than what is set
+	# here due to default_data_alignment making the first PE a MiB multiple.
+	# The metadata area begins with a 512 byte header and is followed by a
+	# circular buffer used for VG metadata text. The maximum size of the VG
+	# metadata is about half the size of the metadata buffer. VGs with large
+	# numbers of PVs or LVs, or VGs containing complex LV structures, may need
+	# additional space for VG metadata. The --metadatasize option overrides
+	# this setting.
+	# This configuration option does not have a default value defined.
 	# This configuration option has an automatic default value.
-	# pvmetadatasize = 255

 	# Configuration option metadata/pvmetadataignore.
 	# Ignore metadata areas on a new PV.
@@ -1726,11 +1690,6 @@ activation {
 	# This configuration option is advanced.
 	# This configuration option has an automatic default value.
 	# stripesize = 64
-
-	# Configuration option metadata/dirs.
-	# This setting is no longer used.
-	# This configuration option is advanced.
-	# This configuration option does not have a default value defined.
 # }

 # Configuration section report.
--- a/conf/vdo-small.profile
+++ b/conf/vdo-small.profile
@@ -1,25 +1,24 @@
 # Demo configuration for 'VDO' using less memory.
-#
+# ~lvmconfig --type full | grep vdo

 allocation {
-	vdo_use_compression = 1
-	vdo_use_deduplication = 1
-	vdo_emulate_512_sectors = 0
-	vdo_block_map_cache_size_mb = 128
-	vdo_block_map_period = 16380
-	vdo_check_point_frequency = 0
-	vdo_use_sparse_index = 0
-	vdo_index_memory_size_mb = 256
-	vdo_use_read_cache = 0
-	vdo_read_cache_size_mb = 0
-	vdo_slab_size_mb = 2048
-
-	vdo_ack_threads = 1
-	vdo_bio_threads = 1
-	vdo_bio_rotation = 64
-	vdo_cpu_threads = 2
-	vdo_hash_zone_threads = 1
-	vdo_logical_threads = 1
-	vdo_physical_threads = 1
-	vdo_write_policy = "auto"
+	vdo_use_compression=1
+	vdo_use_deduplication=1
+	vdo_use_metadata_hints=1
+	vdo_minimum_io_size=4096
+	vdo_block_map_cache_size_mb=128
+	vdo_block_map_period=16380
+	vdo_check_point_frequency=0
+	vdo_use_sparse_index=0
+	vdo_index_memory_size_mb=256
+	vdo_slab_size_mb=2048
+	vdo_ack_threads=1
+	vdo_bio_threads=1
+	vdo_bio_rotation=64
+	vdo_cpu_threads=2
+	vdo_hash_zone_threads=1
+	vdo_logical_threads=1
+	vdo_physical_threads=1
+	vdo_write_policy="auto"
+	vdo_max_discard=1
 }
--- a/67
+++ b/67
@@ -728,7 +728,6 @@ DEFAULT_PID_DIR
 DEFAULT_MIRROR_SEGTYPE
 DEFAULT_LOCK_DIR
 DEFAULT_DM_RUN_DIR
-DEFAULT_DATA_ALIGNMENT
 DEFAULT_CACHE_SUBDIR
 DEFAULT_BACKUP_SUBDIR
 DEFAULT_ARCHIVE_SUBDIR
@@ -915,6 +914,7 @@ with_cache_restore
 enable_cache_check_needs_check
 with_vdo
 with_vdo_format
+with_writecache
 enable_readline
 enable_realtime
 enable_ocf
@@ -973,7 +973,6 @@ with_default_archive_subdir
 with_default_backup_subdir
 with_default_cache_subdir
 with_default_locking_dir
-with_default_data_alignment
 with_interface
 '
      ac_precious_vars='build_alias
@@ -1708,6 +1707,7 @@ Optional Packages:
                          cache_restore tool: [autodetect]
  --with-vdo=TYPE         vdo support: internal/none [internal]
  --with-vdo-format=PATH  vdoformat tool: [autodetect]
+  --with-writecache=TYPE  writecache support: internal/none [internal]
  --with-ocfdir=DIR       install OCF files in
                          [PREFIX/lib/ocf/resource.d/lvm2]
  --with-default-pid-dir=PID_DIR
@@ -1752,8 +1752,6 @@ Optional Packages:
                          default metadata cache subdir [cache]
  --with-default-locking-dir=DIR
                          default locking directory [autodetect_lock_dir/lvm]
-  --with-default-data-alignment=NUM
-                          set the default data alignment in MiB [1]
  --with-interface=IFACE  choose kernel interface (ioctl) [ioctl]

 Some influential environment variables:
@@ -3067,7 +3065,7 @@ if test -z "$CFLAGS"; then :
 fi
 case "$host_os" in
 	linux*)
-		CLDFLAGS="$CLDFLAGS -Wl,--version-script,.export.sym"
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"} -Wl,--version-script,.export.sym"
 		# equivalent to -rdynamic
 		ELDFLAGS="-Wl,--export-dynamic"
 		# FIXME Generate list and use --dynamic-list=.dlopen.sym
@@ -3087,7 +3085,7 @@ case "$host_os" in
 		;;
 	darwin*)
 		CFLAGS="$CFLAGS -no-cpp-precomp -fno-common"
-		CLDFLAGS="$CLDFLAGS"
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"}"
 		ELDFLAGS=
 		CLDWHOLEARCHIVE="-all_load"
 		CLDNOWHOLEARCHIVE=
@@ -3099,6 +3097,9 @@ case "$host_os" in
 		FSADM=no
 		BLKDEACTIVATE=no
 		;;
+	*)
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"}"
+		;;
 esac

 ################################################################################
@@ -6622,6 +6623,15 @@ fi



+
+
+$as_echo "#define _GNU_SOURCE 1" >>confdefs.h
+
+
+$as_echo "#define _REENTRANT 1" >>confdefs.h
+
+
+
 ################################################################################
 for ac_func in ftruncate gethostname getpagesize gettimeofday localtime_r \
  memchr memset mkdir mkfifo munmap nl_langinfo realpath rmdir setenv \
@@ -9702,6 +9712,31 @@ _ACEOF
 #                           VDO_LIB=$withval, VDO_LIB="/usr/lib")
 #AC_MSG_RESULT($VDO_LIB)

+################################################################################
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to include writecache" >&5
+$as_echo_n "checking whether to include writecache... " >&6; }
+
+# Check whether --with-writecache was given.
+if test "${with_writecache+set}" = set; then :
+  withval=$with_writecache; WRITECACHE=$withval
+else
+  WRITECACHE="none"
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $WRITECACHE" >&5
+$as_echo "$WRITECACHE" >&6; }
+
+case "$WRITECACHE" in
+ none) ;;
+ internal)
+
+$as_echo "#define WRITECACHE_INTERNAL 1" >>confdefs.h
+
+	;;
+ *) as_fn_error $? "--with-writecache parameter invalid" "$LINENO" 5 ;;
+esac
+
 ################################################################################
 # Check whether --enable-readline was given.
 if test "${enable_readline+set}" = set; then :
@@ -11638,7 +11673,6 @@ fi
 ################################################################################

 if test "$BUILD_LVMDBUSD" = yes; then
-	unset PYTHON PYTHON_CONFIG
 	unset am_cv_pathless_PYTHON ac_cv_path_PYTHON am_cv_python_platform
 	unset am_cv_python_pythondir am_cv_python_version am_cv_python_pyexecdir
 	unset ac_cv_path_PYTHON_CONFIG ac_cv_path_ac_pt_PYTHON_CONFIG
@@ -13560,21 +13594,6 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF


-################################################################################
-
-# Check whether --with-default-data-alignment was given.
-if test "${with_default_data_alignment+set}" = set; then :
-  withval=$with_default_data_alignment; DEFAULT_DATA_ALIGNMENT=$withval
-else
-  DEFAULT_DATA_ALIGNMENT=1
-fi
-
-
-cat >>confdefs.h <<_ACEOF
-#define DEFAULT_DATA_ALIGNMENT $DEFAULT_DATA_ALIGNMENT
-_ACEOF
-
-
 ################################################################################
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for kernel interface choice" >&5
 $as_echo_n "checking for kernel interface choice... " >&6; }
@@ -13765,11 +13784,10 @@ _ACEOF



-


 ################################################################################
-ac_config_files="$ac_config_files Makefile make.tmpl libdm/make.tmpl daemons/Makefile daemons/cmirrord/Makefile daemons/dmeventd/Makefile daemons/dmeventd/libdevmapper-event.pc daemons/dmeventd/plugins/Makefile daemons/dmeventd/plugins/lvm2/Makefile daemons/dmeventd/plugins/raid/Makefile daemons/dmeventd/plugins/mirror/Makefile daemons/dmeventd/plugins/snapshot/Makefile daemons/dmeventd/plugins/thin/Makefile daemons/dmeventd/plugins/vdo/Makefile daemons/lvmdbusd/Makefile daemons/lvmdbusd/lvmdbusd daemons/lvmdbusd/lvmdb.py daemons/lvmdbusd/lvm_shell_proxy.py daemons/lvmdbusd/path.py daemons/lvmpolld/Makefile daemons/lvmlockd/Makefile conf/Makefile conf/example.conf conf/lvmlocal.conf conf/command_profile_template.profile conf/metadata_profile_template.profile include/Makefile lib/Makefile include/lvm-version.h libdaemon/Makefile libdaemon/client/Makefile libdaemon/server/Makefile libdm/Makefile libdm/dm-tools/Makefile libdm/libdevmapper.pc man/Makefile po/Makefile scripts/blkdeactivate.sh scripts/blk_availability_init_red_hat scripts/blk_availability_systemd_red_hat.service scripts/cmirrord_init_red_hat scripts/com.redhat.lvmdbus1.service scripts/dm_event_systemd_red_hat.service scripts/dm_event_systemd_red_hat.socket scripts/lvm2_cmirrord_systemd_red_hat.service scripts/lvm2_lvmdbusd_systemd_red_hat.service scripts/lvm2_lvmpolld_init_red_hat scripts/lvm2_lvmpolld_systemd_red_hat.service scripts/lvm2_lvmpolld_systemd_red_hat.socket scripts/lvmlockd.service scripts/lvmlocks.service scripts/lvm2_monitoring_init_red_hat scripts/lvm2_monitoring_systemd_red_hat.service scripts/lvm2_tmpfiles_red_hat.conf scripts/lvmdump.sh scripts/Makefile test/Makefile tools/Makefile udev/Makefile"
+ac_config_files="$ac_config_files Makefile make.tmpl libdm/make.tmpl daemons/Makefile daemons/cmirrord/Makefile daemons/dmeventd/Makefile daemons/dmeventd/libdevmapper-event.pc daemons/dmeventd/plugins/Makefile daemons/dmeventd/plugins/lvm2/Makefile daemons/dmeventd/plugins/raid/Makefile daemons/dmeventd/plugins/mirror/Makefile daemons/dmeventd/plugins/snapshot/Makefile daemons/dmeventd/plugins/thin/Makefile daemons/dmeventd/plugins/vdo/Makefile daemons/lvmdbusd/Makefile daemons/lvmdbusd/lvmdbusd daemons/lvmdbusd/lvmdb.py daemons/lvmdbusd/lvm_shell_proxy.py daemons/lvmdbusd/path.py daemons/lvmpolld/Makefile daemons/lvmlockd/Makefile conf/Makefile conf/example.conf conf/lvmlocal.conf conf/command_profile_template.profile conf/metadata_profile_template.profile include/Makefile lib/Makefile include/lvm-version.h libdaemon/Makefile libdaemon/client/Makefile libdaemon/server/Makefile libdm/Makefile libdm/dm-tools/Makefile libdm/libdevmapper.pc man/Makefile po/Makefile scripts/lvm2-pvscan.service scripts/blkdeactivate.sh scripts/blk_availability_init_red_hat scripts/blk_availability_systemd_red_hat.service scripts/cmirrord_init_red_hat scripts/com.redhat.lvmdbus1.service scripts/dm_event_systemd_red_hat.service scripts/dm_event_systemd_red_hat.socket scripts/lvm2_cmirrord_systemd_red_hat.service scripts/lvm2_lvmdbusd_systemd_red_hat.service scripts/lvm2_lvmpolld_init_red_hat scripts/lvm2_lvmpolld_systemd_red_hat.service scripts/lvm2_lvmpolld_systemd_red_hat.socket scripts/lvmlockd.service scripts/lvmlocks.service scripts/lvm2_monitoring_init_red_hat scripts/lvm2_monitoring_systemd_red_hat.service scripts/lvm2_tmpfiles_red_hat.conf scripts/lvmdump.sh scripts/Makefile test/Makefile tools/Makefile udev/Makefile"

 cat >confcache <<\_ACEOF
 # This file is a shell script that caches the results of configure
@@ -14501,6 +14519,7 @@ do
    "libdm/libdevmapper.pc") CONFIG_FILES="$CONFIG_FILES libdm/libdevmapper.pc" ;;
    "man/Makefile") CONFIG_FILES="$CONFIG_FILES man/Makefile" ;;
    "po/Makefile") CONFIG_FILES="$CONFIG_FILES po/Makefile" ;;
+    "scripts/lvm2-pvscan.service") CONFIG_FILES="$CONFIG_FILES scripts/lvm2-pvscan.service" ;;
    "scripts/blkdeactivate.sh") CONFIG_FILES="$CONFIG_FILES scripts/blkdeactivate.sh" ;;
    "scripts/blk_availability_init_red_hat") CONFIG_FILES="$CONFIG_FILES scripts/blk_availability_init_red_hat" ;;
    "scripts/blk_availability_systemd_red_hat.service") CONFIG_FILES="$CONFIG_FILES scripts/blk_availability_systemd_red_hat.service" ;;
--- a/configure.ac
+++ b/configure.ac
@@ -30,7 +30,7 @@ AC_CANONICAL_TARGET([])
 AS_IF([test -z "$CFLAGS"], [COPTIMISE_FLAG="-O2"])
 case "$host_os" in
 	linux*)
-		CLDFLAGS="$CLDFLAGS -Wl,--version-script,.export.sym"
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"} -Wl,--version-script,.export.sym"
 		# equivalent to -rdynamic
 		ELDFLAGS="-Wl,--export-dynamic"
 		# FIXME Generate list and use --dynamic-list=.dlopen.sym
@@ -50,7 +50,7 @@ case "$host_os" in
 		;;
 	darwin*)
 		CFLAGS="$CFLAGS -no-cpp-precomp -fno-common"
-		CLDFLAGS="$CLDFLAGS"
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"}"
 		ELDFLAGS=
 		CLDWHOLEARCHIVE="-all_load"
 		CLDNOWHOLEARCHIVE=
@@ -62,6 +62,9 @@ case "$host_os" in
 		FSADM=no
 		BLKDEACTIVATE=no
 		;;
+	*)
+		CLDFLAGS="${CLDFLAGS:"$LDFLAGS"}"
+		;;
 esac

 ################################################################################
@@ -141,6 +144,11 @@ AC_TYPE_UINT64_T
 AX_GCC_BUILTIN([__builtin_clz])
 AX_GCC_BUILTIN([__builtin_clzll])

+
+AC_DEFINE([_GNU_SOURCE], 1, [Define to get access to GNU/Linux extension])
+AC_DEFINE([_REENTRANT], 1, [Define to use re-entrant thread safe versions])
+
+
 ################################################################################
 dnl -- Check for functions
 AC_CHECK_FUNCS([ftruncate gethostname getpagesize gettimeofday localtime_r \
@@ -639,6 +647,24 @@ AC_DEFINE_UNQUOTED([VDO_FORMAT_CMD], ["$VDO_FORMAT_CMD"],
 #                           VDO_LIB=$withval, VDO_LIB="/usr/lib") 
 #AC_MSG_RESULT($VDO_LIB)

+################################################################################
+dnl -- writecache inclusion type
+AC_MSG_CHECKING(whether to include writecache)
+AC_ARG_WITH(writecache,
+	    AC_HELP_STRING([--with-writecache=TYPE],
+			   [writecache support: internal/none [internal]]),
+			   WRITECACHE=$withval, WRITECACHE="none")
+
+AC_MSG_RESULT($WRITECACHE)
+
+case "$WRITECACHE" in
+ none) ;;
+ internal) 
+	AC_DEFINE([WRITECACHE_INTERNAL], 1, [Define to 1 to include built-in support for writecache.])
+	;;
+ *) AC_MSG_ERROR([--with-writecache parameter invalid]) ;;
+esac
+
 ################################################################################
 dnl -- Disable readline
 AC_ARG_ENABLE([readline],
@@ -1151,7 +1177,6 @@ AS_IF([test "$NOTIFYDBUS_SUPPORT" = yes && test "BUILD_LVMDBUSD" = yes],
 dnl -- Enable Python dbus library

 if test "$BUILD_LVMDBUSD" = yes; then
-	unset PYTHON PYTHON_CONFIG
 	unset am_cv_pathless_PYTHON ac_cv_path_PYTHON am_cv_python_platform
 	unset am_cv_python_pythondir am_cv_python_version am_cv_python_pyexecdir
 	unset ac_cv_path_PYTHON_CONFIG ac_cv_path_ac_pt_PYTHON_CONFIG
@@ -1577,15 +1602,6 @@ AC_ARG_WITH(default-locking-dir,
 AC_DEFINE_UNQUOTED(DEFAULT_LOCK_DIR, ["$DEFAULT_LOCK_DIR"],
 		   [Name of default locking directory.])

-################################################################################
-dnl -- Setup default data alignment
-AC_ARG_WITH(default-data-alignment,
-	    AC_HELP_STRING([--with-default-data-alignment=NUM],
-			   [set the default data alignment in MiB [1]]),
-	    DEFAULT_DATA_ALIGNMENT=$withval, DEFAULT_DATA_ALIGNMENT=1)
-AC_DEFINE_UNQUOTED(DEFAULT_DATA_ALIGNMENT, [$DEFAULT_DATA_ALIGNMENT],
-		   [Default data alignment.])
-
 ################################################################################
 dnl -- which kernel interface to use (ioctl only)
 AC_MSG_CHECKING(for kernel interface choice)
@@ -1646,7 +1662,6 @@ AC_SUBST(DEBUG)
 AC_SUBST(DEFAULT_ARCHIVE_SUBDIR)
 AC_SUBST(DEFAULT_BACKUP_SUBDIR)
 AC_SUBST(DEFAULT_CACHE_SUBDIR)
-AC_SUBST(DEFAULT_DATA_ALIGNMENT)
 AC_SUBST(DEFAULT_DM_RUN_DIR)
 AC_SUBST(DEFAULT_LOCK_DIR)
 AC_SUBST(DEFAULT_MIRROR_SEGTYPE)
@@ -1807,6 +1822,7 @@ libdm/dm-tools/Makefile
 libdm/libdevmapper.pc
 man/Makefile
 po/Makefile
+scripts/lvm2-pvscan.service
 scripts/blkdeactivate.sh
 scripts/blk_availability_init_red_hat
 scripts/blk_availability_systemd_red_hat.service
--- a/daemons/cmirrord/Makefile.in
+++ b/daemons/cmirrord/Makefile.in
@@ -28,9 +28,11 @@ LMLIBS += $(CPG_LIBS)
 CFLAGS += $(CPG_CFLAGS) $(EXTRA_EXEC_CFLAGS)
 LDFLAGS += $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS)

-cmirrord: $(OBJECTS) $(top_builddir)/lib/liblvm-internal.a
-	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJECTS) \
-		$(LVMLIBS) $(LMLIBS) $(INTERNAL_LIBS) $(LIBS)
+cmirrord: $(OBJECTS)
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJECTS) \
+		$(LMLIBS) -L$(top_builddir)/libdm -ldevmapper $(LIBS)

 install: $(TARGETS)
-	$(INSTALL_PROGRAM) -D cmirrord $(usrsbindir)/cmirrord
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D cmirrord $(usrsbindir)/cmirrord
--- a/daemons/cmirrord/cluster.c
+++ b/daemons/cmirrord/cluster.c
@@ -17,6 +17,7 @@
 #include "link_mon.h"
 #include "local.h"
 #include "lib/mm/xlate.h"
+#include "base/memory/zalloc.h"

 /* FIXME: remove this and the code */
 #define CMIRROR_HAS_CHECKPOINT 0
@@ -402,13 +403,12 @@ static struct checkpoint_data *prepare_checkpoint(struct clog_cpg *entry,
 		return NULL;
 	}

-	new = malloc(sizeof(*new));
+	new = zalloc(sizeof(*new));
 	if (!new) {
 		LOG_ERROR("Unable to create checkpoint data for %u",
 			  cp_requester);
 		return NULL;
 	}
-	memset(new, 0, sizeof(*new));
 	new->requester = cp_requester;
 	strncpy(new->uuid, entry->name.value, entry->name.length);

@@ -643,13 +643,12 @@ static int export_checkpoint(struct checkpoint_data *cp)
 	rq_size += RECOVERING_REGION_SECTION_SIZE;
 	rq_size += cp->bitmap_size * 2; /* clean|sync_bits */

-	rq = malloc(rq_size);
+	rq = zalloc(rq_size);
 	if (!rq) {
 		LOG_ERROR("export_checkpoint: "
 			  "Unable to allocate transfer structs");
 		return -ENOMEM;
 	}
-	memset(rq, 0, rq_size);

 	dm_list_init(&rq->u.list);
 	rq->u_rq.request_type = DM_ULOG_CHECKPOINT_READY;
@@ -1621,12 +1620,11 @@ int create_cluster_cpg(char *uuid, uint64_t luid)
 			return -EEXIST;
 		}

-	new = malloc(sizeof(*new));
+	new = zalloc(sizeof(*new));
 	if (!new) {
 		LOG_ERROR("Unable to allocate memory for clog_cpg");
 		return -ENOMEM;
 	}
-	memset(new, 0, sizeof(*new));
 	dm_list_init(&new->list);
 	new->lowest_id = 0xDEAD;
 	dm_list_init(&new->startup_list);
--- a/daemons/cmirrord/cluster.h
+++ b/daemons/cmirrord/cluster.h
@@ -12,8 +12,8 @@
 #ifndef _LVM_CLOG_CLUSTER_H
 #define _LVM_CLOG_CLUSTER_H

-#include "device_mapper/misc/dm-log-userspace.h"
-#include "device_mapper/all.h"
+#include "libdm/misc/dm-log-userspace.h"
+#include "libdm/libdevmapper.h"

 #define DM_ULOG_RESPONSE 0x1000U /* in last byte of 32-bit value */
 #define DM_ULOG_CHECKPOINT_READY 21
--- a/daemons/cmirrord/logging.h
+++ b/daemons/cmirrord/logging.h
@@ -13,9 +13,6 @@
 #ifndef _LVM_CLOG_LOGGING_H
 #define _LVM_CLOG_LOGGING_H

-#define _GNU_SOURCE
-
-#include "configure.h"
 #include <stdio.h>
 #include <stdint.h>
 #include <syslog.h>
--- a/daemons/dmeventd/Makefile.in
+++ b/daemons/dmeventd/Makefile.in
@@ -57,14 +57,16 @@ all: device-mapper
 device-mapper: $(TARGETS)

 CFLAGS_dmeventd.o += $(EXTRA_EXEC_CFLAGS)
-LIBS += $(PTHREAD_LIBS)
+LIBS += $(PTHREAD_LIBS) -L$(top_builddir)/libdm -ldevmapper

 dmeventd: $(LIB_SHARED) dmeventd.o
-	$(CC) $(CFLAGS) -L. $(LDFLAGS) $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS) dmeventd.o \
-		-o $@ $(DL_LIBS) $(DMEVENT_LIBS) $(INTERNAL_LIBS) $(LIBS) -lm
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) -L. $(LDFLAGS) $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS) dmeventd.o \
+		-o $@ $(DL_LIBS) $(DMEVENT_LIBS) $(LIBS) -lm

 dmeventd.static: $(LIB_STATIC) dmeventd.o
-	$(CC) $(CFLAGS) $(LDFLAGS) -static -L. -L$(interfacebuilddir) dmeventd.o \
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) $(LDFLAGS) -static -L. -L$(interfacebuilddir) dmeventd.o \
 		-o $@ $(DL_LIBS) $(DMEVENT_LIBS) $(LIBS) $(STATIC_LIBS)

 ifeq ("@PKGCONFIG@", "yes")
@@ -80,23 +82,28 @@ CFLOW_SOURCES = $(addprefix $(srcdir)/, $(SOURCES))
 endif

 install_include: $(srcdir)/libdevmapper-event.h
-	$(INSTALL_DATA) -D $< $(includedir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_DATA) -D $< $(includedir)/$(<F)

 install_pkgconfig: libdevmapper-event.pc
-	$(INSTALL_DATA) -D $< $(pkgconfigdir)/devmapper-event.pc
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_DATA) -D $< $(pkgconfigdir)/devmapper-event.pc

 install_lib_dynamic: install_lib_shared

 install_lib_static: $(LIB_STATIC)
-	$(INSTALL_DATA) -D $< $(usrlibdir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_DATA) -D $< $(usrlibdir)/$(<F)

 install_lib: $(INSTALL_LIB_TARGETS)

 install_dmeventd_dynamic: dmeventd
-	$(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)

 install_dmeventd_static: dmeventd.static
-	$(INSTALL_PROGRAM) -D $< $(staticdir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D $< $(staticdir)/$(<F)

 install_dmeventd: $(INSTALL_DMEVENTD_TARGETS)

--- a/daemons/dmeventd/dmeventd.c
+++ b/daemons/dmeventd/dmeventd.c
@@ -16,14 +16,12 @@
 * dmeventd - dm event daemon to monitor active mapped devices
 */

-#include "device_mapper/misc/dmlib.h"
-#include "base/memory/zalloc.h"
-#include "device_mapper/misc/dm-logging.h"

-#include "daemons/dmeventd/libdevmapper-event.h"
+#include "libdevmapper-event.h"
 #include "dmeventd.h"

-#include "tools/tool.h"
+#include "libdm/misc/dm-logging.h"
+#include "base/memory/zalloc.h"

 #include <dlfcn.h>
 #include <pthread.h>
@@ -35,6 +33,8 @@
 #include <signal.h>
 #include <arpa/inet.h>		/* for htonl, ntohl */
 #include <fcntl.h>		/* for musl libc */
+#include <unistd.h>
+#include <syslog.h>

 #ifdef __linux__
 /*
@@ -62,8 +62,6 @@

 #endif

-#include <syslog.h>
-
 #define DM_SIGNALED_EXIT  1
 #define DM_SCHEDULED_EXIT 2
 static volatile sig_atomic_t _exit_now = 0;	/* set to '1' when signal is given to exit */
--- a/daemons/dmeventd/libdevmapper-event.c
+++ b/daemons/dmeventd/libdevmapper-event.c
@@ -12,11 +12,11 @@
 * Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

-#include "device_mapper/misc/dmlib.h"
-#include "base/memory/zalloc.h"
-#include "device_mapper/misc/dm-logging.h"
-#include "daemons/dmeventd/libdevmapper-event.h"
+#include "libdevmapper-event.h"
 #include "dmeventd.h"
+#include "libdm/misc/dm-logging.h"
+#include "base/memory/zalloc.h"
+
 #include "lib/misc/intl.h"

 #include <fcntl.h>
--- a/daemons/dmeventd/libdevmapper-event.h
+++ b/daemons/dmeventd/libdevmapper-event.h
@@ -21,6 +21,7 @@
 #ifndef LIB_DMEVENT_H
 #define LIB_DMEVENT_H

+#include <stdarg.h>
 #include <stdint.h>

 /*
--- a/daemons/dmeventd/plugins/lvm2/Makefile.in
+++ b/daemons/dmeventd/plugins/lvm2/Makefile.in
@@ -16,6 +16,7 @@ top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

 CLDFLAGS += -L$(top_builddir)/tools
+LIBS += $(DMEVENT_LIBS) $(PTHREAD_LIBS) @LVM2CMD_LIB@

 SOURCES = dmeventd_lvm.c

@@ -24,8 +25,6 @@ LIB_VERSION = $(LIB_VERSION_LVM)

 include $(top_builddir)/make.tmpl

-LIBS += @LVM2CMD_LIB@ $(INTERNAL_LIBS) $(PTHREAD_LIBS)
-
 install_lvm2: install_lib_shared

 install: install_lvm2
--- a/daemons/dmeventd/plugins/lvm2/dmeventd_lvm.c
+++ b/daemons/dmeventd/plugins/lvm2/dmeventd_lvm.c
@@ -159,6 +159,7 @@ int dmeventd_lvm2_command(struct dm_pool *mem, char *buffer, size_t size,
 			dmeventd_lvm2_lock();
 			if (!dmeventd_lvm2_run(cmd) ||
 			    !(env = getenv(cmd))) {
+				dmeventd_lvm2_unlock();
 				log_error("Unable to find configured command.");
 				return 0;
 			}
--- a/daemons/dmeventd/plugins/mirror/Makefile.in
+++ b/daemons/dmeventd/plugins/mirror/Makefile.in
@@ -16,8 +16,8 @@ srcdir = @srcdir@
 top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

-INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
 CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
+LIBS += -ldevmapper-event-lvm2

 SOURCES = dmeventd_mirror.c

@@ -30,8 +30,6 @@ CFLOW_LIST_TARGET = $(LIB_NAME).cflow

 include $(top_builddir)/make.tmpl

-LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
-
 install_lvm2: install_dm_plugin

 install: install_lvm2
--- a/daemons/dmeventd/plugins/mirror/dmeventd_mirror.c
+++ b/daemons/dmeventd/plugins/mirror/dmeventd_mirror.c
@@ -13,8 +13,8 @@
 */

 #include "lib/misc/lib.h"
+#include "daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h"
 #include "daemons/dmeventd/libdevmapper-event.h"
-#include "dmeventd_lvm.h"
 #include "lib/activate/activate.h"

 /* FIXME Reformat to 80 char lines. */
--- a/daemons/dmeventd/plugins/raid/Makefile.in
+++ b/daemons/dmeventd/plugins/raid/Makefile.in
@@ -15,8 +15,8 @@ srcdir = @srcdir@
 top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

-INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
 CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
+LIBS += -ldevmapper-event-lvm2

 SOURCES = dmeventd_raid.c

@@ -29,8 +29,6 @@ CFLOW_LIST_TARGET = $(LIB_NAME).cflow

 include $(top_builddir)/make.tmpl

-LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
-
 install_lvm2: install_dm_plugin

 install: install_lvm2
--- a/daemons/dmeventd/plugins/raid/dmeventd_raid.c
+++ b/daemons/dmeventd/plugins/raid/dmeventd_raid.c
@@ -13,9 +13,9 @@
 */

 #include "lib/misc/lib.h"
-#include "lib/config/defaults.h"
-#include "dmeventd_lvm.h"
+#include "daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h"
 #include "daemons/dmeventd/libdevmapper-event.h"
+#include "lib/config/defaults.h"

 /* Hold enough elements for the mximum number of RAID images */
 #define	RAID_DEVS_ELEMS	((DEFAULT_RAID_MAX_IMAGES + 63) / 64)
--- a/daemons/dmeventd/plugins/snapshot/Makefile.in
+++ b/daemons/dmeventd/plugins/snapshot/Makefile.in
@@ -16,8 +16,8 @@ srcdir = @srcdir@
 top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

-INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
 CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
+LIBS += -ldevmapper-event-lvm2

 SOURCES = dmeventd_snapshot.c

@@ -26,8 +26,6 @@ LIB_VERSION = $(LIB_VERSION_LVM)

 include $(top_builddir)/make.tmpl

-LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
-
 install_lvm2: install_dm_plugin

 install: install_lvm2
--- a/daemons/dmeventd/plugins/snapshot/dmeventd_snapshot.c
+++ b/daemons/dmeventd/plugins/snapshot/dmeventd_snapshot.c
@@ -13,7 +13,7 @@
 */

 #include "lib/misc/lib.h"
-#include "dmeventd_lvm.h"
+#include "daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h"
 #include "daemons/dmeventd/libdevmapper-event.h"

 #include <sys/sysmacros.h>
@@ -175,6 +175,7 @@ void process_event(struct dm_task *dmt,
 	const char *device = dm_task_get_name(dmt);
 	int percent;
 	struct dm_info info;
+	int ret;

 	/* No longer monitoring, waiting for remove */
 	if (!state->percent_check)
@@ -205,7 +206,8 @@ void process_event(struct dm_task *dmt,
 		/* Maybe configurable ? */
 		_remove(dm_task_get_uuid(dmt));
 #endif
-		pthread_kill(pthread_self(), SIGALRM);
+		if ((ret = pthread_kill(pthread_self(), SIGALRM)) && (ret != ESRCH))
+			log_sys_error("pthread_kill", "self");
 		goto out;
 	}

@@ -213,7 +215,8 @@ void process_event(struct dm_task *dmt,
 		/* TODO eventually recognize earlier when room is enough */
 		log_info("Dropping monitoring of fully provisioned snapshot %s.",
 			 device);
-		pthread_kill(pthread_self(), SIGALRM);
+		if ((ret = pthread_kill(pthread_self(), SIGALRM)) && (ret != ESRCH))
+			log_sys_error("pthread_kill", "self");
 		goto out;
 	}

--- a/daemons/dmeventd/plugins/thin/Makefile.in
+++ b/daemons/dmeventd/plugins/thin/Makefile.in
@@ -15,8 +15,8 @@ srcdir = @srcdir@
 top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

-INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
 CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
+LIBS += -ldevmapper-event-lvm2

 SOURCES = dmeventd_thin.c

@@ -29,8 +29,6 @@ CFLOW_LIST_TARGET = $(LIB_NAME).cflow

 include $(top_builddir)/make.tmpl

-LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
-
 install_lvm2: install_dm_plugin

 install: install_lvm2
--- a/daemons/dmeventd/plugins/thin/dmeventd_thin.c
+++ b/daemons/dmeventd/plugins/thin/dmeventd_thin.c
@@ -13,7 +13,7 @@
 */

 #include "lib/misc/lib.h"
-#include "dmeventd_lvm.h"
+#include "daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h"
 #include "daemons/dmeventd/libdevmapper-event.h"

 #include <sys/wait.h>
@@ -286,7 +286,7 @@ void process_event(struct dm_task *dmt,
 		if (state->fails++ <= state->max_fails) {
 			log_debug("Postponing frequently failing policy (%u <= %u).",
 				  state->fails - 1, state->max_fails);
-			return;
+			goto out;
 		}
 		if (state->max_fails < MAX_FAILS)
 			state->max_fails <<= 1;
--- a/daemons/dmeventd/plugins/vdo/Makefile.in
+++ b/daemons/dmeventd/plugins/vdo/Makefile.in
@@ -15,8 +15,8 @@ srcdir = @srcdir@
 top_srcdir = @top_srcdir@
 top_builddir = @top_builddir@

-INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
 CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
+LIBS += -ldevmapper-event-lvm2

 SOURCES = dmeventd_vdo.c

@@ -29,8 +29,6 @@ CFLOW_LIST_TARGET = $(LIB_NAME).cflow

 include $(top_builddir)/make.tmpl

-LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
-
 install_lvm2: install_dm_plugin

 install: install_lvm2
--- a/daemons/dmeventd/plugins/vdo/dmeventd_vdo.c
+++ b/daemons/dmeventd/plugins/vdo/dmeventd_vdo.c
@@ -13,9 +13,11 @@
 */

 #include "lib/misc/lib.h"
-#include "dmeventd_lvm.h"
+#include "daemons/dmeventd/plugins/lvm2/dmeventd_lvm.h"
 #include "daemons/dmeventd/libdevmapper-event.h"
-#include "device_mapper/vdo/target.h"
+
+/* Use parser from new device_mapper library */
+#include "device_mapper/vdo/status.c"

 #include <sys/wait.h>
 #include <stdarg.h>
@@ -245,7 +247,7 @@ void process_event(struct dm_task *dmt,
 		if (state->fails++ <= state->max_fails) {
 			log_debug("Postponing frequently failing policy (%u <= %u).",
 				  state->fails - 1, state->max_fails);
-			return;
+			goto out;
 		}
 		if (state->max_fails < MAX_FAILS)
 			state->max_fails <<= 1;
@@ -253,8 +255,7 @@ void process_event(struct dm_task *dmt,
 	} else
 		state->max_fails = 1; /* Reset on success */

-	/* FIXME: ATM nothing can be done, drop 0, once it becomes useful */
-	if (0 && needs_policy)
+	if (needs_policy)
 		_use_policy(dmt, state);
 out:
 	if (vdop.status)
--- a/daemons/lvmdbusd/Makefile.in
+++ b/daemons/lvmdbusd/Makefile.in
@@ -23,11 +23,10 @@ LVMDBUS_SRCDIR_FILES = \
 	cfg.py \
 	cmdhandler.py \
 	fetch.py \
-	__init__.py \
 	job.py \
 	loader.py \
-	main.py \
 	lv.py \
+	main.py \
 	manager.py \
 	objectmanager.py \
 	pv.py \
@@ -35,7 +34,8 @@ LVMDBUS_SRCDIR_FILES = \
 	state.py \
 	udevwatch.py \
 	utils.py \
-	vg.py
+	vg.py \
+	__init__.py

 LVMDBUS_BUILDDIR_FILES = \
 	lvmdb.py \
@@ -51,17 +51,18 @@ include $(top_builddir)/make.tmpl
 .PHONY: install_lvmdbusd

 all:
-	test -x $(LVMDBUSD) || chmod 755 $(LVMDBUSD)
+	$(Q) test -x $(LVMDBUSD) || chmod 755 $(LVMDBUSD)

 install_lvmdbusd:
-	$(INSTALL_DIR) $(sbindir)
-	$(INSTALL_SCRIPT) $(LVMDBUSD) $(sbindir)
-	$(INSTALL_DIR) $(DESTDIR)$(lvmdbusdir)
-	(cd $(srcdir); $(INSTALL_DATA) $(LVMDBUS_SRCDIR_FILES) $(DESTDIR)$(lvmdbusdir))
-	$(INSTALL_DATA) $(LVMDBUS_BUILDDIR_FILES) $(DESTDIR)$(lvmdbusdir)
-	PYTHON=$(PYTHON3) $(PYCOMPILE) --destdir "$(DESTDIR)" --basedir "$(lvmdbusdir)" $(LVMDBUS_SRCDIR_FILES) $(LVMDBUS_BUILDDIR_FILES)
-	$(CHMOD) 755 $(DESTDIR)$(lvmdbusdir)/__pycache__
-	$(CHMOD) 444 $(DESTDIR)$(lvmdbusdir)/__pycache__/*.py[co]
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_DIR) $(sbindir)
+	$(Q) $(INSTALL_SCRIPT) $(LVMDBUSD) $(sbindir)
+	$(Q) $(INSTALL_DIR) $(DESTDIR)$(lvmdbusdir)
+	$(Q) (cd $(srcdir); $(INSTALL_DATA) $(LVMDBUS_SRCDIR_FILES) $(DESTDIR)$(lvmdbusdir))
+	$(Q) $(INSTALL_DATA) $(LVMDBUS_BUILDDIR_FILES) $(DESTDIR)$(lvmdbusdir)
+	$(Q) PYTHON=$(PYTHON3) $(PYCOMPILE) --destdir "$(DESTDIR)" --basedir "$(lvmdbusdir)" $(LVMDBUS_SRCDIR_FILES) $(LVMDBUS_BUILDDIR_FILES)
+	$(Q) $(CHMOD) 755 $(DESTDIR)$(lvmdbusdir)/__pycache__
+	$(Q) $(CHMOD) 444 $(DESTDIR)$(lvmdbusdir)/__pycache__/*.py[co]

 install_lvm2: install_lvmdbusd

--- a/daemons/lvmdbusd/automatedproperties.py
+++ b/daemons/lvmdbusd/automatedproperties.py
@@ -155,7 +155,7 @@ class AutomatedProperties(dbus.service.Object):
 		# through all dbus objects as some don't have a search method, like
 		# 'Manager' object.
 		if not self._ap_search_method:
-			return
+			return 0

 		search = self.lvm_id
 		if search_key:
--- a/daemons/lvmdbusd/cfg.py
+++ b/daemons/lvmdbusd/cfg.py
@@ -87,3 +87,13 @@ blackbox = None

 # RequestEntry ctor
 create_request_entry = None
+
+
+def exit_daemon():
+    """
+    Exit the daemon cleanly
+    :return:
+    """
+    if run and loop:
+        run.value = 0
+        loop.quit()
--- a/daemons/lvmdbusd/cmdhandler.py
+++ b/daemons/lvmdbusd/cmdhandler.py
@@ -67,7 +67,7 @@ class LvmFlightRecorder(object):
 		with cmd_lock:
 			if len(self.queue):
 				log_error("LVM dbus flight recorder START")
-				for c in self.queue:
+				for c in reversed(self.queue):
 					log_error(str(c))
 				log_error("LVM dbus flight recorder END")

@@ -263,10 +263,10 @@ def lv_tag(lv_name, add, rm, tag_options):
 	return _tag('lvchange', lv_name, add, rm, tag_options)


-def vg_rename(vg, new_name, rename_options):
+def vg_rename(vg_uuid, new_name, rename_options):
 	cmd = ['vgrename']
 	cmd.extend(options_to_cli_args(rename_options))
-	cmd.extend([vg, new_name])
+	cmd.extend([vg_uuid, new_name])
 	return call(cmd)


@@ -497,7 +497,8 @@ def lvm_full_report_json():
 	])

 	rc, out, err = call(cmd)
-	if rc == 0:
+	# When we have an exported vg the exit code of lvs or fullreport will be 5
+	if rc == 0 or rc == 5:
 		# With the current implementation, if we are using the shell then we
 		# are using JSON and JSON is returned back to us as it was parsed to
 		# figure out if we completed OK or not
--- a/daemons/lvmdbusd/fetch.py
+++ b/daemons/lvmdbusd/fetch.py
@@ -14,6 +14,7 @@ from . import cfg
 from .utils import MThreadRunner, log_debug, log_error
 import threading
 import queue
+import time
 import traceback


@@ -82,6 +83,8 @@ class StateUpdate(object):

 	@staticmethod
 	def update_thread(obj):
+		exception_count = 0
+
 		queued_requests = []
 		while cfg.run.value != 0:
 			# noinspection PyBroadException
@@ -136,12 +139,26 @@ class StateUpdate(object):
 				# wake up if we get an exception
 				queued_requests = []

+				# We retrieved OK, clear exception count
+				exception_count = 0
+
 			except queue.Empty:
 				pass
-			except Exception:
+			except Exception as e:
 				st = traceback.format_exc()
 				log_error("update_thread exception: \n%s" % st)
 				cfg.blackbox.dump()
+				exception_count += 1
+				if exception_count >= 5:
+					for i in queued_requests:
+						i.set_result(e)
+
+					log_error("Too many errors in update_thread, exiting daemon")
+					cfg.exit_daemon()
+
+				else:
+					# Slow things down when encountering errors
+					time.sleep(1)

 	def __init__(self):
 		self.lock = threading.RLock()
--- a/daemons/lvmdbusd/lv.py
+++ b/daemons/lvmdbusd/lv.py
@@ -10,7 +10,7 @@
 from .automatedproperties import AutomatedProperties

 from . import utils
-from .utils import vg_obj_path_generate
+from .utils import vg_obj_path_generate, log_error
 import dbus
 from . import cmdhandler
 from . import cfg
@@ -24,6 +24,8 @@ from . import background
 from .utils import round_size, mt_remove_dbus_objects
 from .job import JobState

+import traceback
+

 # Try and build a key for a LV, so that we sort the LVs with least dependencies
 # first.  This may be error prone because of the flexibility LVM
@@ -291,6 +293,22 @@ class LvCommon(AutomatedProperties):
 				(lv_uuid, lv_name))
 		return dbo

+	def attr_struct(self, index, type_map, default='undisclosed'):
+		try:
+			if self.state.Attr[index] not in type_map:
+				log_error("LV %s %s with lv_attr %s, lv_attr[%d] = "
+					"'%s' is not known" %
+					(self.Uuid, self.Name, self.Attr, index,
+					self.state.Attr[index]))
+
+			return dbus.Struct((self.state.Attr[index],
+				type_map.get(self.state.Attr[index], default)),
+								signature="(ss)")
+		except BaseException:
+			st = traceback.format_exc()
+			log_error("attr_struct: \n%s" % st)
+			return dbus.Struct(('?', 'Unavailable'), signature="(ss)")
+
 	@property
 	def VolumeType(self):
 		type_map = {'C': 'Cache', 'm': 'mirrored',
@@ -304,16 +322,14 @@ class LvCommon(AutomatedProperties):
 					'V': 'thin Volume', 't': 'thin pool', 'T': 'Thin pool data',
 					'e': 'raid or pool metadata or pool metadata spare',
 					'-': 'Unspecified'}
-		return dbus.Struct((self.state.Attr[0], type_map[self.state.Attr[0]]),
-						signature="as")
+		return self.attr_struct(0, type_map)

 	@property
 	def Permissions(self):
 		type_map = {'w': 'writable', 'r': 'read-only',
 					'R': 'Read-only activation of non-read-only volume',
 					'-': 'Unspecified'}
-		return dbus.Struct((self.state.Attr[1], type_map[self.state.Attr[1]]),
-						signature="(ss)")
+		return self.attr_struct(1, type_map)

 	@property
 	def AllocationPolicy(self):
@@ -322,8 +338,7 @@ class LvCommon(AutomatedProperties):
 					'i': 'inherited', 'I': 'inherited locked',
 					'l': 'cling', 'L': 'cling locked',
 					'n': 'normal', 'N': 'normal locked', '-': 'Unspecified'}
-		return dbus.Struct((self.state.Attr[2], type_map[self.state.Attr[2]]),
-						signature="(ss)")
+		return self.attr_struct(2, type_map)

 	@property
 	def FixedMinor(self):
@@ -331,15 +346,20 @@ class LvCommon(AutomatedProperties):

 	@property
 	def State(self):
-		type_map = {'a': 'active', 's': 'suspended', 'I': 'Invalid snapshot',
+		type_map = {'a': 'active',
+					's': 'suspended',
+					'I': 'Invalid snapshot',
 					'S': 'invalid Suspended snapshot',
 					'm': 'snapshot merge failed',
 					'M': 'suspended snapshot (M)erge failed',
 					'd': 'mapped device present without  tables',
 					'i': 'mapped device present with inactive table',
-					'X': 'unknown', '-': 'Unspecified'}
-		return dbus.Struct((self.state.Attr[4], type_map[self.state.Attr[4]]),
-						signature="(ss)")
+					'h': 'historical',
+					'c': 'check needed suspended thin-pool',
+					'C': 'check needed',
+					'X': 'unknown',
+					'-': 'Unspecified'}
+		return self.attr_struct(4, type_map)

 	@property
 	def TargetType(self):
@@ -355,11 +375,18 @@ class LvCommon(AutomatedProperties):

 	@property
 	def Health(self):
-		type_map = {'p': 'partial', 'r': 'refresh',
-					'm': 'mismatches', 'w': 'writemostly',
-					'X': 'X unknown', '-': 'Unspecified'}
-		return dbus.Struct((self.state.Attr[8], type_map[self.state.Attr[8]]),
-					signature="(ss)")
+		type_map = {'p': 'partial',
+					'r': 'refresh needed',
+					'm': 'mismatches',
+					'w': 'writemostly',
+					'X': 'unknown',
+					'-': 'unspecified',
+					's': 'reshaping',
+					'F': 'failed',
+					'D': 'Data space',
+					'R': 'Remove',
+					'M': 'Metadata'}
+		return self.attr_struct(8, type_map)

 	@property
 	def SkipActivation(self):
--- a/daemons/lvmdbusd/lvm_shell_proxy.py.in
+++ b/daemons/lvmdbusd/lvm_shell_proxy.py.in
@@ -220,7 +220,10 @@ class LVMShellProxy(object):

 		# Parse the report to see what happened
 		if 'log' in report_json:
-			if report_json['log'][-1:][0]['log_ret_code'] == '1':
+			ret_code = int(report_json['log'][-1:][0]['log_ret_code'])
+			# If we have an exported vg we get a log_ret_code == 5 when
+			# we do a 'fullreport'
+			if (ret_code == 1) or (ret_code == 5 and argv[0] == 'fullreport'):
 				rc = 0
 			else:
 				error_msg = self.get_error_msg()
--- a/daemons/lvmdbusd/lvmdb.py.in
+++ b/daemons/lvmdbusd/lvmdb.py.in
@@ -141,13 +141,22 @@ class DataStore(object):

 	@staticmethod
 	def _parse_vgs(_vgs):
-		vgs = sorted(_vgs, key=lambda vk: vk['vg_name'])
+		vgs = sorted(_vgs, key=lambda vk: vk['vg_uuid'])

 		c_vgs = OrderedDict()
 		c_lookup = {}

 		for i in vgs:
-			c_lookup[i['vg_name']] = i['vg_uuid']
+			vg_name = i['vg_name']
+
+			# Lvm allows duplicate vg names.  When this occurs, each subsequent
+			# matching VG name will be called vg_name:vg_uuid.  Note: ':' is an
+			# invalid character for lvm VG names
+			if vg_name in c_lookup:
+				vg_name = "%s:%s" % (vg_name, i['vg_uuid'])
+				i['vg_name'] = vg_name
+
+			c_lookup[vg_name] = i['vg_uuid']
 			DataStore._insert_record(c_vgs, i['vg_uuid'], i, [])

 		return c_vgs, c_lookup
@@ -162,13 +171,22 @@ class DataStore(object):
 				tmp_vg.extend(r['vg'])

 		# Sort for consistent output, however this is optional
-		vgs = sorted(tmp_vg, key=lambda vk: vk['vg_name'])
+		vgs = sorted(tmp_vg, key=lambda vk: vk['vg_uuid'])

 		c_vgs = OrderedDict()
 		c_lookup = {}

 		for i in vgs:
-			c_lookup[i['vg_name']] = i['vg_uuid']
+			vg_name = i['vg_name']
+
+			# Lvm allows duplicate vg names.  When this occurs, each subsequent
+			# matching VG name will be called vg_name:vg_uuid.  Note: ':' is an
+			# invalid character for lvm VG names
+			if vg_name in c_lookup:
+				vg_name = "%s:%s" % (vg_name, i['vg_uuid'])
+				i['vg_name'] = vg_name
+
+			c_lookup[vg_name] = i['vg_uuid']
 			c_vgs[i['vg_uuid']] = i

 		return c_vgs, c_lookup
@@ -521,6 +539,10 @@ if __name__ == "__main__":
 	for v in ds.vgs.values():
 		pp.pprint(v)

+	print("VG name to UUID")
+	for k, v in ds.vg_name_to_uuid.items():
+		print("%s: %s" % (k, v))
+
 	print("LVS")
 	for v in ds.lvs.values():
 		pp.pprint(v)
--- a/daemons/lvmdbusd/manager.py
+++ b/daemons/lvmdbusd/manager.py
@@ -164,6 +164,8 @@ class Manager(AutomatedProperties):
 		return the object path in O(1) time.

 		:param key: The lookup value
+		:param cb:	dbus python call back parameter, not client visible
+		:param cbe:	dbus python error call back parameter, not client visible
 		:return: Return the object path.  If object not found you will get '/'
 		"""
 		r = RequestEntry(-1, Manager._lookup_by_lvm_id, (key,), cb, cbe, False)
--- a/daemons/lvmdbusd/objectmanager.py
+++ b/daemons/lvmdbusd/objectmanager.py
@@ -189,8 +189,8 @@ class ObjectManager(AutomatedProperties):
 			path = dbus_object.dbus_object_path()
 			interfaces = dbus_object.interface()

-			# print 'UN-Registering object path %s for %s' % \
-			#      (path, dbus_object.lvm_id)
+			# print('UN-Registering object path %s for %s' %
+			#		(path, dbus_object.lvm_id))

 			self._lookup_remove(path)

@@ -240,39 +240,19 @@ class ObjectManager(AutomatedProperties):
 				return lookup_rc
 			return '/'

-	def _uuid_verify(self, path, uuid, lvm_id):
+	def _id_verify(self, path, uuid, lvm_id):
 		"""
-		Ensure uuid is present for a successful lvm_id lookup
+		Ensure our lookups are correct
 		NOTE: Internal call, assumes under object manager lock
 		:param path: 		Path to object we looked up
-		:param uuid: 		lvm uuid to verify
-		:param lvm_id:		lvm_id used to find object
+		:param uuid: 		uuid lookup
+		:param lvm_id:		lvm_id lookup
 		:return: None
 		"""
-		# This gets called when we found an object based on lvm_id, ensure
-		# uuid is correct too, as they can change. There is no durable
-		# non-changeable name in lvm
+		# There is no durable non-changeable name in lvm
 		if lvm_id != uuid:
-			if uuid and uuid not in self._id_to_object_path:
-				obj = self.get_object_by_path(path)
-				self._lookup_add(obj, path, lvm_id, uuid)
-
-	def _lvm_id_verify(self, path, uuid, lvm_id):
-		"""
-		Ensure lvm_id is present for a successful uuid lookup
-		NOTE: Internal call, assumes under object manager lock
-		:param path: 		Path to object we looked up
-		:param uuid: 		uuid used to find object
-		:param lvm_id:		lvm_id to verify
-		:return: None
-		"""
-		# This gets called when we found an object based on uuid, ensure
-		# lvm_id is correct too, as they can change.  There is no durable
-		# non-changeable name in lvm
-		if lvm_id != uuid:
-			if lvm_id and lvm_id not in self._id_to_object_path:
-				obj = self.get_object_by_path(path)
-				self._lookup_add(obj, path, lvm_id, uuid)
+			obj = self.get_object_by_path(path)
+			self._lookup_add(obj, path, lvm_id, uuid)

 	def _id_lookup(self, the_id):
 		path = None
@@ -339,22 +319,22 @@ class ObjectManager(AutomatedProperties):
 				# Lets check for the uuid first
 				path = self._id_lookup(uuid)
 				if path:
-					# Verify the lvm_id is sane
-					self._lvm_id_verify(path, uuid, lvm_id)
+					# Ensure table lookups are correct
+					self._id_verify(path, uuid, lvm_id)
 				else:
 					# Unable to find by UUID, lets lookup by lvm_id
 					path = self._id_lookup(lvm_id)
 					if path:
-						# Verify the uuid is sane
-						self._uuid_verify(path, uuid, lvm_id)
+						# Ensure table lookups are correct
+						self._id_verify(path, uuid, lvm_id)
 					else:
 						# We have exhausted all lookups, let's create if we can
 						if path_create:
 							path = path_create()
 							self._lookup_add(None, path, lvm_id, uuid)

-			# print('get_object_path_by_lvm_id(%s, %s, %s, %s: return %s' %
-			# 	   (uuid, lvm_id, str(path_create), str(gen_new), path))
+			# print('get_object_path_by_lvm_id(%s, %s, %s): return %s' %
+			#	(uuid, lvm_id, str(path_create), path))

 			return path

--- a/daemons/lvmdbusd/vg.py
+++ b/daemons/lvmdbusd/vg.py
@@ -52,18 +52,23 @@ def load_vgs(vg_specific=None, object_path=None, refresh=False,

 # noinspection PyPep8Naming,PyUnresolvedReferences,PyUnusedLocal
 class VgState(State):
+
 	@property
-	def lvm_id(self):
+	def internal_name(self):
 		return self.Name

+	@property
+	def lvm_id(self):
+		return self.internal_name
+
 	def identifiers(self):
-		return (self.Uuid, self.Name)
+		return (self.Uuid, self.internal_name)

 	def _lv_paths_build(self):
 		rc = []
 		for lv in cfg.db.lvs_in_vg(self.Uuid):
 			(lv_name, meta, lv_uuid) = lv
-			full_name = "%s/%s" % (self.Name, lv_name)
+			full_name = "%s/%s" % (self.internal_name, lv_name)

 			gen = utils.lv_object_path_method(lv_name, meta)

@@ -92,7 +97,7 @@ class VgState(State):
 	def create_dbus_object(self, path):
 		if not path:
 			path = cfg.om.get_object_path_by_uuid_lvm_id(
-				self.Uuid, self.Name, vg_obj_path_generate)
+				self.Uuid, self.internal_name, vg_obj_path_generate)
 		return Vg(path, self)

 	# noinspection PyMethodMayBeStatic
@@ -102,7 +107,6 @@ class VgState(State):

 # noinspection PyPep8Naming
@utils.dbus_property(VG_INTERFACE, 'Uuid', 's')
-@utils.dbus_property(VG_INTERFACE, 'Name', 's')
@utils.dbus_property(VG_INTERFACE, 'Fmt', 's')
@utils.dbus_property(VG_INTERFACE, 'SizeBytes', 't', 0)
@utils.dbus_property(VG_INTERFACE, 'FreeBytes', 't', 0)
@@ -135,6 +139,7 @@ class Vg(AutomatedProperties):
 	_AllocNormal_meta = ('b', VG_INTERFACE)
 	_AllocAnywhere_meta = ('b', VG_INTERFACE)
 	_Clustered_meta = ('b', VG_INTERFACE)
+	_Name_meta = ('s', VG_INTERFACE)

 	# noinspection PyUnusedLocal,PyPep8Naming
 	def __init__(self, object_path, object_state):
@@ -172,7 +177,7 @@ class Vg(AutomatedProperties):
 		# Make sure we have a dbus object representing it
 		Vg.validate_dbus_object(uuid, vg_name)
 		rc, out, err = cmdhandler.vg_rename(
-			vg_name, new_name, rename_options)
+			uuid, new_name, rename_options)
 		Vg.handle_execute(rc, out, err)
 		return '/'

@@ -216,7 +221,7 @@ class Vg(AutomatedProperties):
 	# TODO: This should be broken into a number of different methods
 	# instead of having one method that takes a hash for parameters.  Some of
 	# the changes that vgchange does works on entire system, not just a
-	# specfic vg, thus that should be in the Manager interface.
+	# specific vg, thus that should be in the Manager interface.
 	@dbus.service.method(
 		dbus_interface=VG_INTERFACE,
 		in_signature='ia{sv}',
@@ -729,6 +734,12 @@ class Vg(AutomatedProperties):
 				cb, cbe, return_tuple=False)
 		cfg.worker_q.put(r)

+	@property
+	def Name(self):
+		if ':' in self.state.Name:
+			return self.state.Name.split(':')[0]
+		return self.state.Name
+
 	@property
 	def Tags(self):
 		return utils.parse_tags(self.state.tags)
--- a/daemons/lvmlockd/Makefile.in
+++ b/daemons/lvmlockd/Makefile.in
@@ -37,29 +37,26 @@ TARGETS = lvmlockd lvmlockctl

 include $(top_builddir)/make.tmpl

-CFLAGS += $(EXTRA_EXEC_CFLAGS)
+CFLAGS += $(EXTRA_EXEC_CFLAGS) $(SYSTEMD_CFLAGS)
 INCLUDES += -I$(top_srcdir)/libdaemon/server
-LDFLAGS += -L$(top_builddir)/libdaemon/server $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS)
-LIBS += $(RT_LIBS) $(DAEMON_LIBS) $(PTHREAD_LIBS)
+LDFLAGS += $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS)
+LIBS += $(PTHREAD_LIBS) $(SYSTEMD_LIBS)

+lvmlockd: $(OBJECTS) $(top_builddir)/libdaemon/server/libdaemonserver.a $(INTERNAL_LIBS)
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $+ $(LOCK_LIBS) $(LIBS)

-ifeq ($(USE_SD_NOTIFY),yes)
-	CFLAGS += $(shell pkg-config --cflags libsystemd) -DUSE_SD_NOTIFY
-	LIBS += $(shell pkg-config --libs libsystemd)
-endif
-
-lvmlockd: $(OBJECTS) $(top_builddir)/libdaemon/client/libdaemonclient.a \
-		    $(top_builddir)/libdaemon/server/libdaemonserver.a
-	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJECTS) $(LOCK_LIBS) -ldaemonserver $(INTERNAL_LIBS) $(LIBS)
-
-lvmlockctl: lvmlockctl.o $(top_builddir)/libdaemon/client/libdaemonclient.a
-	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ lvmlockctl.o $(INTERNAL_LIBS) $(LIBS)
+lvmlockctl: lvmlockctl.o $(INTERNAL_LIBS)
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $+ $(LIBS)

 install_lvmlockd: lvmlockd
-	$(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)

 install_lvmlockctl: lvmlockctl
-	$(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)

 install_lvm2: install_lvmlockd install_lvmlockctl

--- a/daemons/lvmlockd/lvmlockctl.c
+++ b/daemons/lvmlockd/lvmlockctl.c
@@ -24,7 +24,7 @@
 static int quit = 0;
 static int info = 0;
 static int dump = 0;
-static int wait_opt = 0;
+static int wait_opt = 1;
 static int force_opt = 0;
 static int kill_vg = 0;
 static int drop_vg = 0;
--- a/daemons/lvmlockd/lvmlockd-core.c
+++ b/daemons/lvmlockd/lvmlockd-core.c
@@ -8,10 +8,6 @@
 * of the GNU Lesser General Public License v.2.1.
 */

-#define _XOPEN_SOURCE 500  /* pthread */
-#define _ISOC99_SOURCE
-#define _REENTRANT
-
 #include "tools/tool.h"

 #include "libdaemon/client/daemon-io.h"
@@ -35,7 +31,7 @@
 #include <sys/utsname.h>
 #include <sys/un.h>

-#ifdef USE_SD_NOTIFY
+#ifdef NOTIFYDBUS_SUPPORT
 #include <systemd/sd-daemon.h>
 #endif

@@ -409,12 +405,11 @@ struct lockspace *alloc_lockspace(void)
 {
 	struct lockspace *ls;

-	if (!(ls = malloc(sizeof(struct lockspace)))) {
+	if (!(ls = zalloc(sizeof(struct lockspace)))) {
 		log_error("out of memory for lockspace");
 		return NULL;
 	}

-	memset(ls, 0, sizeof(struct lockspace));
 	INIT_LIST_HEAD(&ls->actions);
 	INIT_LIST_HEAD(&ls->resources);
 	pthread_mutex_init(&ls->mutex, NULL);
@@ -929,12 +924,12 @@ static void lm_rem_resource(struct lockspace *ls, struct resource *r)
 		lm_rem_resource_sanlock(ls, r);
 }

-static int lm_find_free_lock(struct lockspace *ls, uint64_t *free_offset)
+static int lm_find_free_lock(struct lockspace *ls, uint64_t *free_offset, int *sector_size, int *align_size)
 {
 	if (ls->lm_type == LD_LM_DLM)
 		return 0;
 	else if (ls->lm_type == LD_LM_SANLOCK)
-		return lm_find_free_lock_sanlock(ls, free_offset);
+		return lm_find_free_lock_sanlock(ls, free_offset, sector_size, align_size);
 	return -1;
 }

@@ -2427,11 +2422,16 @@ static void *lockspace_thread_main(void *arg_in)

 			if (act->op == LD_OP_FIND_FREE_LOCK && act->rt == LD_RT_VG) {
 				uint64_t free_offset = 0;
+				int sector_size = 0;
+				int align_size = 0;
+
 				log_debug("S %s find free lock", ls->name);
-				rv = lm_find_free_lock(ls, &free_offset);
-				log_debug("S %s find free lock %d offset %llu",
-					  ls->name, rv, (unsigned long long)free_offset);
+				rv = lm_find_free_lock(ls, &free_offset, &sector_size, &align_size);
+				log_debug("S %s find free lock %d offset %llu sector_size %d align_size %d",
+					  ls->name, rv, (unsigned long long)free_offset, sector_size, align_size);
 				ls->free_lock_offset = free_offset;
+				ls->free_lock_sector_size = sector_size;
+				ls->free_lock_align_size = align_size;
 				list_del(&act->list);
 				act->result = rv;
 				add_client_result(act);
@@ -2743,6 +2743,9 @@ static int add_lockspace_thread(const char *ls_name,
 		if (ls2->thread_stop) {
 			log_debug("add_lockspace_thread %s exists and stopping", ls->name);
 			rv = -EAGAIN;
+		} else if (!ls2->create_fail && !ls2->create_done) {
+			log_debug("add_lockspace_thread %s exists and starting", ls->name);
+			rv = -ESTARTING;
 		} else {
 			log_debug("add_lockspace_thread %s exists", ls->name);
 			rv = -EEXIST;
@@ -2984,7 +2987,7 @@ static int count_lockspace_starting(uint32_t client_id)

 	pthread_mutex_lock(&lockspaces_mutex);
 	list_for_each_entry(ls, &lockspaces, list) {
-		if (ls->start_client_id != client_id)
+		if (client_id && (ls->start_client_id != client_id))
 			continue;

 		if (!ls->create_done && !ls->create_fail) {
@@ -3237,6 +3240,8 @@ static int work_init_lv(struct action *act)
 	char vg_args[MAX_ARGS+1];
 	char lv_args[MAX_ARGS+1];
 	uint64_t free_offset = 0;
+	int sector_size = 0;
+	int align_size = 0;
 	int lm_type = 0;
 	int rv = 0;

@@ -3252,6 +3257,8 @@ static int work_init_lv(struct action *act)
 		lm_type = ls->lm_type;
 		memcpy(vg_args, ls->vg_args, MAX_ARGS);
 		free_offset = ls->free_lock_offset;
+		sector_size = ls->free_lock_sector_size;
+		align_size = ls->free_lock_align_size;
 	}
 	pthread_mutex_unlock(&lockspaces_mutex);

@@ -3268,7 +3275,7 @@ static int work_init_lv(struct action *act)

 	if (lm_type == LD_LM_SANLOCK) {
 		rv = lm_init_lv_sanlock(ls_name, act->vg_name, act->lv_uuid,
-					vg_args, lv_args, free_offset);
+					vg_args, lv_args, sector_size, align_size, free_offset);

 		memcpy(act->lv_args, lv_args, MAX_ARGS);
 		return rv;
@@ -3385,7 +3392,7 @@ static void *worker_thread_main(void *arg_in)
 			add_client_result(act);

 		} else if (act->op == LD_OP_START_WAIT) {
-			act->result = count_lockspace_starting(act->client_id);
+			act->result = count_lockspace_starting(0);
 			if (!act->result)
 				add_client_result(act);
 			else
@@ -3419,7 +3426,7 @@ static void *worker_thread_main(void *arg_in)
 		list_for_each_entry_safe(act, safe, &delayed_list, list) {
 			if (act->op == LD_OP_START_WAIT) {
 				log_debug("work delayed start_wait for client %u", act->client_id);
-				act->result = count_lockspace_starting(act->client_id);
+				act->result = count_lockspace_starting(0);
 				if (!act->result) {
 					list_del(&act->list);
 					add_client_result(act);
--- a/daemons/lvmlockd/lvmlockd-dlm.c
+++ b/daemons/lvmlockd/lvmlockd-dlm.c
@@ -272,10 +272,9 @@ static int lm_add_resource_dlm(struct lockspace *ls, struct resource *r, int wit
 	int rv;

 	if (r->type == LD_RT_GL || r->type == LD_RT_VG) {
-		buf = malloc(sizeof(struct val_blk) + DLM_LVB_LEN);
+		buf = zalloc(sizeof(struct val_blk) + DLM_LVB_LEN);
 		if (!buf)
 			return -ENOMEM;
-		memset(buf, 0, sizeof(struct val_blk) + DLM_LVB_LEN);

 		rdd->vb = (struct val_blk *)buf;
 		rdd->lksb.sb_lvbptr = buf + sizeof(struct val_blk);
--- a/daemons/lvmlockd/lvmlockd-internal.h
+++ b/daemons/lvmlockd/lvmlockd-internal.h
@@ -174,7 +174,9 @@ struct lockspace {
 	int8_t lm_type;			/* lock manager: LM_DLM, LM_SANLOCK */
 	void *lm_data;
 	uint64_t host_id;
-	uint64_t free_lock_offset;	/* start search for free lock here */
+	uint64_t free_lock_offset;	/* for sanlock, start search for free lock here */
+	int free_lock_sector_size;	/* for sanlock */
+	int free_lock_align_size;	/* for sanlock */

 	uint32_t start_client_id;	/* client_id that started the lockspace */
 	pthread_t thread;		/* makes synchronous lock requests */
@@ -468,7 +470,7 @@ static inline int lm_hosts_dlm(struct lockspace *ls, int notify)
 #ifdef LOCKDSANLOCK_SUPPORT

 int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_args);
-int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name, char *vg_args, char *lv_args, uint64_t free_offset);
+int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name, char *vg_args, char *lv_args, int sector_size, int align_size, uint64_t free_offset);
 int lm_free_lv_sanlock(struct lockspace *ls, struct resource *r);
 int lm_rename_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_args);
 int lm_prepare_lockspace_sanlock(struct lockspace *ls);
@@ -488,7 +490,7 @@ int lm_gl_is_enabled(struct lockspace *ls);
 int lm_get_lockspaces_sanlock(struct list_head *ls_rejoin);
 int lm_data_size_sanlock(void);
 int lm_is_running_sanlock(void);
-int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset);
+int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset, int *sector_size, int *align_size);

 static inline int lm_support_sanlock(void)
 {
@@ -502,7 +504,7 @@ static inline int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flag
 	return -1;
 }

-static inline int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name, char *vg_args, char *lv_args, uint64_t free_offset)
+static inline int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name, char *vg_args, char *lv_args, int sector_size, int align_size, uint64_t free_offset)
 {
 	return -1;
 }
@@ -590,7 +592,7 @@ static inline int lm_is_running_sanlock(void)
 	return 0;
 }

-static inline int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset)
+static inline int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset, int *sector_size, int *align_size)
 {
 	return -1;
 }
--- a/daemons/lvmlockd/lvmlockd-sanlock.c
+++ b/daemons/lvmlockd/lvmlockd-sanlock.c
@@ -24,10 +24,29 @@
 #include "sanlock_admin.h"
 #include "sanlock_resource.h"

+/* FIXME: these are copied from sanlock.h only until
+   an updated version of sanlock is available with them. */
+#define SANLK_RES_ALIGN1M       0x00000010
+#define SANLK_RES_ALIGN2M       0x00000020
+#define SANLK_RES_ALIGN4M       0x00000040
+#define SANLK_RES_ALIGN8M       0x00000080
+#define SANLK_RES_SECTOR512     0x00000100
+#define SANLK_RES_SECTOR4K      0x00000200
+#define SANLK_LSF_ALIGN1M       0x00000010
+#define SANLK_LSF_ALIGN2M       0x00000020
+#define SANLK_LSF_ALIGN4M       0x00000040
+#define SANLK_LSF_ALIGN8M       0x00000080
+#define SANLK_LSF_SECTOR512     0x00000100
+#define SANLK_LSF_SECTOR4K      0x00000200
+
 #include <stddef.h>
 #include <poll.h>
 #include <errno.h>
 #include <syslog.h>
+#include <blkid/blkid.h>
+#include <sys/sysmacros.h>
+
+#define ONE_MB 1048576

 /*
 -------------------------------------------------------------------------------
@@ -139,6 +158,7 @@ release all the leases for the VG.

 struct lm_sanlock {
 	struct sanlk_lockspace ss;
+	int sector_size;
 	int align_size;
 	int sock; /* sanlock daemon connection */
 };
@@ -201,7 +221,6 @@ int lm_data_size_sanlock(void)
 * ...
 */

-#define LS_BEGIN 0
 #define GL_LOCK_BEGIN UINT64_C(65)
 #define VG_LOCK_BEGIN UINT64_C(66)
 #define LV_LOCK_BEGIN UINT64_C(67)
@@ -288,7 +307,8 @@ static int read_host_id_file(void)
 		}
 	}
 	if (fclose(file))
-		log_error("failed to close host id file %s", daemon_host_id_file);
+		log_debug("Failed to fclose host id file %s (%s).",
+			  daemon_host_id_file, strerror(errno));
 out:
 	log_debug("host_id %d from %s", host_id, daemon_host_id_file);
 	return host_id;
@@ -324,6 +344,154 @@ fail:
 	return rv;
 }

+static void _read_sysfs_size(dev_t devno, const char *name, unsigned int *val)
+{
+	char path[PATH_MAX];
+	char buf[32];
+	FILE *fp;
+	size_t len;
+
+	snprintf(path, sizeof(path), "/sys/dev/block/%d:%d/queue/%s",
+		 (int)major(devno), (int)minor(devno), name);
+
+	if (!(fp = fopen(path, "r")))
+		return;
+
+	if (!fgets(buf, sizeof(buf), fp))
+		goto out;
+
+	if ((len = strlen(buf)) && buf[len - 1] == '\n')
+		buf[--len] = '\0';
+
+	if (strlen(buf))
+		*val = atoi(buf);
+out:
+	if (fclose(fp))
+		log_debug("Failed to fclose host id file %s (%s).", path, strerror(errno));
+
+}
+
+/* Select sector/align size for a new VG based on what the device reports for
+   sector size of the lvmlock LV. */
+
+static int get_sizes_device(char *path, int *sector_size, int *align_size)
+{
+	unsigned int physical_block_size = 0;
+	unsigned int logical_block_size = 0;
+	struct stat st;
+	int rv;
+
+	rv = stat(path, &st);
+	if (rv < 0) {
+		log_error("Failed to stat device to get block size %s %d", path, errno);
+		return -1;
+	}
+
+	_read_sysfs_size(st.st_rdev, "physical_block_size", &physical_block_size);
+	_read_sysfs_size(st.st_rdev, "logical_block_size", &logical_block_size);
+
+	if ((physical_block_size == 512) && (logical_block_size == 512)) {
+		*sector_size = 512;
+		*align_size = ONE_MB;
+		return 0;
+	}
+
+	if ((physical_block_size == 4096) && (logical_block_size == 4096)) {
+		*sector_size = 4096;
+		*align_size = 8 * ONE_MB;
+		return 0;
+	}
+
+	if (physical_block_size && (physical_block_size != 512) && (physical_block_size != 4096)) {
+		log_warn("WARNING: invalid block sizes physical %u logical %u for %s",
+			 physical_block_size, logical_block_size, path);
+		physical_block_size = 0;
+	}
+
+	if (logical_block_size && (logical_block_size != 512) && (logical_block_size != 4096)) {
+		log_warn("WARNING: invalid block sizes physical %u logical %u for %s",
+			 physical_block_size, logical_block_size, path);
+		logical_block_size = 0;
+	}
+
+	if (!physical_block_size && !logical_block_size) {
+		log_error("Failed to get a block size for %s", path);
+		return -1;
+	}
+
+	if (!physical_block_size || !logical_block_size) {
+		log_warn("WARNING: incomplete block size information physical %u logical %u for %s",
+			 physical_block_size, logical_block_size, path);
+		if (!physical_block_size)
+			physical_block_size = logical_block_size;
+		if (!logical_block_size)
+			logical_block_size = physical_block_size;
+	}
+
+	if ((logical_block_size == 4096) && (physical_block_size == 512)) {
+		log_warn("WARNING: mixed block sizes physical %u logical %u (using 4096) for %s",
+			 physical_block_size, logical_block_size, path);
+		*sector_size = 4096;
+		*align_size = 8 * ONE_MB;
+		return 0;
+	}
+
+	if ((physical_block_size == 4096) && (logical_block_size == 512)) {
+		log_warn("WARNING: mixed block sizes physical %u logical %u (using 4096) for %s",
+			 physical_block_size, logical_block_size, path);
+		*sector_size = 4096;
+		*align_size = 8 * ONE_MB;
+		return 0;
+	}
+
+	if (physical_block_size == 512) {
+		*sector_size = 512;
+		*align_size = ONE_MB;
+		return 0;
+	}
+
+	if (physical_block_size == 4096) {
+		*sector_size = 4096;
+		*align_size = 8 * ONE_MB;
+		return 0;
+	}
+
+	log_error("Failed to get a block size for %s", path);
+	return -1;
+}
+
+
+/* Get the sector/align sizes that were used to create an existing VG.
+   sanlock encoded this in the lockspace/resource structs on disk. */
+
+static int get_sizes_lockspace(char *path, int *sector_size, int *align_size)
+{
+	struct sanlk_lockspace ss;
+	uint32_t io_timeout = 0;
+	int rv;
+
+	memset(&ss, 0, sizeof(ss));
+	memcpy(ss.host_id_disk.path, path, SANLK_PATH_LEN);
+	ss.host_id_disk.offset = 0;
+
+	rv = sanlock_read_lockspace(&ss, 0, &io_timeout);
+	if (rv < 0) {
+		log_error("get_sizes_lockspace %s error %d", path, rv);
+		return rv;
+	}
+
+	if ((ss.flags & SANLK_LSF_SECTOR4K) && (ss.flags & SANLK_LSF_ALIGN8M)) {
+		*sector_size = 4096;
+		*align_size = 8 * ONE_MB;
+	} else if ((ss.flags & SANLK_LSF_SECTOR512) && (ss.flags & SANLK_LSF_ALIGN1M)) {
+		*sector_size = 512;
+		*align_size = ONE_MB;
+	}
+
+	log_debug("get_sizes_lockspace found %d %d", *sector_size, *align_size);
+	return 0;
+}
+
 /*
 * vgcreate
 *
@@ -343,7 +511,8 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar
 	uint32_t daemon_version;
 	uint32_t daemon_proto;
 	uint64_t offset;
-	int align_size;
+	int sector_size = 0;
+	int align_size = 0;
 	int i, rv;

 	memset(&ss, 0, sizeof(ss));
@@ -387,23 +556,25 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar
 	log_debug("sanlock daemon version %08x proto %08x",
 		  daemon_version, daemon_proto);

-	rv = sanlock_align(&disk);
-	if (rv <= 0) {
+	/* Nothing formatted on disk yet, use what the device reports. */
+	rv = get_sizes_device(disk.path, &sector_size, &align_size);
+	if (rv < 0) {
 		if (rv == -EACCES) {
 			log_error("S %s init_vg_san sanlock error -EACCES: no permission to access %s",
 				  ls_name, disk.path);
 			return -EDEVOPEN;
 		} else {
-			log_error("S %s init_vg_san sanlock error %d trying to get align size of %s",
+			log_error("S %s init_vg_san sanlock error %d trying to get sector/align size of %s",
 				  ls_name, rv, disk.path);
 			return -EARGS;
 		}
-	} else
-		align_size = rv;
+	}

 	strncpy(ss.name, ls_name, SANLK_NAME_LEN);
 	memcpy(ss.host_id_disk.path, disk.path, SANLK_PATH_LEN);
-	ss.host_id_disk.offset = LS_BEGIN * align_size;
+	ss.host_id_disk.offset = 0;
+	ss.flags = (sector_size == 4096) ? (SANLK_LSF_SECTOR4K | SANLK_LSF_ALIGN8M) :
+					   (SANLK_LSF_SECTOR512 | SANLK_LSF_ALIGN1M);

 	rv = sanlock_write_lockspace(&ss, 0, 0, sanlock_io_timeout);
 	if (rv < 0) {
@@ -436,6 +607,8 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar
 	memcpy(rd.rs.disks[0].path, disk.path, SANLK_PATH_LEN);
 	rd.rs.disks[0].offset = align_size * GL_LOCK_BEGIN;
 	rd.rs.num_disks = 1;
+	rd.rs.flags = (sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+					      (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	rv = sanlock_write_resource(&rd.rs, 0, 0, 0);
 	if (rv < 0) {
@@ -449,6 +622,8 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar
 	memcpy(rd.rs.disks[0].path, disk.path, SANLK_PATH_LEN);
 	rd.rs.disks[0].offset = align_size * VG_LOCK_BEGIN;
 	rd.rs.num_disks = 1;
+	rd.rs.flags = (sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+					      (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	rv = sanlock_write_resource(&rd.rs, 0, 0, 0);
 	if (rv < 0) {
@@ -472,6 +647,8 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar

 	memset(&rd, 0, sizeof(rd));
 	rd.rs.num_disks = 1;
+	rd.rs.flags = (sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+					      (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);
 	memcpy(rd.rs.disks[0].path, disk.path, SANLK_PATH_LEN);
 	strncpy(rd.rs.lockspace_name, ls_name, SANLK_NAME_LEN);
 	strcpy(rd.rs.name, "#unused");
@@ -510,13 +687,13 @@ int lm_init_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_ar
 */

 int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name,
-		       char *vg_args, char *lv_args, uint64_t free_offset)
+		       char *vg_args, char *lv_args,
+		       int sector_size, int align_size, uint64_t free_offset)
 {
 	struct sanlk_resourced rd;
 	char lock_lv_name[MAX_ARGS+1];
 	char lock_args_version[MAX_ARGS+1];
 	uint64_t offset;
-	int align_size;
 	int rv;

 	memset(&rd, 0, sizeof(rd));
@@ -534,7 +711,7 @@ int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name,
 		 LV_LOCK_ARGS_MAJOR, LV_LOCK_ARGS_MINOR, LV_LOCK_ARGS_PATCH);

 	if (daemon_test) {
-		align_size = 1048576;
+		align_size = ONE_MB;
 		snprintf(lv_args, MAX_ARGS, "%s:%llu",
 			 lock_args_version,
 			 (unsigned long long)((align_size * LV_LOCK_BEGIN) + (align_size * daemon_test_lv_count)));
@@ -547,12 +724,35 @@ int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name,
 	if ((rv = build_dm_path(rd.rs.disks[0].path, SANLK_PATH_LEN, vg_name, lock_lv_name)))
 		return rv;

-	align_size = sanlock_align(&rd.rs.disks[0]);
-	if (align_size <= 0) {
-		log_error("S %s init_lv_san align error %d", ls_name, align_size);
-		return -EINVAL;
+	/*
+	 * These should not usually be zero, maybe only the first time this function is called?
+	 * We need to use the same sector/align sizes that are already being used.
+	 */
+	if (!sector_size || !align_size) {
+		rv = get_sizes_lockspace(rd.rs.disks[0].path, &sector_size, &align_size);
+		if (rv < 0) {
+			log_error("S %s init_lv_san read_lockspace error %d %s",
+				  ls_name, rv, rd.rs.disks[0].path);
+			return rv;
+		}
+
+		if (sector_size)
+			log_debug("S %s init_lv_san found ls sector_size %d align_size %d", ls_name, sector_size, align_size);
+		else {
+			/* use the old method */
+			align_size = sanlock_align(&rd.rs.disks[0]);
+			if (align_size <= 0) {
+				log_error("S %s init_lv_san align error %d", ls_name, align_size);
+				return -EINVAL;
+			}
+			sector_size = (align_size == ONE_MB) ? 512 : 4096;
+			log_debug("S %s init_lv_san found old sector_size %d align_size %d", ls_name, sector_size, align_size);
+		}
 	}

+	rd.rs.flags = (sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+					      (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);
+
 	if (free_offset)
 		offset = free_offset;
 	else
@@ -595,6 +795,8 @@ int lm_init_lv_sanlock(char *ls_name, char *vg_name, char *lv_name,
 				  ls_name, lv_name, (unsigned long long)offset);

 			strncpy(rd.rs.name, lv_name, SANLK_NAME_LEN);
+			rd.rs.flags = (sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+							      (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 			rv = sanlock_write_resource(&rd.rs, 0, 0, 0);
 			if (!rv) {
@@ -626,7 +828,8 @@ int lm_rename_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_
 	char lock_lv_name[MAX_ARGS+1];
 	uint64_t offset;
 	uint32_t io_timeout;
-	int align_size;
+	int sector_size = 0;
+	int align_size = 0;
 	int i, rv;

 	memset(&disk, 0, sizeof(disk));
@@ -655,20 +858,13 @@ int lm_rename_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_
 	/* FIXME: device is not always ready for us here */
 	sleep(1);

-	align_size = sanlock_align(&disk);
-	if (align_size <= 0) {
-		log_error("S %s rename_vg_san bad align size %d %s",
-			  ls_name, align_size, disk.path);
-		return -EINVAL;
-	}
-
 	/*
 	 * Lockspace
 	 */

 	memset(&ss, 0, sizeof(ss));
 	memcpy(ss.host_id_disk.path, disk.path, SANLK_PATH_LEN);
-	ss.host_id_disk.offset = LS_BEGIN * align_size;
+	ss.host_id_disk.offset = 0;

 	rv = sanlock_read_lockspace(&ss, 0, &io_timeout);
 	if (rv < 0) {
@@ -677,6 +873,26 @@ int lm_rename_vg_sanlock(char *ls_name, char *vg_name, uint32_t flags, char *vg_
 		return rv;
 	}

+	if ((ss.flags & SANLK_LSF_SECTOR4K) && (ss.flags & SANLK_LSF_ALIGN8M)) {
+		sector_size = 4096;
+		align_size = 8 * ONE_MB;
+	} else if ((ss.flags & SANLK_LSF_SECTOR512) && (ss.flags & SANLK_LSF_ALIGN1M)) {
+		sector_size = 512;
+		align_size = ONE_MB;
+	} else {
+		/* use the old method */
+		align_size = sanlock_align(&ss.host_id_disk);
+		if (align_size <= 0) {
+			log_error("S %s rename_vg_san unknown sector/align size for %s",
+				 ls_name, ss.host_id_disk.path);
+			return -1;
+		}
+		sector_size = (align_size == ONE_MB) ? 512 : 4096;
+	}
+
+	if (!sector_size || !align_size)
+		return -1;
+
 	strncpy(ss.name, ls_name, SANLK_NAME_LEN);

 	rv = sanlock_write_lockspace(&ss, 0, 0, sanlock_io_timeout);
@@ -830,6 +1046,11 @@ int lm_ex_disable_gl_sanlock(struct lockspace *ls)
 	rd1.rs.num_disks = 1;
 	strncpy(rd1.rs.disks[0].path, lms->ss.host_id_disk.path, SANLK_PATH_LEN-1);
 	rd1.rs.disks[0].offset = lms->align_size * GL_LOCK_BEGIN;
+	
+	rd1.rs.flags = (lms->sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+						    (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);
+	rd2.rs.flags = (lms->sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+						    (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	rv = sanlock_acquire(lms->sock, -1, 0, 1, &rs1, NULL);
 	if (rv < 0) {
@@ -891,6 +1112,8 @@ int lm_able_gl_sanlock(struct lockspace *ls, int enable)
 	rd.rs.num_disks = 1;
 	strncpy(rd.rs.disks[0].path, lms->ss.host_id_disk.path, SANLK_PATH_LEN-1);
 	rd.rs.disks[0].offset = lms->align_size * GL_LOCK_BEGIN;
+	rd.rs.flags = (lms->sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+						   (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	rv = sanlock_write_resource(&rd.rs, 0, 0, 0);
 	if (rv < 0) {
@@ -936,7 +1159,8 @@ static int gl_is_enabled(struct lockspace *ls, struct lm_sanlock *lms)

 	rv = sanlock_read_resource(&rd.rs, 0);
 	if (rv < 0) {
-		log_error("gl_is_enabled read_resource error %d", rv);
+		log_error("gl_is_enabled read_resource align_size %d offset %llu error %d",
+			  lms->align_size, (unsigned long long)offset, rv);
 		return rv;
 	}

@@ -973,7 +1197,7 @@ int lm_gl_is_enabled(struct lockspace *ls)
 * been disabled.)
 */

-int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset)
+int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset, int *sector_size, int *align_size)
 {
 	struct lm_sanlock *lms = (struct lm_sanlock *)ls->lm_data;
 	struct sanlk_resourced rd;
@@ -983,15 +1207,22 @@ int lm_find_free_lock_sanlock(struct lockspace *ls, uint64_t *free_offset)
 	int round = 0;

 	if (daemon_test) {
-		*free_offset = (1048576 * LV_LOCK_BEGIN) + (1048576 * (daemon_test_lv_count + 1));
+		*free_offset = (ONE_MB * LV_LOCK_BEGIN) + (ONE_MB * (daemon_test_lv_count + 1));
+		*sector_size = 512;
+		*align_size = ONE_MB;
 		return 0;
 	}

+	*sector_size = lms->sector_size;
+	*align_size = lms->align_size;
+
 	memset(&rd, 0, sizeof(rd));

 	strncpy(rd.rs.lockspace_name, ls->name, SANLK_NAME_LEN);
 	rd.rs.num_disks = 1;
 	strncpy(rd.rs.disks[0].path, lms->ss.host_id_disk.path, SANLK_PATH_LEN-1);
+	rd.rs.flags = (lms->sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) :
+						   (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	if (ls->free_lock_offset)
 		offset = ls->free_lock_offset;
@@ -1091,6 +1322,8 @@ int lm_prepare_lockspace_sanlock(struct lockspace *ls)
 	char disk_path[SANLK_PATH_LEN];
 	char killpath[SANLK_PATH_LEN];
 	char killargs[SANLK_PATH_LEN];
+	int sector_size = 0;
+	int align_size = 0;
 	int gl_found;
 	int ret, rv;

@@ -1160,7 +1393,7 @@ int lm_prepare_lockspace_sanlock(struct lockspace *ls)
 		goto fail;
 	}

-	lms = malloc(sizeof(struct lm_sanlock));
+	lms = zalloc(sizeof(struct lm_sanlock));
 	if (!lms) {
 		ret = -ENOMEM;
 		goto fail;
@@ -1169,7 +1402,6 @@ int lm_prepare_lockspace_sanlock(struct lockspace *ls)
 	memset(lsname, 0, sizeof(lsname));
 	strncpy(lsname, ls->name, SANLK_NAME_LEN);

-	memset(lms, 0, sizeof(struct lm_sanlock));
 	memcpy(lms->ss.name, lsname, SANLK_NAME_LEN);
 	lms->ss.host_id_disk.offset = 0;
 	lms->ss.host_id = ls->host_id;
@@ -1207,13 +1439,34 @@ int lm_prepare_lockspace_sanlock(struct lockspace *ls)
 		goto fail;
 	}

-	lms->align_size = sanlock_align(&lms->ss.host_id_disk);
-	if (lms->align_size <= 0) {
-		log_error("S %s prepare_lockspace_san align error %d", lsname, lms->align_size);
+	rv = get_sizes_lockspace(disk_path, &sector_size, &align_size);
+	if (rv < 0) {
+		log_error("S %s prepare_lockspace_san cannot get sector/align sizes %d", lsname, rv);
 		ret = -EMANAGER;
 		goto fail;
 	}

+	if (!sector_size) {
+		log_debug("S %s prepare_lockspace_san using old size method", lsname);
+		/* use the old method */
+		align_size = sanlock_align(&lms->ss.host_id_disk);
+		if (align_size <= 0) {
+			log_error("S %s prepare_lockspace_san align error %d", lsname, align_size);
+			ret = -EINVAL;
+			goto fail;
+		}
+		sector_size = (align_size == ONE_MB) ? 512 : 4096;
+		log_debug("S %s prepare_lockspace_san found old sector_size %d align_size %d", lsname, sector_size, align_size);
+	}
+
+	log_debug("S %s prepare_lockspace_san sizes %d %d", lsname, sector_size, align_size);
+
+	lms->align_size = align_size;
+	lms->sector_size = sector_size;
+
+	lms->ss.flags = (sector_size == 4096) ? (SANLK_LSF_SECTOR4K | SANLK_LSF_ALIGN8M) :
+						(SANLK_LSF_SECTOR512 | SANLK_LSF_ALIGN1M);
+
 	gl_found = gl_is_enabled(ls, lms);
 	if (gl_found < 0) {
 		log_error("S %s prepare_lockspace_san gl_enabled error %d", lsname, gl_found);
@@ -1351,6 +1604,7 @@ static int lm_add_resource_sanlock(struct lockspace *ls, struct resource *r)
 	strncpy(rds->rs.name, r->name, SANLK_NAME_LEN);
 	rds->rs.num_disks = 1;
 	memcpy(rds->rs.disks[0].path, lms->ss.host_id_disk.path, SANLK_PATH_LEN);
+	rds->rs.flags = (lms->sector_size == 4096) ? (SANLK_RES_SECTOR4K | SANLK_RES_ALIGN8M) : (SANLK_RES_SECTOR512 | SANLK_RES_ALIGN1M);

 	if (r->type == LD_RT_GL)
 		rds->rs.disks[0].offset = GL_LOCK_BEGIN * lms->align_size;
@@ -1360,10 +1614,9 @@ static int lm_add_resource_sanlock(struct lockspace *ls, struct resource *r)
 	/* LD_RT_LV offset is set in each lm_lock call from lv_args. */

 	if (r->type == LD_RT_GL || r->type == LD_RT_VG) {
-		rds->vb = malloc(sizeof(struct val_blk));
+		rds->vb = zalloc(sizeof(struct val_blk));
 		if (!rds->vb)
 			return -ENOMEM;
-		memset(rds->vb, 0, sizeof(struct val_blk));
 	}

 	return 0;
@@ -1860,12 +2113,20 @@ int lm_unlock_sanlock(struct lockspace *ls, struct resource *r,
 	if (rv < 0)
 		log_error("S %s R %s unlock_san release error %d", ls->name, r->name, rv);

-	if (rv == -EIO)
-		rv = -ELOCKIO;
-	else if (rv < 0)
-		rv = -ELMERR;
+	/*
+	 * sanlock may return an error here if it fails to release the lease on
+	 * disk because of an io timeout.  But, sanlock will continue trying to
+	 * release the lease after this call returns.  We shouldn't return an
+	 * error here which would result in lvmlockd-core keeping the lock
+	 * around.  By releasing the lock in lvmlockd-core at this point,
+	 * lvmlockd may send another acquire request to lvmlockd.  If sanlock
+	 * has not been able to release the previous instance of the lock yet,
+	 * then it will return an error for the new request.  But, acquiring a
+	 * new lock is able o fail gracefully, until sanlock is finally able to
+	 * release the old lock.
+	 */

-	return rv;
+	return 0;
 }

 int lm_hosts_sanlock(struct lockspace *ls, int notify)
--- a/daemons/lvmpolld/Makefile.in
+++ b/daemons/lvmpolld/Makefile.in
@@ -29,15 +29,16 @@ include $(top_builddir)/make.tmpl

 CFLAGS += $(EXTRA_EXEC_CFLAGS)
 INCLUDES += -I$(top_srcdir)/libdaemon/server
-LDFLAGS += -L$(top_builddir)/libdaemon/server $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS)
-LIBS += $(DAEMON_LIBS) -ldaemonserver $(PTHREAD_LIBS)
+LDFLAGS += $(EXTRA_EXEC_LDFLAGS) $(ELDFLAGS)
+LIBS += $(DAEMON_LIBS) $(PTHREAD_LIBS)

-lvmpolld: $(OBJECTS) $(top_builddir)/libdaemon/client/libdaemonclient.a \
-		    $(top_builddir)/libdaemon/server/libdaemonserver.a
-	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJECTS) $(INTERNAL_LIBS) $(LIBS)
+lvmpolld: $(OBJECTS) $(top_builddir)/libdaemon/server/libdaemonserver.a $(INTERNAL_LIBS)
+	@echo "    [CC] $@"
+	$(Q) $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $+ $(LIBS)

 install_lvmpolld: lvmpolld
-	$(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)
+	@echo "    [INSTALL] $<"
+	$(Q) $(INSTALL_PROGRAM) -D $< $(sbindir)/$(<F)

 install_lvm2: install_lvmpolld

--- a/daemons/lvmpolld/lvmpolld-common.h
+++ b/daemons/lvmpolld/lvmpolld-common.h
@@ -18,8 +18,6 @@
 #ifndef _LVM_LVMPOLLD_COMMON_H
 #define _LVM_LVMPOLLD_COMMON_H

-#define _REENTRANT
-
 #include "tools/tool.h"

 #include "lvmpolld-cmd-utils.h"
--- a/device_mapper/Makefile
+++ b/device_mapper/Makefile
@@ -10,8 +10,12 @@
 # along with this program; if not, write to the Free Software Foundation,
 # Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

+# NOTE: this Makefile only works as 'include' for toplevel Makefile
+#       which defined all top_* variables
+
 DEVICE_MAPPER_SOURCE=\
 	device_mapper/datastruct/bitset.c \
+	device_mapper/ioctl/libdm-iface.c \
 	device_mapper/libdm-common.c \
 	device_mapper/libdm-config.c \
 	device_mapper/libdm-deptree.c \
@@ -24,29 +28,25 @@ DEVICE_MAPPER_SOURCE=\
 	device_mapper/regex/matcher.c \
 	device_mapper/regex/parse_rx.c \
 	device_mapper/regex/ttree.c \
-	device_mapper/ioctl/libdm-iface.c \
-	device_mapper/vdo/vdo_target.c \
-	device_mapper/vdo/status.c
+	device_mapper/vdo/status.c \
+	device_mapper/vdo/vdo_target.c

-DEVICE_MAPPER_DEPENDS=$(addprefix $(top_builddir)/,$(subst .c,.d,$(DEVICE_MAPPER_SOURCE)))
-DEVICE_MAPPER_OBJECTS=$(addprefix $(top_builddir)/,$(subst .c,.o,$(DEVICE_MAPPER_SOURCE)))
-CLEAN_TARGETS+=$(DEVICE_MAPPER_DEPENDS) $(DEVICE_MAPPER_OBJECTS)
+DEVICE_MAPPER_TARGET = device_mapper/libdevice-mapper.a
+DEVICE_MAPPER_DEPENDS = $(DEVICE_MAPPER_SOURCE:%.c=%.d)
+DEVICE_MAPPER_OBJECTS = $(DEVICE_MAPPER_SOURCE:%.c=%.o)
+CLEAN_TARGETS += $(DEVICE_MAPPER_DEPENDS) $(DEVICE_MAPPER_OBJECTS) \
+	$(DEVICE_MAPPER_SOURCE:%.c=%.gcda) \
+	$(DEVICE_MAPPER_SOURCE:%.c=%.gcno) \
+	$(DEVICE_MAPPER_TARGET)

 #$(DEVICE_MAPPER_DEPENDS): INCLUDES+=$(VDO_INCLUDES)
 #$(DEVICE_MAPPER_OBJECTS): INCLUDES+=$(VDO_INCLUDES)

-ifeq ("$(USE_TRACKING)","yes")
-ifeq (,$(findstring $(MAKECMDGOALS),cscope.out cflow clean distclean lcov \
- help check check_local check_cluster check_lvmetad check_lvmpolld))
-	-include $(DEVICE_MAPPER_DEPENDS)
-endif
-endif
-
-$(DEVICE_MAPPER_OBJECTS): INCLUDES+=-I$(top_srcdir)/device_mapper/
-
-$(top_builddir)/device_mapper/libdevice-mapper.a: $(DEVICE_MAPPER_OBJECTS)
+$(DEVICE_MAPPER_TARGET): $(DEVICE_MAPPER_OBJECTS)
 	@echo "    [AR] $@"
 	$(Q) $(RM) $@
 	$(Q) $(AR) rsv $@ $(DEVICE_MAPPER_OBJECTS) > /dev/null

-CLEAN_TARGETS+=$(top_builddir)/device_mapper/libdevice-mapper.a
+ifeq ("$(DEPENDS)","yes")
+-include $(DEVICE_MAPPER_DEPENDS)
+endif
--- a/device_mapper/all.h
+++ b/device_mapper/all.h
@@ -116,10 +116,12 @@ enum {
 	DM_DEVICE_MKNODES,

 	DM_DEVICE_LIST_VERSIONS,
-	
+
 	DM_DEVICE_TARGET_MSG,

-	DM_DEVICE_SET_GEOMETRY
+	DM_DEVICE_SET_GEOMETRY,
+
+	DM_DEVICE_ARM_POLL
 };

 /*
@@ -378,6 +380,16 @@ struct dm_status_cache {
 int dm_get_status_cache(struct dm_pool *mem, const char *params,
 			struct dm_status_cache **status);

+struct dm_status_writecache {
+	uint32_t error;
+	uint64_t total_blocks;
+	uint64_t free_blocks;
+	uint64_t writeback_blocks;
+};
+
+int dm_get_status_writecache(struct dm_pool *mem, const char *params,
+                             struct dm_status_writecache **status);
+
 /*
 * Parse params from STATUS call for snapshot target
 *
@@ -912,14 +924,57 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
 				  const char *origin_uuid,
 				  const char *policy_name,
 				  const struct dm_config_node *policy_settings,
+				  uint64_t metadata_start,
+				  uint64_t metadata_len,
+				  uint64_t data_start,
+				  uint64_t data_len,
 				  uint32_t data_block_size);

+struct writecache_settings {
+	uint64_t high_watermark;
+	uint64_t low_watermark;
+	uint64_t writeback_jobs;
+	uint64_t autocommit_blocks;
+	uint64_t autocommit_time; /* in milliseconds */
+	uint32_t fua;
+	uint32_t nofua;
+
+	/*
+	 * Allow an unrecognized key and its val to be passed to the kernel for
+	 * cases where a new kernel setting is added but lvm doesn't know about
+	 * it yet.
+	 */
+	char *new_key;
+	char *new_val;
+
+	/*
+	 * Flag is 1 if a value has been set.
+	 */
+	unsigned high_watermark_set:1;
+	unsigned low_watermark_set:1;
+	unsigned writeback_jobs_set:1;
+	unsigned autocommit_blocks_set:1;
+	unsigned autocommit_time_set:1;
+	unsigned fua_set:1;
+	unsigned nofua_set:1;
+};
+
+int dm_tree_node_add_writecache_target(struct dm_tree_node *node,
+				uint64_t size,
+				const char *origin_uuid,
+				const char *cache_uuid,
+				int pmem,
+				uint32_t writecache_block_size,
+				struct writecache_settings *settings);
+
+
 /*
 * VDO target
 */
 int dm_tree_node_add_vdo_target(struct dm_tree_node *node,
 				uint64_t size,
 				const char *data_uuid,
+				uint64_t data_size,
 				const struct dm_vdo_target_params *param);

 /*
--- a/device_mapper/ioctl/libdm-iface.c
+++ b/device_mapper/ioctl/libdm-iface.c
@@ -15,6 +15,7 @@

 #include "base/memory/zalloc.h"
 #include "device_mapper/misc/dmlib.h"
+#include "device_mapper/misc/dm-ioctl.h"
 #include "device_mapper/ioctl/libdm-targets.h"
 #include "device_mapper/libdm-common.h"

@@ -32,11 +33,9 @@
 #else
 #  define MAJOR(x) major((x))
 #  define MINOR(x) minor((x))
-#  define MKDEV(x,y) makedev((x),(y))
+#  define MKDEV(x,y) makedev(((dev_t)x),((dev_t)y))
 #endif

-#include "device_mapper/misc/dm-ioctl.h"
-
 /*
 * Ensure build compatibility.  
 * The hard-coded versions here are the highest present 
@@ -117,6 +116,9 @@ static struct cmd_data _cmd_data_v4[] = {
 #ifdef DM_DEV_SET_GEOMETRY
 	{"setgeometry",	DM_DEV_SET_GEOMETRY,	{4, 6, 0}},
 #endif
+#ifdef DM_DEV_ARM_POLL
+	{"armpoll",	DM_DEV_ARM_POLL,	{4, 36, 0}},
+#endif
 };
 /* *INDENT-ON* */

@@ -261,7 +263,7 @@ static int _control_exists(const char *control, uint32_t major, uint32_t minor)
 		return -1;
 	}

-	if (major && buf.st_rdev != MKDEV((dev_t)major, (dev_t)minor)) {
+	if (major && buf.st_rdev != MKDEV(major, minor)) {
 		log_verbose("%s: Wrong device number: (%u, %u) instead of "
 			    "(%u, %u)", control,
 			    MAJOR(buf.st_mode), MINOR(buf.st_mode),
@@ -304,7 +306,7 @@ static int _create_control(const char *control, uint32_t major, uint32_t minor)
 	(void) dm_prepare_selinux_context(control, S_IFCHR);
 	old_umask = umask(DM_CONTROL_NODE_UMASK);
 	if (mknod(control, S_IFCHR | S_IRUSR | S_IWUSR,
-		  MKDEV((dev_t)major, (dev_t)minor)) < 0)  {
+		  MKDEV(major, minor)) < 0)  {
 		log_sys_error("mknod", control);
 		ret = 0;
 	}
@@ -468,6 +470,7 @@ static void _dm_zfree_string(char *string)
 {
 	if (string) {
 		memset(string, 0, strlen(string));
+		asm volatile ("" ::: "memory"); /* Compiler barrier. */
 		free(string);
 	}
 }
@@ -476,6 +479,7 @@ static void _dm_zfree_dmi(struct dm_ioctl *dmi)
 {
 	if (dmi) {
 		memset(dmi, 0, dmi->data_size);
+		asm volatile ("" ::: "memory"); /* Compiler barrier. */
 		free(dmi);
 	}
 }
@@ -1082,6 +1086,22 @@ static int _lookup_dev_name(uint64_t dev, char *buf, size_t len)
 	return r;
 }

+static int _add_params(int type)
+{
+	switch (type) {
+	case DM_DEVICE_REMOVE_ALL:
+	case DM_DEVICE_CREATE:
+	case DM_DEVICE_REMOVE:
+	case DM_DEVICE_SUSPEND:
+	case DM_DEVICE_STATUS:
+	case DM_DEVICE_CLEAR:
+	case DM_DEVICE_ARM_POLL:
+		return 0; /* IOCTL_FLAGS_NO_PARAMS in drivers/md/dm-ioctl.c */
+	default:
+		return 1;
+	}
+}
+
 static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
 {
 	const size_t min_size = 16 * 1024;
@@ -1094,11 +1114,15 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
 	char *b, *e;
 	int count = 0;

-	for (t = dmt->head; t; t = t->next) {
-		len += sizeof(struct dm_target_spec);
-		len += strlen(t->params) + 1 + ALIGNMENT;
-		count++;
-	}
+	if (_add_params(dmt->type))
+		for (t = dmt->head; t; t = t->next) {
+			len += sizeof(struct dm_target_spec);
+			len += strlen(t->params) + 1 + ALIGNMENT;
+			count++;
+		}
+	else if (dmt->head)
+		log_debug_activation(INTERNAL_ERROR "dm '%s' ioctl should not define parameters.",
+				     _cmd_data_v4[dmt->type].name);

 	if (count && (dmt->sector || dmt->message)) {
 		log_error("targets and message are incompatible");
@@ -1182,7 +1206,7 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
 		}

 		dmi->flags |= DM_PERSISTENT_DEV_FLAG;
-		dmi->dev = MKDEV((dev_t)dmt->major, (dev_t)dmt->minor);
+		dmi->dev = MKDEV(dmt->major, dmt->minor);
 	}

 	/* Does driver support device number referencing? */
@@ -1248,9 +1272,10 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
 	b = (char *) (dmi + 1);
 	e = (char *) dmi + len;

-	for (t = dmt->head; t; t = t->next)
-		if (!(b = _add_target(t, b, e)))
-			goto_bad;
+	if (_add_params(dmt->type))
+		for (t = dmt->head; t; t = t->next)
+			if (!(b = _add_target(t, b, e)))
+				goto_bad;

 	if (dmt->newname)
 		strcpy(b, dmt->newname);
@@ -1454,6 +1479,7 @@ static int _create_and_load_v4(struct dm_task *dmt)
 	dmt->uuid = NULL;
 	free(dmt->mangled_uuid);
 	dmt->mangled_uuid = NULL;
+	_dm_task_free_targets(dmt);

 	if (dm_task_run(dmt))
 		return 1;
@@ -1464,6 +1490,7 @@ static int _create_and_load_v4(struct dm_task *dmt)
 	dmt->uuid = NULL;
 	free(dmt->mangled_uuid);
 	dmt->mangled_uuid = NULL;
+	_dm_task_free_targets(dmt);

 	/*
 	 * Also udev-synchronize "remove" dm task that is a part of this revert!
--- a/device_mapper/libdm-common.c
+++ b/device_mapper/libdm-common.c
@@ -1042,7 +1042,7 @@ static int _add_dev_node(const char *dev_name, uint32_t major, uint32_t minor,
 {
 	char path[PATH_MAX];
 	struct stat info;
-	dev_t dev = MKDEV((dev_t)major, (dev_t)minor);
+	dev_t dev = MKDEV(major, minor);
 	mode_t old_mask;

 	if (!_build_dev_path(path, sizeof(path), dev_name))
@@ -1765,7 +1765,7 @@ static int _mountinfo_parse_line(const char *line, unsigned *maj, unsigned *min,
 			return 0;
 		}
 		devmapper += 12; /* skip fixed prefix */
-		for (i = 0; devmapper[i] && devmapper[i] != ' ' && i < sizeof(root); ++i)
+		for (i = 0; devmapper[i] && devmapper[i] != ' ' && i < sizeof(root)-1; ++i)
 			root[i] = devmapper[i];
 		root[i] = 0;
 		_unmangle_mountinfo_string(root, buf);
--- a/device_mapper/libdm-deptree.c
+++ b/device_mapper/libdm-deptree.c
@@ -21,7 +21,6 @@

 #include <stdarg.h>
 #include <string.h>
-#include <sys/param.h>
 #include <sys/utsname.h>

 #define MAX_TARGET_PARAMSIZE 500000
@@ -38,6 +37,7 @@ enum {
 	SEG_SNAPSHOT_MERGE,
 	SEG_STRIPED,
 	SEG_ZERO,
+	SEG_WRITECACHE,
 	SEG_THIN_POOL,
 	SEG_THIN,
 	SEG_VDO,
@@ -77,6 +77,7 @@ static const struct {
 	{ SEG_SNAPSHOT_MERGE, "snapshot-merge" },
 	{ SEG_STRIPED, "striped" },
 	{ SEG_ZERO, "zero"},
+	{ SEG_WRITECACHE, "writecache"},
 	{ SEG_THIN_POOL, "thin-pool"},
 	{ SEG_THIN, "thin"},
 	{ SEG_VDO, "vdo" },
@@ -190,6 +191,11 @@ struct load_segment {
 	uint32_t min_recovery_rate;	/* raid kB/sec/disk */
 	uint32_t data_copies;		/* raid10 data_copies */

+	uint64_t metadata_start;	/* Cache */
+	uint64_t metadata_len;		/* Cache */
+	uint64_t data_start;		/* Cache */
+	uint64_t data_len;		/* Cache */
+
 	struct dm_tree_node *metadata;	/* Thin_pool + Cache */
 	struct dm_tree_node *pool;	/* Thin_pool, Thin */
 	struct dm_tree_node *external;	/* Thin */
@@ -197,6 +203,7 @@ struct load_segment {
 	uint64_t transaction_id;	/* Thin_pool */
 	uint64_t low_water_mark;	/* Thin_pool */
 	uint32_t data_block_size;       /* Thin_pool + cache */
+	uint32_t migration_threshold;   /* Cache */
 	unsigned skip_block_zeroing;	/* Thin_pool */
 	unsigned ignore_discard;	/* Thin_pool target vsn 1.1 */
 	unsigned no_discard_passdown;	/* Thin_pool target vsn 1.1 */
@@ -208,6 +215,12 @@ struct load_segment {
 	struct dm_tree_node *vdo_data;  /* VDO */
 	struct dm_vdo_target_params vdo_params; /* VDO */
 	const char *vdo_name;           /* VDO - device name is ALSO passed as table arg */
+	uint64_t vdo_data_size;		/* VDO - size of data storage device */
+
+	struct dm_tree_node *writecache_node;		/* writecache */
+	int writecache_pmem;				/* writecache, 1 if pmem, 0 if ssd */
+	uint32_t writecache_block_size;			/* writecache, in bytes */
+	struct writecache_settings writecache_settings;	/* writecache */
 };

 /* Per-device properties */
@@ -532,7 +545,7 @@ static struct dm_tree_node *_create_dm_tree_node(struct dm_tree *dtree,
 	dm_list_init(&node->activated);
 	dm_list_init(&node->props.segs);

-	dev = MKDEV((dev_t)info->major, (dev_t)info->minor);
+	dev = MKDEV(info->major, info->minor);

 	if (!dm_hash_insert_binary(dtree->devs, (const char *) &dev,
 				   sizeof(dev), node)) {
@@ -555,7 +568,7 @@ static struct dm_tree_node *_create_dm_tree_node(struct dm_tree *dtree,
 static struct dm_tree_node *_find_dm_tree_node(struct dm_tree *dtree,
 					       uint32_t major, uint32_t minor)
 {
-	dev_t dev = MKDEV((dev_t)major, (dev_t)minor);
+	dev_t dev = MKDEV(major, minor);

 	return dm_hash_lookup_binary(dtree->devs, (const char *) &dev,
 				     sizeof(dev));
@@ -1487,7 +1500,7 @@ static int _node_message(uint32_t major, uint32_t minor,
 			 int expected_errno, const char *message)
 {
 	struct dm_task *dmt;
-	int r;
+	int r = 0;

 	if (!(dmt = dm_task_create(DM_DEVICE_TARGET_MSG)))
 		return_0;
@@ -1753,7 +1766,12 @@ static int _dm_tree_deactivate_children(struct dm_tree_node *dnode,

 		if (info.open_count) {
 			/* Skip internal non-toplevel opened nodes */
-			if (level)
+			/* On some old udev systems without corrrect udev rules
+			 * this hack avoids 'leaking' active _mimageX legs after
+			 * deactivation of mirror LV. Other suffixes are not added
+			 * since it's expected newer systems with wider range of
+			 * supported targets also use better udev */
+			if (level && !strstr(name, "_mimage"))
 				continue;

 			/* When retry is not allowed, error */
@@ -1793,7 +1811,7 @@ static int _dm_tree_deactivate_children(struct dm_tree_node *dnode,

 		if (!_deactivate_node(name, info.major, info.minor,
 				      &child->dtree->cookie, child->udev_flags,
-				      (level == 0) ? child->dtree->retry_remove : 0)) {
+				      child->dtree->retry_remove)) {
 			log_error("Unable to deactivate %s (" FMTu32 ":"
 				  FMTu32 ").", name, info.major, info.minor);
 			r = 0;
@@ -2545,7 +2563,7 @@ static int _cache_emit_segment_line(struct dm_task *dmt,
 				    char *params, size_t paramsize)
 {
 	int pos = 0;
-	/* unsigned feature_count; */
+	unsigned feature_count;
 	char data[DM_FORMAT_DEV_BUFSIZE];
 	char metadata[DM_FORMAT_DEV_BUFSIZE];
 	char origin[DM_FORMAT_DEV_BUFSIZE];
@@ -2570,29 +2588,119 @@ static int _cache_emit_segment_line(struct dm_task *dmt,
 	EMIT_PARAMS(pos, " %u", seg->data_block_size);

 	/* Features */
-	/* feature_count = hweight32(seg->flags); */
-	/* EMIT_PARAMS(pos, " %u", feature_count); */
+
+	feature_count = 1; /* One of passthrough|writeback|writethrough is always set. */
+
 	if (seg->flags & DM_CACHE_FEATURE_METADATA2)
-		EMIT_PARAMS(pos, " 2 metadata2 ");
-	else
-		EMIT_PARAMS(pos, " 1 ");
+		feature_count++;
+
+	EMIT_PARAMS(pos, " %u", feature_count);
+
+	if (seg->flags & DM_CACHE_FEATURE_METADATA2)
+		EMIT_PARAMS(pos, " metadata2");

 	if (seg->flags & DM_CACHE_FEATURE_PASSTHROUGH)
-		EMIT_PARAMS(pos, "passthrough");
+		EMIT_PARAMS(pos, " passthrough");
        else if (seg->flags & DM_CACHE_FEATURE_WRITEBACK)
-		EMIT_PARAMS(pos, "writeback");
+		EMIT_PARAMS(pos, " writeback");
 	else
-		EMIT_PARAMS(pos, "writethrough");
+		EMIT_PARAMS(pos, " writethrough");

 	/* Cache Policy */
 	name = seg->policy_name ? : "default";

 	EMIT_PARAMS(pos, " %s", name);

-	EMIT_PARAMS(pos, " %u", seg->policy_argc * 2);
+	/* Do not pass migration_threshold 2048 which is default */
+	EMIT_PARAMS(pos, " %u", (seg->policy_argc + (seg->migration_threshold != 2048) ? 1 : 0) * 2);
+	if (seg->migration_threshold != 2048)
+		    EMIT_PARAMS(pos, " migration_threshold %u", seg->migration_threshold);
 	if (seg->policy_settings)
 		for (cn = seg->policy_settings->child; cn; cn = cn->sib)
-			EMIT_PARAMS(pos, " %s %" PRIu64, cn->key, cn->v->v.i);
+			if (cn->v) /* Skip deleted entry */
+				EMIT_PARAMS(pos, " %s %" PRIu64, cn->key, cn->v->v.i);
+
+	return 1;
+}
+
+static int _writecache_emit_segment_line(struct dm_task *dmt,
+				    struct load_segment *seg,
+				    char *params, size_t paramsize)
+{
+	int pos = 0;
+	int count = 0;
+	uint32_t block_size;
+	char origin_dev[DM_FORMAT_DEV_BUFSIZE];
+	char cache_dev[DM_FORMAT_DEV_BUFSIZE];
+
+	if (!_build_dev_string(origin_dev, sizeof(origin_dev), seg->origin))
+		return_0;
+
+	if (!_build_dev_string(cache_dev, sizeof(cache_dev), seg->writecache_node))
+		return_0;
+
+	if (seg->writecache_settings.high_watermark_set)
+		count += 2;
+	if (seg->writecache_settings.low_watermark_set)
+		count += 2;
+	if (seg->writecache_settings.writeback_jobs_set)
+		count += 2;
+	if (seg->writecache_settings.autocommit_blocks_set)
+		count += 2;
+	if (seg->writecache_settings.autocommit_time_set)
+		count += 2;
+	if (seg->writecache_settings.fua_set)
+		count += 1;
+	if (seg->writecache_settings.nofua_set)
+		count += 1;
+	if (seg->writecache_settings.new_key)
+		count += 2;
+
+	if (!(block_size = seg->writecache_block_size))
+		block_size = 4096;
+
+	EMIT_PARAMS(pos, "%s %s %s %u %d",
+		    seg->writecache_pmem ? "p" : "s",
+		    origin_dev, cache_dev, block_size, count);
+
+	if (seg->writecache_settings.high_watermark_set) {
+		EMIT_PARAMS(pos, " high_watermark %llu",
+			(unsigned long long)seg->writecache_settings.high_watermark);
+	}
+
+	if (seg->writecache_settings.low_watermark_set) {
+		EMIT_PARAMS(pos, " low_watermark %llu",
+			(unsigned long long)seg->writecache_settings.low_watermark);
+	}
+
+	if (seg->writecache_settings.writeback_jobs_set) {
+		EMIT_PARAMS(pos, " writeback_jobs %llu",
+			(unsigned long long)seg->writecache_settings.writeback_jobs);
+	}
+
+	if (seg->writecache_settings.autocommit_blocks_set) {
+		EMIT_PARAMS(pos, " autocommit_blocks %llu",
+			(unsigned long long)seg->writecache_settings.autocommit_blocks);
+	}
+
+	if (seg->writecache_settings.autocommit_time_set) {
+		EMIT_PARAMS(pos, " autocommit_time %llu",
+			(unsigned long long)seg->writecache_settings.autocommit_time);
+	}
+
+	if (seg->writecache_settings.fua_set) {
+		EMIT_PARAMS(pos, " fua");
+	}
+
+	if (seg->writecache_settings.nofua_set) {
+		EMIT_PARAMS(pos, " nofua");
+	}
+
+	if (seg->writecache_settings.new_key) {
+		EMIT_PARAMS(pos, " %s %s",
+			seg->writecache_settings.new_key,
+			seg->writecache_settings.new_val);
+	}

 	return 1;
 }
@@ -2640,20 +2748,21 @@ static int _vdo_emit_segment_line(struct dm_task *dmt,
 	/* Unlike normal targets, current VDO requires device path */
 	if (dm_snprintf(data_dev, sizeof(data_dev), "/dev/dm-%u", seg->vdo_data->info.minor) < 0) {
 		log_error("Can create VDO data volume path for %s.", data);
-		return_0;
+		return 0;
 	}

-	EMIT_PARAMS(pos, "%s %u %s " FMTu64 " " FMTu64 " %u on %s %s "
-		    "ack=%u,bio=%u,bioRotationInterval=%u,cpu=%u,hash=%u,logical=%u,physical=%u",
+	EMIT_PARAMS(pos, "V2 %s " FMTu64 " %u " FMTu64 " %u %s %s %s "
+		    "maxDiscard %u ack %u bio %u bioRotationInterval %u cpu %u hash %u logical %u physical %u",
 		    data_dev,
-		    (seg->vdo_params.emulate_512_sectors == 0) ? 4096 : 512,
-		    seg->vdo_params.use_read_cache ? "enabled" : "disabled",
-		    seg->vdo_params.read_cache_size_mb * UINT64_C(256),		// 1MiB -> 4KiB units
+		    seg->vdo_data_size / 8, // this parameter is in 4K units
+		    seg->vdo_params.minimum_io_size,
 		    seg->vdo_params.block_map_cache_size_mb * UINT64_C(256),	// 1MiB -> 4KiB units
-		    seg->vdo_params.block_map_period,
+		    seg->vdo_params.block_map_era_length,
+		    seg->vdo_params.use_metadata_hints ? "on" : "off" ,
 		    (seg->vdo_params.write_policy == DM_VDO_WRITE_POLICY_SYNC) ? "sync" :
 			(seg->vdo_params.write_policy == DM_VDO_WRITE_POLICY_ASYNC) ? "async" : "auto", // policy
 		    seg->vdo_name,
+		    seg->vdo_params.max_discard,
 		    seg->vdo_params.ack_threads,
 		    seg->vdo_params.bio_threads,
 		    seg->vdo_params.bio_rotation,
@@ -2776,6 +2885,10 @@ static int _emit_segment_line(struct dm_task *dmt, uint32_t major,
 		if (!_cache_emit_segment_line(dmt, seg, params, paramsize))
 			return_0;
 		break;
+	case SEG_WRITECACHE:
+		if (!_writecache_emit_segment_line(dmt, seg, params, paramsize))
+			return_0;
+		break;
 	}

 	switch(seg->type) {
@@ -2787,6 +2900,7 @@ static int _emit_segment_line(struct dm_task *dmt, uint32_t major,
 	case SEG_THIN_POOL:
 	case SEG_THIN:
 	case SEG_CACHE:
+	case SEG_WRITECACHE:
 		break;
 	case SEG_CRYPT:
 	case SEG_LINEAR:
@@ -3470,6 +3584,10 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
 				  const char *origin_uuid,
 				  const char *policy_name,
 				  const struct dm_config_node *policy_settings,
+				  uint64_t metadata_start,
+				  uint64_t metadata_len,
+				  uint64_t data_start,
+				  uint64_t data_len,
 				  uint32_t data_block_size)
 {
 	struct dm_config_node *cn;
@@ -3545,9 +3663,14 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
 	if (!_link_tree_nodes(node, seg->origin))
 		return_0;

+	seg->metadata_start = metadata_start;
+	seg->metadata_len = metadata_len;
+	seg->data_start = data_start;
+	seg->data_len = data_len;
 	seg->data_block_size = data_block_size;
 	seg->flags = feature_flags;
 	seg->policy_name = policy_name;
+	seg->migration_threshold = 2048; /* Default migration threshold 1MiB */

 	/* FIXME: better validation missing */
 	if (policy_settings) {
@@ -3560,10 +3683,58 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
 				log_error("Cache policy parameter %s is without integer value.", cn->key);
 				return 0;
 			}
-			seg->policy_argc++;
+			if (strcmp(cn->key, "migration_threshold") == 0) {
+				seg->migration_threshold = cn->v->v.i;
+				cn->v = NULL; /* skip this entry */
+			} else
+				seg->policy_argc++;
 		}
 	}

+	/* Always some throughput available for cache to proceed */
+	if (seg->migration_threshold < data_block_size * 8)
+		seg->migration_threshold = data_block_size * 8;
+
+	return 1;
+}
+
+int dm_tree_node_add_writecache_target(struct dm_tree_node *node,
+				  uint64_t size,
+				  const char *origin_uuid,
+				  const char *cache_uuid,
+				  int pmem,
+				  uint32_t writecache_block_size,
+				  struct writecache_settings *settings)
+{
+	struct load_segment *seg;
+
+	if (!(seg = _add_segment(node, SEG_WRITECACHE, size)))
+		return_0;
+
+	seg->writecache_pmem = pmem;
+	seg->writecache_block_size = writecache_block_size;
+
+	if (!(seg->writecache_node = dm_tree_find_node_by_uuid(node->dtree, cache_uuid))) {
+		log_error("Missing writecache's cache uuid %s.", cache_uuid);
+		return 0;
+	}
+	if (!_link_tree_nodes(node, seg->writecache_node))
+		return_0;
+
+	if (!(seg->origin = dm_tree_find_node_by_uuid(node->dtree, origin_uuid))) {
+		log_error("Missing writecache's origin uuid %s.", origin_uuid);
+		return 0;
+	}
+	if (!_link_tree_nodes(node, seg->origin))
+		return_0;
+
+	memcpy(&seg->writecache_settings, settings, sizeof(struct writecache_settings));
+
+	if (settings->new_key && settings->new_val) {
+		seg->writecache_settings.new_key = dm_pool_strdup(node->dtree->mem, settings->new_key);
+		seg->writecache_settings.new_val = dm_pool_strdup(node->dtree->mem, settings->new_val);
+	}
+
 	return 1;
 }

@@ -4023,13 +4194,14 @@ int dm_tree_node_add_cache_target_base(struct dm_tree_node *node,

 	return dm_tree_node_add_cache_target(node, size, feature_flags & _mask,
 					     metadata_uuid, data_uuid, origin_uuid,
-					     policy_name, policy_settings, data_block_size);
+					     policy_name, policy_settings, 0, 0, 0, 0, data_block_size);
 }
 #endif

 int dm_tree_node_add_vdo_target(struct dm_tree_node *node,
 				uint64_t size,
 				const char *data_uuid,
+				uint64_t data_size,
 				const struct dm_vdo_target_params *vtp)
 {
 	struct load_segment *seg;
@@ -4050,6 +4222,7 @@ int dm_tree_node_add_vdo_target(struct dm_tree_node *node,

 	seg->vdo_params = *vtp;
 	seg->vdo_name = node->name;
+	seg->vdo_data_size = data_size;

 	node->props.send_messages = 2;

--- a/device_mapper/libdm-file.c
+++ b/device_mapper/libdm-file.c
@@ -222,6 +222,8 @@ retry_fcntl:
 		goto fail_close_unlink;
 	}

+	/* coverity[leaked_handle] intentional leak of fd handle here  */
+
 	return 1;

 fail_close_unlink:
--- a/device_mapper/libdm-report.c
+++ b/device_mapper/libdm-report.c
@@ -2381,7 +2381,7 @@ static const char *_get_reserved(struct dm_report *rh, unsigned type,
 {
 	const struct dm_report_reserved_value *iter = implicit ? NULL : rh->reserved_values;
 	const struct dm_report_field_reserved_value *frv;
-	const char *tmp_begin, *tmp_end, *tmp_s = s;
+	const char *tmp_begin = NULL, *tmp_end = NULL, *tmp_s = s;
 	const char *name = NULL;
 	char c;

--- a/device_mapper/libdm-targets.c
+++ b/device_mapper/libdm-targets.c
@@ -346,6 +346,38 @@ bad:
 	return 0;
 }

+/*
+ * From linux/Documentation/device-mapper/writecache.txt
+ *
+ * Status:
+ * 1. error indicator - 0 if there was no error, otherwise error number
+ * 2. the number of blocks
+ * 3. the number of free blocks
+ * 4. the number of blocks under writeback
+ */
+
+int dm_get_status_writecache(struct dm_pool *mem, const char *params,
+			     struct dm_status_writecache **status)
+{
+	struct dm_status_writecache *s;
+
+	if (!(s = dm_pool_zalloc(mem, sizeof(struct dm_status_writecache))))
+		return_0;
+
+	if (sscanf(params, "%u %llu %llu %llu",
+		   &s->error,
+		   (unsigned long long *)&s->total_blocks,
+		   (unsigned long long *)&s->free_blocks,
+		   (unsigned long long *)&s->writeback_blocks) != 4) {
+		log_error("Failed to parse writecache params: %s.", params);
+		dm_pool_free(mem, s);
+		return 0;
+	}
+
+	*status = s;
+	return 1;
+}
+
 int parse_thin_pool_status(const char *params, struct dm_status_thin_pool *s)
 {
 	int pos;
--- a/device_mapper/misc/dmlib.h
+++ b/device_mapper/misc/dmlib.h
@@ -21,11 +21,6 @@

 // FIXME: get rid of this whole file
 
-#include "configure.h"
-
-#define _REENTRANT
-#define _GNU_SOURCE
-
 #include "device_mapper/all.h"
 #include "lib/misc/util.h"
 #include "dm-logging.h"
--- a/device_mapper/misc/kdev_t.h
+++ b/device_mapper/misc/kdev_t.h
@@ -17,6 +17,6 @@

 #define MAJOR(dev)      ((dev & 0xfff00) >> 8)
 #define MINOR(dev)      ((dev & 0xff) | ((dev >> 12) & 0xfff00))
-#define MKDEV(ma,mi)    ((mi & 0xff) | (ma << 8) | ((mi & ~0xff) << 12))
+#define MKDEV(ma,mi)    (((dev_t)mi & 0xff) | ((dev_t)ma << 8) | (((dev_t)mi & ~0xff) << 12))

 #endif
--- a/device_mapper/mm/pool.c
+++ b/device_mapper/mm/pool.c
@@ -59,11 +59,13 @@ char *dm_pool_strdup(struct dm_pool *p, const char *str)

 char *dm_pool_strndup(struct dm_pool *p, const char *str, size_t n)
 {
+	size_t slen = strlen(str);
+	size_t len = (slen < n) ? slen : n;
 	char *ret = dm_pool_alloc(p, n + 1);

 	if (ret) {
-		strncpy(ret, str, n);
-		ret[n] = '\0';
+		ret[len] = '\0';
+		memcpy(ret, str, len);
 	}

 	return ret;
--- a/device_mapper/vdo/status.c
+++ b/device_mapper/vdo/status.c
@@ -1,8 +1,24 @@
-#include "configure.h"
-#include "target.h"
+/*
+ * Copyright (C) 2018 Red Hat, Inc. All rights reserved.
+ *
+ * This file is part of the device-mapper userspace tools.
+ *
+ * This copyrighted material is made available to anyone wishing to use,
+ * modify, copy, or redistribute it subject to the terms and conditions
+ * of the GNU Lesser General Public License v.2.1.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */

-// For DM_ARRAY_SIZE!
+/* Note: this object is also used by VDO dmeventd plugin for parsing status */
+/* File could be included by VDO plugin and can use original libdm library */
+#ifndef LIB_DMEVENT_H
 #include "device_mapper/all.h"
+#endif
+
+#include "device_mapper/vdo/target.h"
 #include "base/memory/zalloc.h"

 #include <ctype.h>
@@ -154,7 +170,7 @@ static void _set_error(struct dm_vdo_status_parse_result *result, const char *fm
 	va_list ap;

 	va_start(ap, fmt);
-	vsnprintf(result->error, sizeof(result->error), fmt, ap);
+	(void) vsnprintf(result->error, sizeof(result->error), fmt, ap);
 	va_end(ap);
 }

@@ -184,7 +200,7 @@ static bool _parse_field(const char **b, const char *e,
 bool dm_vdo_status_parse(struct dm_pool *mem, const char *input,
 			 struct dm_vdo_status_parse_result *result)
 {
-	const char *b = b = input;
+	const char *b = input;
 	const char *e = input + strlen(input);
 	const char *te;
 	struct dm_vdo_status *s;
@@ -203,11 +219,10 @@ bool dm_vdo_status_parse(struct dm_pool *mem, const char *input,
 		goto bad;
 	}

-	if (!(s->device = (!mem) ? malloc((e - b) + 1) : dm_pool_alloc(mem, (e - b) + 1))) {
+	if (!(s->device = (!mem) ? strndup(b, (te - b)) : dm_pool_alloc(mem, (te - b)))) {
 		_set_error(result, "out of memory");
 		goto bad;
 	}
-	dm_strncpy(s->device, b, te - b + 1);

 	b = _eat_space(te, e);

--- a/device_mapper/vdo/target.h
+++ b/device_mapper/vdo/target.h
@@ -74,16 +74,16 @@ enum dm_vdo_write_policy {

 // FIXME: review whether we should use the createParams from the userlib
 struct dm_vdo_target_params {
+	uint32_t minimum_io_size;
 	uint32_t block_map_cache_size_mb;
-	uint32_t block_map_period;
+	uint32_t block_map_era_length;	// format period

 	uint32_t check_point_frequency;
-	uint32_t index_memory_size_mb;
+	uint32_t index_memory_size_mb;  // format

-	uint32_t read_cache_size_mb;
-
-	uint32_t slab_size_mb;
+	uint32_t slab_size_mb;          // format

+	uint32_t max_discard;
 	// threads
 	uint32_t ack_threads;
 	uint32_t bio_threads;
@@ -95,9 +95,8 @@ struct dm_vdo_target_params {

 	bool use_compression;
 	bool use_deduplication;
-	bool emulate_512_sectors;
-	bool use_sparse_index;
-	bool use_read_cache;
+	bool use_metadata_hints;
+	bool use_sparse_index;          // format

 	// write policy
 	enum dm_vdo_write_policy write_policy;
--- a/device_mapper/vdo/vdo_limits.h
+++ b/device_mapper/vdo/vdo_limits.h
@@ -21,8 +21,8 @@
 #define DM_VDO_BLOCK_MAP_CACHE_SIZE_MAXIMUM_MB	(16 * 1024 * 1024 - 1)	// 16TiB - 1
 #define DM_VDO_BLOCK_MAP_CACHE_SIZE_MINIMUM_PER_LOGICAL_THREAD  (4096 * DM_VDO_BLOCK_SIZE_KB)

-#define DM_VDO_BLOCK_MAP_PERIOD_MINIMUM		1
-#define DM_VDO_BLOCK_MAP_PERIOD_MAXIMUM		(16380)
+#define DM_VDO_BLOCK_MAP_ERA_LENGTH_MINIMUM	(1)
+#define DM_VDO_BLOCK_MAP_ERA_LENGTH_MAXIMUM	(16380)

 #define DM_VDO_INDEX_MEMORY_SIZE_MINIMUM_MB	(256)			// 0.25 GiB
 #define DM_VDO_INDEX_MEMORY_SIZE_MAXIMUM_MB	(1024 * 1024 * 1024)	// 1TiB
@@ -57,4 +57,7 @@
 //#define DM_VDO_PHYSICAL_THREADS_MINIMUM	(0)
 #define DM_VDO_PHYSICAL_THREADS_MAXIMUM		(16)

+#define DM_VDO_MAX_DISCARD_MINIMUM		(1)
+#define DM_VDO_MAX_DISCARD_MAXIMUM		(UINT32_MAX / 4096)
+
 #endif // DEVICE_MAPPER_VDO_LIMITS_H
--- a/device_mapper/vdo/vdo_target.c
+++ b/device_mapper/vdo/vdo_target.c
@@ -23,6 +23,13 @@ bool dm_vdo_validate_target_params(const struct dm_vdo_target_params *vtp,
 {
 	bool valid = true;

+	if ((vtp->minimum_io_size != 512) &&
+	    (vtp->minimum_io_size != 4096)) {
+		log_error("VDO minimum io size %u is unsupported.",
+			  vtp->minimum_io_size);
+		valid = false;
+	}
+
 	if ((vtp->block_map_cache_size_mb < DM_VDO_BLOCK_MAP_CACHE_SIZE_MINIMUM_MB) ||
 	    (vtp->block_map_cache_size_mb > DM_VDO_BLOCK_MAP_CACHE_SIZE_MAXIMUM_MB)) {
 		log_error("VDO block map cache size %u out of range.",
@@ -37,12 +44,6 @@ bool dm_vdo_validate_target_params(const struct dm_vdo_target_params *vtp,
 		valid = false;
 	}

-	if (vtp->read_cache_size_mb > DM_VDO_READ_CACHE_SIZE_MAXIMUM_MB) {
-		log_error("VDO read cache size %u out of range.",
-			  vtp->read_cache_size_mb);
-		valid = false;
-	}
-
 	if ((vtp->slab_size_mb < DM_VDO_SLAB_SIZE_MINIMUM_MB) ||
 	    (vtp->slab_size_mb > DM_VDO_SLAB_SIZE_MAXIMUM_MB)) {
 		log_error("VDO slab size %u out of range.",
@@ -50,6 +51,13 @@ bool dm_vdo_validate_target_params(const struct dm_vdo_target_params *vtp,
 		valid = false;
 	}

+	if ((vtp->max_discard < DM_VDO_MAX_DISCARD_MINIMUM) ||
+	    (vtp->max_discard > DM_VDO_MAX_DISCARD_MAXIMUM)) {
+		log_error("VDO max discard %u out of range.",
+			  vtp->max_discard);
+		valid = false;
+	}
+
 	if (vtp->ack_threads > DM_VDO_ACK_THREADS_MAXIMUM) {
 		log_error("VDO ack threads %u out of range.", vtp->ack_threads);
 		valid = false;
--- a/include/Makefile.in
+++ b/include/Makefile.in
@@ -18,7 +18,7 @@ top_builddir = @top_builddir@

 include $(top_builddir)/make.tmpl

-DISTCLEAN_TARGETS += .configure.h lvm-version.h
+DISTCLEAN_TARGETS += configure.h lvm-version.h
 CLEAN_TARGETS += \
 .symlinks \
 .symlinks_created \
--- a/include/configure.h.in
+++ b/include/configure.h.in
@@ -45,9 +45,6 @@
 /* Name of default metadata cache subdirectory. */
 #undef DEFAULT_CACHE_SUBDIR

-/* Default data alignment. */
-#undef DEFAULT_DATA_ALIGNMENT
-
 /* Define default node creation behavior with dmsetup create */
 #undef DEFAULT_DM_ADD_NODE

@@ -669,6 +666,15 @@
 /* Define to 1 to include built-in support for vdo. */
 #undef VDO_INTERNAL

+/* Define to 1 to include built-in support for writecache. */
+#undef WRITECACHE_INTERNAL
+
+/* Define to get access to GNU/Linux extension */
+#undef _GNU_SOURCE
+
+/* Define to use re-entrant thread safe versions */
+#undef _REENTRANT
+
 /* Define for Solaris 2.5.1 so the uint32_t typedef from <sys/synch.h>,
   <pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
   #define below would cause a syntax error. */
--- a/lib/Makefile.in
+++ b/lib/Makefile.in
@@ -19,6 +19,7 @@ top_builddir = @top_builddir@
 SOURCES =\
 	activate/activate.c \
 	cache/lvmcache.c \
+	writecache/writecache.c \
 	cache_segtype/cache.c \
 	commands/toolcontext.c \
 	config/config.c \
@@ -60,6 +61,7 @@ SOURCES =\
 	format_text/text_label.c \
 	freeseg/freeseg.c \
 	label/label.c \
+	label/hints.c \
 	locking/file_locking.c \
 	locking/locking.c \
 	log/log.c \
@@ -90,6 +92,7 @@ SOURCES =\
 	misc/lvm-string.c \
 	misc/lvm-wrappers.c \
 	misc/lvm-percent.c \
+	misc/sharedlib.c \
 	mm/memlock.c \
 	notify/lvmnotify.c \
 	properties/prop_common.c \
@@ -108,10 +111,6 @@ ifeq ("@DEVMAPPER@", "yes")
 	activate/fs.c
 endif

-ifeq ("@HAVE_LIBDL@", "yes")
-  SOURCES += misc/sharedlib.c
-endif
-
 ifeq ("@BUILD_LVMPOLLD@", "yes")
  SOURCES +=\
 	lvmpolld/lvmpolld-client.c
@@ -129,12 +128,6 @@ endif
 LIB_NAME = liblvm-internal
 LIB_STATIC = $(LIB_NAME).a

-ifeq ($(MAKECMDGOALS),distclean)
-  SUBDIRS =\
-	notify \
-	locking
-endif
-
 CFLOW_LIST = $(SOURCES)
 CFLOW_LIST_TARGET = $(LIB_NAME).cflow

--- a/lib/activate/activate.c
+++ b/lib/activate/activate.c
@@ -28,7 +28,6 @@
 #include "lib/config/config.h"
 #include "lib/metadata/segtype.h"
 #include "lib/misc/sharedlib.h"
-#include "lib/cache/lvmcache.h"
 #include "lib/metadata/metadata.h"

 #include <limits.h>
@@ -1174,6 +1173,26 @@ out:
 	return r;
 }

+int lv_writecache_message(const struct logical_volume *lv, const char *msg)
+{
+	int r = 0;
+	struct dev_manager *dm;
+
+	if (!lv_info(lv->vg->cmd, lv, 0, NULL, 0, 0)) {
+		log_error("Unable to send message to an inactive logical volume.");
+		return 0;
+	}
+
+	if (!(dm = dev_manager_create(lv->vg->cmd, lv->vg->name, 1)))
+		return_0;
+
+	r = dev_manager_writecache_message(dm, lv, msg);
+
+	dev_manager_destroy(dm);
+
+	return r;
+}
+
 /*
 * Return dm_status_cache for cache volume, accept also cache pool
 *
@@ -1340,8 +1359,6 @@ int lv_vdo_pool_status(const struct logical_volume *lv, int flush,
 {
 	int r = 0;
 	struct dev_manager *dm;
-	struct lv_status_vdo *status;
-	char *params;

 	if (!lv_info(lv->vg->cmd, lv, 0, NULL, 0, 0))
 		return 0;
@@ -1352,14 +1369,10 @@ int lv_vdo_pool_status(const struct logical_volume *lv, int flush,
 	if (!(dm = dev_manager_create(lv->vg->cmd, lv->vg->name, !lv_is_pvmove(lv))))
 		return_0;

-	if (!dev_manager_vdo_pool_status(dm, lv, flush, &params, &status))
+	if (!dev_manager_vdo_pool_status(dm, lv, vdo_status, flush))
 		goto_out;

-	if (!parse_vdo_pool_status(status->mem, lv, params, status))
-		goto_out;
-
-	/* User is responsible to dm_pool_destroy memory pool! */
-	*vdo_status = status;
+	/* User has to call dm_pool_destroy(vdo_status->mem) */
 	r = 1;
 out:
 	if (!r)
@@ -1368,6 +1381,19 @@ out:
 	return r;
 }

+int lv_vdo_pool_percent(const struct logical_volume *lv, dm_percent_t *percent)
+{
+	struct lv_status_vdo *vdo_status;
+
+	if (!lv_vdo_pool_status(lv, 0, &vdo_status))
+		return_0;
+
+	*percent = vdo_status->usage;
+	dm_pool_destroy(vdo_status->mem);
+
+	return 1;
+}
+
 static int _lv_active(struct cmd_context *cmd, const struct logical_volume *lv)
 {
 	struct lvinfo info;
@@ -2032,8 +2058,6 @@ static int _lv_suspend(struct cmd_context *cmd, const char *lvid_s,
 	               const struct logical_volume *lv, const struct logical_volume *lv_pre)
 {
 	const struct logical_volume *pvmove_lv = NULL;
-	const struct logical_volume *lv_to_free = NULL;
-	const struct logical_volume *lv_pre_to_free = NULL;
 	struct logical_volume *lv_pre_tmp, *lv_tmp;
 	struct seg_list *sl;
 	struct lv_segment *snap_seg;
@@ -2224,10 +2248,6 @@ static int _lv_suspend(struct cmd_context *cmd, const char *lvid_s,
 out:
 	if (mem)
 		dm_pool_destroy(mem);
-	if (lv_pre_to_free)
-		release_vg(lv_pre_to_free->vg);
-	if (lv_to_free)
-		release_vg(lv_to_free->vg);

 	return r;
 }
@@ -2388,7 +2408,6 @@ static int _lv_has_open_snapshots(const struct logical_volume *lv)

 int lv_deactivate(struct cmd_context *cmd, const char *lvid_s, const struct logical_volume *lv)
 {
-	const struct logical_volume *lv_to_free = NULL;
 	struct lvinfo info;
 	static const struct lv_activate_opts laopts = { .skip_in_use = 1 };
 	struct dm_list *snh;
@@ -2458,27 +2477,25 @@ int lv_deactivate(struct cmd_context *cmd, const char *lvid_s, const struct logi
 		r = 0;
 	}
 out:
-	if (lv_to_free)
-		release_vg(lv_to_free->vg);

 	return r;
 }

 /* Test if LV passes filter */
 int lv_activation_filter(struct cmd_context *cmd, const char *lvid_s,
-			 int *activate_lv, const struct logical_volume *lv)
+			 int *activate, const struct logical_volume *lv)
 {
 	if (!activation()) {
-		*activate_lv = 1;
+		*activate = 1;
 		return 1;
 	}

 	if (!_passes_activation_filter(cmd, lv)) {
 		log_verbose("Not activating %s since it does not pass "
 			    "activation filter.", display_lvname(lv));
-		*activate_lv = 0;
+		*activate = 0;
 	} else
-		*activate_lv = 1;
+		*activate = 1;

 	return 1;
 }
@@ -2522,6 +2539,12 @@ static int _lv_activate(struct cmd_context *cmd, const char *lvid_s,
 		goto out;
 	}

+	if (lv_raid_has_visible_sublvs(lv)) {
+		log_error("Refusing activation of RAID LV %s with "
+			  "visible SubLVs.", display_lvname(lv));
+		goto out;
+	}
+
 	if (test_mode()) {
 		_skip("Activating %s.", display_lvname(lv));
 		r = 1;
--- a/lib/activate/activate.h
+++ b/lib/activate/activate.h
@@ -38,6 +38,7 @@ typedef enum {
 	SEG_STATUS_THIN,
 	SEG_STATUS_THIN_POOL,
 	SEG_STATUS_VDO_POOL,
+	SEG_STATUS_WRITECACHE,
 	SEG_STATUS_UNKNOWN
 } lv_seg_status_type_t;

@@ -51,6 +52,7 @@ struct lv_seg_status {
 		struct dm_status_snapshot *snapshot;
 		struct dm_status_thin *thin;
 		struct dm_status_thin_pool *thin_pool;
+		struct dm_status_writecache *writecache;
 		struct lv_status_vdo vdo_pool;
 	};
 };
@@ -160,10 +162,10 @@ int lv_info_with_seg_status(struct cmd_context *cmd,
 int lv_check_not_in_use(const struct logical_volume *lv, int error_if_used);

 /*
- * Returns 1 if activate_lv has been set: 1 = activate; 0 = don't.
+ * Returns 1 if activate has been set: 1 = activate; 0 = don't.
 */
 int lv_activation_filter(struct cmd_context *cmd, const char *lvid_s,
-			 int *activate_lv, const struct logical_volume *lv);
+			 int *activate, const struct logical_volume *lv);
 /*
 * Checks against the auto_activation_volume_list and
 * returns 1 if the LV should be activated, 0 otherwise.
@@ -184,6 +186,7 @@ int lv_raid_dev_health(const struct logical_volume *lv, char **dev_health);
 int lv_raid_mismatch_count(const struct logical_volume *lv, uint64_t *cnt);
 int lv_raid_sync_action(const struct logical_volume *lv, char **sync_action);
 int lv_raid_message(const struct logical_volume *lv, const char *msg);
+int lv_writecache_message(const struct logical_volume *lv, const char *msg);
 int lv_cache_status(const struct logical_volume *cache_lv,
 		    struct lv_status_cache **status);
 int lv_thin_pool_percent(const struct logical_volume *lv, int metadata,
@@ -195,6 +198,7 @@ int lv_thin_pool_transaction_id(const struct logical_volume *lv,
 int lv_thin_device_id(const struct logical_volume *lv, uint32_t *device_id);
 int lv_vdo_pool_status(const struct logical_volume *lv, int flush,
 		       struct lv_status_vdo **status);
+int lv_vdo_pool_percent(const struct logical_volume *lv, dm_percent_t *percent);

 /*
 * Return number of LVs in the VG that are active.
@@ -255,6 +259,7 @@ int device_is_usable(struct device *dev, struct dev_usable_check_params check);
 void fs_unlock(void);

 #define TARGET_NAME_CACHE "cache"
+#define TARGET_NAME_WRITECACHE "writecache"
 #define TARGET_NAME_ERROR "error"
 #define TARGET_NAME_ERROR_OLD "erro"	/* Truncated in older kernels */
 #define TARGET_NAME_LINEAR "linear"
@@ -271,6 +276,7 @@ void fs_unlock(void);

 #define MODULE_NAME_CLUSTERED_MIRROR "clog"
 #define MODULE_NAME_CACHE TARGET_NAME_CACHE
+#define MODULE_NAME_WRITECACHE TARGET_NAME_WRITECACHE
 #define MODULE_NAME_ERROR TARGET_NAME_ERROR
 #define MODULE_NAME_LOG_CLUSTERED "log-clustered"
 #define MODULE_NAME_LOG_USERSPACE "log-userspace"
--- a/lib/activate/dev_manager.c
+++ b/lib/activate/dev_manager.c
@@ -213,6 +213,10 @@ static int _get_segment_status_from_target_params(const char *target_name,
 		if (!parse_vdo_pool_status(seg_status->mem, seg->lv, params, &seg_status->vdo_pool))
 			return_0;
 		seg_status->type = SEG_STATUS_VDO_POOL;
+	} else if (segtype_is_writecache(segtype)) {
+		if (!dm_get_status_writecache(seg_status->mem, params, &(seg_status->writecache)))
+			return_0;
+		seg_status->type = SEG_STATUS_WRITECACHE;
 	} else
 		/*
 		 * TODO: Add support for other segment types too!
@@ -374,7 +378,7 @@ static int _ignore_blocked_mirror_devices(struct device *dev,
 			if (!(tmp_dev = dev_create_file(buf, NULL, NULL, 0)))
 				goto_out;

-			tmp_dev->dev = MKDEV((dev_t)sm->logs[0].major, (dev_t)sm->logs[0].minor);
+			tmp_dev->dev = MKDEV(sm->logs[0].major, sm->logs[0].minor);
 			if (device_is_usable(tmp_dev, (struct dev_usable_check_params)
 					     { .check_empty = 1,
 					       .check_blocked = 1,
@@ -827,6 +831,113 @@ static int _info(struct cmd_context *cmd,
 	return 1;
 }

+/* FIXME: could we just use dev_manager_info instead of this? */
+
+int get_cache_single_meta_data(struct cmd_context *cmd,
+				    struct logical_volume *lv,
+				    struct logical_volume *pool_lv,
+				    struct dm_info *info_meta, struct dm_info *info_data)
+{
+	struct lv_segment *lvseg = first_seg(lv);
+	union lvid lvid_meta;
+	union lvid lvid_data;
+	char *name_meta;
+	char *name_data;
+	char *dlid_meta;
+	char *dlid_data;
+
+	memset(&lvid_meta, 0, sizeof(lvid_meta));
+	memset(&lvid_data, 0, sizeof(lvid_meta));
+	memcpy(&lvid_meta.id[0], &lv->vg->id, sizeof(struct id));
+	memcpy(&lvid_meta.id[1], &lvseg->metadata_id, sizeof(struct id));
+	memcpy(&lvid_data.id[0], &lv->vg->id, sizeof(struct id));
+	memcpy(&lvid_data.id[1], &lvseg->data_id, sizeof(struct id));
+
+	if (!(dlid_meta = dm_build_dm_uuid(cmd->mem, UUID_PREFIX, (const char *)&lvid_meta.s, NULL)))
+		return_0;
+	if (!(dlid_data = dm_build_dm_uuid(cmd->mem, UUID_PREFIX, (const char *)&lvid_data.s, NULL)))
+		return_0;
+	if (!(name_meta = dm_build_dm_name(cmd->mem, lv->vg->name, pool_lv->name, "_cmeta")))
+		return_0;
+	if (!(name_data = dm_build_dm_name(cmd->mem, lv->vg->name, pool_lv->name, "_cdata")))
+		return_0;
+
+	if (!_info(cmd, name_meta, dlid_meta, 1, 0, info_meta, NULL, NULL))
+		return_0;
+
+	if (!_info(cmd, name_data, dlid_data, 1, 0, info_data, NULL, NULL))
+		return_0;
+
+	return 1;
+}
+
+/*
+ * FIXME: isn't there a simpler, more direct way to just remove these two dm
+ * devs?
+ */
+
+int remove_cache_single_meta_data(struct cmd_context *cmd,
+				       struct dm_info *info_meta, struct dm_info *info_data)
+{
+	struct dm_tree *dtree;
+	struct dm_tree_node *root;
+	struct dm_tree_node *child;
+	const char *uuid;
+	void *handle = NULL;
+
+	if (!(dtree = dm_tree_create()))
+		goto_out;
+
+	if (!dm_tree_add_dev(dtree, info_meta->major, info_meta->minor))
+		goto_out;
+
+	if (!dm_tree_add_dev(dtree, info_data->major, info_data->minor))
+		goto_out;
+
+	if (!(root = dm_tree_find_node(dtree, 0, 0)))
+		goto_out;
+
+	while ((child = dm_tree_next_child(&handle, root, 0))) {
+		if (!(uuid = dm_tree_node_get_uuid(child))) {
+			stack;
+			continue;
+		}
+
+		if (!dm_tree_deactivate_children(root, uuid, strlen(uuid))) {
+			stack;
+			continue;
+		}
+	}
+
+	dm_tree_free(dtree);
+	return 1;
+ out:
+	dm_tree_free(dtree);
+	return 0;
+}
+
+int dev_manager_remove_dm_major_minor(uint32_t major, uint32_t minor)
+{
+	struct dm_task *dmt;
+	int r = 0;
+
+	log_verbose("Removing dm dev %u:%u", major, minor);
+
+	if (!(dmt = dm_task_create(DM_DEVICE_REMOVE)))
+		return_0;
+
+	if (!dm_task_set_major(dmt, major) || !dm_task_set_minor(dmt, minor)) {
+		log_error("Failed to set device number for remove %u:%u", major, minor);
+		goto out;
+	}
+
+	r = dm_task_run(dmt);
+out:
+	dm_task_destroy(dmt);
+
+	return r;
+}
+
 static int _info_by_dev(uint32_t major, uint32_t minor, struct dm_info *info)
 {
 	return _info_run(NULL, info, NULL, 0, 0, 0, major, minor);
@@ -1450,6 +1561,40 @@ out:
 	return r;
 }

+int dev_manager_writecache_message(struct dev_manager *dm,
+				   const struct logical_volume *lv,
+				   const char *msg)
+{
+	int r = 0;
+	const char *dlid;
+	struct dm_task *dmt;
+	const char *layer = lv_layer(lv);
+
+	if (!lv_is_writecache(lv)) {
+		log_error(INTERNAL_ERROR "%s is not a writecache logical volume.",
+			  display_lvname(lv));
+		return 0;
+	}
+
+	if (!(dlid = build_dm_uuid(dm->mem, lv, layer)))
+		return_0;
+
+	if (!(dmt = _setup_task_run(DM_DEVICE_TARGET_MSG, NULL, NULL, dlid, 0, 0, 0, 0, 1, 0)))
+		return_0;
+
+	if (!dm_task_set_message(dmt, msg))
+		goto_out;
+
+	if (!dm_task_run(dmt))
+		goto_out;
+
+	r = 1;
+out:
+	dm_task_destroy(dmt);
+
+	return r;
+}
+
 int dev_manager_cache_status(struct dev_manager *dm,
 			     const struct logical_volume *lv,
 			     struct lv_status_cache **status)
@@ -1647,9 +1792,8 @@ out:

 int dev_manager_vdo_pool_status(struct dev_manager *dm,
 				const struct logical_volume *lv,
-				int flush,
-				char **vdo_params,
-				struct lv_status_vdo **vdo_status)
+				struct lv_status_vdo **vdo_status,
+				int flush)
 {
 	struct lv_status_vdo *status;
 	const char *dlid;
@@ -1660,7 +1804,6 @@ int dev_manager_vdo_pool_status(struct dev_manager *dm,
 	char *params = NULL;
 	int r = 0;

-	*vdo_params = NULL;
 	*vdo_status = NULL;

 	if (!(status = dm_pool_zalloc(dm->mem, sizeof(struct lv_status_vdo)))) {
@@ -1689,13 +1832,11 @@ int dev_manager_vdo_pool_status(struct dev_manager *dm,
 		goto out;
 	}

-	if (!(*vdo_params = dm_pool_strdup(dm->mem, params))) {
-		log_error("Cannot duplicate VDO status params.");
-		goto out;
-	}
+	if (!parse_vdo_pool_status(dm->mem, lv, params, status))
+		goto_out;

 	status->mem = dm->mem;
-	*vdo_status =  status;
+	*vdo_status = status;

 	r = 1;
 out:
@@ -1916,7 +2057,8 @@ static int _check_holder(struct dev_manager *dm, struct dm_tree *dtree,

 		if (!strncmp(uuid, (char*)&lv->vg->id, sizeof(lv->vg->id)) &&
 		    !dm_tree_find_node_by_uuid(dtree, uuid)) {
-			dm_strncpy((char*)&id, uuid, 2 * sizeof(struct id) + 1);
+			/* trims any UUID suffix (i.e. -cow) */
+			(void) dm_strncpy((char*)&id, uuid, 2 * sizeof(struct id) + 1);

 			/* If UUID is not yet in dtree, look for matching LV */
 			if (!(lv_det = find_lv_in_vg_by_lvid(lv->vg, &id))) {
@@ -2232,6 +2374,10 @@ static int _pool_register_callback(struct dev_manager *dm,
 		return 1;
 #endif

+	/* Skip for single-device cache pool */
+	if (lv_is_cache(lv) && lv_is_cache_single(first_seg(lv)->pool_lv))
+		return 1;
+
 	if (!(data = dm_pool_zalloc(dm->mem, sizeof(*data)))) {
 		log_error("Failed to allocated path for callback.");
 		return 0;
@@ -2299,6 +2445,53 @@ static int _add_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 		/* Unused cache pool is activated as metadata */
 	}

+	if (lv_is_cache(lv) && lv_is_cache_single(first_seg(lv)->pool_lv) && dm->activation) {
+		struct logical_volume *pool_lv = first_seg(lv)->pool_lv;
+		struct lv_segment *lvseg = first_seg(lv);
+		struct dm_info info_meta;
+		struct dm_info info_data;
+		union lvid lvid_meta;
+		union lvid lvid_data;
+		char *name_meta;
+		char *name_data;
+		char *dlid_meta;
+		char *dlid_data;
+
+		memset(&lvid_meta, 0, sizeof(lvid_meta));
+		memset(&lvid_data, 0, sizeof(lvid_meta));
+		memcpy(&lvid_meta.id[0], &lv->vg->id, sizeof(struct id));
+		memcpy(&lvid_meta.id[1], &lvseg->metadata_id, sizeof(struct id));
+		memcpy(&lvid_data.id[0], &lv->vg->id, sizeof(struct id));
+		memcpy(&lvid_data.id[1], &lvseg->data_id, sizeof(struct id));
+
+		if (!(dlid_meta = dm_build_dm_uuid(dm->mem, UUID_PREFIX, (const char *)&lvid_meta.s, NULL)))
+			return_0;
+		if (!(dlid_data = dm_build_dm_uuid(dm->mem, UUID_PREFIX, (const char *)&lvid_data.s, NULL)))
+			return_0;
+		if (!(name_meta = dm_build_dm_name(dm->mem, lv->vg->name, pool_lv->name, "_cmeta")))
+			return_0;
+		if (!(name_data = dm_build_dm_name(dm->mem, lv->vg->name, pool_lv->name, "_cdata")))
+			return_0;
+
+		if (!_info(dm->cmd, name_meta, dlid_meta, 1, 0, &info_meta, NULL, NULL))
+			return_0;
+
+		if (!_info(dm->cmd, name_data, dlid_data, 1, 0, &info_data, NULL, NULL))
+			return_0;
+
+		if (info_meta.exists &&
+		    !dm_tree_add_dev_with_udev_flags(dtree, info_meta.major, info_meta.minor,
+						     _get_udev_flags(dm, lv, NULL, 0, 0, 0))) {
+			log_error("Failed to add device (%" PRIu32 ":%" PRIu32") to dtree.", info_meta.major, info_meta.minor);
+		}
+
+		if (info_data.exists &&
+		    !dm_tree_add_dev_with_udev_flags(dtree, info_data.major, info_data.minor,
+						     _get_udev_flags(dm, lv, NULL, 0, 0, 0))) {
+			log_error("Failed to add device (%" PRIu32 ":%" PRIu32") to dtree.", info_data.major, info_data.minor);
+		}
+	}
+
 	if (!origin_only && !_add_dev_to_dtree(dm, dtree, lv, NULL))
 		return_0;

@@ -2439,8 +2632,12 @@ static int _add_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 		if (seg->metadata_lv &&
 		    !_add_lv_to_dtree(dm, dtree, seg->metadata_lv, 0))
 			return_0;
+		if (seg->writecache && seg_is_writecache(seg)) {
+			if (!_add_lv_to_dtree(dm, dtree, seg->writecache, dm->activation ? origin_only : 1))
+				return_0;
+		}
 		if (seg->pool_lv &&
-		    (lv_is_cache_pool(seg->pool_lv) || dm->track_external_lv_deps) &&
+		    (lv_is_cache_pool(seg->pool_lv) || lv_is_cache_single(seg->pool_lv) || dm->track_external_lv_deps) &&
 		    /* When activating and not origin_only detect linear 'overlay' over pool */
 		    !_add_lv_to_dtree(dm, dtree, seg->pool_lv, dm->activation ? origin_only : 1))
 			return_0;
@@ -2891,6 +3088,11 @@ static int _add_segment_to_dtree(struct dev_manager *dm,
 				  lv_layer(seg->pool_lv)))
 		return_0;

+	if (seg->writecache && !laopts->origin_only &&
+	    !_add_new_lv_to_dtree(dm, dtree, seg->writecache, laopts,
+				  lv_layer(seg->writecache)))
+		return_0;
+
 	/* Add any LVs used by this segment */
 	for (s = 0; s < seg->area_count; ++s) {
 		if ((seg_type(seg, s) == AREA_LV) &&
@@ -2937,6 +3139,14 @@ static int _add_new_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 	int save_pending_delete = dm->track_pending_delete;
 	int merge_in_progress = 0;

+	if (!(lvlayer = dm_pool_alloc(dm->mem, sizeof(*lvlayer)))) {
+		log_error("_add_new_lv_to_dtree: pool alloc failed for %s %s.",
+			  display_lvname(lv), layer);
+		return 0;
+	}
+	lvlayer->lv = lv;
+	lvlayer->visible_component = (laopts->component_lv == lv) ? 1 : 0;
+
 	log_debug_activation("Adding new LV %s%s%s to dtree", display_lvname(lv),
 			     layer ? "-" : "", layer ? : "");
 	/* LV with pending delete is never put new into a table */
@@ -2953,6 +3163,99 @@ static int _add_new_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 		return 1;
 	}

+	if (lv_is_cache(lv) && lv_is_cache_single(first_seg(lv)->pool_lv)) {
+		struct logical_volume *pool_lv = first_seg(lv)->pool_lv;
+		struct lv_segment *lvseg = first_seg(lv);
+		struct volume_group *vg = lv->vg;
+		struct dm_tree_node *dnode_meta;
+		struct dm_tree_node *dnode_data;
+		union lvid lvid_meta;
+		union lvid lvid_data;
+		char *name_meta;
+		char *name_data;
+		char *dlid_meta;
+		char *dlid_data;
+		char *dlid_pool;
+		uint64_t meta_len = first_seg(lv)->metadata_len;
+		uint64_t data_len = first_seg(lv)->data_len;
+		uint16_t udev_flags = _get_udev_flags(dm, lv, layer,
+					     laopts->noscan, laopts->temporary,
+					     0);
+
+		log_debug("Add cache pool %s to dtree before cache %s", pool_lv->name, lv->name);
+
+		if (!_add_new_lv_to_dtree(dm, dtree, pool_lv, laopts, NULL)) {
+			log_error("Failed to add cachepool to dtree before cache");
+			return_0;
+		}
+
+		memset(&lvid_meta, 0, sizeof(lvid_meta));
+		memset(&lvid_data, 0, sizeof(lvid_meta));
+		memcpy(&lvid_meta.id[0], &vg->id, sizeof(struct id));
+		memcpy(&lvid_meta.id[1], &lvseg->metadata_id, sizeof(struct id));
+		memcpy(&lvid_data.id[0], &vg->id, sizeof(struct id));
+		memcpy(&lvid_data.id[1], &lvseg->data_id, sizeof(struct id));
+
+		if (!(dlid_meta = dm_build_dm_uuid(dm->mem, UUID_PREFIX, (const char *)&lvid_meta.s, NULL)))
+			return_0;
+		if (!(dlid_data = dm_build_dm_uuid(dm->mem, UUID_PREFIX, (const char *)&lvid_data.s, NULL)))
+			return_0;
+
+		if (!(name_meta = dm_build_dm_name(dm->mem, vg->name, pool_lv->name, "_cmeta")))
+			return_0;
+		if (!(name_data = dm_build_dm_name(dm->mem, vg->name, pool_lv->name, "_cdata")))
+			return_0;
+
+		if (!(dlid_pool = build_dm_uuid(dm->mem, pool_lv, NULL)))
+			return_0;
+
+		/* add meta dnode */
+		if (!(dnode_meta = dm_tree_add_new_dev_with_udev_flags(dtree,
+								  name_meta,
+								  dlid_meta,
+								  -1, -1,
+								  read_only_lv(lv, laopts, layer),
+								  ((lv->vg->status & PRECOMMITTED) | laopts->revert) ? 1 : 0,
+								  lvlayer,
+								  udev_flags)))
+			return_0;
+
+		/* add load_segment to meta dnode: linear, size of meta area */
+		if (!add_linear_area_to_dtree(dnode_meta,
+					      meta_len,
+					      lv->vg->extent_size,
+					      lv->vg->cmd->use_linear_target,
+					      lv->vg->name, lv->name))
+			return_0;
+
+		/* add seg_area to prev load_seg: offset 0 maps to cachepool lv offset 0 */
+		if (!dm_tree_node_add_target_area(dnode_meta, NULL, dlid_pool, 0))
+			return_0;
+
+		/* add data dnode */
+		if (!(dnode_data = dm_tree_add_new_dev_with_udev_flags(dtree,
+								  name_data,
+								  dlid_data,
+								  -1, -1,
+								  read_only_lv(lv, laopts, layer),
+								  ((lv->vg->status & PRECOMMITTED) | laopts->revert) ? 1 : 0,
+								  lvlayer,
+								  udev_flags)))
+			return_0;
+
+		/* add load_segment to data dnode: linear, size of data area */
+		if (!add_linear_area_to_dtree(dnode_data,
+					      data_len,
+					      lv->vg->extent_size,
+					      lv->vg->cmd->use_linear_target,
+					      lv->vg->name, lv->name))
+			return_0;
+
+		/* add seg_area to prev load_seg: offset 0 maps to cachepool lv after meta */
+		if (!dm_tree_node_add_target_area(dnode_data, NULL, dlid_pool, meta_len))
+			return_0;
+	}
+
 	/* FIXME Seek a simpler way to lay out the snapshot-merge tree. */

 	if (!layer && lv_is_merging_origin(lv)) {
@@ -3021,12 +3324,6 @@ static int _add_new_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 	    dm_tree_node_get_context(dnode))
 		return 1;

-	if (!(lvlayer = dm_pool_alloc(dm->mem, sizeof(*lvlayer)))) {
-		log_error("_add_new_lv_to_dtree: pool alloc failed for %s %s.",
-			  display_lvname(lv), layer);
-		return 0;
-	}
-
 	lvlayer->lv = lv;
 	lvlayer->visible_component = (laopts->component_lv == lv) ? 1 : 0;

@@ -3117,7 +3414,7 @@ static int _add_new_lv_to_dtree(struct dev_manager *dm, struct dm_tree *dtree,
 	    !_pool_register_callback(dm, dnode, lv))
 		return_0;

-	if (lv_is_cache(lv) &&
+	if (lv_is_cache(lv) && !lv_is_cache_single(first_seg(lv)->pool_lv) &&
 	    /* Register callback only for layer activation or non-layered cache LV */
 	    (layer || !lv_layer(lv)) &&
 	    /* Register callback when metadata LV is NOT already active */
--- a/lib/activate/dev_manager.h
+++ b/lib/activate/dev_manager.h
@@ -63,6 +63,9 @@ int dev_manager_raid_status(struct dev_manager *dm,
 int dev_manager_raid_message(struct dev_manager *dm,
 			     const struct logical_volume *lv,
 			     const char *msg);
+int dev_manager_writecache_message(struct dev_manager *dm,
+                                   const struct logical_volume *lv,
+                                   const char *msg);
 int dev_manager_cache_status(struct dev_manager *dm,
 			     const struct logical_volume *lv,
 			     struct lv_status_cache **status);
@@ -81,9 +84,8 @@ int dev_manager_thin_device_id(struct dev_manager *dm,
 			       uint32_t *device_id);
 int dev_manager_vdo_pool_status(struct dev_manager *dm,
 				const struct logical_volume *lv,
-				int flush,
-				char **vdo_params,
-				struct lv_status_vdo **vdo_status);
+				struct lv_status_vdo **vdo_status,
+				int flush);
 int dev_manager_suspend(struct dev_manager *dm, const struct logical_volume *lv,
 			struct lv_activate_opts *laopts, int lockfs, int flush_required);
 int dev_manager_activate(struct dev_manager *dm, const struct logical_volume *lv,
@@ -103,4 +105,14 @@ int dev_manager_execute(struct dev_manager *dm);
 int dev_manager_device_uses_vg(struct device *dev,
 			       struct volume_group *vg);

+int dev_manager_remove_dm_major_minor(uint32_t major, uint32_t minor);
+
+int get_cache_single_meta_data(struct cmd_context *cmd,
+                                    struct logical_volume *lv,
+                                    struct logical_volume *pool_lv,
+                                    struct dm_info *info_meta, struct dm_info *info_data);
+
+int remove_cache_single_meta_data(struct cmd_context *cmd,
+                                       struct dm_info *info_meta, struct dm_info *info_data);
+
 #endif
--- a/lib/cache/lvmcache.c
+++ b/lib/cache/lvmcache.c
@@ -39,12 +39,19 @@ struct lvmcache_info {
 	uint32_t ext_version;   /* Extension version */
 	uint32_t ext_flags;	/* Extension flags */
 	uint32_t status;
+	int summary_seqno;	/* vg seqno found on this dev during scan */
+	int mda1_seqno;
+	int mda2_seqno;
+	unsigned summary_seqno_mismatch:1; /* two mdas on this dev has mismatching metadata */
+	unsigned mda1_bad:1;	/* label scan found bad metadata in mda1 */
+	unsigned mda2_bad:1;	/* label scan found bad metadata in mda2 */
 };

 /* One per VG */
 struct lvmcache_vginfo {
 	struct dm_list list;	/* Join these vginfos together */
 	struct dm_list infos;	/* List head for lvmcache_infos */
+	struct dm_list outdated_infos; /* vg_read moves info from infos to outdated_infos */
 	const struct format_type *fmt;
 	char *vgname;		/* "" == orphan */
 	uint32_t status;
@@ -67,9 +74,9 @@ static DM_LIST_INIT(_vginfos);
 static DM_LIST_INIT(_found_duplicate_devs);
 static DM_LIST_INIT(_unused_duplicate_devs);
 static int _scanning_in_progress = 0;
-static int _has_scanned = 0;
 static int _vgs_locked = 0;
 static int _found_duplicate_pvs = 0;	/* If we never see a duplicate PV we can skip checking for them later. */
+static int _found_duplicate_vgnames = 0;

 int lvmcache_init(struct cmd_context *cmd)
 {
@@ -132,6 +139,11 @@ int lvmcache_found_duplicate_pvs(void)
 	return _found_duplicate_pvs;
 }

+int lvmcache_found_duplicate_vgnames(void)
+{
+	return _found_duplicate_vgnames;
+}
+
 int lvmcache_get_unused_duplicate_devs(struct cmd_context *cmd, struct dm_list *head)
 {
 	struct device_list *devl, *devl2;
@@ -170,6 +182,33 @@ static void _destroy_duplicate_device_list(struct dm_list *head)
 	dm_list_init(head);
 }

+int lvmcache_has_bad_metadata(struct device *dev)
+{
+	struct lvmcache_info *info;
+
+	if (!(info = lvmcache_info_from_pvid(dev->pvid, dev, 0))) {
+		/* shouldn't happen */
+		log_error("No lvmcache info for checking bad metadata on %s", dev_name(dev));
+		return 0;
+	}
+
+	if (info->mda1_bad || info->mda2_bad)
+		return 1;
+	return 0;
+}
+
+/*
+ * "bad" metadata cannot be used/processed by lvm, e.g.
+ * it has a bad checksum, invalid/unrecognizable content.
+ */
+void lvmcache_set_bad_metadata(struct lvmcache_info *info, int mda1_bad, int mda2_bad)
+{
+	if (mda1_bad)
+		info->mda1_bad = 1;
+	if (mda2_bad)
+		info->mda2_bad = 1;
+}
+
 static void _vginfo_attach_info(struct lvmcache_vginfo *vginfo,
 				struct lvmcache_info *info)
 {
@@ -858,8 +897,8 @@ int lvmcache_label_scan(struct cmd_context *cmd)
 	_scanning_in_progress = 1;

 	/* FIXME: can this happen? */
-	if (!cmd->full_filter) {
-		log_error("label scan is missing full filter");
+	if (!cmd->filter) {
+		log_error("label scan is missing filter");
 		goto out;
 	}

@@ -1226,6 +1265,8 @@ static int _insert_vginfo(struct lvmcache_vginfo *new_vginfo, const char *vgid,
 				     sizeof(uuid_primary)))
 			return_0;

+		_found_duplicate_vgnames = 1;
+
 		/*
 		 * vginfo is kept for each VG with the same name.
 		 * They are saved with the vginfo->next list.
@@ -1336,6 +1377,7 @@ static int _lvmcache_update_vgname(struct lvmcache_info *info,
 			return 0;
 		}
 		dm_list_init(&vginfo->infos);
+		dm_list_init(&vginfo->outdated_infos);

 		/*
 		 * A different VG (different uuid) can exist with the same name.
@@ -1460,12 +1502,9 @@ int lvmcache_add_orphan_vginfo(const char *vgname, struct format_type *fmt)
 }

 /*
- * FIXME: get rid of other callers of this function which call it
- * in odd cases to "fix up" some bit of lvmcache state.  Make those
- * callers fix up what they need to directly, and leave this function
- * with one purpose and caller.
+ * Returning 0 causes the caller to remove the info struct for this
+ * device from lvmcache, which will make it look like a missing device.
 */
-
 int lvmcache_update_vgname_and_id(struct lvmcache_info *info, struct lvmcache_vgsummary *vgsummary)
 {
 	const char *vgname = vgsummary->vgname;
@@ -1491,6 +1530,7 @@ int lvmcache_update_vgname_and_id(struct lvmcache_info *info, struct lvmcache_vg
 	 * Puts the vginfo into the vgname hash table.
 	 */
 	if (!_lvmcache_update_vgname(info, vgname, vgid, vgsummary->vgstatus, vgsummary->creation_host, info->fmt)) {
+		/* shouldn't happen, internal error */
 		log_error("Failed to update VG %s info in lvmcache.", vgname);
 		return 0;
 	}
@@ -1499,6 +1539,7 @@ int lvmcache_update_vgname_and_id(struct lvmcache_info *info, struct lvmcache_vg
 	 * Puts the vginfo into the vgid hash table.
 	 */
 	if (!_lvmcache_update_vgid(info, info->vginfo, vgid)) {
+		/* shouldn't happen, internal error */
 		log_error("Failed to update VG %s info in lvmcache.", vgname);
 		return 0;
 	}
@@ -1514,56 +1555,140 @@ int lvmcache_update_vgname_and_id(struct lvmcache_info *info, struct lvmcache_vg
 	if (!vgsummary->seqno && !vgsummary->mda_size && !vgsummary->mda_checksum)
 		return 1;

+	/*
+	 * Keep track of which devs/mdas have old versions of the metadata.
+	 * The values we keep in vginfo are from the metadata with the largest
+	 * seqno.  One dev may have more recent metadata than another dev, and
+	 * one mda may have more recent metadata than the other mda on the same
+	 * device.
+	 *
+	 * When a device holds old metadata, the info struct for the device
+	 * remains in lvmcache, so the device is not treated as missing.
+	 * Also the mda struct containing the old metadata is kept on
+	 * info->mdas.  This means that vg_read will read metadata from
+	 * the mda again (and probably see the same old metadata).  It
+	 * also means that vg_write will use the mda to write new metadata
+	 * into the mda that currently has the old metadata.
+	 */
+	if (vgsummary->mda_num == 1)
+		info->mda1_seqno = vgsummary->seqno;
+	else if (vgsummary->mda_num == 2)
+		info->mda2_seqno = vgsummary->seqno;
+
+	if (!info->summary_seqno)
+		info->summary_seqno = vgsummary->seqno;
+	else {
+		if (info->summary_seqno == vgsummary->seqno) {
+			/* This mda has the same metadata as the prev mda on this dev. */
+			return 1;
+
+		} else if (info->summary_seqno > vgsummary->seqno) {
+			/* This mda has older metadata than the prev mda on this dev. */
+			info->summary_seqno_mismatch = 1;
+
+		} else if (info->summary_seqno < vgsummary->seqno) {
+			/* This mda has newer metadata than the prev mda on this dev. */
+			info->summary_seqno_mismatch = 1;
+			info->summary_seqno = vgsummary->seqno;
+		}
+	}
+
+	/* this shouldn't happen */
 	if (!(vginfo = info->vginfo))
 		return 1;

 	if (!vginfo->seqno) {
 		vginfo->seqno = vgsummary->seqno;
-
-		log_debug_cache("lvmcache %s: VG %s: set seqno to %d",
-				dev_name(info->dev), vginfo->vgname, vginfo->seqno);
-
-	} else if (vgsummary->seqno != vginfo->seqno) {
-		log_warn("Scan of VG %s from %s found metadata seqno %d vs previous %d.",
-			 vgname, dev_name(info->dev), vgsummary->seqno, vginfo->seqno);
-		vginfo->scan_summary_mismatch = 1;
-		/* If we don't return success, this dev info will be removed from lvmcache,
-		   and then we won't be able to rescan it or repair it. */
-		return 1;
-	}
-
-	if (!vginfo->mda_size) {
 		vginfo->mda_checksum = vgsummary->mda_checksum;
 		vginfo->mda_size = vgsummary->mda_size;

-		log_debug_cache("lvmcache %s: VG %s: set mda_checksum to %x mda_size to %zu",
-				dev_name(info->dev), vginfo->vgname,
-				vginfo->mda_checksum, vginfo->mda_size);
+		log_debug_cache("lvmcache %s mda%d VG %s set seqno %u checksum %x mda_size %zu",
+				dev_name(info->dev), vgsummary->mda_num, vgname,
+				vgsummary->seqno, vgsummary->mda_checksum, vgsummary->mda_size);
+		goto update_vginfo;

-	} else if ((vginfo->mda_size != vgsummary->mda_size) || (vginfo->mda_checksum != vgsummary->mda_checksum)) {
-		log_warn("Scan of VG %s from %s found mda_checksum %x mda_size %zu vs previous %x %zu",
-			 vgname, dev_name(info->dev), vgsummary->mda_checksum, vgsummary->mda_size,
-			 vginfo->mda_checksum, vginfo->mda_size);
+	} else if (vgsummary->seqno < vginfo->seqno) {
 		vginfo->scan_summary_mismatch = 1;
-		/* If we don't return success, this dev info will be removed from lvmcache,
-		   and then we won't be able to rescan it or repair it. */
+
+		log_debug_cache("lvmcache %s mda%d VG %s older seqno %u checksum %x mda_size %zu",
+				dev_name(info->dev), vgsummary->mda_num, vgname,
+				vgsummary->seqno, vgsummary->mda_checksum, vgsummary->mda_size);
+		return 1;
+
+	} else if (vgsummary->seqno > vginfo->seqno) {
+		vginfo->scan_summary_mismatch = 1;
+
+		/* Replace vginfo values with values from newer metadata. */
+		vginfo->seqno = vgsummary->seqno;
+		vginfo->mda_checksum = vgsummary->mda_checksum;
+		vginfo->mda_size = vgsummary->mda_size;
+
+		log_debug_cache("lvmcache %s mda%d VG %s newer seqno %u checksum %x mda_size %zu",
+				dev_name(info->dev), vgsummary->mda_num, vgname,
+				vgsummary->seqno, vgsummary->mda_checksum, vgsummary->mda_size);
+
+		goto update_vginfo;
+	} else {
+		/*
+		 * Same seqno as previous metadata we saw for this VG.
+		 * If the metadata somehow has a different checksum or size,
+		 * even though it has the same seqno, something has gone wrong.
+		 * FIXME: test this case: VG has two PVs, first goes missing,
+		 * second updated to seqno 4, first comes back and second goes
+		 * missing, first updated to seqno 4, second comes back, now
+		 * both are present with same seqno but different checksums.
+		 */
+
+		if ((vginfo->mda_size != vgsummary->mda_size) || (vginfo->mda_checksum != vgsummary->mda_checksum)) {
+			log_warn("WARNING: scan of VG %s from %s mda%d found mda_checksum %x mda_size %zu vs %x %zu",
+				 vgname, dev_name(info->dev), vgsummary->mda_num,
+				 vgsummary->mda_checksum, vgsummary->mda_size,
+				 vginfo->mda_checksum, vginfo->mda_size);
+			vginfo->scan_summary_mismatch = 1;
+			return 0;
+		}
+
+		/*
+		 * The seqno and checksum matches what was previously seen;
+		 * the summary values have already been saved in vginfo.
+		 */
 		return 1;
 	}

-	/*
-	 * If a dev has an unmatching checksum, ignore the other
-	 * info from it, keeping the info we already saved.
-	 */
+ update_vginfo:
 	if (!_lvmcache_update_vgstatus(info, vgsummary->vgstatus, vgsummary->creation_host,
 				       vgsummary->lock_type, vgsummary->system_id)) {
+		/*
+		 * This shouldn't happen, it's an internal errror, and we can leave
+		 * the info in place without saving the summary values in vginfo.
+		 */
 		log_error("Failed to update VG %s info in lvmcache.", vgname);
-		return 0;
 	}

 	return 1;
 }

-int lvmcache_update_vg(struct volume_group *vg, unsigned precommitted)
+/*
+ * FIXME: quit trying to mirror changes that a command is making into lvmcache.
+ *
+ * First, it's complicated and hard to ensure it's done correctly in every case
+ * (it would be much easier and safer to just toss out what's in lvmcache and
+ * reread the info to recreate it from scratch instead of trying to make sure
+ * every possible discrete state change is correct.)
+ *
+ * Second, it's unnecessary if commands just use the vg they are modifying
+ * rather than also trying to get info from lvmcache.  The lvmcache state
+ * should be populated by label_scan, used to perform vg_read's, and then
+ * ignored (or dropped so it can't be used).
+ *
+ * lvmcache info is already used very little after a command begins its
+ * operation.  The code that's supposed to keep the lvmcache in sync with
+ * changes being made to disk could be half wrong and we wouldn't know it.
+ * That creates a landmine for someone who might try to use a bit of it that
+ * isn't being updated correctly.
+ */
+
+int lvmcache_update_vg_from_write(struct volume_group *vg)
 {
 	struct pv_list *pvl;
 	struct lvmcache_info *info;
@@ -1587,6 +1712,110 @@ int lvmcache_update_vg(struct volume_group *vg, unsigned precommitted)
 	return 1;
 }

+/*
+ * The lvmcache representation of a VG after label_scan can be incorrect
+ * because the label_scan does not use the full VG metadata to construct
+ * vginfo/info.  PVs that don't hold VG metadata weren't attached to the vginfo
+ * during label scan, and PVs with outdated metadata (claiming to be in the VG,
+ * but not listed in the latest metadata) were attached to the vginfo, but
+ * shouldn't be.  After vg_read() gets the full metdata in the form of a 'vg',
+ * this function is called to fix up the lvmcache representation of the VG
+ * using the 'vg'.
+ */
+
+int lvmcache_update_vg_from_read(struct volume_group *vg, unsigned precommitted)
+{
+	struct pv_list *pvl;
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info, *info2;
+	struct metadata_area *mda;
+	char pvid_s[ID_LEN + 1] __attribute__((aligned(8)));
+	struct lvmcache_vgsummary vgsummary = {
+		.vgname = vg->name,
+		.vgstatus = vg->status,
+		.vgid = vg->id,
+		.system_id = vg->system_id,
+		.lock_type = vg->lock_type
+	};
+
+	if (!(vginfo = lvmcache_vginfo_from_vgname(vg->name, (const char *)&vg->id))) {
+		log_error(INTERNAL_ERROR "lvmcache_update_vg %s no vginfo", vg->name);
+		return 0;
+	}
+
+	/*
+	 * The label scan doesn't know when a PV with old metadata has been
+	 * removed from the VG.  Now with the vg we can tell, so remove the
+	 * info for a PV that has been removed from the VG with
+	 * vgreduce --removemissing.
+	 */
+	dm_list_iterate_items_safe(info, info2, &vginfo->infos) {
+		int found = 0;
+		dm_list_iterate_items(pvl, &vg->pvs) {
+			if (pvl->pv->dev != info->dev)
+				continue;
+			found = 1;
+			break;
+		}
+
+		if (found)
+			continue;
+
+		log_warn("WARNING: outdated PV %s seqno %u has been removed in current VG %s seqno %u.",
+			 dev_name(info->dev), info->summary_seqno, vg->name, vginfo->seqno);
+
+		_drop_vginfo(info, vginfo); /* remove from vginfo->infos */
+		dm_list_add(&vginfo->outdated_infos, &info->list);
+	}
+
+	dm_list_iterate_items(pvl, &vg->pvs) {
+		(void) dm_strncpy(pvid_s, (char *) &pvl->pv->id, sizeof(pvid_s));
+
+		if (!(info = lvmcache_info_from_pvid(pvid_s, pvl->pv->dev, 0))) {
+			log_debug_cache("lvmcache_update_vg %s no info for %s %s",
+					vg->name,
+					(char *) &pvl->pv->id,
+					pvl->pv->dev ? dev_name(pvl->pv->dev) : "missing");
+			continue;
+		}
+
+		log_debug_cache("lvmcache_update_vg %s for info %s",
+				vg->name, dev_name(info->dev));
+		
+		/*
+		 * FIXME: use a different function that just attaches info's that
+		 * had no metadata onto the correct vginfo.
+		 *
+		 * info's for PVs without metadata were not connected to the
+		 * vginfo by label_scan, so do it here.
+		 */
+		if (!lvmcache_update_vgname_and_id(info, &vgsummary)) {
+			log_debug_cache("lvmcache_update_vg %s failed to update info for %s",
+					vg->name, dev_name(info->dev));
+		}
+
+		/*
+		 * Ignored mdas were not copied from info->mdas to
+		 * fid->metadata_areas... when create_text_instance (at the
+		 * start of vg_read) called lvmcache_fid_add_mdas_vg because at
+		 * that point the info's were not connected to the vginfo
+		 * (since label_scan didn't know this without metadata.)
+		 */
+		dm_list_iterate_items(mda, &info->mdas) {
+			if (!mda_is_ignored(mda))
+				continue;
+			log_debug("lvmcache_update_vg %s copy ignored mdas for %s", vg->name, dev_name(info->dev));
+			if (!lvmcache_fid_add_mdas_pv(info, vg->fid)) {
+				log_debug_cache("lvmcache_update_vg %s failed to update mdas for %s",
+					        vg->name, dev_name(info->dev));
+			}
+			break;
+		}
+	}
+
+	return 1;
+}
+
 /*
 * We can see multiple different devices with the
 * same pvid, i.e. duplicates.
@@ -1638,7 +1867,7 @@ int lvmcache_update_vg(struct volume_group *vg, unsigned precommitted)
 *   transient duplicate?
 */

-static struct lvmcache_info * _create_info(struct labeller *labeller, struct device *dev)
+static struct lvmcache_info * _create_info(struct labeller *labeller, struct device *dev, uint64_t label_sector)
 {
 	struct lvmcache_info *info;
 	struct label *label;
@@ -1651,6 +1880,9 @@ static struct lvmcache_info * _create_info(struct labeller *labeller, struct dev
 		return NULL;
 	}

+	label->dev = dev;
+	label->sector = label_sector;
+
 	info->dev = dev;
 	info->fmt = labeller->fmt;

@@ -1666,8 +1898,9 @@ static struct lvmcache_info * _create_info(struct labeller *labeller, struct dev
 }

 struct lvmcache_info *lvmcache_add(struct labeller *labeller,
-				   const char *pvid, struct device *dev,
-				   const char *vgname, const char *vgid, uint32_t vgstatus)
+				   const char *pvid, struct device *dev, uint64_t label_sector,
+				   const char *vgname, const char *vgid, uint32_t vgstatus,
+				   int *is_duplicate)
 {
 	char pvid_s[ID_LEN + 1] __attribute__((aligned(8)));
 	char uuid[64] __attribute__((aligned(8)));
@@ -1695,7 +1928,7 @@ struct lvmcache_info *lvmcache_add(struct labeller *labeller,
 		info = lvmcache_info_from_pvid(dev->pvid, NULL, 0);

 	if (!info) {
-		info = _create_info(labeller, dev);
+		info = _create_info(labeller, dev, label_sector);
 		created = 1;
 	}

@@ -1727,6 +1960,8 @@ struct lvmcache_info *lvmcache_add(struct labeller *labeller,

 			dm_list_add(&_found_duplicate_devs, &devl->list);
 			_found_duplicate_pvs = 1;
+			if (is_duplicate)
+				*is_duplicate = 1;
 			return NULL;
 		}

@@ -1808,8 +2043,6 @@ void lvmcache_destroy(struct cmd_context *cmd, int retain_orphans, int reset)
 {
 	log_debug_cache("Dropping VG info");

-	_has_scanned = 0;
-
 	if (_vgid_hash) {
 		dm_hash_destroy(_vgid_hash);
 		_vgid_hash = NULL;
@@ -1851,7 +2084,8 @@ void lvmcache_destroy(struct cmd_context *cmd, int retain_orphans, int reset)
 	if (retain_orphans) {
 		struct format_type *fmt;

-		lvmcache_init(cmd);
+		if (!lvmcache_init(cmd))
+			stack;

 		dm_list_iterate_items(fmt, &cmd->formats) {
 			if (!lvmcache_add_orphan_vginfo(fmt->orphan_vg_name, fmt))
@@ -1871,6 +2105,14 @@ int lvmcache_fid_add_mdas_pv(struct lvmcache_info *info, struct format_instance
 	return lvmcache_fid_add_mdas(info, fid, info->dev->pvid, ID_LEN);
 }

+/*
+ * This is the linkage where information is passed from
+ * the label_scan to vg_read.
+ *
+ * Called by create_text_instance in vg_read to copy the
+ * mda's found during label_scan and saved in info->mdas,
+ * to fid->metadata_areas_in_use which is used by vg_read.
+ */
 int lvmcache_fid_add_mdas_vg(struct lvmcache_vginfo *vginfo, struct format_instance *fid)
 {
 	struct lvmcache_info *info;
@@ -1961,9 +2203,10 @@ void lvmcache_del_bas(struct lvmcache_info *info)
 }

 int lvmcache_add_mda(struct lvmcache_info *info, struct device *dev,
-		     uint64_t start, uint64_t size, unsigned ignored)
+		     uint64_t start, uint64_t size, unsigned ignored,
+		     struct metadata_area **mda_new)
 {
-	return add_mda(info->fmt, NULL, &info->mdas, dev, start, size, ignored);
+	return add_mda(info->fmt, NULL, &info->mdas, dev, start, size, ignored, mda_new);
 }

 int lvmcache_add_da(struct lvmcache_info *info, uint64_t start, uint64_t size)
@@ -2280,3 +2523,129 @@ int lvmcache_scan_mismatch(struct cmd_context *cmd, const char *vgname, const ch
 	return 1;
 }

+int lvmcache_vginfo_has_pvid(struct lvmcache_vginfo *vginfo, char *pvid)
+{
+	struct lvmcache_info *info;
+
+	dm_list_iterate_items(info, &vginfo->infos) {
+		if (!strcmp(info->dev->pvid, pvid))
+			return 1;
+	}
+	return 0;
+}
+
+/*
+ * This is used by the metadata repair command to check if
+ * the metadata on a dev needs repair because it's old.
+ */
+int lvmcache_has_old_metadata(struct cmd_context *cmd, const char *vgname, const char *vgid, struct device *dev)
+{
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info;
+
+	/* shouldn't happen */
+	if (!vgname || !vgid)
+		return 0;
+
+	/* shouldn't happen */
+	if (!(vginfo = lvmcache_vginfo_from_vgid(vgid)))
+		return 0;
+
+	/* shouldn't happen */
+	if (!(info = lvmcache_info_from_pvid(dev->pvid, NULL, 0)))
+		return 0;
+
+	/* writing to a new PV */
+	if (!info->summary_seqno)
+		return 0;
+
+	/* on same dev, one mda has newer metadata than the other */
+	if (info->summary_seqno_mismatch)
+		return 1;
+
+	/* one or both mdas on this dev has older metadata than another dev */
+	if (vginfo->seqno > info->summary_seqno)
+		return 1;
+
+	return 0;
+}
+
+void lvmcache_get_outdated_devs(struct cmd_context *cmd,
+				const char *vgname, const char *vgid,
+				struct dm_list *devs)
+{
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info;
+	struct device_list *devl;
+
+	if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, vgid))) {
+		log_error(INTERNAL_ERROR "lvmcache_get_outdated_devs no vginfo %s", vgname);
+		return;
+	}
+
+	dm_list_iterate_items(info, &vginfo->outdated_infos) {
+		if (!(devl = zalloc(sizeof(*devl))))
+			return;
+		devl->dev = info->dev;
+		dm_list_add(devs, &devl->list);
+	}
+}
+
+void lvmcache_del_outdated_devs(struct cmd_context *cmd,
+				const char *vgname, const char *vgid)
+{
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info, *info2;
+
+	if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, vgid))) {
+		log_error(INTERNAL_ERROR "lvmcache_get_outdated_devs no vginfo");
+		return;
+	}
+
+	dm_list_iterate_items_safe(info, info2, &vginfo->outdated_infos)
+		lvmcache_del(info);
+}
+
+void lvmcache_get_outdated_mdas(struct cmd_context *cmd,
+				const char *vgname, const char *vgid,
+				struct device *dev,
+				struct dm_list **mdas)
+{
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info;
+
+	*mdas = NULL;
+
+	if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, vgid))) {
+		log_error(INTERNAL_ERROR "lvmcache_get_outdated_mdas no vginfo");
+		return;
+	}
+
+	dm_list_iterate_items(info, &vginfo->outdated_infos) {
+		if (info->dev != dev)
+			continue;
+		*mdas = &info->mdas;
+		return;
+	}
+}
+
+int lvmcache_is_outdated_dev(struct cmd_context *cmd,
+			     const char *vgname, const char *vgid,
+			     struct device *dev)
+{
+	struct lvmcache_vginfo *vginfo;
+	struct lvmcache_info *info;
+
+	if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, vgid))) {
+		log_error(INTERNAL_ERROR "lvmcache_get_outdated_mdas no vginfo");
+		return 0;
+	}
+
+	dm_list_iterate_items(info, &vginfo->outdated_infos) {
+		if (info->dev == dev)
+			return 1;
+	}
+
+	return 0;
+}
+
--- a/lib/cache/lvmcache.h
+++ b/lib/cache/lvmcache.h
@@ -57,10 +57,12 @@ struct lvmcache_vgsummary {
 	char *creation_host;
 	const char *system_id;
 	const char *lock_type;
+	uint32_t seqno;
 	uint32_t mda_checksum;
 	size_t mda_size;
-	int zero_offset;
-	int seqno;
+	int mda_num; /* 1 = summary from mda1, 2 = summary from mda2 */
+	unsigned mda_ignored:1;
+	unsigned zero_offset:1;
 };

 int lvmcache_init(struct cmd_context *cmd);
@@ -72,9 +74,9 @@ int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const

 /* Add/delete a device */
 struct lvmcache_info *lvmcache_add(struct labeller *labeller, const char *pvid,
-				   struct device *dev,
+				   struct device *dev, uint64_t label_sector,
 				   const char *vgname, const char *vgid,
-				   uint32_t vgstatus);
+				   uint32_t vgstatus, int *is_duplicate);
 int lvmcache_add_orphan_vginfo(const char *vgname, struct format_type *fmt);
 void lvmcache_del(struct lvmcache_info *info);
 void lvmcache_del_dev(struct device *dev);
@@ -82,7 +84,8 @@ void lvmcache_del_dev(struct device *dev);
 /* Update things */
 int lvmcache_update_vgname_and_id(struct lvmcache_info *info,
 				  struct lvmcache_vgsummary *vgsummary);
-int lvmcache_update_vg(struct volume_group *vg, unsigned precommitted);
+int lvmcache_update_vg_from_read(struct volume_group *vg, unsigned precommitted);
+int lvmcache_update_vg_from_write(struct volume_group *vg);

 void lvmcache_lock_vgname(const char *vgname, int read_only);
 void lvmcache_unlock_vgname(const char *vgname);
@@ -127,7 +130,8 @@ void lvmcache_del_mdas(struct lvmcache_info *info);
 void lvmcache_del_das(struct lvmcache_info *info);
 void lvmcache_del_bas(struct lvmcache_info *info);
 int lvmcache_add_mda(struct lvmcache_info *info, struct device *dev,
-		     uint64_t start, uint64_t size, unsigned ignored);
+		     uint64_t start, uint64_t size, unsigned ignored,
+		     struct metadata_area **mda_new);
 int lvmcache_add_da(struct lvmcache_info *info, uint64_t start, uint64_t size);
 int lvmcache_add_ba(struct lvmcache_info *info, uint64_t start, uint64_t size);

@@ -169,6 +173,7 @@ int lvmcache_vgid_is_cached(const char *vgid);
 uint64_t lvmcache_smallest_mda_size(struct lvmcache_info *info);

 int lvmcache_found_duplicate_pvs(void);
+int lvmcache_found_duplicate_vgnames(void);

 void lvmcache_pvscan_duplicate_check(struct cmd_context *cmd);

@@ -198,6 +203,8 @@ void lvmcache_set_independent_location(const char *vgname);

 int lvmcache_scan_mismatch(struct cmd_context *cmd, const char *vgname, const char *vgid);

+int lvmcache_vginfo_has_pvid(struct lvmcache_vginfo *vginfo, char *pvid);
+
 /*
 * These are clvmd-specific functions and are not related to lvmcache.
 * FIXME: rename these with a clvm_ prefix in place of lvmcache_
@@ -209,4 +216,23 @@ void lvmcache_drop_saved_vgid(const char *vgid);

 int dev_in_device_list(struct device *dev, struct dm_list *head);

+void lvmcache_set_bad_metadata(struct lvmcache_info *info, int mda1_bad, int mda2_bad);
+int lvmcache_has_bad_metadata(struct device *dev);
+int lvmcache_has_old_metadata(struct cmd_context *cmd, const char *vgname, const char *vgid, struct device *dev);
+
+void lvmcache_get_outdated_devs(struct cmd_context *cmd,
+                                const char *vgname, const char *vgid,
+                                struct dm_list *devs);
+void lvmcache_get_outdated_mdas(struct cmd_context *cmd,
+                                const char *vgname, const char *vgid,
+                                struct device *dev,
+                                struct dm_list **mdas);
+
+int lvmcache_is_outdated_dev(struct cmd_context *cmd,
+                             const char *vgname, const char *vgid,
+                             struct device *dev);
+
+void lvmcache_del_outdated_devs(struct cmd_context *cmd,
+                                const char *vgname, const char *vgid);
+
 #endif
--- a/lib/cache_segtype/cache.c
+++ b/lib/cache_segtype/cache.c
@@ -47,23 +47,33 @@ static int _cache_out_line(const char *line, void *_f)
 static void _cache_display(const struct lv_segment *seg)
 {
 	const struct dm_config_node *n;
-	const struct lv_segment *pool_seg =
-		seg_is_cache_pool(seg) ? seg : first_seg(seg->pool_lv);
+	const struct lv_segment *setting_seg = NULL;
+
+	if (seg_is_cache(seg) && lv_is_cache_single(seg->pool_lv))
+		setting_seg = seg;
+
+	else if (seg_is_cache_pool(seg))
+		setting_seg = seg;
+
+	else if (seg_is_cache(seg))
+		setting_seg = first_seg(seg->pool_lv);
+	else
+		return;

 	log_print("  Chunk size\t\t%s",
-		  display_size(seg->lv->vg->cmd, pool_seg->chunk_size));
+		  display_size(seg->lv->vg->cmd, setting_seg->chunk_size));

-	if (pool_seg->cache_metadata_format != CACHE_METADATA_FORMAT_UNSELECTED)
-		log_print("  Metadata format\t%u", pool_seg->cache_metadata_format);
+	if (setting_seg->cache_metadata_format != CACHE_METADATA_FORMAT_UNSELECTED)
+		log_print("  Metadata format\t%u", setting_seg->cache_metadata_format);

-	if (pool_seg->cache_mode != CACHE_MODE_UNSELECTED)
-		log_print("  Mode\t\t%s", get_cache_mode_name(pool_seg));
+	if (setting_seg->cache_mode != CACHE_MODE_UNSELECTED)
+		log_print("  Mode\t\t%s", get_cache_mode_name(setting_seg));

-	if (pool_seg->policy_name)
-		log_print("  Policy\t\t%s", pool_seg->policy_name);
+	if (setting_seg->policy_name)
+		log_print("  Policy\t\t%s", setting_seg->policy_name);

-	if (pool_seg->policy_settings &&
-	    (n = pool_seg->policy_settings->child))
+	if (setting_seg->policy_settings &&
+	    (n = setting_seg->policy_settings->child))
 		dm_config_write_node(n, _cache_out_line, NULL);

 	log_print(" ");
@@ -99,32 +109,16 @@ static void _fix_missing_defaults(struct lv_segment *cpool_seg)
 	}
 }

-static int _cache_pool_text_import(struct lv_segment *seg,
-				   const struct dm_config_node *sn,
-				   struct dm_hash_table *pv_hash __attribute__((unused)))
+static int _settings_text_import(struct lv_segment *seg,
+				 const struct dm_config_node *sn)
 {
-	struct logical_volume *data_lv, *meta_lv;
 	const char *str = NULL;
 	struct dm_pool *mem = seg->lv->vg->vgmem;

-	if (!dm_config_has_node(sn, "data"))
-		return SEG_LOG_ERROR("Cache data not specified in");
-	if (!(str = dm_config_find_str(sn, "data", NULL)))
-		return SEG_LOG_ERROR("Cache data must be a string in");
-	if (!(data_lv = find_lv(seg->lv->vg, str)))
-		return SEG_LOG_ERROR("Unknown logical volume %s specified for "
-				     "cache data in", str);
-
-	if (!dm_config_has_node(sn, "metadata"))
-		return SEG_LOG_ERROR("Cache metadata not specified in");
-	if (!(str = dm_config_find_str(sn, "metadata", NULL)))
-		return SEG_LOG_ERROR("Cache metadata must be a string in");
-	if (!(meta_lv = find_lv(seg->lv->vg, str)))
-		return SEG_LOG_ERROR("Unknown logical volume %s specified for "
-				     "cache metadata in", str);
-
-	if (!dm_config_get_uint32(sn, "chunk_size", &seg->chunk_size))
-		return SEG_LOG_ERROR("Couldn't read cache chunk_size in");
+	if (dm_config_has_node(sn, "chunk_size")) {
+		if (!dm_config_get_uint32(sn, "chunk_size", &seg->chunk_size))
+			return SEG_LOG_ERROR("Couldn't read cache chunk_size in");
+	}

 	/*
 	 * Read in features:
@@ -146,16 +140,6 @@ static int _cache_pool_text_import(struct lv_segment *seg,
 			return SEG_LOG_ERROR("Failed to duplicate policy in");
 	}

-	if (dm_config_has_node(sn, "metadata_format")) {
-		if (!dm_config_get_uint32(sn, "metadata_format", &seg->cache_metadata_format) ||
-		    ((seg->cache_metadata_format != CACHE_METADATA_FORMAT_1) &&
-		     (seg->cache_metadata_format != CACHE_METADATA_FORMAT_2)))
-			return SEG_LOG_ERROR("Unknown cache metadata format %u number in",
-					     seg->cache_metadata_format);
-		if (seg->cache_metadata_format == CACHE_METADATA_FORMAT_2)
-			seg->lv->status |= LV_METADATA_FORMAT;
-	}
-
 	/*
 	 * Read in policy args:
 	 *   policy_settings {
@@ -184,6 +168,75 @@ static int _cache_pool_text_import(struct lv_segment *seg,
 			return_0;
 	}

+	return 1;
+}
+
+static int _settings_text_export(const struct lv_segment *seg,
+				 struct formatter *f)
+{
+	if (seg->chunk_size)
+		outf(f, "chunk_size = %" PRIu32, seg->chunk_size);
+
+	if (seg->cache_mode != CACHE_MODE_UNSELECTED) {
+		const char *cache_mode;
+		if (!(cache_mode = cache_mode_num_to_str(seg->cache_mode)))
+			return_0;
+		outf(f, "cache_mode = \"%s\"", cache_mode);
+	}
+
+	if (seg->policy_name) {
+		outf(f, "policy = \"%s\"", seg->policy_name);
+
+		if (seg->policy_settings) {
+			if (strcmp(seg->policy_settings->key, "policy_settings")) {
+				log_error(INTERNAL_ERROR "Incorrect policy_settings tree, %s.",
+					  seg->policy_settings->key);
+				return 0;
+			}
+			if (seg->policy_settings->child)
+				out_config_node(f, seg->policy_settings);
+		}
+	}
+
+	return 1;
+}
+
+static int _cache_pool_text_import(struct lv_segment *seg,
+				   const struct dm_config_node *sn,
+				   struct dm_hash_table *pv_hash __attribute__((unused)))
+{
+	struct logical_volume *data_lv, *meta_lv;
+	const char *str = NULL;
+
+	if (!dm_config_has_node(sn, "data"))
+		return SEG_LOG_ERROR("Cache data not specified in");
+	if (!(str = dm_config_find_str(sn, "data", NULL)))
+		return SEG_LOG_ERROR("Cache data must be a string in");
+	if (!(data_lv = find_lv(seg->lv->vg, str)))
+		return SEG_LOG_ERROR("Unknown logical volume %s specified for "
+				     "cache data in", str);
+
+	if (!dm_config_has_node(sn, "metadata"))
+		return SEG_LOG_ERROR("Cache metadata not specified in");
+	if (!(str = dm_config_find_str(sn, "metadata", NULL)))
+		return SEG_LOG_ERROR("Cache metadata must be a string in");
+	if (!(meta_lv = find_lv(seg->lv->vg, str)))
+		return SEG_LOG_ERROR("Unknown logical volume %s specified for "
+				     "cache metadata in", str);
+
+	if (dm_config_has_node(sn, "metadata_format")) {
+		if (!dm_config_get_uint32(sn, "metadata_format", &seg->cache_metadata_format) ||
+		    ((seg->cache_metadata_format != CACHE_METADATA_FORMAT_1) &&
+		     (seg->cache_metadata_format != CACHE_METADATA_FORMAT_2)))
+			return SEG_LOG_ERROR("Unknown cache metadata format %u number in",
+					     seg->cache_metadata_format);
+		if (seg->cache_metadata_format == CACHE_METADATA_FORMAT_2)
+			seg->lv->status |= LV_METADATA_FORMAT;
+	}
+
+	if (!_settings_text_import(seg, sn))
+		return_0;
+
 	if (!attach_pool_data_lv(seg, data_lv))
 		return_0;
 	if (!attach_pool_metadata_lv(seg, meta_lv))
@@ -207,11 +260,8 @@ static int _cache_pool_text_import_area_count(const struct dm_config_node *sn,
 static int _cache_pool_text_export(const struct lv_segment *seg,
 				   struct formatter *f)
 {
-	const char *cache_mode;
-
 	outf(f, "data = \"%s\"", seg_lv(seg, 0)->name);
 	outf(f, "metadata = \"%s\"", seg->metadata_lv->name);
-	outf(f, "chunk_size = %" PRIu32, seg->chunk_size);

 	switch (seg->cache_metadata_format) {
 	case CACHE_METADATA_FORMAT_UNSELECTED:
@@ -237,25 +287,9 @@ static int _cache_pool_text_export(const struct lv_segment *seg,
 	 * but not worth to break backward compatibility, by shifting
 	 * content to cache segment
 	 */
-	if (seg->cache_mode != CACHE_MODE_UNSELECTED) {
-		if (!(cache_mode = get_cache_mode_name(seg)))
-			return_0;
-		outf(f, "cache_mode = \"%s\"", cache_mode);
-	}

-	if (seg->policy_name) {
-		outf(f, "policy = \"%s\"", seg->policy_name);
-
-		if (seg->policy_settings) {
-			if (strcmp(seg->policy_settings->key, "policy_settings")) {
-				log_error(INTERNAL_ERROR "Incorrect policy_settings tree, %s.",
-					  seg->policy_settings->key);
-				return 0;
-			}
-			if (seg->policy_settings->child)
-				out_config_node(f, seg->policy_settings);
-		}
-	}
+	if (!_settings_text_export(seg, f))
+		return_0;

 	return 1;
 }
@@ -443,6 +477,7 @@ static int _cache_text_import(struct lv_segment *seg,
 {
 	struct logical_volume *pool_lv, *origin_lv;
 	const char *name;
+	const char *uuid;

 	if (!dm_config_has_node(sn, "cache_pool"))
 		return SEG_LOG_ERROR("cache_pool not specified in");
@@ -472,9 +507,44 @@ static int _cache_text_import(struct lv_segment *seg,
 	if (!attach_pool_lv(seg, pool_lv, NULL, NULL, NULL))
 		return_0;

-	/* load order is unknown, could be cache origin or pool LV, so check for both */
-	if (!dm_list_empty(&pool_lv->segments))
-		_fix_missing_defaults(first_seg(pool_lv));
+	if (!_settings_text_import(seg, sn))
+		return_0;
+
+	if (dm_config_has_node(sn, "metadata_format")) {
+		if (!dm_config_get_uint32(sn, "metadata_format", &seg->cache_metadata_format))
+			return SEG_LOG_ERROR("Couldn't read cache metadata_format in");
+		if (seg->cache_metadata_format != CACHE_METADATA_FORMAT_2)
+			return SEG_LOG_ERROR("Unknown cache metadata format %u number in",
+					     seg->cache_metadata_format);
+	}
+
+	if (dm_config_has_node(sn, "metadata_start")) {
+		if (!dm_config_get_uint64(sn, "metadata_start", &seg->metadata_start))
+			return SEG_LOG_ERROR("Couldn't read metadata_start in");
+		if (!dm_config_get_uint64(sn, "metadata_len", &seg->metadata_len))
+			return SEG_LOG_ERROR("Couldn't read metadata_len in");
+		if (!dm_config_get_uint64(sn, "data_start", &seg->data_start))
+			return SEG_LOG_ERROR("Couldn't read data_start in");
+		if (!dm_config_get_uint64(sn, "data_len", &seg->data_len))
+			return SEG_LOG_ERROR("Couldn't read data_len in");
+
+		if (!dm_config_get_str(sn, "metadata_id", &uuid))
+			return SEG_LOG_ERROR("Couldn't read metadata_id in");
+
+		if (!id_read_format(&seg->metadata_id, uuid))
+			return SEG_LOG_ERROR("Couldn't format metadata_id in");
+
+		if (!dm_config_get_str(sn, "data_id", &uuid))
+			return SEG_LOG_ERROR("Couldn't read data_id in");
+
+		if (!id_read_format(&seg->data_id, uuid))
+			return SEG_LOG_ERROR("Couldn't format data_id in");
+	} else {
+		/* Do not call this when LV is cache_single. */
+		/* load order is unknown, could be cache origin or pool LV, so check for both */
+		if (!dm_list_empty(&pool_lv->segments))
+			_fix_missing_defaults(first_seg(pool_lv));
+	}

 	return 1;
 }
@@ -489,6 +559,8 @@ static int _cache_text_import_area_count(const struct dm_config_node *sn,

 static int _cache_text_export(const struct lv_segment *seg, struct formatter *f)
 {
+	char buffer[40];
+
 	if (!seg_lv(seg, 0))
 		return_0;

@@ -498,6 +570,26 @@ static int _cache_text_export(const struct lv_segment *seg, struct formatter *f)
 	if (seg->cleaner_policy)
 		outf(f, "cleaner = 1");

+	if (lv_is_cache_single(seg->pool_lv)) {
+		outf(f, "metadata_format = " FMTu32, seg->cache_metadata_format);
+
+		if (!_settings_text_export(seg, f))
+			return_0;
+
+		outf(f, "metadata_start = " FMTu64, seg->metadata_start);
+		outf(f, "metadata_len = " FMTu64, seg->metadata_len);
+		outf(f, "data_start = " FMTu64, seg->data_start);
+		outf(f, "data_len = " FMTu64, seg->data_len);
+
+		if (!id_write_format(&seg->metadata_id, buffer, sizeof(buffer)))
+			return_0;
+		outf(f, "metadata_id = \"%s\"", buffer);
+
+		if (!id_write_format(&seg->data_id, buffer, sizeof(buffer)))
+			return_0;
+		outf(f, "data_id = \"%s\"", buffer);
+	}
+
 	return 1;
 }

@@ -512,6 +604,9 @@ static int _cache_add_target_line(struct dev_manager *dm,
 				 uint32_t *pvmove_mirror_count __attribute__((unused)))
 {
 	struct lv_segment *cache_pool_seg;
+	struct lv_segment *setting_seg;
+	union lvid metadata_lvid;
+	union lvid data_lvid;
 	char *metadata_uuid, *data_uuid, *origin_uuid;
 	uint64_t feature_flags = 0;
 	unsigned attr;
@@ -521,15 +616,23 @@ static int _cache_add_target_line(struct dev_manager *dm,
 		return 0;
 	}

+	log_debug("cache_add_target_line lv %s pool %s", seg->lv->name, seg->pool_lv->name);
+
 	cache_pool_seg = first_seg(seg->pool_lv);
+
+	if (lv_is_cache_single(seg->pool_lv))
+		setting_seg = seg;
+	else
+		setting_seg = cache_pool_seg;
+
 	if (seg->cleaner_policy)
 		/* With cleaner policy always pass writethrough */
 		feature_flags |= DM_CACHE_FEATURE_WRITETHROUGH;
 	else
-		switch (cache_pool_seg->cache_mode) {
+		switch (setting_seg->cache_mode) {
 		default:
 			log_error(INTERNAL_ERROR "LV %s has unknown cache mode %d.",
-				  display_lvname(seg->lv), cache_pool_seg->cache_mode);
+				  display_lvname(seg->lv), setting_seg->cache_mode);
 			/* Fall through */
 		case CACHE_MODE_WRITETHROUGH:
 			feature_flags |= DM_CACHE_FEATURE_WRITETHROUGH;
@@ -542,7 +645,7 @@ static int _cache_add_target_line(struct dev_manager *dm,
 			break;
 		}

-	switch (cache_pool_seg->cache_metadata_format) {
+	switch (setting_seg->cache_metadata_format) {
 	case CACHE_METADATA_FORMAT_1: break;
 	case CACHE_METADATA_FORMAT_2:
 		if (!_target_present(cmd, NULL, &attr))
@@ -550,7 +653,7 @@ static int _cache_add_target_line(struct dev_manager *dm,

 		if (!(attr & CACHE_FEATURE_METADATA2)) {
 			log_error("LV %s has metadata format %u unsuported by kernel.",
-				  display_lvname(seg->lv), cache_pool_seg->cache_metadata_format);
+				  display_lvname(seg->lv), setting_seg->cache_metadata_format);
 			return 0;
 		}
 		feature_flags |= DM_CACHE_FEATURE_METADATA2;
@@ -558,19 +661,50 @@ static int _cache_add_target_line(struct dev_manager *dm,
 		break;
 	default:
 		log_error(INTERNAL_ERROR "LV %s has unknown metadata format %u.",
-			  display_lvname(seg->lv), cache_pool_seg->cache_metadata_format);
+			  display_lvname(seg->lv), setting_seg->cache_metadata_format);
 		return 0;
 	}

-	if (!(metadata_uuid = build_dm_uuid(mem, cache_pool_seg->metadata_lv, NULL)))
-		return_0;
-
-	if (!(data_uuid = build_dm_uuid(mem, seg_lv(cache_pool_seg, 0), NULL)))
-		return_0;
-
 	if (!(origin_uuid = build_dm_uuid(mem, seg_lv(seg, 0), NULL)))
 		return_0;

+	if (!lv_is_cache_single(seg->pool_lv)) {
+		/* We don't use start/len when using separate data/meta devices. */
+		if (seg->metadata_len || seg->data_len) {
+			log_error(INTERNAL_ERROR "LV %s using unsupported ranges with cache pool.",
+				 display_lvname(seg->lv));
+			return 0;
+		}
+
+		if (!(metadata_uuid = build_dm_uuid(mem, cache_pool_seg->metadata_lv, NULL)))
+			return_0;
+
+		if (!(data_uuid = build_dm_uuid(mem, seg_lv(cache_pool_seg, 0), NULL)))
+			return_0;
+	} else {
+		if (!seg->metadata_len || !seg->data_len || (seg->metadata_start == seg->data_start)) {
+			log_error(INTERNAL_ERROR "LV %s has invalid ranges metadata %llu %llu data %llu %llu.",
+				 display_lvname(seg->lv),
+				 (unsigned long long)seg->metadata_start,
+				 (unsigned long long)seg->metadata_len,
+				 (unsigned long long)seg->data_start,
+				 (unsigned long long)seg->data_len);
+			return 0;
+		}
+
+		memset(&metadata_lvid, 0, sizeof(metadata_lvid));
+		memset(&data_lvid, 0, sizeof(data_lvid));
+		memcpy(&metadata_lvid.id[0], &seg->lv->vg->id, sizeof(struct id));
+		memcpy(&metadata_lvid.id[1], &seg->metadata_id, sizeof(struct id));
+		memcpy(&data_lvid.id[0], &seg->lv->vg->id, sizeof(struct id));
+		memcpy(&data_lvid.id[1], &seg->data_id, sizeof(struct id));
+
+		if (!(metadata_uuid = dm_build_dm_uuid(mem, UUID_PREFIX, (const char *)&metadata_lvid.s, NULL)))
+			return_0;
+		if (!(data_uuid = dm_build_dm_uuid(mem, UUID_PREFIX, (const char *)&data_lvid.s, NULL)))
+			return_0;
+	}
+
 	if (!dm_tree_node_add_cache_target(node, len,
 					   feature_flags,
 					   metadata_uuid,
@@ -579,8 +713,12 @@ static int _cache_add_target_line(struct dev_manager *dm,
 					   seg->cleaner_policy ? "cleaner" :
 						   /* undefined policy name -> likely an old "mq" */
 						   cache_pool_seg->policy_name ? : "mq",
-					   seg->cleaner_policy ? NULL : cache_pool_seg->policy_settings,
-					   cache_pool_seg->chunk_size))
+					   seg->cleaner_policy ? NULL : setting_seg->policy_settings,
+					   seg->metadata_start,
+					   seg->metadata_len,
+					   seg->data_start,
+					   seg->data_len,
+					   setting_seg->chunk_size))
 		return_0;

 	return 1;
--- a/lib/commands/toolcontext.c
+++ b/lib/commands/toolcontext.c
@@ -22,6 +22,7 @@
 #include "lib/activate/activate.h"
 #include "lib/filters/filter.h"
 #include "lib/label/label.h"
+#include "lib/label/hints.h"
 #include "lib/misc/lvm-file.h"
 #include "lib/format_text/format-text.h"
 #include "lib/display/display.h"
@@ -32,10 +33,6 @@
 #include "lib/format_text/archiver.h"
 #include "lib/lvmpolld/lvmpolld-client.h"

-#ifdef HAVE_LIBDL
-#include "lib/misc/sharedlib.h"
-#endif
-
 #include <locale.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
@@ -1032,7 +1029,7 @@ static int _init_dev_cache(struct cmd_context *cmd)

 #define MAX_FILTERS 10

-static struct dev_filter *_init_lvmetad_filter_chain(struct cmd_context *cmd)
+static struct dev_filter *_init_filter_chain(struct cmd_context *cmd)
 {
 	int nr_filt = 0;
 	const struct dm_config_node *cn;
@@ -1141,65 +1138,45 @@ bad:
 }

 /*
- * The way the filtering is initialized depends on whether lvmetad is uesd or not.
- *
- * If lvmetad is used, there are three filter chains:
- *
- *   - cmd->lvmetad_filter - the lvmetad filter chain used when scanning devs for lvmetad update:
- *     sysfs filter -> internal filter -> global regex filter -> type filter ->
- *     usable device filter(FILTER_MODE_PRE_LVMETAD) ->
- *     mpath component filter -> partitioned filter ->
- *     md component filter -> fw raid filter
- *
- *   - cmd->filter - the filter chain used for lvmetad responses:
- *     persistent filter -> regex_filter -> usable device filter(FILTER_MODE_POST_LVMETAD)
- *
- *   - cmd->full_filter - the filter chain used for all the remaining situations:
- *     cmd->lvmetad_filter -> cmd->filter
- *
- * If lvmetad is not used, there's just one filter chain:
- *
- *   - cmd->filter == cmd->full_filter:
- *     persistent filter -> sysfs filter -> internal filter -> global regex filter ->
- *     regex_filter -> type filter -> usable device filter(FILTER_MODE_NO_LVMETAD) ->
+ *   cmd->filter == 
+ *     persistent(cache) filter -> sysfs filter -> internal filter -> global regex filter ->
+ *     regex_filter -> type filter -> usable device filter ->
 *     mpath component filter -> partitioned filter -> md component filter -> fw raid filter
 *
 */
 int init_filters(struct cmd_context *cmd, unsigned load_persistent_cache)
 {
-	struct dev_filter *filter = NULL, *filter_components[2] = {0};
+	struct dev_filter *pfilter, *filter = NULL, *filter_components[2] = {0};

 	if (!cmd->initialized.connections) {
 		log_error(INTERNAL_ERROR "connections must be initialized before filters");
 		return 0;
 	}

-	cmd->lvmetad_filter = _init_lvmetad_filter_chain(cmd);
-	if (!cmd->lvmetad_filter)
+	filter = _init_filter_chain(cmd);
+	if (!filter)
 		goto_bad;

 	init_ignore_suspended_devices(find_config_tree_bool(cmd, devices_ignore_suspended_devices_CFG, NULL));
 	init_ignore_lvm_mirrors(find_config_tree_bool(cmd, devices_ignore_lvm_mirrors_CFG, NULL));

 	/*
-	 * If lvmetad is used, there's a separation between pre-lvmetad filter chain
-	 * ("cmd->lvmetad_filter") applied only if scanning for lvmetad update and
-	 * post-lvmetad filter chain ("filter") applied on each lvmetad response.
-	 * However, if lvmetad is not used, these two chains are not separated
-	 * and we use exactly one filter chain during device scanning ("filter"
-	 * that includes also "cmd->lvmetad_filter" chain).
+	 * persisent filter is a cache of the previous result real filter result.
+	 * If a dev is found in persistent filter, the pass/fail result saved by
+	 * the pfilter is used.  If a dev does not existing in the persistent
+	 * filter, the dev is passed on to the real filter, and when the result
+	 * of the real filter is saved in the persistent filter.
+	 *
+	 * FIXME: we should apply the filter once at the start of the command,
+	 * and not call the filters repeatedly.  In that case we would not need
+	 * the persistent/caching filter layer.
 	 */
-	filter = cmd->lvmetad_filter;
-	cmd->lvmetad_filter = NULL;
-
-	if (!(filter = persistent_filter_create(cmd->dev_types, filter))) {
+	if (!(pfilter = persistent_filter_create(cmd->dev_types, filter))) {
 		log_verbose("Failed to create persistent device filter.");
 		goto bad;
 	}

-	cmd->filter = filter;
-
-	cmd->full_filter = filter;
+	cmd->filter = pfilter;

 	cmd->initialized.filters = 1;
 	return 1;
@@ -1221,10 +1198,6 @@ bad:
 		filter->destroy(filter);
 	}

-	/* if lvmetad is used, the cmd->lvmetad_filter is separate */
-	if (cmd->lvmetad_filter)
-		cmd->lvmetad_filter->destroy(cmd->lvmetad_filter);
-
 	cmd->initialized.filters = 0;
 	return 0;
 }
@@ -1298,24 +1271,6 @@ int lvm_register_segtype(struct segtype_library *seglib,
 	return 1;
 }

-static int _init_single_segtype(struct cmd_context *cmd,
-				struct segtype_library *seglib)
-{
-	struct segment_type *(*init_segtype_fn) (struct cmd_context *);
-	struct segment_type *segtype;
-
-	if (!(init_segtype_fn = dlsym(seglib->lib, "init_segtype"))) {
-		log_error("Shared library %s does not contain segment type "
-			  "functions", seglib->libname);
-		return 0;
-	}
-
-	if (!(segtype = init_segtype_fn(seglib->cmd)))
-		return_0;
-
-	return lvm_register_segtype(seglib, segtype);
-}
-
 static int _init_segtypes(struct cmd_context *cmd)
 {
 	int i;
@@ -1336,10 +1291,6 @@ static int _init_segtypes(struct cmd_context *cmd)
 		NULL
 	};

-#ifdef HAVE_LIBDL
-	const struct dm_config_node *cn;
-#endif
-
 	for (i = 0; init_segtype_array[i]; i++) {
 		if (!(segtype = init_segtype_array[i](cmd)))
 			return 0;
@@ -1367,55 +1318,9 @@ static int _init_segtypes(struct cmd_context *cmd)
 		return_0;
 #endif

-#ifdef HAVE_LIBDL
-	/* Load any formats in shared libs unless static */
-	if (!is_static() &&
-	    (cn = find_config_tree_array(cmd, global_segment_libraries_CFG, NULL))) {
-
-		const struct dm_config_value *cv;
-		int (*init_multiple_segtypes_fn) (struct cmd_context *,
-						  struct segtype_library *);
-
-		for (cv = cn->v; cv; cv = cv->next) {
-			if (cv->type != DM_CFG_STRING) {
-				log_error("Invalid string in config file: "
-					  "global/segment_libraries");
-				return 0;
-			}
-			seglib.libname = cv->v.str;
-			if (!(seglib.lib = load_shared_library(cmd,
-							seglib.libname,
-							"segment type", 0)))
-				return_0;
-
-			if ((init_multiple_segtypes_fn =
-			    dlsym(seglib.lib, "init_multiple_segtypes"))) {
-				if (dlsym(seglib.lib, "init_segtype"))
-					log_warn("WARNING: Shared lib %s has "
-						 "conflicting init fns.  Using"
-						 " init_multiple_segtypes().",
-						 seglib.libname);
-			} else
-				init_multiple_segtypes_fn =
-				    _init_single_segtype;
- 
-			if (!init_multiple_segtypes_fn(cmd, &seglib)) {
-				struct dm_list *sgtl, *tmp;
-				log_error("init_multiple_segtypes() failed: "
-					  "Unloading shared library %s",
-					  seglib.libname);
-				dm_list_iterate_safe(sgtl, tmp, &cmd->segtypes) {
-					segtype = dm_list_item(sgtl, struct segment_type);
-					if (segtype->library == seglib.lib) {
-						dm_list_del(&segtype->list);
-						segtype->ops->destroy(segtype);
-					}
-				}
-				dlclose(seglib.lib);
-				return_0;
-			}
-		}
-	}
+#ifdef WRITECACHE_INTERNAL
+	if (!init_writecache_segtypes(cmd, &seglib))
+		return 0;
 #endif

 	return 1;
@@ -1578,6 +1483,7 @@ struct cmd_context *create_config_context(void)

 	dm_list_init(&cmd->config_files);
 	dm_list_init(&cmd->tags);
+	dm_list_init(&cmd->hints);

 	if (!_init_lvm_conf(cmd))
 		goto_out;
@@ -1753,6 +1659,8 @@ struct cmd_context *create_toolcontext(unsigned is_clvmd,
 						find_config_tree_array(cmd, devices_types_CFG, NULL))))
 		goto_out;

+	init_use_aio(find_config_tree_bool(cmd, global_use_aio_CFG, NULL));
+
 	if (!_init_dev_cache(cmd))
 		goto_out;

@@ -1814,27 +1722,11 @@ static void _destroy_segtypes(struct dm_list *segtypes)
 {
 	struct dm_list *sgtl, *tmp;
 	struct segment_type *segtype;
-	void *lib;

 	dm_list_iterate_safe(sgtl, tmp, segtypes) {
 		segtype = dm_list_item(sgtl, struct segment_type);
 		dm_list_del(&segtype->list);
-		lib = segtype->library;
 		segtype->ops->destroy(segtype);
-#ifdef HAVE_LIBDL
-		/*
-		 * If no segtypes remain from this library, close it.
-		 */
-		if (lib) {
-			struct segment_type *segtype2;
-			dm_list_iterate_items(segtype2, segtypes)
-				if (segtype2->library == lib)
-					goto skip_dlclose;
-			dlclose(lib);
-skip_dlclose:
-			;
-		}
-#endif
 	}
 }

@@ -1849,9 +1741,9 @@ static void _destroy_dev_types(struct cmd_context *cmd)

 static void _destroy_filters(struct cmd_context *cmd)
 {
-	if (cmd->full_filter) {
-		cmd->full_filter->destroy(cmd->full_filter);
-		cmd->lvmetad_filter = cmd->filter = cmd->full_filter = NULL;
+	if (cmd->filter) {
+		cmd->filter->destroy(cmd->filter);
+		cmd->filter = NULL;
 	}
 	cmd->initialized.filters = 0;
 }
@@ -1890,6 +1782,7 @@ int refresh_toolcontext(struct cmd_context *cmd)
 	 */

 	activation_release();
+	hints_exit();
 	lvmcache_destroy(cmd, 0, 0);
 	label_scan_destroy(cmd);
 	label_exit();
@@ -2009,6 +1902,7 @@ void destroy_toolcontext(struct cmd_context *cmd)

 	archive_exit(cmd);
 	backup_exit(cmd);
+	hints_exit();
 	lvmcache_destroy(cmd, 0, 0);
 	label_scan_destroy(cmd);
 	label_exit();
--- a/lib/commands/toolcontext.h
+++ b/lib/commands/toolcontext.h
@@ -172,13 +172,17 @@ struct cmd_context {
 	unsigned is_clvmd:1;
 	unsigned use_full_md_check:1;
 	unsigned is_activating:1;
+	unsigned enable_hints:1;		/* hints are enabled for cmds in general */
+	unsigned use_hints:1;			/* if hints are enabled this cmd can use them */
+	unsigned pvscan_recreate_hints:1;	/* enable special case hint handling for pvscan --cache */
+	unsigned scan_lvs:1;
+	unsigned wipe_outdated_pvs:1;

 	/*
-	 * Filtering.
+	 * Devices and filtering.
 	 */
-	struct dev_filter *lvmetad_filter;	/* pre-lvmetad filter chain */
-	struct dev_filter *filter;		/* post-lvmetad filter chain */
-	struct dev_filter *full_filter;		/* lvmetad_filter + filter */
+	struct dev_filter *filter;
+	struct dm_list hints;

 	/*
 	 * Configuration.
--- a/lib/config/config.c
+++ b/lib/config/config.c
@@ -24,6 +24,7 @@
 #include "lib/misc/lvm-file.h"
 #include "lib/mm/memlock.h"
 #include "lib/label/label.h"
+#include "lib/metadata/metadata.h"

 #include <sys/stat.h>
 #include <sys/mman.h>
@@ -2333,6 +2334,11 @@ int load_pending_profiles(struct cmd_context *cmd)
 	return r;
 }

+int get_default_metadata_pvmetadatasize_CFG(struct cmd_context *cmd, struct profile *profile)
+{
+	return get_default_pvmetadatasize_sectors();
+}
+
 const char *get_default_devices_cache_dir_CFG(struct cmd_context *cmd, struct profile *profile)
 {
 	static char buf[PATH_MAX];
--- a/lib/config/config.h
+++ b/lib/config/config.h
@@ -311,5 +311,7 @@ int get_default_allocation_cache_pool_chunk_size_CFG(struct cmd_context *cmd, st
 const char *get_default_allocation_cache_policy_CFG(struct cmd_context *cmd, struct profile *profile);
 #define get_default_unconfigured_allocation_cache_policy_CFG NULL
 uint64_t get_default_allocation_cache_pool_max_chunks_CFG(struct cmd_context *cmd, struct profile *profile);
+int get_default_metadata_pvmetadatasize_CFG(struct cmd_context *cmd, struct profile *profile);
+#define get_default_unconfigured_metadata_pvmetadatasize_CFG NULL

 #endif
--- a/lib/config/config_settings.h
+++ b/lib/config/config_settings.h
@@ -226,7 +226,7 @@ cfg(devices_dir_CFG, "dir", devices_CFG_SECTION, CFG_ADVANCED, CFG_TYPE_STRING,
 cfg_array(devices_scan_CFG, "scan", devices_CFG_SECTION, CFG_ADVANCED, CFG_TYPE_STRING, "#S/dev", vsn(1, 0, 0), NULL, 0, NULL,
 	"Directories containing device nodes to use with LVM.\n")

-cfg_array(devices_loopfiles_CFG, "loopfiles", devices_CFG_SECTION, CFG_DEFAULT_UNDEFINED | CFG_UNSUPPORTED, CFG_TYPE_STRING, NULL, vsn(1, 2, 0), NULL, vsn(3, 0, 0), NULL, NULL)
+cfg_array(devices_loopfiles_CFG, "loopfiles", devices_CFG_SECTION, CFG_DEFAULT_UNDEFINED | CFG_UNSUPPORTED, CFG_TYPE_STRING, NULL, vsn(1, 2, 0), NULL, vsn(2, 3, 0), NULL, NULL)

 cfg(devices_obtain_device_list_from_udev_CFG, "obtain_device_list_from_udev", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_OBTAIN_DEVICE_LIST_FROM_UDEV, vsn(2, 2, 85), NULL, 0, NULL,
 	"Obtain the list of available devices from udev.\n"
@@ -255,6 +255,20 @@ cfg(devices_external_device_info_source_CFG, "external_device_info_source", devi
 	"    compiled with udev support.\n"
 	"#\n")

+cfg(devices_hints_CFG, "hints", devices_CFG_SECTION, 0, CFG_TYPE_STRING, DEFAULT_HINTS, vsn(2, 3, 2), NULL, 0, NULL,
+	"Use a local file to remember which devices have PVs on them.\n"
+	"Some commands will use this as an optimization to reduce device\n"
+	"scanning, and will only scan the listed PVs. Removing the hint file\n"
+	"will cause lvm to generate a new one. Disable hints if PVs will\n"
+	"be copied onto devices using non-lvm commands, like dd.\n"
+	"#\n"
+	"Accepted values:\n"
+	"  all\n"
+	"    Use all hints.\n"
+	"  none\n"
+	"    Use no hints.\n"
+	"#\n")
+
 cfg_array(devices_preferred_names_CFG, "preferred_names", devices_CFG_SECTION, CFG_ALLOW_EMPTY | CFG_DEFAULT_UNDEFINED , CFG_TYPE_STRING, NULL, vsn(1, 2, 19), NULL, 0, NULL,
 	"Select which path name to display for a block device.\n"
 	"If multiple path names exist for a block device, and LVM needs to\n"
@@ -314,13 +328,13 @@ cfg_array(devices_global_filter_CFG, "global_filter", devices_CFG_SECTION, CFG_D
 cfg_runtime(devices_cache_CFG, "cache", devices_CFG_SECTION, 0, CFG_TYPE_STRING, vsn(1, 0, 0), vsn(1, 2, 19), NULL,
 	"This setting is no longer used.\n")

-cfg_runtime(devices_cache_dir_CFG, "cache_dir", devices_CFG_SECTION, 0, CFG_TYPE_STRING, vsn(1, 2, 19), vsn(3, 0, 0), NULL,
+cfg_runtime(devices_cache_dir_CFG, "cache_dir", devices_CFG_SECTION, 0, CFG_TYPE_STRING, vsn(1, 2, 19), vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg(devices_cache_file_prefix_CFG, "cache_file_prefix", devices_CFG_SECTION, CFG_ALLOW_EMPTY, CFG_TYPE_STRING, DEFAULT_CACHE_FILE_PREFIX, vsn(1, 2, 19), NULL, vsn(3, 0, 0), NULL,
+cfg(devices_cache_file_prefix_CFG, "cache_file_prefix", devices_CFG_SECTION, CFG_ALLOW_EMPTY, CFG_TYPE_STRING, DEFAULT_CACHE_FILE_PREFIX, vsn(1, 2, 19), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg(devices_write_cache_state_CFG, "write_cache_state", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, 1, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg(devices_write_cache_state_CFG, "write_cache_state", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, 1, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

 cfg_array(devices_types_CFG, "types", devices_CFG_SECTION, CFG_DEFAULT_UNDEFINED | CFG_ADVANCED, CFG_TYPE_INT | CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, 0, NULL,
@@ -352,16 +366,21 @@ cfg(devices_fw_raid_component_detection_CFG, "fw_raid_component_detection", devi
 	"detection to execute.\n")

 cfg(devices_md_chunk_alignment_CFG, "md_chunk_alignment", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_MD_CHUNK_ALIGNMENT, vsn(2, 2, 48), NULL, 0, NULL,
-	"Align PV data blocks with md device's stripe-width.\n"
-	"This applies if a PV is placed directly on an md device.\n")
+	"Align the start of a PV data area with md device's stripe-width.\n"
+	"This applies if a PV is placed directly on an md device.\n"
+	"default_data_alignment will be overriden if it is not aligned\n"
+	"with the value detected for this setting.\n"
+	"This setting is overriden by data_alignment_detection,\n"
+	"data_alignment, and the --dataalignment option.\n")

-cfg(devices_default_data_alignment_CFG, "default_data_alignment", devices_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_DATA_ALIGNMENT, vsn(2, 2, 75), NULL, 0, NULL,
-	"Default alignment of the start of a PV data area in MB.\n"
-	"If set to 0, a value of 64KiB will be used.\n"
-	"Set to 1 for 1MiB, 2 for 2MiB, etc.\n")
+cfg(devices_default_data_alignment_CFG, "default_data_alignment", devices_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, FIRST_PE_AT_ONE_MB_IN_MB, vsn(2, 2, 75), NULL, 0, NULL,
+	"Align the start of a PV data area with this number of MiB.\n"
+	"Set to 1 for 1MiB, 2 for 2MiB, etc. Set to 0 to disable.\n"
+	"This setting is overriden by data_alignment and the --dataalignment\n"
+	"option.\n")

 cfg(devices_data_alignment_detection_CFG, "data_alignment_detection", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_DATA_ALIGNMENT_DETECTION, vsn(2, 2, 51), NULL, 0, NULL,
-	"Detect PV data alignment based on sysfs device information.\n"
+	"Align the start of a PV data area with sysfs io properties.\n"
 	"The start of a PV data area will be a multiple of minimum_io_size or\n"
 	"optimal_io_size exposed in sysfs. minimum_io_size is the smallest\n"
 	"request the device can perform without incurring a read-modify-write\n"
@@ -369,25 +388,27 @@ cfg(devices_data_alignment_detection_CFG, "data_alignment_detection", devices_CF
 	"preferred unit of receiving I/O, e.g. MD stripe width.\n"
 	"minimum_io_size is used if optimal_io_size is undefined (0).\n"
 	"If md_chunk_alignment is enabled, that detects the optimal_io_size.\n"
-	"This setting takes precedence over md_chunk_alignment.\n")
+	"default_data_alignment and md_chunk_alignment will be overriden\n"
+	"if they are not aligned with the value detected for this setting.\n"
+	"This setting is overriden by data_alignment and the --dataalignment\n"
+	"option.\n")

 cfg(devices_data_alignment_CFG, "data_alignment", devices_CFG_SECTION, 0, CFG_TYPE_INT, 0, vsn(2, 2, 45), NULL, 0, NULL,
-	"Alignment of the start of a PV data area in KiB.\n"
-	"If a PV is placed directly on an md device and md_chunk_alignment or\n"
-	"data_alignment_detection are enabled, then this setting is ignored.\n"
-	"Otherwise, md_chunk_alignment and data_alignment_detection are\n"
-	"disabled if this is set. Set to 0 to use the default alignment or the\n"
-	"page size, if larger.\n")
+	"Align the start of a PV data area with this number of KiB.\n"
+	"When non-zero, this setting overrides default_data_alignment.\n"
+	"Set to 0 to disable, in which case default_data_alignment\n"
+	"is used to align the first PE in units of MiB.\n"
+	"This setting is overriden by the --dataalignment option.\n")

 cfg(devices_data_alignment_offset_detection_CFG, "data_alignment_offset_detection", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_DATA_ALIGNMENT_OFFSET_DETECTION, vsn(2, 2, 50), NULL, 0, NULL,
-	"Detect PV data alignment offset based on sysfs device information.\n"
-	"The start of a PV aligned data area will be shifted by the\n"
+	"Shift the start of an aligned PV data area based on sysfs information.\n"
+	"After a PV data area is aligned, it will be shifted by the\n"
 	"alignment_offset exposed in sysfs. This offset is often 0, but may\n"
 	"be non-zero. Certain 4KiB sector drives that compensate for windows\n"
 	"partitioning will have an alignment_offset of 3584 bytes (sector 7\n"
 	"is the lowest aligned logical block, the 4KiB sectors start at\n"
 	"LBA -1, and consequently sector 63 is aligned on a 4KiB boundary).\n"
-	"pvcreate --dataalignmentoffset will skip this detection.\n")
+	"This setting is overriden by the --dataalignmentoffset option.\n")

 cfg(devices_ignore_suspended_devices_CFG, "ignore_suspended_devices", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_IGNORE_SUSPENDED_DEVICES, vsn(1, 2, 19), NULL, 0, NULL,
 	"Ignore DM devices that have I/O suspended while scanning devices.\n"
@@ -412,7 +433,7 @@ cfg(devices_ignore_lvm_mirrors_CFG, "ignore_lvm_mirrors", devices_CFG_SECTION, 0
 	"apply to LVM RAID types like 'raid1' which handle failures in a\n"
 	"different way, making them a better choice for VG stacking.\n")

-cfg(devices_disable_after_error_count_CFG, "disable_after_error_count", devices_CFG_SECTION, 0, CFG_TYPE_INT, 0, vsn(2, 2, 75), NULL, vsn(3, 0, 0), NULL,
+cfg(devices_disable_after_error_count_CFG, "disable_after_error_count", devices_CFG_SECTION, 0, CFG_TYPE_INT, 0, vsn(2, 2, 75), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

 cfg(devices_require_restorefile_with_uuid_CFG, "require_restorefile_with_uuid", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_REQUIRE_RESTOREFILE_WITH_UUID, vsn(2, 2, 73), NULL, 0, NULL,
@@ -609,7 +630,7 @@ cfg_runtime(allocation_thin_pool_chunk_size_CFG, "thin_pool_chunk_size", allocat
 cfg(allocation_physical_extent_size_CFG, "physical_extent_size", allocation_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_EXTENT_SIZE, vsn(2, 2, 112), NULL, 0, NULL,
 	"Default physical extent size in KiB to use for new VGs.\n")

-#define VDO_1ST_VSN vsn(3, 0, 0)
+#define VDO_1ST_VSN vsn(2, 3, 0)
 cfg(allocation_vdo_use_compression_CFG, "vdo_use_compression", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_USE_COMPRESSION, VDO_1ST_VSN, NULL, 0, NULL,
 	"Enables or disables compression when creating a VDO volume.\n"
 	"Compression may be disabled if necessary to maximize performance\n"
@@ -618,10 +639,17 @@ cfg(allocation_vdo_use_compression_CFG, "vdo_use_compression", allocation_CFG_SE
 cfg(allocation_vdo_use_deduplication_CFG, "vdo_use_deduplication", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_USE_DEDUPLICATION, VDO_1ST_VSN, NULL, 0, NULL,
 	"Enables or disables deduplication when creating a VDO volume.\n"
 	"Deduplication may be disabled in instances where data is not expected\n"
-	"to have good deduplication rates but compression is still desired.")
+	"to have good deduplication rates but compression is still desired.\n")

-cfg(allocation_vdo_emulate_512_sectors_CFG, "vdo_emulate_512_sectors", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_EMULATE_512_SECTORS, VDO_1ST_VSN, NULL, 0, NULL,
-	"Specifies that the VDO volume is to emulate a 512 byte block device.\n")
+cfg(allocation_vdo_use_metadata_hints_CFG, "vdo_use_metadata_hints", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_USE_METADATA_HINTS, VDO_1ST_VSN, NULL, 0, NULL,
+	"Enables or disables whether VDO volume should tag its latency-critical\n"
+	"writes with the REQ_SYNC flag. Some device mapper targets such as dm-raid5\n"
+	"process writes with this flag at a higher priority.\n"
+	"Default is enabled.\n")
+
+cfg(allocation_vdo_minimum_io_size_CFG, "vdo_minimum_io_size", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_MINIMUM_IO_SIZE, VDO_1ST_VSN, NULL, 0, NULL,
+	"The minimum IO size for VDO volume to accept, in bytes.\n"
+	"Valid values are 512 or 4096. The recommended and default value is 4096.\n")

 cfg(allocation_vdo_block_map_cache_size_mb_CFG, "vdo_block_map_cache_size_mb", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_BLOCK_MAP_CACHE_SIZE_MB, VDO_1ST_VSN, NULL, 0, NULL,
 	"Specifies the amount of memory in MiB allocated for caching block map\n"
@@ -629,34 +657,25 @@ cfg(allocation_vdo_block_map_cache_size_mb_CFG, "vdo_block_map_cache_size_mb", a
 	"at least 128MiB and less than 16TiB. The cache must be at least 16MiB\n"
 	"per logical thread. Note that there is a memory overhead of 15%.\n")

-cfg(allocation_vdo_block_map_period_CFG, "vdo_block_map_period", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_BLOCK_MAP_PERIOD, VDO_1ST_VSN, NULL, 0, NULL,
-	"Tunes the quantity of block map updates that can accumulate\n"
-	"before cache pages are flushed to disk. The value must be\n"
-	"at least 1 and less then 16380.\n"
-	"A lower value means shorter recovery time but lower performance.\n")
+// vdo format --blockMapPeriod
+cfg(allocation_vdo_block_map_era_length_CFG, "vdo_block_map_period", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_BLOCK_MAP_ERA_LENGTH, VDO_1ST_VSN, NULL, 0, NULL,
+	"The speed with which the block map cache writes out modified block map pages.\n"
+	"A smaller era length is likely to reduce the amount time spent rebuilding,\n"
+	"at the cost of increased block map writes during normal operation.\n"
+	"The maximum and recommended value is 16380; the minimum value is 1.\n")

 cfg(allocation_vdo_check_point_frequency_CFG, "vdo_check_point_frequency", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_CHECK_POINT_FREQUENCY, VDO_1ST_VSN, NULL, 0, NULL,
 	"The default check point frequency for VDO volume.\n")

+// vdo format
 cfg(allocation_vdo_use_sparse_index_CFG, "vdo_use_sparse_index", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_USE_SPARSE_INDEX, VDO_1ST_VSN, NULL, 0, NULL,
 	"Enables sparse indexing for VDO volume.\n")

+// vdo format
 cfg(allocation_vdo_index_memory_size_mb_CFG, "vdo_index_memory_size_mb", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_INDEX_MEMORY_SIZE_MB, VDO_1ST_VSN, NULL, 0, NULL,
 	"Specifies the amount of index memory in MiB for VDO volume.\n"
 	"The value must be at least 256MiB and at most 1TiB.\n")

-cfg(allocation_vdo_use_read_cache_CFG, "vdo_use_read_cache", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_USE_READ_CACHE, VDO_1ST_VSN, NULL, 0, NULL,
-	"Enables or disables the read cache within the VDO volume.\n"
-	"The cache should be enabled if write workloads are expected\n"
-	"to have high levels of deduplication, or for read intensive\n"
-	"workloads of highly compressible data.\n")
-
-cfg(allocation_vdo_read_cache_size_mb_CFG, "vdo_read_cache_size_mb", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_READ_CACHE_SIZE_MB, VDO_1ST_VSN, NULL, 0, NULL,
-	"Specifies the extra VDO volume read cache size in MiB.\n"
-	"This space is in addition to a system-defined minimum.\n"
-	"The value must be less then 16TiB and 1.12 MiB of memory\n"
-	"will be used per MiB of read cache specified, per bio thread.\n")
-
 cfg(allocation_vdo_slab_size_mb_CFG, "vdo_slab_size_mb", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_SLAB_SIZE_MB, VDO_1ST_VSN, NULL, 0, NULL,
 	"Specifies the size in MiB of the increment by which a VDO is grown.\n"
 	"Using a smaller size constrains the total maximum physical size\n"
@@ -687,7 +706,7 @@ cfg(allocation_vdo_hash_zone_threads_CFG, "vdo_hash_zone_threads", allocation_CF
 	"processing based on the hash value computed from the block data.\n"
 	"The value must be at in range [0..100].\n"
 	"vdo_hash_zone_threads, vdo_logical_threads and vdo_physical_threads must be\n"
-	"either all zero or all non-zero.")
+	"either all zero or all non-zero.\n")

 cfg(allocation_vdo_logical_threads_CFG, "vdo_logical_threads", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_LOGICAL_THREADS, VDO_1ST_VSN, NULL, 0, NULL,
 	"Specifies the number of threads across which to subdivide parts of the VDO\n"
@@ -715,6 +734,16 @@ cfg(allocation_vdo_write_policy_CFG, "vdo_write_policy", allocation_CFG_SECTION,
 	"async - Writes are acknowledged after data has been cached for writing to stable storage.\n"
 	"        Data which has not been flushed is not guaranteed to persist in this mode.\n")

+cfg(allocation_vdo_max_discard_CFG, "vdo_max_discard", allocation_CFG_SECTION, CFG_PROFILABLE | CFG_PROFILABLE_METADATA | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_VDO_MAX_DISCARD, VDO_1ST_VSN, NULL, 0, NULL,
+	"Specified te maximum size of discard bio accepted, in 4096 byte blocks.\n"
+	"I/O requests to a VDO volume are normally split into 4096-byte blocks,\n"
+	"and processed up to 2048 at a time. However, discard requests to a VDO volume\n"
+	"can be automatically split to a larger size, up to <max discard> 4096-byte blocks\n"
+	"in a single bio, and are limited to 1500 at a time.\n"
+	"Increasing this value may provide better overall performance, at the cost of\n"
+	"increased latency for the individual discard requests.\n"
+	"The default and minimum is 1. The maximum is UINT_MAX / 4096.\n")
+
 cfg(log_report_command_log_CFG, "report_command_log", log_CFG_SECTION, CFG_PROFILABLE | CFG_DEFAULT_COMMENTED | CFG_DISALLOW_INTERACTIVE, CFG_TYPE_BOOL, DEFAULT_COMMAND_LOG_REPORT, vsn(2, 2, 158), NULL, 0, NULL,
 	"Enable or disable LVM log reporting.\n"
 	"If enabled, LVM will collect a log of operations, messages,\n"
@@ -864,13 +893,13 @@ cfg(global_activation_CFG, "activation", global_CFG_SECTION, 0, CFG_TYPE_BOOL, D
 	"is not present in the kernel, disabling this should suppress\n"
 	"the error messages.\n")

-cfg(global_fallback_to_lvm1_CFG, "fallback_to_lvm1", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, 0, vsn(1, 0, 18), NULL, vsn(3, 0, 0), NULL,
+cfg(global_fallback_to_lvm1_CFG, "fallback_to_lvm1", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, 0, vsn(1, 0, 18), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg(global_format_CFG, "format", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_STRING, DEFAULT_FORMAT, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg(global_format_CFG, "format", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_STRING, DEFAULT_FORMAT, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg_array(global_format_libraries_CFG, "format_libraries", global_CFG_SECTION, CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg_array(global_format_libraries_CFG, "format_libraries", global_CFG_SECTION, CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.")

 cfg_array(global_segment_libraries_CFG, "segment_libraries", global_CFG_SECTION, CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 18), NULL, 0, NULL, NULL)
@@ -881,16 +910,16 @@ cfg(global_proc_CFG, "proc", global_CFG_SECTION, CFG_ADVANCED, CFG_TYPE_STRING,
 cfg(global_etc_CFG, "etc", global_CFG_SECTION, 0, CFG_TYPE_STRING, DEFAULT_ETC_DIR, vsn(2, 2, 117), "@CONFDIR@", 0, NULL,
 	"Location of /etc system configuration directory.\n")

-cfg(global_locking_type_CFG, "locking_type", global_CFG_SECTION, 0, CFG_TYPE_INT, 1, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg(global_locking_type_CFG, "locking_type", global_CFG_SECTION, 0, CFG_TYPE_INT, 1, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.")

 cfg(global_wait_for_locks_CFG, "wait_for_locks", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_WAIT_FOR_LOCKS, vsn(2, 2, 50), NULL, 0, NULL,
 	"When disabled, fail if a lock request would block.\n")

-cfg(global_fallback_to_clustered_locking_CFG, "fallback_to_clustered_locking", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_FALLBACK_TO_CLUSTERED_LOCKING, vsn(2, 2, 42), NULL, vsn(3, 0, 0), NULL,
+cfg(global_fallback_to_clustered_locking_CFG, "fallback_to_clustered_locking", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_FALLBACK_TO_CLUSTERED_LOCKING, vsn(2, 2, 42), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg(global_fallback_to_local_locking_CFG, "fallback_to_local_locking", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_FALLBACK_TO_LOCAL_LOCKING, vsn(2, 2, 42), NULL, vsn(3, 0, 0), NULL,
+cfg(global_fallback_to_local_locking_CFG, "fallback_to_local_locking", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_FALLBACK_TO_LOCAL_LOCKING, vsn(2, 2, 42), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

 cfg(global_locking_dir_CFG, "locking_dir", global_CFG_SECTION, 0, CFG_TYPE_STRING, DEFAULT_LOCK_DIR, vsn(1, 0, 0), "@DEFAULT_LOCK_DIR@", 0, NULL,
@@ -910,7 +939,7 @@ cfg(global_prioritise_write_locks_CFG, "prioritise_write_locks", global_CFG_SECT
 cfg(global_library_dir_CFG, "library_dir", global_CFG_SECTION, CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, 0, NULL,
 	"Search this directory first for shared libraries.\n")

-cfg(global_locking_library_CFG, "locking_library", global_CFG_SECTION, CFG_ALLOW_EMPTY | CFG_DEFAULT_COMMENTED, CFG_TYPE_STRING, DEFAULT_LOCKING_LIB, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg(global_locking_library_CFG, "locking_library", global_CFG_SECTION, CFG_ALLOW_EMPTY | CFG_DEFAULT_COMMENTED, CFG_TYPE_STRING, DEFAULT_LOCKING_LIB, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

 cfg(global_abort_on_internal_errors_CFG, "abort_on_internal_errors", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_ABORT_ON_INTERNAL_ERRORS, vsn(2, 2, 57), NULL, 0, NULL,
@@ -953,6 +982,16 @@ cfg(global_mirror_segtype_default_CFG, "mirror_segtype_default", global_CFG_SECT
 	"    fashion in a cluster.\n"
 	"#\n")

+cfg(global_support_mirrored_mirror_log_CFG, "support_mirrored_mirror_log", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 0, vsn(2, 3, 2), NULL, 0, NULL,
+	"Enable mirrored 'mirror' log type for testing.\n"
+	"#\n"
+	"This type is deprecated to create or convert to but can\n"
+	"be enabled to test that activation of existing mirrored\n"
+	"logs and conversion to disk/core works.\n"
+	"#\n"
+	"Not supported for regular operation!\n"
+	"\n")
+
 cfg(global_raid10_segtype_default_CFG, "raid10_segtype_default", global_CFG_SECTION, 0, CFG_TYPE_STRING, DEFAULT_RAID10_SEGTYPE, vsn(2, 2, 99), "@DEFAULT_RAID10_SEGTYPE@", 0, NULL,
 	"The segment type used by the -i -m combination.\n"
 	"The --type raid10|mirror option overrides this setting.\n"
@@ -997,12 +1036,24 @@ cfg(global_lvdisplay_shows_full_device_path_CFG, "lvdisplay_shows_full_device_pa
 	"Previously this was always shown as /dev/vgname/lvname even when that\n"
 	"was never a valid path in the /dev filesystem.\n")

-cfg(global_use_lvmetad_CFG, "use_lvmetad", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 0, vsn(2, 2, 93), 0, vsn(3, 0, 0), NULL,
+cfg(global_event_activation_CFG, "event_activation", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 1, vsn(2, 3, 1), 0, 0, NULL,
+	"Activate LVs based on system-generated device events.\n"
+	"When a device appears on the system, a system-generated event runs\n"
+	"the pvscan command to activate LVs if the new PV completes the VG.\n"
+	"Use auto_activation_volume_list to select which LVs should be\n"
+	"activated from these events (the default is all.)\n"
+	"When event_activation is disabled, the system will generally run\n"
+	"a direct activation command to activate LVs in complete VGs.\n")
+
+cfg(global_use_lvmetad_CFG, "use_lvmetad", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 0, vsn(2, 2, 93), 0, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

-cfg(global_lvmetad_update_wait_time_CFG, "lvmetad_update_wait_time", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(2, 2, 151), NULL, vsn(3, 0, 0), NULL,
+cfg(global_lvmetad_update_wait_time_CFG, "lvmetad_update_wait_time", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(2, 2, 151), NULL, vsn(2, 3, 0), NULL,
 	"This setting is no longer used.\n")

+cfg(global_use_aio_CFG, "use_aio", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, DEFAULT_USE_AIO, vsn(2, 2, 183), NULL, 0, NULL,
+	"Use async I/O when reading and writing devices.\n")
+
 cfg(global_use_lvmlockd_CFG, "use_lvmlockd", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 0, vsn(2, 2, 124), NULL, 0, NULL,
 	"Use lvmlockd for locking among hosts using LVM on shared storage.\n"
 	"Applicable only if LVM is compiled with lockd support in which\n"
@@ -1594,12 +1645,19 @@ cfg(metadata_vgmetadatacopies_CFG, "vgmetadatacopies", metadata_CFG_SECTION, CFG
 	"and allows you to control which metadata areas are used at the\n"
 	"individual PV level using pvchange --metadataignore y|n.\n")

-cfg(metadata_pvmetadatasize_CFG, "pvmetadatasize", metadata_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_PVMETADATASIZE, vsn(1, 0, 0), NULL, 0, NULL,
-	"Approximate number of sectors to use for each metadata copy.\n"
-	"VGs with large numbers of PVs or LVs, or VGs containing complex LV\n"
-	"structures, may need additional space for VG metadata. The metadata\n"
-	"areas are treated as circular buffers, so unused space becomes filled\n"
-	"with an archive of the most recent previous versions of the metadata.\n")
+cfg_runtime(metadata_pvmetadatasize_CFG, "pvmetadatasize", metadata_CFG_SECTION, CFG_DEFAULT_COMMENTED | CFG_DEFAULT_UNDEFINED, CFG_TYPE_INT, vsn(1, 0, 0), 0, NULL,
+	"The default size of the metadata area in units of 512 byte sectors.\n"
+	"The metadata area begins at an offset of the page size from the start\n"
+	"of the device. The first PE is by default at 1 MiB from the start of\n"
+	"the device. The space between these is the default metadata area size.\n"
+	"The actual size of the metadata area may be larger than what is set\n"
+	"here due to default_data_alignment making the first PE a MiB multiple.\n"
+	"The metadata area begins with a 512 byte header and is followed by a\n"
+	"circular buffer used for VG metadata text. The maximum size of the VG\n"
+	"metadata is about half the size of the metadata buffer. VGs with large\n"
+	"numbers of PVs or LVs, or VGs containing complex LV structures, may need\n"
+	"additional space for VG metadata. The --metadatasize option overrides\n"
+	"this setting.\n")

 cfg(metadata_pvmetadataignore_CFG, "pvmetadataignore", metadata_CFG_SECTION, CFG_ADVANCED | CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, DEFAULT_PVMETADATAIGNORE, vsn(2, 2, 69), NULL, 0, NULL,
 	"Ignore metadata areas on a new PV.\n"
@@ -1609,14 +1667,14 @@ cfg(metadata_pvmetadataignore_CFG, "pvmetadataignore", metadata_CFG_SECTION, CFG

 cfg(metadata_stripesize_CFG, "stripesize", metadata_CFG_SECTION, CFG_ADVANCED | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_STRIPESIZE, vsn(1, 0, 0), NULL, 0, NULL, NULL)

-cfg_array(metadata_dirs_CFG, "dirs", metadata_CFG_SECTION, CFG_ADVANCED | CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL,
+cfg_array(metadata_dirs_CFG, "dirs", metadata_CFG_SECTION, CFG_ADVANCED | CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL,
 	  "This setting is no longer used.\n")

-cfg_section(metadata_disk_areas_CFG_SUBSECTION, "disk_areas", metadata_CFG_SECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, vsn(1, 0, 0), vsn(3, 0, 0), NULL, NULL)
-cfg_section(disk_area_CFG_SUBSECTION, "disk_area", metadata_disk_areas_CFG_SUBSECTION, CFG_NAME_VARIABLE | CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, vsn(1, 0, 0), vsn(3, 0, 0), NULL, NULL)
-cfg(disk_area_start_sector_CFG, "start_sector", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL, NULL)
-cfg(disk_area_size_CFG, "size", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL, NULL)
-cfg(disk_area_id_CFG, "id", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(3, 0, 0), NULL, NULL)
+cfg_section(metadata_disk_areas_CFG_SUBSECTION, "disk_areas", metadata_CFG_SECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, vsn(1, 0, 0), vsn(2, 3, 0), NULL, NULL)
+cfg_section(disk_area_CFG_SUBSECTION, "disk_area", metadata_disk_areas_CFG_SUBSECTION, CFG_NAME_VARIABLE | CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, vsn(1, 0, 0), vsn(2, 3, 0), NULL, NULL)
+cfg(disk_area_start_sector_CFG, "start_sector", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL, NULL)
+cfg(disk_area_size_CFG, "size", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, 0, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL, NULL)
+cfg(disk_area_id_CFG, "id", disk_area_CFG_SUBSECTION, CFG_UNSUPPORTED | CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(1, 0, 0), NULL, vsn(2, 3, 0), NULL, NULL)

 cfg(report_output_format_CFG, "output_format", report_CFG_SECTION, CFG_PROFILABLE | CFG_DEFAULT_COMMENTED | CFG_DISALLOW_INTERACTIVE, CFG_TYPE_STRING, DEFAULT_REP_OUTPUT_FORMAT, vsn(2, 2, 158), NULL, 0, NULL,
 	"Format of LVM command's report output.\n"
--- a/lib/config/defaults.h
+++ b/lib/config/defaults.h
@@ -18,8 +18,18 @@

 #include "device_mapper/vdo/vdo_limits.h"

-#define DEFAULT_PE_ALIGN 2048
-#define DEFAULT_PE_ALIGN_OLD 128
+
+/*
+ * By default the first PE is placed at 1 MiB.
+ *
+ * If default_data_alignment is 2, then the first PE
+ * is placed at 2 * 1 MiB.
+ *
+ * If default_data_alignment is 3, then the first PE
+ * is placed at 3 * 1 MiB.
+ */
+#define FIRST_PE_AT_ONE_MB_IN_SECTORS 2048  /* 1 MiB in 512 byte sectors */
+#define FIRST_PE_AT_ONE_MB_IN_MB         1

 #define DEFAULT_ARCHIVE_ENABLED 1
 #define DEFAULT_BACKUP_ENABLED 1
@@ -60,6 +70,7 @@
 #define DEFAULT_METADATA_READ_ONLY 0
 #define DEFAULT_LVDISPLAY_SHOWS_FULL_DEVICE_PATH 0
 #define DEFAULT_UNKNOWN_DEVICE_NAME "[unknown]"
+#define DEFAULT_USE_AIO 1

 #define DEFAULT_SANLOCK_LV_EXTEND_MB 256

@@ -142,14 +153,13 @@
 /* VDO defaults */
 #define DEFAULT_VDO_USE_COMPRESSION	(true)
 #define DEFAULT_VDO_USE_DEDUPLICATION	(true)
-#define DEFAULT_VDO_EMULATE_512_SECTORS	(false)
+#define DEFAULT_VDO_USE_METADATA_HINTS	(true)
+#define DEFAULT_VDO_MINIMUM_IO_SIZE	(4096)
 #define DEFAULT_VDO_BLOCK_MAP_CACHE_SIZE_MB	(DM_VDO_BLOCK_MAP_CACHE_SIZE_MINIMUM_MB)
-#define DEFAULT_VDO_BLOCK_MAP_PERIOD	(DM_VDO_BLOCK_MAP_PERIOD_MAXIMUM)
+#define DEFAULT_VDO_BLOCK_MAP_ERA_LENGTH (DM_VDO_BLOCK_MAP_ERA_LENGTH_MAXIMUM)
 #define DEFAULT_VDO_USE_SPARSE_INDEX	(false)
 #define DEFAULT_VDO_CHECK_POINT_FREQUENCY	(0)
 #define DEFAULT_VDO_INDEX_MEMORY_SIZE_MB	(DM_VDO_INDEX_MEMORY_SIZE_MINIMUM_MB)
-#define DEFAULT_VDO_USE_READ_CACHE	(false)
-#define DEFAULT_VDO_READ_CACHE_SIZE_MB	(0)
 #define DEFAULT_VDO_SLAB_SIZE_MB	(2 * 1024)  // 2GiB ... 19 slabbits
 #define DEFAULT_VDO_ACK_THREADS		(1)
 #define DEFAULT_VDO_BIO_THREADS		(1)
@@ -159,6 +169,7 @@
 #define DEFAULT_VDO_LOGICAL_THREADS	(1)
 #define DEFAULT_VDO_PHYSICAL_THREADS	(1)
 #define DEFAULT_VDO_WRITE_POLICY	"auto"
+#define DEFAULT_VDO_MAX_DISCARD		(DM_VDO_MAX_DISCARD_MINIMUM)

 #define DEFAULT_VDO_FORMAT_OPTIONS_CONFIG "#S" ""
 /*
@@ -179,7 +190,6 @@
 #define DEFAULT_RECORD_LVS_HISTORY 0
 #define DEFAULT_LVS_HISTORY_RETENTION_TIME 0
 #define DEFAULT_PVMETADATAIGNORE 0
-#define DEFAULT_PVMETADATASIZE 255
 #define DEFAULT_PVMETADATACOPIES 1
 #define DEFAULT_VGMETADATACOPIES 0
 #define DEFAULT_LABELSECTOR UINT64_C(1)
@@ -302,4 +312,6 @@

 #define DEFAULT_SCAN_LVS 1

+#define DEFAULT_HINTS "all"
+
 #endif				/* _LVM_DEFAULTS_H */
--- a/lib/device/bcache.c
+++ b/lib/device/bcache.c
@@ -12,8 +12,6 @@
 * Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

-#define _GNU_SOURCE
-
 #include "lib/device/bcache.h"

 #include "base/data-struct/radix-tree.h"
@@ -31,7 +29,6 @@
 #include <libaio.h>
 #include <unistd.h>
 #include <linux/fs.h>
-#include <sys/ioctl.h>
 #include <sys/user.h>

 #define SECTOR_SHIFT 9L
@@ -158,6 +155,10 @@ static void _async_destroy(struct io_engine *ioe)
 	free(e);
 }

+static int _last_byte_fd;
+static uint64_t _last_byte_offset;
+static int _last_byte_sector_size;
+
 static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
 			 sector_t sb, sector_t se, void *data, void *context)
 {
@@ -165,12 +166,53 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
 	struct iocb *cb_array[1];
 	struct control_block *cb;
 	struct async_engine *e = _to_async(ioe);
+	sector_t offset;
+	sector_t nbytes;
+	sector_t limit_nbytes;
+	sector_t extra_nbytes = 0;

 	if (((uintptr_t) data) & e->page_mask) {
 		log_warn("misaligned data buffer");
 		return false;
 	}

+	offset = sb << SECTOR_SHIFT;
+	nbytes = (se - sb) << SECTOR_SHIFT;
+
+	/*
+	 * If bcache block goes past where lvm wants to write, then clamp it.
+	 */
+	if ((d == DIR_WRITE) && _last_byte_offset && (fd == _last_byte_fd)) {
+		if (offset > _last_byte_offset) {
+			log_error("Limit write at %llu len %llu beyond last byte %llu",
+				  (unsigned long long)offset,
+				  (unsigned long long)nbytes,
+				  (unsigned long long)_last_byte_offset);
+			return false;
+		}
+
+		if (offset + nbytes > _last_byte_offset) {
+			limit_nbytes = _last_byte_offset - offset;
+			if (limit_nbytes % _last_byte_sector_size)
+				extra_nbytes = _last_byte_sector_size - (limit_nbytes % _last_byte_sector_size);
+
+			if (extra_nbytes) {
+				log_debug("Limit write at %llu len %llu to len %llu rounded to %llu",
+					  (unsigned long long)offset,
+					  (unsigned long long)nbytes,
+					  (unsigned long long)limit_nbytes,
+					  (unsigned long long)(limit_nbytes + extra_nbytes));
+				nbytes = limit_nbytes + extra_nbytes;
+			} else {
+				log_debug("Limit write at %llu len %llu to len %llu",
+					  (unsigned long long)offset,
+					  (unsigned long long)nbytes,
+					  (unsigned long long)limit_nbytes);
+				nbytes = limit_nbytes;
+			}
+		}
+	}
+
 	cb = _cb_alloc(e->cbs, context);
 	if (!cb) {
 		log_warn("couldn't allocate control block");
@@ -181,10 +223,22 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,

 	cb->cb.aio_fildes = (int) fd;
 	cb->cb.u.c.buf = data;
-	cb->cb.u.c.offset = sb << SECTOR_SHIFT;
-	cb->cb.u.c.nbytes = (se - sb) << SECTOR_SHIFT;
+	cb->cb.u.c.offset = offset;
+	cb->cb.u.c.nbytes = nbytes;
 	cb->cb.aio_lio_opcode = (d == DIR_READ) ? IO_CMD_PREAD : IO_CMD_PWRITE;

+#if 0
+	if (d == DIR_READ) {
+		log_debug("io R off %llu bytes %llu",
+			  (unsigned long long)cb->cb.u.c.offset,
+			  (unsigned long long)cb->cb.u.c.nbytes);
+	} else {
+		log_debug("io W off %llu bytes %llu",
+			  (unsigned long long)cb->cb.u.c.offset,
+			  (unsigned long long)cb->cb.u.c.nbytes);
+	}
+#endif
+
 	cb_array[0] = &cb->cb;
 	do {
 		r = io_submit(e->aio_context, 1, cb_array);
@@ -273,7 +327,7 @@ struct io_engine *create_async_io_engine(void)
 	e->aio_context = 0;
 	r = io_setup(MAX_IO, &e->aio_context);
 	if (r < 0) {
-		log_warn("io_setup failed");
+		log_debug("io_setup failed %d", r);
 		free(e);
 		return NULL;
 	}
@@ -316,8 +370,11 @@ static void _sync_destroy(struct io_engine *ioe)
 static bool _sync_issue(struct io_engine *ioe, enum dir d, int fd,
                        sector_t sb, sector_t se, void *data, void *context)
 {
-        int r;
-        uint64_t len = (se - sb) * 512, where;
+	int rv;
+	off_t off;
+	uint64_t where;
+	uint64_t pos = 0;
+	uint64_t len = (se - sb) * 512;
 	struct sync_engine *e = _to_sync(ioe);
 	struct sync_io *io = malloc(sizeof(*io));
 	if (!io) {
@@ -326,32 +383,99 @@ static bool _sync_issue(struct io_engine *ioe, enum dir d, int fd,
 	}

 	where = sb * 512;
-	r = lseek(fd, where, SEEK_SET);
-	if (r < 0) {
-        	log_warn("unable to seek to position %llu", (unsigned long long) where);
-        	return false;
+	off = lseek(fd, where, SEEK_SET);
+	if (off == (off_t) -1) {
+		log_warn("Device seek error %d for offset %llu", errno, (unsigned long long)where);
+		free(io);
+		return false;
+	}
+	if (off != (off_t) where) {
+		log_warn("Device seek failed for offset %llu", (unsigned long long)where);
+		free(io);
+		return false;
 	}

-	while (len) {
-        	do {
-                	if (d == DIR_READ)
-                                r = read(fd, data, len);
-                        else
-                                r = write(fd, data, len);
+	/*
+	 * If bcache block goes past where lvm wants to write, then clamp it.
+	 */
+	if ((d == DIR_WRITE) && _last_byte_offset && (fd == _last_byte_fd)) {
+		uint64_t offset = where;
+		uint64_t nbytes = len;
+		sector_t limit_nbytes = 0;
+		sector_t extra_nbytes = 0;

-        	} while ((r < 0) && ((r == EINTR) || (r == EAGAIN)));
+		if (offset > _last_byte_offset) {
+			log_error("Limit write at %llu len %llu beyond last byte %llu",
+				  (unsigned long long)offset,
+				  (unsigned long long)nbytes,
+				  (unsigned long long)_last_byte_offset);
+			free(io);
+			return false;
+		}

-        	if (r < 0) {
-                	log_warn("io failed %d", r);
-                	return false;
-        	}
+		if (offset + nbytes > _last_byte_offset) {
+			limit_nbytes = _last_byte_offset - offset;
+			if (limit_nbytes % _last_byte_sector_size)
+				extra_nbytes = _last_byte_sector_size - (limit_nbytes % _last_byte_sector_size);

-                len -= r;
+			if (extra_nbytes) {
+				log_debug("Limit write at %llu len %llu to len %llu rounded to %llu",
+					  (unsigned long long)offset,
+					  (unsigned long long)nbytes,
+					  (unsigned long long)limit_nbytes,
+					  (unsigned long long)(limit_nbytes + extra_nbytes));
+				nbytes = limit_nbytes + extra_nbytes;
+			} else {
+				log_debug("Limit write at %llu len %llu to len %llu",
+					  (unsigned long long)offset,
+					  (unsigned long long)nbytes,
+					  (unsigned long long)limit_nbytes);
+				nbytes = limit_nbytes;
+			}
+		}
+
+		where = offset;
+		len = nbytes;
 	}

-	if (len) {
-        	log_warn("short io %u bytes remaining", (unsigned) len);
+	while (pos < len) {
+		if (d == DIR_READ)
+			rv = read(fd, (char *)data + pos, len - pos);
+		else
+			rv = write(fd, (char *)data + pos, len - pos);
+
+		if (rv == -1 && errno == EINTR)
+			continue;
+		if (rv == -1 && errno == EAGAIN)
+			continue;
+
+		if (!rv)
+			break;
+
+		if (rv < 0) {
+			if (d == DIR_READ)
+				log_debug("Device read error %d offset %llu len %llu", errno,
+					  (unsigned long long)(where + pos),
+					  (unsigned long long)(len - pos));
+			else
+				log_debug("Device write error %d offset %llu len %llu", errno,
+					  (unsigned long long)(where + pos),
+					  (unsigned long long)(len - pos));
+			free(io);
+			return false;
+		}
+		pos += rv;
+	}
+
+	if (pos < len) {
+		if (d == DIR_READ)
+			log_warn("Device read short %u bytes remaining", (unsigned)(len - pos));
+		else
+			log_warn("Device write short %u bytes remaining", (unsigned)(len - pos));
+		/*
+        	free(io);
        	return false;
+		*/
 	}


@@ -878,6 +1002,11 @@ struct bcache *bcache_create(sector_t block_sectors, unsigned nr_cache_blocks,
 	unsigned max_io = engine->max_io(engine);
 	long pgsize = sysconf(_SC_PAGESIZE);

+	if (pgsize < 0) {
+		log_warn("WARNING: _SC_PAGESIZE returns negative value.");
+		return NULL;
+	}
+
 	if (!nr_cache_blocks) {
 		log_warn("bcache must have at least one cache block");
 		return NULL;
@@ -940,7 +1069,8 @@ void bcache_destroy(struct bcache *cache)
 	if (cache->nr_locked)
 		log_warn("some blocks are still locked");

-	bcache_flush(cache);
+	if (!bcache_flush(cache))
+		stack;
 	_wait_all(cache);
 	_exit_free_list(cache);
 	radix_tree_destroy(cache->rtree);
@@ -1167,3 +1297,21 @@ bool bcache_invalidate_fd(struct bcache *cache, int fd)

 //----------------------------------------------------------------

+void bcache_set_last_byte(struct bcache *cache, int fd, uint64_t offset, int sector_size)
+{
+	_last_byte_fd = fd;
+	_last_byte_offset = offset;
+	_last_byte_sector_size = sector_size;
+	if (!sector_size)
+		_last_byte_sector_size = 512;
+}
+
+void bcache_unset_last_byte(struct bcache *cache, int fd)
+{
+	if (_last_byte_fd == fd) {
+		_last_byte_fd = 0;
+		_last_byte_offset = 0;
+		_last_byte_sector_size = 0;
+	}
+}
+
--- a/lib/device/bcache.h
+++ b/lib/device/bcache.h
@@ -15,7 +15,6 @@
 #ifndef BCACHE_H
 #define BCACHE_H

-#include "configure.h"
 #include "device_mapper/all.h"

 #include <linux/fs.h>
@@ -158,6 +157,9 @@ bool bcache_write_bytes(struct bcache *cache, int fd, uint64_t start, size_t len
 bool bcache_zero_bytes(struct bcache *cache, int fd, uint64_t start, size_t len);
 bool bcache_set_bytes(struct bcache *cache, int fd, uint64_t start, size_t len, uint8_t val);

+void bcache_set_last_byte(struct bcache *cache, int fd, uint64_t offset, int sector_size);
+void bcache_unset_last_byte(struct bcache *cache, int fd);
+
 //----------------------------------------------------------------

 #endif
--- a/lib/device/dev-cache.c
+++ b/lib/device/dev-cache.c
@@ -25,7 +25,6 @@
 #include <libudev.h>
 #endif
 #include <unistd.h>
-#include <sys/param.h>
 #include <dirent.h>

 struct dev_iter {
@@ -480,7 +479,7 @@ static struct device *_get_device_for_sysfs_dev_name_using_devno(const char *dev
 		return NULL;
 	}

-	devno = MKDEV((dev_t)major, (dev_t)minor);
+	devno = MKDEV(major, minor);
 	if (!(dev = (struct device *) btree_lookup(_cache.devices, (uint32_t) devno))) {
 		/*
 		 * If we get here, it means the device is referenced in sysfs, but it's not yet in /dev.
@@ -667,10 +666,9 @@ struct dm_list *dev_cache_get_dev_list_for_lvid(const char *lvid)

 void dev_cache_failed_path(struct device *dev, const char *path)
 {
-	struct device *dev_by_path;
 	struct dm_str_list *strl;

-	if ((dev_by_path = (struct device *) dm_hash_lookup(_cache.names, path)))
+	if (dm_hash_lookup(_cache.names, path))
 		dm_hash_remove(_cache.names, path);

 	dm_list_iterate_items(strl, &dev->aliases) {
@@ -949,7 +947,7 @@ static int _dev_cache_iterate_sysfs_for_index(const char *path)
 			continue;
 		}

-		devno = MKDEV((dev_t)major, (dev_t)minor);
+		devno = MKDEV(major, minor);
 		if (!(dev = (struct device *) btree_lookup(_cache.devices, (uint32_t) devno)) &&
 		    !(dev = (struct device *) btree_lookup(_cache.sysfs_only_devices, (uint32_t) devno))) {
 			if (!dm_device_get_name(major, minor, 1, devname, sizeof(devname)) ||
@@ -1302,8 +1300,8 @@ static int _check_for_open_devices(int close_immediate)
 			log_error("Device '%s' has been left open (%d remaining references).",
 				  dev_name(dev), dev->open_count);
 			num_open++;
-			if (close_immediate)
-				dev_close_immediate(dev);
+			if (close_immediate && !dev_close_immediate(dev))
+				stack;
 		}
 	}

@@ -1475,7 +1473,7 @@ struct device *dev_cache_get(struct cmd_context *cmd, const char *name, struct d
 		return d;

 	if (f && !(d->flags & DEV_REGULAR)) {
-		ret = f->passes_filter(cmd, f, d);
+		ret = f->passes_filter(cmd, f, d, NULL);

 		if (ret == -EAGAIN) {
 			log_debug_devs("get device by name defer filter %s", dev_name(d));
@@ -1548,7 +1546,7 @@ struct device *dev_cache_get_by_devt(struct cmd_context *cmd, dev_t dev, struct
 	if (!f)
 		return d;

-	ret = f->passes_filter(cmd, f, d);
+	ret = f->passes_filter(cmd, f, d, NULL);

 	if (ret == -EAGAIN) {
 		log_debug_devs("get device by number defer filter %s", dev_name(d));
@@ -1605,7 +1603,7 @@ struct device *dev_iter_get(struct cmd_context *cmd, struct dev_iter *iter)
 		f = iter->filter;

 		if (f && !(d->flags & DEV_REGULAR)) {
-			ret = f->passes_filter(cmd, f, d);
+			ret = f->passes_filter(cmd, f, d, NULL);

 			if (ret == -EAGAIN) {
 				log_debug_devs("get device by iter defer filter %s", dev_name(d));
--- a/lib/device/dev-cache.h
+++ b/lib/device/dev-cache.h
@@ -25,11 +25,12 @@ struct cmd_context;
 * predicate for devices.
 */
 struct dev_filter {
-	int (*passes_filter) (struct cmd_context *cmd, struct dev_filter *f, struct device *dev);
+	int (*passes_filter) (struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name);
 	void (*destroy) (struct dev_filter *f);
 	void (*wipe) (struct dev_filter *f);
 	void *private;
 	unsigned use_count;
+	const char *name;
 };

 int dev_cache_index_devs(void);
--- a/lib/device/dev-io.c
+++ b/lib/device/dev-io.c
@@ -17,7 +17,6 @@
 #include "lib/device/device.h"
 #include "lib/metadata/metadata.h"
 #include "lib/mm/memlock.h"
-#include "lib/locking/locking.h"

 #include <limits.h>
 #include <sys/stat.h>
@@ -149,16 +148,27 @@ static int _io(struct device_area *where, char *buffer, int should_write, dev_io
 int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, unsigned int *block_size)
 {
 	const char *name = dev_name(dev);
-	int needs_open;
+	int fd = dev->bcache_fd;
+	int do_close = 0;
 	int r = 1;

-	needs_open = (!dev->open_count && (dev->phys_block_size == -1 || dev->block_size == -1));
+	if ((dev->phys_block_size > 0) && (dev->block_size > 0)) {
+		*physical_block_size = (unsigned int)dev->phys_block_size;
+		*block_size = (unsigned int)dev->block_size;
+		return 1;
+	}

-	if (needs_open && !dev_open_readonly(dev))
-		return_0;
+	if (fd <= 0) {
+		if (!dev->open_count) {
+			if (!dev_open_readonly(dev))
+				return_0;
+			do_close = 1;
+		}
+		fd = dev_fd(dev);
+	}

 	if (dev->block_size == -1) {
-		if (ioctl(dev_fd(dev), BLKBSZGET, &dev->block_size) < 0) {
+		if (ioctl(fd, BLKBSZGET, &dev->block_size) < 0) {
 			log_sys_error("ioctl BLKBSZGET", name);
 			r = 0;
 			goto out;
@@ -169,7 +179,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
 #ifdef BLKPBSZGET
 	/* BLKPBSZGET is available in kernel >= 2.6.32 only */
 	if (dev->phys_block_size == -1) {
-		if (ioctl(dev_fd(dev), BLKPBSZGET, &dev->phys_block_size) < 0) {
+		if (ioctl(fd, BLKPBSZGET, &dev->phys_block_size) < 0) {
 			log_sys_error("ioctl BLKPBSZGET", name);
 			r = 0;
 			goto out;
@@ -179,7 +189,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
 #elif defined (BLKSSZGET)
 	/* if we can't get physical block size, just use logical block size instead */
 	if (dev->phys_block_size == -1) {
-		if (ioctl(dev_fd(dev), BLKSSZGET, &dev->phys_block_size) < 0) {
+		if (ioctl(fd, BLKSSZGET, &dev->phys_block_size) < 0) {
 			log_sys_error("ioctl BLKSSZGET", name);
 			r = 0;
 			goto out;
@@ -197,7 +207,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
 	*physical_block_size = (unsigned int) dev->phys_block_size;
 	*block_size = (unsigned int) dev->block_size;
 out:
-	if (needs_open && !dev_close_immediate(dev))
+	if (do_close && !dev_close_immediate(dev))
 		stack;

 	return r;
@@ -504,7 +514,8 @@ int dev_open_flags(struct device *dev, int flags, int direct, int quiet)
 		/* dev_close_immediate will decrement this */
 		dev->open_count++;

-		dev_close_immediate(dev);
+		if (!dev_close_immediate(dev))
+			return_0;
 		// FIXME: dev with DEV_ALLOCED is released
 		// but code is referencing it
 	}
@@ -585,7 +596,8 @@ int dev_open_flags(struct device *dev, int flags, int direct, int quiet)
 	if (!(dev->flags & DEV_REGULAR) &&
 	    ((fstat(dev->fd, &buf) < 0) || (buf.st_rdev != dev->dev))) {
 		log_error("%s: fstat failed: Has device name changed?", name);
-		dev_close_immediate(dev);
+		if (!dev_close_immediate(dev))
+			stack;
 		return 0;
 	}

--- a/lib/device/dev-md.c
+++ b/lib/device/dev-md.c
@@ -190,14 +190,24 @@ out:

 int dev_is_md(struct device *dev, uint64_t *offset_found, int full)
 {
+	int ret;

 	/*
 	 * If non-native device status source is selected, use it
 	 * only if offset_found is not requested as this
 	 * information is not in udev db.
 	 */
-	if ((dev->ext.src == DEV_EXT_NONE) || offset_found)
-		return _native_dev_is_md(dev, offset_found, full);
+	if ((dev->ext.src == DEV_EXT_NONE) || offset_found) {
+		ret = _native_dev_is_md(dev, offset_found, full);
+
+		if (!full) {
+			if (!ret || (ret == -EAGAIN)) {
+				if (udev_dev_is_md_component(dev))
+					return 1;
+			}
+		}
+		return ret;
+	}

 	if (dev->ext.src == DEV_EXT_UDEV)
 		return _udev_dev_is_md(dev);
@@ -422,7 +432,7 @@ int dev_is_md_with_end_superblock(struct dev_types *dt, struct device *dev)
 	log_very_verbose("Device %s %s is %s.",
 			 dev_name(dev), attribute, version_string);

-	if (!strcmp(version_string, "1.0"))
+	if (!strcmp(version_string, "1.0") || !strcmp(version_string, "0.90"))
 		return 1;
 	return 0;
 }
--- a/lib/device/dev-type.c
+++ b/lib/device/dev-type.c
@@ -15,15 +15,13 @@
 #include "base/memory/zalloc.h"
 #include "lib/misc/lib.h"
 #include "lib/device/dev-type.h"
+#include "lib/device/device-types.h"
 #include "lib/mm/xlate.h"
 #include "lib/config/config.h"
 #include "lib/metadata/metadata.h"
 #include "lib/device/bcache.h"
 #include "lib/label/label.h"

-#include <libgen.h>
-#include <ctype.h>
-
 #ifdef BLKID_WIPING_SUPPORT
 #include <blkid.h>
 #endif
@@ -33,7 +31,83 @@
 #include "lib/device/dev-ext-udev-constants.h"
 #endif

-#include "lib/device/device-types.h"
+#include <libgen.h>
+#include <ctype.h>
+
+/*
+ * dev is pmem if /sys/dev/block/<major>:<minor>/queue/dax is 1
+ */
+
+int dev_is_pmem(struct device *dev)
+{
+	FILE *fp;
+	char path[PATH_MAX];
+	char buffer[64];
+	int is_pmem = 0;
+
+	if (dm_snprintf(path, sizeof(path), "%sdev/block/%d:%d/queue/dax",
+			dm_sysfs_dir(),
+			(int) MAJOR(dev->dev),
+			(int) MINOR(dev->dev)) < 0) {
+		log_warn("Sysfs path for %s dax is too long.", dev_name(dev));
+		return 0;
+	}
+
+	if (!(fp = fopen(path, "r")))
+		return 0;
+
+	if (!fgets(buffer, sizeof(buffer), fp)) {
+		log_warn("Failed to read %s.", path);
+		if (fclose(fp))
+			log_sys_debug("fclose", path);
+		return 0;
+	} else if (sscanf(buffer, "%d", &is_pmem) != 1) {
+		log_warn("Failed to parse %s '%s'.", path, buffer);
+		if (fclose(fp))
+			log_sys_debug("fclose", path);
+		return 0;
+	}
+
+	if (fclose(fp))
+		log_sys_debug("fclose", path);
+
+	if (is_pmem) {
+		log_debug("%s is pmem", dev_name(dev));
+		return 1;
+	}
+
+	return 0;
+}
+
+int dev_is_lv(struct device *dev)
+{
+	FILE *fp;
+	char path[PATH_MAX];
+	char buffer[64];
+
+	if (dm_snprintf(path, sizeof(path), "%sdev/block/%d:%d/dm/uuid",
+			dm_sysfs_dir(),
+			(int) MAJOR(dev->dev),
+			(int) MINOR(dev->dev)) < 0) {
+		log_warn("Sysfs dm uuid path for %s is too long.", dev_name(dev));
+		return 0;
+	}
+
+	if (!(fp = fopen(path, "r")))
+		return 0;
+
+	if (!fgets(buffer, sizeof(buffer), fp)) {
+		log_warn("Failed to read %s.", path);
+		fclose(fp);
+		return 0;
+	}
+
+	fclose(fp);
+
+	if (!strncmp(buffer, "LVM-", 4))
+		return 1;
+	return 0;
+}

 struct dev_types *create_dev_types(const char *proc_dir,
 				   const struct dm_config_node *cn)
@@ -506,7 +580,7 @@ int dev_get_primary_dev(struct dev_types *dt, struct device *dev, dev_t *result)
 	 */
 	if ((parts = dt->dev_type_array[major].max_partitions) > 1) {
 		if ((residue = minor % parts)) {
-			*result = MKDEV((dev_t)major, (dev_t)(minor - residue));
+			*result = MKDEV(major, (minor - residue));
 			ret = 2;
 		} else {
 			*result = dev->dev;
@@ -576,7 +650,7 @@ int dev_get_primary_dev(struct dev_types *dt, struct device *dev, dev_t *result)
 			  path, buffer);
 		goto out;
 	}
-	*result = MKDEV((dev_t)major, (dev_t)minor);
+	*result = MKDEV(major, minor);
 	ret = 2;
 out:
 	if (fp && fclose(fp))
@@ -1005,25 +1079,23 @@ int dev_is_rotational(struct dev_types *dt, struct device *dev)
 *        failed already due to timeout in udev - in both cases the
 *        udev_device_get_is_initialized returns 0.
 */
-#define UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT 100
-#define UDEV_DEV_IS_MPATH_COMPONENT_USLEEP 100000
+#define UDEV_DEV_IS_COMPONENT_ITERATION_COUNT 100
+#define UDEV_DEV_IS_COMPONENT_USLEEP 100000

-int udev_dev_is_mpath_component(struct device *dev)
+static struct udev_device *_udev_get_dev(struct device *dev)
 {
 	struct udev *udev_context = udev_get_library_context();
 	struct udev_device *udev_device = NULL;
-	const char *value;
 	int initialized = 0;
 	unsigned i = 0;
-	int ret = 0;

 	if (!udev_context) {
 		log_warn("WARNING: No udev context available to check if device %s is multipath component.", dev_name(dev));
-		return 0;
+		return NULL;
 	}

 	while (1) {
-		if (i >= UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT)
+		if (i >= UDEV_DEV_IS_COMPONENT_ITERATION_COUNT)
 			break;

 		if (udev_device)
@@ -1031,7 +1103,7 @@ int udev_dev_is_mpath_component(struct device *dev)

 		if (!(udev_device = udev_device_new_from_devnum(udev_context, 'b', dev->dev))) {
 			log_warn("WARNING: Failed to get udev device handler for device %s.", dev_name(dev));
-			return 0;
+			return NULL;
 		}

 #ifdef HAVE_LIBUDEV_UDEV_DEVICE_GET_IS_INITIALIZED
@@ -1043,19 +1115,32 @@ int udev_dev_is_mpath_component(struct device *dev)
 #endif

 		log_debug("Device %s not initialized in udev database (%u/%u, %u microseconds).", dev_name(dev),
-			   i + 1, UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT,
-			   i * UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
+			   i + 1, UDEV_DEV_IS_COMPONENT_ITERATION_COUNT,
+			   i * UDEV_DEV_IS_COMPONENT_USLEEP);

-		usleep(UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
+		usleep(UDEV_DEV_IS_COMPONENT_USLEEP);
 		i++;
 	}

 	if (!initialized) {
 		log_warn("WARNING: Device %s not initialized in udev database even after waiting %u microseconds.",
-			  dev_name(dev), i * UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
+			  dev_name(dev), i * UDEV_DEV_IS_COMPONENT_USLEEP);
 		goto out;
 	}

+out:
+	return udev_device;
+}
+
+int udev_dev_is_mpath_component(struct device *dev)
+{
+	struct udev_device *udev_device;
+	const char *value;
+	int ret = 0;
+
+	if (!(udev_device = _udev_get_dev(dev)))
+		return 0;
+
 	value = udev_device_get_property_value(udev_device, DEV_EXT_UDEV_BLKID_TYPE);
 	if (value && !strcmp(value, DEV_EXT_UDEV_BLKID_TYPE_MPATH)) {
 		log_debug("Device %s is multipath component based on blkid variable in udev db (%s=\"%s\").",
@@ -1075,6 +1160,28 @@ out:
 	udev_device_unref(udev_device);
 	return ret;
 }
+
+int udev_dev_is_md_component(struct device *dev)
+{
+	struct udev_device *udev_device;
+	const char *value;
+	int ret = 0;
+
+	if (!(udev_device = _udev_get_dev(dev)))
+		return 0;
+
+	value = udev_device_get_property_value(udev_device, DEV_EXT_UDEV_BLKID_TYPE);
+	if (value && !strcmp(value, DEV_EXT_UDEV_BLKID_TYPE_SW_RAID)) {
+		log_debug("Device %s is md raid component based on blkid variable in udev db (%s=\"%s\").",
+			   dev_name(dev), DEV_EXT_UDEV_BLKID_TYPE, value);
+		ret = 1;
+		goto out;
+	}
+out:
+	udev_device_unref(udev_device);
+	return ret;
+}
+
 #else

 int udev_dev_is_mpath_component(struct device *dev)
@@ -1082,4 +1189,9 @@ int udev_dev_is_mpath_component(struct device *dev)
 	return 0;
 }

+int udev_dev_is_md_component(struct device *dev)
+{
+	return 0;
+}
+
 #endif
--- a/lib/device/dev-type.h
+++ b/lib/device/dev-type.h
@@ -62,6 +62,7 @@ int dev_is_swap(struct device *dev, uint64_t *signature, int full);
 int dev_is_luks(struct device *dev, uint64_t *signature, int full);
 int dasd_is_cdl_formatted(struct device *dev);
 int udev_dev_is_mpath_component(struct device *dev);
+int udev_dev_is_md_component(struct device *dev);

 int dev_is_lvm1(struct device *dev, char *buf, int buflen);
 int dev_is_pool(struct device *dev, char *buf, int buflen);
@@ -92,4 +93,8 @@ unsigned long dev_discard_granularity(struct dev_types *dt, struct device *dev);

 int dev_is_rotational(struct dev_types *dt, struct device *dev);

+int dev_is_pmem(struct device *dev);
+
+int dev_is_lv(struct device *dev);
+
 #endif
--- a/lib/device/device.h
+++ b/lib/device/device.h
@@ -36,6 +36,7 @@
 #define DEV_FILTER_AFTER_SCAN	0x00002000	/* apply filter after bcache has data */
 #define DEV_FILTER_OUT_SCAN	0x00004000	/* filtered out during label scan */
 #define DEV_BCACHE_WRITE	0x00008000      /* bcache_fd is open with RDWR */
+#define DEV_SCAN_FOUND_LABEL	0x00010000      /* label scan read dev and found label */

 /*
 * Support for external device info.
--- a/lib/filters/filter-composite.c
+++ b/lib/filters/filter-composite.c
@@ -18,13 +18,15 @@
 #include "lib/filters/filter.h"
 #include "lib/device/device.h"

-static int _and_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _and_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	struct dev_filter **filters;
 	int ret;

 	for (filters = (struct dev_filter **) f->private; *filters; ++filters) {
-		ret = (*filters)->passes_filter(cmd, *filters, dev);
+		if (use_filter_name && strcmp((*filters)->name, use_filter_name))
+			continue;
+		ret = (*filters)->passes_filter(cmd, *filters, dev, use_filter_name);

 		if (!ret)
 			return 0;	/* No 'stack': a filter, not an error. */
@@ -33,12 +35,12 @@ static int _and_p(struct cmd_context *cmd, struct dev_filter *f, struct device *
 	return 1;
 }

-static int _and_p_with_dev_ext_info(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _and_p_with_dev_ext_info(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	int r;

 	dev_ext_enable(dev, external_device_info_source());
-	r = _and_p(cmd, f, dev);
+	r = _and_p(cmd, f, dev, use_filter_name);
 	dev_ext_disable(dev);

 	return r;
@@ -93,6 +95,7 @@ struct dev_filter *composite_filter_create(int n, int use_dev_ext_info, struct d
 	cft->wipe = _wipe;
 	cft->use_count = 0;
 	cft->private = filters_copy;
+	cft->name = "composite";

 	log_debug_devs("Composite filter initialised.");

--- a/lib/filters/filter-fwraid.c
+++ b/lib/filters/filter-fwraid.c
@@ -65,7 +65,7 @@ static int _dev_is_fwraid(struct device *dev)
 #define MSG_SKIPPING "%s: Skipping firmware RAID component device"

 static int _ignore_fwraid(struct cmd_context *cmd, struct dev_filter *f __attribute__((unused)),
-			   struct device *dev)
+			   struct device *dev, const char *use_filter_name)
 {
 	int ret;

@@ -113,6 +113,7 @@ struct dev_filter *fwraid_filter_create(struct dev_types *dt __attribute__((unus
 	f->destroy = _destroy;
 	f->use_count = 0;
 	f->private = NULL;
+	f->name = "fwraid";

 	log_debug_devs("Firmware RAID filter initialised.");

--- a/lib/filters/filter-internal.c
+++ b/lib/filters/filter-internal.c
@@ -38,7 +38,7 @@ void internal_filter_clear(void)
 }

 static int _passes_internal(struct cmd_context *cmd, struct dev_filter *f __attribute__((unused)),
-			    struct device *dev)
+			    struct device *dev, const char *use_filter_name)
 {
 	struct device_list *devl;

@@ -74,6 +74,7 @@ struct dev_filter *internal_filter_create(void)
 	f->passes_filter = _passes_internal;
 	f->destroy = _destroy;
 	f->use_count = 0;
+	f->name = "internal";

 	log_debug_devs("Internal filter initialised.");

--- a/lib/filters/filter-md.c
+++ b/lib/filters/filter-md.c
@@ -16,6 +16,7 @@
 #include "base/memory/zalloc.h"
 #include "lib/misc/lib.h"
 #include "lib/filters/filter.h"
+#include "lib/commands/toolcontext.h"

 #ifdef __linux__

@@ -45,7 +46,7 @@
 * 3. use udev to detect components
 *
 * mode 1 will not detect and exclude components of md devices
- * that use superblock version 1.0 which is at the end of the device.
+ * that use superblock version 0.9 or 1.0 which is at the end of the device.
 *
 * mode 2 will detect these, but mode 2 doubles the i/o done by label
 * scan, since there's a read at both the start and end of every device.
@@ -58,11 +59,11 @@
 *
 * - the command is pvcreate/vgcreate/vgextend, which format new
 *   devices, and if the user ran these commands on a component
- *   device of an md device 1.0, then it would cause problems.
+ *   device of an md device 0.9 or 1.0, then it would cause problems.
 *   FIXME: this would only really need to scan the end of the
 *   devices being formatted, not all devices.
 *
- * - it sees an md device on the system using version 1.0.
+ * - it sees an md device on the system using version 0.9 or 1.0.
 *   The point of this is just to avoid displaying md components
 *   from the 'pvs' command.
 *   FIXME: the cost (double i/o) may not be worth the benefit
@@ -81,7 +82,7 @@
 * that will not pass.
 */

-static int _passes_md_filter(struct cmd_context *cmd, struct dev_filter *f __attribute__((unused)), struct device *dev)
+static int _passes_md_filter(struct cmd_context *cmd, struct dev_filter *f __attribute__((unused)), struct device *dev, const char *use_filter_name)
 {
 	int ret;

@@ -105,6 +106,7 @@ static int _passes_md_filter(struct cmd_context *cmd, struct dev_filter *f __att
 		return 1;

 	if (ret == 1) {
+		log_debug_devs("md filter full %d excluding md component %s", cmd->use_full_md_check, dev_name(dev));
 		if (dev->ext.src == DEV_EXT_NONE)
 			log_debug_devs(MSG_SKIPPING, dev_name(dev));
 		else
@@ -143,6 +145,7 @@ struct dev_filter *md_filter_create(struct cmd_context *cmd, struct dev_types *d
 	f->destroy = _destroy;
 	f->use_count = 0;
 	f->private = dt;
+	f->name = "md";

 	log_debug_devs("MD filter initialised.");

--- a/lib/filters/filter-mpath.c
+++ b/lib/filters/filter-mpath.c
@@ -247,7 +247,7 @@ static int _dev_is_mpath(struct dev_filter *f, struct device *dev)

 #define MSG_SKIPPING "%s: Skipping mpath component device"

-static int _ignore_mpath(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _ignore_mpath(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	if (_dev_is_mpath(f, dev) == 1) {
 		if (dev->ext.src == DEV_EXT_NONE)
@@ -288,6 +288,7 @@ struct dev_filter *mpath_filter_create(struct dev_types *dt)
 	f->destroy = _destroy;
 	f->use_count = 0;
 	f->private = dt;
+	f->name = "mpath";

 	log_debug_devs("mpath filter initialised.");

--- a/lib/filters/filter-partitioned.c
+++ b/lib/filters/filter-partitioned.c
@@ -19,7 +19,7 @@

 #define MSG_SKIPPING "%s: Skipping: Partition table signature found"

-static int _passes_partitioned_filter(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _passes_partitioned_filter(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	struct dev_types *dt = (struct dev_types *) f->private;
 	int ret;
@@ -66,6 +66,7 @@ struct dev_filter *partitioned_filter_create(struct dev_types *dt)
 	f->destroy = _partitioned_filter_destroy;
 	f->use_count = 0;
 	f->private = dt;
+	f->name = "partitioned";

 	log_debug_devs("Partitioned filter initialised.");

--- a/lib/filters/filter-persistent.c
+++ b/lib/filters/filter-persistent.c
@@ -71,13 +71,16 @@ static void _persistent_filter_wipe(struct dev_filter *f)
 	dm_hash_wipe(pf->devices);
 }

-static int _lookup_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _lookup_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	struct pfilter *pf = (struct pfilter *) f->private;
 	void *l;
 	struct dm_str_list *sl;
 	int pass = 1;

+	if (use_filter_name && strcmp(f->name, use_filter_name))
+		return pf->real->passes_filter(cmd, pf->real, dev, use_filter_name);
+
 	if (dm_list_empty(&dev->aliases)) {
 		log_debug_devs("%d:%d: filter cache skipping (no name)",
 				(int)MAJOR(dev->dev), (int)MINOR(dev->dev));
@@ -102,7 +105,7 @@ static int _lookup_p(struct cmd_context *cmd, struct dev_filter *f, struct devic
 	if (!l) {
 		dev->flags &= ~DEV_FILTER_AFTER_SCAN;

-		pass = pf->real->passes_filter(cmd, pf->real, dev);
+		pass = pf->real->passes_filter(cmd, pf->real, dev, use_filter_name);

 		if (!pass) {
 			/*
@@ -182,6 +185,7 @@ struct dev_filter *persistent_filter_create(struct dev_types *dt, struct dev_fil
 	f->use_count = 0;
 	f->private = pf;
 	f->wipe = _persistent_filter_wipe;
+	f->name = "persistent";

 	log_debug_devs("Persistent filter initialised.");

--- a/lib/filters/filter-regex.c
+++ b/lib/filters/filter-regex.c
@@ -145,7 +145,7 @@ static int _build_matcher(struct rfilter *rf, const struct dm_config_value *val)
 	return r;
 }

-static int _accept_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev)
+static int _accept_p(struct cmd_context *cmd, struct dev_filter *f, struct device *dev, const char *use_filter_name)
 {
 	int m, first = 1, rejected = 0;
 	struct rfilter *rf = (struct rfilter *) f->private;
@@ -212,6 +212,7 @@ struct dev_filter *regex_filter_create(const struct dm_config_value *patterns)
 	f->destroy = _regex_destroy;
 	f->use_count = 0;
 	f->private = rf;
+	f->name = "regex";

 	log_debug_devs("Regex filter initialised.");

--- a/Show More
+++ b/Show More