shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00

Author	SHA1	Message	Date
Zdenek Kabelac	d6ac039b65	cov: widen before calculating min_chunk_size Although we expect min_chunk_size to be 32bit value, for large size of caches it might be useful to do calcs 64bit. So to avoid doing shift as signed 32bit - use unsigned 64bit from the start.	2020-02-04 17:22:06 +01:00
Zdenek Kabelac	de43527f94	cov: unused header file removal cov: unused header removed Also ensure library header file with config settings goes first. Move inclusion of format-text.h into layout.h	2020-02-04 17:22:06 +01:00
David Teigland	bddbbcb98c	writecache: report status fields reporting fields (-o) directly from kernel: writecache_total_blocks writecache_free_blocks writecache_writeback_blocks writecache_error The data_percent field shows used cache blocks / total cache blocks.	2020-01-31 11:52:49 -06:00
Zdenek Kabelac	cf844941d4	vdo: adapt for multi line vdo_format output Do not close pipeline after 1st. line parsed from vdo_format. Also reprint the output for a user so new messages from vdo_format can be seen by users.	2020-01-23 10:32:15 +01:00
Zdenek Kabelac	d7bf7091c3	raid: more limitted prohibition of stacked raid usage We actually need to prohibit only reshaping cases which are running over multiple commands.	2020-01-23 10:32:15 +01:00
Zdenek Kabelac	7737ffb11c	raid: disallow reshape of stacked LVs Until we resolve reshape for 'stacked' devices, we need to disable it. So users can no longer reshape i.e. thin-pool data volumes, causing ATM bad thin-pool problems.	2020-01-13 17:42:31 +01:00
David Teigland	2173bdb821	drop warnings about missing pvs in foreign vgs When a foreign VG is ignored, don't print warnings that it is missing PVs.	2019-12-11 12:56:15 -06:00
Zdenek Kabelac	89d839e541	clenaup: simpler form	2019-12-10 15:44:16 +01:00
Zdenek Kabelac	abc0a8faba	vg_read: use else for 3 case Make it visible we check for ==, >, < of same var.	2019-12-10 15:44:16 +01:00
Zdenek Kabelac	5555765cfc	debug: enhance messages Drop 'extra' stack trace where errors are already logged from function. Add some missing dots in messages.	2019-12-10 15:44:16 +01:00
Nikhil Kshirsagar	e70d5d470c	debug: print VG name in log messages for segment errors Signed-off-by: Nikhil Kshirsagar <nkshirsa@redhat.com>	2019-12-10 15:44:06 +01:00
David Teigland	74ad2cd76f	metadata: add vg_from_config_tree Add cmd/fmt args to import functions so that they can be used without the fid arg which.	2019-11-27 11:13:47 -06:00
David Teigland	98a8099da9	scanning: use bool type for _scan_text_mismatch	2019-11-27 09:26:49 -06:00
David Teigland	0c1316cda8	scanning: optimize by checking text offset and checksum After the VG lock is taken for vg_read, reread the mda_header and compare the metadata text offset and checksum to what was seen during label scan. If it is unchanged, then the metadata has not changed since the label scan, and the metadata does not need to be reread under the lock for command processing. For commands that do not make changes (e.g. reporting), the mda_header is reread and checked on one mda to decide if the full metadata rereading can be skipped. For other commands (e.g. modifying the vg) the mda_header is reread and checked from all PVs. (These could probably just check one mda also.)	2019-11-26 16:52:28 -06:00
Zdenek Kabelac	33c1d2e921	cov: add explicit ret value ignoring We don't need to check for any error result codes here.	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	ad0343d8cb	cov: remove unused headers	2019-11-14 18:06:42 +01:00
Heming Zhao	13c254fc05	fix dev_unset_last_byte after write error dev_unset_last_byte() must be called while the fd is still valid. After a write error, dev_unset_last_byte() must be called before closing the dev and resetting the fd. In the write error path, dev_unset_last_byte() was being called after label_scan_invalidate() which meant that it would not unset the last_byte values. After a write error, dev_unset_last_byte() is now called in dev_write_bytes() before label_scan_invalidate(), instead of by the caller of dev_write_bytes(). In the common case of a successful write, the sequence is still: dev_set_last_byte(); dev_write_bytes(); dev_unset_last_byte(); Signed-off-by: Zhao Heming <heming.zhao@suse.com>	2019-11-13 09:36:58 -06:00
Zdenek Kabelac	08f36dd093	lvextend: fix resizing volumes of different segtype When resizing 2 volumes like thin-pool and it's metadata and they would be of a different type - command would be actually expecting both LVs being of a same segtype - and would throw an error in case they are different. This patch fixes is by setting a new segtype from last segment of 2nd. extented device. Also it fixes the possible 'percentage' extension setup that might have been used for 'primary' volume - while the 'secondary' LV always goes with direct size - as we do not support 'percentage' setup for them This affects maily usage of thin-pool where the extension of thin-pool data size may also lead to extension of metadata size.	2019-11-11 22:44:25 +01:00
Zdenek Kabelac	8689b4ed82	raid: drop internal error Fix some internal error reports and debug trace returns	2019-10-31 15:31:30 +01:00
Zdenek Kabelac	3d9fc7d6f3	manip: optimize lvs_using_lv Instead of checking all LVs in a VG - do just a direct copy of LVs from the existing list ->segs_using_thin_lv. TODO: maybe it could be better to expose seg_list to /tools...	2019-10-31 15:31:30 +01:00
Zdenek Kabelac	c21440536d	mirror: remove unused code	2019-10-31 15:31:30 +01:00
Zdenek Kabelac	ab315e7a81	mirror: directly activate updated mirror	2019-10-31 15:31:30 +01:00
Zdenek Kabelac	0e5f39a5ac	snapshot: use single merging sequence The resume of 'released' 'COW' should preceed the resume of origin. The fact we need to do the sequence differently for merge was cause by bugs fixed in 2 previous commits - so we no longer need to recognize 'merging' and we should always go with single sequence. The importance of this order is - to properly remove '-real' device from origin LV. When COW is activated as 2nd. '-real' device is kept in table as it cannot be removed during 1st. resume of origin, and later activation of COW LV no longer builds tree associated with origin LV.	2019-10-26 00:49:16 +02:00
David Teigland	6a8bd0c509	lvmlockd: fix cachevol locking When a cachevol LV is attached, have the LV keep it's lock allocated. The lock on the cachevol won't be used while it's attached. When the cachevol is split a new lock does not need to be allocated. (Applies to cachevol usage by both dm-cache and dm-writecache.)	2019-10-25 14:08:59 -05:00
David Teigland	c08704cee7	cachevol: use cachepool code for metadata size Based on a more detailed calculation, but because of extent size rounding, the final result is about the same.	2019-10-21 12:13:33 -05:00
Zdenek Kabelac	0c01a4c2a6	gcc: avoid warning: declaration of xxx shadows a global declaration Fix some gcc complaints again shadowing global declarations	2019-10-21 15:32:35 +02:00
Zdenek Kabelac	dd7629ea09	cache: use _cpool for used cache-pools When LV gets cached and uses cache-pool - such cache-pool will now get _cpool suffix automatically. Thus 'Pool' column for cached LV will now show either _cvol or _cpool LV.	2019-10-21 15:31:33 +02:00
Zdenek Kabelac	2266a1863f	lv_manip: add lv_uniq_rename_update Add function to rename LV to either passed name or if the name is already in use, generate new lvol% name.	2019-10-21 12:14:15 +02:00
Zdenek Kabelac	ec85dfe0f8	cachevol: support removal of cachevol Removal of cachevol is equivalent of lvconvert --uncache and works the same way as with cachepool.	2019-10-17 13:03:50 +02:00
Zdenek Kabelac	5938cde11b	cache: single code for removal of cached volume Use same routine for dropping cached LV for cachevol and cachepool.	2019-10-17 13:03:50 +02:00
Zdenek Kabelac	9969361b51	debug: missing trace	2019-10-17 13:03:50 +02:00
Zdenek Kabelac	dab4a2c893	cachevol: move flag setting after taking archive Before 'archive()' is called, lvm2 must not touch/modify metadata. So move setting CACHE_VOL related flags past this point. Also make sure reading of cache segtype always restores this flag properly (even if compatible flag would be lost).	2019-10-17 13:03:50 +02:00
Zdenek Kabelac	f63e20ebcc	cache: drop validation check Since now we can cache either with cache-pool LV or any other LV (being used as cachevol LV) drop the validation condition.	2019-10-17 13:03:49 +02:00
Zdenek Kabelac	af8cfa90d9	cache: add more comments for min meta size Enhance source code with better explanation how the minimal metadata size is evaluated from data size and chunk size.	2019-10-17 13:03:49 +02:00
Zdenek Kabelac	2a08d6d1d4	cachevol: use CVOL UUID for cdata and cmeta layered devices Since code is using -cdata and -cmeta UUID suffixes, it does not need any new 'extra' ID to be generated and stored in metadata. Since introduce of new 'segtype' cache+CACHE_USES_CACHEVOL we can safely assume 'new' cache with cachevol will now be created without extra metadata_id and data_id in metadata. For backward compatibility, code still reads them in case older version of metadata have them - so it still should be able to activate such volumes. Bonus is lowered size of lv structure used to store info about LV (noticable with big volume groups).	2019-10-17 13:03:49 +02:00
David Teigland	81fe045714	cache: change default cachevol metadata sizes The first part of a cachevol LV is used for metadata, and the rest of the space is used for data. The division of space between metadata and data depends on the total size of the cachevol. The previous division gave more space than needed to metadata, it was: cachevol size 8M to 128M -> metadata size 16M * cachevol size 128M to 1G -> metadata size 32M cachevol size 1G and up -> metadata size 64M (* if this resulted in over half the LV used as metadata, then half the cachevol would be used for metadata, and the other half for data.) The division of space now gives less space to metadata, it is: cachevol size 8M to 16M -> metadata size 4M cachevol size 16M to 4G -> metadata size 8M cachevol size 4G to 16G -> metadata size 16M cachevol size 16G to 32G -> metadata size 32M cachevol size 32G and up -> metadata size 64M	2019-10-15 14:36:03 -05:00
David Teigland	0443d00ff1	allow activating known LVs when other LVs have unknown segtypes When a VG contains some LVs with unknown segtypes, the user should still be allowed to activate other LVs in the VG that are understood. $ lvs foo WARNING: Unrecognised flag CACHE_USES_CACHEVOL in segment type cache+CACHE_USES_CACHEVOL. WARNING: Unrecognised segment type cache+CACHE_USES_CACHEVOL LV VG Attr LSize lvol0 foo -wi------- 4.00m other foo vwi---u--- 48.00m $ lvcreate -l1 foo WARNING: Unrecognised flag CACHE_USES_CACHEVOL in segment type cache+CACHE_USES_CACHEVOL. WARNING: Unrecognised segment type cache+CACHE_USES_CACHEVOL Cannot change VG foo with unknown segments in it! Cannot process volume group foo $ lvchange -ay foo/lvol0 WARNING: Unrecognised flag CACHE_USES_CACHEVOL in segment type cache+CACHE_USES_CACHEVOL. WARNING: Unrecognised segment type cache+CACHE_USES_CACHEVOL $ lvchange -ay foo/other WARNING: Unrecognised flag CACHE_USES_CACHEVOL in segment type cache+CACHE_USES_CACHEVOL. WARNING: Unrecognised segment type cache+CACHE_USES_CACHEVOL Refusing activation of LV foo/other containing an unrecognised segment. $ lvs foo WARNING: Unrecognised flag CACHE_USES_CACHEVOL in segment type cache+CACHE_USES_CACHEVOL. WARNING: Unrecognised segment type cache+CACHE_USES_CACHEVOL LV VG Attr LSize lvol0 foo -wi-a----- 4.00m other foo vwi---u--- 48.00m	2019-10-15 14:34:53 -05:00
David Teigland	91ee025d5b	cache: change cachevol flags for backward compat A cachevol LV had the CACHE_VOL status flag in metadata, and the cache LV using it had no new flag. This caused problems if the new metadata was used by an old version of lvm. An old version of lvm would have two problems processing the new metadata: . The old lvm would return an error when reading the VG metadata when it saw the unknown CACHE_VOL status flag. . The old lvm would return an error when reading the VG metadata because it would not find an expected cache pool attached to the cache LV (since the cache LV had a cachevol attached instead.) Change the use of flags: . Change the CACHE_VOL flag to be a COMPATIBLE flag (instead of a STATUS flag) so that old versions will not fail when they see it. . When a cache LV is using a cachevol, the cache LV gets a new SEGTYPE flag CACHE_USES_CACHEVOL. This flag is appended to the segtype name, so that old lvm versions will fail to use the LV because of an unknown segtype, as opposed to failing to read the VG.	2019-10-15 09:05:52 -05:00
Zdenek Kabelac	1cd308d640	cachevol: drop no longer needed functions Code is no longer used/needed.	2019-10-14 15:20:25 +02:00
Zdenek Kabelac	201ffbd04a	cachevol: use lv_cache_remove Use same routine for dropping cache.	2019-10-14 15:20:25 +02:00
Zdenek Kabelac	77deadd3af	cachevol: drop LV_CACHE_VOL on detach automatically Move dropping of cachevol flag into detach function. TODO: this flag should be internal to lvm2.	2019-10-14 15:15:14 +02:00
Zdenek Kabelac	615e18f5b2	cache: enhance removal function to work with cvol To keep things simple, use same code for all cache removal functions, not just for cachepools but also cachevols.	2019-10-14 15:14:25 +02:00
Zdenek Kabelac	6ee83f699b	cache: correct condition	2019-10-14 15:14:25 +02:00
Zdenek Kabelac	bc35ccd174	cache: recognize cachevol with lv_cache_remove	2019-10-14 15:14:25 +02:00
Zdenek Kabelac	36944e1009	cache: reload only when switched to cleaner policy Reload cache target only when lvm2 reload table with cache with clearer policy.	2019-10-14 15:14:22 +02:00
David Teigland	bd21736e8b	vgck: let updatemetadata repair mismatched metadata Let vgck --updatemetadata repair cases where different mdas hold indepedently valid but unmatching copies of the metadata, i.e. different text metadata checksums or text metadata sizes.	2019-10-11 12:57:39 -05:00
David Teigland	fe16d296b0	pvmove: remove some cmirror related code which is no longer used	2019-10-11 11:31:42 -05:00
Zdenek Kabelac	cf8aee096f	vdo: introduce get_vdo_write_policy_name	2019-10-04 17:31:55 +02:00
Zdenek Kabelac	c756f76802	vdo: correct internal API for set_vdo_write_policy This is 'setting' function.	2019-10-04 17:31:55 +02:00
Zdenek Kabelac	9d8a028e8c	vdo: keep minimum_io_size in sectors	2019-10-04 17:31:55 +02:00
Zdenek Kabelac	6a9a4b4534	resize: continue change for getting vdo status before resize Continue commit `a98b77c164`. There needs to be error reported when status can't be obtained.	2019-10-04 17:31:55 +02:00
David Teigland	a68258339d	lvmlockd: set failure flag for test mode Set a failure flag when vg_read returns an error for test mode. The caller can segfault if there's an error with no flag set.	2019-10-04 10:09:49 -05:00
David Teigland	3a8e41a67b	metadata: import device name hint from metadata Start by using it in a comment for a missing PV.	2019-09-30 11:38:10 -05:00
Zdenek Kabelac	a98b77c164	vdo: properly check percentage for resize Avoid checking 'lv_is_active()' since special LV types does this validation anyway what calling _percent() function and call it ONLY when none of special types is queried. This restores support for VDO resize (as with support for separate VDO pool activation, plain query for lv_is_active() is not working in this case).	2019-09-30 13:34:34 +02:00
David Teigland	26596ce7fa	writecache: allow removing LV with attached writecache	2019-09-24 15:51:05 -05:00
David Teigland	76dd9b2b51	writecache: move code into new file put writecache specific code in writecache_manip.c should be no functional change	2019-09-24 15:51:05 -05:00
David Teigland	56aadd7fe2	lvremove: remove attached cachevol with removed LV When an LV is removed that has an attached cachevol, also remove the cachevol LV.	2019-09-24 15:51:05 -05:00
David Teigland	27c3c1d7c8	writecache: display layout and role fields	2019-09-20 14:55:11 -05:00
David Teigland	6f7d7089b4	writecache: use dm suffixes and lv attributes - use internal CACHE_VOL flag on cachevol LV - add suffixes to dm uuids for internal LVs - display appropriate letters in the LV attr field - display writecache's cachevol in lvs output	2019-09-20 14:08:51 -05:00
David Teigland	5d3bced5ea	lvconvert: detaching cachevol with missing PVs . For dm-cache in writethrough, always allow splitcache, whether the cache is missing PVs or not. . For dm-cache in writeback, if the cache is missing PVs, allow splitcache with force and yes. . For dm-writecache, if the cache is missing PVs, allow splitcache with force and yes.	2019-09-20 09:59:37 -05:00
David Teigland	d2c065789c	lvconvert: cachevol LV can have multiple segments	2019-09-20 09:59:37 -05:00
Zdenek Kabelac	6612d8dd5e	vdo: enhance activation with layer -vpool Enhance 'activation' experience for VDO pool to more closely match what happens for thin-pools where we do use a 'fake' LV to keep pool running even when no thinLVs are active. This gives user a choice whether he want to keep thin-pool running (wihout possibly lenghty activation/deactivation process) As we do plan to support multple VDO LVs to be mapped into a single VDO, we want to give user same experience and 'use-patter' as with thin-pools. This patch gives option to activate VDO pool only without activating VDO LV. Also due to 'fake' layering LV we can protect usage of VDO pool from command like 'mkfs' which do require exlusive access to the volume, which is no longer possible. Note: VDO pool contains 1024 initial sectors as 'empty' header - such header is also exposed in layered LV (as read-only LV). For blkid we are indentified as LV with UUID suffix - thus private DM device of lvm2 - so we do not need to store any extra info in this header space (aka zero is good enough).	2019-09-17 13:17:19 +02:00
David Teigland	25b58310e3	pvscan: avoid full scan for activation When an online PV completed a VG, the standard activation functions were used to activate the VG. These functions use a full scan of all devs. When many pvscans are run during startup and need to activate many VGs, scanning all devs from all the pvscans can take a long time. Optimize VG activation in pvscan to scan only the devs in the VG being activated. This makes use of the online file info that was used to determine the VG was complete. The downside of this approach is that pvscan activation will not detect duplicate PVs and block activation, where a normal activation command (which scans all devices) would.	2019-09-03 10:11:16 -05:00
David Teigland	98d420200e	vgextend: check missing device during block size check Checking the block size when a device is missing could trigger a segfault.	2019-09-03 10:07:56 -05:00
David Teigland	7cfbf3a394	fix segfault for invalid characters in vg name Fixes a regression from commit `ba7ff96faf` "improve reading and repairing vg metadata" where the error path for a vg name with invalid charaters was missing an error flag, which led to the caller not recognizing an error occured. Previously, an error flag was hidden in the old _vg_make_handle function.	2019-08-29 11:35:46 -05:00
Zdenek Kabelac	4b1dcc2eeb	lv_manip: add synchronizations New udev in rawhide seems to be 'dropping' udev rule operations for devices that are no longer existing - while this is 'probably' a bug - it's revealing moments in lvm2 that likely should not run in a single transaction and we should wait for a cookie before submitting more work. TODO: it seem more 'error' paths should always include synchronization before starting deactivating 'just activated' devices. We should probably figure out some 'automatic' solution for this instead of placing sync_local_dev_name() all over the place...	2019-08-26 15:32:19 +02:00
Zdenek Kabelac	c98e34e4d0	cache: improve vgremove loop Support internal removal of 'cache origin' volume - which we do not normally expose to a user - however internal processing loops may hit this condition (depending on order of list LVs). So when this operation is internally requested - we automatically try to remove it's 'holding' LV (cache LV) - which will also remove the origin.	2019-08-26 15:32:12 +02:00
Zdenek Kabelac	af0b84ccc8	snapshot: always activate Drop the 'cluster-only' optimization so we do resume ALL device before we try to wait on cookie before 'removal' operation. It's more correct order of operation - alhtough possibly slightly less efficient - but until we have correct list of operations 'in-progress' we can't do anything better.	2019-08-26 15:23:44 +02:00
David Teigland	677833ce6f	lvmcache: renaming functions and variables related to duplicates, no functional changes.	2019-08-16 13:26:11 -05:00
David Teigland	65bcd16be2	md component detection addition in vg_read Usually md components are eliminated in label scan and/or duplicate resolution, but they could sometimes get into the vg_read stage, where set_pv_devices compares the device to the PV. If set_pv_devices runs an md component check and finds one, vg_read should eliminate the components. In set_pv_devices, run an md component check always if the PV is smaller than the device (this is not very common.) If the PV is larger than the device, (more common), do the component check when the config setting is "auto" (the default).	2019-08-16 13:24:34 -05:00
David Teigland	09bc2d0fd1	devices: clean up block size functions Replace calls to the old dev_get_block_size function with calls to the new dev_get_direct_block_size function, and remove the old function.	2019-08-07 11:48:10 -05:00
David Teigland	0404539edb	vgcreate/vgextend: restrict PVs with mixed block sizes Avoid having PVs with different logical block sizes in the same VG. This prevents LVs from having mixed block sizes, which can produce file system errors. The new config setting devices/allow_mixed_block_sizes (default 0) can be changed to 1 to return to the unrestricted mode.	2019-08-01 10:06:47 -05:00
David Teigland	f17353e3e6	md component detection for differing PV and device sizes This check was mistakenly removed when shifting code in commit "separate code for setting devices from metadata parsing". Put it back with some new conditions.	2019-07-09 13:40:41 -05:00
David Teigland	b4402bd821	exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.	2019-06-25 15:39:08 -05:00
David Teigland	d16142f90f	scanning: open devs rw when rescanning for write When vg_read rescans devices with the intention of writing the VG, the label rescan can open the devs RW so they do not need to be closed and reopened RW in dev_write_bytes.	2019-06-21 10:57:49 -05:00
David Teigland	8fecd9c14e	metadata: include description with command in metadata areas Previously the VG metadata description field (which contains the command line) was only included in backup/archive copies of the metadata. Now also include it in the metadata written to the metadata areas.	2019-06-20 16:09:05 -05:00
David Teigland	4bb7d3da0e	lvmcache: remove wrapper around lvmcache_get_vgnameids This was left over from when there was an lvmetad version of the function.	2019-06-11 14:10:14 -05:00
David Teigland	550536474f	vgsplit: simplify vg creation The way that this command now uses the global lock followed by a label scan, it can simply check if the new VG name exists, and if not lock it and create it.	2019-06-10 10:38:32 -05:00
David Teigland	a3a676e0e7	metadata.c: removed unused code if 0 was placed around old vg_read code by the previous commit.	2019-06-07 15:54:04 -05:00
David Teigland	ba7ff96faf	improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)	2019-06-07 15:54:04 -05:00
David Teigland	015b906069	add a warning message when updating old metadata in an mda that had previously not been updated	2019-06-07 15:54:04 -05:00
David Teigland	47effdc025	vgck --updatemetadata is a new command uses vg_write to correct more common or less severe issues, and also adds the ability to repair some metadata corruption that couldn't be handled previously.	2019-06-07 15:54:04 -05:00
David Teigland	de3d3b11f4	move pv header repairs to vg_write Correct PV header in-use or version fields from vg_write instead of vg_read.	2019-06-07 15:54:04 -05:00
David Teigland	ab61a6d85d	move wipe_outdated_pvs to vg_write and implement it based on a device, not based on a pv struct (which is not available when the device is not a part of the vg.) currently only the vgremove command wipes outdated pvs until more advanced recovery is added in a subsequent commit	2019-06-07 15:54:04 -05:00
David Teigland	45b164f62c	create separate lvmcache update functions for read and write The vg read and vg write cases need to update lvmcache differently, so create separate functions for them. The read case now handles checking for outdated mdas and moves them aside into a new list to be repaired in a subsequent commit.	2019-06-07 15:54:04 -05:00
David Teigland	027e0e92e6	fix vg_commit return value The existing comment was desribing the correct behavior, but the code didn't match. The commit is successful if one mda was committed. Making it depend on the result of the internal lvmcache update was wrong.	2019-06-07 15:54:04 -05:00
David Teigland	650524b955	ability to keep track of bad mdas in lvmcache mda's that cannot be processed by lvm because of some corruption can be kept on a separate list. These will be used for more advanced repair in a subsequent commit.	2019-06-07 15:54:04 -05:00
David Teigland	aeafdc1f45	add flags to keep track of bad metadata When reading metadata headers and text, use a new set of flags to identify specific errors that are seen. These will be used for more advanced repair in a subsequent commit.	2019-06-07 15:54:04 -05:00
David Teigland	2b241eb1f6	pvck: use new dump routines for old output Use the recently added dump routines to produce the old/traditional pvck output, and remove the code that had been used for that. The validation/checking done by the new routines means that new lines prefixed with CHECK are printed for incorrect values.	2019-06-05 16:28:52 -05:00
Zdenek Kabelac	e3c4ab0cc7	cache: support no_discard_passdown Recent kernel version from kernel commit: de7180ff908b2bc0342e832dbdaa9a5f1ecaa33a started to report in cache status line new flag: no_discard_passdown Whenever lvm spots unknown status it reports: Unknown feature in status: So add reconginzing this feature flag and also report this with 'lvs -o+kernel_discards' When no_discard_passdown is found in status 'nopassdown' gets reported for this field (roughly matching what we report for thin-pools).	2019-06-05 15:48:41 +02:00
David Teigland	645dd27604	separate code for setting devices from metadata parsing Pull the code that sets devs for PVs out of the metadata parsing code and call it separately.	2019-05-23 11:57:38 -05:00
Zdenek Kabelac	d60d59a5f3	cleanup: use unsigned type	2019-05-03 13:17:22 +02:00
David Teigland	8c87dda195	locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.	2019-04-29 13:01:05 -05:00
David Teigland	ccd1386070	wipe_lv: initially open LV in writable mode wipe_lv knows it's going to write the device, so it can open rw from the start. It was opening readonly, and then dev_write needed to reopen it readwrite.	2019-04-26 14:49:27 -05:00
David Teigland	c33770c02d	lvmlockd: do not allow mirror LV to be activated shared This reverts `518a8e8cfb` "lvmlockd: activate mirror LVs in shared mode with cmirrord" because while activating a mirror LV with cmirrord worked, changes to the active cmirror did not work.	2019-04-04 13:21:38 -05:00
Zdenek Kabelac	fcec6691f0	thin: fix maintenance of _pmspare When metadata grows lvm2 may need to extend also _pmspare volume.	2019-04-03 13:28:54 +02:00
Zdenek Kabelac	e27d027155	thin: resize metadata with data When data are growing, adapt also size of metadata. As we get way too many reports from users doing huge growths of data portion while keep metadata small and avoiding using monitoring. So to enhance the user-experience in case user requests grown of thin-pool (without passing PV list for growth) - lvm2 will automaticaly grown also the metadata part of thin-pool (if possible).	2019-04-03 13:28:22 +02:00
Zdenek Kabelac	7c3de2fd93	thin: introduce estimate_thin_pool_metadata_size Add function for estimation of thin-pool metadata size for given size of data. Function is using already existing internal API so it can be reused for resize of thin-pool data.	2019-04-03 13:27:17 +02:00
David Teigland	85e68a8333	lvextend: refresh shared LV remotely using dlm/corosync When lvextend extends an LV that is active with a shared lock, use this as a signal that other hosts may also have the LV active, with gfs2 mounted, and should have the LV refreshed to reflect the new size. Use the libdlmcontrol run api, which uses dlm_controld/corosync to run an lvchange --refresh command on other cluster nodes.	2019-03-21 12:38:20 -05:00
David Teigland	d369de8399	lvextend: allow on LV active with a shared lock Detect when a shared lock exists, don't require the normal exclusive lock, and allow the lvextend.	2019-03-21 12:38:20 -05:00
Zdenek Kabelac	677aa84be3	vdo: enable caching for vdopool LV and vdo LV Allow using caching with VDO. User can either cache a single vdopool or a vdo LV - difference when the caching is put-in depends on a use-case and it's upto user to decide which kind of speed is expected.	2019-03-20 14:38:31 +01:00
Zdenek Kabelac	0db22c5f81	lv_manip: insert remove layer skips pools Fixing renaming of subLVs when removing and inserting layers - this got visible when using stacked VDO pools.	2019-03-20 14:38:05 +01:00
Zdenek Kabelac	1cc690e911	thin: max thin	2019-03-20 14:37:44 +01:00
David Teigland	4e20ebd6a1	pvscan: ignore online for shared and foreign PVs Activation would not be allowed anyway, but we can check for these cases early and avoid wasted time in pvscan managing online files an attempting activation.	2019-03-05 15:19:05 -06:00
David Teigland	a9eaab6beb	Use "cachevol" to refer to cache on a single LV and "cachepool" to refer to a cache on a cache pool object. The problem was that the --cachepool option was being used to refer to both a cache pool object, and to a standard LV used for caching. This could be somewhat confusing, and it made it less clear when each kind would be used. By separating them, it's clear when a cachepool or a cachevol should be used. Previously: - lvm would use the cache pool approach when the user passed a cache-pool LV to the --cachepool option. - lvm would use the cache vol approach when the user passed a standard LV in the --cachepool option. Now: - lvm will always use the cache pool approach when the user uses the --cachepool option. - lvm will always use the cache vol approach when the user uses the --cachevol option.	2019-02-27 08:52:34 -06:00
Zdenek Kabelac	d19e372795	cleanup: indent	2019-01-28 22:39:10 +01:00
Zdenek Kabelac	78dd9d820d	thin: select chunk size as power of 2 Whenever thin-pool chunk size is unspecified and left for lvm calculation try to select the size as nearest highest power-of-2 instead of just being a multiple of 64KiB.	2019-01-28 22:17:25 +01:00
Zdenek Kabelac	58ad831c72	cache: select chunk size as power of 2 When cache chunk size is not configured, and left for lvm deduction, select the value which is power-of-2.	2019-01-28 22:17:14 +01:00
Zdenek Kabelac	105a8edea1	lv_manip: better work with PERCENT_VG modifier with lvresize Fixing recent commit `022ebb0cfe` Resize already has size that needs to be counted with, otherwise upsizing operation could turn into size reduction one.	2019-01-21 15:39:24 +01:00
Zdenek Kabelac	f3c52a515b	vdo: enable dmeventd resize	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	a16d914d34	cleanup: better naming	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	08cabe9b83	vdo: allow resize of VDO and VDO pool volumes Now with newer VDO kvdo target we can start to use standard mechanism to enable resize of VDO volumes. VDO pool can be grown. Virtual volume grows on top of VDO pool when is not big enough. Reduced VDOLV is calling discard for reduced areas - this can take long time! TODO: implement some pollable mechanism for out-of-lock TRIM.	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	bd6709cec6	vdo: size reduction requires VDO to be active To be able to send discard to reduced areas - the VDO LV needs to be active.	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	f1ad4b0679	vdo: discard reduced area Implement sending discard to reduced LV area.	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	ca72d19691	vdo: estimate virtual size after resize	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	ab031d673d	vdo: introduce function for estimation of virtual size	2019-01-21 12:53:16 +01:00
Zdenek Kabelac	022ebb0cfe	lv_manip: better work with PERCENT_VG modifier When using 'lvcreate -l100%VG' and there is big disproportion between real available space and requested setting - automatically fallback to 100%FREE. Difference can be seen when VG is big and already most space was allocated, so the requestion 100%VG can end (and by spec for % modifier it's correct) as LV with size of 1%VG. Usually this is not a big problem - buit in some cases - like cache-pool allocation, this can result a big difference for chunksize selection. With this patch it's more closely match common-sense logic without the need of reitteration of too big changes in lvm2 core ATM. TODO: in the future there should be allocator solving all allocations in a single call.	2019-01-21 12:53:15 +01:00
Zdenek Kabelac	26ead4bf45	cov: extent_size cannot be 0 Make this obvious to coverity.	2018-12-21 21:45:08 +01:00
Zdenek Kabelac	9dfb1a11b7	cov: drop unneeded header file MAX macro no longer needed in pe_align.	2018-12-21 21:45:08 +01:00
Zdenek Kabelac	3320ab8334	lib: move towards v2 version of VDO format Drop very old original format of VDO target and focus on V2 version. So some variables were renamed or replaced. There is no compatibility preserved (with assumption so far this is experimental feature and there is no real user). Note - version currently VDO calls this version 6.2.	2018-12-20 13:26:55 +01:00
Heinz Mauelshagen	e82303fd6a	lvcreate/lvconvert: optionally reenable mirrored mirror log for testing purposes only This is a followup patch to commit `edb72cb70c` to support related lvm2 test suite tests. A 'global/support_mirrored_mirror_log' bool configuration variable gets introduced allowing the creation of, or conversion to mirrored 'mirror' logs if set. The capability to create these in turn allows the rest of the tests to perform activation of such existing LVs and their conversions to disk/core 'mirror' logs. Display a disclaimer warning if enabled that this is not for regular use. Add definition of the enabled config option to respective test scripts. Related: rhbz1643562	2018-12-17 19:28:54 +01:00
Ming-Hung Tsai	859feb81e5	lvmanip: uninitialized members in struct pv_list (#10 ) Scenario: Given an existed LV `lvol0`, I want to create another LV on the PVs used by `lvol0`. I use `build_parallel_areas_from_lv()` to obtain the `pv_list` of each segments. However, the returned `pv_list` is not properly initialized, which causes segfault in subsequent operations.	2018-12-14 15:23:18 +01:00
Heinz Mauelshagen	dd5716ddf2	raid: fix (de)activation of RaidLVs with visible SubLVs There's a small window during creation of a new RaidLV when rmeta SubLVs are made visible to wipe them in order to prevent erroneous discovery of stale RAID metadata. In case a crash prevents the SubLVs from being committed hidden after such wiping, the RaidLV can still be activated with the SubLVs visible. During deactivation though, a deadlock occurs because the visible SubLVs are deactivated before the RaidLV. The patch adds _check_raid_sublvs to the raid validation in merge.c, an activation check to activate.c (paranoid, because the merge.c check will prevent activation in case of visible SubLVs) and shares the existing wiping function _clear_lvs in raid_manip.c moved to lv_manip.c and renamed to activate_and_wipe_lvlist to remove code duplication. Whilst on it, introduce activate_and_wipe_lv to share with (lvconvert\|lvchange).c. Resolves: rhbz1633167	2018-12-11 16:35:34 +01:00
Heinz Mauelshagen	edb72cb70c	lvcreate/lvconvert: prohibit creation of/conversion to mirrored mirror logs In RHEL7 we marked mirrored mirror logs as deprecated and added a related message. This patch prohibits creating new 'mirror' LVs with that log type or converting existing LVs to have one. Existing LVs with mirrored mirror log can be activated and converted to disk/core logs. Avoid double deprecation message when running lvconvert. Resolves: rhbz1643562	2018-12-08 02:52:50 +01:00
David Teigland	904e1e3d26	Place the first PE at 1 MiB for all defaults . When using default settings, this commit should change nothing. The first PE continues to be placed at 1 MiB resulting in a metadata area size of 1020 KiB (for 4K page sizes; slightly smaller for larger page sizes.) . When default_data_alignment is disabled in lvm.conf, align pe_start at 1 MiB, based on a default metadata area size that adapts to the page size. Previously, disabling this option would result in mda_size that was too small for common use, and produced a 64 KiB aligned pe_start. . Customized pe_start and mda_size values continue to be set as before in lvm.conf and command line. . Remove the configure option for setting default_data_alignment at build time. . Improve alignment related option descriptions. . Add section about alignment to pvcreate man page. Previously, DEFAULT_PVMETADATASIZE was 255 sectors. However, the fact that the config setting named "default_data_alignment" has a default value of 1 (MiB) meant that DEFAULT_PVMETADATASIZE was having no effect. The metadata area size is the space between the start of the metadata area (page size offset from the start of the device) and the first PE (1 MiB by default due to default_data_alignment 1.) The result is a 1020 KiB metadata area on machines with 4KiB page size (1024 KiB - 4 KiB), and smaller on machines with larger page size. If default_data_alignment was set to 0 (disabled), then DEFAULT_PVMETADATASIZE 255 would take effect, and produce a metadata area that was 188 KiB and pe_start of 192 KiB. This was too small for common use. This is fixed by making the default metadata area size a computed value that matches the value produced by default_data_alignment.	2018-11-26 16:36:50 -06:00
David Teigland	3ae5569570	Add dm-writecache support dm-writecache is used like dm-cache with a standard LV as the cache. $ lvcreate -n main -L 128M -an foo /dev/loop0 $ lvcreate -n fast -L 32M -an foo /dev/pmem0 $ lvconvert --type writecache --cachepool fast foo/main $ lvs -a foo -o+devices LV VG Attr LSize Origin Devices [fast] foo -wi------- 32.00m /dev/pmem0(0) main foo Cwi------- 128.00m [main_wcorig] main_wcorig(0) [main_wcorig] foo -wi------- 128.00m /dev/loop0(0) $ lvchange -ay foo/main $ dmsetup table foo-main_wcorig: 0 262144 linear 7:0 2048 foo-main: 0 262144 writecache p 253:4 253:3 4096 0 foo-fast: 0 65536 linear 259:0 2048 $ lvchange -an foo/main $ lvconvert --splitcache foo/main $ lvs -a foo -o+devices LV VG Attr LSize Devices fast foo -wi------- 32.00m /dev/pmem0(0) main foo -wi------- 128.00m /dev/loop0(0)	2018-11-06 14:18:41 -06:00
David Teigland	cac4a9743a	Allow dm-cache cache device to be standard LV If a single, standard LV is specified as the cache, use it directly instead of converting it into a cache-pool object with two separate LVs (for data and metadata). With a single LV as the cache, lvm will use blocks at the beginning for metadata, and the rest for data. Separate dm linear devices are set up to point at the metadata and data areas of the LV. These dm devs are given to the dm-cache target to use. The single LV cache cannot be resized without recreating it. If the --poolmetadata option is used to specify an LV for metadata, then a cache pool will be created (with separate LVs for data and metadata.) Usage: $ lvcreate -n main -L 128M vg /dev/loop0 $ lvcreate -n fast -L 64M vg /dev/loop1 $ lvs -a vg LV VG Attr LSize Type Devices main vg -wi-a----- 128.00m linear /dev/loop0(0) fast vg -wi-a----- 64.00m linear /dev/loop1(0) $ lvconvert --type cache --cachepool fast vg/main $ lvs -a vg LV VG Attr LSize Origin Pool Type Devices [fast] vg Cwi---C--- 64.00m linear /dev/loop1(0) main vg Cwi---C--- 128.00m [main_corig] [fast] cache main_corig(0) [main_corig] vg owi---C--- 128.00m linear /dev/loop0(0) $ lvchange -ay vg/main $ dmsetup ls vg-fast_cdata (253:4) vg-fast_cmeta (253:5) vg-main_corig (253:6) vg-main (253:24) vg-fast (253:3) $ dmsetup table vg-fast_cdata: 0 98304 linear 253:3 32768 vg-fast_cmeta: 0 32768 linear 253:3 0 vg-main_corig: 0 262144 linear 7:0 2048 vg-main: 0 262144 cache 253:5 253:4 253:6 128 2 metadata2 writethrough mq 0 vg-fast: 0 131072 linear 7:1 2048 $ lvchange -an vg/min $ lvconvert --splitcache vg/main $ lvs -a vg LV VG Attr LSize Type Devices fast vg -wi------- 64.00m linear /dev/loop1(0) main vg -wi------- 128.00m linear /dev/loop0(0)	2018-11-06 13:44:54 -06:00
David Teigland	a686391eca	cache: reorganize cache_set_policy to prepare for future addition	2018-11-06 11:36:29 -06:00
David Teigland	23948e99b3	cache: improve error message about flush	2018-11-06 11:36:29 -06:00
David Teigland	3e547fa952	cache: improve warning message about cached thin data	2018-11-06 11:36:28 -06:00
David Teigland	e26dacf30a	cache: factor getting cache mode so part can be called separately	2018-11-06 11:36:28 -06:00
David Teigland	8d7075528f	cache: add cache_mode_num_to_str Requires only string and number, no specific lv/seg type.	2018-11-06 11:36:28 -06:00
Zdenek Kabelac	70e3d0a613	cov: remove unused assigns	2018-11-05 17:25:11 +01:00
David Teigland	aecf542126	metadata: prevent writing beyond metadata area lvm uses a bcache block size of 128K. A bcache block at the end of the metadata area will overlap the PEs from which LVs are allocated. How much depends on alignments. When lvm reads and writes one of these bcache blocks to update VG metadata, it can also be reading and writing PEs that belong to an LV. If these overlapping PEs are being written to by the LV user (e.g. filesystem) at the same time that lvm is modifying VG metadata in the overlapping bcache block, then the user's updates to the PEs can be lost. This patch is a quick hack to prevent lvm from writing past the end of the metadata area.	2018-10-29 16:53:17 -05:00
Heinz Mauelshagen	8df2dd66ce	Revert "raid: fix left behind SubLVs" This reverts commit `16ae968d24`. We need to come up with a better fix, because we fall short wiping all known signatures when not using the wipe_lv API.	2018-10-25 14:35:56 +02:00
Heinz Mauelshagen	16ae968d24	raid: fix left behind SubLVs lvm metadata writes, commits and activations are performed for (newly) allocated RAID metadata SubLVs to wipe any preexisiting data thus avoid false raid superblock positives on RaidLV activation. This process can be interrupted by command or system crashs thus leaving stale SubLVs in the lvm metadata as a problem. Because we hold an exclusive lock in this metadata SubLV wiping process, we can address this problem by avoiding aforementioned commits/writes/activations altogether wiping the respective first sector of the first physical extent allocated to any metadata SubLV directly via the existing dev_set() API. Succeeds all LVM RAID tests. Related: rhbz1633167	2018-10-24 16:35:30 +02:00
Zdenek Kabelac	fdd76da33d	cov: drop uneeded header files	2018-10-15 17:49:44 +02:00
Zdenek Kabelac	253989ecd9	cov: fix error path Avoid calling 'bad:' section since we have not set 'fd' yet and instead directly return failing 0 value.	2018-10-15 17:49:44 +02:00
Zdenek Kabelac	fbfbbf6d6a	cov: drop check for pointer Pointer must be always set and it's been already dereferenced.	2018-10-15 14:24:28 +02:00
Heinz Mauelshagen	989626926c	lvconvert: allow raid4 -> linear conversion request Allow "lvconvert --type linear RaidLV" on a raid4 LV providing convenient interim steps to convert to linear. Add respective new test lvconvert-raid-takeover-raid4_to_linear.sh and lvconvert-raid-takeover-linear_to_raid4.sh for linear to raid4 once on it.	2018-09-10 18:43:21 +02:00
Heinz Mauelshagen	e2e30a64ab	lvconvert: fix interim segtype regression on raid6 conversions When converting from striped/raid0/raid0_meta to raid6 with > 2 stripes, allow possible direct conversion (to raid6_n_6). In case of 2 stripes, first convert to raid5_n to restripe to at least 3 data stripes (the raid6 minimum in lvm2) in a second conversion before finally converting to raid6_n_6. As before, raid6_n_6 then can be converted to any other raid6 layout. Enhance lvconvert-raid-takeover.sh to test the 2 stripes conversions to raid6. Resolves: rhbz1624038	2018-09-07 13:48:19 +02:00
Heinz Mauelshagen	22a1304368	lvconvert: avoid superfluous interim raid type When converting striped/raid0*/raid6_n_6 <-> raid4, avoid superfluous interim raid5_n layout. Related: rhbz1447809	2018-08-31 19:04:19 +02:00
Heinz Mauelshagen	e83c4f07ca	lvconvert: fix conversion attempts to linear "lvconvert --type linear RaidLV" on striped and raid4/5/6/10 have to provide the convenient interim layouts. Fix involves a cleanup to the convenience type function. As a result of testing, add missing sync waits to lvconvert-raid-reshape-linear_to_raid6-single-type.sh. Resolves: rhbz1447809	2018-08-22 17:12:43 +02:00
Heinz Mauelshagen	4578411633	lvconvert: fix regression preventing direct striped conversion Conversion to striped from raid0/raid0_meta is directly possible. Fix a regression setting superfluous interim raid5_n conversion type introduced by commit `bd7cdd0b09`. Add new test script lvconvert-raid0-striped.sh. Resolves: rhbz1608067	2018-08-21 17:28:56 +02:00
Zdenek Kabelac	acab591378	mirror: fix splitmirrors for mirror type With improved mirror activation code --splitmirror issue poppedup since there was missing proper preload code and deactivation for splitted mirror leg.	2018-08-07 17:58:30 +02:00
Zdenek Kabelac	c34291e3bf	cache: drop metadata_format validation Allow to use any combination of cache metadata format for policy.	2018-08-07 17:57:00 +02:00
David Teigland	778ce8d808	lvconvert: improve text about splitmirrors in messages and man page.	2018-07-23 12:28:48 -05:00
David Teigland	117160b27e	Remove lvmetad Native disk scanning is now both reduced and async/parallel, which makes it comparable in performance (and often faster) when compared to lvm using lvmetad. Autoactivation now uses local temp files to record online PVs, and no longer requires lvmetad. There should be no apparent command-level change in behavior.	2018-07-11 11:26:42 -05:00
Zdenek Kabelac	12213445b5	vgchange: vdo support Support vgchange usage with VDO segtype. Also changing extent size need small update for vdo virtual extent. TODO: API needs enhancements so it's not about adding ifs() everywhere.	2018-07-09 15:29:16 +02:00
Zdenek Kabelac	c58733ca15	lvcreate: vdo support Supports basic: 'lvcreate --vdo -LXXXG -VYYYG vg/vdoname -n lvname' Allows to create basic VDO pool volume and virtual VDO volume.	2018-07-09 15:29:12 +02:00
Zdenek Kabelac	6945bbdbc6	lvresize: vdo support Unsupported ATM. Wait till VDO kernel target starts to use updated resize sequence, LOAD, SUSPEND, RESUME.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	44c99a8822	vdo: data percentage Display percentage of used virtual size of vdo-pool volume.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	5807993bbf	display: basic vdo segment lvdisplay and lvs support Print some basic info about vdo segment. 'lvdisplay -m' ATM shows the most. lvs shows usage percentage.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	493ffe7a0f	lv_manip: layout and role support for vdo segment	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	00990ed53e	check_lv_segment: internal vdo segment validation Check if settings for vdo segment are correct.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	0dafd159a8	vdo_manip: parsing status of VDO device	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	aa63dfbe39	vdo: support functions to map enums to string names Translate VDO enums to printable strings.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	aff69ecf39	vdo: component activation of VDO data LV Allow component activation of VDO data LV.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	4b7a57c9ed	vdo: with created names use vpool When user create vdo-pool - use different automatic name. So unlike with traditional LVs using lvol0, lvol1 use vpool0, vpool1... TODO: apply similar for thin-pool & cache-pool...	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	a8f84f7801	vdo: introduce segment types and manip functions Core functionality introducing lvm VDO support.	2018-07-09 15:28:35 +02:00
Zdenek Kabelac	e9d1f676b3	allocation: add check for passing log allocation Updates previous commit.	2018-07-09 00:59:34 +02:00
Zdenek Kabelac	6d1c983122	cleanup: use last_seg More readable code.	2018-07-09 00:23:35 +02:00
Zdenek Kabelac	b697aa9646	allocator: fix thin-pool allocation When allocating thin-pool with more then 1 device - try to allocate 'metadataLV' with reuse of log-type allocation for mirror LV. It should be naturally place on other device then 'dataLV'. However due to somewhat hard to follow allocation logic code, it's been rejected allocation in cases where there was not enough space for data or metadata on single PV, thus to successed, usage of segments was mandatory. While user may use: allocation/thin_pool_metadata_require_separate_pvs=1 to enforce separe meta and data LV - on default settings, this is not enable thus segment allocation is meant to work. NOTE: As already said - the original intention of this whole 'if()' is unclear, so try to split this test into multiple more simple tests that are more readable. TODO: more validation.	2018-07-09 00:19:30 +02:00
Zdenek Kabelac	f2b856c994	lv_manip: do not check extents for any virtual target Allow creation of any virtual segment type with just --virtualsize specified without any real extent size give. TODO: likely --type error,zero might be later enhanced to use -V (along with -L) - but since those targets do not allocate real space, supporting -V makes sense with them.	2018-07-02 10:24:23 +02:00
Zdenek Kabelac	2bb9627d01	lv_manip: add name of failing LV into error message	2018-07-02 10:24:23 +02:00
Zdenek Kabelac	cea88a9e4e	lv_manip: use vgmem pool Switch to vgmem pool for allocation associated with modification of particular VG.	2018-06-25 15:07:55 +02:00
Zdenek Kabelac	357e9f9572	cache: use new api function	2018-06-25 15:07:55 +02:00
Zdenek Kabelac	9c0d92d957	lv_manip: add new internal api function	2018-06-25 15:07:55 +02:00
Zdenek Kabelac	8949903fbb	cache: set areas count prior using it Set correct counter, so it's not failing on internal error check.	2018-06-25 15:07:32 +02:00
Zdenek Kabelac	106ee05ba0	lv_manip: add extra internal error Catch error early, when trying to store data into non-allocated area.	2018-06-22 23:37:02 +02:00
David Teigland	e166d2b14c	lvmlockd: fix another missing lock_type null check Same as `347c807f8`.	2018-06-21 09:24:51 -05:00
David Teigland	428514a07f	Drop --ignoreskippedcluster option It's no longer needed. Clustered VGs are now handled in the same way as foreign VGs, and as shared VGs that can't be accessed: - A command processing all VGs sees a clustered VG, prints a message ("Skipping clustered VG foo."), skips it, and does not fail. - A command where the clustered VG is explicitly named on the command line, prints a message and fails. "Cannot access clustered VG foo, see lvmlockd(8)." The option is listed in the set of ignored options for the commands that previously accepted it. (Removing it entirely would cause commands/scripts to fail if they set it.)	2018-06-15 15:59:34 -05:00
David Teigland	8eab37593e	Add cmd arg to more functions so that it can be used in the filter code	2018-06-15 11:03:55 -05:00
David Teigland	e53cfc6a88	lvmlockd: update method for changing clustered VG The previous method for forcibly changing a clustered VG to a local VG involved using -cn and locking_type 0. Since those options are deprecated, replace it with the same command used for other forced lock type changes: vgchange --locktype none --lockopt force.	2018-06-13 15:30:28 -05:00
David Teigland	17f5572bc9	Remove independent metadata areas in which metadata is stored in files on the local fs instead of on PVs.	2018-06-13 12:25:19 -05:00
David Teigland	981a3ba98e	Clean up repair and result values in vg_read Fix the confusing mix of input and output values in the single variable.	2018-06-12 11:08:26 -05:00
David Teigland	9a8c36b891	Fix use of orphan lock in commands vgreduce, vgremove and vgcfgrestore were acquiring the orphan lock in the midst of command processing instead of at the start of the command. (The orphan lock moved to being acquired at the start of the command back when pvcreate/vgcreate/vgextend were reworked based on pvcreate_each_device.) vgsplit also needed a small update to avoid reacquiring a VG lock that it already held (for the new VG name).	2018-06-12 09:46:11 -05:00
David Teigland	c4153a8dfc	Remove checking for locked VGs A few places were calling a function to check if a VG lock was held. The only place it was actually needed is for pvcreate which wants to do its own locking (and scanning) around process_each_pv. The locking/scanning exceptions for pvcreate in process_each_pv/vg_read can be enabled by just passing a couple of flags instead of checking if the VG is already locked. This also means that these special cases won't be enabled unknowingly in other places where they shouldn't be used.	2018-06-12 09:46:04 -05:00
David Teigland	3b6b7f8f9b	lvmlockd: skip repair lock upgrade for non shared vgs Only attempt lvmlockd lock upgrade for shared VGs.	2018-06-12 09:44:05 -05:00
Zdenek Kabelac	77d5caae90	snapshot: improve checking of merging snapshot Add runtime detection for 'lvs -o+seg_monitor' and 'vgchange --monitor'. This fix should avoid unnecessary timeout on systemd shutdown.	2018-06-11 22:25:42 +02:00
David Teigland	a8759dc7a6	Remove unused cache management from locking This code was for managing lvmcache for clvm and it no longer does anything.	2018-06-08 12:30:43 -05:00
David Teigland	669b1295ae	Remove header declarations for removed functions	2018-06-08 10:01:05 -05:00
David Teigland	73b7e6fde7	Remove more code that was only used by liblvm2app	2018-06-08 09:29:11 -05:00
Joe Thornber	7c4b19c335	Merge branch '2018-06-04-data-structs'	2018-06-08 14:21:07 +01:00
Joe Thornber	d5da55ed85	device_mapper: remove dbg_malloc. I wrote dbg_malloc before we had valgrind. These days there's just no need.	2018-06-08 13:40:53 +01:00
Zdenek Kabelac	5cb4b2a424	cache: cleaner policy also uses fmt2 Format 2 is also with cleaner policy.	2018-06-08 14:37:29 +02:00
Zdenek Kabelac	fb171edd45	pvresize: add missing return Log error path missed return 0. Also fix some unneded bactraces (since log_error already shows position).	2018-06-08 14:36:56 +02:00
Joe Thornber	286c1ba336	device_mapper: rename libdevmapper.h -> all.h I'm paranoid a file will include the global one in /usr/include by accident.	2018-06-08 12:31:45 +01:00
David Teigland	18259d5559	Remove unused clvm variations for active LVs Different flavors of activate_lv() and lv_is_active() which are meaningful in a clustered VG can be eliminated and replaced with whatever that flavor already falls back to in a local VG. e.g. lv_is_active_exclusive_locally() is distinct from lv_is_active() in a clustered VG, but in a local VG they are equivalent. So, all instances of the variant are replaced with the basic local equivalent. For local VGs, the same behavior remains as before. For shared VGs, lvmlockd was written with the explicit requirement of local behavior from these functions (lvmlockd requires locking_type 1), so the behavior in shared VGs also remains the same.	2018-06-07 16:17:04 +01:00
David Teigland	e4d9099e19	Remove more clvm code	2018-06-07 16:17:04 +01:00
David Teigland	d154dd6638	lvmlockd: fix missing lock_type null check Missed checking if vg->lock_type is NULL in commit `db8d3bdfa`: lvmlockd: enable mirror split and merge with dlm lock_type	2018-06-07 16:17:04 +01:00
David Teigland	3e781ea446	Remove clvmd and associated code More code reduction and simplification can follow.	2018-06-05 11:09:13 -05:00
Heinz Mauelshagen	bd7cdd0b09	lvconvert: support linear <-> striped convenience conversions "lvconvert --type {linear\|striped\|raid} ..." on a striped/linear LV provides convenience interim type to convert to the requested final layout similar to the given raid <-> raid* conveninece types. Whilst on it, add missing raid5_n convenince type from raid5* to raid10. Resolves: rhbz1439925 Resolves: rhbz1447809 Resolves: rhbz1573255	2018-06-05 16:23:18 +02:00
Heinz Mauelshagen	de66704253	segtype: add linear Add linear segtype addressing FIXME in preparation for linear <-> striped convenience conversion support	2018-06-05 16:23:18 +02:00
Zdenek Kabelac	1140d70893	build: fixes	2018-06-04 12:28:13 +02:00
Zdenek Kabelac	6a1f458bb7	build: compile fixes	2018-06-01 21:12:31 +02:00
David Teigland	09177b53dd	lvmlockd: clarify lock_type use for coverity Make it clearer when vg->lock_type will be used so coverity doesn't worry about it.	2018-06-01 13:15:22 -05:00
David Teigland	b6f0f20da2	lvmlockd: primarily use vg_is_shared to check if a vg uses an lvmlockd lock_type, instead of the equivalent but longer is_lockd_type.	2018-06-01 13:15:22 -05:00
Joe Thornber	dbba1e9b93	Merge branch 'master' into 2018-05-11-fork-libdm	2018-06-01 13:04:12 +01:00
David Teigland	b9c1cef817	lvmlockd: fix reverting new lv in error path The wrong name was being used to free the LV lock in lvmlockd in the error exit path.	2018-05-31 15:35:48 -05:00
David Teigland	fdaa7e2e87	vgs: add report field for shared equivalent to a non-empty -o locktype.	2018-05-31 10:23:03 -05:00
David Teigland	c516321325	lvmlockd: enable lvcreate of new LV plus existing cache pool In this command, lvcreate creates a new LV and then combines it with an existing cache pool, producing a cache LV. This command was previously not allowed in in a shared VG.	2018-05-30 15:24:24 -05:00
David Teigland	6cd0523337	lvmlockd: enable repairing shared VG while reading it When the lvmlockd lock is shared, upgrade it to ex when repair (writing) is needed during vg_read. Pass the lockd state through additional read-related functions so the instances of repair scattered through vg_read can be handled. (Temporary solution until the ad hoc repairs can be pulled out of vg_read into a top level, centralized repair function.)	2018-05-30 12:56:46 -05:00
David Teigland	948f2d9979	lvmlockd: enable lvcreate of thin pool and thin lv in one command Previously, thin pools and thin lvs need needed to be created with separate commands, now the combined command is permitted.	2018-05-30 09:25:45 -05:00
David Teigland	db8d3bdfa9	lvmlockd: enable mirror split and merge with dlm lock_type	2018-05-30 09:25:45 -05:00
David Teigland	0253f5a21d	fix id_write_format on non-uuid string orphan vgs using the vgname "#orphans" as the vgid, and valgrind complains about calling id_write_format on that invalid uuid.	2018-05-18 13:41:20 -05:00
David Teigland	286c9c78b4	liblvm2app: fix valgrind memory warning	2018-05-17 15:18:11 -05:00
Rick Elrod	8c453e2e5e	cleanup: fix grammar in output - less then -> less than This minor patch fixes grammar in a few messages which get printed to users. It also fixes the same grammar mistake in several comments. Signed-off-by: Rick Elrod <relrod@redhat.com> --	2018-05-17 10:37:45 +02:00
David Teigland	28d35e5c59	scan: fix missing close in lib lib was using dev_test_excl which wasn't closing the device. Switch code to new io layer with excl open. Also use exclusive open in some other places.	2018-05-16 14:48:30 -05:00
Joe Thornber	89fdc0b588	Merge branch 'master' into 2018-05-11-fork-libdm	2018-05-16 13:43:02 +01:00
Joe Thornber	ccc35e2647	device-mapper: Fork libdm internally. The device-mapper directory now holds a copy of libdm source. At the moment this code is identical to libdm. Over time code will migrate out to appropriate places (see doc/refactoring.txt). The libdm directory still exists, and contains the source for the libdevmapper shared library, which we will continue to ship (though not neccessarily update). All code using libdm should now use the version in device-mapper.	2018-05-16 13:00:50 +01:00
Joe Thornber	7f97c7ea9a	build: Don't generate symlinks in include/ dir As we start refactoring the code to break dependencies (see doc/refactoring.txt), I want us to use full paths in the includes (eg, #include "base/data-struct/list.h"). This makes it more obvious when we're breaking abstraction boundaries, eg, including a file in metadata/ from base/	2018-05-14 10:30:20 +01:00
David Teigland	5c9dcd99fd	scan: remove unused args from label_read	2018-05-11 14:16:49 -05:00
David Teigland	bbb8040456	dev_cache: drop open_list devices are now held open only in bcache, so drop the dev_cache list of open devices which is unused.	2018-05-11 12:47:56 -05:00
David Teigland	9ad42e5f06	io: write log header with bcache	2018-05-10 16:25:33 -05:00
David Teigland	57bb46c5e7	filter: use bcache for filter reads Filters are still applied before any device reading or the label scan, but any filter checks that want to read the device are skipped and the device is flagged. After bcache is populated, but before lvm looks for devices (i.e. before label scan), the filters are reapplied to the devices that were flagged above. The filters will then find the data they need in bcache.	2018-05-10 16:03:19 -05:00
Joe Thornber	39ce38eb88	label/lv_manip: squash some warnings	2018-05-10 15:14:39 +01:00
David Teigland	9a5bd01b0c	io: replace dev_set with bcache equivalents	2018-05-09 11:29:52 -05:00
David Teigland	c016b573ee	clvmd: separate saved_vg from vginfo The clvmd saved_vg data is independent from the normal lvm lvmcache vginfo data, so separate saved_vg from vginfo. Normal lvm doesn't need to use save_vg at all, and in clvmd, lvmcache changes on vginfo can be made without worrying about unwanted effects on saved_vg.	2018-05-03 14:54:48 -05:00
Heinz Mauelshagen	88fe07ad0a	raid: use new internal APIs Use APIs introduced with commit `4ebfd8e8eb` where appropriate to minimize redundant code.	2018-05-03 21:36:50 +02:00
Heinz Mauelshagen	4ebfd8e8eb	lvconvert: don't return success on degraded -m raid1 conversion In case "lvconvert -mN RaidLV" was used on a degraded raid1 LV, success was returned instead of an error. Provide message to inform about the need to repair first before changing number of mirrors and exit with error. Add new lvconvert-m-raid1-degraded.sh test. Resolves: rhbz1573960	2018-05-03 18:48:00 +02:00
David Teigland	c1cd18f21e	Remove lvm1 and pool disk formats There are likely more bits of code that can be removed, e.g. lvm1/pool-specific bits of code that were identified using FMT flags. The vgconvert command can likely be reduced further. The lvm1-specific config settings should probably have some other fields set for proper deprecation.	2018-04-30 16:55:02 -05:00
David Teigland	029a76b4f8	clvmd: don't repair vg from vg_read in clvmd The mixed up vg repair code in vg_read was trying to repair a vg when vg_read was called by clvmd. The clvmd daemon isn't supposed to be repairing or writing a vg. (This is a temporary workaround; vg repair will soon be pulled out of vg_read so it can be called in a controlled way and consolidated instead of spread around.)	2018-04-30 15:56:51 -05:00
Joe Thornber	65d6118e47	[metadata-liblvm.c] comment out some dead code and add a FIXME	2018-04-30 09:45:39 +01:00
David Teigland	5b6e62dc1f	clvmd: drop old saved_vg when returning new saved_vg In some pvmove tests, clvmd uses the new (precommitted) saved_vg, but then requests the old saved_vg, and expects that the new saved_vg be returned instead of the old. So, when returning the new saved_vg, forget the old one so we don't return it again.	2018-04-26 14:57:45 -05:00
David Teigland	47bfac21ca	clvmd: skip dev rescan after full scan When clvmd does a full label scan just prior to calling _vg_read(), pass a new flag into _vg_read to indicate that the normal rescan of VG devs is not needed.	2018-04-25 16:39:43 -05:00
David Teigland	1fec86571f	clvmd: reuse a vg struct for sequential LV operations After reading a VG, stash it in lvmcache as "saved_vg". Before reading the VG again, try to use the saved_vg. The saved_vg is dropped on VG lock operations.	2018-04-25 16:39:43 -05:00
Zdenek Kabelac	c492fbb51c	debug: more explanatory error message	2018-04-23 22:42:18 +02:00
David Teigland	1409c4a1c2	clvm: rescan when VG or PV not found Rescan devices to update lvmcache content when clvmd vg_read doesn't find a VG or PV.	2018-04-20 16:09:49 -05:00
David Teigland	aee27dc7ba	scan: skip device rescan in vg_read For reporting commands (pvs,vgs,lvs,pvdisplay,vgdisplay,lvdisplay) we do not need to repeat the label scan of devices in vg_read if they all had matching metadata in the initial label scan. The data read by label scan can just be reused for the vg_read. This cuts the amount of device i/o in half, from two reads of each device to one. We have to be careful to avoid repairing the VG if we've skipped rescanning. (The VG repair code is very poor, and will be redone soon.)	2018-04-20 11:23:14 -05:00
David Teigland	9b6a62f944	lvmcache: simplify Recent changes allow some major simplification of the way lvmcache works and is used. lvmcache_label_scan is now called in a controlled fashion at the start of commands, and not via various unpredictable side effects. Remove various calls to it from other places. lvmcache_label_scan should not be called from anywhere during a command, because it produces an incorrect representation of PVs with no MDAs, and misclassifies them as orphans. This has been a long standing problem. The invalid flag and rescanning based on that is no longer used and removed. The 'force' variation is no longer needed and removed.	2018-04-20 11:22:48 -05:00
David Teigland	a9b0aa5c17	lvmetad: more fixes related to bcache Need to open devs prior to bcache io.	2018-04-20 11:22:48 -05:00
David Teigland	ddb5de7a98	clvm: fix bcache scan handling We can't let clvmd keep all scanned devs open, which prevents them from being removed. So drop the bcache data (and close fds) affter doing a label scan. Also set up bcache before the clvm-specific vg_read (which needs to rescan the vg's devs using bcache) and destroy the bcache after.	2018-04-20 11:22:48 -05:00
David Teigland	e49b114f7e	bcache: use wrappers for bcache read write in lvm Using a wrapper makes it easier to disable bcache if needed.	2018-04-20 11:22:47 -05:00
David Teigland	8065492046	bcache: do all writes through bcache	2018-04-20 11:22:47 -05:00
David Teigland	37471bb477	scan: skip extra scan in vg_read Drop an extra label scan in the recovery part of vg_read. This is a temporary improvement until the pending replacement for the broken recovery code burried in vg_read.	2018-04-20 11:22:46 -05:00
David Teigland	6c67c7557c	scan: use separate fd for bcache Create a new dev->bcache_fd that the scanning code owns and is in charge of opening/closing. This prevents other parts of lvm code (which do various open/close) from interfering with the bcache fd. A number of dev_open and dev_close are removed from the reading path since the read path now uses the bcache. With that in place, open(O_EXCL) for pvcreate/pvremove can then be fixed. That wouldn't work previously because of other open fds.	2018-04-20 11:22:46 -05:00
David Teigland	d9a77e8bb4	lvmcache: simplify metadata cache The copy of VG metadata stored in lvmcache was not being used in general. It pretended to be a generic VG metadata cache, but was not being used except for clvmd activation. There it was used to avoid reading from disk while devices were suspended, i.e. in resume. This removes the code that attempted to make this look like a generic metadata cache, and replaces with with something narrowly targetted to what it's actually used for. This is a way of passing the VG from suspend to resume in clvmd. Since in the case of clvmd one caller can't simply pass the same VG to both suspend and resume, suspend needs to stash the VG somewhere that resume can grab it from. (resume doesn't want to read it from disk since devices are suspended.) The lvmcache vginfo struct is used as a convenient place to stash the VG to pass it from suspend to resume, even though it isn't related to the lvmcache or vginfo. These suspended_vg* vginfo fields should not be used or touched anywhere else, they are only to be used for passing the VG data from suspend to resume in clvmd. The VG data being passed between suspend and resume is never modified, and will only exist in the brief period between suspend and resume in clvmd. suspend has both old (current) and new (precommitted) copies of the VG metadata. It stashes both of these in the vginfo prior to suspending devices. When vg_commit is successful, it sets a flag in vginfo as before, signaling the transition from old to new metadata. resume grabs the VG stashed by suspend. If the vg_commit happened, it grabs the new VG, and if the vg_commit didn't happen it grabs the old VG. The VG is then used to resume LVs. This isolates clvmd-specific code and usage from the normal lvm vg_read code, making the code simpler and the behavior easier to verify. Sequence of operations: - lv_suspend() has both vg_old and vg_new and stashes a copy of each onto the vginfo: lvmcache_save_suspended_vg(vg_old); lvmcache_save_suspended_vg(vg_new); - vg_commit() happens, which causes all clvmd instances to call lvmcache_commit_metadata(vg). A flag is set in the vginfo indicating the transition from the old to new VG: vginfo->suspended_vg_committed = 1; - lv_resume() needs either vg_old or vg_new to use in resuming LVs. It doesn't want to read the VG from disk since devices are suspended, so it gets the VG stashed by lv_suspend: vg = lvmcache_get_suspended_vg(vgid); If the vg_commit did not happen, suspended_vg_committed will not be set, and in this case, lvmcache_get_suspended_vg() will return the old VG instead of the new VG, and it will resume LVs based on the old metadata.	2018-04-20 11:22:45 -05:00
David Teigland	79c4971210	label_scan: remove extra label scan and read for orphan PVs When process_each_pv() calls vg_read() on the orphan VG, the internal implementation was doing an unnecessary lvmcache_label_scan() and two unnecessary label_read() calls on each orphan. Some of those unnecessary label scans/reads would sometimes be skipped due to caching, but the code was always doing at least one unnecessary read on each orphan. The common format_text case was also unecessarily calling into the format-specific pv_read() function which actually did nothing. By analyzing each case in which vg_read() was being called on the orphan VG, we can say that all of the label scans/reads in vg_read_orphans are unnecessary: 1. reporting commands: the information saved in lvmcache by the original label scan can be reported. There is no advantage to repeating the label scan on the orphans a second time before reporting it. 2. pvcreate/vgcreate/vgextend: these all share a common implementation in pvcreate_each_device(). That function already rescans labels after acquiring the orphan VG lock, which ensures that the command is using valid lvmcache information.	2018-04-20 11:22:45 -05:00
David Teigland	748f29b42a	scan: do scanning at the start of a command Move the location of scans to make it clearer and avoid unnecessary repeated scanning. There should be one scan at the start of a command which is then used through the rest of command processing. Previously, the initial label scan was called as a side effect from various utility functions. This would lead to it being called unnecessarily. It is an expensive operation, and should only be called when necessary. Also, this is a primary step in the function of the command, and as such it should be called prominently at the top level of command processing, not as a hidden side effect of a utility function. lvm knows exactly where and when the label scan needs to be done. Because of this, move the label scan calls from the internal functions to the top level of processing. Other specific instances of lvmcache_label_scan() are still called unnecessarily or unclearly by specific commands that do not use the common process_each functions. These will be improved in future commits. During the processing phase, rescanning labels for devices in a VG needs to be done after the VG lock is acquired in case things have changed since the initial label scan. This was being done by way of rescanning devices that had the INVALID flag set in lvmcache. This usually approximated the right set of devices, but it was not exact, and obfuscated the real requirement. Correct this by using a new function that rescans the devices in the VG: lvmcache_label_rescan_vg(). Apart from being inexact, the rescanning was extremely well hidden. _vg_read() would call ->create_instance(), _text_create_text_instance(), _create_vg_text_instance() which would call lvmcache_label_scan() which would call _scan_invalid() which repeats the label scan on devices flagged INVALID. lvmcache_label_rescan_vg() is now called prominently by _vg_read() directly.	2018-04-20 11:21:38 -05:00
David Teigland	4507ba3596	scan: use new label_scan for lvmcache_label_scan To do label scanning, lvm code calls lvmcache_label_scan(). Change lvmcache_label_scan() to use the new label_scan() based on bcache. Also add lvmcache_label_rescan_vg() which calls the new label_scan_devs() which does label scanning on only the specified devices. This is for a subsequent commit and is not yet used.	2018-04-20 11:19:32 -05:00
David Teigland	a7cb76ae94	scan: use bcache for label scan and vg read New label_scan function populates bcache for each device on the system. The two read paths are updated to get data from bcache. The bcache is not yet used for writing. bcache blocks for a device are invalidated when the device is written.	2018-04-20 11:19:24 -05:00
Joe Thornber	00f1b208a1	[io paths] Unpick agk's aio stuff	2018-04-20 11:03:58 -05:00
Zdenek Kabelac	73cda0437f	cleanup: correcting macro wrapping Use proper do {} while(0) so ';' after macros are correctly interpretted..	2018-04-20 12:17:01 +02:00
Zdenek Kabelac	9731d48691	cleanup: enhance debug message	2018-04-20 12:17:01 +02:00
Zdenek Kabelac	d437bd86ff	cleanup: display_lvname update message Add more display_lvname usage. Update some error messages. Indent.	2018-04-20 12:17:01 +02:00
Zdenek Kabelac	7323557379	cleanup: add _mb_ to regiosize option Just like with others mentions default unit in function name.	2018-04-20 12:17:01 +02:00
Zdenek Kabelac	27a1a0e5c0	cleanup: reorder condition There is no point to wait for sync for non-locally active LV.	2018-04-20 12:17:01 +02:00
Zdenek Kabelac	d81e3f9b06	mirror: use vg mempool Use vg mempool with mirror log metadata update.	2018-04-20 12:16:14 +02:00
Zdenek Kabelac	05f954ee9b	mirror: checking for mirror segtype Checking more correctly for mirror segtype here instead of mirrored one which can be also 'raid'.	2018-04-20 12:16:14 +02:00

... 3 4 5 6 7 ...

3058 Commits