shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00

Author	SHA1	Message	Date
David Teigland	a61272a6f0	Revert "lvs: disable scanning optimization" This reverts commit `7474440d3b`. lvs can use the scanning optimization again since it has been changed in: "scanning: optimize by checking text offset and checksum"	2019-11-26 16:52:28 -06:00
Heinz Mauelshagen	29db9c6325	lvcreate: ensure striped raid region size is at least stripe size The kernel MD runtime requires region size to be larger than stripe size on striped raid layouts, thus the dm-raid target's constructor rejects such request. This causes e.g. an 'lvcreate --type raid10 -i3 -I4096 -R2048 -n lv vg' to fail. Avoid failing late in the kernel by enforcing region size to be larger or equal to stripe size. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1698225	2019-11-26 22:31:58 +01:00
David Teigland	2037476008	pvcreate,pvremove: fix reacquiring global lock after prompt When pvcreate/pvremove prompt the user, they first release the global lock, then acquire it again after the prompt, to avoid blocking other commands while waiting for a user response. This release/reacquire changes the locking order with respect to the hints flock (and potentially other locks). So, to avoid deadlock, use a nonblocking request when reacquiring the global lock.	2019-11-26 14:34:43 -06:00
David Teigland	7474440d3b	lvs: disable scanning optimization The scanning optimization can produce warnings from 'lvs' when run concurrently with commands modifying LVs, so disable the optimization until it can be improved. Without the scanning optimization, lvs will always read all PVs twice: 1. read metadata from all PVs, saving it in memory 2. for each VG 3. lock VG 4. reread metadata from all PVs in VG, replacing metadata saved from step 1 5. run command on VG 6. unlock VG The optimization would usually cause step 4 to be skipped, and PVs would be read only once. Running the command in step 5 using metadata that was not read under the VG lock is usually fine, except for the fact that lvs attempts to validate the metadata by comparing it to current dm state. If other commands are modifying dm state while lvs is running, lvs may see differences between metadata from step 1 and dm state checked during step 5, and print warnings. (A better fix may be to detect the concurrent change and fall back to rereading metadata in step 4 only when needed.)	2019-11-19 10:56:12 -06:00
Zdenek Kabelac	9af1d63b4d	cov: use zalloc Instead of malloc() memset() -> zalloc()	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	1f4968289c	pvck: check result of dev_get_size Don't use garbage value for later computations.	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	d67ce9e140	cov: fix mem leaking buffer Free allocated buffer on function's exit. Also check for fwrite() results.	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	0bad3977df	cov: avoid passing NULL to strstr function When 'str1' would be NULL, there is no point to run 2nd. strstr().	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	153e55c20e	cov: check for retvalue	2019-11-14 18:06:42 +01:00
Zdenek Kabelac	82e6b820b8	cov: check for NULL Since we check for NULL pointers earlier we need to be consistent across function - since the NULL would applies across whole function. When dropping 'mda' check - we are actually already dereferencing it before - so it can't be NULL at that places (and it's validated before entering _read_mda_header_and_metadata).	2019-11-14 18:06:42 +01:00
David Teigland	6a8bd0c509	lvmlockd: fix cachevol locking When a cachevol LV is attached, have the LV keep it's lock allocated. The lock on the cachevol won't be used while it's attached. When the cachevol is split a new lock does not need to be allocated. (Applies to cachevol usage by both dm-cache and dm-writecache.)	2019-10-25 14:08:59 -05:00
David Teigland	5706764885	improve command definition matching using type When a user includes "--type foo" in a command, only look at command definitions with matching type, as opposed to using matching/mismatching --type as a vote for/against a given command def. This means a command with --type foo will prioritize a command def with --type foo over other command defs that have more matching options but an unmatching type. This makes it more likely that a closely matching command def will be recommended.	2019-10-22 09:35:10 -05:00
David Teigland	967e2decd2	vgchange: remove bogus option restriction for -A with -a	2019-10-21 13:29:57 -05:00
Zdenek Kabelac	644186e920	gcc: all paths will set ret Set success on common path. Fixes random failure on writecache uncaching path.	2019-10-21 15:32:35 +02:00
Zdenek Kabelac	dd7629ea09	cache: use _cpool for used cache-pools When LV gets cached and uses cache-pool - such cache-pool will now get _cpool suffix automatically. Thus 'Pool' column for cached LV will now show either _cvol or _cpool LV.	2019-10-21 15:31:33 +02:00
Zdenek Kabelac	23f660cf98	cache: drop _cpool suffix from unused cache-pool Drop _cpool prefix if present and cache-pool is going to be unused.	2019-10-21 12:14:15 +02:00
Zdenek Kabelac	a5f8e7a96c	lvconvert: use new functions	2019-10-21 12:14:15 +02:00
David Teigland	5714c8c9cc	pvck: dump metadata search Improve the implementation of extracting all text metadata copies from the metadata area. Use this for the existing metadata_all dump option. Add a new metadata_search dump option which does not use lvm headers to find metadata, but looks in standard locations. This is useful if headers are damaged and can't be used to locate metadata. Adding '-v' to metadata_all or metadata_search will add the description and creation_time to the printed list of metadata instances that are found.	2019-10-18 12:26:29 -05:00
Zdenek Kabelac	a255385e3a	cachevol: move cvol rename Move rename of CVOL after archive().	2019-10-17 13:03:50 +02:00
Zdenek Kabelac	dab4a2c893	cachevol: move flag setting after taking archive Before 'archive()' is called, lvm2 must not touch/modify metadata. So move setting CACHE_VOL related flags past this point. Also make sure reading of cache segtype always restores this flag properly (even if compatible flag would be lost).	2019-10-17 13:03:50 +02:00
David Teigland	998e7b1075	writecache: add cvol suffix to attached cachevol When an LV is used as a writecache cachevol, give it the LV name a _cvol suffix. Remove the suffix when the cachevol is detached, restoring the original LV name.	2019-10-15 16:03:34 -05:00
David Teigland	91ee025d5b	cache: change cachevol flags for backward compat A cachevol LV had the CACHE_VOL status flag in metadata, and the cache LV using it had no new flag. This caused problems if the new metadata was used by an old version of lvm. An old version of lvm would have two problems processing the new metadata: . The old lvm would return an error when reading the VG metadata when it saw the unknown CACHE_VOL status flag. . The old lvm would return an error when reading the VG metadata because it would not find an expected cache pool attached to the cache LV (since the cache LV had a cachevol attached instead.) Change the use of flags: . Change the CACHE_VOL flag to be a COMPATIBLE flag (instead of a STATUS flag) so that old versions will not fail when they see it. . When a cache LV is using a cachevol, the cache LV gets a new SEGTYPE flag CACHE_USES_CACHEVOL. This flag is appended to the segtype name, so that old lvm versions will fail to use the LV because of an unknown segtype, as opposed to failing to read the VG.	2019-10-15 09:05:52 -05:00
Zdenek Kabelac	201ffbd04a	cachevol: use lv_cache_remove Use same routine for dropping cache.	2019-10-14 15:20:25 +02:00
Zdenek Kabelac	8d8047883e	cachevol: use writethrough for partial removal Instead of using 'noflush' option, switch cache_mode into WRITETHROUGH which does not require flushing, when user confirmed he does not want flushing for WRITEBACK (because of (partially) missing caching PV)	2019-10-14 15:15:14 +02:00
Zdenek Kabelac	8a8e6ebba2	cachevol: rename converted LV to _cvol When converting existing public LV to internally used 'CacheVol' LV - rename LV to LV_cvol. When splitting CacheVol, remove _cvol suffix.	2019-10-14 15:15:12 +02:00
Zdenek Kabelac	f6d171ffe3	cachevol: wipe 'normal' device For wiping we activate and clear 'regular' devices, since in case of whole process interuption (i.e. kill -9) we leave metadata & DM table and workable state all the time.	2019-10-14 15:14:46 +02:00
Zdenek Kabelac	ddaf2002c9	lvconvert: use struct initializer Always good to keep rest of structure initilized with zeros.	2019-10-14 15:13:47 +02:00
Zdenek Kabelac	76a9a86fd3	lvconvert: fix return value when zeroing fails Use correct error return code for fail path.	2019-10-14 15:13:33 +02:00
David Teigland	d6ffc99052	vgck: fix updatemetadata writing different descriptions vgck --updatemetadata would write the same correct metadata to good mdas, and then to bad mdas, but the sequence of vg_write/vg_commit calls betwen good and bad mdas could cause a different description field to be generated for good/bad mdas. (The description field describing the command was recently included in the ondisk copy of the metadata text.)	2019-10-11 12:57:32 -05:00
David Teigland	fe16d296b0	pvmove: remove some cmirror related code which is no longer used	2019-10-11 11:31:42 -05:00
David Teigland	7368cf8e7d	pvck: handle PVs with zero metadata copies	2019-09-30 16:20:17 -05:00
David Teigland	0c23d3fc84	pvscan: use quick activation only with matching PV device names When the PV device names in the VG metadata do not match the current PV device names seen on the system, do not use the optimized activation function (that avoids extra device scanning.) When the device names do not match, it's a clue that there could be duplicate PVs, in which case we want to scan all devicess to find any duplicates and stop the activation if found. This does not prevent autoactivating a VG from the incorrect duplicate PV, because the incorrect duplicate may appear by itself first. At that point its duplicate PV does not exist to be seen. (A future enhancement could use the WWID to strengthen this detection.)	2019-09-30 11:38:10 -05:00
Zdenek Kabelac	5c0264d689	vdo: restore monitoring of vdo pool Switch to -vpool layered name needs to monitor proper device.	2019-09-30 13:34:34 +02:00
David Teigland	9a8e6ad014	lvconvert: enable --uncache with dm-writecache cachevol splitcache followed by an automatic lvremove of the cachevol LV	2019-09-24 15:51:05 -05:00
David Teigland	76dd9b2b51	writecache: move code into new file put writecache specific code in writecache_manip.c should be no functional change	2019-09-24 15:51:05 -05:00
David Teigland	f27625f005	lvconvert: enable --uncache with dm-cache cachevol splitcache followed by an automatic lvremove of the cachevol LV	2019-09-24 15:50:58 -05:00
David Teigland	4464004362	lvconvert: separate splitcache and uncache functions Reorg code so there are separate functions for splitcache and uncache for both cachepool and cachevol. Should be no functional change.	2019-09-24 13:55:21 -05:00
David Teigland	4fe4c30e7a	lvconvert: allow --cache shortcut for --type cache with cachevol	2019-09-23 14:21:09 -05:00
David Teigland	6f7d7089b4	writecache: use dm suffixes and lv attributes - use internal CACHE_VOL flag on cachevol LV - add suffixes to dm uuids for internal LVs - display appropriate letters in the LV attr field - display writecache's cachevol in lvs output	2019-09-20 14:08:51 -05:00
David Teigland	5d3bced5ea	lvconvert: detaching cachevol with missing PVs . For dm-cache in writethrough, always allow splitcache, whether the cache is missing PVs or not. . For dm-cache in writeback, if the cache is missing PVs, allow splitcache with force and yes. . For dm-writecache, if the cache is missing PVs, allow splitcache with force and yes.	2019-09-20 09:59:37 -05:00
David Teigland	b46dce0bad	lvchange: allow activating cachevol	2019-09-20 09:59:37 -05:00
Zdenek Kabelac	6612d8dd5e	vdo: enhance activation with layer -vpool Enhance 'activation' experience for VDO pool to more closely match what happens for thin-pools where we do use a 'fake' LV to keep pool running even when no thinLVs are active. This gives user a choice whether he want to keep thin-pool running (wihout possibly lenghty activation/deactivation process) As we do plan to support multple VDO LVs to be mapped into a single VDO, we want to give user same experience and 'use-patter' as with thin-pools. This patch gives option to activate VDO pool only without activating VDO LV. Also due to 'fake' layering LV we can protect usage of VDO pool from command like 'mkfs' which do require exlusive access to the volume, which is no longer possible. Note: VDO pool contains 1024 initial sectors as 'empty' header - such header is also exposed in layered LV (as read-only LV). For blkid we are indentified as LV with UUID suffix - thus private DM device of lvm2 - so we do not need to store any extra info in this header space (aka zero is good enough).	2019-09-17 13:17:19 +02:00
Zdenek Kabelac	7612c21f55	lvconvert: improve validation thin and cache pool conversion Limit convertible LVs to thin-pool and cache-pools. Also fix return code on interal error path to return ECMD_FAILED.	2019-09-17 13:13:49 +02:00
David Teigland	3e5e7fd6c9	pvscan: allow use of noudevsync option When pvscan is used to activate a VG via an asynchronous service (i.e. lvm2-pvscan), there is no requirement that the command wait for udev to create device nodes before returning. It's possible that waiting for udev is slow enough to cause the service running the command to time out. So, allow the --noudevsync option to be given to pvscan to skip waiting for udev. (This commit is not changing the lvm2-pvscan service itself to use --noudevsync.) Still unknown is whether there are any complex LV activation cases in which lvm itself requires access to a device node, in which case the udev wait could be needed by lvm itself. (When running an activation command directly from the command line, it's generally expected that the activated LVs are ready to use when the command is finished, so lvm waits for udev to finish creating the dev nodes.)	2019-09-10 09:47:33 -05:00
Heinz Mauelshagen	aae2e872b4	lvchange: add --resync help/manual text relative to 'R' attribute Add information that --resync clears the 'R' attribute on not initially synchronized mirror/RAID LVs. Related: 1708299	2019-09-06 14:18:29 +02:00
David Teigland	25b58310e3	pvscan: avoid full scan for activation When an online PV completed a VG, the standard activation functions were used to activate the VG. These functions use a full scan of all devs. When many pvscans are run during startup and need to activate many VGs, scanning all devs from all the pvscans can take a long time. Optimize VG activation in pvscan to scan only the devs in the VG being activated. This makes use of the online file info that was used to determine the VG was complete. The downside of this approach is that pvscan activation will not detect duplicate PVs and block activation, where a normal activation command (which scans all devices) would.	2019-09-03 10:11:16 -05:00
Zdenek Kabelac	4b1dcc2eeb	lv_manip: add synchronizations New udev in rawhide seems to be 'dropping' udev rule operations for devices that are no longer existing - while this is 'probably' a bug - it's revealing moments in lvm2 that likely should not run in a single transaction and we should wait for a cookie before submitting more work. TODO: it seem more 'error' paths should always include synchronization before starting deactivating 'just activated' devices. We should probably figure out some 'automatic' solution for this instead of placing sync_local_dev_name() all over the place...	2019-08-26 15:32:19 +02:00
Zdenek Kabelac	0bdd6d6240	pvmove: add missing synchronization Between 'resume' and 'remove' we need to wait for udev to synchronize, otherwise udev may 'skip' resume event processing if the udev node is already gone.	2019-08-20 12:44:39 +02:00
David Teigland	0534cd9cd4	pvscan: disable sleeping and retrying for udev When systemd is running pvscans, udev may not be entirely initialized, so the pvscan should not sleep and retry waiting for udev info.	2019-08-16 14:41:26 -05:00
David Teigland	83261b79b5	pvscan cache: use lvmcache_label_scan instead of the lower level label_scan. The lvmcache wrapper around label_scan checks for and eliminates more duplicate devs and md components.	2019-08-16 13:26:12 -05:00
David Teigland	677833ce6f	lvmcache: renaming functions and variables related to duplicates, no functional changes.	2019-08-16 13:26:11 -05:00
David Teigland	65bcd16be2	md component detection addition in vg_read Usually md components are eliminated in label scan and/or duplicate resolution, but they could sometimes get into the vg_read stage, where set_pv_devices compares the device to the PV. If set_pv_devices runs an md component check and finds one, vg_read should eliminate the components. In set_pv_devices, run an md component check always if the PV is smaller than the device (this is not very common.) If the PV is larger than the device, (more common), do the component check when the config setting is "auto" (the default).	2019-08-16 13:24:34 -05:00
Zdenek Kabelac	cc4a92b13c	cov: ensure cname exists before derefering it Just make it clear to analyzers cname can't be NULL. TODO: maybe exclude NULL at front of the function...	2019-08-09 12:57:07 +02:00
David Teigland	0404539edb	vgcreate/vgextend: restrict PVs with mixed block sizes Avoid having PVs with different logical block sizes in the same VG. This prevents LVs from having mixed block sizes, which can produce file system errors. The new config setting devices/allow_mixed_block_sizes (default 0) can be changed to 1 to return to the unrestricted mode.	2019-08-01 10:06:47 -05:00
David Teigland	7657313740	pvck: fix looping dump metadata_all dump metadata_all wouldn't quit if the metadata wrapped.	2019-07-12 14:09:06 -05:00
David Teigland	4567c6a2b2	enable full md component detection at the right time An active md device with an end superblock causes lvm to enable full md component detection. This was being done within the filter loop instead of before, so the full filtering of some devs could be missed. Also incorporate the recently added config setting that controls the md component detection.	2019-07-10 13:30:50 -05:00
David Teigland	b16abb3816	pvscan: fix PV online when device has a different size Fix commit `7836e7aa1c` "pvscan: ignore device with incorrect size" which caused pvscan to not consider a PV online (for purposes of event based activation) if the PV and device sizes differed. This helped to avoid mistaking MD components for PVs, and is replaced by triggering an md component check when PV and device sizes differ (which happens in set_pv_device).	2019-07-09 13:45:09 -05:00
Heinz Mauelshagen	1b63a219f4	lvconvert: allow --stripes/--stripesize in 'mirror' conversions This allows the creation of a striped mirror leg(s) during upconvert by adding lvconvert command line options --stripes/--stripesize for 'mirror' to tools/command-lines.in. In case multiple mirror legs are being added, all will have the same requested striped layout. Resolves: rhbz1720705	2019-07-08 19:32:17 +02:00
David Teigland	f938545687	cache: warn and prompt for writeback with cachevol The cache repair utility does not yet work with a cachevol (where metadata and data exist on the same LV.) So, warn and prompt if writeback is specified with a cachevol.	2019-07-02 11:03:03 -05:00
David Teigland	b4402bd821	exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.	2019-06-25 15:39:08 -05:00
David Teigland	d16142f90f	scanning: open devs rw when rescanning for write When vg_read rescans devices with the intention of writing the VG, the label rescan can open the devs RW so they do not need to be closed and reopened RW in dev_write_bytes.	2019-06-21 10:57:49 -05:00
David Teigland	82b137ef2f	vgchange: don't fail monitor command if vg is exported When monitoring, skip exported VGs without causing a command failure. The lvm2-monitor service runs 'vgchange --monitor y', so any exported VG on the system would cause the service to fail.	2019-06-20 15:59:36 -05:00
David Teigland	9f5e46965b	fix man page generation The man page generation for pvchange/lvchange/vgchange was incorrect (leaving out some option listings) as a result of commit `e225bf5` "fix command definition for pvchange -a"	2019-06-14 09:26:08 -05:00
David Teigland	7eaa3adedf	vgchange: change debug message level A debug message was mistakely left visible.	2019-06-11 16:14:07 -05:00
David Teigland	4bb7d3da0e	lvmcache: remove wrapper around lvmcache_get_vgnameids This was left over from when there was an lvmetad version of the function.	2019-06-11 14:10:14 -05:00
David Teigland	0f350ba890	remove unused trustcache option	2019-06-11 11:42:49 -05:00
David Teigland	e225bf59ff	fix command definition for pvchange -a The -a was being included in the set of "one or more" options instead of an actual required option. Even though the cmd def was not implementing the restrictions correctly, the command internally was. Adjust the cmd def code which did not support a command with some real required options and a set of "one or more" options.	2019-06-10 13:43:20 -05:00
David Teigland	49b8846567	lvmcache: remove unused function Drop lvmcache_fmt_from_vgname(), the way it was called made it identical to the existing lvmcache_vginfo_from_vgname().	2019-06-10 10:38:32 -05:00
David Teigland	550536474f	vgsplit: simplify vg creation The way that this command now uses the global lock followed by a label scan, it can simply check if the new VG name exists, and if not lock it and create it.	2019-06-10 10:38:32 -05:00
David Teigland	a07cc8dbef	reset cmd wipe_outdated_pvs at the start of a command, which is needed in case the cmd struct is reused.	2019-06-10 10:34:58 -05:00
David Teigland	36cbc6db24	locking: reset global_ex flag at end of cmd These two flags may be not reset at the end of the command when the unlock is implicit, which is a problem if the cmd struct is reused. Clear the flags in the general fin_locking.	2019-06-10 10:34:58 -05:00
David Teigland	ba7ff96faf	improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)	2019-06-07 15:54:04 -05:00
David Teigland	5dd32680b0	vgcfgbackup add error messages	2019-06-07 15:54:04 -05:00
David Teigland	47effdc025	vgck --updatemetadata is a new command uses vg_write to correct more common or less severe issues, and also adds the ability to repair some metadata corruption that couldn't be handled previously.	2019-06-07 15:54:04 -05:00
David Teigland	89914a541f	process_each_pv handle outdated pvs process_each_pv should account for outdated pvs in the list of all devices it is processing.	2019-06-07 15:54:04 -05:00
David Teigland	ab61a6d85d	move wipe_outdated_pvs to vg_write and implement it based on a device, not based on a pv struct (which is not available when the device is not a part of the vg.) currently only the vgremove command wipes outdated pvs until more advanced recovery is added in a subsequent commit	2019-06-07 15:54:04 -05:00
David Teigland	db98a6e362	Additional MD component checking If udev info is missing for a device, (which would indicate if it's an MD component), then do an end-of-device read to check if a PV is an MD component. (This is skipped when using hints since we already know devs in hints are good.) A new config setting md_component_checks can be used to disable the additional end-of-device MD checks, or to always enable end-of-device MD checks. When both hints and udev info are disabled/unavailable, the end of PVs will now be scanned by default. If md devices with end-of-device superblocks are not being used, the extra I/O overhead can be avoided by setting md_component_checks="start".	2019-06-07 13:27:16 -05:00
David Teigland	2b241eb1f6	pvck: use new dump routines for old output Use the recently added dump routines to produce the old/traditional pvck output, and remove the code that had been used for that. The validation/checking done by the new routines means that new lines prefixed with CHECK are printed for incorrect values.	2019-06-05 16:28:52 -05:00
David Teigland	bada89a224	pvck: dump metadata_all This searches the entire metadata area for any copy of the metadata and dumps it to file.	2019-06-05 12:25:34 -05:00
David Teigland	d18e491f68	pvck: dump headers and metadata Add 'pvck --dump headers' to print all the lvm ondisk structs. Also checks the values and prints any problems. The previous dump metadata is also converted to use these same routines, which do not depend on lvm fully scanning/reading/processing the headers and metadata on disk. This makes it useful to get data in cases where there is corruption that would otherwise prevent the normal functions from working.	2019-06-03 15:13:32 -05:00
David Teigland	645dd27604	separate code for setting devices from metadata parsing Pull the code that sets devs for PVs out of the metadata parsing code and call it separately.	2019-05-23 11:57:38 -05:00
David Teigland	52586b1039	pvck: new dump option to extract metadata The new command 'pvck --dump metadata PV' will extract the current version of VG metadata from a PV for testing and debugging. --dump metadata_area extracts the entire text metadata area.	2019-05-23 11:49:06 -05:00
David Teigland	6422b9ddc5	move the setting of use_full_md_check flag from each command to one location in command init. No functional change.	2019-05-21 11:51:58 -05:00
David Teigland	9f561f2206	pvscan: fix segfault in recent commit commit `aa75b31db5` "pvscan: handle case of scanning PV without metadata last" failed to recognize that an arg may be null in the case of 'pvscan --cache' (without -aay) which does not keep track of complete VGs because it does not need to activate them.	2019-05-03 16:51:34 -05:00
David Teigland	3405ead1e0	pvs: remove unnecessary label scan The scanning rework missed removing this instance of label scan. It's no longer needed because of the way that label scan is always run once from the start of the command. This unnecessary scan would be triggered by running 'pvs @tag'.	2019-05-03 16:16:29 -05:00
David Teigland	1e9e21a171	pvscan: don't record PV online after error reading metadata	2019-05-03 14:39:42 -05:00
Zdenek Kabelac	3c70ae1803	clean: avoid cleaning iterator on error path Return error dirrectly instead of using 'out' code path.	2019-05-03 13:17:22 +02:00
David Teigland	d7054cd28a	vgcreate: remove the lvmcache locking workaround Recent cleanups and simplifications to lvmcache and locking mean that the odd locking to workaround other issues is now unnecessary.	2019-04-30 14:26:16 -05:00
David Teigland	366c1ac15b	pvcreate: call label scan prior to pvcreate_each_device and don't call it from inside pvcreate_each_device. This avoids having to repeat it for users of pvcreate_each_device (pvcreate/pvremove/vgcreate/vgextend.)	2019-04-30 14:10:27 -05:00
David Teigland	6d0f09f478	pvscan: remove fixme comment that is fixed Remove the fixme comment describing the case that was fixed by `aa75b31db5` "pvscan: handle case of scanning PV without metadata last"	2019-04-29 15:44:57 -05:00
David Teigland	c3e385c108	hints: skip hint flock if nolocking option is set	2019-04-29 13:01:15 -05:00
David Teigland	a519be8d4b	remove retry for missed PVs in process_each_pv This is no longer needed with the change to orphan and global locks.	2019-04-29 13:01:15 -05:00
David Teigland	8c87dda195	locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.	2019-04-29 13:01:05 -05:00
David Teigland	aa75b31db5	pvscan: handle case of scanning PV without metadata last Handle the case where pvscan --cache -aay (with no dev args) gets to the final PV, completing the VG, but that final PV does not have VG metadata. In this case, we need to use VG metadata from a previously scanned PV in the same VG, which we saved for this possibility. Using this saved metadata, we can find which VG this PVID belongs to, and then check if that VG is now complete, and if so add the VG name to the list of complete VGs to be autoactivated.	2019-04-15 11:27:49 -05:00
David Teigland	41ba2b568b	tests: disable unworking pvscan case and add corresponding fixme in the code	2019-04-12 15:40:38 -05:00
David Teigland	7836e7aa1c	pvscan: ignore device with incorrect size If a device looks like a PV, but its size does not match the PV size in the metadata, then skip it for purposes of autoactivation. It's probably not wrong device for the PV.	2019-04-05 16:44:00 -05:00
David Teigland	6f18186bfd	pvscan: print more reasons for ignoring devices	2019-04-05 15:48:12 -05:00
David Teigland	f58a70c168	pvscan: don't print warning about lvmlockd not running pvscan --cache ignores shared VGs, so it doesn't need to consider lvmlockd, and shouldn't include a warning about it.	2019-04-05 14:04:42 -05:00
David Teigland	0ba316f102	pvscan: remove initialization case In the past, the first 'pvscan --cache -aay dev' command to run on the system would initialize the pvs_online dir by scanning all devs and creating online files for all pvs it found, and then autoactivating the VG (if complete) for the named dev. The idea was that the system may not have been able to run pvscan commands for early devices, so the first pvscan to run would need to "make up" for any devices that had appeared previously, which the system was unable to scan. The problem or idea of making up for missed scans is historical and should no longer be needed, so remove this special init case.	2019-04-05 14:04:02 -05:00
David Teigland	6b89c0d4b7	pvscan: for init only autoactivate vg for named dev When pvscan is run for the initialization case (the first pvscan run on the system), it scans all devs and creates online files for all PVs it finds. Previously it would then autoactivate every complete VG, but change this to only autoactive the (complete) VG corresponding to the named device arg(s).	2019-04-05 12:46:39 -05:00

1 2 3 4 5 ...

3870 Commits