shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00

Author	SHA1	Message	Date
David Teigland	87ee401eea	md component detection changes Move extra md component detection into the label scan phase. It had been in set_pv_devices which was deep within the vg_read phase, which wasn't a good place (better to detect that earlier.) Now that pv metadata info is available in the scan phase, the pv details (size and device_hint) can be used for extra md checking. Use the device_hint from the pv metadata to trigger a full md component check if the device_hint begins with /dev/md. Stop triggering full md component checks based on missing udev info for a dev. Changes to tests to reflect that the code is now detecting md components in some test case that it wasn't before.	2021-02-05 16:23:51 -06:00
David Teigland	74ad2cd76f	metadata: add vg_from_config_tree Add cmd/fmt args to import functions so that they can be used without the fid arg which.	2019-11-27 11:13:47 -06:00
David Teigland	65bcd16be2	md component detection addition in vg_read Usually md components are eliminated in label scan and/or duplicate resolution, but they could sometimes get into the vg_read stage, where set_pv_devices compares the device to the PV. If set_pv_devices runs an md component check and finds one, vg_read should eliminate the components. In set_pv_devices, run an md component check always if the PV is smaller than the device (this is not very common.) If the PV is larger than the device, (more common), do the component check when the config setting is "auto" (the default).	2019-08-16 13:24:34 -05:00
David Teigland	ba7ff96faf	improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)	2019-06-07 15:54:04 -05:00
David Teigland	645dd27604	separate code for setting devices from metadata parsing Pull the code that sets devs for PVs out of the metadata parsing code and call it separately.	2019-05-23 11:57:38 -05:00
David Teigland	117160b27e	Remove lvmetad Native disk scanning is now both reduced and async/parallel, which makes it comparable in performance (and often faster) when compared to lvm using lvmetad. Autoactivation now uses local temp files to record online PVs, and no longer requires lvmetad. There should be no apparent command-level change in behavior.	2018-07-11 11:26:42 -05:00
Joe Thornber	7f97c7ea9a	build: Don't generate symlinks in include/ dir As we start refactoring the code to break dependencies (see doc/refactoring.txt), I want us to use full paths in the includes (eg, #include "base/data-struct/list.h"). This makes it more obvious when we're breaking abstraction boundaries, eg, including a file in metadata/ from base/	2018-05-14 10:30:20 +01:00
David Teigland	a7cb76ae94	scan: use bcache for label scan and vg read New label_scan function populates bcache for each device on the system. The two read paths are updated to get data from bcache. The bcache is not yet used for writing. bcache blocks for a device are invalidated when the device is written.	2018-04-20 11:19:24 -05:00
Joe Thornber	00f1b208a1	[io paths] Unpick agk's aio stuff	2018-04-20 11:03:58 -05:00
Alasdair G Kergon	d6cabbbc53	device: Fix basic async I/O error handling	2018-02-08 20:19:21 +00:00
Alasdair G Kergon	9194610f42	device: Add ioflags parameter to transfer additional state. Flags are set on the initial I/O and passed to any callbacks that may in turn issue further I/O using the inherited flags.	2018-01-21 21:10:23 +00:00
Alasdair G Kergon	6210c1ec28	device: Mark read-only device buffers const.	2018-01-10 19:57:10 +00:00
Alasdair G Kergon	f4675af4cf	format_text: Use vgsummary callbacks	2018-01-09 03:14:30 +00:00
Alasdair G Kergon	5e7d3ad749	device: Introduce dev_read_callback If it obtains the data, it passes it into the supplied callback function and returns 1. Otherwise the callback receives failed = 1. Updated config_file_read_fd to use this and similarly return the data via a callback fn of its own.	2018-01-06 02:40:12 +00:00
Alasdair G Kergon	946f07af3e	metadata: Use a consistent format for callback fn parameters	2018-01-05 14:24:56 +00:00
Alasdair G Kergon	a0ddfad94b	metadata: Change the new data processing fns to void. Move the existing fn return codes into the new structs.	2018-01-05 03:12:22 +00:00
Alasdair G Kergon	5a846e0929	format_text: Split the text import fns into two pieces.	2018-01-03 20:48:02 +00:00
Alasdair G Kergon	22b6c482ec	config: Split config buffer processing into new fn. Wrap its parameters into struct process_config_file_params allocated from a mempool now passed into the config_file_read* fns.	2018-01-02 21:10:46 +00:00
Alasdair G Kergon	d591d04103	device: Tag I/O for each mda on a device separately in log messages. Mark the first metadata area on each text format PV as MDA_PRIMARY. Pass this information down to the device layer so that when there are two metadata areas on a block device, we can easily distinguish two independent streams of I/O.	2017-12-07 03:48:11 +00:00
Peter Rajnoha	e40fbd08c8	config: parse config tree without dup node checking if it's metadata tree	2016-09-21 18:16:05 +02:00
David Teigland	147c9c01a2	rename function read_vgname to read_vgsummary The name did not clearly represent what it does.	2016-04-11 13:07:48 -05:00
Zdenek Kabelac	fcbef05aae	doc: change fsf address Hmm rpmlint suggest fsf is using a different address these days, so lets keep it up-to-date	2016-01-21 12:11:37 +01:00
Petr Rockai	c78b6f18d4	metadata: Reject lvmetad metadata extensions when reading from disk.	2015-06-10 16:25:57 +02:00
Alasdair G Kergon	6407d184d1	cache: Store metadata size and checksum. Refactor the recent metadata-reading optimisation patches. Remove the recently-added cache fields from struct labeller and struct format_instance. Instead, introduce struct lvmcache_vgsummary to wrap the VG information that lvmcache holds and add the metadata size and checksum to it. Allow this VG summary information to be looked up by metadata size + checksum. Adjust the debug log messages to make it clear when this shortcut has been successful. (This changes the optimisation slightly, and might be extendable further.) Add struct cached_vg_fmtdata to format-specific vg_read calls to preserve state alongside the VG across separate calls and indicate if the details supplied match, avoiding the need to read and process the VG metadata again.	2015-03-18 23:43:02 +00:00
Zdenek Kabelac	a9b28a4f21	lib: reduce parsing in vgname_from_mda Use similar logic as with text_vg_import_fd() and avoid repeated parsing of same mda and its config tree for vgname_from_mda(). Remember last parsed vgname, vgid and creation_host in labeller structure and if the metadata have the same size and checksum, return this stored info. TODO: The reuse of labeller struct is not ideal, some lvmcache API for this functionality would be nicer.	2015-03-06 13:53:13 +01:00
Zdenek Kabelac	7e7411966a	lib: avoid reparsing same metadata When reading VG mda from multiple PVs - do all the validation only when mda is seen for the first time and when mda checksum and length is same just return already existing VG pointer. (i.e. using 300PVs for a VG would lead to create and destroy 300 config trees....)	2015-03-06 13:53:12 +01:00
Zdenek Kabelac	6a2ae250ff	cleanup: add stack trace Missed stack in error path.	2015-03-06 13:51:54 +01:00
David Teigland	8dc5f42254	metadata: Use flags to control warnings. The warnings arg was used to enable logging of warnings when reading a PV. This arg is turned into a set of flags with the WARN_PV_READ flag matching the existing behavior. A new flag WARN_INCONSISTENT is added that will cause vg_read_internal() to log the "VG is not consistent" warning so the various callers do not need to log this warning themselves. A new vg_read flag READ_WARN_INCONSISTENT is used from reporting to enable the WARN_INCONSISTENT flag in vg_read_internal. [Committed by agk with cosmetic changes and tweaks.]	2014-10-07 01:15:43 +01:00
Peter Rajnoha	ff9d27a1c7	config: add CONFIG_FILE_SPECIAL config source id Add CONFIG_FILE_SPECIAL config source id to make a difference between real configuration tree (like lvm.conf and tag configs) and special purpose configuration tree (like LVM metadata, persistent filter). This makes it easier to attach correct customized data to the config tree that is created out of the source then.	2014-05-19 15:37:41 +02:00
Peter Rajnoha	da3ea66a96	config: add config_source_t type to identify configuration source A helper type that helps with identification of the configuration source which makes handling the configuration cascade a bit easier, mainly removing and adding configuration trees to cascade dynamically. Currently, the possible types are: CONFIG_UNDEFINED - configuration is not defined yet (not initialized) CONFIG_FILE - one file configuration CONFIG_MERGED_FILES - configuration that is a result of merging more files into one CONFIG_STRING - configuration string typed on cmd line directly CONFIG_PROFILE - profile configuration (the new type of configuration, patches will follow...) Also, generalize existing "remove_overridden_config_tree" to work with configuration type identification in a cascade. Before, it was just the CONFIG_STRING we used. Now, we need some more to add in a cascade (like the CONFIG_PROFILE). So, we have: struct dm_config_tree remove_config_tree_by_source(struct cmd_context cmd, config_source_t source); config_source_t config_get_source_type(struct dm_config_tree *cft); ... for removing the tree by its source type from the cascade and simply getting the source type.	2013-07-02 15:19:08 +02:00
Zdenek Kabelac	286cd2006b	cleanup: drop unneeded included header files This headers were not resolving anything used for compiled .c files. Remove unused util.c file.	2012-08-23 14:37:20 +02:00
Alasdair Kergon	5b613cff97	Pass 'single_device' parameter down to suppress 'Can't find uuid' messages when reading VG text metadate and called from pvscan --lvmetad. (Longer-term, that check needs moving outside of that code.)	2012-02-29 02:35:35 +00:00
Petr Rockai	845b1df617	Make a cleaner split between config tree and config file functionality. Move the latter out of libdm.	2011-12-18 21:56:03 +00:00
Petr Rockai	e59e2f7c3c	Move the core of the lib/config/config.c functionality into libdevmapper, leaving behind the LVM-specific parts of the code (convenience wrappers that handle `struct device` and `struct cmd_context`, basically). A number of functions have been renamed (in addition to getting a dm_ prefix) -- namely, all of the config interface now has a dm_config_ prefix.	2011-08-30 14:55:15 +00:00
Zdenek Kabelac	bebe60b70c	Code move of vg_mark_partial() up in stack It's useful to keep the partial flag cached - so just move the call for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets evaluated before it goes through the lvmcache. This patch should not present any functional change. Note: It is rather temporal solution - proper place is probably inside the 'read' call back - but needs some more discussion. For now using this minor hack.	2011-06-17 14:39:10 +00:00
Zdenek Kabelac	6feecf76d4	Change import_vg_from_buffer to use config_tree Change function import_vg_from_buffer() to import_vg_from_config_tree(). Instead of creating config tree inside the function allow config tree to be passed as parameter - usable later for caching.	2011-01-10 13:13:42 +00:00
Zdenek Kabelac	ba96eb24fa	Some const cleanups Minor const warning fixes and internal API updates.	2010-12-20 13:19:13 +00:00
Zdenek Kabelac	9f926fd060	Use void parameter for function definition.	2010-08-03 13:06:35 +00:00
Milan Broz	6733116a19	Fix all segments memory is allocated from vg private mempool. Physical segments were still allocated from global command context mempool. This leads to very high memory usage when activating large VG (vgchange). (Memory usage was about 2G when >3000LVs). Fix it by properly using vg->vgmem private pool, so all the memory is released early. New memory pool parameter is needed here for pv_split_segment function. Also fix the same problem in some minor allocations (vg description, lv segment split).	2010-03-31 17:23:18 +00:00
Alasdair Kergon	0a5182fc97	Suppress repeated errors about the same missing PV uuids. Bypass full device scans when using internally-cached VG metadata.	2010-03-17 02:11:18 +00:00
Mike Snitzer	a2552d4f59	Switch status from 32-bit to 64-bit The physical_volume, volume_group, logical_volume and lv_segment structures' 'status' member is now uint64_t. The alignment of these structures was also audited to remove holes. The movement of some members in 'volume_group' and 'lv_segment' eliminates holes. The 'physical_volume' structure still has one 4-byte hole after 'pe_size'; the other structures no longer have any holes. Each structures' size has not changed.	2009-11-24 22:55:55 +00:00
Alasdair Kergon	6e210a6c54	Cache VG metadata internally while VG lock is held.	2008-04-01 22:40:13 +00:00
Alasdair Kergon	dc2bdce11e	Refactor text format initialisation into _init_text_import.	2008-03-13 12:33:22 +00:00
Alasdair Kergon	67cdbd7e4d	Some whitespace tidy-ups.	2008-01-30 14:00:02 +00:00
Alasdair Kergon	c51b9fff19	Use stack return macros throughout.	2008-01-30 13:19:47 +00:00
Alasdair Kergon	be6845999b	Fix inconsistent licence notices: executables are GPLv2; libraries LGPLv2.1.	2007-08-20 20:55:30 +00:00
Alasdair Kergon	c1c16a8f01	Protect .cache manipulations with fcntl locking. Change .cache timestamp comparisons to use ctime.	2006-11-04 03:34:10 +00:00
Alasdair Kergon	898e6f8e41	Add mirror_library description to example.conf. More compile-time cleanup.	2006-05-11 17:58:58 +00:00
Alasdair Kergon	4b52511f50	fix last commit	2006-05-10 17:51:02 +00:00
Alasdair Kergon	2c7fbadeef	more coverity fixes	2006-05-10 17:49:25 +00:00

1 2

88 Commits