shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2025-12-09 12:23:50 +03:00

Author	SHA1	Message	Date
David Teigland	a761f9fc9d	label_scan/vg_read: use label_read_data to avoid disk reads The new label_scan() function reads a large buffer of data from the start of the disk, and saves it so that multiple structs can be read from it. Previously, only the label_header was read from this buffer, and the code which needed data structures that immediately followed the label_header would read those from disk separately. This created a large number of small, unnecessary disk reads. In each place that the two read paths (label_scan and vg_read) need to read data from disk, first check if that data is already available from the label_read_data buffer, and if so just copy it from the buffer instead of reading from disk. Code changes ------------ - passing the label_read_data struct down through both read paths to make it available. - before every disk read, first check if the location and size of the desired piece of data exists fully in the label_read_data buffer, and if so copy it from there. Otherwise, use the existing code to read the data from disk. - adding some log_error messages on existing error paths that were already being updated for the reasons above. - using similar naming for parallel functions on the two parallel read paths that are being updated above. label_scan path calls: read_metadata_location_summary, text_read_metadata_summary vg_read path calls: read_metadata_location_vg, text_read_metadata_file Previously, those functions were named: label_scan path calls: vgname_from_mda, text_vgsummary_import vg_read path calls: _find_vg_rlocn, text_vg_import_fd I/O changes ----------- In the label_scan path, the following data is either copied from label_read_data or read from disk for each PV: - label_header and pv_header - mda_header (in _raw_read_mda_header) - vg metadata name (in read_metadata_location_summary) - vg metadata (in config_file_read_fd) Total of 4 reads per PV in the label_scan path. In the vg_read path, the following data is either copied from label_read_data or read from disk for each PV: - mda_header (in _raw_read_mda_header) - vg metadata name (in read_metadata_location_vg) - vg metadata (in config_file_read_fd) Total of 3 reads per PV in the vg_read path. For a common read/reporting command, each PV will be: - read by the command's initial lvmcache_label_scan() - read by lvmcache_label_rescan_vg() at the start of vg_read() - read by vg_read() Previously, this would cause 11 synchronous disk reads per PV: 4 from lvmcache_label_scan(), 4 from lvmcache_label_rescan_vg() and 3 from vg_read(). With this commit's optimization, there are now 2 async disk reads per PV: 1 from lvmcache_label_scan() and 1 from lvmcache_label_rescan_vg(). When a second mda is used on a PV, it is located at the end of the PV. This second mda and copy of metadata will not be found in the label_read_data buffer, and will always require separate disk reads.	2017-10-27 12:32:44 -05:00
David Teigland	654525b33d	independent metadata areas: fix bogus code Fix mixing bitwise & and logical && which was always 1 in any case.	2017-10-27 12:13:25 -05:00
David Teigland	2773767924	label_scan: fix independent metadata areas This fixes the use of lvmcache_label_rescan_vg() in the previous commit for the special case of independent metadata areas. label scan is about discovering VG name to device associations using information from disks, but devices in VGs with independent metadata areas have no information on disk, so the label scan does nothing for these VGs/devices. With independent metadata areas, only the VG metadata found in files is used. This metadata is found and read in vg_read in the processing phase. lvmcache_label_rescan_vg() drops lvmcache info for the VG devices before repeating the label scan on them. In the case of independent metadata areas, there is no metadata on devices, so the label scan of the devices will find nothing, so will not recreate the necessary vginfo/info data in lvmcache for the VG. Fix this by setting a flag in the lvmcache vginfo struct indicating that the VG uses independent metadata areas, and label rescanning should be skipped. In the case of independent metadata areas, it is the metadata processing in the vg_read phase that sets up the lvmcache vginfo/info information, and label scan has no role.	2017-10-27 12:13:25 -05:00
David Teigland	606dc06e43	label_scan: move to start of command LVM's general design for scanning/reading of metadata from disks is that a command begins with a discovery phase, called "label scan", in which it discovers which devices belong to lvm, what VGs exist on those devices, and which devices are associated with each VG. After this comes the processing phase, which is based around processing specific VGs. In this phase, lvm acquires a lock on the VG, and rescans the devices associated with that VG, i.e. it repeats the label scan steps on the devices in the VG in case something has changed between the initial label scan and taking the VG lock. This ensures that the command is processing the lastest, unchanging data on disk. This commit moves the location of these label scans to make them clearer and avoid unnecessary repeated calls to them. Previously, the initial label scan was called as a side effect from various utility functions. This would lead to it being called unnecessarily. It is an expensive operation, and should only be called when necessary. Also, this is a primary step in the function of the command, and as such it should be called prominently at the top level of command processing, not as a hidden side effect of a utility function. lvm knows exactly where and when the label scan needs to be done. Because of this, move the label scan calls from the internal functions to the top level of processing. Other specific instances of lvmcache_label_scan() are still called unnecessarily or unclearly by specific commands that do not use the common process_each functions. These will be improved in future commits. During the processing phase, rescanning labels for devices in a VG needs to be done after the VG lock is acquired in case things have changed since the initial label scan. This was being done by way of rescanning devices that had the INVALID flag set in lvmcache. This usually approximated the right set of devices, but it was not exact, and obfuscated the real requirement. Correct this by using a new function that rescans the devices in the VG: lvmcache_label_rescan_vg(). Apart from being inexact, the rescanning was extremely well hidden. _vg_read() would call ->create_instance(), _text_create_text_instance(), _create_vg_text_instance() which would call lvmcache_label_scan() which would call _scan_invalid() which repeats the label scan on devices flagged INVALID. lvmcache_label_rescan_vg() is now called prominently by _vg_read() directly.	2017-10-27 12:13:25 -05:00
David Teigland	cfe35ec877	label_scan: call new label_scan from lvmcache_label_scan To do label scanning, lvm code calls lvmcache_label_scan(). Change lvmcache_label_scan() to use the new label_scan() which can use async io, rather than implementing its own dev iter loop and calling the synchronous label_read() on each device. Also add lvmcache_label_rescan_vg() which calls the new label_scan_devs() which does label scanning on only the specified devices. This is for a subsequent commit and is not yet used.	2017-10-27 12:13:25 -05:00
Alasdair G Kergon	486ed10848	vgmerge: Fix intermediate metadata corruption vgmerge suffers from a similar problem to the one fixed in commit `8146548d25` ("vgsplit: Fix intermediate metadata corruption.") When merging, splitting or renaming VGs, use a new PV status flag PV_MOVED_VG to mark the PVs that hold metadata with the old VG name and use this to provide PV-level granularity instead of incorrectly assuming all PVs in the VG are the same.	2017-10-06 02:20:45 +01:00
Alasdair G Kergon	8146548d25	vgsplit: Fix intermediate metadata corruption. Changing the VG of a PV uses the same on-disk mechanism as vgrename. This relies on recognising both the old and new VG names. Prior to this patch the vgsplit code incorrectly provided the new VG name twice instead of the old and new ones. This lead the low-level mechanism not to recognise the device as already belonging to a VG and so paying no attention to the location of its existing metadata, sometimes partly overwriting it and then later trying to read the corrupt metadata and issuing a checksum error.	2017-09-22 18:34:34 +01:00
Peter Rajnoha	3c978f7bcc	pvcreate: fix check for 2nd mda at end of disk fits if using pvcreate --restorefile Fix code checking that the 2nd mda which is at the end of disk really fits the available free space and avoid any DA and MDA interleaving when we already have DA preallocated. This mainly applies when we're restoring a PV from VG backup using pvcreate --restorefile where we may already have some DA preallocated - this means the PV was in a VG before with already allocated space from it (the LVs were created). Hence we need to avoid stepping into DA - the MDA can never ever be inside in such case! The code responsible for this calculation was already in _text_pv_add_metadata_area fn, but it had a bug in the calculation where we subtracted one more sector by mistake and then the code could still incorrectly allocate the MDA inside existing DA. The patch also renames the variable in the code so it doesn't confuse us in future. Also, if the 2nd mda doesn't fit, don't silently continue with just 1 MDA (at the start of the disk). If 2nd mda was requested and we can't create that due to unavailable space, error out correctly (the patch also adds a test to shell/pvcreate-operation.sh for this case).	2017-08-15 13:40:25 +02:00
Zdenek Kabelac	48ce8c7a49	tidy: drop unneeded cast Avoid casting to the same type.	2017-07-20 11:20:44 +02:00
Zdenek Kabelac	0bf836aa14	tidy: prefer not using else after return clang-tidy: avoid using 'else' after return - give more readable code, and also saves indention level.	2017-07-20 11:18:29 +02:00
Zdenek Kabelac	f7e62bc55c	cleanup: drop extra compare dm_free() already validates for NULL itself.	2017-07-17 12:32:18 +02:00
Alasdair G Kergon	5027c3c7ee	format_text: Extend FIXME to reduce label scans It's unnecessarily scanning all invalid labels even when nothing changed instead of first just scanning the ones under the lock.	2017-07-13 17:05:49 +01:00
Zdenek Kabelac	419e8284c8	coverity: validate length of renaming path Make sure path fits into buffer on stack.	2017-06-27 12:15:42 +02:00
David Teigland	01156de6f7	lvmcache: add optional dev arg to lvmcache_info_from_pvid A number of places are working on a specific dev when they call lvmcache_info_from_pvid() to look up an info struct based on a pvid. In those cases, pass the dev being used to lvmcache_info_from_pvid(). When a dev is specified, lvmcache_info_from_pvid() will verify that the cached info it's using matches the dev being processed before returning the info. Calling code will not mistakenly get info for the wrong dev when duplicate devs exist. This confusion was happening when scanning labels when duplicate devs existed. label_read for the first dev would add an info struct to lvmcache for that dev/pvid. label_read for the second dev would see the pvid in lvmcache from first dev, and mistakenly conclude that the label_read from the second dev can be skipped because it's already been done. By verifying that the dev for the cached pvid matches the dev being read, this mismatch is avoided and the label is actually read from the second duplicate.	2016-06-07 15:15:47 -05:00
Zdenek Kabelac	509b2e5247	debug: move misplaced log_debug It should log action before taking it instead of only in error path.	2016-04-21 00:34:01 +02:00
David Teigland	5e9e43074a	lvmetad: rework command connection setup and checking The lvmetad connection is created within the init_connections() path during command startup, rather than via the old lvmetad_active() check. The old lvmetad_active() checks are replaced with lvmetad_used() which is a simple check that tests if the command is using/connected to lvmetad. The old lvmetad_set_active(cmd, 0) calls, which stopped the command from using lvmetad (to revert to disk scanning), are replaced with lvmetad_make_unused(cmd).	2016-04-19 14:00:02 -05:00
Zdenek Kabelac	a28c81cbae	debug: unify some tracing messages Introduce FMTVGID - although it might be possibly better to ensure vgid is always \0 ended string. Unify some lvmcache reported messages.	2016-04-12 13:06:16 +02:00
David Teigland	147c9c01a2	rename function read_vgname to read_vgsummary The name did not clearly represent what it does.	2016-04-11 13:07:48 -05:00
David Teigland	4de6caf5b5	redefine pvcreate structs New pv_create_args struct contains all the specific parameters for creating a PV, independent of the command.	2016-02-25 09:14:10 -06:00
Peter Rajnoha	8ad93874d6	tests: fix tests checking pv_attr - there's a new bit now	2016-02-15 12:44:46 +01:00
Peter Rajnoha	9b9f1ae772	format: format_text: add pv_needs_rewrite to format_handler and implemention for format_text	2016-02-15 12:44:46 +01:00
Peter Rajnoha	d320d9c52b	pv: format-text: store PV_EXT_USED flag if PV is used and unset it otherwise When adding a PV to VG, set the PV_EXT_USED flag in PV header and vice versa - if the PV is no longer in a VG, unset the flag.	2016-02-15 12:44:46 +01:00
Peter Rajnoha	a522af93b7	format: add FMT_PV_FLAGS to indicate format supports PV flags	2016-02-15 12:44:46 +01:00
Zdenek Kabelac	fcbef05aae	doc: change fsf address Hmm rpmlint suggest fsf is using a different address these days, so lets keep it up-to-date	2016-01-21 12:11:37 +01:00
Alasdair G Kergon	01228b692b	vgcfgrestore: Retain allocatable PV attribute. pvchange -xn was getting lost. All PVs were set to allocatable again after restore. Moved setting ALLOCATABLE_PV outside pv_setup().	2016-01-14 00:46:45 +00:00
David Teigland	796461a912	vgrename: use process_each_vg Use process_each_vg() to lock and read the old VG, and then call the main vgrename code. When real VG names are used (not a UUID in place of the old name), the command still pre-locks the new name (when strcmp wants it locked first), before calling process_each_vg on the old name. In the case where the old name is replaced with a UUID, process_each_vg now translates that UUID into the real VG name, which it locks and reads. In this case, we cannot do pre-locking to maintain lock ordering because the old name is unknown. So, in this case the strcmp based lock ordering is suppressed and the old name is always locked first. This opens a remote chance for lock ordering conflict between racing vgrenames between two names where one or both commands use the UUID.	2015-12-14 14:26:47 -06:00
Zdenek Kabelac	c3b292a4a9	format-text: ensure no division by zero Coverity likes here to be 100% sure no division by zero is possible. Add check for alignment !=0 which is made on other code paths here.	2015-11-16 01:16:11 +01:00
Peter Rajnoha	ccfc09f79b	metadata: format_text: also count with calculated mda size of 0 When checking minimum mda size, make sure the mda_size after alignment and calculation is more than 0 - if there's no place for an MDA at the end of the disk, the _text_pv_add_metadata_area does not try to add it there and it returns (because we already have the MDA at the start of the disk at least).	2015-10-30 12:02:34 +01:00
Peter Rajnoha	c2e88d1107	metadata: format_text: better check for metadata overlap Actually, we don't need extra condition as introduced in commit `00348c0a63`. We should fix the last condition: (mdac->rlocn.size >= mdah->size) ...which should be: (MDA_HEADER_SIZE + (rlocn ? rlocn->size : 0) + mdac->rlocn.size >= mdah->size)) Where the "mdac" is new metadata, the "rlocn" is old metadata. So the main problem with the previous condition was that it didn't count in MDA_HEADER_SIZE properly (and possible existing metadata - the "rlocn"). This could have caused the error state where metadata in ring buffer overlap to not be hit. Replace the new condition introduced in `00348c0a63` with the improved one for the condition that existed there already but it was just incomplete.	2015-10-30 08:57:34 +01:00
Peter Rajnoha	00348c0a63	metadata: format_text: check VG metadata do not overlap themselves We're already checking whether old and new meta do not overlap in ring buffer (as we need to keep both old and new meta during vg_write up until vg_commit). We also need to check whether the new metadata do not overlap themselves in case we don't have old metadata yet (...because we're in vgcreate). This could happen if we're creating a VG so that the very first metadata written are long enough that it wraps themselves in metadata ring buffer. Although we limited the minimum metadata area size better with the previous commit `ccb8da404d` which makes the initial VG metadata overlap in ring buffer to be less probable, the risk of hitting this overlap condition is still there if we still manage to generate big enough metadata somehow. For example, users can provide many and/or long VG tags during vgcreate so that the VG metadata is long enough to start to wrap in the ring buffer again...	2015-10-29 16:46:41 +01:00
Peter Rajnoha	ccb8da404d	metadata: format_text: check metadata area size is at least MDA_SIZE_MIN	2015-10-29 16:00:32 +01:00
Peter Rajnoha	b3c81d02c9	revert: `3d03e504cd`: message about VG metadata size vs. PV mda size The message needs refinement - it's not correct in all situations.	2015-10-29 11:10:48 +01:00
Peter Rajnoha	3d03e504cd	metadata: format_text: provide more detailed error message when metadata too large for PV mda Also, leave out the note about "circular buffer" which is an internal imeplementation detail anyway and not quite informational for users: Before this patch: $ vgcreate vg1 /dev/sda VG vg1 metadata too large for circular buffer Failed to write VG vg1. With this patch applied: $ vgcreate vg1 /dev/sda VG vg1 metadata too large: size of metadata to write is 691 bytes while PV metadata area size on /dev/sda is 512 bytes. Failed to write VG vg1.	2015-10-08 16:27:03 +02:00
Peter Rajnoha	fcfca57e2e	format-text: label: fix missing dev assignment for struct label in _text_pv_write When using lvm shell, some structures which are cached in memory may be reused. This happens for the struct label (a part of lvmcache_info structure) when lvmetad is used in which case the PV scan is not done that would normally overwrite these label structures in memory and making them up-to-date. This is all consequence of the fact that struct lvmcache_info and struct label are not always assigned in the same part of the code. For example, if lvmetad is not used, parts of the struct label are reassigned in label_read fn while struct lvmcache_info is created elsewhere. No part of the code reused struct label (and its "dev" field) before calling label_read fn. That's why the real bug is hidden when using lvm shell without lvmetad. However, with lvmetad and lvm shell, the situation is a bit different. The label_read fn is not called if lvmetad is used, hence the struct label may have ended up not initialized properly. There was missing assignment for the dev field in struct label in _text_pv_write fn which caused this problem to appear in lvm shell with lvmetad, for example: Before this patch: lvm> pvcreate /dev/sda Physical volume "/dev/sda" successfully created lvm> pvs /dev/sda PV VG Fmt Attr PSize PFree unknown device lvm2 --- 128.00m 128.00m With this patch applied: lvm> pvcreate /dev/sda Physical volume "/dev/sda" successfully created lvm> pvs /dev/sda PV VG Fmt Attr PSize PFree /dev/sda lvm2 --- 128.00m 128.00m Also, this problem had not appeared before changes introduced by commits `e1a63905d1` through `3a6f91d713` which, among other things, added proper label field type reporting. Before, label reporting was the same as using struct physical_volume which has its own dev field assigned and so this problem was not exposed.	2015-09-15 18:07:32 +02:00
Zdenek Kabelac	a8fd88463e	cleanup: trace error from lvmcache_update_vgname_and_id Check result value from lvmcache_update_vgname_and_id().	2015-08-18 15:00:08 +02:00
Peter Rajnoha	3b6840e099	config: replace find_config_tree_node with find_config_tree_array where appropriate	2015-07-08 13:03:08 +02:00
Alasdair G Kergon	810ab095e6	macros: Wrap PRI with FMT. Create a set of wrappers with embedded % such as #define FMTu64 "%" PRIu64	2015-07-06 15:09:17 +01:00
Zdenek Kabelac	05934d2538	format_text: properly validate PV size for restore Use 64bit arithmentic for PV size calculation (Coverity). Also remove sector shift for compared PV size, since all values are already held in sectors. This fixes validatio of PV size when restoring PV from vg metadata backup file.	2015-05-08 15:12:35 +02:00
Alasdair G Kergon	cc26085b62	alloc: Respect cling_tag_list in contig alloc. When performing initial allocation (so there is nothing yet to cling to), use the list of tags in allocation/cling_tag_list to partition the PVs. We implement this by maintaining a list of tags that have been "used up" as we proceed and ignoring further devices that have a tag on the list. https://bugzilla.redhat.com/983600	2015-04-11 01:55:24 +01:00
Alasdair G Kergon	a9d48bae2f	cache: Set correct vgid when changing PV header. pv_write is called both to write orphans and to rewrite PV headers of PVs in VGs. It needs to select the correct VG id so that the internal cache state gets updated correctly. It only affected commands that involved further steps after the pv_write and was often masked because the metadata would be re-read off disk and correct itself. "Incorrect metadata area header checksum" warnings appeared. Example: Create vg1 containing dev1, dev2 and dev3. Hide dev1 and dev2 from the system. Fix up vg1 with vgreduce --removemissing. Bring back dev1 and dev2. In a single operation reinstate dev1 and dev2 into vg1 (vgextend). Done as separate operations (automatically fix-up dev1 and dev2 as orphans, then vgextend) it worked, but done all in one go the internal cache got corrupted and warnings about checksum errors appeared.	2015-04-09 21:13:55 +01:00
Alasdair G Kergon	a515a91fcc	format_text: Fix precommitted segfault. The code never mixes reads of committed and precommitted metadata, so there's no need to attempt to set PRECOMMITTED when *use_previous_vg is being set.	2015-03-19 11:14:47 +00:00
Alasdair G Kergon	6407d184d1	cache: Store metadata size and checksum. Refactor the recent metadata-reading optimisation patches. Remove the recently-added cache fields from struct labeller and struct format_instance. Instead, introduce struct lvmcache_vgsummary to wrap the VG information that lvmcache holds and add the metadata size and checksum to it. Allow this VG summary information to be looked up by metadata size + checksum. Adjust the debug log messages to make it clear when this shortcut has been successful. (This changes the optimisation slightly, and might be extendable further.) Add struct cached_vg_fmtdata to format-specific vg_read calls to preserve state alongside the VG across separate calls and indicate if the details supplied match, avoiding the need to read and process the VG metadata again.	2015-03-18 23:43:02 +00:00
Zdenek Kabelac	a9b28a4f21	lib: reduce parsing in vgname_from_mda Use similar logic as with text_vg_import_fd() and avoid repeated parsing of same mda and its config tree for vgname_from_mda(). Remember last parsed vgname, vgid and creation_host in labeller structure and if the metadata have the same size and checksum, return this stored info. TODO: The reuse of labeller struct is not ideal, some lvmcache API for this functionality would be nicer.	2015-03-06 13:53:13 +01:00
Zdenek Kabelac	60427d5d42	lib: return value Drop label out: with goto and return NULL directly. Add log_debug() for zero metadata offset.	2015-03-06 13:51:43 +01:00
Alasdair G Kergon	5e6e2d6b1b	vgcreate: Permit non-power-of-2 extent sizes. Relax validation to permit extent sizes > 128KB that are not powers of 2 with lvm2 format. Existing code was already capable of handling this.	2014-10-14 18:12:15 +01:00
Peter Rajnoha	05eb6a167e	tests: add separate test file for bootloader area support and enhance tests Enahnce bootloader area test to check whether restoring values from backup works correctly.	2014-04-10 14:18:59 +02:00
Alasdair G Kergon	5d7614fcf9	format_text: Report failed close.	2014-04-04 02:28:10 +01:00
Zdenek Kabelac	65bbfdf74d	lvmetad: add missing dev_close in error path Fixes missing dev_close() in dev_read error path introduced in commit `a368698672` `3e5bec37e9` (in-release fix)	2014-03-25 14:55:58 +01:00
Zdenek Kabelac	89575d6895	cleanup: drop init of already zalloced mem	2014-03-25 11:22:59 +01:00
Zdenek Kabelac	406ec4162f	cleanup: use dm_free without extra test It's ok to free(NULL).	2014-03-25 11:22:59 +01:00

1 2 3 4 5 ...

280 Commits