shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2025-01-21 22:04:19 +03:00

Author	SHA1	Message	Date
Dave Wysochanski	821f0cc5ea	Add vg get/set methods for VG metadata copies. This patch adds the get and partially implemented set function. The 'set' function should probably ignore or un-ignore metadata areas based on new values. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:56 +00:00
Dave Wysochanski	88d7dc1af8	Add mda_copies to VG structures and initialization. Add a field to struct volume_group to later implement metadata balancing: - mda_copies: target # of non-ignored mdas in the VG; default 0 (do not control pv 'ignore mdas' bit. This patch just adds the parameter to the structures with the default values but does not modify any commands. Should be no functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:37 +00:00
Dave Wysochanski	0f2f8a5c3a	Before committing each mda, arrange mdas so ignored mdas get committed first. Arrange mdas so mdas that are to be ignored come first. This is an optimization that ensures consistency on disk for the longest period of time. This was noted by agk in review of the v4 patchset of pvchange-based mda balance. Note the following example for an explanation of the background: Assume the initial state on disk is as follows: PV0 (v1, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) If we did not sort the list, we would have a commit sequence something like this: PV0 (v2, non-ignored) PV1 (v2, ignored) PV2 (v2, ignored) PV3 (v2, non-ignored) After the commit of PV0's mdas, we'd have an on-disk state like this: PV0 (v2, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is an inconsistent state of the disk. If the machine fails, the next time it was brought back up, the auto-correct mechanism in vg_read would update the metadata on PV1-PV3. However, if possible we try to avoid inconsistent on-disk states. Clearly, because we did not sort, we have a greater chance of on-disk inconsistency - from the time the commit of PV0 is complete until the time PV3 is complete. We could improve the amount of time the on-disk state is consistent by simply sorting the commit order as follows: PV1 (v2, ignored) PV2 (v2, ignored) PV0 (v2, non-ignored) PV3 (v2, non-ignored) Thus, after the first PV is committed (in this case PV1), on-disk we would have: PV0 (v1, non-ignored) PV1 (v2, ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is clearly a consistent state. PV1 will be read but the mda will be ignored. All other PVs contain v1 metadata, and no auto-correct will be required. In fact, if we commit all PVs with ignored mdas first, we'll only have an inconsistent state when we start writing non-ignored PVs, and thus the chances we'll get an inconsistent state on disk is much less with the sorted method. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:49 +00:00
Dave Wysochanski	77e0ed4be7	Refactor vg_commit() to add _vg_commit_mdas(). Factor out calling mda->ops->vg_commit() for each mda. No functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:33 +00:00
Dave Wysochanski	69d1732334	Update _vg_read and _text_create_text_instance to use fid_add_mda[s]. When we are constructing the vg, we may need to adjust the list of metadata_areas if there are ignored mdas. At label read time, we do not read the metadata of ignored mdas, and as a result, they do not get placed on vg->fid->metadata_areas inside _text_create_text_instance since lvmcache does not have these areas attached to vginfo->infos. However, when we're checking the pvids inside _vg_read, after having read another metadata area from another PV, we do have the opportunity to update the metadata_area and metadata_areas_ignored lists based on the read metadata_area. We need accurate mda lists for the reporting functions that count the ignored mdas, as well as general correctness of mda balancing. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:17 +00:00
Dave Wysochanski	bb723d7897	Use mdas_empty_or_ignored() in place of checks for empty mda list. With the addition of ignored mdas, we replace all checks for an empty mda list with a new function to look for either an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:58 +00:00
Dave Wysochanski	f9c307cd07	Add mdas_empty_or_ignored() helper function. Add a helper function to consolidate checking for an empty mdas list or ignored mdas. Ignored mdas should behave almost identically to an empty mda list - the metadata areas should not be read or written to. This function will make it easier to implement metadata balancing and easier to track pvs with an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:40 +00:00
Dave Wysochanski	cdbe475fe3	Define new functions and vgs/pvs fields related to mda ignore. Define a new pvs field, pv_mda_used_count, and a new vgs field, vg_mda_used_count to match the existing pv_mda_count and vg_mda_count. These new fields count the number of mdas that have the 'ignored' bit clear (they are in use on the PV / VG). Also define various supporting functions to implement the counting as well as setting the ignored flag and determining if an mda is ignored. These high level functions call into the lower level location independent mda ignore functions defined by earlier patches. Note that counting ignored mdas in a vg requires traversing both lists and checking for the ignored bit on the mda. The count of 'ignored' mdas then is defined by having the bit set, not by which list the mda is on. The list does determine whether LVM actually does read/write to the mda, though we must count the bits in order to return accurate numbers for the various counts. Also, pv_mda_set_ignored must search both vg lists for ignored mda. If the state changes and needs to be committed to disk, the ignored mda will be on the non-ignored list. Note also in pv_mda_set_ignored(), we must properly manage the mda lists. If we change the ignored state of an mda, we must change any mdas on vg->fid->metadata_areas that correspond to this pv. Also, we may need to allocate a copy of the mda, as is done when fid->metadata_areas is populated from _vg_read(), if we are un-ignoring an ignored mda. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:44 +00:00
Dave Wysochanski	9ccac021a7	Add metadata_areas_ignored list and functions to manage ignored mdas. Add a second mda list, metadata_areas_ignored to fid, and a couple functions, fid_add_mda() and fid_add_mdas() to help manage the lists. These functions are needed to properly count the ignored mdas and manage the lists attached to the 'fid' and ultimately the 'vg'. Ensure metadata_areas_ignored is initialized in other formats, even if the list is never used. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:22 +00:00
Dave Wysochanski	f55a20eb36	Rename fid->metadata_areas to fid->metadata_areas_in_use. Rename the metadata_areas list to an 'in_use' list to prepare for future 'ignored' list.	2010-06-28 20:32:44 +00:00
Dave Wysochanski	ef4fa155a5	Add mda location specific mda_copy constructor. Because of the way mdas are handled internally, where a PV in a VG has mdas on both info->mdas and vg->fid->metadata_areas list, we need a location independent copy constructor for struct metadata_area. Break up the existing format-text specific copy constructor into a format independent piece and a format dependent piece. This function is necessary to properly implement pv_set_mda_ignored(). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:59 +00:00
Dave Wysochanski	29f24d4634	Add mda_locns_match() internal library function for mapping pv/device to VG mda. A metadata_area is defined independent of the location. One downside is that there is no obvious mapping from a pv to an mda. For a PV in a VG, we need a way to start with a PV and end up with an MDA, if we are to manage mdas starting with a device/pv. This function provides us a way to go down the list of PVs on a VG, and identify which ones match a particular PV. I'm not entirely happy with this approach, but it does fit into the existing structures in a reasonable way. An alternative solution might be to refactor the VG - PV interface such that mdas are a list tied to a PV. However, this seemed a bit tricky since a PV does not come into existence until after the list of mdas is constructed (see _vg_read() - we create a 'fid' and attach mdas to it, then we go through them and attach pvs). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:38 +00:00
Dave Wysochanski	322c5868b3	Add location independent flag and functions to ignore mdas. First we add a 'flags' field to the location independent metadata_area structure, and a MDA_IGNORE flag. The mda_is_ignored and mda_set_ignored functions are added to manage the flag. Adding the flag and functions gives a library interface to ignore metadata areas independent of the underlying location (disk, file, etc). The location specific read/write functions must then handle the specifics of what this flag means to the location. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:30:14 +00:00
Milan Broz	f9e177d281	Fix "allocated" warning typo.	2010-06-22 21:10:53 +00:00
Petr Rockai	d345bf2cd3	Account for mirror transient status when doing lvconvert --repair.	2010-05-24 15:32:20 +00:00
Zdenek Kabelac	e86e45f7ea	Replicator: extend _lv_each_dependency() with dependencies for Replicator devices	2010-05-21 12:45:18 +00:00
Dave Wysochanski	dd2a0e940d	Add find_vgname_from_{pvname\|pvid} functions. Some commands start with a pvname, but we'd like to force users to start with a vg handle to obtain a pv handle. Our best option seems to be providing a way to look up the vgname from the pvname, and then require them to use vg_read/vg_open. In addition to the pvname lookup function, this patch also provides a lookup by pvid. The lookup by pvid can be used in conjunction with lvmcache_get_pvids to process all pvs in the system. The pvid find function first calls lvmcache_vgname_from_pvid, which may cause the label to be read if it is not in the cache. If the vgname is returned is an orphan, we then check to see if there are metadata areas, and if not, we scan every PV on the system by calling scan_vgs_for_pvs(). In most cases we should not need to do this, and by using the info->mdas count, we avoid calling pv_read() as prior code did. So this patch is a bit cleaner and should allow us to refactor more of the pv code. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-05-19 11:52:37 +00:00
Alasdair Kergon	1d837442bf	Add is_global_vg and split out from is_orphan_vg.	2010-05-19 02:36:33 +00:00
Alasdair Kergon	34220fe292	Validate orphan and VG_GLOBAL lock order too.	2010-05-19 02:08:50 +00:00
Alasdair Kergon	fa305e2ec6	Accept orphan VG names as parameters to lock_vol() and related functions.	2010-05-19 01:16:40 +00:00
Jonathan Earl Brassow	a932c2b61f	Disallow toggling the cluster attribute of a volume group if there are active mirrors or snapshots. We don't have the mechanisms in place to change the device-mapper tables for those targets that have behavioral differences between cluster and single machine instances. Allowing users to change the attribute but not changing the target's behavior can lead to data corruption. The following bugs are fixed/avoided by this patch: 235123 - vgchange -c [ny] do not change target types when necessary 289331 - RFE: switching from cluster domain to local domain needs to deactivate volume somehow 289541 - when changing from local to cluster, volumes can not appear to be deactivated	2010-05-14 15:19:42 +00:00
Mike Snitzer	60267bdce8	Disallow the direct removal of a merging snapshot. Allow lv_remove_with_dependencies() to know the top-level LV that was requested to be removed (otherwise it recurses and we lose context). A merging snapshot cannot be removed directly but the associated origin can be. Disallow removal of a merging snapshot unless the associated origin is also being removed.	2010-04-23 19:27:10 +00:00
Peter Rajnoha	1e696b0c15	Do not reset position in metadata ring buffer on vgrename and vgcfgrestore. We should write metadata into next position in the ring buffer while calling vgrename and vgcfgrestore. At this code level (_vg_write_raw), we were not able to determine if this is a rename or not. If yes, then accompanying VG structure passed here has a new name set, not the old one. When looking for a location where to put metadata next, we were given a NULL value because of failed VG name comparison (in _find_vg_rlocn) between the name in existing metadata and metadata we're just about to write. This resets the position in the ring buffer, overwriting any existing metadata (and also incorrectly updates the cache to "orphan" afterwards). This patch just adds old_name item in struct volume_group that we can check and use if necessary and detect renames at lower layers as well. The same applies for vgcfgrestore, but here we're using a special value of old_name, an empty string, to disable the check with existing metadata totally.	2010-04-14 13:09:16 +00:00
Dave Wysochanski	af46c894d0	Add pv->vg to solidify link between a pv and a vg. lvm2app needs a link back to the vg in order to use the vg handle for memory allocations as well as other things. This patch adds the field to struct physical_volume, and sets pv->vg when reading a vg from disk or extending a vg by using the helper function previously added, add_pvl_to_vgs(). Moves and renames are handled with separate code inside move_pv() and vgmerge(). Add pv->vg check to vg_validate(). A NULL value in pv->vg signifies membership in the orphan VG. Note though in the case of pv_read() on a device with metadatacopies == 0, more devices may need to be read for an authoritative answer. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:36 +00:00
Dave Wysochanski	11647ad01c	Use del_pvl_from_vgs() in vgreduce paths. Somehow these got missed in earlier patches. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:20 +00:00
Dave Wysochanski	0adfbfd5ea	Call add_pvl_to_vgs() and del_pvl_from_vgs() from more places. Now that we have library functions to add/delete a pv from the vg->pvs list, call them from everywhere. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:03 +00:00
Dave Wysochanski	8cfd64de78	Add del_pvl_from_vgs() and move prototypes into metadata-exported.h Add a delete function to manage the vg->pvs list. NOTE: It may be possible to do further cleanup to these add/del functions by passing a 'pv' as input instead of 'pv_list'. The pv_list is used for functions which do allocations (lvcreate) while other places in the code just manage a list of 'pv' (e.g. import functions, vgextend, etc). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:25:44 +00:00
Dave Wysochanski	fddc256a02	Check for duplicate paths (pvids) on the commandline of vgcreate. A user specifying duplicate paths on the cmdline of vgcreate will get a message similar to the following: vgcreate vgtest2 /dev/loop3 /dev/loop5 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5 Internal error: Duplicate PV id jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8 detected for /dev/loop3 in vgtest2. This is caught by vg_validate(), but it would be good to find this condition earlier in the vgcreate code. add_pv_to_vg() currently checks by pvname, but does not look for duplcate pvids. This patch adds the check for duplicate pvids and results in new error output as follows: vgcreate vgtest2 /dev/loop3 /dev/loop5 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5 Physical volume '/dev/loop5 (jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8)' listed more than once. Unable to add physical volume '/dev/loop5' to volume group 'vgtest2'. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-08 15:18:35 +00:00
Dave Wysochanski	9e82787da2	Add add_pvl_to_vgs() - helper function to add a pv to a vg list. Small refactor of main places in the code where a pv is added to a vg into a small function which adds the pv to the list and updates the vg counts. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-06 14:04:54 +00:00
Dave Wysochanski	53ad3cad14	Add pv to vg->pvs after check for maximum value of vg->extent_count. In add_pv_to_vg(), we should only add the pv to vg->pvs after all internal checks have passed. The check for vg->extent_count exeeding maximum was after we added the pv to the list, so this function could return a state of vg->pvs that did not reflect other parameters such as vg->pv_count. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-06 14:03:43 +00:00
Alasdair Kergon	0c67893ce9	Avoid endless loop if lv->segments list is corrupted	2010-04-01 13:08:06 +00:00
Alasdair Kergon	a1192f17ba	Improve vg_validate to detect some loops in lists.	2010-04-01 11:45:36 +00:00
Alasdair Kergon	0640232acd	Improve vg_validate to detect some loops in lists.	2010-04-01 11:43:24 +00:00
Milan Broz	80b96a8974	Optimise PV segments search. The function find_peg_by_pe is incredibly inefficient for Pvs with many segments. In shiny future there should be binary (or interval) tree instead of sorted linked list (volunteers?). Anyway, for now, we can use dirty trick here to optimise this case: - Allocations are usually applied from the beginning of PV (we have no alloocation policy which allocates areas "backwards") - The only user of find_peg_by_pe is pv_split_segment() call. In most cases it need to split last PV segment. So if we search sorted pv segment list backwards, we hit the requested segment immediatelly. This patch applies this tiny change. (and saves >30% of processing time when >3000LVs segments are on one PV!) To discourage using this inefficient function from other code, it is moved to pv_manip.c and used static for now:-)	2010-03-31 17:21:40 +00:00
Mikulas Patocka	655849fb14	A missing space in the error message. Add missing parentheses to an error message Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2010-03-31 12:06:30 +00:00
Dave Wysochanski	15fdc8d3ee	Avoid scanning all pvs in the system if operating on a device with mdas. When we pv_read() a device that has an orphan vgname, we might need to scan the system to be sure this is true. However, if the PV has mdas, there's no way possible for it to have an orphan vgname unless it is a true orphan. Some areas of the code were optimized to take advantage of this fact, while others were not (we would still do the expensive scan if a device had mdas but had an orphan VG). This patch unifies the code so that every place we are operating on such a PV, we skip the expensive scan if there are mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Petr Rockai <prockai@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com>	2010-03-18 17:29:12 +00:00
Milan Broz	acb4b5e4de	Fix pvcreate device check. If user try to vgcreate or vgextend non-existent VG, these messages appears: # vgcreate xxx /dev/xxx Internal error: Volume Group xxx was not unlocked Device /dev/xxx not found (or ignored by filtering). Unable to add physical volume '/dev/xxx' to volume group 'xxx'. Internal error: Attempt to unlock unlocked VG xxx. (the same with existing VG and non-existing PV & vgextend) # vgextend vg_test /dev/xxx ... It is caused because code tries to "refresh" cache if md filter is switched on using cache destroy. But we can change filters and rescan even without this machinery now, just use refresh_filters (and reset md filter afterwards). (Patch also discovers cache alias bug in vgsplit test, fix it by using better filter line.)	2010-03-17 14:44:18 +00:00
Alasdair Kergon	b1f9a2f5d1	Only do one full device scan during each read of text format metadata.	2010-03-16 17:30:00 +00:00
Alasdair Kergon	38220f9fe9	Remove unnecessary full_scan parameter from get_vgids and get_vgnames calls.	2010-03-16 16:57:03 +00:00
Alasdair Kergon	cccae7e633	Look up missing PVs by uuid not dev_name in _pvs_single to avoid invalid stat. Make find_pv_in_vg_by_uuid() return same type as related functions.	2010-03-16 15:30:48 +00:00
Alasdair Kergon	770dc81b8e	Introduce is_missing_pv().	2010-03-16 14:37:38 +00:00
Mike Snitzer	c485fe183e	Handle a misaligned device that reports a -1 alignment_offset. The kernel's blk_stack_limits() function may flag a device as 'misaligned'. If it does the alignment_offset will be -1. Update set_pe_align_offset() to accommodate this corner case.	2010-03-02 21:56:14 +00:00
Dave Wysochanski	3c23ff0f2e	Add dm_pool_strdup to allocate memory and copy a tag in {lv\|vg}_change_tag() We need to allocate memory for the tag and copy the tag value before we add it to the list of tags. We could put this inside lvm2app since the tools keep their memory around until vg_write/vg_commit is called, but we put it inside the internal library to minimize code in lvm2app. We need to copy the tag passed in by the caller to ensure the lifetime of the memory until the {vg\|lv} handle is released. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:57 +00:00
Dave Wysochanski	cd69ee7453	Refactor lvchange_tag() to call lv_change_tag() library function. Similar refactoring to vgchange - pull out common parts and put into library function for reuse. Should be no functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:49 +00:00
Dave Wysochanski	e17bcc7432	Refactor _vgchange_tag() to vg_change_tag() library function. Pull out common code to be called from tools as well as lvm2app. Leave archive() at tool level so we can use from vgcreate as well as vgchange. Should be no functional change. - add stack macro in vgchange Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:05 +00:00
Dave Wysochanski	629efc6a89	Export lvm_pv_get_size(), lvm_pv_get_free(), lvm_pv_get_dev_size in lvm2app. We add these exports to show the pv_size and pv_free and dev_size fields. Fixes rhbz561423. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-14 03:21:37 +00:00
Peter Rajnoha	04fa77c3be	This is related to liblvm and its lvm_list_vg_names() and lvm_list_vg_uuids() functions where we should not expose internal VG names/uuids (the ones with "#" prefix )through the interface. Otherwise, we could end up with library users opening internal VGs which will initiate locking mechanism that won't be cleaned up properly. "#orphans_{lvm1, lvm2, pool}" names are treated in a special way, they are truncated first to "orphans" and this is used as a part of the lock name then (e.g. while calling lvm_vg_open()). When library user calls lvm_vg_close(), the original name "orphans_{lvm1, lvm2, pool}" is used directly and therefore no unlock occurs. We should exclude internal VG names and uuids in the lists provided by lvmcache: lvmcache_get_vgids() and lvmcache_get_vgnames().	2010-02-03 14:08:39 +00:00
Dave Wysochanski	a7ca101517	Call _alloc_pv() inside _pv_read() and clean up error paths. We should be consistent with pv constructors so call _alloc_pv() here as we do from pv_create().	2010-01-21 21:09:23 +00:00
Dave Wysochanski	1d749d01fb	Remove useless memory allocation for pv->vg_name in _alloc_pv(). All this seems to do is provide a memory leak so remove it. The only caller of _alloc_pv() later explicitly sets pv->vg_name = fmt->orphan_vg_name so clearly this allocation should be removed. I also saw no where in the code where strncpy was used to assign pv->vg_name - only direct assignments and strdup's.	2010-01-21 21:04:44 +00:00
Dave Wysochanski	2b1446c7d6	Correct 'void *' usage in pvcreate_single. Remove needless cast.	2010-01-21 21:04:20 +00:00

1 2 3 4 5 ...

459 Commits