shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00

Author	SHA1	Message	Date
Alasdair Kergon	d8886386bd	more mda ignore cleanups	2010-06-30 19:28:35 +00:00
Dave Wysochanski	40b4d1c3ae	Refactor vg_remove_check to place pv removal into separate function.	2010-06-30 18:03:52 +00:00
Alasdair Kergon	23177eda88	more metadataignore message/code cleanup	2010-06-30 17:13:05 +00:00
Alasdair Kergon	efe75fd705	revert that	2010-06-30 14:54:29 +00:00
Alasdair Kergon	a6c4427188	suppress useless compiler warning	2010-06-30 14:52:29 +00:00
Dave Wysochanski	ef7b409966	Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg.	2010-06-30 14:48:07 +00:00
Alasdair Kergon	67b91d0848	Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg.	2010-06-30 14:27:40 +00:00
Alasdair Kergon	647c64c796	Improve various log messages.	2010-06-30 13:51:11 +00:00
Dave Wysochanski	a5bf70018b	Add --metadataignore to pvcreate. Allow metadataignore flag to be passed in to pvcreate. Ideally, more refactoring of the mda allocation / initialization is warranted, but for now, we just add another parameter to 'add_mda' to take an existing mda ignored flag. We need to do this or pv_write loses the state of the mda 'ignored' flag before copying and writing to disk.	2010-06-30 12:17:24 +00:00
Dave Wysochanski	6af5155529	Improve logging for setting --vgmetadatacopies. Example of logging: metadata/metadata.c:1127 Setting mda_copies = 3 on vg vgtest metadata/pv_manip.c:296 /dev/loop2 0: 0 25: NULL(0:0) metadata/pv_manip.c:296 /dev/loop3 0: 0 25: NULL(0:0) metadata/pv_manip.c:296 /dev/loop4 0: 0 25: NULL(0:0) metadata/metadata.c:1072 Adjusting ignored mdas on vg vgtest, vg_mda_used_count=5, vg_mda_copies=3 metadata/metadata.c:1015 Setting ignore flag for 2 mdas on vg vgtest metadata/metadata.c:4151 Setting mda ignored flag for metadata_locn /dev/loop2. metadata/metadata.c:4151 Setting mda ignored flag for metadata_locn /dev/loop3.	2010-06-29 22:41:28 +00:00
Dave Wysochanski	d37dd5b2d3	Improve logging for metadata ignore by printing device name. Print device name when setting or clearing metadata ignore bit. Example: label/label.c:160 /dev/loop2: lvm2 label detected cache/lvmcache.c:1136 lvmcache: /dev/loop2: now in VG #orphans_lvm2 (#orphans_lvm2) metadata/metadata.c:4142 Setting mda ignored flag for metadata_locn /dev/loop2. format_text/text_label.c:318 Skipping mda with ignored flag on device /dev/loop2 at offset 4096	2010-06-29 22:37:32 +00:00
Dave Wysochanski	710c9373bf	Add some log_verbose debug statements related to metadataignore. Logging isn't ideal, especially for mda_set_ignore. Ideally we'd like to display the device name and offset in this case but this requires a bit more work and a per-format 'mda_description' function pointer definition (we don't have access to mda_context in metadata.c).	2010-06-29 22:25:58 +00:00
Dave Wysochanski	a375ced300	Move code into pv_change_metadataignore library function. In preparation to call this from both pvcreate as well as pvchange, move the guts of metadataignore into a library function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-29 21:32:44 +00:00
Dave Wysochanski	a9d8bf269a	Allow 'all' and 'unmanaged' values for --vgmetadatacopies. Allowing an 'all' and 'unmanaged' value is more intuitive, and provides a simple way for users to get back to original LVM behavior of metadata written to all PVs in the volume group. If the user requests "--vgmetadatacopies unmanaged", this instructs LVM not to manage the ignore bits to achieve a specific number of metadata copies in the volume group. The user is free to use "pvchange --metadataignore" to control the mdas on a per-PV basis. If the user requests "--vgmetadatacopies all", this instructs LVM to do 2 things: 1) clear all ignore bits, and 2) set the "unmanaged" policy going forward. Internally, we use the special MAX_UINT32 value to indicate 'all'. This 'just' works since it's the largest value possible for the field and so all 'ignore' bits on all mdas in the VG will get cleared inside _vg_metadata_balance(). However, after we've called the _vg_metadata_balance function, we check for the special 'all' value, and if set, we write the "unmanaged" value into the metadata. As such, the 'all' value is never written to disk. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:40:01 +00:00
Dave Wysochanski	a09a8efb66	Update check in vg_split_mdas to account for ignored mdas list. The check in vg_split_mdas will trigger an error if the 'from' vg list is empty. However, this might be ok in some instances now that we have ignored mdas. Relax this check so an error is triggered only in the case where there's truly no more mdas in the 'from' vg. One example of where this makes a difference is with vgreduce. If we try to vgreduce a PV with un-ignored mdas, this should trigger the balancing function to un-ignore mdas on another PV in the VG. However, we don't get to vg_write() before we fail because this list size check fails, and we see an error message indicating: "Cannot remove final metadata area ..." Another example is with vgsplit into a new VG, where the PVs being moved contain all ignored mdas. We must move the mdas on fid->metadata_areas_ignored from 'vg_from' to 'vg_to'. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:38:56 +00:00
Dave Wysochanski	f61cd7b249	Ensure fid mda lists are populated correctly during vgextend. The vgextend path calls add_pv_to_vg(). Inside add_pv_to_vg(), we must ensure we pass the correct mdas list into pv_setup(), as copies of mdas are placed on the vg->fid list. If we don't place the mdas on the correct vg->fid list, the various counts may be incorrect and the metadata balance algorithm will not work when called from vg_write() path. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:38:39 +00:00
Dave Wysochanski	1b54343328	Implement _vg_adjust_ignored_mdas and call from vg_write() path. Compare the value of the newly added vg_mda_copies field (--vgmetadatacopies parameter) with the current count of in-use mdas and ignoring or unignoring mdas as necessary to get to the target count. Also, as a safety check before returning, ensure we have at least one mda enabled. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:37:54 +00:00
Dave Wysochanski	821f0cc5ea	Add vg get/set methods for VG metadata copies. This patch adds the get and partially implemented set function. The 'set' function should probably ignore or un-ignore metadata areas based on new values. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:56 +00:00
Dave Wysochanski	88d7dc1af8	Add mda_copies to VG structures and initialization. Add a field to struct volume_group to later implement metadata balancing: - mda_copies: target # of non-ignored mdas in the VG; default 0 (do not control pv 'ignore mdas' bit. This patch just adds the parameter to the structures with the default values but does not modify any commands. Should be no functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:37 +00:00
Dave Wysochanski	0f2f8a5c3a	Before committing each mda, arrange mdas so ignored mdas get committed first. Arrange mdas so mdas that are to be ignored come first. This is an optimization that ensures consistency on disk for the longest period of time. This was noted by agk in review of the v4 patchset of pvchange-based mda balance. Note the following example for an explanation of the background: Assume the initial state on disk is as follows: PV0 (v1, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) If we did not sort the list, we would have a commit sequence something like this: PV0 (v2, non-ignored) PV1 (v2, ignored) PV2 (v2, ignored) PV3 (v2, non-ignored) After the commit of PV0's mdas, we'd have an on-disk state like this: PV0 (v2, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is an inconsistent state of the disk. If the machine fails, the next time it was brought back up, the auto-correct mechanism in vg_read would update the metadata on PV1-PV3. However, if possible we try to avoid inconsistent on-disk states. Clearly, because we did not sort, we have a greater chance of on-disk inconsistency - from the time the commit of PV0 is complete until the time PV3 is complete. We could improve the amount of time the on-disk state is consistent by simply sorting the commit order as follows: PV1 (v2, ignored) PV2 (v2, ignored) PV0 (v2, non-ignored) PV3 (v2, non-ignored) Thus, after the first PV is committed (in this case PV1), on-disk we would have: PV0 (v1, non-ignored) PV1 (v2, ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is clearly a consistent state. PV1 will be read but the mda will be ignored. All other PVs contain v1 metadata, and no auto-correct will be required. In fact, if we commit all PVs with ignored mdas first, we'll only have an inconsistent state when we start writing non-ignored PVs, and thus the chances we'll get an inconsistent state on disk is much less with the sorted method. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:49 +00:00
Dave Wysochanski	77e0ed4be7	Refactor vg_commit() to add _vg_commit_mdas(). Factor out calling mda->ops->vg_commit() for each mda. No functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:33 +00:00
Dave Wysochanski	69d1732334	Update _vg_read and _text_create_text_instance to use fid_add_mda[s]. When we are constructing the vg, we may need to adjust the list of metadata_areas if there are ignored mdas. At label read time, we do not read the metadata of ignored mdas, and as a result, they do not get placed on vg->fid->metadata_areas inside _text_create_text_instance since lvmcache does not have these areas attached to vginfo->infos. However, when we're checking the pvids inside _vg_read, after having read another metadata area from another PV, we do have the opportunity to update the metadata_area and metadata_areas_ignored lists based on the read metadata_area. We need accurate mda lists for the reporting functions that count the ignored mdas, as well as general correctness of mda balancing. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:17 +00:00
Dave Wysochanski	bb723d7897	Use mdas_empty_or_ignored() in place of checks for empty mda list. With the addition of ignored mdas, we replace all checks for an empty mda list with a new function to look for either an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:58 +00:00
Dave Wysochanski	f9c307cd07	Add mdas_empty_or_ignored() helper function. Add a helper function to consolidate checking for an empty mdas list or ignored mdas. Ignored mdas should behave almost identically to an empty mda list - the metadata areas should not be read or written to. This function will make it easier to implement metadata balancing and easier to track pvs with an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:40 +00:00
Dave Wysochanski	cdbe475fe3	Define new functions and vgs/pvs fields related to mda ignore. Define a new pvs field, pv_mda_used_count, and a new vgs field, vg_mda_used_count to match the existing pv_mda_count and vg_mda_count. These new fields count the number of mdas that have the 'ignored' bit clear (they are in use on the PV / VG). Also define various supporting functions to implement the counting as well as setting the ignored flag and determining if an mda is ignored. These high level functions call into the lower level location independent mda ignore functions defined by earlier patches. Note that counting ignored mdas in a vg requires traversing both lists and checking for the ignored bit on the mda. The count of 'ignored' mdas then is defined by having the bit set, not by which list the mda is on. The list does determine whether LVM actually does read/write to the mda, though we must count the bits in order to return accurate numbers for the various counts. Also, pv_mda_set_ignored must search both vg lists for ignored mda. If the state changes and needs to be committed to disk, the ignored mda will be on the non-ignored list. Note also in pv_mda_set_ignored(), we must properly manage the mda lists. If we change the ignored state of an mda, we must change any mdas on vg->fid->metadata_areas that correspond to this pv. Also, we may need to allocate a copy of the mda, as is done when fid->metadata_areas is populated from _vg_read(), if we are un-ignoring an ignored mda. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:44 +00:00
Dave Wysochanski	9ccac021a7	Add metadata_areas_ignored list and functions to manage ignored mdas. Add a second mda list, metadata_areas_ignored to fid, and a couple functions, fid_add_mda() and fid_add_mdas() to help manage the lists. These functions are needed to properly count the ignored mdas and manage the lists attached to the 'fid' and ultimately the 'vg'. Ensure metadata_areas_ignored is initialized in other formats, even if the list is never used. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:22 +00:00
Dave Wysochanski	f55a20eb36	Rename fid->metadata_areas to fid->metadata_areas_in_use. Rename the metadata_areas list to an 'in_use' list to prepare for future 'ignored' list.	2010-06-28 20:32:44 +00:00
Dave Wysochanski	ef4fa155a5	Add mda location specific mda_copy constructor. Because of the way mdas are handled internally, where a PV in a VG has mdas on both info->mdas and vg->fid->metadata_areas list, we need a location independent copy constructor for struct metadata_area. Break up the existing format-text specific copy constructor into a format independent piece and a format dependent piece. This function is necessary to properly implement pv_set_mda_ignored(). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:59 +00:00
Dave Wysochanski	29f24d4634	Add mda_locns_match() internal library function for mapping pv/device to VG mda. A metadata_area is defined independent of the location. One downside is that there is no obvious mapping from a pv to an mda. For a PV in a VG, we need a way to start with a PV and end up with an MDA, if we are to manage mdas starting with a device/pv. This function provides us a way to go down the list of PVs on a VG, and identify which ones match a particular PV. I'm not entirely happy with this approach, but it does fit into the existing structures in a reasonable way. An alternative solution might be to refactor the VG - PV interface such that mdas are a list tied to a PV. However, this seemed a bit tricky since a PV does not come into existence until after the list of mdas is constructed (see _vg_read() - we create a 'fid' and attach mdas to it, then we go through them and attach pvs). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:38 +00:00
Dave Wysochanski	322c5868b3	Add location independent flag and functions to ignore mdas. First we add a 'flags' field to the location independent metadata_area structure, and a MDA_IGNORE flag. The mda_is_ignored and mda_set_ignored functions are added to manage the flag. Adding the flag and functions gives a library interface to ignore metadata areas independent of the underlying location (disk, file, etc). The location specific read/write functions must then handle the specifics of what this flag means to the location. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:30:14 +00:00
Jonathan Earl Brassow	68c31a2a36	Fix for bz608048 from Taka... The same region size is used for both mirror volume and mirrored log volume, but when the physical extent size is bigger than region size, the size of mirror leg for mirrored log is smaller than the region size and lvcreate command fails. This patch adjusts a region size of mirrored log to a smaller value of region size or physical extent size. [This patch ensures that the region_size of the mirrored log does not exceed the size of the mirrored log itself, which would violate the kernel constraint: (region_size <= ti->len).] Signed-off-by: Takahiro Yasui <takahiro.yasui@hds.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-06-28 14:19:41 +00:00
Jonathan Earl Brassow	42f7fd0590	The function that runs to compress a stacked mirror after converting from 2-way to 3-way mirror (collapse_mirrored_lv) was calling '_remove_mirror_images' with the 'remove_log' parameter set. When the code was put in to fix 599898 to honor log parameters during conversion, this argument was suddenly being honored. Thus, when someone would convert from a 2-way to 3-way mirror, the log would get removed. 'collapse_mirrored_lv' should not be calling '_remove_mirror_images' with 'remove_log' set.	2010-06-23 13:57:26 +00:00
Milan Broz	f9e177d281	Fix "allocated" warning typo.	2010-06-22 21:10:53 +00:00
Jonathan Earl Brassow	a7d355a28c	Mirrors can be layered - as in the case of an converting 2-way to 3-way mirror. When conversion operations are performed on these types of mirrors, log options can be confused/ignored. In the case of a converting 3-way mirror, we have a top-level 2-way corelog mirror whose legs are 1) a 2-way disk-log mirror and 2) a linear device. If we wish to convert this 3-way mirror to a 2-way mirror, the linear device is removed and the extra top layer is eliminated. If we also wished to convert the disk log to a core log in the same step, ambiguity creeps in. It is somewhat obvious what the user wants - a 2-way mirror with a corelog. However, looking at the top level mirror before compression, it seems that the mirror already has a core log. This is why the operation seemed to fail. This patch simply re-evaluates what mirrored_seg points to after a compression and then considers the log argument. This is a fix for bug 599898.	2010-06-21 16:12:33 +00:00
Petr Rockai	d345bf2cd3	Account for mirror transient status when doing lvconvert --repair.	2010-05-24 15:32:20 +00:00
Zdenek Kabelac	4ef2bf27a7	Update Copyright date for resently modifed files	2010-05-24 09:04:27 +00:00
Zdenek Kabelac	65928349e7	Replicator: add read and release VGs for rsites Add functions to read and release remote VGs for replicator sites in activation context.	2010-05-21 14:07:16 +00:00
Zdenek Kabelac	f6d7e637c3	Add toolcontext.h header file.	2010-05-21 13:34:09 +00:00
Zdenek Kabelac	6222635b38	Replicator: add find_replicator_vgs Adding find_replicator_vgs() function to find all needed VGs for replicator-dev LV. This function is later called before taking lock_vol().	2010-05-21 12:55:25 +00:00
Zdenek Kabelac	12569ccb03	Replicator: add sorted cmd_vg list Introduce struct cmd_vg to store information about needed volume group name, vgid, flags and the pointer to opened VG. Keep VGs list in alphabetical order for locking order. Introduce functions: cmd_vg_add() add new cmd_vg entry. cmd_vg_lookup() search cmd_vgs for vg_name. cmd_vg_read() open VGs in cmd_vgs list. cmd_vg_release() close VGs in reversed order.	2010-05-21 12:52:01 +00:00
Zdenek Kabelac	0a02d30ea4	Replicator: extend volume_group with list of VGs and flag Add pointer to linked list of opened VGs. List temporarily keeps the information about needed or locked and opened VGs for replicator target. Also add cmd_missing_vgs flag information for quick check and also for possible continuos process_each_lv() usage where we need to detect whether failure has been caused by missing VG or some other reason.	2010-05-21 12:47:46 +00:00
Zdenek Kabelac	e86e45f7ea	Replicator: extend _lv_each_dependency() with dependencies for Replicator devices	2010-05-21 12:45:18 +00:00
Zdenek Kabelac	651cae3c5c	Replicator: check replicator segment Check for possible problems within replicator structures. Used also by vg_validate.	2010-05-21 12:43:02 +00:00
Zdenek Kabelac	1207106fbc	Replicator: new files for Replicator target	2010-05-21 12:40:05 +00:00
Zdenek Kabelac	8fea97b7e7	Replicator: base lvm2 support Adding configure.in support for Replicators. Adding basic lib lvm support for Replicators. Adding flags REPLICATOR and REPLICATOR_LOG. Adding segments SEG_REPLICATOR and SEG_REPLICATOR_DEV. Adding basic methods for handling replicator metadata.	2010-05-21 12:36:30 +00:00
Dave Wysochanski	dd2a0e940d	Add find_vgname_from_{pvname\|pvid} functions. Some commands start with a pvname, but we'd like to force users to start with a vg handle to obtain a pv handle. Our best option seems to be providing a way to look up the vgname from the pvname, and then require them to use vg_read/vg_open. In addition to the pvname lookup function, this patch also provides a lookup by pvid. The lookup by pvid can be used in conjunction with lvmcache_get_pvids to process all pvs in the system. The pvid find function first calls lvmcache_vgname_from_pvid, which may cause the label to be read if it is not in the cache. If the vgname is returned is an orphan, we then check to see if there are metadata areas, and if not, we scan every PV on the system by calling scan_vgs_for_pvs(). In most cases we should not need to do this, and by using the info->mdas count, we avoid calling pv_read() as prior code did. So this patch is a bit cleaner and should allow us to refactor more of the pv code. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-05-19 11:52:37 +00:00
Alasdair Kergon	1d837442bf	Add is_global_vg and split out from is_orphan_vg.	2010-05-19 02:36:33 +00:00
Alasdair Kergon	34220fe292	Validate orphan and VG_GLOBAL lock order too.	2010-05-19 02:08:50 +00:00
Alasdair Kergon	fa305e2ec6	Accept orphan VG names as parameters to lock_vol() and related functions.	2010-05-19 01:16:40 +00:00
Jonathan Earl Brassow	a932c2b61f	Disallow toggling the cluster attribute of a volume group if there are active mirrors or snapshots. We don't have the mechanisms in place to change the device-mapper tables for those targets that have behavioral differences between cluster and single machine instances. Allowing users to change the attribute but not changing the target's behavior can lead to data corruption. The following bugs are fixed/avoided by this patch: 235123 - vgchange -c [ny] do not change target types when necessary 289331 - RFE: switching from cluster domain to local domain needs to deactivate volume somehow 289541 - when changing from local to cluster, volumes can not appear to be deactivated	2010-05-14 15:19:42 +00:00

1 2 3 4 5 ...

745 Commits