shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2025-01-21 22:04:19 +03:00

Author	SHA1	Message	Date
Jonathan Earl Brassow	d3582e0252	Add the ability to convert linear LVs to RAID1 Example: ~> lvconvert --type raid1 -m 1 vg/lv The following steps are performed to convert linear to RAID1: 1) Allocate a metadata device from the same PV as the linear device to provide the metadata/data LV pair required for all RAID components. 2) Allocate the required number of metadata/data LV pairs for the remaining additional images. 3) Clear the metadata LVs. This performs a LVM metadata update. 4) Create the top-level RAID LV and add the component devices. We want to make any failure easy to unwind. This is why we don't create the top-level LV and add the components until the last step. Should anything happen before that, the user could simply remove the unnecessary images. Also, we want to ensure that the metadata LVs are cleared before forming the array to prevent stale information from polluting the new array. A new macro 'seg_is_linear' was added to allow us to distinguish linear LVs from striped LVs.	2011-10-07 14:52:26 +00:00
Jonathan Earl Brassow	a80192b6a7	Allow 'nosync' extension of mirrors. This patch allows a mirror to be extended without an initial resync of the extended portion. It compliments the existing '--nosync' option to lvcreate. This action can be done implicitly if the mirror was created with the '--nosync' option, or explicitly if the '--nosync' option is used when extending the device. Here are the operational criteria: 1) A mirror created with '--nosync' should extend with 'nosync' implicitly [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv ; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 10.00 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 10.00g lv_mlog 100.00 2) The 'M' attribute ('M' signifies a mirror created with '--nosync', while 'm' signifies a mirror created w/o '--nosync') must be preserved when extending a mirror created with '--nosync'. See #1 for example of 'M' attribute. 3) A mirror created without '--nosync' should extend with 'nosync' only when '--nosync' is explicitly used when extending. [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 20.00m lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 5.02 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 5.02g lv_mlog 0.39 vs. [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv --nosync; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 20.00m lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 5.02 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.02g lv_mlog 100.00 4) The 'm' attribute must change to 'M' when extending a mirror created without '--nosync' is extended with the '--nosync' option. (See #3 examples above.) 5) An inactive mirror's sync percent cannot be determined definitively, so it must not be allowed to skip resync. Instead, the extend should ask the user if they want to extend while performing a resync. [EXAMPLE]# lvchange -an vg/lv [EXAMPLE]# lvextend -L +5G vg/lv Extending 2 mirror images. Extending logical volume lv to 10.00 GiB vg/lv is not active. Unable to get sync percent. Do full resync of extended portion of vg/lv? [y/n]: y Logical volume lv successfully resized 6) A mirror that is performing recovery (as opposed to an initial sync) - like after a failure - is not allowed to extend with either an implicit or explicit nosync option. [You can simulate this with a 'corelog' mirror because when it is reactivated, it must be recovered every time.] [EXAMPLE]# lvcreate -m1 -L 5G -n lv vg --nosync --corelog WARNING: New mirror won't be synchronised. Don't read what you didn't write! Logical volume "lv" created [EXAMPLE]# lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g 100.00 [EXAMPLE]# lvchange -an vg/lv; lvchange -ay vg/lv; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g 0.08 [EXAMPLE]# lvextend -L +5G vg/lv Extending 2 mirror images. Extending logical volume lv to 10.00 GiB vg/lv cannot be extended while it is recovering. 7) If 'no' is selected in #5 or if the condition in #6 is hit, it should not result in the mirror being resized or the 'm/M' attribute being changed. NOTE: A mirror created with '--nosync' behaves differently than one created without it when performing an extension. The former cannot be extended when the mirror is recovering (unless in-active), while the latter can. This is a reasonable thing to do since recovery of a mirror doesn't take long (at least in the case of an on-disk log) and it would cause far more time in degraded mode if the extension w/o '--nosync' was allowed. It might be reasonable to add the ability to force the operation in the future. This should /not/ force a nosync extension, but rather force a sync'ed extension. IOW, the user would be saying, "Yes, yes... I know recovery won't take long and that I'll be adding significantly to the time spent in degraded mode, but I need the extra space right now!".	2011-10-06 15:32:26 +00:00
Jonathan Earl Brassow	b19f01212e	Fix splitmirror in cluster having different DM/LVM views of storage. This patch also does some clean-up of the splitmirrors code. I've attempted to clean-up the splitmirrors code to make it easier to understand with fewer operations. I've tried to reduce the number of metadata operations without compromising the intermediate stages which are necessary for easy clean-up in the even of failure. These changes now correctly handle cluster situations - including exclusive cluster mirrors. Whereas before, a splitmirror operation would result in remote nodes having LVM commands report the newly split LV with a proper name while DM commands would report the old (pre-split) names of the device. IOW, there was a kernel/userspace mismatch.	2011-10-06 14:55:39 +00:00
Jonathan Earl Brassow	6c0b0e5d9a	Revert initial solution to bug 733114 - I/O error message during splitmirror The original commit comments can be located via this git commit ID: 7d8e615c0b30fc2ef300c90378a51f01c328128c There were three possible solutions to the original problem proposed in the initial check-in. The one chosen was as follows: 2) Do like _remove_mirror_images does and suspend the original, then suspend the sub-lv (the error target), then resume the sub-lv, and finally resume the original LV. This seems like extra pointless operations to me, but it doesn't produce the error message (although, I'm not sure why) and it allows us to leave the visible flag in place. Turns out, the cluster also views the extra suspend/resume operations as pointless too and ignores them. So, this solution doesn't work in a cluster. Further, I've noticed that in addition to the remote cluster nodes still getting I/O errors from scanning the error target, they also have a different LVM and DM views of the same LV. IOW, while the LVM level (gotten from the LVM metadata) sees the correct name for the newly split LV, device-mapper still maintains the old names. Because the original fix failed to completely fix the problem (or work-around it) and because a better solution must be found to address the additional cluster issue of device renaming, I am reverting the above mentioned commit.	2011-10-06 14:49:16 +00:00
Jonathan Earl Brassow	83c606ae30	This patch fixes issues with improper udev flags on sub-LVs. The current code does not always assign proper udev flags to sub-LVs (e.g. mirror images and log LVs). This shows up especially during a splitmirror operation in which an image is split off from a mirror to form a new LV. A mirror with a disk log is actually composed of 4 different LVs: the 2 mirror images, the log, and the top-level LV that "glues" them all together. When a 2-way mirror is split into two linear LVs, two of those LVs must be removed. The segments of the image which is not split off to form the new LV are transferred to the top-level LV. This is done so that the original LV can maintain its major/minor, UUID, and name. The sub-lv from which the segments were transferred gets an error segment as a transitory process before it is eventually removed. (Note that if the error target was not put in place, a resume_lv would result in two LVs pointing to the same segment! If the machine crashes before the eventual removal of the sub-LV, the result would be a residual LV with the same mapping as the original (now linear) LV.) So, the two LVs that need to be removed are now the log device and the sub-LV with the error segment. If udev_flags are not properly set, a resume will cause the error LV to come up and be scanned by udev. This causes I/O errors. Additionally, when udev scans sub-LVs (or former sub-LVs), it can cause races when we are trying to remove those LVs. This is especially bad during failure conditions. When the mirror is suspended, the top-level along with its sub-LVs are suspended. The changes (now 2 linear devices and the yet-to-be-removed log and error LV) are committed. When the resume takes place on the original LV, there are no longer links to the other sub-lvs through the LVM metadata. The links are implicitly handled by querying the kernel for a list of dependencies. This is done in the '_add_dev' function (which is recursively called for each dependency found) - called through the following chain: _add_dev dm_tree_add_dev_with_udev_flags <* DM / LVM divide *> _add_dev_to_dtree _add_lv_to_dtree _create_partial_dtree _tree_action dev_manager_activate _lv_activate_lv _lv_resume lv_resume_if_active When udev flags are calculated by '_get_udev_flags', it is done by referencing the 'logical_volume' structure. Those flags are then passed down into 'dm_tree_add_dev_with_udev_flags', which in turn passes them to '_add_dev'. Unfortunately, when '_add_dev' is finding the dependencies, it has no way to calculate their proper udev_flags. This is because it is below the DM/LVM divide - it doesn't have access to the logical_volume structure. In fact, '_add_dev' simply reuses the udev_flags given for the initial device! This virtually guarentees the udev_flags are wrong for all the dependencies unless they are reset by some other mechanism. The current code provides no such mechanism. Even if '_add_new_lv_to_dtree' were called on the sub-devices - which it isn't - entries already in the tree are simply passed over, failing to reset any udev_flags. The solution must retain its implicit nature of discovering dependencies and be able to go back over the dependencies found to properly set the udev_flags. My solution simply calls a new function before leaving '_add_new_lv_to_dtree' that iterates over the dtree nodes to properly reset the udev_flags of any children. It is important that this function occur after the '_add_dev' has done its job of querying the kernel for a list of dependencies. It is this list of children that we use to look up their respective LVs and properly calculate the udev_flags. This solution has worked for single machine, cluster, and cluster w/ exclusive activation.	2011-10-06 14:45:40 +00:00
Zdenek Kabelac	151ed8d935	Add more validation to config parser Do not leave it for vgvalidate().	2011-10-06 11:06:36 +00:00
Zdenek Kabelac	565a4bfc49	Move defines to header Make limits for thin data_block_size and device_id part of public API. FIXME: read them possible from some kernel header file in the future ? But we may need to support different values for different versions ?	2011-10-06 11:05:56 +00:00
Zdenek Kabelac	c0b9c64a77	Use capital letters	2011-10-04 12:39:59 +00:00
Zdenek Kabelac	01ef6510b0	Missed rename pool->thin_pool Fix compilation	2011-10-03 19:10:52 +00:00
Zdenek Kabelac	04a4715cb8	Add code to activate thin target Code to zero pool metadata lv when pool is created. Add code to create thin target via message sending. (Revert is missing)	2011-10-03 18:43:39 +00:00
Zdenek Kabelac	d35a117e4b	Add simple function for lookup of some free device_id Initial simple implementation for finding some free device_id.	2011-10-03 18:39:17 +00:00
Zdenek Kabelac	a00cb3a6b0	Add lvm functions for sending messages. Functions are currently only needed for thin provissioning.	2011-10-03 18:37:47 +00:00
Zdenek Kabelac	97bde15a9f	Display transaction_id for thin_pool	2011-10-03 18:31:03 +00:00
Zdenek Kabelac	1419bf1c98	Transaction_id is property of thin_pool Remove Transaction_id from thin target. Store device_id for thin target.	2011-10-03 18:26:07 +00:00
Zdenek Kabelac	87663d5f88	Add preload support for thin and thin_pool	2011-10-03 18:24:47 +00:00
Zdenek Kabelac	38796c3d47	Fix bad error message for thinp validation	2011-09-29 09:03:36 +00:00
Zdenek Kabelac	aebf2d5cdc	Add experimental code for activation of thinp targets No dm messages yes - just a base functionality in the steps of other targets. For now usable only for debugging and tracing.	2011-09-29 08:56:38 +00:00
Alasdair Kergon	10d0d9c7c4	Introduce revert_lv for better pvmove cleanup. (One further fix needed to remove the stray pvmove LVs left behind.)	2011-09-27 22:43:40 +00:00
Alasdair Kergon	1c26860d82	Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely. Change suspend_lvs to call vg_revert internally. Change vg_revert to void and remove superfluous calls after failed vg_commit.	2011-09-27 17:09:42 +00:00
Alasdair Kergon	d71fd30e5d	typo	2011-09-27 12:34:14 +00:00
Alasdair Kergon	7c67d33dd4	correct thin_pool width	2011-09-27 12:33:36 +00:00
Zdenek Kabelac	1d526c8585	Show some Thin related info in lvdisplay	2011-09-26 13:11:02 +00:00
Peter Rajnoha	c3e5b4976d	Add log_error even for general device in use when we can't do the sysfs checks.	2011-09-26 10:17:51 +00:00
Zdenek Kabelac	f1ab501a58	Fix log_error() usage Cosmetic - skip <bactrace> when error has been just printed in raid segtype. Add missing log_error if allocation would fail for unknown segtype.	2011-09-24 21:19:30 +00:00
Jonathan Earl Brassow	efa3621a59	Add 'Volume Type' lv_attr characters for RAID and RAID_IMAGE. RAID_META is already handled.	2011-09-23 15:17:54 +00:00
Peter Rajnoha	9fa1d30a1c	Add activation/retry_deactivation to lvm.conf to retry deactivation of an LV.	2011-09-22 17:39:56 +00:00
Peter Rajnoha	125712bea0	Replace open_count check with holders/mounted_fs check on lvremove path. Before, we used to display "Can't remove open logical volume" which was generic. There 3 possibilities of how a device could be opened: - used by another device - having a filesystem on that device which is mounted - opened directly by an application With the help of sysfs info, we can distinguish the first two situations. The third one will be subject to "remove retry" logic - if it's opened quickly (e.g. a parallel scan from within a udev rule run), this will finish quickly and we can remove it once it has finished. If it's a legitimate application that keeps the device opened, we'll do our best to remove the device, but we will fail finally after a few retries.	2011-09-22 17:33:50 +00:00
Jonathan Earl Brassow	40c85cf1d7	When up-converting a RAID1 array, we need to allocate new larger arrays for seg->areas and seg->meta_areas. We also need to copy the memory from the old arrays to the newly allocated arrays. The amount of memory to copy was determined by seg->area_count. However, seg->area_count was being set to the higher value after copying the 'seg->areas' information, but before copying the 'seg->meta_areas' information. This means we were copying more memory than necessary for 'seg->meta_areas' - something that could lead to a segfault.	2011-09-22 15:33:21 +00:00
Zdenek Kabelac	ce840163c0	Revert patch Caller of exec must report log_error when rstatus is passed.	2011-09-19 18:38:43 +00:00
Zdenek Kabelac	4eeff46bf2	Use log_error instead of log_verbose when executed command fails	2011-09-19 14:54:23 +00:00
Jonathan Earl Brassow	4026cb6fd1	fix compiler warning. Compiler says variable may be used uninitialized. It can't be, but we initialize the variable to NULL anyway. Also, remove the double initialization of another variable.	2011-09-19 14:28:23 +00:00
Zdenek Kabelac	5f3f06db66	Move debug message so it does not look like we are executing command in the middle of critical_section in log trace.	2011-09-19 12:48:02 +00:00
Jonathan Earl Brassow	eb607100ef	Fix Bug 738832 - core to disk log conversion fails with internal error This bug showed up when trying to add a log to a mirror whose images are on multiple devices. This is an intra-release regression and no WHATS_NEW entry will be added. The error was introduce in the following commit: 2d8a2f35c77fdeef1dbe0ef791db8530d07826eb The solution is to recognise in _alloc_init that if there are no mirrors or stripes specified, then 'new_extents' should be zero.	2011-09-16 18:39:03 +00:00
Jonathan Earl Brassow	a514067448	After suspend/resume following a splitmirror op, call sync_local_dev_names to settle udev before calling deactivate_lv. This is an intra-release regression (no WHATS_NEW entry required). It is part of the fix for the current WHATS_NEW entry: Work around resume_lv causing error LV scanning during splitmirror operation.	2011-09-16 16:41:37 +00:00
Zdenek Kabelac	a6d50bef2f	Remove thin volumes before thin pools When user wants to remove thin pool - check if there are no thin volumes using it. If so - query before removal (or -ff for no question) and remove them first.	2011-09-16 12:12:51 +00:00
Zdenek Kabelac	4a0c6df8df	Reset LV status when unlinking LV from VG When LV is unlinked, we want to catch problem in vg_validate, that LV has changed. i.e. catch LV has been removed and is no long thin_pool while still being referenced by some thin volume.	2011-09-16 11:59:22 +00:00
Zdenek Kabelac	94147f3f29	Trim spaces on EOL	2011-09-16 11:53:14 +00:00
Petr Rockai	fd7d4adc57	Fix the divisibility check in the allocator for the mirror+stripe case (require divisibility by stripe count alone, not by (mirror*stripe)).	2011-09-16 09:59:42 +00:00
Milan Broz	c81a322337	Activate virtual snapshot origin exclusively (only on local node in cluster).	2011-09-14 14:20:16 +00:00
Zdenek Kabelac	e24be2abe4	Add suggest parentheses around '&&' Follow gcc suggestion.	2011-09-14 10:03:15 +00:00
Zdenek Kabelac	886d005616	LVM_WRITE and LVM_READ are 64bit constants Revert John patch, which fixed only 1 place where ~LVM_WRITE was in use and convert ommited LVM_READ/WRITE flags to 64bit constants as well. (Since both 'status' flags for LV and VG are 64bit.)	2011-09-14 09:57:35 +00:00
Zdenek Kabelac	3e25de05a9	Add missing underscores to local static functions	2011-09-14 09:54:21 +00:00
Jonathan Earl Brassow	462579d54e	Additional fixes for lv_mirror_count. Changing lv_mirror_count to only count the AREA_LVs made the function stop working for PVMOVE mirrors. A conditional has been added to fix that problem. Additionally, when counting the images in a mirror stack, we don't need to subtract 1 from the count we get back from the lv_mirror_count call on the temporary mirror layer. (This is because we are no falsely counting the top layer of the temporary mirror.)	2011-09-14 04:10:26 +00:00
Jonathan Earl Brassow	9cb27929e9	Fix for bug 734252 - problem up converting striped mirror after image failure lv_mirror_count was not able to handle mirrors of stripes properly. When a failed device is removed, the MIRRORED status flag is removed from the LV conditionally based on the results of lv_mirror_count. However, lv_mirror_count trusted the MIRRORED flag - thinking any such LV must be mirrored. It would happily assign first_seg(lv)->area_count as the number of mirrors, but when a mirrored striped LV was reduced to a simple striped LV area_count would be the number of /stripes/ not the number of /mirrors/. A result higher than 1 would be returned from lv_mirror_count, the MIRRORED flag would not be cleared, and the LV would fail to be up-converted properly in lvconvert_mirrors_aux because of it.	2011-09-14 02:45:36 +00:00
Jonathan Earl Brassow	46f0efbfce	Fix bug 733400 - Mirror down conversion when specifying the secondary leg is broke The operation of deactivating the residual error target LV after removing a mirror layer can cause a "device in-use" conflict with udev. Giving udev a poke before calling deactivate_lv eliminates the conflict. The stick used to poke udev is 'sync_local_dev_names'.	2011-09-13 21:13:33 +00:00
Jonathan Earl Brassow	c94c47abd7	Fix for bug 737200 - Can't create mirrored-log mirror on a VG with small extents Kernel requires a mirror to be at least 1 region large. So, if our mirror log is itself a mirror, it must be at least 1 region large. This restriction may not be necessary for non-mirrored logs, but we apply the rule anyway. (The other option is to make the region size of the log mirror smaller than the mirror it is acting as a log for, but that really complicates things. It's much easier to keep the region_size the same for both.)	2011-09-13 18:42:57 +00:00
Jonathan Earl Brassow	f5e43f061a	Better fix for bug 737125 - unable to create mirror on 1K extent size VG WHATS_NEW entry: Fix log size calculation when only a log is being added to a mirror. The original fix pass the mirror LV to allocate_extents (rather than passing NULL) so that _alloc_init could correctly determine the necessary size of the mirror log. In the previous check-in, I noted: In order to get a decent value computed, we need to pass in the 'lv' argument to allocate_extents. This would normally imply a desire for cling/contiguous allocation to the given LV, but since we are not allocating any parallel extents and only log extents, it works fine. However, passing in the LV did have unintended consequences on the placement of the log. The better solution is to pass in the number of extext that are in the mirror LV instead of the LV itself. This will not cause the allocator to reserve that number of extents, because 'stripes' and 'mirrors' are specified as 0. Thus, 'extents' is used to calculate the size of the log, but won't affect how much is allocated.	2011-09-13 18:11:38 +00:00
Jonathan Earl Brassow	0c89ef513a	Changing RAID status flags to 64-bit broke some binary flag operations. LVM_WRITE is a 32-bit flag. Now that RAID[_IMAGE\|_META] are 64-bit, and'ing a RAID LV's status against LVM_WRITE can reset the higher order flags. A similar thing will affect thinp flags if not careful.	2011-09-13 16:33:21 +00:00
Jonathan Earl Brassow	cc9dc919e6	Fix for bug 737125 - unable to create mirror on 1K extent size VG _alloc_init calculates the number of necessary log extents via 'mirror_log_extents'. 'mirror_log_extents' takes 3 arguments: region_size, pe_size, and size of the mirror LV. Unfortunately, _alloc_init is guessing at the mirror size by using 'ah->new_extents / ah->area_multiple' - the number of extents that the mirror images have. However, this is /always/ wrong when allocating the log separately. Further, the log is always allocated separately unless we are up-converting the mirror at the same time. It was by luck alone that a default value of '1' reflects what we want in most cases. In order to get a decent value computed, we need to pass in the 'lv' argument to allocate_extents. This would normally imply a desire for cling/contiguous allocation to the given LV, but since we are not allocating any parallel extents and only log extents, it works fine.	2011-09-13 14:37:48 +00:00
Jonathan Earl Brassow	6d0aa801a0	Fix for bug 733114. When an image is split from a 2-way mirror, the original mirror is converted to a linear device. To do this, the top "layer" must be removed. The segments are transferred from the sub-lv to the top-level LV and the link is severed. The former sub-lv - having its segments transferred - now contains a temporary error target. When the original LV is resumed, the old sub-lv that now contains an error segment is activated and scanned. This is what causes the I/O error messages. There are three ways to fix this problem: 1) Do not set the sub-lv which contains the error target as "visible" before suspending the original LV. This way, when the original is resumed, the sub-lv device node is not created and it is not scanned - avoiding the error messages. The problem with this approach is that if the machine crashes after the resume, it leaves the hidden LV in place and the user has a more difficult time noticing that it needs to be cleaned up. Thus, this type of processing is frowned upon. 2) Do like _remove_mirror_images does and suspend the original, then suspend the sub-lv (the error target), then resume the sub-lv, and finally resume the original LV. This seems like extra pointless operations to me, but it does not produce the error message (although, I'm not sure why) and it allows us to leave the visible flag in place. 3) Flag the sub-lv (error target) with a "do not scan" flag. This seems like the cleanest approach, but I have been unable to find the method for doing this. LVs get tagged in such a way by _get_udev_flags, but in this case the resume of the original LV also resumes the error target LV without running it through _get_udev_flags (likely because they are no longer linked). Could there be something wrong in resume_lv? Option #2 was chosen to fix this bug, but it seems like more of a workaround for now.	2011-09-13 13:59:19 +00:00

1 2 3 4 5 ...

2525 Commits