shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2025-01-07 21:18:59 +03:00

Author	SHA1	Message	Date
Zdenek Kabelac	efe228a42e	Just reindent with tabs	2012-02-27 09:51:31 +00:00
Zdenek Kabelac	2960604178	Add explicit cast for time() ret value To keep all numbers with same sign	2012-02-23 22:31:23 +00:00
Zdenek Kabelac	219e040062	Drop backtrace after log_error Just a minor change to not give backtrace when log_error has been just reported.	2012-02-23 22:24:47 +00:00
Jonathan Earl Brassow	870762d8e3	Require number of stripes to be greater than parity devices in higher RAID. Also, add some comments to code that I recently added that may be unclear otherwise.	2012-02-23 17:36:35 +00:00
Petr Rockai	dae0822698	The lvmetad client-side integration. Only active when use_lvmetad = 1 is set in lvm.conf and lvmetad is running.	2012-02-23 13:11:07 +00:00
Jonathan Earl Brassow	9bdfb30720	Fix allocation code to allow replacement of single RAID 4/5/6 device. The code fail to account for the case where we just need a single device in a RAID 4/5/6 array. There is no good way to tell the allocation functions that we don't need parity devices when we are allocating just a single device. So, I've used a bit of a hack. If we are allocating an area_count that is <= the parity count, then we can assume we are simply allocating a replacement device (i.e. no need to include parity devices in the calculations). This should make sense in most cases. If we need to allocate replacement devices due to failure (or moving), we will never allocate more than the parity count; or we would cause the array to become unusable. If we are creating a new device, we should always create more stripes than parity devices.	2012-02-23 03:57:23 +00:00
Alasdair Kergon	d860272b00	Check all tags and LV names are in a valid form in vg_validate.	2012-02-23 00:11:01 +00:00
Jonathan Earl Brassow	0e92b70f71	* empty log message *	2012-02-22 17:14:38 +00:00
Zdenek Kabelac	d81498a824	Initialize dmeventd monitoring for every command Read lvm.conf setting for monitoring for each command. So we should not activate monitoring if the default compilation is set to monitor during lvconvert commnads. Patch also removes check for clustered VG and allows to disable monitoring for clustered VG with the assumption, the problem with monitoring and dmeventd flag passing for INGNORE is already fixed.	2012-02-15 15:18:43 +00:00
Zdenek Kabelac	73e62cdc11	Add internal error for unsupported code paths Patch mainly helps static analyzers to better work with code paths lvm code should never trigger.	2012-02-13 11:25:56 +00:00
Zdenek Kabelac	cbe6bcd593	Add check for rimage name allocation failure	2012-02-13 11:10:37 +00:00
Zdenek Kabelac	52f2f3eae4	Add free_orphan_vg Move commod code to destroy orphan VG into free_orphan_vg() function. Use orphan vgmem for creation of PV lists. Remove some free_pv_fid() calls (FIXME: check all of them) FIXME: Check whether we could merge release_vg back again for all VGs.	2012-02-13 11:03:59 +00:00
Zdenek Kabelac	65079de265	If the same fid is already same avoid ref_counting	2012-02-13 11:01:34 +00:00
Zdenek Kabelac	960ee343f3	Add missing test for failure of lvmcache_foreach_pv	2012-02-13 10:58:20 +00:00
Zdenek Kabelac	bbf98c19a8	Log error reporting for failing _alloc_pv Drop unneeded zeroing of zalloced memory region.	2012-02-13 10:51:52 +00:00
Alasdair Kergon	b719e3d323	FMT_INSTANCE_VG is redundant now	2012-02-12 23:01:19 +00:00
Alasdair Kergon	ba14fff2af	FMT_INSTANCE_PV is no longer used	2012-02-12 22:37:24 +00:00
Alasdair Kergon	f3f6c17250	use stack consistently if 0 is considered an error	2012-02-12 21:42:43 +00:00
Alasdair Kergon	2a434c479d	missing error mesg	2012-02-12 21:37:03 +00:00
Alasdair Kergon	10670c641b	remove unused bits after fid changes	2012-02-12 20:19:39 +00:00
Petr Rockai	6e41729eb8	Keep a global (per-format) orphan_vg and keep any and all orphan PVs linked to it. Avoids the need for FMT_INSTANCE_PV and enables further simplifications. No functional change, internal refactor only.	2012-02-10 02:53:03 +00:00
Petr Rockai	8e5f7cf3dc	Move lvmcache data structures behind an API (making the structures private to lvmcache.c). No functional change.	2012-02-10 01:28:27 +00:00
Peter Rajnoha	5fa417a9c0	Stop processing lvextend if trying to extend a mirror that is being recovered. Missing correct return value in lv_extend fn.	2012-02-09 15:13:42 +00:00
Zdenek Kabelac	a7e2da0585	Thin add pool_below_threshold Test both data and metadata percent usage.	2012-02-08 13:05:38 +00:00
Zdenek Kabelac	94f88a4f14	Fix test for lv_snapshot_percent Do not check for PERCENT_MERGE_FAILED if the lv_snapshot_percent() failed. (test for snap_percent would be testing uninitialized value).	2012-02-08 13:02:07 +00:00
Zdenek Kabelac	462835faa0	Switch to return void List delete cannot fail, so there is no reason to test for error.	2012-02-08 12:52:58 +00:00
Zdenek Kabelac	d75c5f06f0	Replace snprintf with dm_snprintf snprintf testing for negative is replaced with dm_snprintf where this test really works. Add missing test for result of dm_snprintf().	2012-02-08 11:40:02 +00:00
Alasdair Kergon	b167ca28b0	Adjust comments	2012-02-01 15:05:53 +00:00
Zdenek Kabelac	42b5c54092	Add synchornization point in mirror log init. Put extra sync point when mirror log is deactivated and before it's activated for the second time.	2012-02-01 13:50:36 +00:00
Alasdair Kergon	1368dc905a	lost line	2012-02-01 02:11:43 +00:00
Alasdair Kergon	72abf1d880	Track unreserved space for all alloc policies and then permit NORMAL to place log and data on same single PV.	2012-02-01 02:10:45 +00:00
Zdenek Kabelac	7012268499	Thin for_each_sub_lv Adapt to scan thin dependency LVs	2012-01-26 21:39:32 +00:00
Zdenek Kabelac	42beb34826	Set missing header define	2012-01-25 22:37:48 +00:00
Zdenek Kabelac	9b1fe5a062	Thin clear stacked message for thin pool Before removing thin pool LV always make sure, stacked message for previous run are cleared - but allow to remove any device that should have been created (i.e. creation of snapshot failed - so the message for snapshot creation may be replaced with delete message within unfinished transaction). Also commit messages after lv remove - so free space is released in pool.	2012-01-25 11:27:42 +00:00
Zdenek Kabelac	89764fd494	Thin skip activation when there are no thin message If the list with thin messages is empty, do not touch thin pool device.	2012-01-25 09:17:15 +00:00
Zdenek Kabelac	c771a70882	Thin correct activation order When the message is passed only in resume path the order needs to be corrected.	2012-01-25 09:15:44 +00:00
Zdenek Kabelac	3dadb176ce	Thin use suspend/resume_lv_origin Use origin_only support for thin volume when thin snapshot is created.	2012-01-25 09:14:25 +00:00
Zdenek Kabelac	2258242f6c	Thin use origin_only for thin pools as well Extend the usage of origin_only flag to allow resume of thin pool LV (when it's active) to pass only the messages. origin_only flag will skip detection of already resumed tree for thin_pool, so we do not need to suspend the tree and we just send messages.	2012-01-25 09:13:10 +00:00
Zdenek Kabelac	1dede50c85	Thin check for lv_thin_pool_percent error status Check has been missing.	2012-01-25 09:02:35 +00:00
Zdenek Kabelac	0926438aad	Thin prevent removal of its data and metadata LVs LVs cannot be removed while there are linked to thin pool. (Gives better error message, than validation).	2012-01-25 08:57:25 +00:00
Zdenek Kabelac	d55aa53816	Thin fix transaction_id incrementation and code refactoring Add pool_has_message and use it in attach_pool_message. Also update header to make more obvious which segment type is expected as parameter. Rename 'read_only' to 'no_update' (no auto update transaction_id) to better fit how it's used. Fix problem when there was only one stacked message replaced with delete message that caused unwanted transaction_id increase.	2012-01-25 08:55:19 +00:00
Zdenek Kabelac	c217690f4c	Thin dependency scan support Go through pool_lv and metadata_lv LVs when doing recursive scan.	2012-01-25 08:50:10 +00:00
Alasdair Kergon	3f61871f38	Caller is still entitled to reference an LV that's unlinked, so don't tamper with struct contents.	2012-01-24 14:53:59 +00:00
Jonathan Earl Brassow	6cf3274732	Use suspend\|resume_origin_only when up-converting RAID LVs, as mirrors do. Failure to do so results in "Performing unsafe table load while X device(s) are known to be suspended" errors. While fixing the problem in this way works and is consistent with the way the mirror segment type does it, it would be nice to find a solution that uses the generic suspend/resume calls. Also included in this check-in are additions to the test suite that perform conversions on RAID LVs under a snapshot. These tests are disabled for the time being due to a kernel bug that is yet to be tracked down.	2012-01-24 14:33:38 +00:00
Milan Broz	095d95d0a8	Properly show LV removal message. (Fix regression in commit `6e181ba96d`)	2012-01-24 14:15:52 +00:00
Alasdair Kergon	46c67b5279	Use chunk_size consistently for thin_pool within LVM.	2012-01-24 00:55:03 +00:00
Alasdair Kergon	5c9eae9647	Reorder fns in libdm-deptree. Tweak dm_config interface and remove FIXMEs.	2012-01-23 17:46:31 +00:00
Mike Snitzer	fc0f2d5031	Prompt if request is made to remove a snapshot whose "Merge failed".	2012-01-20 22:04:16 +00:00
Mike Snitzer	27e21a4adc	Allow removal of an invalid snapshot that was to be merged on next activation. Don't allow a user to merge an invalid snapshot.	2012-01-20 22:03:48 +00:00
Mike Snitzer	d658922f36	Use m and M lv_attr to indicate that a snapshot merge failed in lvs. snapshot (m)erge failed, suspended snapshot (M)erge failed	2012-01-20 22:03:03 +00:00
Zdenek Kabelac	6515946e4d	Thin cleanup Reorder condition so the code is better readable (and shorter).	2012-01-20 10:56:30 +00:00
Zdenek Kabelac	f881095a69	Drop hack in segtype reporting Since striped name function knows when to report 'linear' instead of 'stripe' type name - drop it from this place. This fixes problem when reporting segtype e.g. for thin-pool which is also using area_count=1 to store thin data device reference. It also returns properly strduped memory instead of badly casted const char*.	2012-01-20 10:55:28 +00:00
Zdenek Kabelac	f82bddb76c	Thin disable snapshot creation when pool is over the threshold. Since snapshot needs to suspend origin - it might lead to pool userspace deadlock (as the pool will wait for new space in case it would be overfilled, but dmeventd would not be able to resize it, as the lvcreate operation would have kept the VG lock.) To minimize the risk of such scenario - we prevent to create new snapshot in case we are over the threshold - but beware, there is still small timewindow, so keep threshold at some reasonable level!	2012-01-19 15:39:41 +00:00
Zdenek Kabelac	e58b5dd8e8	Thin add new display field for lvs New field Data% is able to display info about thin_pool, thin, snapshot and has generic meaning here. Simple Time/Host field are here to display host and time creation.	2012-01-19 15:34:32 +00:00
Zdenek Kabelac	53d7985fa1	Add support to keep info about creation time and host for each LV Basic support to keep info when the LV was created. Host and time is stored into LV mda section. FIXME: Current version doesn't support configurable string via lvm.conf and used fixed version strftime "%Y-%m-%d %T %z".	2012-01-19 15:31:45 +00:00
Zdenek Kabelac	d8106dfee2	Thin rename seg var pool_metadata_lv to metadata_lv Better fits the code.	2012-01-19 15:23:50 +00:00
Alasdair Kergon	8f95d94b4f	Show read-only activation in display tools.	2012-01-12 16:58:43 +00:00
Zdenek Kabelac	f582793f1b	Thin rename internal thin pool segment Use matching name as kernel target - useful when function like _percent is using this for validation.	2011-12-21 12:54:19 +00:00
Alasdair Kergon	66e5b7f53c	Reinstate support for format1 snapshots, but issue deprecated warning. I anticipate removing support for snapshots with lvm1-formatted metadata in a future release.	2011-12-20 00:02:18 +00:00
Alasdair Kergon	289ed221d0	update FIXMEs	2011-12-10 00:47:23 +00:00
Jonathan Earl Brassow	9711057499	Don't allow two images to be split and tracked from a RAID LV at one time Also, don't allow a splitmirror operation on a RAID LV that is already tracking a split, unless the operation is to stop the tracking and complete the split. Example: ~> lvconvert --splitmirrors 1 --trackchanges vg/lv /dev/sdc1 # Now tracking changes - image can be merged back or split-off for good ~> lvconvert --splitmirrors 1 -n new_name vg/lv /dev/sdc1 # ^ Completes split ^ If a split is performed on a RAID that is tracking an already split image and PVs are provided, we must ensure that 1) the already split LV is represented in the PVs 2) we are careful to split only the tracked image	2011-12-01 00:21:04 +00:00
Jonathan Earl Brassow	a927e401f1	Do not allow users to change the name of RAID sub-LVs or the name of the RAID LV if it is tracking changes for a split image.	2011-12-01 00:09:34 +00:00
Jonathan Earl Brassow	2ba1e8fccc	The LV_REBUILD flag is not internal - bad comments in metadata-exported.h updated	2011-11-30 02:20:13 +00:00
Jonathan Earl Brassow	0c506d9a40	Support the ability to replace specific devices in a RAID array. RAID is not like traditional LVM mirroring. LVM mirroring required failed devices to be removed or the logical volume would simply hang. RAID arrays can keep on running with failed devices. In fact, for RAID types other than RAID1, removing a device would mean substituting an error target or converting to a lower level RAID (e.g. RAID6 -> RAID5, or RAID4/5 to RAID0). Therefore, rather than removing a failed device unconditionally and potentially allocating a replacement, RAID allows the user to "replace" a device with a new one. This approach is a 1-step solution vs the current 2-step solution. example> lvconvert --replace <dev_to_remove> vg/lv [possible_replacement_PVs] '--replace' can be specified more than once. example> lvconvert --replace /dev/sdb1 --replace /dev/sdc1 vg/lv	2011-11-30 02:02:10 +00:00
Zdenek Kabelac	900f5f8187	Replace dynamic buffer allocations for PATH_MAX Use static buffer instead of stack allocated buffer. This reduces stack size usage of lvm tool and the change is very simple. Since the whole library is not thread safe - it should not add any new problems - and if there will be some conversion it's easy to convert this to use some preallocated buffer.	2011-11-18 19:31:09 +00:00
Zdenek Kabelac	8deeeb07ea	Unlock memory for vg_write For write we do not need to hold memory locked. This relaxes many conditions and avoid problems when allocating a lot of memory for writting metadata buffers. (In case of huge MDA size this would lead to mismatch between locked and unlocked memory region size). Add also internal check we are not writing in critical section.	2011-11-18 19:28:00 +00:00
Zdenek Kabelac	37f274ced9	Query before removing inactive snapshots Removal of an inactive origin removes also all related snapshots. When we now support 'old' external snapshots with thin volumes, removal of pool will not only drop all thin volumes, but as a consequence also all snapshots - which might be seen a bit unexpected for the user - so add a query to confirm such action. lvremove -f will skip the prompt.	2011-11-18 19:25:20 +00:00
Zdenek Kabelac	91e4512619	Adjusted mirror region size only for mirrors and raids Update region_size only for mirror and raid targets. This fixes warning messages when vg is using small extent size like 1KiB and no mirror/raid is created, but the user still got the message: $> vgcreate -s 1K vg <pvs> $> lvcreate -L10K vg Using reduced mirror region size of 4 sectors	2011-11-15 17:32:12 +00:00
Zdenek Kabelac	5f129d15b1	Thin update prompt message Enhance message with info about how many thin volumes are going to be removed with thin pool removal.	2011-11-15 17:29:52 +00:00
Zdenek Kabelac	8542953f74	Reorder AND test condition Take the easiest condition for checking first since they must apply all together, check local conditions first before doing more expensive tests.	2011-11-15 17:27:41 +00:00
Zdenek Kabelac	25de8ca372	Thin supports only thin volumes as snapshot origins It's currently of the scope to properly solve the snapshoting of internal thin devs so prevent non-toplevel snapshots here.	2011-11-15 17:23:51 +00:00
Zdenek Kabelac	dd0c58c69b	Add missing stack reporting also remove unneeded {}	2011-11-12 22:53:23 +00:00
Zdenek Kabelac	3af072cc63	Thin use items iterator and stack reporting	2011-11-12 22:52:18 +00:00
Zdenek Kabelac	651ef6be82	Missing stack printing	2011-11-12 22:51:20 +00:00
Zdenek Kabelac	6e89eb9a52	Small comment and indent updates	2011-11-10 12:43:05 +00:00
Zdenek Kabelac	f201498f99	Thin test min thin_pool size for at least 1 chunk	2011-11-10 12:42:36 +00:00
Zdenek Kabelac	39fc633957	Thin align volume size on chunk boundary size If the extent_size is smaller then the chunk_size we may try to find better aligment (wasting less space). i.e. using 4KB extent_size and 64KB chunk size will lead to creation of 64KB aligned thin volume.	2011-11-10 12:42:15 +00:00
Zdenek Kabelac	74e53e8bc0	Thin disable pool create without activation	2011-11-10 12:39:01 +00:00
Alasdair Kergon	3da4ed712e	Must not override alloc policy specified by user.	2011-11-07 13:54:54 +00:00
Zdenek Kabelac	65e88e6b3c	Thin add error message for double delete Add few more internal error messages.	2011-11-07 11:04:45 +00:00
Zdenek Kabelac	97d7e5aedb	Thin supports snapshots Full support for thin snapshots. Create and remove is supported. TODO: lvconvert support is not yes available.	2011-11-07 11:03:47 +00:00
Zdenek Kabelac	11721819a7	Thin reindent code Drop indention level Add extra internal error.	2011-11-07 10:59:07 +00:00
Zdenek Kabelac	87371d48cc	Thin revert code for exclusive pool activation There are no limits on thin-pool activation now. Revert code that is no longer needed.	2011-11-07 10:58:13 +00:00
Zdenek Kabelac	4079a8f298	Avoid lvextend to overflow Add extra check to extent_count overflow. Use internal define MAX_EXTENT_COUNT instead UINT32_MAX.	2011-11-04 22:49:53 +00:00
Zdenek Kabelac	83baa0b778	Thin pool allocation simplified Support allocation of metadata from the same PV, if the VG is build only from one PV. As thinp is not mirror - we do not require 2 PVs for basic thin usage as user is losing only perfomance.	2011-11-04 22:45:52 +00:00
Zdenek Kabelac	bd15208cd7	Thin add thin_pool_metadata_require_separate_pvs Allow to set different policy for pool from mirrors.	2011-11-04 22:44:21 +00:00
Zdenek Kabelac	b8cac455bd	Thin supports poolmetadatasize setting Add option to set pool metadatasize. For passing size parameter reuse region_size.	2011-11-04 22:43:10 +00:00
Alasdair Kergon	13dc67cda7	Add missing lvrename mirrored log recursion in for_each_sub_lv.	2011-11-04 01:31:23 +00:00
Zdenek Kabelac	1cae10a36c	Thin keep pool device in the same state Leave the optimalisation to be done differently and preserve availability state of the pool device.	2011-11-03 15:58:20 +00:00
Zdenek Kabelac	9aa24bd034	Thin no device is created - so nothing to revert here	2011-11-03 15:46:51 +00:00
Zdenek Kabelac	466a8ebf9d	Thin removing unused detach_pool_messages	2011-11-03 14:57:04 +00:00
Zdenek Kabelac	92384bfd0b	Thin using update_pool_lv Replace detach_pool_messages with update_pool_lv. Move creation code from to 'if' condition into 1. Ensure creation has finished all previous message operations.	2011-11-03 14:56:20 +00:00
Zdenek Kabelac	73b7bf961b	Thin genering update_pool_lv function Function to trigger pool message passing via resume, or resize of the pool itself independently on other thins.	2011-11-03 14:53:58 +00:00
Zdenek Kabelac	dc964ab0d3	Thin uses _tdata instead of _tpool for data LV Switch to different suffix and keep -tpool reserved for overlay device name.	2011-11-03 14:38:36 +00:00
Zdenek Kabelac	1f5c98270d	Thin code cleanup Use iterate_items for list processing.	2011-11-03 14:36:40 +00:00
Zdenek Kabelac	25de9addb6	Thin fix compile warns Test for dm_snprintf < 0. Add header for moved backup.	2011-10-30 22:52:08 +00:00
Zdenek Kabelac	7654abc26f	Thin creation without activation All thins are created with the next activation and VG is updated without messages. Only some basic commands works. (i.e. lvcreate -an -V10 -T mvg/pool) There can be some combination to confuse this system. This functionality for snapshots is going to be interesting.	2011-10-30 22:07:38 +00:00
Zdenek Kabelac	f0df05e1dd	Cleanup unsuccessfully created thin LV If something fails during creation of thin LV remove such LV and deactivate in case it's been already tried to activate (i.e. thin kernel driver fails for some reason.)	2011-10-30 22:02:18 +00:00
Zdenek Kabelac	96279ac1c0	Make detach_pool_message visible for tools Move there also vg_write and vg_commit.	2011-10-30 22:01:39 +00:00
Zdenek Kabelac	f8d46bd256	Thin cleanups Fix/cleanup several error messages. Remove test for seg_is_thin which could never be true there. Replace (1<<24) with predefined constant.	2011-10-30 22:00:57 +00:00
Zdenek Kabelac	0968dfcd03	Thin support for stripe Support stripe options to create thin data pool LV. TODO: combine chunk size and stripe size.	2011-10-28 20:32:54 +00:00
Zdenek Kabelac	daa10ad0fd	Thin pool resize support for data LV Support for extension of pool data LV. TODO: figure out thin volume for suspend/resume in cluster.	2011-10-28 20:31:01 +00:00
Zdenek Kabelac	e5b12b305f	Thin support for lvrename Rename pool's metadata lv _tmeta together with pool and _tdata.	2011-10-28 20:29:32 +00:00
Zdenek Kabelac	a1d5aaf725	Thin pool activation change To ensure we properly handle LV cluster locking - explicitely do not allow to change the availability of the thin pool that is in use for some thin LV. As soon as the thin volume is created the only way to activate pool is via implicit dependency. Ignore thinpool open count for lv/vgchange operations.	2011-10-28 20:28:00 +00:00
Zdenek Kabelac	2b71bcd0cb	Improve lv_extend stack reporting and some code cleanup with setting return value.	2011-10-28 20:23:24 +00:00
Zdenek Kabelac	c590a9cdbc	Thin error messages clenaup and some indent	2011-10-28 20:19:26 +00:00
Zdenek Kabelac	dd3bb2bac3	Remove thin code from mirror/raid lv_extend	2011-10-28 20:18:32 +00:00
Zdenek Kabelac	2fa836e843	Extend virtual segment instead of adding new one Before adding a new virtual segment to LV, check first whether the last segment isn't already of the same type. In this case extend last segment instead of creating the new one. Thin volumes should have always only 1 virtual segment, but it helps also to virtual snapshot or error segtype..	2011-10-28 20:17:55 +00:00
Zdenek Kabelac	bd4b840879	Add last_seg Implement a function to return the last segment in a LV. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2011-10-28 20:12:54 +00:00
Jonathan Earl Brassow	682309e0b8	Disallow 'mirrored' log for cluster mirrors. Git commit ID `0864378250` was meant to disallow 'mirrored' logs for cluster mirrors. However, when add_mirror_log is used to create the log (as is now the case when using 'lvcreate' or converting only the log) the check is bypassed. This patch adds the check to add_mirror_log.	2011-10-25 13:17:04 +00:00
Zdenek Kabelac	eafbdf3029	Don't print char type[8] as a plain string pvck prints 'extra' character from the label since there is no '\0' after the struct label entry and just uint64_t follows directly. So avoid it by limiting 8 chars to be printed. https://www.redhat.com/archives/lvm-devel/2011-January/msg00109.html Signed-off-by: Paul Bolle <pebolle tiscali nl>	2011-10-24 10:24:39 +00:00
Zdenek Kabelac	72ff89d279	Always use vg memory pool for allocated lv segment Remove mem pool parameter from alloc_lv_segment() Since we should always allocate LV segment from the vg mempool.	2011-10-23 16:02:01 +00:00
Zdenek Kabelac	aef13649ea	Remove old thin code from _lv_insert_empty_sublvs Since thin is not able to use _lv_insert_empty_sublvs, remove its appearence from this function. Start to use extend_pool() function for desired functionality and modify lv_extend() for this.	2011-10-22 16:48:59 +00:00
Zdenek Kabelac	dc225f58a9	Remove extra empty check dm_list_splice handles empty list itself, no need to duplicate code.	2011-10-22 16:46:34 +00:00
Zdenek Kabelac	f4c77bd0e3	Recoded way to insert thin pool into vg Code in _lv_insert_empty_sublvs was not able to provide proper initialization order for thin pool LV. New function extend_pool() first adds metadata segment to pool LV which is still visible. Such LV is activate and cleared. Then new meta LV is created and metadata segments are moved there. Now the preallocated pool data segment is attached to the pool LV and layer _tpool is created. Finaly segment is marked as thin_pool.	2011-10-22 16:44:23 +00:00
Zdenek Kabelac	06b8248d63	Make move_lv_segment non-static This function could be useful for other _manip source files. Use dm_list manipulation function for provided functionality, which make the code more readable and avoid touching list internal details here.	2011-10-22 16:42:10 +00:00
Zdenek Kabelac	f0c9160df4	Store transaction_id with created thin lv So we know the creation history and this should be useful with vgcfgrestore.	2011-10-21 11:38:35 +00:00
Zdenek Kabelac	4d925f5785	Remove double-hack for setting metadata size Drop the second lv_extend and set 128MB directly in the first hack place.	2011-10-21 09:55:50 +00:00
Zdenek Kabelac	3bc417488d	Thin pool now support chunk size as well Use chunksize option to specify data_block_size for thin pool target. Drop low_water_mark to zero.	2011-10-21 09:55:07 +00:00
Zdenek Kabelac	22f40c4efe	Ensure right activation order Couple FIXMEs put into the code for parts of the code which may be improved later, since we might be able to add 'lazy' device creation later. For now require exclusive activation.	2011-10-20 10:35:14 +00:00
Zdenek Kabelac	3f53c059e9	Add _BLOCK_ to define Use DM_THIN_MIN_DATA_BLOCK_SIZE and DM_THIN_MAX_DATA_BLOCK_SIZE to make it more obvious, for which this define is useful in thin API.	2011-10-20 10:28:41 +00:00
Zdenek Kabelac	759b9592ba	Update error message Drop INTERNAL_ERROR from public API functions. Improve some messages.	2011-10-19 16:42:14 +00:00
Zdenek Kabelac	8de912b677	Simple validation of messages in mda Check we do not combine multiple messages for same LV target and switch to use 'delete_id' to make it clear for what this device_id is being used.	2011-10-19 16:39:09 +00:00
Zdenek Kabelac	3dcce042f6	Drop messages referencing deleted LV lvremove may remove problematic LV for thin target.	2011-10-19 16:37:30 +00:00
Zdenek Kabelac	97d0f72c92	Just indent changes Some tabs & spaces.	2011-10-19 16:36:39 +00:00
Zdenek Kabelac	b04e977851	Remove test for thin_pool Since both functions are called during mda read - we don't have full LV info at this moment.	2011-10-19 16:32:34 +00:00
Zdenek Kabelac	a25434a3a3	Message support for thin provisiong lvm part of messaging. Each message is now stored it's own thin pool section: message1 { create = lv } Messages are queued to thin pool dm target when this target is going to be resumed or used through some dependency. Currently 'delete' message are purely queued and processed with next thin pool resume operation (i.e. create_thin). WARNING - thin provisioning support is developmental code.	2011-10-17 14:17:09 +00:00
Jonathan Earl Brassow	a551de6152	Use a more correct macro for 'seg_is_linear' It is better to check 'seg->area_count == 1' than '!seg->stripe_size'.	2011-10-14 14:21:32 +00:00
Zdenek Kabelac	d4f134b8f6	Check for refresh_filter failure Properly detect if the filters were refreshed properly. (May needs few more fixes ??) Filter refresh may fail because it may be out of free file descriptors when clvmd gets overloaded.	2011-10-11 09:09:00 +00:00
Jonathan Earl Brassow	f60175c308	Add the ability to convert LVs of "mirror" segtype to "raid1" segtype. Example: ~> lvconvert --type raid1 vg/mirror_lv Steps to convert "mirror" to "raid1" 1) Allocate a RAID metadata LV for each mirror image from the same PVs on which they are located. 2) Clear the metadata LVs. This involves writing LVM metadata, so we don't change any aspects of the mirror LV before this so that the user can easily remove LVs from the failed convert attempt while retaining the original mirror. 3) Remove the mirror log, if it exists. 4) Add metadata LVs to mirror LV 5) Rename mirror sub-lvs (s/mimage/rimage/) 6) Change flags and segtype from mirror to raid1	2011-10-07 14:56:01 +00:00
Jonathan Earl Brassow	d3582e0252	Add the ability to convert linear LVs to RAID1 Example: ~> lvconvert --type raid1 -m 1 vg/lv The following steps are performed to convert linear to RAID1: 1) Allocate a metadata device from the same PV as the linear device to provide the metadata/data LV pair required for all RAID components. 2) Allocate the required number of metadata/data LV pairs for the remaining additional images. 3) Clear the metadata LVs. This performs a LVM metadata update. 4) Create the top-level RAID LV and add the component devices. We want to make any failure easy to unwind. This is why we don't create the top-level LV and add the components until the last step. Should anything happen before that, the user could simply remove the unnecessary images. Also, we want to ensure that the metadata LVs are cleared before forming the array to prevent stale information from polluting the new array. A new macro 'seg_is_linear' was added to allow us to distinguish linear LVs from striped LVs.	2011-10-07 14:52:26 +00:00
Jonathan Earl Brassow	a80192b6a7	Allow 'nosync' extension of mirrors. This patch allows a mirror to be extended without an initial resync of the extended portion. It compliments the existing '--nosync' option to lvcreate. This action can be done implicitly if the mirror was created with the '--nosync' option, or explicitly if the '--nosync' option is used when extending the device. Here are the operational criteria: 1) A mirror created with '--nosync' should extend with 'nosync' implicitly [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv ; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 10.00 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 10.00g lv_mlog 100.00 2) The 'M' attribute ('M' signifies a mirror created with '--nosync', while 'm' signifies a mirror created w/o '--nosync') must be preserved when extending a mirror created with '--nosync'. See #1 for example of 'M' attribute. 3) A mirror created without '--nosync' should extend with 'nosync' only when '--nosync' is explicitly used when extending. [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 20.00m lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 5.02 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 5.02g lv_mlog 0.39 vs. [EXAMPLE]# lvs vg; lvextend -L +5G vg/lv --nosync; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg mwi-a-m- 20.00m lv_mlog 100.00 Extending 2 mirror images. Extending logical volume lv to 5.02 GiB Logical volume lv successfully resized LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.02g lv_mlog 100.00 4) The 'm' attribute must change to 'M' when extending a mirror created without '--nosync' is extended with the '--nosync' option. (See #3 examples above.) 5) An inactive mirror's sync percent cannot be determined definitively, so it must not be allowed to skip resync. Instead, the extend should ask the user if they want to extend while performing a resync. [EXAMPLE]# lvchange -an vg/lv [EXAMPLE]# lvextend -L +5G vg/lv Extending 2 mirror images. Extending logical volume lv to 10.00 GiB vg/lv is not active. Unable to get sync percent. Do full resync of extended portion of vg/lv? [y/n]: y Logical volume lv successfully resized 6) A mirror that is performing recovery (as opposed to an initial sync) - like after a failure - is not allowed to extend with either an implicit or explicit nosync option. [You can simulate this with a 'corelog' mirror because when it is reactivated, it must be recovered every time.] [EXAMPLE]# lvcreate -m1 -L 5G -n lv vg --nosync --corelog WARNING: New mirror won't be synchronised. Don't read what you didn't write! Logical volume "lv" created [EXAMPLE]# lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g 100.00 [EXAMPLE]# lvchange -an vg/lv; lvchange -ay vg/lv; lvs vg LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert lv vg Mwi-a-m- 5.00g 0.08 [EXAMPLE]# lvextend -L +5G vg/lv Extending 2 mirror images. Extending logical volume lv to 10.00 GiB vg/lv cannot be extended while it is recovering. 7) If 'no' is selected in #5 or if the condition in #6 is hit, it should not result in the mirror being resized or the 'm/M' attribute being changed. NOTE: A mirror created with '--nosync' behaves differently than one created without it when performing an extension. The former cannot be extended when the mirror is recovering (unless in-active), while the latter can. This is a reasonable thing to do since recovery of a mirror doesn't take long (at least in the case of an on-disk log) and it would cause far more time in degraded mode if the extension w/o '--nosync' was allowed. It might be reasonable to add the ability to force the operation in the future. This should /not/ force a nosync extension, but rather force a sync'ed extension. IOW, the user would be saying, "Yes, yes... I know recovery won't take long and that I'll be adding significantly to the time spent in degraded mode, but I need the extra space right now!".	2011-10-06 15:32:26 +00:00
Jonathan Earl Brassow	b19f01212e	Fix splitmirror in cluster having different DM/LVM views of storage. This patch also does some clean-up of the splitmirrors code. I've attempted to clean-up the splitmirrors code to make it easier to understand with fewer operations. I've tried to reduce the number of metadata operations without compromising the intermediate stages which are necessary for easy clean-up in the even of failure. These changes now correctly handle cluster situations - including exclusive cluster mirrors. Whereas before, a splitmirror operation would result in remote nodes having LVM commands report the newly split LV with a proper name while DM commands would report the old (pre-split) names of the device. IOW, there was a kernel/userspace mismatch.	2011-10-06 14:55:39 +00:00
Jonathan Earl Brassow	6c0b0e5d9a	Revert initial solution to bug 733114 - I/O error message during splitmirror The original commit comments can be located via this git commit ID: `7d8e615c0b` There were three possible solutions to the original problem proposed in the initial check-in. The one chosen was as follows: 2) Do like _remove_mirror_images does and suspend the original, then suspend the sub-lv (the error target), then resume the sub-lv, and finally resume the original LV. This seems like extra pointless operations to me, but it doesn't produce the error message (although, I'm not sure why) and it allows us to leave the visible flag in place. Turns out, the cluster also views the extra suspend/resume operations as pointless too and ignores them. So, this solution doesn't work in a cluster. Further, I've noticed that in addition to the remote cluster nodes still getting I/O errors from scanning the error target, they also have a different LVM and DM views of the same LV. IOW, while the LVM level (gotten from the LVM metadata) sees the correct name for the newly split LV, device-mapper still maintains the old names. Because the original fix failed to completely fix the problem (or work-around it) and because a better solution must be found to address the additional cluster issue of device renaming, I am reverting the above mentioned commit.	2011-10-06 14:49:16 +00:00
Zdenek Kabelac	565a4bfc49	Move defines to header Make limits for thin data_block_size and device_id part of public API. FIXME: read them possible from some kernel header file in the future ? But we may need to support different values for different versions ?	2011-10-06 11:05:56 +00:00
Zdenek Kabelac	01ef6510b0	Missed rename pool->thin_pool Fix compilation	2011-10-03 19:10:52 +00:00
Zdenek Kabelac	04a4715cb8	Add code to activate thin target Code to zero pool metadata lv when pool is created. Add code to create thin target via message sending. (Revert is missing)	2011-10-03 18:43:39 +00:00
Zdenek Kabelac	d35a117e4b	Add simple function for lookup of some free device_id Initial simple implementation for finding some free device_id.	2011-10-03 18:39:17 +00:00
Zdenek Kabelac	38796c3d47	Fix bad error message for thinp validation	2011-09-29 09:03:36 +00:00
Zdenek Kabelac	aebf2d5cdc	Add experimental code for activation of thinp targets No dm messages yes - just a base functionality in the steps of other targets. For now usable only for debugging and tracing.	2011-09-29 08:56:38 +00:00
Alasdair Kergon	1c26860d82	Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely. Change suspend_lvs to call vg_revert internally. Change vg_revert to void and remove superfluous calls after failed vg_commit.	2011-09-27 17:09:42 +00:00
Jonathan Earl Brassow	efa3621a59	Add 'Volume Type' lv_attr characters for RAID and RAID_IMAGE. RAID_META is already handled.	2011-09-23 15:17:54 +00:00
Peter Rajnoha	125712bea0	Replace open_count check with holders/mounted_fs check on lvremove path. Before, we used to display "Can't remove open logical volume" which was generic. There 3 possibilities of how a device could be opened: - used by another device - having a filesystem on that device which is mounted - opened directly by an application With the help of sysfs info, we can distinguish the first two situations. The third one will be subject to "remove retry" logic - if it's opened quickly (e.g. a parallel scan from within a udev rule run), this will finish quickly and we can remove it once it has finished. If it's a legitimate application that keeps the device opened, we'll do our best to remove the device, but we will fail finally after a few retries.	2011-09-22 17:33:50 +00:00
Jonathan Earl Brassow	40c85cf1d7	When up-converting a RAID1 array, we need to allocate new larger arrays for seg->areas and seg->meta_areas. We also need to copy the memory from the old arrays to the newly allocated arrays. The amount of memory to copy was determined by seg->area_count. However, seg->area_count was being set to the higher value after copying the 'seg->areas' information, but before copying the 'seg->meta_areas' information. This means we were copying more memory than necessary for 'seg->meta_areas' - something that could lead to a segfault.	2011-09-22 15:33:21 +00:00
Jonathan Earl Brassow	4026cb6fd1	fix compiler warning. Compiler says variable may be used uninitialized. It can't be, but we initialize the variable to NULL anyway. Also, remove the double initialization of another variable.	2011-09-19 14:28:23 +00:00
Jonathan Earl Brassow	eb607100ef	Fix Bug 738832 - core to disk log conversion fails with internal error This bug showed up when trying to add a log to a mirror whose images are on multiple devices. This is an intra-release regression and no WHATS_NEW entry will be added. The error was introduce in the following commit: `2d8a2f35c7` The solution is to recognise in _alloc_init that if there are no mirrors or stripes specified, then 'new_extents' should be zero.	2011-09-16 18:39:03 +00:00
Jonathan Earl Brassow	a514067448	After suspend/resume following a splitmirror op, call sync_local_dev_names to settle udev before calling deactivate_lv. This is an intra-release regression (no WHATS_NEW entry required). It is part of the fix for the current WHATS_NEW entry: Work around resume_lv causing error LV scanning during splitmirror operation.	2011-09-16 16:41:37 +00:00
Zdenek Kabelac	a6d50bef2f	Remove thin volumes before thin pools When user wants to remove thin pool - check if there are no thin volumes using it. If so - query before removal (or -ff for no question) and remove them first.	2011-09-16 12:12:51 +00:00
Zdenek Kabelac	4a0c6df8df	Reset LV status when unlinking LV from VG When LV is unlinked, we want to catch problem in vg_validate, that LV has changed. i.e. catch LV has been removed and is no long thin_pool while still being referenced by some thin volume.	2011-09-16 11:59:22 +00:00
Zdenek Kabelac	94147f3f29	Trim spaces on EOL	2011-09-16 11:53:14 +00:00
Petr Rockai	fd7d4adc57	Fix the divisibility check in the allocator for the mirror+stripe case (require divisibility by stripe count alone, not by (mirror*stripe)).	2011-09-16 09:59:42 +00:00
Milan Broz	c81a322337	Activate virtual snapshot origin exclusively (only on local node in cluster).	2011-09-14 14:20:16 +00:00
Zdenek Kabelac	e24be2abe4	Add suggest parentheses around '&&' Follow gcc suggestion.	2011-09-14 10:03:15 +00:00
Zdenek Kabelac	886d005616	LVM_WRITE and LVM_READ are 64bit constants Revert John patch, which fixed only 1 place where ~LVM_WRITE was in use and convert ommited LVM_READ/WRITE flags to 64bit constants as well. (Since both 'status' flags for LV and VG are 64bit.)	2011-09-14 09:57:35 +00:00
Zdenek Kabelac	3e25de05a9	Add missing underscores to local static functions	2011-09-14 09:54:21 +00:00
Jonathan Earl Brassow	462579d54e	Additional fixes for lv_mirror_count. Changing lv_mirror_count to only count the AREA_LVs made the function stop working for PVMOVE mirrors. A conditional has been added to fix that problem. Additionally, when counting the images in a mirror stack, we don't need to subtract 1 from the count we get back from the lv_mirror_count call on the temporary mirror layer. (This is because we are no falsely counting the top layer of the temporary mirror.)	2011-09-14 04:10:26 +00:00
Jonathan Earl Brassow	9cb27929e9	Fix for bug 734252 - problem up converting striped mirror after image failure lv_mirror_count was not able to handle mirrors of stripes properly. When a failed device is removed, the MIRRORED status flag is removed from the LV conditionally based on the results of lv_mirror_count. However, lv_mirror_count trusted the MIRRORED flag - thinking any such LV must be mirrored. It would happily assign first_seg(lv)->area_count as the number of mirrors, but when a mirrored striped LV was reduced to a simple striped LV area_count would be the number of /stripes/ not the number of /mirrors/. A result higher than 1 would be returned from lv_mirror_count, the MIRRORED flag would not be cleared, and the LV would fail to be up-converted properly in lvconvert_mirrors_aux because of it.	2011-09-14 02:45:36 +00:00
Jonathan Earl Brassow	46f0efbfce	Fix bug 733400 - Mirror down conversion when specifying the secondary leg is broke The operation of deactivating the residual error target LV after removing a mirror layer can cause a "device in-use" conflict with udev. Giving udev a poke before calling deactivate_lv eliminates the conflict. The stick used to poke udev is 'sync_local_dev_names'.	2011-09-13 21:13:33 +00:00
Jonathan Earl Brassow	c94c47abd7	Fix for bug 737200 - Can't create mirrored-log mirror on a VG with small extents Kernel requires a mirror to be at least 1 region large. So, if our mirror log is itself a mirror, it must be at least 1 region large. This restriction may not be necessary for non-mirrored logs, but we apply the rule anyway. (The other option is to make the region size of the log mirror smaller than the mirror it is acting as a log for, but that really complicates things. It's much easier to keep the region_size the same for both.)	2011-09-13 18:42:57 +00:00
Jonathan Earl Brassow	f5e43f061a	Better fix for bug 737125 - unable to create mirror on 1K extent size VG WHATS_NEW entry: Fix log size calculation when only a log is being added to a mirror. The original fix pass the mirror LV to allocate_extents (rather than passing NULL) so that _alloc_init could correctly determine the necessary size of the mirror log. In the previous check-in, I noted: In order to get a decent value computed, we need to pass in the 'lv' argument to allocate_extents. This would normally imply a desire for cling/contiguous allocation to the given LV, but since we are not allocating any parallel extents and only log extents, it works fine. However, passing in the LV did have unintended consequences on the placement of the log. The better solution is to pass in the number of extext that are in the mirror LV instead of the LV itself. This will not cause the allocator to reserve that number of extents, because 'stripes' and 'mirrors' are specified as 0. Thus, 'extents' is used to calculate the size of the log, but won't affect how much is allocated.	2011-09-13 18:11:38 +00:00
Jonathan Earl Brassow	0c89ef513a	Changing RAID status flags to 64-bit broke some binary flag operations. LVM_WRITE is a 32-bit flag. Now that RAID[_IMAGE\|_META] are 64-bit, and'ing a RAID LV's status against LVM_WRITE can reset the higher order flags. A similar thing will affect thinp flags if not careful.	2011-09-13 16:33:21 +00:00
Jonathan Earl Brassow	cc9dc919e6	Fix for bug 737125 - unable to create mirror on 1K extent size VG _alloc_init calculates the number of necessary log extents via 'mirror_log_extents'. 'mirror_log_extents' takes 3 arguments: region_size, pe_size, and size of the mirror LV. Unfortunately, _alloc_init is guessing at the mirror size by using 'ah->new_extents / ah->area_multiple' - the number of extents that the mirror images have. However, this is /always/ wrong when allocating the log separately. Further, the log is always allocated separately unless we are up-converting the mirror at the same time. It was by luck alone that a default value of '1' reflects what we want in most cases. In order to get a decent value computed, we need to pass in the 'lv' argument to allocate_extents. This would normally imply a desire for cling/contiguous allocation to the given LV, but since we are not allocating any parallel extents and only log extents, it works fine.	2011-09-13 14:37:48 +00:00
Jonathan Earl Brassow	6d0aa801a0	Fix for bug 733114. When an image is split from a 2-way mirror, the original mirror is converted to a linear device. To do this, the top "layer" must be removed. The segments are transferred from the sub-lv to the top-level LV and the link is severed. The former sub-lv - having its segments transferred - now contains a temporary error target. When the original LV is resumed, the old sub-lv that now contains an error segment is activated and scanned. This is what causes the I/O error messages. There are three ways to fix this problem: 1) Do not set the sub-lv which contains the error target as "visible" before suspending the original LV. This way, when the original is resumed, the sub-lv device node is not created and it is not scanned - avoiding the error messages. The problem with this approach is that if the machine crashes after the resume, it leaves the hidden LV in place and the user has a more difficult time noticing that it needs to be cleaned up. Thus, this type of processing is frowned upon. 2) Do like _remove_mirror_images does and suspend the original, then suspend the sub-lv (the error target), then resume the sub-lv, and finally resume the original LV. This seems like extra pointless operations to me, but it does not produce the error message (although, I'm not sure why) and it allows us to leave the visible flag in place. 3) Flag the sub-lv (error target) with a "do not scan" flag. This seems like the cleanest approach, but I have been unable to find the method for doing this. LVs get tagged in such a way by _get_udev_flags, but in this case the resume of the original LV also resumes the error target LV without running it through _get_udev_flags (likely because they are no longer linked). Could there be something wrong in resume_lv? Option #2 was chosen to fix this bug, but it seems like more of a workaround for now.	2011-09-13 13:59:19 +00:00
Alasdair Kergon	5081181b5d	Append z to lv_attr if new blocks will be zeroed.	2011-09-09 01:15:18 +00:00
Alasdair Kergon	dbb48de507	Add a new 'thin_pool' output field to 'lvs. A gentle reminder that anyone relying on the output of reporting commands like lvs in scripts must use -o to guarantee they get the fields they expect. The default sequence of fields can change from release to release. Equally, the 'attr' fields can have new values introduced and/or characters appended to them.	2011-09-09 00:54:49 +00:00
Alasdair Kergon	52e3f9dd5e	Add 7th lv_attr char to show the related kernel target. Add thin volume types to lv_attr.	2011-09-08 20:55:39 +00:00
Alasdair Kergon	ef78ebf35a	lvcreate/remove thin_pool and thin volumes (--driverloaded n only)	2011-09-08 16:41:18 +00:00
Alasdair Kergon	1abaaab1bc	Terminate pv_attr field correctly. (2.02.86)	2011-09-07 13:42:00 +00:00
Zdenek Kabelac	f32b76a193	Minor change for pv_create api Switch int to unsigned type.	2011-09-07 08:34:21 +00:00
Alasdair Kergon	bb6f9b10db	pool attach fns & more field renaming	2011-09-06 22:43:56 +00:00
Alasdair Kergon	b88362ff95	add thin_manip.c like the other manip files move basic lv_is_* to macros data_lv -> pool_lv - we decided to call it 'pool' everywhere now	2011-09-06 19:25:42 +00:00
Alasdair Kergon	2ef5b7cca6	Start using 64-bit status flags - most of the code already handles them. tdata -> tpool remove commented out definitions from metadata.h formatting clean-ups	2011-09-06 18:49:31 +00:00
Alasdair Kergon	dd44cccefe	else	2011-09-06 15:39:46 +00:00
Alasdair Kergon	9ac61d2ba2	lvcreate parsing for thin provisioning. The rest is incomplete so this isn't usable yet.	2011-09-06 00:26:42 +00:00
Jonathan Earl Brassow	da23255cc9	Fix for bug 732142: Unsafe table load during mirror image split There was a bad sequence: *) Make changes to LV layout to split images (e.g. 4-way -> 2-way/2-way) 1) vg_write, suspend_lv(original_mirror), vg_commit 2) activate_lv(newly_split_lv) 3) resume_lv(original_mirror) Step #2 is not allowed. However, without it, the resume of the original mirror will also resume its former sub-LVs - making it impossible to activate the newly split LV due to the changes in layering, pointers, and names that had already been made. Additionally, the resume or the original brings the sub-lv's online with names that differ from the metadata on disk - also a no-no. Thus, the split must be done in stages such that the active LVs always reflect what is in the committed LVM metadata. First, alter the original mirror by releasing the images. The images are made visible and independent as an intermediate stage. (This way, we can have consistency between LVM metadata and active LVs.) The second stage collects the recently split LVs, deactivates them, forms them into a mirror if necessary, and then activates them. It is a bit of a circuitous method, but it is the only way to split a mirror from a mirror and obey these general rules: 1) Never [de]activate sub-lvs when the top-level LV is suspended 2) Avoid having active LVs that differ from the description in the LVM metadata Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2011-09-01 19:22:11 +00:00
Zdenek Kabelac	3caa77f831	Use size_t return type Since these function returns buffer size - use size_t type for them.	2011-09-01 10:25:22 +00:00
Petr Rockai	e59e2f7c3c	Move the core of the lib/config/config.c functionality into libdevmapper, leaving behind the LVM-specific parts of the code (convenience wrappers that handle `struct device` and `struct cmd_context`, basically). A number of functions have been renamed (in addition to getting a dm_ prefix) -- namely, all of the config interface now has a dm_config_ prefix.	2011-08-30 14:55:15 +00:00
Alasdair Kergon	11bfaa1df8	same for segtype_is_thin	2011-08-26 18:17:05 +00:00
Alasdair Kergon	6fbf1c6b56	seg_is_thin includes both thin_pool and thin_volume	2011-08-26 18:15:14 +00:00
Alasdair Kergon	42914557d5	thin - hide unimplemented dso fn; remove duplicate origin_lv field; add some lvcreate struct parms	2011-08-26 17:40:53 +00:00
Zdenek Kabelac	e82bd6249b	Initial code for read/write of thin metadata lv segments	2011-08-26 13:37:47 +00:00
Zdenek Kabelac	9d32170d5c	Add registration of thin_pool segment Register thin and thin_pool segment via multiple_segtypes.	2011-08-25 10:00:09 +00:00
Alasdair Kergon	f9b92564a7	Fix raid shared lib segtype registration (2.02.87).	2011-08-24 13:41:46 +00:00
Zdenek Kabelac	3ba4a19510	Initial code layout for thin provisioning target Only registers init_thin_segtype Option --with-thin=internal needed for compilation. For now useful only for developememt!	2011-08-24 08:27:49 +00:00
Alasdair Kergon	c31d14d786	Remove incorrect error message added in 2.02.87.	2011-08-19 22:55:07 +00:00
Alasdair Kergon	1d64dcfbf7	clarify comment	2011-08-19 19:35:50 +00:00
Alasdair Kergon	ba7df3de88	avoid multi-line calc with incorrect intermediate var contents	2011-08-19 16:41:26 +00:00
Alasdair Kergon	3250b38583	_ for static fns	2011-08-19 15:59:15 +00:00
Jonathan Earl Brassow	a2facf4ad4	Add ability to merge back a RAID1 image that has been split w/ --trackchanges Argument layout is very similar to the merge command for snapshots.	2011-08-18 19:43:08 +00:00
Jonathan Earl Brassow	f439e65b64	Add support for m-way to n-way up-convert in RAID1 (no linear to n-way yet) This patch adds the ability to upconvert a raid1 array - say from 2-way to 3-way. It does not yet support upconverting linear to n-way. The 'raid' device-mapper target allows for individual components (images) of an array to be specified for rebuild. This mechanism is used when adding new images to the array so that the new images can be resync'ed while the rest of the images in the array can remain 'in-sync'. (There is no mirror-on-mirror layering required.)	2011-08-18 19:41:21 +00:00
Jonathan Earl Brassow	6d04311efa	Add the ability to split an image from the mirror and track changes. ~> lvconvert --splitmirrors 1 --trackchanges vg/lv The '--trackchanges' option allows a user the ability to use an image of a RAID1 array for the purposes of temporary read-only access. The image can be merged back into the array at a later time and only the blocks that have changed in the array since the split will be resync'ed. This operation can be thought of as a partial split. The image is never completely extracted from the array, in that the array reserves the position the device occupied and tracks the differences between the array and the split image via a bitmap. The image itself is rendered read-only and the name (<LV>_rimage_*) cannot be changed. The user can complete the split (permanently splitting the image from the array) by re-issuing the 'lvconvert' command without the '--trackchanges' argument and specifying the '--name' argument. ~> lvconvert --splitmirrors 1 --name my_split vg/lv Merging the tracked image back into the array is done with the '--merge' option (included in a follow-on patch). ~> lvconvert --merge vg/lv_rimage_<n> The internal mechanics of this are relatively simple. The 'raid' device- mapper target allows for the specification of an empty slot in an array via '- -'. This is what will be used if a partial activation of an array is ever required. (It would also be possible to use 'error' targets in place of the '- -'.) If a RAID image is found to be both read-only and visible, then it is considered separate from the array and '- -' is used to hold it's position in the array. So, all that needs to be done to temporarily split an image from the array /and/ cause the kernel target's bitmap to track (aka "mark") changes made is to make the specified image visible and read-only. To merge the device back into the array, the image needs to be returned to the read/write state of the top-level LV and made invisible.	2011-08-18 19:38:26 +00:00
Jonathan Earl Brassow	a324baf6a1	Add --splitmirrors support for RAID1 (1 image only) Users already have the ability to split an image from an LV of "mirror" segtype. This patch extends that ability to LVs of "raid1" segtype. This patch only allows a single image to be split off, however. (The "mirror" segtype allows an arbitrary number of images to be split off. e.g. 4-way => 3-way/linear, 2-way/2-way, linear,3-way)	2011-08-18 19:34:18 +00:00
Jonathan Earl Brassow	63d32fb6a6	When down-converting RAID1, don't activate sub-lvs between suspend/resume of top-level LV. We can't activate sub-lv's that are being removed from a RAID1 LV while it is suspended. However, this is what was being used to have them show-up so we could remove them. 'sync_local_dev_names' is a sufficient and proper replacement and can be done after the top-level LV is resumed.	2011-08-18 19:31:33 +00:00
Jonathan Earl Brassow	4903b85d23	Compiler warning fixes, better error messaging, and cosmetic changes. 1) add new function 'raid_remove_top_layer' which will be useful to other conversion functions later (also cleans up code) 2) Add error messages if raid_[extract\|add]_images fails 3) Add function prototypes to prevent compiler warnings when compiling with '--with-raid=shared'	2011-08-13 04:28:34 +00:00
Jonathan Earl Brassow	a22515c87f	Various code clean-ups (s/malloc/zalloc/, new msgs, etc) Fix a couple more issues that kabi found. - Add some error messages in failure cases - s/malloc/zalloc/ - use vg->vgmem for lv names instead of vg->cmd->mem	2011-08-11 21:32:18 +00:00
Jonathan Earl Brassow	2100c90dd7	Add missing checks for function return codes. Some functions were being called without having their return values checked.	2011-08-11 19:38:00 +00:00
Jonathan Earl Brassow	b2fa9b43dc	Add some log_error msg's and fix potential segfault Thanks to kabi for spotting these - especially the possibility for segfault if a loop runs all the way through without finding a match.	2011-08-11 19:17:10 +00:00
Jonathan Earl Brassow	4aebd52c4c	Add ability to down-convert RAID1 arrays. Also, add some simple RAID tests to testsuite.	2011-08-11 18:24:40 +00:00
Zdenek Kabelac	031c986ea8	Lock memory for shared VG Use debug pool locking functionality. So the command could check, whether the memory in the pool has not been modified. For lv_postoder() instead of unlocking and locking for every changed struct status member do it once when entering and leaving function. (mprotect would trap each such memory access). Currently lv_postoder() does not modify other part of vg structure then status flags of each LV with flags that are reverted back to its original state after function exit.	2011-08-11 17:34:30 +00:00
Zdenek Kabelac	bb115a7a6c	Cache and share generated VG structs Extend vginfo cache with cached VG structure. So if the same metadata are use, skip mda decoding in the case, the same data are in use. This helps for operations like activation of all LVs in one VG, where same data were decoded giving the same output result. Patch adds 1-to-1 connection between volume_group and lvmcache_vginfo.	2011-08-11 17:24:23 +00:00
Peter Rajnoha	47d7f00e16	Fix possible format instance memory leaks and premature releases in _vg_read.	2011-08-11 16:31:40 +00:00
Jonathan Earl Brassow	66d9675559	Fix renaming of RAID logical volumes. The function 'for_each_sub_lv', which rename uses, was not handling the RAID metadata areas. Thus, the metadata LVs were not being renamed.	2011-08-11 03:29:51 +00:00
Zdenek Kabelac	530b00a652	Just add new lines between header comment	2011-08-10 20:26:41 +00:00
Zdenek Kabelac	077a6755ff	Replace free_vg with release_vg Move the free_vg() to vg.c and replace free_vg with release_vg and make the _free_vg internal. Patch is needed for sharing VG in vginfo cache so the release_vg function name is a better fit here.	2011-08-10 20:25:29 +00:00
Zdenek Kabelac	789f9c55e5	Remove INCONSISTENT_VG flag As this flag could not have been set by the current code - removing it. Note: because of the wrong code logic this call: lvmcache_update_vg(correct_vg, correct_vg->status & PRECOMMITTED & (inconsistent ? INCONSISTENT_VG : 0)); had always passed '0' - now after flag removal it's passing PRECOMMITTED flag in - this present functinal change in this patch. To match the original functionality - 0 had to be always passed. More testing is needed here.	2011-08-10 20:17:33 +00:00
Jonathan Earl Brassow	e01bcc6884	Fix compiler warning. Compiler complaining that meta_lv could be used uninitialized. (Not true because it is protected by 'clear_metadata'.) I switched to using 'lv->vg', as it makes no difference to vg_[write\|commit].	2011-08-10 16:44:17 +00:00
Peter Rajnoha	0127a9a525	Remove unused 'origin' variable in lv_remove_single function.	2011-08-05 09:21:13 +00:00
Zdenek Kabelac	425862fb95	Remove unused inconsistent_seqno Last usage was removed in Petr's commit related to VG mda repair fix where relaxed check starts to ignore inconsistencies coming from PVs that are marked MISSING - thus removing unused variable.	2011-08-04 15:18:10 +00:00
Jonathan Earl Brassow	cac52ca4ce	Add basic RAID segment type(s) support. Implementation described in doc/lvm2-raid.txt. Basic support includes: - ability to create RAID 1/4/5/6 arrays - ability to delete RAID arrays - ability to display RAID arrays Notable missing features (not included in this patch): - ability to clean-up/repair failures - ability to convert RAID segment types - ability to monitor RAID segment types	2011-08-02 22:07:20 +00:00
Jonathan Earl Brassow	7411a44871	Remove and unneeded parameter from build_parallel_areas_from_lv()	2011-07-19 16:37:42 +00:00
Jonathan Earl Brassow	aa6599e687	Fix potential null ptr deref in 'origin_from_cow' return NULL rather than segfaulting if lv->snapshot is not set	2011-07-19 16:23:52 +00:00
Alasdair Kergon	ee840ff14c	Move snapshot deactivation logic into lib/activate, fixing the teardown sequence. (Previously the snapshot was deactivated while its origin was active and before its removal was committed to disk, so restarting after a crash at the point would leave corruption.)	2011-07-08 12:48:41 +00:00
Alasdair Kergon	0f2a4ca2b5	When suspending, automatically preload newly-visible existing LVs Let's find out if this makes things better or worse overall...	2011-06-30 18:25:18 +00:00
Alasdair Kergon	1d7649f36b	Reinstate correct permissions when creating mirrors.	2011-06-29 17:05:53 +00:00
Alasdair Kergon	e189a84f57	Append 'm' attribute to pv_attr for missing PVs.	2011-06-29 14:56:33 +00:00
Alasdair Kergon	140615dafb	remove unused var after recent patch	2011-06-24 23:39:09 +00:00
Jonathan Earl Brassow	9e0edb7ee5	Fix to preserve exclusive activation of mirror while up-converting. When an LVM mirror is up-converted (an additional image added), it creates a temporary mirror stack. The lower-level mirror in the stack that is created was not being activated exclusively - violating the exclusive nature of the original mirror. We now check for exclusive activation of a mirror before converting it, and if found, we ensure that the temporary mirror is also exclusively activated.	2011-06-23 14:00:58 +00:00
Milan Broz	6adbb95b82	Fail allocation if number of extents not divisible by area count Allocation should fail early if this condition is not met. Quick fix for https://bugzilla.redhat.com/show_bug.cgi?id=707779	2011-06-23 10:53:24 +00:00
Jonathan Earl Brassow	9e277b9e2c	Fix issue preventing cluster mirror creation. Mirrors used to be created by first creating a linear device and then adding the other images plus the log. Now mirrors are created by creating all the images in one go and then adding the log separately. The new way ran into the condition that cluster mirrors cannot change the log type (in the case of creation, from core -> disk) while the mirror is not active. (It isn't active because it is in the process of being created.) The reason this condition is in place is because a remote node may have the mirror active, and we don't want to alter the log underneath it. What we really needed was a way of checking if the mirror was active remotely but not locally, and in that case do not allow a change of the log. I've added this check, and cluster mirrors can now be created again.	2011-06-22 21:31:21 +00:00
Zdenek Kabelac	bebe60b70c	Code move of vg_mark_partial() up in stack It's useful to keep the partial flag cached - so just move the call for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets evaluated before it goes through the lvmcache. This patch should not present any functional change. Note: It is rather temporal solution - proper place is probably inside the 'read' call back - but needs some more discussion. For now using this minor hack.	2011-06-17 14:39:10 +00:00
Zdenek Kabelac	93a98c2672	Remove unused internal flag ACTIVATE_EXCL from the code	2011-06-17 14:30:58 +00:00
Zdenek Kabelac	f50a76379a	Remove test for status flag As the ACTIVATE_EXCL could be set only in clvmd code - there is no use for this test in lv_add_mirrors() function only called from tools context. FIXME: Add cluster test case for this.	2011-06-17 14:27:34 +00:00
Zdenek Kabelac	f3d8974dc9	Add couple FIXMEs around suspicious code	2011-06-17 14:24:18 +00:00
Zdenek Kabelac	81beded3af	Add lv_activate_opts structure To avoid modification of 'read-only' volume group structure add a new structure to pass local data around the code for LV activation. As origin_only is one such flag - replace this parameter with new struct lv_activate_opts. More parameters might eventually become part of lv_activate_opts.	2011-06-17 14:14:19 +00:00
Petr Rockai	6d25c0d26f	Fix RHBZ 651590 (failure to lock LV results in failure to repair mirror after transient error), stemming from the following sequence of events: 1) devices fail IO, triggering repair 2) dmeventd starts fixing up the mirror 3) during the downconversion, a new metadata version is written --> the devices come back online here 4) the mirror device suspend/resume is called to update DM tables 5) during the suspend/resume cycle, pre-commit metadata is read; however, since the failed devices are now back online, we get back inconsistent set of precommit metadata and the whole operation fails The patch relaxes the check that fails in step 5 above, namely by ignoring inconsistencies coming from PVs that are marked MISSING.	2011-06-15 17:45:02 +00:00
Alasdair Kergon	7df72b3c88	Fix last snapshot removal to avoid table reload while a device is suspended.	2011-06-13 22:28:04 +00:00
Alasdair Kergon	df390f1799	Major pvmove fix to issue ioctls in the correct order when multiple LVs are affected by the move. (Currently it's possible for I/O to become trapped between suspended devices amongst other problems. The current fix was selected so as to minimise the testing surface. I hope eventually to replace it with a cleaner one that extends the deptree code. Some lvconvert scenarios still suffer from related problems.	2011-06-11 00:03:06 +00:00
Milan Broz	4fb39ae074	Validate mirror segments size Currently some operation with striped mirrors lead to corrupted metadata, this patch just add detection of such situation. Example: # lvcreate -i2 -l10 -n lvs vg_test # lvconvert -m1 vg_test/lvs # lvreduce -f -l1 vg_test/lvs Reducing logical volume lvs to 4.00 MiB Segment extent reduction 9not divisible by #stripes 2 Logical volume lvs successfully resized # lvremove vg_test/lvs Segment extent reduction 1not divisible by #stripes 2 LV segment lvs:0-4294967295 is incorrectly listed as being used by LV lvs_mimage_0 Internal error: LV segments corrupted in lvs_mimage_0.	2011-06-09 19:36:16 +00:00
Alasdair Kergon	bb056af3c9	missing space in mesg	2011-06-06 12:08:42 +00:00
Alasdair Kergon	3cac20f850	Defer writing PV labels to vg_write. Store label_sector only in struct physical_volume.	2011-06-01 19:29:31 +00:00
Alasdair Kergon	453cdee51c	Permit --available with lvcreate so non-snapshot LVs need not be activated.	2011-06-01 19:21:03 +00:00
Petr Rockai	833a287337	Make vg_mark_partial_lvs also clear existing PARTIAL_LV flags, so it can be issued repeatedly on the same VG, keeping the PARTIAL_LV flags up to date.	2011-05-07 13:32:05 +00:00
Alasdair Kergon	5510b4e7d7	test update without WHATS_NEW to check it gives warning now	2011-04-29 19:06:17 +00:00
Alasdair Kergon	9cda028a96	clean up critical section patch	2011-04-28 20:29:59 +00:00
Zdenek Kabelac	b680d5bf7b	Fix use of released vgname and vgid Avoid using of already released memory when duplicated MDA is found. As get_pv_from_vg_by_id() may call lvmcache_label_scan() use the local copy of the vgname and vgid on the stack as vginfo may dissapear and code was then accessing garbage in memory. i.e. pvs /dev/loop0 (when /dev/loop0 and /dev/loop1 has same MDA content) Invalid read of size 1 at 0x523C986: dm_hash_lookup (hash.c:325) by 0x440C8C: vginfo_from_vgname (lvmcache.c:399) by 0x4605C0: _create_vg_text_instance (format-text.c:1882) by 0x46140D: _text_create_text_instance (format-text.c:2243) by 0x47EB49: _vg_read (metadata.c:2887) by 0x47FBD8: vg_read_internal (metadata.c:3231) by 0x477594: get_pv_from_vg_by_id (metadata.c:344) by 0x45F07A: _get_pv_if_in_vg (format-text.c:1400) by 0x45F0B9: _populate_pv_fields (format-text.c:1414) by 0x45F40F: _text_pv_read (format-text.c:1493) by 0x480431: _pv_read (metadata.c:3500) by 0x4802B2: pv_read (metadata.c:3462) Address 0x652ab80 is 0 bytes inside a block of size 4 free'd at 0x4C2756E: free (vg_replace_malloc.c:366) by 0x442277: _free_vginfo (lvmcache.c:963) by 0x44235E: _drop_vginfo (lvmcache.c:992) by 0x442B23: _lvmcache_update_vgname (lvmcache.c:1165) by 0x443449: lvmcache_update_vgname_and_id (lvmcache.c:1358) by 0x443C07: lvmcache_add (lvmcache.c:1492) by 0x46588C: _text_read (text_label.c:271) by 0x466A65: label_read (label.c:289) by 0x4413FC: lvmcache_label_scan (lvmcache.c:635) by 0x4605AD: _create_vg_text_instance (format-text.c:1881) by 0x46140D: _text_create_text_instance (format-text.c:2243) by 0x47EB49: _vg_read (metadata.c:2887) Add testing script	2011-04-21 13:13:40 +00:00
Mike Snitzer	ffcb1b9c2c	Improve the discard documentation. Also improve discard code in pv_manip.c to properly account for case when pe_start=0 and the first physical extent is to be released (currently skip the first extent to avoid discarding the PV label).	2011-04-13 18:26:39 +00:00
Mike Snitzer	727373c176	Use uint32_t rather than uint64_t.	2011-04-12 22:04:04 +00:00
Mike Snitzer	fdc8670327	Add "devices/issue_discards" to lvm.conf. Issue discards on lvremove if enabled and both storage and kernel have support.	2011-04-12 21:59:01 +00:00
Zdenek Kabelac	96077265c4	Replace dm_snprintf with strncpy My previous patch fixed incorrect error check for dm_snprintf. However in this particular case - dm_snprintf has been used differently - just like strncpy + setting last char with '\0' - so the code had to return error - because the buffer was to short for whole string. Patch replaces it with real strncpy. Also test for alloca() failure is removed - as the program behaviour is rather undefined in this case - it never returns NULL.	2011-04-12 14:13:17 +00:00
Petr Rockai	db22d9b978	This patchset refactors some reporting code and completes the remaining lvseg properties for lvm2app, 'devices' and 'seg_pe_ranges'. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Petr Rockai <prockai@redhat.com>	2011-04-12 12:24:29 +00:00
Zdenek Kabelac	c67d2b4dd4	Fix incorrect tests for dm_snprintf() failure As the memory is preallocated based on arg size in these cases, the error would be quite hard to trigger here anyway.	2011-04-09 19:05:23 +00:00
Zdenek Kabelac	a1eba521e3	Fix some unmatching sign comparation gcc warnings Simple replacement for unsigned type - usually in for() loops.	2011-04-08 14:40:18 +00:00
Jonathan Earl Brassow	532e6c8ae3	Thanks to Zdenek Kabelac (kabi) for pointing out that I was using dm_pool_free incorrectly. This check-in fixes that incorrect usage. I've also added a WHATS_NEW line to reflect the changes I made to allow lv_extend to operate on 0 length intrinsically layered LVs (i.e mirrors and RAID). I forgot that in the last commit.	2011-04-07 21:49:29 +00:00
Jonathan Earl Brassow	fe93c99ad9	This patch adds the ability to extend 0 length layered LVs. This allows us to allocate all images of a mirror (or RAID array) at one time during create. The current mirror implementation still requires a separate allocation for the log, however.	2011-04-06 21:32:20 +00:00
Peter Rajnoha	29684f590c	Cleanup fid finalization code in free_vg and allow exactly the same fid to be set again for a PV/VG. Actually, we can call vg_set_fid(vg, NULL) instead of calling destroy_instance for all PV structs and a VG struct - it's the same code we already have in the vg_set_fid. Also, allow exactly the same fid to be set again for the same PV/VG Before, this could end up with the fid destroyed because we destroyed existing fid first and then we used the new one and we didn't care whether existing one == new one by chance.	2011-04-01 14:54:20 +00:00
Zdenek Kabelac	3d04380691	Use created hash tables for quick check of LV, PV. Instead of searching linear list of all LVs, PVs - use created hash tables also for quick mapping between LV. (Note - for small number of PVs or LVs the overhead of the hash is bigger). TODO: Use hash tables in volume_group structure directly.	2011-03-30 13:35:51 +00:00
Zdenek Kabelac	1bedd3a97b	Use id_equal instead of strncmp() More consistent and easier to read.	2011-03-29 21:57:56 +00:00
Zdenek Kabelac	f77736cab5	Remove double braces Clang gives notice about possible confusion as commonly double bracces are used when some assignment is done inside them.	2011-03-29 20:19:03 +00:00
Jonathan Earl Brassow	60c10a45ce	s/MIRROR_NOTSYNCED/LV_NOTSYNCED/ - Flag will may refer to more than just mirrors	2011-03-29 12:51:57 +00:00
Jonathan Earl Brassow	be226be635	Fix unhandled condition in _move_lv_segments If _move_lv_segments is passed a 'lv_from' that does not yet have any segments, it will screw things up because the code that does the segment copy assumes there is at least one segment. See copy code here: lv_to->segments = lv_from->segments; lv_to->segments.n->p = &lv_to->segments; lv_to->segments.p->n = &lv_to->segments; If 'segments' is an empty list, the first statement copies over the values, but the next two reset those values to point to the other LV's list structure. 'lv_to' now appears to have one segment, but it is really an ill-set pointer.	2011-03-25 22:02:27 +00:00
Petr Rockai	5ef2808bc7	In some cases, we could end up with a mirrored LV without a MIRRORED flag. In other cases, the code could wind up removing wrong number of mirrors. In yet other cases, we could remove the right number of mirrors, but fail to respect the removal preferences (i.e. keep an image that was requested to be removed while removing an image that was requested to be kept). Under some circumstances, remove_mirror_images could also get stuck in an infinite loop. This patch should fix all of the above undesirable behaviours. Signed-off-by: Petr Rockai <prockai@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>	2011-03-24 12:28:02 +00:00
Zdenek Kabelac	b8ccce3500	Add missing \0 for grown debug object Attach \0 for proper char* display - otherwise somewhat random message could be displayed in debug more and read of unpredictable read of uninitilized memory values could happen.	2011-03-14 17:00:57 +00:00
Zdenek Kabelac	844b75f4d6	Fix allocation of system_id As code uses strncpy(system_id, NAME_LEN) and doesn't set '\0' Fix it by always allocating NAME_LEN + 1 buffer size and with zalloc we always get '\0' as the last byte. This bug may trigger some unexpected behavior of the string operation code - depends on the pool allocator. FIXME: refactor this code to alloc_vg.	2011-03-13 23:05:48 +00:00
Peter Rajnoha	ff4479414c	Use format instance mempool where possible and adequate.	2011-03-11 15:10:16 +00:00
Peter Rajnoha	e8d4946ec7	Various cleanups for fid mem and ref_count changes. Missing free_vg on error_path in lvmcache_get_vg fn. Call destroy_instance only if the fid is not part of the vg in backup_read_vg fn (otherwise it's part of the VG we're returning and we definitely don't want to destroy it!).	2011-03-11 15:08:31 +00:00
Peter Rajnoha	2feb2a66fd	Call destroy_instance for any PVs found in VG structure during vg_free call. This is necessary for proper format instance ref_count support. We iterate over vg->pvs and vg->removed_pvs list and the ref_count is decremented and then it is destroyed if not referenced anymore.	2011-03-11 15:06:13 +00:00
Peter Rajnoha	84f48499a3	Add new free_pv_fid fn and use it throughout to free all attached fids. Since format instances will use own memory pool, it's necessary to properly deallocate it. For now, only fid is deallocated. The PV structure itself still uses cmd mempool mostly, but anytime we'd like to add a mempool in the struct physical_volume, we can just rename this fn to free_pv and add the code (like we have free_vg fn for VGs).	2011-03-11 14:56:56 +00:00
Peter Rajnoha	1307ddf4cf	Use only vg_set_fid and new pv_set_fid fn to assign the format instance. This is essential for proper format instance ref_count support. We must use these functions to set the fid everywhere from now on, even the NULL value!	2011-03-11 14:50:13 +00:00
Peter Rajnoha	a1bec4e685	Add mem and ref_count fields to struct format_instance for own mempool use. Format instances can be created anytime on demand and it contains metadata area information mostly (at least for now, but in the future, we may store more things here to update/edit in a PV/VG). In case we have lots of metadata areas, memory consumption will rise. Using cmd context mempool is not quite optimal here because it is destroyed too late. So let's use a separate mempool for format instances. Reference counting is used because fids could be shared, e.g. each PV has either a PV-based fid or VG-based fid. If it's VG-based, each PV has a shared fid with the VG - a reference to VG's fid.	2011-03-11 14:38:38 +00:00
Peter Rajnoha	56f5b12eed	Use new alloc_fid fn for common format instance initialisation.	2011-03-11 14:30:27 +00:00
Zdenek Kabelac	a6f38f9d6a	Missed merge fix in vg_validate patch	2011-03-10 22:39:36 +00:00
Zdenek Kabelac	442dbf9ad8	Refactor code for _lv_postoder Add _lv_postorder_vg() - for calling _lv_postorder() for every LV from VG. We use this in 2 places - vg_mark_partial_lvs() and vg_validate() so make it as a one function. Benefit here is - to use only one cleanup code and avoid potentially duplicate scans of same LVs.	2011-03-10 14:40:32 +00:00
Zdenek Kabelac	4ee2b4965f	Use hash tables for validating names Accelerate validation loop by using lvname, lvid, pvid hash tables. Also merge pvl loop into one cycle now - no need to scan the list twice. List scan is stopped when dm_hash_insert fails. The error message with loop_counter1 is no longer provided - however the message has been misleading anyway.	2011-03-10 13:11:59 +00:00
Zdenek Kabelac	3019419e95	Refactor vg allocation code Create new function alloc_vg() to allocate VG structure. It takes pool_name (for easier debugging). and also take vg_name to futher simplify code. Move remainder of _build_vg_from_pds to _pool_vg_read and use vg memory pool for import functions. (it's been using smem -> fid mempool -> cmd mempool) (FIXME: remove mempool parameter for import functions and use vg). Move remainder of the _build_vg to _format1_vg_read	2011-03-10 12:43:29 +00:00
Alasdair Kergon	2f25c320fb	Use empty string instead of /dev// for LV path when there's no VG. Don't allocate unused VG mempool in _pvsegs_sub_single.	2011-03-09 12:44:42 +00:00
Zdenek Kabelac	55f6627427	Fix reading of released memory lvseg_segtype_dup used memory pool vg memory pool for strind duplication. However this one gets released before reporting happens so the command like: pvs -o segtype prints data from already released memory pool. Thanks to the fact there is not much allocation happing after the VG is released, the memory stays unmodified and correct result is printed. Fix adds support for mempool passed parameter (like other similar query commands) and uses dm_report memory pool for string duplication.	2011-03-05 12:14:00 +00:00
Milan Broz	be3510b204	PE size overflows, on most architectures it is catch by "PE cannot be 0" but s390x unfortunately return something usable. Always use unit64 in inital parameter check.	2011-03-02 20:00:09 +00:00
Zdenek Kabelac	36653e8903	Add fall through comments Add comments to switch case construct.	2011-02-28 19:53:03 +00:00
Peter Rajnoha	3b97e8d643	Allow non-orphan PVs with two metadata areas to be resized. We allow writing non-orphan PVs only for resize now. The "orphan PV" assert in pv_write fn uses the "allow_non_orphan" parameter to control this assert. However, we should find a more elaborate solution so we can remove this restriction altogether (pv_write together with vg_write is not atomic, we need to find a safe mechanism so there's an easy revert possible in case of an error).	2011-02-28 13:19:02 +00:00
Alasdair Kergon	1a52fa6858	Fix check for log-only allocation in new alloc normal loop.	2011-02-27 01:16:52 +00:00
Alasdair Kergon	92ffcda183	Various changes to the allocation algorithms: Expect some fallout. There is a lot to test. Two new config settings added that are intended to make the code behave closely to the way it did before - worth a try if you find problems.	2011-02-27 00:38:31 +00:00
Peter Rajnoha	4a304dc1d8	Allow only orphan PVs to be resized even with two metadata areas.	2011-02-25 14:08:54 +00:00
Peter Rajnoha	f74bd57ec9	Revert the patch for vgconvert to work with recent changes in metadata area handling. This should work now with the help of the patch from previous commit.	2011-02-25 14:02:53 +00:00
Peter Rajnoha	38b0564cab	Read PV metadata information from cache if pv_setup called with pv->fid == vg->fid. If the PV is already part of the VG (so the pv->fid == vg->fid), it makes no sense to attach the mdas information from PV to a VG. Instead, we read new PV metadata information from cache and attach it to the VG fid.	2011-02-25 13:59:47 +00:00
Peter Rajnoha	c901a92aa5	%ld -> PRIu64	2011-02-21 13:09:27 +00:00
Peter Rajnoha	9c0035c129	Fix metadata balance code to work with recent changes in metadata handling interface (with the changes in format_instance).	2011-02-21 12:33:16 +00:00
Peter Rajnoha	51aed1992f	Add old_uuid field to struct physical_volume so we can still reference a PV with its old UUID when we're changig it (the cache as well as metadata area index has the old uuid that we need to use to access the information!)	2011-02-21 12:31:28 +00:00
Peter Rajnoha	6bdc80743e	Fix vgconvert code to work with changes in metadata area handling and changes in format_instance. Add new 'vg_convert' function.	2011-02-21 12:29:21 +00:00
Peter Rajnoha	cb2396730a	Change pvresize code to work with new metadata handling interface and allow resizing a PV with two metadata areas.	2011-02-21 12:27:26 +00:00
Peter Rajnoha	17ad2b1115	Change pv_write code to work with the changes in metadata handling interface and changes in format_instance.	2011-02-21 12:26:27 +00:00
Peter Rajnoha	94d91fdda1	Change the code throughout to use new pv_initialise and modified pv_setup fn. Change pv_create code to work with these changes together with using new pv_add_metadata_area fn to add metadata areas for a PV being created.	2011-02-21 12:24:15 +00:00
Peter Rajnoha	617b900d85	Separate new pv_initialise function out of the original pv_setup code. pv_initiliase initialises a new PV pv_setup sets up an existing PV with a VG	2011-02-21 12:20:18 +00:00
Peter Rajnoha	981895a860	Add new pv_remove_metadata_area interface function.	2011-02-21 12:17:54 +00:00
Peter Rajnoha	8d5d20a526	Add new pv_add_metadata_area interface function.	2011-02-21 12:17:26 +00:00
Peter Rajnoha	305816232d	Remove useless mdas parameter for pv_read (from now on, we store mdas in a format instance)	2011-02-21 12:15:59 +00:00
Peter Rajnoha	6e0b348d34	Add format instance support for pv_read code.	2011-02-21 12:13:40 +00:00
Peter Rajnoha	56280d0d3a	Initialise a new PV-based format instance for a PV that is being created.	2011-02-21 12:12:32 +00:00
Peter Rajnoha	f8b78ec613	Add vg_set_fid function to change VG format instance. This function also sets a reference to a new VG format instance for all PVs that are part of the VG so the PV-VG interconnection is consistent after the change.	2011-02-21 12:10:58 +00:00
Peter Rajnoha	c0c21864c6	Change the code throughout for recent changes in format_instance handling.	2011-02-21 12:07:03 +00:00
Peter Rajnoha	88129db5e1	Change create_instance to create PV-based as well as VG-based format instances. Add supporting functions to work with the format instance and metadata area structures stored within the format instance. Add support for simple indexing of metadata areas using PV id and mda order (for on-disk PV only for now, we can extend the indexing even for other mdas if needed - we only need to define a proper key for the index).	2011-02-21 12:05:49 +00:00
Peter Rajnoha	716c4ebe52	Change and generalise struct format_instance for PV and VG use.	2011-02-21 12:01:22 +00:00
Zdenek Kabelac	aec2115410	Const fixing Fixing some const warnings - with API change in: int vg_extend(struct volume_group vg, int pv_count, const char const pv_names, Change is needed - as lvm2api expects const behaviour here. So vg_extend() is doing local strdup for unescaping. skip_dev_dir return const char from const char* vg_name. Rest of the patch is cleanup of related warnings. Also using dm_report_filed_string() API change to simplify casting in _string_disp and _lvname_disp.	2011-02-18 14:47:28 +00:00
Zdenek Kabelac	b1bcff7424	Critical section New strategy for memory locking to decrease the number of call to to un/lock memory when processing critical lvm functions. Introducing functions for critical section. Inside the critical section - memory is always locked. When leaving the critical section, the memory stays locked until memlock_unlock() is called - this happens with sync_local_dev_names() and sync_dev_names() function call. memlock_reset() is needed to reset locking numbers after fork (polldaemon). The patch itself is mostly rename: memlock_inc -> critical_section_inc memlock_dec -> critical_section_dec memlock -> critical_section Daemons (clmvd, dmevent) are using memlock_daemon_inc&dec (mlockall()) thus they will never release or relock memory they've already locked memory. Macros sync_local_dev_names() and sync_dev_names() are functions. It's better for debugging - and also we do not need to add memlock.h to locking.h header (for memlock_unlock() prototyp).	2011-02-18 14:16:11 +00:00
Zdenek Kabelac	794e94fe16	Replace PV_MIN_SIZE with function pv_min_size() Add configurable option to define minimal size of of block device usable as a PV. pv_min_size() is added to lvm-globals and it's being initialized through _process_config. Macro PV_MIN_SIZE is unused and removed. New define DEFAULT_PV_MIN_SIZE_KB is added to lvm-global and unlike PV_MIN_SIZE it uses KB units. Should help users with various slow devices attached to the system, which cannot be easily filtered out (like FDD on /dev/sdX): https://bugzilla.redhat.com/show_bug.cgi?id=644578	2011-02-18 14:11:22 +00:00
Petr Rockai	21849a8587	Fix an lv_postorder bug where it failed to clear temporary flags, making it impossible to use twice with the same LV(s). Discovered by Milan.	2011-02-14 19:27:05 +00:00
Jonathan Earl Brassow	27ff8813da	Allow snapshots in a cluster as long as they are exclusively activated. In order to achieve this, we need to be able to query whether the origin is active exclusively (a condition of being able to add an exclusive snapshot). Once we are able to query the exclusive activation of an LV, we can safely create/activate the snapshot. A change to 'hold_lock' was also made so that a request to aquire a WRITE lock did not replace an EX lock, which is already a form of write lock.	2011-02-04 20:30:17 +00:00
Mike Snitzer	3e3591904b	Improve lvcreate "insufficient extents" errors to "insufficient free space".	2011-01-28 02:58:00 +00:00
Alasdair Kergon	cef065f63f	Fix lvchange --test to exit cleanly.	2011-01-24 14:19:05 +00:00
Alasdair Kergon	a8de276520	Replace fs_unlock by sync_local_dev_names to notify local clvmd. (2.02.80) Introduce sync_local_dev_names and CLVMD_CMD_SYNC_NAMES to issue fs_unlock.	2011-01-12 20:42:50 +00:00
Jonathan Earl Brassow	6a095ca99f	s/log_verbose/log_error/ - Increase log level on error message.	2011-01-11 17:21:01 +00:00
Jonathan Earl Brassow	025e69a15a	Add disk to mirrored log type conversion.	2011-01-11 17:05:08 +00:00
Zdenek Kabelac	937a21f0d2	Speedup consequent activation calls Stop calling fs_unlock() from lv_de/activate(). Start using internal lvm fs cookie for dm_tree. Stop directly calling dm_udev_wait() and dm_tree_set/get_cookie() from activate code - it's now called through fs_unlock() function. Add lvm_do_fs_unlock() Call fs_unlock() when unlocking vg where implicit unlock solves the problem also for cluster - thus no extra command for clustering environment is required - only lvm_do_fs_unlock() function is added to call lvm's fs_unlock() while holding lvm_lock mutex in clvmd. Add fs_unlock() also to set_lv() so the command waits until devices are ready for regular open (i.e. wiping its begining). Move fs_unlock() prototype to activation.h to keep fs.h private in lib/activate dir and not expose other functions from this header.	2011-01-10 14:02:30 +00:00
Zdenek Kabelac	6feecf76d4	Change import_vg_from_buffer to use config_tree Change function import_vg_from_buffer() to import_vg_from_config_tree(). Instead of creating config tree inside the function allow config tree to be passed as parameter - usable later for caching.	2011-01-10 13:13:42 +00:00
Zdenek Kabelac	2ae2ca89bf	Add backtraces for backup and backup_remove fail paths	2010-12-22 15:36:41 +00:00
Zdenek Kabelac	b7149bbe45	Add missing test for reallocation error.	2010-12-20 14:38:22 +00:00
Zdenek Kabelac	9b30dfb967	Use const char * for name and old_name in vg Switch to use const char pointers to avoid changes of these structure members and having better control over, were these members could be modified.	2010-12-20 13:40:46 +00:00
Zdenek Kabelac	9d9de35dca	Remove const usage from destroy callbacks As const segment_type or const format_type are never released use their non-const version and remove const downcast from dm_free calls. This change fixes many gcc warnings we were getting from them.	2010-12-20 13:32:49 +00:00
Zdenek Kabelac	ba96eb24fa	Some const cleanups Minor const warning fixes and internal API updates.	2010-12-20 13:19:13 +00:00
Zdenek Kabelac	760d1fac55	Add more strict const pointers around config tree To have better control were the config tree could be modified use more const pointers and very carefully downcast them back to non-const (for config tree merge).	2010-12-20 13:12:55 +00:00
Petr Rockai	ebfe96cad5	Add further consistency checking to vg_validate, ensuring that all segment areas point to LVs or PVs that are listed in the respective VG.	2010-12-14 17:51:09 +00:00
Petr Rockai	75b2f3507a	Add a validation step for pvmoveN internal LVs to vg_validate.	2010-12-14 17:07:35 +00:00
Alasdair Kergon	acb037657c	Fix scanning of VGs without in-PV mdas. Set cmd->independent_metadata_areas if metadata/dirs or disk_areas in use. - Identify and record this state. Don't skip full scan when independent mdas are present even if memlock is set. - Clusters and OOM aren't supported, so no problem doing the proper scans. Avoid revalidating the label cache immediately after scanning. - A simple optimisation. Support scanning for a single VG in independent mdas. - Not used by the fix but I left it in anyway as later patches might use it.	2010-12-10 22:39:52 +00:00
Alasdair Kergon	2b82bd79f5	Rename vg_release to free_vg.	2010-12-08 20:50:48 +00:00
Zdenek Kabelac	54fca7b1ca	Remove reset of vg->vgmem pointer as it is access of already release memory This reset of vgmem pointer causes access of already released memory. (_vg_make_handle allocates vg from vgmem pool itself - which is a bit tricky) Interestingly this memory fault was missed by our test suite.	2010-12-08 10:45:37 +00:00
Zdenek Kabelac	166597d998	Add backtraces for errors Add stack; backtraces when error is reported from dev_set() or dev_close_immediate().	2010-12-01 12:56:39 +00:00
Petr Rockai	8191fe4f4a	Refactor the percent (mirror sync, snapshot usage) handling code to use fixed-point values instead of a combination of a float value and an enum.	2010-11-30 11:53:31 +00:00
Petr Rockai	97e8048e05	Avoid the automatic MISSING_PV recovery path in commands with special MISSING_PV handling (cmd->handles_missing_pvs is set).	2010-11-30 11:15:54 +00:00
Alasdair Kergon	1415afcdba	Fix memory leak when VG allocation policy in metadata is invalid. Ignore unrecognised allocation policy found in metadata instead of aborting. Fix another missing vg_release() in _vg_read_by_vgid.	2010-11-29 18:35:37 +00:00
Zdenek Kabelac	201222ebad	Reset vg pointer after release Set vg to NULL after releasing it as the following memlock() test may lead to goto for the second call of vg_release() with the already released vg pointer.	2010-11-29 11:08:14 +00:00
Alasdair Kergon	728074ac83	Suppress 'No PV label' message when removing several PVs without mdas.	2010-11-23 01:55:53 +00:00
Petr Rockai	c1abd569f2	Add the macro and specific 'get' functions for lvsegs. Signed-off-by: Dave Wysochanski <wysochanski@pobox.com> Reviewed-by: Petr Rockai <prockai@redhat.com>	2010-11-17 20:08:14 +00:00
Alasdair Kergon	f8452d8cfd	Support repetition of --addtag and --deltag arguments. Add infrastructure for specific cmdline arguments to be repeated in groups. Split the_args cmdline arguments and values into arg_props and arg_values.	2010-11-11 17:29:05 +00:00
Zdenek Kabelac	64dff85ce4	Preserve const for char pointer Keep char pointers 'const' (introduced with cling commit).	2010-11-11 12:32:33 +00:00
Alasdair Kergon	eb82bd0525	Extend cling allocation policy to recognise PV tags (cling_by_tags). Add allocation/cling_tag_list to lvm.conf.	2010-11-09 12:34:40 +00:00
Peter Rajnoha	f7e3a19f75	Clarify error messages when activation fails due to activation filter use.	2010-11-05 18:18:11 +00:00
Alasdair Kergon	2aa06d73ca	pre-release	2010-10-25 13:54:29 +00:00
Zdenek Kabelac	91e56ffb29	Fix constness warning for _vg_read_by_vgid() uuid usage	2010-10-25 13:35:13 +00:00
Alasdair Kergon	eacd3a0916	fix header #defines	2010-10-25 12:01:59 +00:00
Alasdair Kergon	b83af51668	Add global/metadata_read_only to use unrepaired metadata in read-only cmds.	2010-10-25 11:20:54 +00:00
Dave Wysochanski	d53d92f2e1	Add lv_read_ahead and lv_kernel_read_ahead 'get' functions.	2010-10-21 14:49:31 +00:00
Dave Wysochanski	f1fc310730	Refactor and add code for (lv) 'lv_origin' get function.	2010-10-21 14:49:20 +00:00
Dave Wysochanski	6103254393	Refactor and add code for (lv) 'lv_name' get function.	2010-10-21 14:49:10 +00:00
Jonathan Earl Brassow	2c33c8b80c	Fix for bug 637936: killing both redundant logs causes deadlock Problem: When both legs of a mirrored log fail, neither the log nor the parent mirror can proceed. The repair code must be careful to replace the log with an error target before operating on the parent - otherwise, the parent can get stuck trying to suspend because it can't push through any writes. The steps to replace the log device with an error target were incomplete and resulted in the replacement not happening at all! The code originally had all the necessary logic to complete the replacement task, but was pulled out in a effort to clean-up that section of code, while fixing another bug: <offending commit msg> In addition, I added following three changes. - Removed tmp_orphan_lvs handling procedure It seems that _delete_lv() can handle detached_log_lv properly without adding mirror legs in mirrored log to tmp_orphan_lvs. Therefore, I removed the procedure. - Removed vg_write()/vg_commit() Metadata is saved by vg_write()/vg_commit() just after detached_log_lv is handled. Therefore, I removed vg_write()/vg_commit(). </offending commit msg> http://sources.redhat.com/cgi-bin/cvsweb.cgi/LVM2/lib/metadata/mirror.c?cvsroot=lvm2&f=h#rev1.130 I've reverted the "clean-up" changes associated with that fix, but not what that commit was actually fixing. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Petr Rockai <prockai@redhat.com>	2010-10-14 20:03:12 +00:00
Mike Snitzer	9443b5d4cd	Convey need for snapshot-merge target in lvconvert error message and man page. Add ->target_name to segtype_handler to allow a more specific target name to be returned based on the state of the segment. Result of trying to merge a snapshot using a kernel that doesn't have the snapshot-merge target: Before: # lvconvert --merge vg/snap Can't expand LV lv: snapshot target support missing from kernel? Failed to suspend origin lv After: # lvconvert --merge vg/snap Can't process LV lv: snapshot-merge target support missing from kernel? Failed to suspend origin lv Unable to merge LV "snap" into it's origin.	2010-10-13 21:26:37 +00:00
Petr Rockai	042312952c	Give correct error message when creating a too-small snapshot (BZ 587063)	2010-10-13 13:52:53 +00:00
Zdenek Kabelac	7c9fd3ea84	Don't use floor() in _bitset_with_random_bits Use _even_rand() function instead of floor() in _bitset_with_random_bits(). floor() function is missing in dietlibc (on architectures other than x86). Moreover using floor() to clip rand results does not assure even result distribution. _even_rand() uses integer arithmetic only and is designed to return evenly distributed results. > Looks OK to me. It took a while to decipher what is the exact meaning of > the loop in _even_rand (to a non-pseudorandomness-expert) but I am > fairly comfortable with it now. If I understand this correctly, it > rejects numbers that come from an "incomplete" slice of the RAND_MAX > space (considering the number space [0, RAND_MAX] is divided into some > "max"-sized slices and at most a single smaller slice, between [n*max, > RAND_MAX] for suitable n -- numbers from this last slice are discarded > because they could distort the distribution in favour of smaller > numbers). Signed-off-by: Przemyslaw Iskra <sparky <at> pld-linux.org> Reviewed-by: Petr Rockai <prockai <at> redhat.com>	2010-10-13 12:18:53 +00:00
Dave Wysochanski	f70468ce0b	Fix lv_modules_dup segfault.	2010-10-12 17:09:23 +00:00
Petr Rockai	98351ffbd5	Make lvconvert respect --yes/--force in the inactive log conversion prompt. Fixes BZs 642055, 621281. Patch by Taka. Signed-off-by: Takahiro Yasui <tyasui@redhat.com> Reviewed-by: Petr Rockai <prockai@redhat.com>	2010-10-12 16:41:17 +00:00
Dave Wysochanski	2eba846043	Refactor and add code for (lv) 'modules' get function.	2010-10-12 16:13:06 +00:00
Dave Wysochanski	d88090b0ae	Refactor and add code for (lv) 'mirror_log' get function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-By: Petr Rockai <prockai@redhat.com>	2010-10-12 16:12:50 +00:00
Dave Wysochanski	40c6c80723	Refactor and add code for (lv) 'lv_kernel_{major\|minor}' get functions. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-By: Petr Rockai <prockai@redhat.com>	2010-10-12 16:12:33 +00:00
Dave Wysochanski	e27833fb9c	Refactor and add code for (lv) 'convert_lv' get function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-By: Petr Rockai <prockai@redhat.com>	2010-10-12 16:12:18 +00:00
Dave Wysochanski	af579eccc3	Refactor and add code for (lv) 'move_pv' get function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-By: Petr Rockai <prockai@redhat.com>	2010-10-12 16:12:02 +00:00
Dave Wysochanski	29636f38e3	Refactor and add code for (lv) 'origin_size' get function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-By: Petr Rockai <prockai@redhat.com>	2010-10-12 16:11:48 +00:00
Dave Wysochanski	802e252b29	Refactor and add code for (lv) 'lv_path' get function.	2010-10-12 16:11:34 +00:00
Dave Wysochanski	637ac19e60	Rename 'flags' to 'status' for struct metadata_area. In other LVM memory structures such as volume_group, the field used to store flags is called "status", and on-disk fields are called 'flags', so rename the one inside metadata_area to be consistent. Not only is it more consistent with existing code but is cleaner to say "the status of this mda is ignored". Background for this patch - prajnoha pinged me on IRC this morning about a fix he was working on related to metadataignore when metadata/dirs was set. I was reviewing my patches from this year and realized the 'flags' field was probably not the best choice when I originally did the metadataignore patches.	2010-10-05 17:34:05 +00:00
Dave Wysochanski	0ca1492ca5	Fix copyright dates on new files lib/metadata/{lv\|vg\|pv}.[ch].	2010-09-30 20:47:18 +00:00
Dave Wysochanski	b184f791d4	Add pv_name_dup() and pv_fmt_dup() helper functions.	2010-09-30 14:09:22 +00:00
Dave Wysochanski	1cd292af8f	Add pv_mda_size, pv_mda_free, and pv_used functions, call from 'disp' functions.	2010-09-30 14:09:10 +00:00
Dave Wysochanski	b1ef78d000	Add supporting functions vg_name_dup, vg_fmt_dup, vg_system_id_dup. Add supporting functions for vg_name, vg_fmt, vg_system_id. Append "_dup" to end of supporting functions to make clear the strings are dup'd and to avoid namespace conflict with vg_name.	2010-09-30 14:08:33 +00:00
Dave Wysochanski	c508945ca9	Add pv_tags_dup, vg_tags_dup, lv_tags_dup functions that call tags_format_and_copy.	2010-09-30 14:08:19 +00:00
Dave Wysochanski	f15033c0e1	Add tags_format_and_copy() common function and call from _tags_disp. Add a common function to allocate memory and format a string of tags. Call tags_format_and_copy() from _tags_disp().	2010-09-30 14:08:07 +00:00
Dave Wysochanski	254d672dcc	Add pv_uuid_dup, vg_uuid_dup, and lv_uuid_dup, and call id_format_and_copy. Add supporting functions for pv_uuid, vg_uuid, and lv_uuid. Call new function id_format_and_copy. Use 'const' where appropriate. Add "_dup" suffix to indicate memory is being allocated. Call {pv\|vg\|lv}_uuid_dup from lvm2app uuid functions.	2010-09-30 14:07:47 +00:00
Dave Wysochanski	4bbadbe1cf	Simplify logic to create 'attr' strings. This patch addresses code review request to simplify creation of 'attr' strings. The simplification is done in this separate patch to more easily review and ensure the simplification is done without error.	2010-09-30 14:07:19 +00:00
Dave Wysochanski	14663348d0	Add {pv\|vg\|lv}_attr_dup() functions and refactor 'disp' functions. Move the creating of the 'attr' strings into a common function so they can be called from the 'disp' functions as well as the new 'get' property functions. Add "_dup" suffix to indicate memory is allocated. Refactor pvstatus_disp to take pv argument and call pv_attr_dup().	2010-09-30 13:52:55 +00:00
Dave Wysochanski	e32e2eb011	Add lib/metadata/vg.[ch] and lib/metadata/lv.[ch]. These got missed when git cvsexportcommit was used.	2010-09-30 13:16:55 +00:00
Dave Wysochanski	b88b638d6e	Add lib/metadata/pv.[ch] new files. Apparently git cvsexportcommit does not properly add new files from a git commit.	2010-09-30 13:15:42 +00:00
Dave Wysochanski	b171907fc5	Refactor metadata.[ch] into lv.[ch] for lv functions. This patch is similar to the other patches for pv and vg functionality, and separates lv functionality into separate files, concentrating on reporting fields and simple functions.	2010-09-30 13:05:45 +00:00
Dave Wysochanski	f42b708eae	Refactor metadata.[ch] into pv.[ch] for pv functions. The metadata.[ch] files are very large. This patch makes a first attempt at separating out pv functions and data, particularly related to the reporting fields calculations. More code could be moved here but for now I'm stopping at reporting functions 'get' / 'set' functions.	2010-09-30 13:05:20 +00:00
Dave Wysochanski	81f0124a58	Refactor metadata.[ch] into vg.[ch] for vg functions. The metadata.[ch] files are very large. This patch makes a first attempt at separating out vg functions and data, particularly related to the reporting fields calculations.	2010-09-30 13:04:55 +00:00
Peter Rajnoha	bad35c6554	Add escape sequence for ':' and '@' found in device names used as PVs.	2010-09-23 12:02:33 +00:00
Milan Broz	c7af31dbd7	Fix return type qualifier to avoid compiler warning. introduced in commit `b16b4d92a7` "Improve various log messages." fixes a lot of ../include/metadata.h:148: warning: type qualifiers ignored on function return type	2010-08-26 12:08:19 +00:00
Mike Snitzer	4efb1d9cbb	Update heuristic used for default and detected data alignment. Add "devices/default_data_alignment" to lvm.conf to control the internal default that LVM2 uses: 0==64k, 1==1MB, 2==2MB, etc. If --dataalignment (or lvm.conf's "devices/data_alignment") is specified then it is always used to align the start of the data area. This means the md_chunk_alignment and data_alignment_detection are disabled if set. (Same now applies to pvcreate --dataalignmentoffset, the specified value will be used instead of the result from data_alignment_offset_detection) set_pe_align() still looks to use the determined default alignment (based on lvm.conf's default_data_alignment) if the default is a multiple of the MD or topology detected values.	2010-08-20 20:59:05 +00:00
Dave Wysochanski	69d67dc2ca	Add vg_mda_size and vg_mda_free functions. Add supporting functions to get vg_mda_size and vg_mda_free fields. Should be no functional change.	2010-08-20 12:43:49 +00:00
Milan Broz	586b56b18c	Fix wrong use of LCK_WRITE In all top vg read functions only LCK_VG_READ/WRITE can be used. All other vg lock definitions are low-level backend machinery. Moreover, LCK_WRITE cannot be tested through bitmask. This patch fixes these mistakes. For _recover_vg() we do not need lock_flags, it can be only two of above and we always upgrading to LCK_VG_WRITE lock there. (N.B. that code is racy) There is no functional change in code (despite wrong masking it produces correct bits:-)	2010-08-19 23:26:31 +00:00
Milan Broz	727f7bfa49	Detect LUKS signature in pvcreate One shiny day we should use libblkid here. But now using LUKS is very common together with LVM and pvcreate destroys LUKS completely. So for user's convenience, try to detect LUKS signature and allow abort.	2010-08-19 23:08:18 +00:00
Milan Broz	2d5e2b52ca	Change the pvcreate swap/md logic pvcreate detects MD and swap signature. The logic hidden there is not only documented but it is also user unfriendly. Who invented this logic should run pvcreate on its own critical MD device to see why;-) This patch - creates one function instead of duplication code - asks if user want to overwrite signature - allows aborting (!) (Please note that writing LVM signatute without wiping old is wrong, it confuses blkid, MD will not work anyway and swap and LUKS is broken too.)	2010-08-19 23:03:34 +00:00
Alasdair Kergon	22149572e8	Use 'SINGLENODE' instead of 'dead' in clvmd singlenode messages. Ignore snapshots when performing mirror recovery beneath an origin. Pass LCK_ORIGIN_ONLY flag around cluster. Add suspend_lv_origin and resume_lv_origin using LCK_ORIGIN_ONLY.	2010-08-17 19:25:05 +00:00
Alasdair Kergon	2d6fcbf67d	Allow internal suspend and resume of origin without its snapshots.	2010-08-17 16:25:32 +00:00
Jonathan Earl Brassow	d0191bf9f4	Fix for bug 612291: dm devices of split off mirror images are not removed DM devices were not handled properly on nodes in a cluster that were not where the splitmirrors command was issued. This was happening because suspend_lv/resume_lv were being used in a place where activate_lv should have been used. When the suspend/resume are issued on (effectively) new LVs, their 'resource' (UUID) is not located in the lv_hash. Thus, both operations turn into no-ops. You can see this from the output of clvmd from one of the remote nodes: <snip> do_suspend_lv, lock not already held <snip> do_resume_lv, lock not already held 'activate_lv' enjoins the other nodes in the cluster to process the lock and activate the new LV. clvmd output from remote node as follows: do_lock_lv: resource 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S', cmd = 0x19 LCK_LV_ACTIVATE (READ\|LV\|NONBLOCK), flags = 0x84 (DMEVENTD_MONITOR ), memlock = 1 sync_lock: 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S' mode:1 flags=1 sync_lock: returning lkid 27b0001 Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Petr Rockai <prockai@redhat.com>	2010-08-16 18:02:14 +00:00
Mike Snitzer	b123a82d73	Change default alignment of pe_start to 1MB. The new standard in the storage industry is to default alignment of data areas to 1MB. fdisk, parted, and mdadm have all been updated to this default. Update LVM to align the PV's data area start (pe_start) to 1MB. This provides a more useful default than the previous default of 64K (which generally ended up being a 192K pe_start once the first metadata area was created). Before this patch: # pvs -o name,vg_mda_size,pe_start PV VMdaSize 1st PE /dev/sdd 188.00k 192.00k After this patch: # pvs -o name,vg_mda_size,pe_start PV VMdaSize 1st PE /dev/sdd 1020.00k 1.00m The heuristic for setting the default alignment for LVM data areas is: - If the default value (1MB) is a multiple of the detected alignment then just use the default. - Otherwise, use the detected value. In practice this means we'll almost always use 1MB -- that is unless: - the alignment was explicitly specified with --dataalignment - or MD's full stripe width, or the {minimum,optimal}_io_size exceeds 1MB - or the specified/detected value is not a power-of-2	2010-08-12 04:11:48 +00:00
Jonathan Earl Brassow	8d2d4f1fa0	Fix for bug 619221 - log device splitting regression An incorrect fix on July 13, 2010 for an annoyance has caused a regression. The offending check-in was part of the 2.02.71 release of LVM. That check-in caused any PVs specified on the command line to be ignored when performing a mirror split. This patch reverses the aforementioned check-in (solving the regressions) and posits a new solution to the list reversal problem. The original problem was that we would always take the lowest mimage LVs from a mirror when performing a split, but what we really want is to take the highest mimage LVs. This patch accomplishes that by working through the list in reverse order - choosing the higher numbered mimages first. (This also reduces the amount of processing necessary.) Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Reviewed-by: Takahiro Yasui <takahiro.yasui@hds.com>	2010-08-06 15:38:32 +00:00
Jonathan Earl Brassow	cbd41292a4	Taka's fix for handling failure of all mirrored log devices and all but one mirror leg. <patch header> To handle a double failure of a mirrored log, Jon's two patches are commited, however, lvconvert command can't still handle an error when mirror leg and mirrored log got failure at the same time. [Patch]: Handle both devices of a mirrored log failing (bug 607347) posted: https://www.redhat.com/archives/lvm-devel/2010-July/msg00009.html commit: https://www.redhat.com/archives/lvm-devel/2010-July/msg00027.html [Patch]: Handle both devices of a mirrored log failing (bug 607347) - additional fix posted: https://www.redhat.com/archives/lvm-devel/2010-July/msg00093.html commit: https://www.redhat.com/archives/lvm-devel/2010-July/msg00101.html In the second patch, the target type of mirrored log is replaced with error target when remove_log is set to 1, but this procedure should be also used in other cases such as the number of mirror leg is 1. This patch relocates the procedure to the main path. In addition, I added following three changes. - Removed tmp_orphan_lvs handling procedure It seems that _delete_lv() can handle detached_log_lv properly without adding mirror legs in mirrored log to tmp_orphan_lvs. Therefore, I removed the procedure. - Removed vg_write()/vg_commit() Metadata is saved by vg_write()/vg_commit() just after detached_log_lv is handled. Therefore, I removed vg_write()/vg_commit(). - With Jon's second patch, we think that we don't have to call remove_mirror_log() in _lv_update_mirrored_log() because will be handled remove_mirror_images() in _lvconvert_mirrors_repaire(). </patch header> Signed-off-by: Takahiro Yasui <takahiro.yasui@hds.com> Reviewed-by: Petr Rockai <prockai@redhat.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-08-02 21:07:40 +00:00
Jonathan Earl Brassow	efaaf3146d	Disallow mirrored logs in cluster mirrors. The cluster log daemon (cmirrord) is not multi-threaded and can handle only one request at a time. When a log is stacked on top of a mirror (which itself contains a 'core' log), it creates a situation that cannot be solved without threading. When the top level mirror issues a "resume", the log daemon attempts to read from the log device to retrieve the log state. However, the log is a mirror which, before issuing the read, attempts to determine the 'sync' status of the region of the mirror which is to be read. This sync status request cannot be completed by the daemon because it is blocked on a read I/O to the very mirror requesting the sync status.	2010-08-02 19:03:45 +00:00
Dave Wysochanski	936541ec56	Remove irrelevant comments relating to vg_mda_copies.	2010-07-30 16:47:27 +00:00
Jonathan Earl Brassow	405c4a45d8	It's not enough to check for the kernel module in the case of cluster mirrors, we must also check that the log daemon (cmirrord) is running. The log module can be auto-loaded, but the daemon cannot be "auto-started". Failing to check for the daemon produces cryptic messages that customers have a hard time deciphering. (The system messages do report that the log daemon is not running, but people don't seem to find this message easily.) Here are examples of what is printed when the module is available, but the log daemon has not been started. [root@bp-01 LVM2]# lvcreate -m1 -l1 -n lv vg Shared cluster mirrors are not available. [root@bp-01 LVM2]# lvcreate -m1 -l1 -n lv vg -v Setting logging type to disk Finding volume group "vg" Archiving volume group "vg" metadata (seqno 3). Creating logical volume lv Executing: /sbin/modprobe dm-log-userspace Cluster mirror log daemon is not running Shared cluster mirrors are not available. Creating volume group backup "/etc/lvm/backup/vg" (seqno 4).	2010-07-21 13:40:21 +00:00
Jonathan Earl Brassow	60f425d1b3	Fix for bug 614164: No check for existing name when splitting mirror The user could use the same name as an existing LV when specifying a name for an LV split off from a mirror. This causes all sorts of issues.	2010-07-13 22:24:39 +00:00
Jonathan Earl Brassow	c42b084793	Fix for bugs: 612248 & 612291 Split mirror issues The main problem with these bugs was that the newly split off LV was not being suspended properly. This meant that the memlock count was not being balanced, the DM devices were not being renamed, and some DM devices which should have been removed were not. I've also renamed some of the variables and added comments to make things clearer as to what is going on. (I can break this patch in two if it means easier review.)	2010-07-13 21:48:16 +00:00
Jonathan Earl Brassow	a93fb6299f	Failed to test for the case where a log was requested to be removed even though there was no log. A simple run through the in-tree test suite would have caught this. :( - if (lv_is_mirrored(detached_log_lv) && + if (detached_log_lv && lv_is_mirrored(detached_log_lv) && Also, made some cosmetic changes suggested by kabi after my last check-in (e.g. s/return 0/return_0/ and adding an error message).	2010-07-09 17:57:51 +00:00
Dave Wysochanski	f77fb62b2a	Add log_error when strdup fails in {vg\|lv}_change_tag(). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-07-09 16:57:44 +00:00
Alasdair Kergon	08f1ddea6c	Use __attribute__ consistently throughout.	2010-07-09 15:34:40 +00:00
Alasdair Kergon	80e569104b	Remove superfluous fn prototypes.	2010-07-09 15:21:10 +00:00
Jonathan Earl Brassow	aa5734f2a3	Finish fix for bug 607347: failing both redundant mirror log legs... A previous check-in added logic to handle the case where both images of a mirrored log failed. It solved the problem by simply removing the log entirely - leaving the parent mirror with a 'core' log. This worked for most cases. However, if there was a small delay between the failures of the two mirrored log devices, the mirror would hang, LVM would hang, and no additional LVM commands could be issued. When the first leg of the log fails, it signals the need for repair. Before 'lvconvert --repair' is run by dmeventd, the second leg fails. 'lvconvert' would see both devices as failed and try to remove the log entirely. When it came time to suspend the parent mirror to update the configuration, the suspend would hang because it couldn't get any I/O through the mirrored log, which was plugged waiting for corrective action. The solution is to replace the log with an error target to clear any pending writes before removing it. This allows the parent mirror to suspend and make the proper changes.	2010-07-09 15:08:12 +00:00
Dave Wysochanski	a5fb2bbff3	Pass metadataignore to pv_create, pv_setup, _mda_setup, and add_mda. Pass metadataignore through PV creation / setup paths. As a result of this cleanup, we can remove the unnecessary setting of mda_ignore bits inside pvcreate_single(), after call to pv_create. For now, just set metadataignore to '0' in some places. This is equivalent to the prior functionality, although the 0 is given by the caller not hardcoded in _mda_setup() call. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-07-08 18:24:29 +00:00
Dave Wysochanski	dce204cec5	Init mda->list in mda_copy. This patch should be no functional change as all callers initialize mda->list.	2010-07-08 17:41:46 +00:00
Dave Wysochanski	7041b476ac	Add warning to vgextend and pvchange if metadataignore given on cmdline. Warn the user then change the value of vg_mda_copies. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-07-07 18:59:45 +00:00
Alasdair Kergon	7f7af46862	Adjust auto-metadata repair and caching logic to try to cope with empty mdas. - If a PV contained empty mdas, the auto-recovery code was not kicking in. - The 'inconsistent' state was getting lost when metadata was cached so recovery didn't kick in. But leave the behaviour alone when using precommitted metadata because of a warning in a confusing FIXME. In my testing, pvs and vgs didn't repair inconsistent metadata like they used to do. (How many other tools fail similarly now?) And there should be no need to cache inconsistent metadata because it is supposed to get repaired under the protection of a write lock immediately it is discovered. This code is in need of a redesign based on first principles. I still see bugs in this code and this commit is risky.	2010-07-07 02:53:16 +00:00
Alasdair Kergon	6c8655ce9b	fix code in 2nd mda unignore loop to match 1st loop	2010-07-06 20:09:38 +00:00
Alasdair Kergon	68f4e0c734	s/flags/mda/	2010-07-06 17:29:50 +00:00
Alasdair Kergon	0db1bbc3c3	shorten mesg	2010-07-06 17:27:32 +00:00
Alasdair Kergon	643f234119	fix jumbled args in 'Adjusting' message	2010-07-06 17:26:08 +00:00
Alasdair Kergon	d911ec67a9	Randomly select which mdas to use or ignore. Add some missing standard configure.in checks.	2010-07-05 22:23:15 +00:00
Alasdair Kergon	db3c1ac1c8	Add printf format attributes to yes_no_prompt & dm_{sn,as}printf and fix a calle	2010-07-02 21:16:50 +00:00
Alasdair Kergon	12eadbabdd	improve vgmetadatacopies unmanaged message	2010-06-30 20:03:52 +00:00
Dave Wysochanski	3b9d1b1a96	Check for missing_pv in vg_remove loop. If a pv is missing, we should just skip it rather than checking the device size and failing the vgremove.	2010-06-30 19:55:43 +00:00
Alasdair Kergon	d8886386bd	more mda ignore cleanups	2010-06-30 19:28:35 +00:00
Dave Wysochanski	40b4d1c3ae	Refactor vg_remove_check to place pv removal into separate function.	2010-06-30 18:03:52 +00:00
Alasdair Kergon	23177eda88	more metadataignore message/code cleanup	2010-06-30 17:13:05 +00:00
Alasdair Kergon	efe75fd705	revert that	2010-06-30 14:54:29 +00:00
Alasdair Kergon	a6c4427188	suppress useless compiler warning	2010-06-30 14:52:29 +00:00
Dave Wysochanski	ef7b409966	Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg.	2010-06-30 14:48:07 +00:00
Alasdair Kergon	67b91d0848	Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg.	2010-06-30 14:27:40 +00:00
Alasdair Kergon	647c64c796	Improve various log messages.	2010-06-30 13:51:11 +00:00
Dave Wysochanski	a5bf70018b	Add --metadataignore to pvcreate. Allow metadataignore flag to be passed in to pvcreate. Ideally, more refactoring of the mda allocation / initialization is warranted, but for now, we just add another parameter to 'add_mda' to take an existing mda ignored flag. We need to do this or pv_write loses the state of the mda 'ignored' flag before copying and writing to disk.	2010-06-30 12:17:24 +00:00
Dave Wysochanski	6af5155529	Improve logging for setting --vgmetadatacopies. Example of logging: metadata/metadata.c:1127 Setting mda_copies = 3 on vg vgtest metadata/pv_manip.c:296 /dev/loop2 0: 0 25: NULL(0:0) metadata/pv_manip.c:296 /dev/loop3 0: 0 25: NULL(0:0) metadata/pv_manip.c:296 /dev/loop4 0: 0 25: NULL(0:0) metadata/metadata.c:1072 Adjusting ignored mdas on vg vgtest, vg_mda_used_count=5, vg_mda_copies=3 metadata/metadata.c:1015 Setting ignore flag for 2 mdas on vg vgtest metadata/metadata.c:4151 Setting mda ignored flag for metadata_locn /dev/loop2. metadata/metadata.c:4151 Setting mda ignored flag for metadata_locn /dev/loop3.	2010-06-29 22:41:28 +00:00
Dave Wysochanski	d37dd5b2d3	Improve logging for metadata ignore by printing device name. Print device name when setting or clearing metadata ignore bit. Example: label/label.c:160 /dev/loop2: lvm2 label detected cache/lvmcache.c:1136 lvmcache: /dev/loop2: now in VG #orphans_lvm2 (#orphans_lvm2) metadata/metadata.c:4142 Setting mda ignored flag for metadata_locn /dev/loop2. format_text/text_label.c:318 Skipping mda with ignored flag on device /dev/loop2 at offset 4096	2010-06-29 22:37:32 +00:00
Dave Wysochanski	710c9373bf	Add some log_verbose debug statements related to metadataignore. Logging isn't ideal, especially for mda_set_ignore. Ideally we'd like to display the device name and offset in this case but this requires a bit more work and a per-format 'mda_description' function pointer definition (we don't have access to mda_context in metadata.c).	2010-06-29 22:25:58 +00:00
Dave Wysochanski	a375ced300	Move code into pv_change_metadataignore library function. In preparation to call this from both pvcreate as well as pvchange, move the guts of metadataignore into a library function. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-29 21:32:44 +00:00
Dave Wysochanski	a9d8bf269a	Allow 'all' and 'unmanaged' values for --vgmetadatacopies. Allowing an 'all' and 'unmanaged' value is more intuitive, and provides a simple way for users to get back to original LVM behavior of metadata written to all PVs in the volume group. If the user requests "--vgmetadatacopies unmanaged", this instructs LVM not to manage the ignore bits to achieve a specific number of metadata copies in the volume group. The user is free to use "pvchange --metadataignore" to control the mdas on a per-PV basis. If the user requests "--vgmetadatacopies all", this instructs LVM to do 2 things: 1) clear all ignore bits, and 2) set the "unmanaged" policy going forward. Internally, we use the special MAX_UINT32 value to indicate 'all'. This 'just' works since it's the largest value possible for the field and so all 'ignore' bits on all mdas in the VG will get cleared inside _vg_metadata_balance(). However, after we've called the _vg_metadata_balance function, we check for the special 'all' value, and if set, we write the "unmanaged" value into the metadata. As such, the 'all' value is never written to disk. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:40:01 +00:00
Dave Wysochanski	a09a8efb66	Update check in vg_split_mdas to account for ignored mdas list. The check in vg_split_mdas will trigger an error if the 'from' vg list is empty. However, this might be ok in some instances now that we have ignored mdas. Relax this check so an error is triggered only in the case where there's truly no more mdas in the 'from' vg. One example of where this makes a difference is with vgreduce. If we try to vgreduce a PV with un-ignored mdas, this should trigger the balancing function to un-ignore mdas on another PV in the VG. However, we don't get to vg_write() before we fail because this list size check fails, and we see an error message indicating: "Cannot remove final metadata area ..." Another example is with vgsplit into a new VG, where the PVs being moved contain all ignored mdas. We must move the mdas on fid->metadata_areas_ignored from 'vg_from' to 'vg_to'. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:38:56 +00:00
Dave Wysochanski	f61cd7b249	Ensure fid mda lists are populated correctly during vgextend. The vgextend path calls add_pv_to_vg(). Inside add_pv_to_vg(), we must ensure we pass the correct mdas list into pv_setup(), as copies of mdas are placed on the vg->fid list. If we don't place the mdas on the correct vg->fid list, the various counts may be incorrect and the metadata balance algorithm will not work when called from vg_write() path. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:38:39 +00:00
Dave Wysochanski	1b54343328	Implement _vg_adjust_ignored_mdas and call from vg_write() path. Compare the value of the newly added vg_mda_copies field (--vgmetadatacopies parameter) with the current count of in-use mdas and ignoring or unignoring mdas as necessary to get to the target count. Also, as a safety check before returning, ensure we have at least one mda enabled. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:37:54 +00:00
Dave Wysochanski	821f0cc5ea	Add vg get/set methods for VG metadata copies. This patch adds the get and partially implemented set function. The 'set' function should probably ignore or un-ignore metadata areas based on new values. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:56 +00:00
Dave Wysochanski	88d7dc1af8	Add mda_copies to VG structures and initialization. Add a field to struct volume_group to later implement metadata balancing: - mda_copies: target # of non-ignored mdas in the VG; default 0 (do not control pv 'ignore mdas' bit. This patch just adds the parameter to the structures with the default values but does not modify any commands. Should be no functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:36:37 +00:00
Dave Wysochanski	0f2f8a5c3a	Before committing each mda, arrange mdas so ignored mdas get committed first. Arrange mdas so mdas that are to be ignored come first. This is an optimization that ensures consistency on disk for the longest period of time. This was noted by agk in review of the v4 patchset of pvchange-based mda balance. Note the following example for an explanation of the background: Assume the initial state on disk is as follows: PV0 (v1, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) If we did not sort the list, we would have a commit sequence something like this: PV0 (v2, non-ignored) PV1 (v2, ignored) PV2 (v2, ignored) PV3 (v2, non-ignored) After the commit of PV0's mdas, we'd have an on-disk state like this: PV0 (v2, non-ignored) PV1 (v1, non-ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is an inconsistent state of the disk. If the machine fails, the next time it was brought back up, the auto-correct mechanism in vg_read would update the metadata on PV1-PV3. However, if possible we try to avoid inconsistent on-disk states. Clearly, because we did not sort, we have a greater chance of on-disk inconsistency - from the time the commit of PV0 is complete until the time PV3 is complete. We could improve the amount of time the on-disk state is consistent by simply sorting the commit order as follows: PV1 (v2, ignored) PV2 (v2, ignored) PV0 (v2, non-ignored) PV3 (v2, non-ignored) Thus, after the first PV is committed (in this case PV1), on-disk we would have: PV0 (v1, non-ignored) PV1 (v2, ignored) PV2 (v1, non-ignored) PV3 (v1, non-ignored) This is clearly a consistent state. PV1 will be read but the mda will be ignored. All other PVs contain v1 metadata, and no auto-correct will be required. In fact, if we commit all PVs with ignored mdas first, we'll only have an inconsistent state when we start writing non-ignored PVs, and thus the chances we'll get an inconsistent state on disk is much less with the sorted method. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:49 +00:00
Dave Wysochanski	77e0ed4be7	Refactor vg_commit() to add _vg_commit_mdas(). Factor out calling mda->ops->vg_commit() for each mda. No functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:33 +00:00
Dave Wysochanski	69d1732334	Update _vg_read and _text_create_text_instance to use fid_add_mda[s]. When we are constructing the vg, we may need to adjust the list of metadata_areas if there are ignored mdas. At label read time, we do not read the metadata of ignored mdas, and as a result, they do not get placed on vg->fid->metadata_areas inside _text_create_text_instance since lvmcache does not have these areas attached to vginfo->infos. However, when we're checking the pvids inside _vg_read, after having read another metadata area from another PV, we do have the opportunity to update the metadata_area and metadata_areas_ignored lists based on the read metadata_area. We need accurate mda lists for the reporting functions that count the ignored mdas, as well as general correctness of mda balancing. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:35:17 +00:00
Dave Wysochanski	bb723d7897	Use mdas_empty_or_ignored() in place of checks for empty mda list. With the addition of ignored mdas, we replace all checks for an empty mda list with a new function to look for either an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:58 +00:00
Dave Wysochanski	f9c307cd07	Add mdas_empty_or_ignored() helper function. Add a helper function to consolidate checking for an empty mdas list or ignored mdas. Ignored mdas should behave almost identically to an empty mda list - the metadata areas should not be read or written to. This function will make it easier to implement metadata balancing and easier to track pvs with an empty mda list or ignored mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:34:40 +00:00
Dave Wysochanski	cdbe475fe3	Define new functions and vgs/pvs fields related to mda ignore. Define a new pvs field, pv_mda_used_count, and a new vgs field, vg_mda_used_count to match the existing pv_mda_count and vg_mda_count. These new fields count the number of mdas that have the 'ignored' bit clear (they are in use on the PV / VG). Also define various supporting functions to implement the counting as well as setting the ignored flag and determining if an mda is ignored. These high level functions call into the lower level location independent mda ignore functions defined by earlier patches. Note that counting ignored mdas in a vg requires traversing both lists and checking for the ignored bit on the mda. The count of 'ignored' mdas then is defined by having the bit set, not by which list the mda is on. The list does determine whether LVM actually does read/write to the mda, though we must count the bits in order to return accurate numbers for the various counts. Also, pv_mda_set_ignored must search both vg lists for ignored mda. If the state changes and needs to be committed to disk, the ignored mda will be on the non-ignored list. Note also in pv_mda_set_ignored(), we must properly manage the mda lists. If we change the ignored state of an mda, we must change any mdas on vg->fid->metadata_areas that correspond to this pv. Also, we may need to allocate a copy of the mda, as is done when fid->metadata_areas is populated from _vg_read(), if we are un-ignoring an ignored mda. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:44 +00:00
Dave Wysochanski	9ccac021a7	Add metadata_areas_ignored list and functions to manage ignored mdas. Add a second mda list, metadata_areas_ignored to fid, and a couple functions, fid_add_mda() and fid_add_mdas() to help manage the lists. These functions are needed to properly count the ignored mdas and manage the lists attached to the 'fid' and ultimately the 'vg'. Ensure metadata_areas_ignored is initialized in other formats, even if the list is never used. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-06-28 20:33:22 +00:00
Dave Wysochanski	f55a20eb36	Rename fid->metadata_areas to fid->metadata_areas_in_use. Rename the metadata_areas list to an 'in_use' list to prepare for future 'ignored' list.	2010-06-28 20:32:44 +00:00
Dave Wysochanski	ef4fa155a5	Add mda location specific mda_copy constructor. Because of the way mdas are handled internally, where a PV in a VG has mdas on both info->mdas and vg->fid->metadata_areas list, we need a location independent copy constructor for struct metadata_area. Break up the existing format-text specific copy constructor into a format independent piece and a format dependent piece. This function is necessary to properly implement pv_set_mda_ignored(). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:59 +00:00
Dave Wysochanski	29f24d4634	Add mda_locns_match() internal library function for mapping pv/device to VG mda. A metadata_area is defined independent of the location. One downside is that there is no obvious mapping from a pv to an mda. For a PV in a VG, we need a way to start with a PV and end up with an MDA, if we are to manage mdas starting with a device/pv. This function provides us a way to go down the list of PVs on a VG, and identify which ones match a particular PV. I'm not entirely happy with this approach, but it does fit into the existing structures in a reasonable way. An alternative solution might be to refactor the VG - PV interface such that mdas are a list tied to a PV. However, this seemed a bit tricky since a PV does not come into existence until after the list of mdas is constructed (see _vg_read() - we create a 'fid' and attach mdas to it, then we go through them and attach pvs). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:31:38 +00:00
Dave Wysochanski	322c5868b3	Add location independent flag and functions to ignore mdas. First we add a 'flags' field to the location independent metadata_area structure, and a MDA_IGNORE flag. The mda_is_ignored and mda_set_ignored functions are added to manage the flag. Adding the flag and functions gives a library interface to ignore metadata areas independent of the underlying location (disk, file, etc). The location specific read/write functions must then handle the specifics of what this flag means to the location. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>	2010-06-28 20:30:14 +00:00
Jonathan Earl Brassow	68c31a2a36	Fix for bz608048 from Taka... The same region size is used for both mirror volume and mirrored log volume, but when the physical extent size is bigger than region size, the size of mirror leg for mirrored log is smaller than the region size and lvcreate command fails. This patch adjusts a region size of mirrored log to a smaller value of region size or physical extent size. [This patch ensures that the region_size of the mirrored log does not exceed the size of the mirrored log itself, which would violate the kernel constraint: (region_size <= ti->len).] Signed-off-by: Takahiro Yasui <takahiro.yasui@hds.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-06-28 14:19:41 +00:00
Jonathan Earl Brassow	42f7fd0590	The function that runs to compress a stacked mirror after converting from 2-way to 3-way mirror (collapse_mirrored_lv) was calling '_remove_mirror_images' with the 'remove_log' parameter set. When the code was put in to fix 599898 to honor log parameters during conversion, this argument was suddenly being honored. Thus, when someone would convert from a 2-way to 3-way mirror, the log would get removed. 'collapse_mirrored_lv' should not be calling '_remove_mirror_images' with 'remove_log' set.	2010-06-23 13:57:26 +00:00
Milan Broz	f9e177d281	Fix "allocated" warning typo.	2010-06-22 21:10:53 +00:00
Jonathan Earl Brassow	a7d355a28c	Mirrors can be layered - as in the case of an converting 2-way to 3-way mirror. When conversion operations are performed on these types of mirrors, log options can be confused/ignored. In the case of a converting 3-way mirror, we have a top-level 2-way corelog mirror whose legs are 1) a 2-way disk-log mirror and 2) a linear device. If we wish to convert this 3-way mirror to a 2-way mirror, the linear device is removed and the extra top layer is eliminated. If we also wished to convert the disk log to a core log in the same step, ambiguity creeps in. It is somewhat obvious what the user wants - a 2-way mirror with a corelog. However, looking at the top level mirror before compression, it seems that the mirror already has a core log. This is why the operation seemed to fail. This patch simply re-evaluates what mirrored_seg points to after a compression and then considers the log argument. This is a fix for bug 599898.	2010-06-21 16:12:33 +00:00
Petr Rockai	d345bf2cd3	Account for mirror transient status when doing lvconvert --repair.	2010-05-24 15:32:20 +00:00
Zdenek Kabelac	4ef2bf27a7	Update Copyright date for resently modifed files	2010-05-24 09:04:27 +00:00
Zdenek Kabelac	65928349e7	Replicator: add read and release VGs for rsites Add functions to read and release remote VGs for replicator sites in activation context.	2010-05-21 14:07:16 +00:00
Zdenek Kabelac	f6d7e637c3	Add toolcontext.h header file.	2010-05-21 13:34:09 +00:00
Zdenek Kabelac	6222635b38	Replicator: add find_replicator_vgs Adding find_replicator_vgs() function to find all needed VGs for replicator-dev LV. This function is later called before taking lock_vol().	2010-05-21 12:55:25 +00:00
Zdenek Kabelac	12569ccb03	Replicator: add sorted cmd_vg list Introduce struct cmd_vg to store information about needed volume group name, vgid, flags and the pointer to opened VG. Keep VGs list in alphabetical order for locking order. Introduce functions: cmd_vg_add() add new cmd_vg entry. cmd_vg_lookup() search cmd_vgs for vg_name. cmd_vg_read() open VGs in cmd_vgs list. cmd_vg_release() close VGs in reversed order.	2010-05-21 12:52:01 +00:00
Zdenek Kabelac	0a02d30ea4	Replicator: extend volume_group with list of VGs and flag Add pointer to linked list of opened VGs. List temporarily keeps the information about needed or locked and opened VGs for replicator target. Also add cmd_missing_vgs flag information for quick check and also for possible continuos process_each_lv() usage where we need to detect whether failure has been caused by missing VG or some other reason.	2010-05-21 12:47:46 +00:00
Zdenek Kabelac	e86e45f7ea	Replicator: extend _lv_each_dependency() with dependencies for Replicator devices	2010-05-21 12:45:18 +00:00
Zdenek Kabelac	651cae3c5c	Replicator: check replicator segment Check for possible problems within replicator structures. Used also by vg_validate.	2010-05-21 12:43:02 +00:00
Zdenek Kabelac	1207106fbc	Replicator: new files for Replicator target	2010-05-21 12:40:05 +00:00
Zdenek Kabelac	8fea97b7e7	Replicator: base lvm2 support Adding configure.in support for Replicators. Adding basic lib lvm support for Replicators. Adding flags REPLICATOR and REPLICATOR_LOG. Adding segments SEG_REPLICATOR and SEG_REPLICATOR_DEV. Adding basic methods for handling replicator metadata.	2010-05-21 12:36:30 +00:00
Dave Wysochanski	dd2a0e940d	Add find_vgname_from_{pvname\|pvid} functions. Some commands start with a pvname, but we'd like to force users to start with a vg handle to obtain a pv handle. Our best option seems to be providing a way to look up the vgname from the pvname, and then require them to use vg_read/vg_open. In addition to the pvname lookup function, this patch also provides a lookup by pvid. The lookup by pvid can be used in conjunction with lvmcache_get_pvids to process all pvs in the system. The pvid find function first calls lvmcache_vgname_from_pvid, which may cause the label to be read if it is not in the cache. If the vgname is returned is an orphan, we then check to see if there are metadata areas, and if not, we scan every PV on the system by calling scan_vgs_for_pvs(). In most cases we should not need to do this, and by using the info->mdas count, we avoid calling pv_read() as prior code did. So this patch is a bit cleaner and should allow us to refactor more of the pv code. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-05-19 11:52:37 +00:00
Alasdair Kergon	1d837442bf	Add is_global_vg and split out from is_orphan_vg.	2010-05-19 02:36:33 +00:00
Alasdair Kergon	34220fe292	Validate orphan and VG_GLOBAL lock order too.	2010-05-19 02:08:50 +00:00
Alasdair Kergon	fa305e2ec6	Accept orphan VG names as parameters to lock_vol() and related functions.	2010-05-19 01:16:40 +00:00
Jonathan Earl Brassow	a932c2b61f	Disallow toggling the cluster attribute of a volume group if there are active mirrors or snapshots. We don't have the mechanisms in place to change the device-mapper tables for those targets that have behavioral differences between cluster and single machine instances. Allowing users to change the attribute but not changing the target's behavior can lead to data corruption. The following bugs are fixed/avoided by this patch: 235123 - vgchange -c [ny] do not change target types when necessary 289331 - RFE: switching from cluster domain to local domain needs to deactivate volume somehow 289541 - when changing from local to cluster, volumes can not appear to be deactivated	2010-05-14 15:19:42 +00:00
Jonathan Earl Brassow	56a5925aed	Fix comment from last commit. Additionally, there is no need to put a comment into the WHATS_NEW file if it is a regression that was created and fixed inside the same release window.	2010-04-27 15:26:58 +00:00
Jonathan Earl Brassow	d7c9d72390	Patch to fix bug 586021 and mantain historical behavior of being able to remove more images from a mirror than the number of PVs directly specified for removal. The effort to fix bug 581611 corrected a bug that was unnoticed at the time. The loop in _remove_mirror_images that looks over the specified PVs was allowing devices that were previously counted and moved to the end of the list to be double-counted. This resulted in the number of devices needed for removal always being satisfied - even if the user did not specify enough PVs for removal to satisfy the request. When 581611 was fixed, this double-counting no longer took place and the result was to remove only the minimum of the number of PVs specified or the number that was asked to be removed. By simply always setting 'new_area_count' (as used to be done only in the else statement), we return to the previous behavior. Indeed, this is exactly what the double-counting was allowing to happen before the fix of 581611.	2010-04-27 14:57:49 +00:00
Mike Snitzer	60267bdce8	Disallow the direct removal of a merging snapshot. Allow lv_remove_with_dependencies() to know the top-level LV that was requested to be removed (otherwise it recurses and we lose context). A merging snapshot cannot be removed directly but the associated origin can be. Disallow removal of a merging snapshot unless the associated origin is also being removed.	2010-04-23 19:27:10 +00:00
Mike Snitzer	1f661c5dd8	When removing a snapshot avoid preloading the origin if the snapshot-merge target is not active.	2010-04-23 02:57:39 +00:00
Jonathan Earl Brassow	66f79d05eb	Disallow the primary mirror image from being removed when the mirror is not in-sync. This restriction is not extended to repair operations (i.e. it will not limit what 'lvconvert --repair' can do).	2010-04-21 13:55:08 +00:00
Alasdair Kergon	ee90b8197f	Move function up file	2010-04-20 12:14:28 +00:00
Peter Rajnoha	1e696b0c15	Do not reset position in metadata ring buffer on vgrename and vgcfgrestore. We should write metadata into next position in the ring buffer while calling vgrename and vgcfgrestore. At this code level (_vg_write_raw), we were not able to determine if this is a rename or not. If yes, then accompanying VG structure passed here has a new name set, not the old one. When looking for a location where to put metadata next, we were given a NULL value because of failed VG name comparison (in _find_vg_rlocn) between the name in existing metadata and metadata we're just about to write. This resets the position in the ring buffer, overwriting any existing metadata (and also incorrectly updates the cache to "orphan" afterwards). This patch just adds old_name item in struct volume_group that we can check and use if necessary and detect renames at lower layers as well. The same applies for vgcfgrestore, but here we're using a special value of old_name, an empty string, to disable the check with existing metadata totally.	2010-04-14 13:09:16 +00:00
Dave Wysochanski	af46c894d0	Add pv->vg to solidify link between a pv and a vg. lvm2app needs a link back to the vg in order to use the vg handle for memory allocations as well as other things. This patch adds the field to struct physical_volume, and sets pv->vg when reading a vg from disk or extending a vg by using the helper function previously added, add_pvl_to_vgs(). Moves and renames are handled with separate code inside move_pv() and vgmerge(). Add pv->vg check to vg_validate(). A NULL value in pv->vg signifies membership in the orphan VG. Note though in the case of pv_read() on a device with metadatacopies == 0, more devices may need to be read for an authoritative answer. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:36 +00:00
Dave Wysochanski	11647ad01c	Use del_pvl_from_vgs() in vgreduce paths. Somehow these got missed in earlier patches. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:20 +00:00
Dave Wysochanski	0adfbfd5ea	Call add_pvl_to_vgs() and del_pvl_from_vgs() from more places. Now that we have library functions to add/delete a pv from the vg->pvs list, call them from everywhere. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:26:03 +00:00
Dave Wysochanski	8cfd64de78	Add del_pvl_from_vgs() and move prototypes into metadata-exported.h Add a delete function to manage the vg->pvs list. NOTE: It may be possible to do further cleanup to these add/del functions by passing a 'pv' as input instead of 'pv_list'. The pv_list is used for functions which do allocations (lvcreate) while other places in the code just manage a list of 'pv' (e.g. import functions, vgextend, etc). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-13 17:25:44 +00:00
Alasdair Kergon	1485ce69c4	Permit mimage LVs to be striped in lvcreate and lvresize.	2010-04-09 01:00:10 +00:00
Dave Wysochanski	fddc256a02	Check for duplicate paths (pvids) on the commandline of vgcreate. A user specifying duplicate paths on the cmdline of vgcreate will get a message similar to the following: vgcreate vgtest2 /dev/loop3 /dev/loop5 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5 Internal error: Duplicate PV id jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8 detected for /dev/loop3 in vgtest2. This is caught by vg_validate(), but it would be good to find this condition earlier in the vgcreate code. add_pv_to_vg() currently checks by pvname, but does not look for duplcate pvids. This patch adds the check for duplicate pvids and results in new error output as follows: vgcreate vgtest2 /dev/loop3 /dev/loop5 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3 Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5 Physical volume '/dev/loop5 (jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8)' listed more than once. Unable to add physical volume '/dev/loop5' to volume group 'vgtest2'. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-08 15:18:35 +00:00
Alasdair Kergon	4d0e07a799	missing ?:	2010-04-08 00:56:26 +00:00
Alasdair Kergon	b3302a0c3c	suppress bogus compiler warning	2010-04-08 00:52:41 +00:00
Alasdair Kergon	aab7a3978b	Fix pvmove allocation to take existing parallel stripes into account. When moving parts of striped LVs, pvmove wouldn't care about leaving you with two stripes on the same disk. Now --alloc anywhere is needed for that. (Tried and gave up on two alternative approaches before the one committed here.)	2010-04-08 00:28:57 +00:00
Dave Wysochanski	9e82787da2	Add add_pvl_to_vgs() - helper function to add a pv to a vg list. Small refactor of main places in the code where a pv is added to a vg into a small function which adds the pv to the list and updates the vg counts. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-06 14:04:54 +00:00
Dave Wysochanski	53ad3cad14	Add pv to vg->pvs after check for maximum value of vg->extent_count. In add_pv_to_vg(), we should only add the pv to vg->pvs after all internal checks have passed. The check for vg->extent_count exeeding maximum was after we added the pv to the list, so this function could return a state of vg->pvs that did not reflect other parameters such as vg->pv_count. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-04-06 14:03:43 +00:00
Alasdair Kergon	d27c8b5660	remove compiler warning	2010-04-02 01:35:34 +00:00
Alasdair Kergon	6ec52a01ee	A few more log_error to log_warn changes for mirrors.	2010-04-01 14:54:37 +00:00
Alasdair Kergon	abb9fb8370	Try to fix tracking of whether or not log extents need allocating.	2010-04-01 13:58:13 +00:00
Alasdair Kergon	0c67893ce9	Avoid endless loop if lv->segments list is corrupted	2010-04-01 13:08:06 +00:00
Alasdair Kergon	e7159c828b	initialise log_allocated to 0	2010-04-01 12:29:07 +00:00
Alasdair Kergon	d723636d52	Limit number of error messages when checking LV segments.	2010-04-01 12:14:20 +00:00
Alasdair Kergon	a1192f17ba	Improve vg_validate to detect some loops in lists.	2010-04-01 11:45:36 +00:00
Alasdair Kergon	0640232acd	Improve vg_validate to detect some loops in lists.	2010-04-01 11:43:24 +00:00
Alasdair Kergon	258db3ad8e	Change most remaining log_error WARNING messages to log_warn.	2010-04-01 10:34:09 +00:00
Alasdair Kergon	bce2869d92	Attempt to fix non-ALLOC_ANYWHERE allocation code after recent changes broke The preference given to the PVs with the largest free areas.	2010-03-31 20:26:04 +00:00
Milan Broz	6733116a19	Fix all segments memory is allocated from vg private mempool. Physical segments were still allocated from global command context mempool. This leads to very high memory usage when activating large VG (vgchange). (Memory usage was about 2G when >3000LVs). Fix it by properly using vg->vgmem private pool, so all the memory is released early. New memory pool parameter is needed here for pv_split_segment function. Also fix the same problem in some minor allocations (vg description, lv segment split).	2010-03-31 17:23:18 +00:00
Milan Broz	0423887528	Do not traverse PV segment list twice. In addition to previous patch, we really do not need to search for segment which was just allocated in split request. Make pv_split_segment function return newly allocated (split) segment also. (So after this patch, there is only one user of slow find_peg_by_pe).	2010-03-31 17:22:26 +00:00
Milan Broz	80b96a8974	Optimise PV segments search. The function find_peg_by_pe is incredibly inefficient for Pvs with many segments. In shiny future there should be binary (or interval) tree instead of sorted linked list (volunteers?). Anyway, for now, we can use dirty trick here to optimise this case: - Allocations are usually applied from the beginning of PV (we have no alloocation policy which allocates areas "backwards") - The only user of find_peg_by_pe is pv_split_segment() call. In most cases it need to split last PV segment. So if we search sorted pv segment list backwards, we hit the requested segment immediatelly. This patch applies this tiny change. (and saves >30% of processing time when >3000LVs segments are on one PV!) To discourage using this inefficient function from other code, it is moved to pv_manip.c and used static for now:-)	2010-03-31 17:21:40 +00:00
Mikulas Patocka	655849fb14	A missing space in the error message. Add missing parentheses to an error message Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2010-03-31 12:06:30 +00:00
Alasdair Kergon	1dee5eb625	Fix --alloc contiguous policy only to allocate one set of parallel areas.	2010-03-29 17:59:46 +00:00
Jonathan Earl Brassow	7a369d3704	Add ability to create mirrored logs for mirror LVs. This check-in enables the 'mirrored' log type. It can be specified by using the '--mirrorlog' option as follows: #> lvcreate -m1 --mirrorlog mirrored -L 5G -n lv vg I've also included a couple updates to the testsuite. These updates include tests for the new log type, and some fixes to some of the lvconvert tests.	2010-03-26 22:15:43 +00:00
Alasdair Kergon	2abbc07f3c	Allow ALLOC_ANYWHERE to split contiguous areas.	2010-03-25 21:19:26 +00:00
Alasdair Kergon	a7ca334681	Add some assertions to allocation code.	2010-03-25 18:16:54 +00:00
Alasdair Kergon	f4cea344b1	improve a few comments in last check-in	2010-03-25 02:40:09 +00:00
Alasdair Kergon	8d6722c8ad	Introduce pv_area_used into allocation algorithm and add debug messages. This is the next preparatory step towards better --alloc anywhere support and is not intended to break anything that currently works so please report any problems - segfaults, bogus data in the new debug messages, or if the code now chooses bizarre allocation layouts.	2010-03-25 02:31:48 +00:00
Mike Snitzer	a6bc975a24	Improve activation monitoring option processing . Add "monitoring" option to "activation" section of lvm.conf . Have clvmd consult the lvm.conf "activation/monitoring" too. . Introduce toollib.c:get_activation_monitoring_mode(). . Error out when both --monitor and --ignoremonitoring are provided. . Add --monitor and --ignoremonitoring support to lvcreate. Update lvcreate man page accordingly. . Clarify that '--monitor' controls the start and stop of monitoring in the {vg,lv}change man pages. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2010-03-23 22:30:18 +00:00
Alasdair Kergon	36f9d53b60	Allow dynamic extension of array of areas selected as allocation candidates.	2010-03-23 15:07:55 +00:00
Dave Wysochanski	15fdc8d3ee	Avoid scanning all pvs in the system if operating on a device with mdas. When we pv_read() a device that has an orphan vgname, we might need to scan the system to be sure this is true. However, if the PV has mdas, there's no way possible for it to have an orphan vgname unless it is a true orphan. Some areas of the code were optimized to take advantage of this fact, while others were not (we would still do the expensive scan if a device had mdas but had an orphan VG). This patch unifies the code so that every place we are operating on such a PV, we skip the expensive scan if there are mdas. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Petr Rockai <prockai@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com>	2010-03-18 17:29:12 +00:00
Milan Broz	acb4b5e4de	Fix pvcreate device check. If user try to vgcreate or vgextend non-existent VG, these messages appears: # vgcreate xxx /dev/xxx Internal error: Volume Group xxx was not unlocked Device /dev/xxx not found (or ignored by filtering). Unable to add physical volume '/dev/xxx' to volume group 'xxx'. Internal error: Attempt to unlock unlocked VG xxx. (the same with existing VG and non-existing PV & vgextend) # vgextend vg_test /dev/xxx ... It is caused because code tries to "refresh" cache if md filter is switched on using cache destroy. But we can change filters and rescan even without this machinery now, just use refresh_filters (and reset md filter afterwards). (Patch also discovers cache alias bug in vgsplit test, fix it by using better filter line.)	2010-03-17 14:44:18 +00:00
Alasdair Kergon	b1f9a2f5d1	Only do one full device scan during each read of text format metadata.	2010-03-16 17:30:00 +00:00
Alasdair Kergon	38220f9fe9	Remove unnecessary full_scan parameter from get_vgids and get_vgnames calls.	2010-03-16 16:57:03 +00:00
Alasdair Kergon	cccae7e633	Look up missing PVs by uuid not dev_name in _pvs_single to avoid invalid stat. Make find_pv_in_vg_by_uuid() return same type as related functions.	2010-03-16 15:30:48 +00:00
Alasdair Kergon	770dc81b8e	Introduce is_missing_pv().	2010-03-16 14:37:38 +00:00
Mike Snitzer	c485fe183e	Handle a misaligned device that reports a -1 alignment_offset. The kernel's blk_stack_limits() function may flag a device as 'misaligned'. If it does the alignment_offset will be -1. Update set_pe_align_offset() to accommodate this corner case.	2010-03-02 21:56:14 +00:00
Alasdair Kergon	16d9293bd7	Extend core allocation code in preparation for mirrored log areas.	2010-03-01 20:00:20 +00:00
Dave Wysochanski	3c23ff0f2e	Add dm_pool_strdup to allocate memory and copy a tag in {lv\|vg}_change_tag() We need to allocate memory for the tag and copy the tag value before we add it to the list of tags. We could put this inside lvm2app since the tools keep their memory around until vg_write/vg_commit is called, but we put it inside the internal library to minimize code in lvm2app. We need to copy the tag passed in by the caller to ensure the lifetime of the memory until the {vg\|lv} handle is released. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:57 +00:00
Dave Wysochanski	cd69ee7453	Refactor lvchange_tag() to call lv_change_tag() library function. Similar refactoring to vgchange - pull out common parts and put into library function for reuse. Should be no functional change. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:49 +00:00
Dave Wysochanski	e17bcc7432	Refactor _vgchange_tag() to vg_change_tag() library function. Pull out common code to be called from tools as well as lvm2app. Leave archive() at tool level so we can use from vgcreate as well as vgchange. Should be no functional change. - add stack macro in vgchange Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-24 18:15:05 +00:00
Mike Snitzer	4bdebfd151	Do not reload origin again in lv_remove_single() if it had a merging snapshot. vg_remove_snapshot() will have already performed the required reload.	2010-02-17 23:36:45 +00:00
Mike Snitzer	a5ec3e3827	Refactor snapshot-merge deptree and device removal to support info-by-uuid Add a merging snapshot to the deptree, using the "error" target, rather than avoid adding it entirely. This allows proper cleanup of the -cow device without having to rename the -cow to use the origin's name as a prefix. Move the preloading of the origin LV, after a merge, from lv_remove_single() to vg_remove_snapshot(). Having vg_remove_snapshot() preload the origin allows the -cow device to be released so that it can be removed via deactivate_lv(). lv_remove_single()'s deactivate_lv() reliably removes the -cow device because the associated snapshot LV, that is to be removed when a snapshot-merge completes, is always added to the deptree (and kernel -- via "error" target). Now when the snapshot LV is removed both the -cow and -real devices get removed using uuid rather than device name. This paves the way for us to switch over to info-by-uuid queries. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2010-02-17 22:59:46 +00:00
Dave Wysochanski	629efc6a89	Export lvm_pv_get_size(), lvm_pv_get_free(), lvm_pv_get_dev_size in lvm2app. We add these exports to show the pv_size and pv_free and dev_size fields. Fixes rhbz561423. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-14 03:21:37 +00:00
Dave Wysochanski	ed3329eb45	Fix off by 512 sizes for lvm2app. Internally we store sizes in sectors, but lvm2app exports sizes in bytes. We could get fancier and allow units configuration but this fix should do for now. Fixes rhbz561422. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>	2010-02-14 03:21:06 +00:00

... 8 9 10 11 12 ...

1590 Commits