shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00

Author	SHA1	Message	Date
Jonathan Brassow	d5896f0afd	Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors There is a problem with the way mirrors have been designed to handle failures that is resulting in stuck LVM processes and hung I/O. When mirrors encounter a write failure, they block I/O and notify userspace to reconfigure the mirror to remove failed devices. This process is open to a couple races: 1) Any LVM process other than the one that is meant to deal with the mirror failure can attempt to read the mirror, fail, and block other LVM commands (including the repair command) from proceeding due to holding a lock on the volume group. 2) If there are multiple mirrors that suffer a failure in the same volume group, a repair can block while attempting to read the LVM label from one mirror while trying to repair the other. Mitigation of these races has been attempted by disallowing label reading of mirrors that are either suspended or are indicated as blocking by the kernel. While this has closed the window of opportunity for hitting the above problems considerably, it hasn't closed it completely. This is because it is still possible to start an LVM command, read the status of the mirror as healthy, and then perform the read for the label at the moment after a the failure is discovered by the kernel. I can see two solutions to this problem: 1) Allow users to configure whether mirrors can be candidates for LVM labels (i.e. whether PVs can be created on mirror LVs). If the user chooses to allow label scanning of mirror LVs, it will be at the expense of a possible hang in I/O or LVM processes. 2) Instrument a way to allow asynchronous label reading - allowing blocked label reads to be ignored while continuing to process the LVM command. This would action would allow LVM commands to continue even though they would have otherwise blocked trying to read a mirror. They can then release their lock and allow a repair command to commence. In the event of #2 above, the repair command already in progress can continue and repair the failed mirror. This patch brings solution #1. If solution #2 is developed later on, the configuration option created in #1 can be negated - allowing mirrors to be scanned for labels by default once again.	2013-10-22 19:14:33 -05:00
Peter Rajnoha	039bdad732	activation: flag temporary LVs internally Add LV_TEMPORARY flag for LVs with limited existence during command execution. Such LVs are temporary in way that they need to be activated, some action done and then removed immediately. Such LVs are just like any normal LV - the only difference is that they are removed during LVM command execution. This is also the case for LVs representing future pool metadata spare LVs which we need to initialize by using the usual LV before they are declared as pool metadata spare. We can optimize some other parts like udev to do a better job if it knows that the LV is temporary and any processing on it is just useless. This flag is orthogonal to LV_NOSCAN flag introduced recently as LV_NOSCAN flag is primarily used to mark an LV for the scanning to be avoided before the zeroing of the device happens. The LV_TEMPORARY flag makes a difference between a full-fledged LV visible in the system and the LV just used as a temporary overlay for some action that needs to be done on underlying PVs. For example: lvcreate --thinpool POOL --zero n -L 1G vg - first, the usual LV is created to do a clean up for pool metadata spare. The LV is activated, zeroed, deactivated. - between "activated" and "zeroed" stage, the LV_NOSCAN flag is used to avoid any scanning in udev - betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH udev rule, but since the LV is just a usual LV, we can't make a difference. The LV_TEMPORARY internal LV flag helps here. If we create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is with "invisible" and non-top-level LVs) - udev is directed to skip WATCH rule use. - if the LV_TEMPORARY flag was not used, there would normally be a WATCH event generated once the LV is closed after "zeroed" stage. This will make problems with immediated deactivation that follows.	2013-10-23 14:09:37 +02:00
Peter Rajnoha	9883bffb04	WHATS_NEW: typo	2013-10-22 16:37:02 +02:00
Peter Rajnoha	b109bfc1ef	blkdeactivate: fix endless loop if device(s) given and unable to umount/deactivate The blkdeactivate script iterates over the list of devices if they're given as an argument and it tries to umount/deactivate them one by one. This iteration failed to proceed if any of the umount/deactivation was unsuccessful - there was a missing "shift" call to move to the next argument (device) for processing. As a result of this, the same device was tried again and again, causing an endless loop, never proceeding to the next device given.	2013-10-22 16:24:39 +02:00
Peter Rajnoha	3fee661028	udev+systemd: refine lvm2-pvscan@.service to better track device existence When using ENV{SYSTEMD_WANTS}=lvm2-pvscan@... to instantiate a service for lvmetad scan when the new PV appears in the system, the service is started and executed. However, to track device removal, we need to bind it (the "BindsTo" systemd directive) to a certain .device systemd unit. In default systemd setup, the device is tracked by it's name and sysfs path (there's normally a sysfs path .device systemd unit for a device and then the device name .device unit as an alias for it). Neither of these two is useful for lvmetad update as we need to bind it to device's <major>:<minor> pair. The /dev/block/<major>:<minor> is the essential symlink under /dev that exists for each block device (created by default udev rules provided by udev directly). So let's use this as an alias for the device's .device unit as well by means of "ENV{SYSTEMD_ALIAS}" declaration within udev rules which systemd understands (this will create a new alias "dev-block-<major>:<minor>.device". Then we can easily bind the "dev-block-<major>:<minor>" device systemd unit with instantiated lvm2-pvscan@<major>:<minor>.service. So once the device is removed from the systemd, the lvm-pvscan@<major>:<minor>.service executes it's ExecStop action (which in turn notifies lvmetad about the device being gone). This completes the udev-systemd-lvmetad interaction then.	2013-10-22 14:22:40 +02:00
Peter Rajnoha	0a48137d39	pvscan: use major:minor as short form of --major and --minor arg for pvscan --cache Before, pvscan recognized either: pvscan --cache --major <major> --minor <minor> or pvscan --cache <DevicePath> When the device is gone and we need to notify lvmetad about device removal, only --major/--minor works as we can't translate DevicePath into major/minor pair anymore. The device does not exist in the system and we don't keep DevicePath index in lvmetad cache to make the translation internally into original major/minor pair. It would be useless to keep this index just for this one exact case. There's nothing bad about using "--major <major> --minor <minor>", but it makes our life a bit harder when trying to make an interconnection with systemd units, mainly with instantiated services where only one and only one arg can be passed (which is encoded in the service name). This patch tries to make this easier by adding support for recognizing the "<major>:<minor>" as a shortcut for the longer form "--major <major> --minor <minor>". The rule here is simple: if the argument starts with "/", it's a DevicePath, otherwise it's a <major>:<minor> pair.	2013-10-22 13:52:18 +02:00
Mike Snitzer	65456a4a29	vgimportclone: remove 2>/dev/null from three lvm commands There is no point eating stderr for these commands. In fact the redirect causes confusion and hurts dubugging. Also reword an error message if the pvs command fails so as not be certain that a device is not a PV. Coupled with removing the stderr redirect this will improve the user experience in the face of errors.	2013-10-21 18:04:14 -04:00
Peter Rajnoha	546db1c4be	udev+systemd: make pvscan --cache -aay run as systemd background job from udev The new lvm2-pvscan@.service is responsible for on-demand execution of "pvscan --cache --activate ay" which causes lvmetad to be updated and LVM activation done if the VG is complete. Also, use udev-systemd mechanism to instantiate the job as the lvm2-pvscan@$devnode.service on each newly appeared PV in the system. This prevents the background job to be killed (that would happen if it was directly forked from udev rule - this behaviour is seen in recent versions of udev with the help of systemd that can track detached processes - the detached process would still be in the same cgroup). To enable this official udev-systemd protocol for instantiating background jobs, use new --enable-udev-systemd-background-jobs configure switch (it's disabled by default). This option is highly recommended wherever systemd is used!	2013-10-18 11:38:49 +02:00
Zdenek Kabelac	1b7631101b	thin: fix lvconvert for active pool. Prohibit conversion of pool device with active thin volumes. Properly restore active states only for active thin pool volume. Use new LV_NOSCAN when converting volume into thin pool's metadata.	2013-10-16 10:53:01 +02:00
Peter Rajnoha	48df36b8c5	activation: check for open count with a timeout before removal/deactivation of an LV This patch reinstates the lv_info call to check for open count of the LV we're removing/deactivating - this was changed with commit `125712b` some time ago and we relied on the ioctl retry logic deeper in the libdm while calling the exact 'remove' ioctl. However, there are still some situations in which it's still required to check for open count before we do any 'remove' actions - this mainly applies to LVs which consist of several sub LVs, like it is for virtual snapshot devices. The commit `1146691` fixed the issue with ordering of actions during virtual snapshot removal while the snapshot is still open. But the check for the open status of the snapshot is still prone to marking the snapshot as in use with an immediate exit even though this could be a temporary asynchronous open only, most notably because of udev and its WATCH udev rule with accompanying scans for the event which is asynchronous. The situation where this crops up most often is when we're closing the LV that was open for read-write and then calling lvremove immediately. This patch reinstates the original lv_info call for the open status of the LV in the lv_check_not_in_use fn that gets called before we do any LV removal/deactivation. In addition to original logic, this patch adds its own retry loop with a delay (25x0.2 seconds) besides the existing ioctl retry loop.	2013-10-15 12:44:42 +02:00
Jonathan Brassow	f58b26b633	RAID: Report RAID images split with tracking as out-of-sync ("I"). Split image should have an out-of-sync attr ('I') - always. Even if the RAID LV has not been written to since the LV was split off, it is still not part of the group that makes up the RAID and is therefore "out-of-sync".	2013-10-14 10:48:44 -05:00
Zdenek Kabelac	851bba258c	snapshot: rework parsing of snapshot metadata Add better parsing code for snapshot metadata, which describe properly errors found for snapshot segment.	2013-10-14 00:26:58 +02:00
Zdenek Kabelac	1146691afc	snapshot: deactivate virtual snapshot first Since the virtual snapshot has no reason to stay alive once we detach related snapshot - deactivate whole thing in front of snapshot removal - otherwice the code would get tricky for support in cluster. The correct full solution would require to have transactions for libdm operations. Also enable to the check for snapshot being opened prior the origin deactivation, otherwise we could easily end with the origin being deactivate, but snapshot still kept active, desynchronizing locking state in cluster.	2013-10-14 00:25:15 +02:00
Zdenek Kabelac	ac961087b0	snapshot: disable merging for virtual snaps Merging into virtual origin is not supposed to work.	2013-10-12 00:15:55 +02:00
Zdenek Kabelac	81504ba70c	snapshot: move virtsnap code from tool to lib Move code for removal dependency from tool's remove.c into lib's manipulation code. Same code then works with lvm2app.	2013-10-12 00:14:52 +02:00
Peter Rajnoha	304159c99a	cleanup: WHATS_NEW + compiler warning about discarding const	2013-10-10 09:09:16 +02:00
Alasdair G Kergon	7bed6d1263	filters: Add NVM Express (nvme).	2013-10-09 20:08:07 +01:00
Peter Rajnoha	1b91847beb	WHATS_NEW: commit `0decd75`	2013-10-09 15:59:19 +02:00
Peter Rajnoha	863be9d9c6	WHATS_NEW: commit `d888a05` and `808a5d9`	2013-10-09 12:11:12 +02:00
Peter Rajnoha	2f5ddfbade	udev: add support for "NOSCAN" flag Recognize DM_SUBSYSTEM_UDEV_FLAG0 which for LVM is the "LVM_NOSCAN" flag that causes the scanning to be skipped (mainly blkid) and also directs all the foreign rules to be skipped as well. Important thing here is that the "watch" udev rules is still set as well as the /dev/disk/by-id content created (which does not require any scanning to be done). Also, the flag is dropped on any subsequent event and scanning done...	2013-10-08 13:43:14 +02:00
Peter Rajnoha	ce7489ed22	activation: add support for flagging an LV to skip udev scanning during activation A common scenario is during new LV creation when we need to wipe the newly created LV and avoid any udev scanning before this stage otherwise it could cause the device (the LV) to be claimed by some other subsystem for which there were stale metadata within LV data. This patch adds possibility to mark the LV we're just about to wipe with a flag that gets passed to udev via DM_COOKIE as a subsystem specific flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM") so LVM udev rules will take care of handling that.	2013-10-08 13:43:14 +02:00
Zdenek Kabelac	92bafade60	thin: fix lvconvert in external origin conversion Patch `562ad293fd` introduced code regression when LV was converted to a thin LV with external origin and at the same time, conversion of LV to a thin pool has been requested. (RHBZ: #997704) data_lv needs to be assigned after test for external conversion find pool.	2013-10-08 13:41:06 +02:00
Zdenek Kabelac	30746f31dd	vgrename: run fullscan For vgrename run full scan so the command is able to properly detect name collision.	2013-10-08 13:39:11 +02:00
Alasdair G Kergon	4806f38d70	lvchange: improve discards when pool active error Existing message deemed misleading: Cannot change discards state for active pool volume https://bugzilla.redhat.com/show_bug.cgi?id=994315	2013-10-07 23:50:09 +01:00
Alasdair G Kergon	761b524519	post-release	2013-10-04 14:41:32 +01:00
Alasdair G Kergon	04d9a52684	release 2.02.103 52 files changed, 598 insertions(+), 264 deletions(-)	2013-10-04 14:32:23 +01:00
Peter Rajnoha	a7ff7aee4f	WHATS_NEW: renamed thin_pool_chunk_size_calculation -> policy	2013-10-04 12:36:32 +02:00
Alasdair G Kergon	baf95bbff7	cmdline: Add --ignoreskippedcluster. Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay, lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered VG' errors when requesting information about a clustered VG without using clustered locking and still exit with success. The messages can still be seen with -v.	2013-10-01 21:20:10 +01:00
Peter Rajnoha	e4c7236c07	udev: fix 3min udev timeout so that it is applied for all LVM volumes The timeout should be set before any volume skipping.	2013-09-27 15:37:16 +02:00
Jonathan Brassow	acdc731e83	RAID: Fix _sufficient_pes_free calculation for RAID lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the required space for RAID allocations incorrectly due to double accounting. This resulted in failure to allocate when available space was tight. When RAID data and metadata areas are allocated together, the total amount is stored in ah->new_extents and ah->alloc_and_split_meta is set. '_sufficient_pes_free' was adding the necessary metadata extents to ah->new_extents without ever checking ah->alloc_and_split_meta. This often led to double accounting of the metadata extents. This patch checks 'ah->alloc_and_split_meta' to perform proper calculations for RAID. This error is only present in the function that checks for the needed space, not in the functions that do the actual allocation.	2013-09-26 11:30:07 -05:00
Jonathan Brassow	d6516d2f79	WHATS_NEW: description for previous commit commit `098896fb29` failed to include description of what was fixed. "Conversion from linear to mirror or RAID1 now honors mirror_segtype_default."	2013-09-25 22:35:52 -05:00
Peter Rajnoha	dd796d6a94	profile: add thin-performance.profile Define a "performance" profile for thin pools which is exactly: - allocation/thin_pool_zero = 0 - thin_pool_chunk_size_calculation = "performance"	2013-09-25 16:07:35 +02:00
Peter Rajnoha	8bf425005c	conf: add allocation/thin_pool_chunk_size_calculation Add allocation/thin_pool_chunk_size_calculation lvm.conf option to select a method for calculating thin pool chunk sizes and define two possible values - "default" and "performance".	2013-09-25 16:06:38 +02:00
Jonathan Brassow	5ded7314ae	RAID: Fix broken allocation policies for parity RAID types A previous commit (`b6bfddcd0a`) which was designed to prevent segfaults during lvextend when trying to extend striped logical volumes forgot to include calculations for RAID4/5/6 parity devices. This was causing the 'contiguous' and 'cling_by_tags' allocation policies to fail for RAID 4/5/6. The solution is to remember that while we can compare ah->area_count == prev_lvseg->area_count for non-RAID, we should compare (ah->area_count + ah->parity_count) == prev_lvseg->area_count for a general solution.	2013-09-24 21:32:10 -05:00
Peter Rajnoha	6553f86818	lvmconf: use_lvmetad=0 on --enable-cluster, reset to default on --disable-cluster lvmetad is not yet supported in clustered environment so disable it automatically if using lvmconf --enable-cluster and reset it to default value if using lvmconf --disable-cluster. Also, add a few comments in lvm.conf about locking_type vs. use_lvmetad if setting it for clustered environment.	2013-09-24 14:03:42 +02:00
Peter Rajnoha	f050278a35	tools: don't install separate command symlink for lvm devtypes	2013-09-24 09:35:20 +02:00
Alasdair G Kergon	11dc6a03c4	lvs: Add seg_size_pe field. Requested https://www.redhat.com/archives/linux-lvm/2013-July/msg00112.html	2013-09-23 21:50:14 +01:00
Alasdair G Kergon	7233e584ad	pvmove: Accept PE ranges as start+length.	2013-09-23 19:50:34 +01:00
Alasdair G Kergon	bbcc120e5a	pvmove: clean exit on failed pvmove restart At present, before the pvmove command can be used to restart pvmove polling, the LVs concerned need to be activated e.g. with lvchange -ay.	2013-09-23 19:46:28 +01:00
Alasdair G Kergon	229e0752f1	post-release	2013-09-23 15:55:11 +01:00
Alasdair G Kergon	c8057aec36	release 2.02.102 18 files changed, 137 insertions(+), 203 deletions(-)	2013-09-23 15:43:37 +01:00
Christine Caulfield	431eda63cc	clvmd: Fix node up/down handing in corosync module The corosync cluster interface for clvmd did not correctly deal with node up/down events so that when a node was removed from the cluster clvmd would prevent remote operations from happening, as it thought the node was up but not running clvmd. This patch fixes that code by simplifying the case to node being up or down - which was the original intention and is supported by pacemaker and CPG in the higher layers. Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>	2013-09-23 13:23:00 +01:00
Zdenek Kabelac	ebf66ac316	Makefile: add missing deps Add missing deps for device-mapper build of scripts dir. Cleanup multiple SUBDIR lines together.	2013-09-23 12:13:51 +02:00
Zdenek Kabelac	3b604e5c8e	lvinfo: allow to use lv_info with NULL info When NULL info struct is passed in - function is usable as a quick query for lv_is_active_locally() - with a bonus we may query for layered device. So it could be seen as a more efficient lv_is_active_locally().	2013-09-23 12:13:06 +02:00
Alasdair G Kergon	bd75844024	release 2.02.101 112 files changed, 4131 insertions(+), 1312 deletions(-)	2013-09-20 13:56:29 +01:00
Alasdair G Kergon	6e912d949b	tools: Avoid overflow in _get_int_arg. Use strtoull instead of strtol so that argument size is not cut to 31 bytes on machines with 32-bit long. (Mikulas)	2013-09-18 01:16:48 +01:00
Alasdair G Kergon	a3a5f58c21	reporting: Add devtypes command. Add internal devtypes reporting command to display built-in recognised block device types. (The output does not include any additional types added by a configuration file.) > lvm devtypes -o help Device Types Fields ------------------- devtype_all - All fields in this section. devtype_name - Name of Device Type exactly as it appears in /proc/devices. devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.) devtype_description - Description of Device Type. > lvm devtypes DevType MaxParts Description aoe 16 ATA over Ethernet ataraid 16 ATA Raid bcache 1 bcache block device cache blkext 1 Extended device partitions ...	2013-09-18 01:09:15 +01:00
Jonathan Brassow	d1bcb21e02	WHATS_NEW: Better description for commit `82228ac` More correct description of changes made to disallow thin+mirror.	2013-09-16 15:37:48 -05:00
Alasdair G Kergon	36c5bb40a2	Makefiles: Fix CC variable override. The CC override in commit `f42b2d4bbf` caused the built-in value to be used instead of the configured value when it wasn't being overridden. The behaviour is explained here: http://stackoverflow.com/questions/18007326/how-to-change-default-values-of-variables-like-cc-in-makefile	2013-09-16 19:57:14 +01:00
Alasdair G Kergon	97ba18f4cb	filters: Add bcache. N.B. Using bcache devices as PVs is still experimental. Problems should be reported to the appropriate mailing lists.	2013-09-16 16:56:55 +01:00

1 2 3 4 5 ...

2797 Commits