1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-12-31 12:32:49 +03:00

Compare commits

...

149 Commits

Author SHA1 Message Date
Marian Csontos
7cbee7e9cf pre_release 2019-03-22 11:04:15 +01:00
Marian Csontos
717957ddc5 build: make generate 2019-03-22 11:01:04 +01:00
Heinz Mauelshagen
9b04851fc5 raid: fix (de)activation of RaidLVs with visible SubLVs
There's a small window during creation of a new RaidLV when
rmeta SubLVs are made visible to wipe them in order to prevent
erroneous discovery of stale RAID metadata.  In case a crash
prevents the SubLVs from being committed hidden after such
wiping, the RaidLV can still be activated with the SubLVs visible.
During deactivation though, a deadlock occurs because the visible
SubLVs are deactivated before the RaidLV.

The patch adds _check_raid_sublvs to the raid validation in merge.c,
an activation check to activate.c (paranoid, because the merge.c check
will prevent activation in case of visible SubLVs) and shares the
existing wiping function _clear_lvs in raid_manip.c moved to lv_manip.c
and renamed to activate_and_wipe_lvlist to remove code duplication.
Whilst on it, introduce activate_and_wipe_lv to share with
(lvconvert|lvchange).c.

Resolves: rhbz1633167
(cherry picked from commit dd5716ddf2)

Conflicts:
	WHATS_NEW
	lib/activate/activate.c
	lib/metadata/lv_manip.c
	lib/metadata/raid_manip.c
	tools/lvchange.c
	tools/lvconvert.c
2019-03-21 08:05:23 +01:00
David Teigland
dcf8f3111a pvscan: lvmetad init should set updating before scanning
When pvscan needs to initialize lvmetad (e.g. lvmetad has just
started and is empty), it should set the lvmetad state to "updating"
before it scans any devices.  Otherwise, many parallel pvscans
will try to initialize lvmetad, and in some cases earlier pvscans
with fewer devices information may replace newer pvscans with
more recent information.
2019-03-07 11:07:27 -06:00
David Teigland
ece0b131e5 config: improve scan_lvs description 2019-03-06 13:38:33 -06:00
Alasdair G Kergon
519f4453a5 dmsetup: Fix multi-line concise table parsing
Use the correct loop variable within the loop, instead of reusing the
initial value.  Table lines after the first don't get terminated in
the right place.

Signed-off-by: Kurt Garloff <kurt@garloff.de>
(cherry picked from commit ccfbd505fe)
2019-03-05 13:01:40 +01:00
Zdenek Kabelac
bc6ae7030a dm: migration_threshold for old linked tools
Just like with precending  lvm2 device_mapper patch, ensure
that old users of libdm will also get fixed migration threshold
for caches.

(cherry picked from commit 74ae1c5bc1)

Conflicts:
	WHATS_NEW_DM
2019-03-05 13:01:19 +01:00
Zdenek Kabelac
167aa34926 stats: initilize regions to NULL
Commit 3750b0cff5 used bad: error
path in more occasions thus it now needs regions defined as NULL.

(cherry picked from commit 83c6f7e7e6)
2019-03-05 12:57:50 +01:00
Zdenek Kabelac
8fc64c5ee6 stats: fix error path when region is NULL
We should not call _stats_cleanup_region_ids() when regions
are NULL.
Also add backtracing for  goto.

(cherry picked from commit 3750b0cff5)
2019-03-05 12:57:40 +01:00
Zdenek Kabelac
8d44cd3e47 libdm: add memory barrier
Just for case ensure compiler is not able to optimize
memset() away for resources that are released.

This idea of using memory barrier is taken from openssl.

Other options would be to check for 'explicit_bzero' function.

(cherry picked from commit 55a8d6c86b)

Conflicts:
	device_mapper/ioctl/libdm-iface.c
2019-03-05 12:48:15 +01:00
Zdenek Kabelac
40f57155a3 libdm: print params only for ioctls using them
When preparing ioctl buffer and flatting all parameters,
add table parameters only to ioctl that do process them.

Note: list of ioctl should be kept in sync with kernel code.
(cherry picked from commit 43f8da7699)

Conflicts:
	WHATS_NEW_DM
	device_mapper/ioctl/libdm-iface.c
2019-03-05 12:47:50 +01:00
Zdenek Kabelac
1bef4dfab3 libdm: add DM_DEVICE_ARM_POLL
Expose DM_DEVICE_ARM_POLL via standard API enum.

(cherry picked from commit 1ae5bf2b83)

Conflicts:
	WHATS_NEW_DM
	device_mapper/all.h
	device_mapper/ioctl/libdm-iface.c
2019-03-05 12:46:21 +01:00
Zdenek Kabelac
e974f6866a cleanup: move cast to det_t into MKDEV macro
(cherry picked from commit aa8b2d6a0f)

Conflicts:
	daemons/clvmd/clvmd-common.h
	device_mapper/ioctl/libdm-iface.c
	device_mapper/libdm-common.c
	device_mapper/libdm-deptree.c
2019-03-05 12:39:17 +01:00
Zdenek Kabelac
a93699ece9 cov: remove unused assigns
(cherry picked from commit 70e3d0a613)

Conflicts:
	tools/pvscan.c
	tools/vgchange.c
2019-03-05 12:28:31 +01:00
Zdenek Kabelac
e4bb94a93e cov: hide intentionaly ptr arithmetic report
Only single region count is ever replaced with on-stack uint64_t.

(cherry picked from commit a91ac41b93)
2019-03-05 12:17:18 +01:00
Zdenek Kabelac
93ac80037a cov: mark warning as expected one
(cherry picked from commit 9238b972c5)

Conflicts:
	base/data-struct/radix-tree-adaptive.c
	device_mapper/libdm-file.c
2019-03-05 12:16:33 +01:00
Zdenek Kabelac
c9e5e6800c cov: split check for type assignment
Check that type is always defined, if not make it explicit internal
error (although logged as debug - so catched only with proper lvm.conf
setting).
This ensures later type being NULL can't be dereferenced with coredump.

(cherry picked from commit 79879bd201)
2019-03-05 12:16:07 +01:00
Zdenek Kabelac
f0f68791f3 cov: shutdown warning
Since previous patch reverted coverity patch as this case is intentional,
provide override this coverity warning.

(cherry picked from commit 05b5774827)
2019-03-05 12:15:40 +01:00
Zdenek Kabelac
b39c26ddc3 revert "cov: dm stats missed terminating null"
This reverts commit 20971f7034
as the parsing of 'dmstatus' started to fail on present \0 char.

(cherry picked from commit 6179cab877)
2019-03-05 12:15:16 +01:00
Zdenek Kabelac
d1ae1455b4 cov: ensure vars are set
Make sure, tmp_begin and tmp_end are always set, even for blind
coverity.

(cherry picked from commit 2513661467)

Conflicts:
	device_mapper/libdm-report.c
2019-03-05 12:14:43 +01:00
Marian Csontos
c115d92287 cov: dmstats check for failing malloc
Add missing check for allocation success.

Backported from: 9b71212262
2019-03-05 12:13:08 +01:00
Zdenek Kabelac
ece117ee10 cov: dm stats missed terminating null
Coverity noticed allocating insufficient memory
for the terminating null of the string.

(cherry picked from commit 20971f7034)
2019-03-05 12:10:36 +01:00
David Teigland
590a1ebcf7 io: increase the default io memory from 4 to 8 MiB
This is the default bcache size that is created at the
start of the command.  It needs to be large enough to
hold a single copy of metadata for a given VG, or the
VG cannot be read or written (since the entire VG would
not fit into available memory.)

Increasing the default reduces the chances of anyone
needing to increase the default to use their VG.

The size can be set in lvm.conf global/io_memory_size;
the lower limit is 4 MiB and the upper limit is 128 MiB.
2019-03-04 11:18:34 -06:00
David Teigland
863a2e693e io: warn when metadata size approaches io memory size
When a single copy of metadata gets within 1MB of the
current io_memory_size value, begin printing a warning
that the io_memory_size should be increased.
2019-03-04 10:57:52 -06:00
David Teigland
8dbfdb5b73 config: add new setting io_memory_size
which defines the amount of memory that lvm will allocate
for bcache.  Increasing this setting is required if it is
smaller than a single copy of VG metadata.
2019-03-04 10:31:47 -06:00
Marcos Paulo de Souza
675b94a11b pvscan.service.in: Move StartLimitInterval to Service section
Without this patch, pvscan service file contains StartLimitInterval at
the Unit section, which trigger an error:

 Unknown lvalue 'StartLimitInterval' in section 'Unit'

Moving StartLimitInterval to Service section fixes the issue.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.de>
2019-02-28 09:38:34 -06:00
David Teigland
850e95f24a tests: add scan_lvs.sh 2019-02-20 14:38:01 -06:00
David Teigland
083f162e8e WHATS_NEW: scan_lvs 2019-02-20 14:33:09 -06:00
David Teigland
7f56908c2b tests: set scan_lvs=1 in tests that stack PVs on LVs 2019-02-20 14:32:01 -06:00
David Teigland
427e8ba3e3 config: change scan_lvs default to 0
so that lvm does not scan LVs for PVs by default.
2019-02-20 14:31:03 -06:00
David Teigland
6a5575e959 filter: add config setting to skip scanning LVs
devices/scan_lvs (default 1) determines whether lvm
will scan LVs for layered PVs.  The lvm behavior has
always been to scan LVs, but it's rare for LVs to have
layered PVs, and much more common for there to be many
LVs that substantially slow down scanning with no benefit.

This is implemented in the usable filter, and has the
same effect as listing all LVs in the global_filter.
2019-02-20 14:30:24 -06:00
David Teigland
57cde6063f apply obtain_device_list_from_udev to all libudev usage
udev_dev_is_md_component and udev_dev_is_mpath_component
are not used for obtaining the device list, but they still
use libudev for device info.  When there are problems with
udev, these functions can get stuck. So, use the existing
obtain_device_list_from_udev config setting to also control
whether these "is component" functions are used, which gives
us a way to avoid using libudev entirely when it's causing
problems.
2019-02-05 10:20:24 -06:00
David Teigland
d0cb672466 lvmlockd: fix make lockstart wait
when building without lvmlockd
2019-01-31 09:56:29 -06:00
David Teigland
75886f59e4 lvmlockd: make lockstart wait for existing start
If there are two independent scripts doing:
  vgchange --lockstart vg
  lvchange -ay vg/lv

The first vgchange to do the lockstart will wait for
the lockstart to complete before returning.
The second vgchange to do the lockstart will see that
the start is already in progress (from the first) and
will do nothing.  This means the second does not wait
for any lockstart to complete, and moves on to the
lvchange which may find the lockspace still starting
and fail.

To fix this, make the vgchange lockstart command
wait for any lockstart's in progress to complete.
2019-01-31 09:38:50 -06:00
Marian Csontos
1d2de5dd13 spec: Use python3 setuptools with python3 2019-01-03 14:39:28 +01:00
Ming-Hung Tsai
df0797db8c lvmanip: uninitialized members in struct pv_list (#10)
Scenario: Given an existed LV `lvol0`, I want to create another LV
on the PVs used by `lvol0`.

I use `build_parallel_areas_from_lv()` to obtain the `pv_list` of each segments.
However, the returned `pv_list` is not properly initialized, which causes
segfault in subsequent operations.

(cherry picked from commit 859feb81e5)
(cherry picked from commit 219ba4f54a)

Conflicts:
	WHATS_NEW
2018-12-19 09:18:25 +01:00
Marian Csontos
2d077286b9 post-release 2018-12-07 15:19:10 +01:00
Marian Csontos
f5ea02ffee pre-release 2018-12-07 15:12:34 +01:00
Marian Csontos
262a42025f build: make generate 2018-12-07 15:05:50 +01:00
Zdenek Kabelac
d5234e1b7e libdm: do not add params for resume and remove
DM_DEVICE_CREATE with table is doing several ioctl operations,
however only some of then takes parameters.
Since _create_and_load_v4() reused already existing dm task from
DM_DEVICE_RELOAD it has also kept passing its table parameters
to DM_DEVICE_RESUME ioctl - but this ioctl is supposed to not take
any argument and thus there is no wiping of passed data - and
since kernel returns buffer and shortens dmi->data_size accordingly,
anything past returned data size remained uncleared in zfree()
function.

This has problem if the user used dm_task_secure_data (i.e. cryptsetup),
as in this case binary expact secured data are erased from main memory
after use, but they may have been left in place.

This patch is also closing the possible hole for error path,
which also reuse same dm task structure for DM_DEVICE_REMOVE.
2018-12-06 17:46:51 +01:00
David Teigland
a188b1e513 pvscan lvmetad: use udev info to improve md component detection
When no md devs are started, pvscan will only scan the start of
an md component, and if it has a superblock at the end may not
exclude it.  udev may already have info identifying it as an
md component, so use that.
2018-12-03 11:05:35 -06:00
David Teigland
9764ee0b3f lvmetad: fix disabling in previous commit
it broke the case where a connection already exists.
2018-11-30 15:49:03 -06:00
David Teigland
322d4ed05e lvmetad: only disable if repair will do something
lvconvert --repair would disable lvmetad at the start of
the command.  This would leave lvmetad disabled even if the
command did nothing.  Move the step to disable lvmetad until
later, just before some actual repair is done.  There are
now numerous cases where nothing is actually done and lvmetad
is not disabled.
2018-11-30 14:54:19 -06:00
David Teigland
a01e1fec0f pvscan lvmetad: use full md filter when md 1.0 devices are present
Apply the same logic to pvscan/lvmetad that was added to
the non-lvmetad label_scan in commit 3fd75d1b:
  scan: use full md filter when md 1.0 devices are present

Before scanning, check if any of the devs on the system are
md 0.90/1.0, and if so make the scan read both the start and
the end of the device so that the components of those md
versions can be ignored.
2018-11-29 14:08:46 -06:00
Peter Rajnoha
0e42ebd6d4 scan: md metadata version 0.90 is at the end of disk
commit de28637
  scan: use full md filter when md 1.0 devices are present

missed the fact that md superblock version 0.90 also puts
metadata at the end of the device, so the full md filter
needs to be used when either 0.90 or 1.0 is present.
2018-11-29 12:16:37 -06:00
David Teigland
fe1cabfa34 WHATS_NEW: sync io 2018-11-20 09:04:37 -06:00
David Teigland
cb5405ded8 bcache: sync io fixes
fix lseek error check
fix read/write error checks
handle zero return from read and write
don't return an error for short io
fix partial read/write loop
2018-11-20 09:04:37 -06:00
David Teigland
f8ce9bf3bc io: use sync io if aio fails
io_setup() for aio may fail if a system has reached the
aio request limit.  In this case, fall back to using
sync io.  Also, lvm use of aio can be disabled entirely
with config setting global/use_aio=0.

The system limit for aio requests can be seen from
  /proc/sys/fs/aio-max-nr

The current usage of aio requests can be seen from
  /proc/sys/fs/aio-nr

The system limit for aio requests can be increased by
setting fs.aio-max-nr using sysctl.

Also add last-byte limit to the sync io code.
2018-11-20 09:00:26 -06:00
David Teigland
9fda169077 update WHATS_NEW 2018-11-06 16:41:43 -06:00
David Teigland
9799c8da07 devices: reuse bcache fd when getting block size
This avoids an unnecessary open() on the device.
2018-11-06 16:41:04 -06:00
Bryn M. Reeves
613466aa8f dmsetup: fix stats report command output
Since the stats handle is neither bound nor listed before the
attempt to call dm_stats_get_nr_regions(), it will always return
zero: this prevents reporting of any dmstats regions on any
device.

Remove the dm_stats_get_nr_regions() check and instead rely on
the correct return status from dm_stats_populate() which only
returns 0 in the case that there are regions to inspect (and
which logs a specific error for all other cases).

Reported-by: Bryan Gurney <bgurney@redhat.com>
2018-11-01 16:49:55 +00:00
Bryn M. Reeves
813a83b2d6 libdm-stats: move no regions warning after dm_stats_list()
It doesn't make sense to test or warn about the region count until
the stats handle has been listed: at this point it may or may not
contain valid information (but is guaranteed to be correct after
the list).
2018-11-01 16:47:56 +00:00
Marian Csontos
fa8d5e4e81 post-release 2018-10-30 10:01:01 +01:00
Marian Csontos
b93aded021 pre-release 2018-10-30 10:00:58 +01:00
Marian Csontos
efa281685a Update WHATS_NEW 2018-10-30 10:00:24 +01:00
David Teigland
ab27d5dc2a metadata: prevent writing beyond metadata area
lvm uses a bcache block size of 128K.  A bcache block
at the end of the metadata area will overlap the PEs
from which LVs are allocated.  How much depends on
alignments.  When lvm reads and writes one of these
bcache blocks to update VG metadata, it can also be
reading and writing PEs that belong to an LV.

If these overlapping PEs are being written to by the
LV user (e.g. filesystem) at the same time that lvm
is modifying VG metadata in the overlapping bcache
block, then the user's updates to the PEs can be lost.

This patch is a quick hack to prevent lvm from writing
past the end of the metadata area.
2018-10-29 16:46:03 -05:00
Marian Csontos
bd872064a2 spec: Fix python and applib interactions
When python3 is not present, macro expends to --disable-applib.
2018-10-29 16:48:42 +01:00
David Teigland
d1b652143a tests: add new test for lvm on md devices 2018-10-18 12:36:11 -05:00
David Teigland
e7bb508809 scan: enable full md filter when md 1.0 devices are present
The previous commit de2863739f
    scan: use full md filter when md 1.0 devices are present

needs the use_full_md_check flag in the md filter, but
the cmd struct is not available when the filter is run,
so that commit wasn't working.  Fix this by setting the
flag in a global variable.

(This was fixed in the master branch with commit 8eab37593
in which the cmd struct was passed to the filters, but it
was an intrusive change, so this commit is using the less
intrusive global variable.)
2018-10-18 12:35:57 -05:00
David Teigland
de2863739f scan: use full md filter when md 1.0 devices are present
The md filter can operate in two native modes:
- normal: reads only the start of each device
- full: reads both the start and end of each device

md 1.0 devices place the superblock at the end of the device,
so components of this version will only be identified and
excluded when lvm uses the full md filter.

Previously, the full md filter was only used in commands
that could write to the device.  Now, the full md filter
is also applied when there is an md 1.0 device present
on the system.  This means the 'pvs' command can avoid
displaying md 1.0 components (at the cost of doubling
the i/o to every device on the system.)

(The md filter can operate in a third mode, using udev,
but this is disabled by default because there have been
problems with reliability of the info returned from udev.)
2018-10-17 13:49:40 -05:00
Heinz Mauelshagen
c26bde42af lvconvert: fix interim segtype regression on raid6 conversions
When converting from striped/raid0/raid0_meta
to raid6 with > 2 stripes, allow possible
direct conversion (to raid6_n_6).

In case of 2 stripes, first convert to raid5_n to restripe
to at least 3 data stripes (the raid6 minimum in lvm2) in
a second conversion before finally converting to raid6_n_6.

As before, raid6_n_6 then can be converted
to any other raid6 layout.

Enhance lvconvert-raid-takeover.sh to test the
2 stripes conversions to raid6.

Resolves: rhbz1624038
(cherry picked from commit e2e30a64ab)

Conflicts:
	WHATS_NEW
2018-09-10 11:21:02 +02:00
Heinz Mauelshagen
0e03c68619 lvconvert: avoid superfluous interim raid type
When converting striped/raid0*/raid6_n_6 <-> raid4,
avoid superfluous interim raid5_n layout.

Related: rhbz1447809
(cherry picked from commit 22a1304368)
2018-09-05 16:41:14 +02:00
Peter Rajnoha
3374a59250 scripts: add After=rbdmap.service to {lvm2-activation-net,blk-availability}.service
We need to have Ceph RBD devices mapped first before use in a stack
where LVM is on top so make sure rbdmap.service is called before
generated lvm2-activation-net.service.

On shutdown, we need to stop blk-availability first before we stop the
rbdmap.service.

Resolves: rhbz1623479
(cherry picked from commit cb17ef221b)

Conflicts:
	WHATS_NEW
2018-09-05 14:41:55 +02:00
Zdenek Kabelac
6afb911252 tests: check activation of many thin-pool
Artifitical testing of monitoring of many thin-pools with low number
of resources in use (need only few pools to actually hit the race).
2018-09-05 14:40:01 +02:00
Zdenek Kabelac
a8d59404f7 dmeventd: lvm2 plugin uses envvar registry
Thin plugin started to use configuble setting to allow to configure
usage of external scripts - however to read this value it needed to
execute internal command as dmeventd itself has no access to lvm.conf
and the API for dmeventd plugin has been kept stable.

The call of command itself was not normally 'a big issue' until users
started to use higher number of monitored LVs and execution of command
got stuck because other monitored resource already started to execute
some other lvm2 command and become blocked waiting on VG lock.

This scenario revealed necesity to somehow avoid calling lvm2 command
during resource registration - but this requires bigger changes - so
meanwhile this patch tries to minimize the possibility to hit this race
by obtaining any configurable setting just once - such patch is small
and covers majority of problem - yet better solution needs to be
introduced likely with bigger rework of dmeventd.

TODO: Avoid blocking registration of resource with execution of lvm2
commands since those can get stuck waiting on mutexes.
2018-09-05 14:39:14 +02:00
Marian Csontos
a1a89a453f Update WHATS_NEW 2018-08-28 15:31:55 +02:00
David Teigland
ed749cdb5b WHATS_NEW: recent fixes 2018-08-27 14:41:29 -05:00
David Teigland
5502f72e41 lvmetad: fix pvs for many devices
When using lvmetad, 'pvs' still evaluates full filters
on all devices (lvmetad only provides info about PVs,
but pvs needs to report info about all devices, at
least sometimes.)

Because some filters read the devices, pvs still reads
every device, even with lvmetad (i.e. lvmetad is no help
for the pvs command.)  Because the device reads are not
being managed by the standard label scan layer, but only
happen incidentally through the filters, there is nothing
to control and limit the bcache content and the open file
descriptors for the devices.  When there are a lot of devs
on the system, the number of open fd's excedes the limit
and all opens begin failing.

The proper solution for this would be for pvs to really
use lvmetad and not scan devs, or for pvs to do a proper
label scan even when lvmetad is enabled.  To avoid any
major changes to the way this has worked, just work around
this problem by dropping bcache and closing the fd after
pvs evaluates the filter on each device.
2018-08-27 14:39:49 -05:00
David Teigland
c527a0cbfc lvmetad: improve scan for pvscan all
For 'pvscan --cache' avoid using dev_iter in the loop
after the label_scan by passing the necessary devs back
from the label_scan for the continued pvscan.
The dev_iter functions reapply the filters which will
trigger more io when we don't need or want it.  With
many devs, incidental opens from the filters (not controlled
by the label scan) can lead to too many open files.
2018-08-27 14:39:49 -05:00
Marian Csontos
63d4983890 spec: Disable python bindings on newer versions 2018-08-27 16:17:11 +02:00
David Teigland
a991664dec bcache: reduce MAX_IO to 256
This is the number of concurrent async io requests that
the scan layer will submit to the bcache layer.  There
will be an open fd for each of these, so it is best to
keep this well below the default limit for max open files
(1024), otherwise lvm may get EMFILE from open(2) when
there are around 1024 devices to scan on the system.
2018-08-24 14:50:53 -05:00
Heinz Mauelshagen
ab1aa0a4fb test: add striped -> raid0 test script
(cherry picked from commit 3c966e637f)
2018-08-23 11:29:24 +02:00
Heinz Mauelshagen
d910f75d89 lvconvert: fix conversion attempts to linear
"lvconvert --type linear RaidLV" on striped and raid4/5/6/10
have to provide the convenient interim layouts.  Fix involves
a cleanup to the convenience type function.

As a result of testing, add missing sync waits to
lvconvert-raid-reshape-linear_to_raid6-single-type.sh.

Resolves: rhbz1447809
(cherry picked from commit e83c4f07ca)

Conflicts:
	WHATS_NEW
2018-08-23 11:29:16 +02:00
Marian Csontos
94362423c4 spec: Add vdo plugin for dmeventd 2018-08-23 11:27:17 +02:00
Heinz Mauelshagen
acf40f5587 lvconvert: fix regression preventing direct striped conversion
Conversion to striped from raid0/raid0_meta is directly possible.

Fix a regression setting superfluous interim raid5_n conversion type
introduced by commit bd7cdd0b09.

Add new test script lvconvert-raid0-striped.sh.

Resolves: rhbz1608067
(cherry picked from commit 4578411633)

Conflicts:
	WHATS_NEW
2018-08-21 18:13:51 +02:00
Zdenek Kabelac
227a0d7336 tests: check policy mq can be used with format2 2018-08-07 18:05:35 +02:00
Zdenek Kabelac
a41968c4b4 tests: splitmirror for mirror type 2018-08-07 18:04:41 +02:00
Zdenek Kabelac
672b8c196b mirror: fix splitmirrors for mirror type
With improved mirror activation code --splitmirror issue poppedup
since there was missing proper preload code and deactivation
for splitted mirror leg.
2018-08-07 18:04:39 +02:00
Zdenek Kabelac
cc96eea029 cache: drop metadata_format validation
Allow to use any combination of cache metadata format for policy.
2018-08-07 18:04:14 +02:00
David Teigland
5f648406b0 mirrors: fix read_only_volume_list
If a mirror LV is listed in read_only_volume_list, it would
still be activated rw.  The activation would initially be
readonly, but the monitoring function would immediately
change it to rw.  This was a regression from commit

fade45b1d1 mirror: improve table update

The monitoring function needs to copy the read_only setting
into the new set of mirror activation options it uses.
2018-08-02 11:39:08 -05:00
Marian Csontos
3ebc745f53 Merge branch '2018-06-01-stable' of git://sourceware.org/git/lvm2 into 2018-06-01-stable
* '2018-06-01-stable' of git://sourceware.org/git/lvm2:
  vgcreate: close exclusive fd after pvcreate
2018-08-02 08:08:51 +02:00
Marian Csontos
acd2c6f256 post-release 2018-08-02 08:08:34 +02:00
Marian Csontos
b10b462fde pre-release 2018-08-01 17:30:40 +02:00
David Teigland
a75eb8d74c vgcreate: close exclusive fd after pvcreate
When vgcreate does an automatic pvcreate, it opens the
dev with O_EXCL to ensure no other subsystem is using
the device.  This exclusive fd remained in bcache and
prevented activation parts of lvm from using the dev.

This appeared with vgcreate of a sanlock VG because of
the unique combination where the dev is not yet a PV,
so pvcreate is needed, and the vgcreate also creates
and activates an internal LV for sanlock.

Fix this by closing the exclusive fd after it's used
by pvcreate so that it won't interfere with other
bits of lvm that may try to use the device.
2018-08-01 10:26:28 -05:00
Marian Csontos
0569add94c pre-release 2018-08-01 16:47:09 +02:00
Marian Csontos
12dfd0ed02 build: make generate 2018-07-31 17:41:35 +02:00
Marian Csontos
ad10d42671 WHATS_NEW 2018-07-31 17:41:31 +02:00
Zdenek Kabelac
f7645995da dmeventd: rebase to stable branch
Some minimal set of changes to make vdo plugin compilable in stable branch:

Use older headers.
Implement simple vdo status parser to only resolve use-percentage.
2018-07-31 14:55:03 +02:00
Zdenek Kabelac
4ed9b07380 dmeventd: base vdo plugin
Introduce VDO plugin for monitoring VDO devices.

This plugin can be used also by other users, as plugin checks
for UUID prefix 'LVM-' and run  lvm actions only on those
devices.

Non LVM- device are only monitored and log warnings
when usage threshold reaches 80%.
2018-07-31 14:53:27 +02:00
Marian Csontos
0174ba692c Add BSD 2-Clause License
This is required by C++ test harness.
2018-07-27 17:09:03 +02:00
Marian Csontos
48594d007a test: Check flavour is used and exists
(cherry picked from commit 9cd05d1f1e)
2018-07-26 15:04:16 +02:00
Heinz Mauelshagen
50a603de6f lvconvert: reject conversions on raid1 split trackchanges LVs
Prohibit, because the tracking can't continue and
further conversions may fail with bogus error messages.

Resolves: rhbz1579072
(cherry picked from commit a004bb07f1)

Conflicts:
	WHATS_NEW
2018-07-26 14:02:20 +02:00
Heinz Mauelshagen
e4fe0d1b8f lvconvert: reject conversions on raid1 split trackchanges SubLVs
Prohibit conversions of raid1 split trackchanges SubLVs
because they will fail to get merged back into the RaidLV.

Resolves: rhbz1579438
(cherry picked from commit 8b0729af0f)

Conflicts:
	WHATS_NEW
2018-07-26 14:01:37 +02:00
Bryn M. Reeves
951676a59e dmsetup: fix error propagation in _display_info_cols()
Commit 3f35146 added a check on the value returned by the
_display_info_cols() function:

  1024         if (!_switches[COLS_ARG])
  1025                 _display_info_long(dmt, &info);
  1026         else
  1027                 r = _display_info_cols(dmt, &info);
  1028
  1029         return r;

This exposes a bug in the dmstats code in _display_info_cols:
the fact that a device has no regions is explicitly not an error
(and is documented as such in the code), but since the return
code is not changed before leaving the function it is now treated
as an error leading to:

  # dmstats list
  Command failed.

When no regions exist.

Set the return code to the correct value before returning.

(cherry picked from commit 29b9ccd261)
2018-07-25 10:55:28 +02:00
Heinz Mauelshagen
4456d9aa77 lvconvert: reject conversions of LVs under snapshot
Conversions of LVs under snapshot to thinpool or cachepool
correctly fail but leave them inactive and provide cryptic
error messages like 'Internal error: #LVs (10) != #visible
LVs (2) + #snapshots (1) + #internal LVs (5) in VG VG'.

Reject and provide better error message.

Resolves: rhbz1514146
(cherry picked from commit 2214dc12c3)
2018-07-25 10:52:58 +02:00
David Teigland
b394a9f63f lvconvert: improve text about splitmirrors
in messages and man page.
2018-07-23 12:31:28 -05:00
David Teigland
9e296c9c6f lvconvert: restrict command matching for no option variant
The 'lvconvert LV' command def has caused multiple problems
for command matching because it matches the required options
of any lvconvert command.  Any lvconvert with incorrect options
ends up matching 'lvconvert LV', which then produces an error
about incorrect options being used for 'lvconvert LV'.  This
prevents suggestions from nearest-command partial command matches.

Add a special case for 'lvconvert LV' so that it won't be used
as a partial match for a command that has options specified.
2018-07-23 12:31:23 -05:00
Marian Csontos
5b87f5fb72 post-release 2018-07-19 18:43:10 +02:00
Marian Csontos
bb384f8488 pre-release 2018-07-19 18:35:42 +02:00
Marian Csontos
82feb5f111 WHATS_NEW 2018-07-19 18:33:59 +02:00
Zdenek Kabelac
66990bc7c8 allocation: add check for passing log allocation
Updates previous commit.
2018-07-09 00:58:30 +02:00
Zdenek Kabelac
6fcb2ba440 WHATS_NEW: update 2018-07-09 00:36:11 +02:00
Zdenek Kabelac
b8a7f6ba3d dev_io: no discard in testmode
When lvm2 command is executed in test mode, discard ioctl is skipped.
This may cause even data-loose in case, issuing discard for released
areas was enabled and user 'tested'  lvreduce.
2018-07-09 00:35:34 +02:00
Zdenek Kabelac
0851ee5301 allocator: fix thin-pool allocation
When allocating thin-pool with more then 1 device - try to
allocate 'metadataLV' with reuse of log-type allocation for mirror LV.
It should be naturally place on other device then 'dataLV'.

However due to somewhat hard to follow allocation logic code,
it's been rejected allocation in cases where there was not
enough space for data or metadata on single PV, thus to successed,
usage of segments was mandatory.

While user may use:

allocation/thin_pool_metadata_require_separate_pvs=1

to enforce separe meta and data LV - on default settings, this is not
enable thus segment allocation is meant to work.

NOTE:

As already said - the original intention of this whole  'if()' is unclear,
so try to split this test into multiple more simple tests that are more readable.

TODO: more validation.
2018-07-09 00:35:34 +02:00
Zdenek Kabelac
df8eef7096 memlock: extend exception list
Amound of linked libraries grows.
Most of them we don't need to lock in, since we are not using
them in locked section, so skip locking them in memory.
2018-07-04 13:41:08 +02:00
Zdenek Kabelac
c1dbb22ba4 tests: update with --yes
vgcfgrestore needs to confirm restore while LVs from VG are present.
2018-07-04 13:41:00 +02:00
Zdenek Kabelac
99cddd67a9 vcfgrestore: add prompt with active volumes
Add check for active device with names matching restored VG.
When such devices are present in dm table, prompt user, if he
wish to continue.
2018-07-04 13:40:50 +02:00
David Teigland
814dd84e07 Revert "man: fix lvreduce example"
-l -3 is correct, meaning reduce by 3.

This reverts commit d5bcc56eef.
2018-06-27 09:19:01 -05:00
David Teigland
d5bcc56eef man: fix lvreduce example 2018-06-27 08:58:22 -05:00
David Teigland
f7ffba204e devs: use bcache fd for read ahead ioctl
to avoid an unnecessary open of the device in
most cases.
2018-06-26 12:15:43 -05:00
David Teigland
90e419c645 scan: reopen RDWR during rescan
Commit a30e622279:
  "scan: work around udev problems by avoiding open RDWR"

had us reopen a device RDWR in the write function.  Since
we know earlier that the command intends to write to devices
in the VG, we can reopen the VG's devices RDWR during the
rescan instead of waiting until the writes to happen.
2018-06-26 12:15:43 -05:00
David Teigland
49147cbaa7 bcache.c add missing { 2018-06-26 12:15:43 -05:00
Marian Csontos
69907e0780 bcache: Fix null pointer dereferencing
(cherry picked from commit a14f21bf1d)

Conflicts:
	lib/device/bcache.c
2018-06-26 17:09:58 +02:00
Heinz Mauelshagen
b90d4b38e5 WHATS_NEW
(cherry picked from commit 11384637fb)

Conflicts:
	WHATS_NEW
2018-06-26 12:18:39 +02:00
Heinz Mauelshagen
befdfc245b test: add convcenience conversion tests linear <-> striped
Add tests for linear <-> striped|raid* conversions.

Add region_size config to reshape tests to avoid test
failures in case of it being defined unexpectedly in lvm.conf.

Related: rhbz1439925
Related: rhbz1447809
(cherry picked from commit 3810fd8d0d)
2018-06-26 12:15:56 +02:00
Heinz Mauelshagen
0d78e4c1e9 lvconvert: support linear <-> striped convenience conversions
"lvconvert --type {linear|striped|raid*} ..." on a striped/linear
LV provides convenience interim type to convert to the requested
final layout similar to the given raid* <-> raid* conveninece types.

Whilst on it, add missing raid5_n convenince type from raid5* to raid10.

Resolves: rhbz1439925
Resolves: rhbz1447809
Resolves: rhbz1573255
(cherry picked from commit bd7cdd0b09)
2018-06-26 12:15:50 +02:00
Heinz Mauelshagen
763c65314e segtype: add linear
Add linear segtype addressing FIXME in preparation
for linear <-> striped convenience conversion support

(cherry picked from commit de66704253)
2018-06-26 12:15:44 +02:00
Marian Csontos
24aee732a5 filter: make pointers distinguishable
This ammends commit: 4afb5971b9 with suggestions to improve debugging
from Nir Soffer.
2018-06-22 15:13:24 +02:00
Zdenek Kabelac
ba6ed5c90c snapshot: improve checking of merging snapshot
Add runtime detection for 'lvs -o+seg_monitor' and 'vgchange --monitor'.
This fix should avoid unnecessary timeout on systemd shutdown.
2018-06-22 15:05:22 +02:00
Zdenek Kabelac
e0c94d883a fsadm: missing -l description 2018-06-22 15:00:52 +02:00
Zdenek Kabelac
39e3b5d8ac cache: cleaner policy also uses fmt2
Format 2 is also with cleaner policy.
2018-06-22 15:00:10 +02:00
Zdenek Kabelac
39fc98d731 pvmove: improve lvs
When pvmoving LV - the target for LV is a mirror so the validation
that checked the type is matching was incorrect.

While we need a more generic enhancment of LVS output for pvmoved LVs,
for now at least stop showing internal errors and  'X' symbols in attrs.
2018-06-22 12:37:59 +02:00
Zdenek Kabelac
5503699c37 pvresize: add missing return
Log error path missed return 0.
Also fix some unneded bactraces (since log_error already shows
position).
2018-06-22 12:37:09 +02:00
Zdenek Kabelac
e0bfc946cb pvresize: update message
There is always at least PV header update even if the size
of PV remains same (so it's not really resized).
Try to make it a slightly less confusing.
2018-06-22 12:34:24 +02:00
Zdenek Kabelac
9546edeef9 systemd: add conficting sockets
Since we are using "DefaultDependencies=no" we do not get automatic STOP
job on socket connection - so automatically refuse connection on
shutdown by adding this Conflict definition to socket Unit.
2018-06-22 12:32:31 +02:00
Zdenek Kabelac
716199334c pvscan: code reshape 2018-06-22 12:31:32 +02:00
Zdenek Kabelac
4479228d32 vgchange: fix error code in error path
This rather hard to hit error path used wrong return value to signal
real error.
2018-06-22 12:29:42 +02:00
David Teigland
4afb5971b9 filter: use pointers to real addresses
instead of casting values 1 and 2 to pointers
which gcc optimization can have problems with.
2018-06-21 10:52:35 -05:00
David Teigland
dd075e93c1 coverity warnings about null info in lvmcache.c 2018-06-21 09:22:05 -05:00
David Teigland
d4fd39f64c lvmlockd: fix another missing lock_type null check
Same as 347c807f8.
2018-06-21 09:00:23 -05:00
Marian Csontos
acb784e2a8 bcache: fix memory leaks 2018-06-21 10:22:35 +02:00
Marian Csontos
8a0af1bec8 libdm: fix buffer overflow 2018-06-21 10:22:24 +02:00
David Teigland
8bd9a89c14 WHATS_NEW: recent changes 2018-06-20 14:32:29 -05:00
David Teigland
a30e622279 scan: work around udev problems by avoiding open RDWR
udev creates a train wreck of events if we open devices
with RDWR.  Until we can fix/disable/scrap udev, work around
this by opening RDONLY and then closing/reopening RDWR when
a write is needed.  This invalidates the bcache blocks for
the device before writing so it can trigger unnecessary
rereading.
2018-06-20 12:05:04 -05:00
David Teigland
76075ff55d clvmd: fix leak of saved_vg struct
Commit c016b573ee "clvmd: separate saved_vg from vginfo"
created a separate hash table for the saved_vg structs.
The vg's referenced by the saved_vg struct were all being
freed properly, but the svg wrapper struct itself was not
being freed.
2018-06-18 14:14:38 -05:00
David Teigland
bfb904af1c bcache: remove extraneous error message
an error from io_submit is already recognized by
the caller like errors during completion.
2018-06-18 11:59:57 -05:00
Marian Csontos
d88376ca78 post-release 2018-06-18 07:30:09 +02:00
Marian Csontos
6283f5ea3f pre-release 2018-06-18 07:21:51 +02:00
David Teigland
43ce357ebc man: update lvmsystemid wording
to refer to "shared VG" instead of "lockd VG".
2018-06-14 13:34:35 -05:00
David Teigland
d136790bab man: updates to lvmlockd
The terminology has migrated toward using "shared VG"
rather than "lockd VG".

Also improve the wording in a number of places.
2018-06-14 12:53:50 -05:00
David Teigland
214de62b5d lvmlockd: update method for changing clustered VG
The previous method for forcibly changing a clustered VG to
a local VG involved using -cn and --config locking_type=0.
Add an alternative that is consistent with other forced
lock type changes:
vgchange --locktype none --lockopt force.
2018-06-13 15:58:57 -05:00
David Teigland
e9c0a64fb5 WHATS_NEW for recent changes 2018-06-13 15:42:15 -05:00
Marian Csontos
7ac8e21f3c Merge branch 'dev-mcsontos-release' into stable 2018-06-13 19:13:52 +02:00
Marian Csontos
fdb362b998 post-release 2018-06-13 19:09:07 +02:00
Marian Csontos
06accf1395 pre-release 2018-06-13 14:13:35 +02:00
David Teigland
d3dcca639c lvmlockd: skip repair lock upgrade for non shared vgs
Only attempt lvmlockd lock upgrade for shared VGs.
2018-06-12 09:25:51 -05:00
David Teigland
98eb9e5754 man lvmlockd: remove unnecessary reference to lvmetad
it's optional to use it with lvmlockd
2018-06-07 13:42:11 -05:00
David Teigland
347c807f86 lvmlockd: fix missing lock_type null check
Missed checking if vg->lock_type is NULL in commit db8d3bdfa:
  lvmlockd: enable mirror split and merge with dlm lock_type
2018-06-06 13:56:02 -05:00
David Teigland
1e5f6887b1 devices: clean up io error messages
Remove the io error message from bcache.c since it is not
very useful without the device path.

Make the io error messages from dev_read_bytes/dev_write_bytes
more user friendly.
2018-06-06 10:05:08 -05:00
121 changed files with 3356 additions and 966 deletions

25
COPYING.BSD Normal file
View File

@@ -0,0 +1,25 @@
BSD 2-Clause License
Copyright (c) 2014, Red Hat, Inc.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -1 +1 @@
2.02.178(2)-git (2018-05-24)
2.02.184(2) (2019-03-22)

View File

@@ -1 +1 @@
1.02.147-git (2018-05-24)
1.02.156 (2019-03-22)

View File

@@ -1,7 +1,73 @@
Version 2.02.178 -
Version 2.02.184 - 22nd March 2019
==================================
Fix (de)activation of RaidLVs with visible SubLVs
Change scan_lvs default to 0 so LVs are not scanned for PVs.
Add scan_lvs config setting to control if lvm scans LVs for PVs.
Fix missing proper initialization of pv_list struct when adding pv.
Version 2.02.183 - 07th December 2018
=====================================
Avoid disabling lvmetad when repair does nothing.
Fix component detection for md version 0.90.
Use sync io if async io_setup fails, or use_aio=0 is set in config.
Avoid opening devices to get block size by using existing open fd.
Version 2.02.182 - 30th October 2018
====================================
Fix possible write race between last metadata block and the first extent.
Fix filtering of md 1.0 devices so they are not seen as duplicate PVs.
Fix lvconvert striped/raid0/raid0_meta -> raid6 regression.
Add After=rbdmap.service to {lvm2-activation-net,blk-availability}.service.
Fix pvs with lvmetad to avoid too many open files from filter reads.
Fix pvscan --cache to avoid too many open files from filter reads.
Reduce max concurrent aios to avoid EMFILE with many devices.
Fix lvconvert conversion attempts to linear.
Fix lvconvert raid0/raid0_meta -> striped regression.
Fix lvconvert --splitmirror for mirror type (2.02.178).
Do not pair cache policy and cache metadata format.
Fix mirrors honoring read_only_volume_list.
Version 2.02.181 - 01 August 2018
=================================
Reject conversions on raid1 LVs with split tracked SubLVs.
Reject conversions on raid1 split tracked SubLVs.
Fix dmstats list failing when no regions exist.
Reject conversions of LVs under snapshot.
Limit suggested options on incorrect option for lvconvert subcommand.
Version 2.02.180 - 19th July 2018
=================================
Never send any discard ioctl with test mode.
Fix thin-pool alloc which needs same PV for data and metadata.
Extend list of non-memlocked areas with newly linked libs.
Enhance vgcfgrestore to check for active LVs in restored VG.
lvconvert: provide possible layouts between linear and striped/raid
Fix unmonitoring of merging snapshots.
Add missing -l description in fsadm man page.
Cache can uses metadata format 2 with cleaner policy.
Avoid showing internal error in lvs output or pvmoved LVs.
Fix check if resized PV can also fit metadata area.
Reopen devices RDWR only before writing to avoid udev issues.
Change pvresize output confusing when no resize took place.
Fix lvmetad hanging on shutdown.
Fix mem leak in clvmd and more coverity issues.
Version 2.02.179 - 18th June 2018
=================================
Allow forced vgchange to lock type none on clustered VG.
Add the report field "shared".
Enable automatic metadata consistency repair on a shared VG.
Fix pvremove force on a PV with a shared VG.
Fixed vgimportclone of a PV with a shared VG.
Enable previously disallowed thin/cache commands in shared VGs.
Enable metadata-related changes on LVs active with shared lock.
Do not continue trying to use a device that cannot be opened.
Fix problems opening a device that fails and returns.
Use versionsort to fix archive file expiry beyond 100000 files.
Version 2.02.178 - 13th June 2018
=================================
Version 2.02.178-rc1 - 24th May 2018
====================================
Add libaio dependency for build.

View File

@@ -1,5 +1,30 @@
Version 1.02.147 -
Version 1.02.156 - 22nd March 2019
==================================
Ensure migration_threshold for cache is at least 8 chunks.
Enhance ioctl flattening and add parameters only when needed.
Add DM_DEVICE_ARM_POLL for API completness matching kernel.
Version 1.02.154 - 07th December 2018
=====================================
Do not add parameters for RESUME with DM_DEVICE_CREATE dm task.
Fix dmstats report printing no output.
Version 1.02.152 - 30th October 2018
====================================
Add hot fix to avoiding locking collision when monitoring thin-pools.
Version 1.02.150 - 01 August 2018
=================================
Add vdo plugin for monitoring VDO devices.
Version 1.02.149 - 19th July 2018
=================================
Version 1.02.148 - 18th June 2018
=================================
Version 1.02.147 - 13th June 2018
=================================
Version 1.02.147-rc1 - 24th May 2018
====================================

View File

@@ -185,6 +185,20 @@ devices {
# present on the system. sysfs must be part of the kernel and mounted.)
sysfs_scan = 1
# Configuration option devices/scan_lvs.
# Scan LVM LVs for layered PVs, allowing LVs to be used as PVs.
# When 1, LVM will detect PVs layered on LVs, and caution must be
# taken to avoid a host accessing a layered VG that may not belong
# to it, e.g. from a guest image. This generally requires excluding
# the LVs with device filters. Also, when this setting is enabled,
# every LVM command will scan every active LV on the system (unless
# filtered), which can cause performance problems on systems with
# many active LVs. When this setting is 0, LVM will not detect or
# use PVs that exist on LVs, and will not allow a PV to be created on
# an LV. The LVs are ignored using a built in device filter that
# identifies and excludes LVs.
scan_lvs = 0
# Configuration option devices/multipath_component_detection.
# Ignore devices that are components of DM multipath devices.
multipath_component_detection = 1
@@ -891,6 +905,11 @@ global {
# This configuration option has an automatic default value.
# lvdisplay_shows_full_device_path = 0
# Configuration option global/use_aio.
# Use async I/O when reading and writing devices.
# This configuration option has an automatic default value.
# use_aio = 1
# Configuration option global/use_lvmetad.
# Use lvmetad to cache metadata and reduce disk scanning.
# When enabled (and running), lvmetad provides LVM commands with VG
@@ -1108,6 +1127,16 @@ global {
# When enabled, an LVM command that changes PVs, changes VG metadata,
# or changes the activation state of an LV will send a notification.
notify_dbus = 1
# Configuration option global/io_memory_size.
# The amount of memory in KiB that LVM allocates to perform disk io.
# LVM performance may benefit from more io memory when there are many
# disks or VG metadata is large. Increasing this size may be necessary
# when a single copy of VG metadata is larger than the current setting.
# This value should usually not be decreased from the default; setting
# it too low can result in lvm failing to read VGs.
# This configuration option has an automatic default value.
# io_memory_size = 8192
}
# Configuration section activation.

3
configure vendored
View File

@@ -15559,7 +15559,7 @@ _ACEOF
################################################################################
ac_config_files="$ac_config_files Makefile make.tmpl daemons/Makefile daemons/clvmd/Makefile daemons/cmirrord/Makefile daemons/dmeventd/Makefile daemons/dmeventd/libdevmapper-event.pc daemons/dmeventd/plugins/Makefile daemons/dmeventd/plugins/lvm2/Makefile daemons/dmeventd/plugins/raid/Makefile daemons/dmeventd/plugins/mirror/Makefile daemons/dmeventd/plugins/snapshot/Makefile daemons/dmeventd/plugins/thin/Makefile daemons/dmfilemapd/Makefile daemons/lvmdbusd/Makefile daemons/lvmdbusd/lvmdbusd daemons/lvmdbusd/lvmdb.py daemons/lvmdbusd/lvm_shell_proxy.py daemons/lvmdbusd/path.py daemons/lvmetad/Makefile daemons/lvmpolld/Makefile daemons/lvmlockd/Makefile device_mapper/Makefile conf/Makefile conf/example.conf conf/lvmlocal.conf conf/command_profile_template.profile conf/metadata_profile_template.profile include/.symlinks include/Makefile lib/Makefile lib/locking/Makefile include/lvm-version.h libdaemon/Makefile libdaemon/client/Makefile libdaemon/server/Makefile libdm/Makefile libdm/libdevmapper.pc liblvm/Makefile liblvm/liblvm2app.pc man/Makefile po/Makefile python/Makefile python/setup.py scripts/blkdeactivate.sh scripts/blk_availability_init_red_hat scripts/blk_availability_systemd_red_hat.service scripts/clvmd_init_red_hat scripts/cmirrord_init_red_hat scripts/com.redhat.lvmdbus1.service scripts/dm_event_systemd_red_hat.service scripts/dm_event_systemd_red_hat.socket scripts/lvm2_cluster_activation_red_hat.sh scripts/lvm2_cluster_activation_systemd_red_hat.service scripts/lvm2_clvmd_systemd_red_hat.service scripts/lvm2_cmirrord_systemd_red_hat.service scripts/lvm2_lvmdbusd_systemd_red_hat.service scripts/lvm2_lvmetad_init_red_hat scripts/lvm2_lvmetad_systemd_red_hat.service scripts/lvm2_lvmetad_systemd_red_hat.socket scripts/lvm2_lvmpolld_init_red_hat scripts/lvm2_lvmpolld_systemd_red_hat.service scripts/lvm2_lvmpolld_systemd_red_hat.socket scripts/lvm2_lvmlockd_systemd_red_hat.service scripts/lvm2_lvmlocking_systemd_red_hat.service scripts/lvm2_monitoring_init_red_hat scripts/lvm2_monitoring_systemd_red_hat.service scripts/lvm2_pvscan_systemd_red_hat@.service scripts/lvm2_tmpfiles_red_hat.conf scripts/lvmdump.sh scripts/Makefile test/Makefile test/api/Makefile test/api/python_lvm_unit.py test/unit/Makefile tools/Makefile udev/Makefile"
ac_config_files="$ac_config_files Makefile make.tmpl daemons/Makefile daemons/clvmd/Makefile daemons/cmirrord/Makefile daemons/dmeventd/Makefile daemons/dmeventd/libdevmapper-event.pc daemons/dmeventd/plugins/Makefile daemons/dmeventd/plugins/lvm2/Makefile daemons/dmeventd/plugins/raid/Makefile daemons/dmeventd/plugins/mirror/Makefile daemons/dmeventd/plugins/snapshot/Makefile daemons/dmeventd/plugins/thin/Makefile daemons/dmeventd/plugins/vdo/Makefile daemons/dmfilemapd/Makefile daemons/lvmdbusd/Makefile daemons/lvmdbusd/lvmdbusd daemons/lvmdbusd/lvmdb.py daemons/lvmdbusd/lvm_shell_proxy.py daemons/lvmdbusd/path.py daemons/lvmetad/Makefile daemons/lvmpolld/Makefile daemons/lvmlockd/Makefile device_mapper/Makefile conf/Makefile conf/example.conf conf/lvmlocal.conf conf/command_profile_template.profile conf/metadata_profile_template.profile include/.symlinks include/Makefile lib/Makefile lib/locking/Makefile include/lvm-version.h libdaemon/Makefile libdaemon/client/Makefile libdaemon/server/Makefile libdm/Makefile libdm/libdevmapper.pc liblvm/Makefile liblvm/liblvm2app.pc man/Makefile po/Makefile python/Makefile python/setup.py scripts/blkdeactivate.sh scripts/blk_availability_init_red_hat scripts/blk_availability_systemd_red_hat.service scripts/clvmd_init_red_hat scripts/cmirrord_init_red_hat scripts/com.redhat.lvmdbus1.service scripts/dm_event_systemd_red_hat.service scripts/dm_event_systemd_red_hat.socket scripts/lvm2_cluster_activation_red_hat.sh scripts/lvm2_cluster_activation_systemd_red_hat.service scripts/lvm2_clvmd_systemd_red_hat.service scripts/lvm2_cmirrord_systemd_red_hat.service scripts/lvm2_lvmdbusd_systemd_red_hat.service scripts/lvm2_lvmetad_init_red_hat scripts/lvm2_lvmetad_systemd_red_hat.service scripts/lvm2_lvmetad_systemd_red_hat.socket scripts/lvm2_lvmpolld_init_red_hat scripts/lvm2_lvmpolld_systemd_red_hat.service scripts/lvm2_lvmpolld_systemd_red_hat.socket scripts/lvm2_lvmlockd_systemd_red_hat.service scripts/lvm2_lvmlocking_systemd_red_hat.service scripts/lvm2_monitoring_init_red_hat scripts/lvm2_monitoring_systemd_red_hat.service scripts/lvm2_pvscan_systemd_red_hat@.service scripts/lvm2_tmpfiles_red_hat.conf scripts/lvmdump.sh scripts/Makefile test/Makefile test/api/Makefile test/api/python_lvm_unit.py test/unit/Makefile tools/Makefile udev/Makefile"
cat >confcache <<\_ACEOF
# This file is a shell script that caches the results of configure
@@ -16267,6 +16267,7 @@ do
"daemons/dmeventd/plugins/mirror/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/dmeventd/plugins/mirror/Makefile" ;;
"daemons/dmeventd/plugins/snapshot/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/dmeventd/plugins/snapshot/Makefile" ;;
"daemons/dmeventd/plugins/thin/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/dmeventd/plugins/thin/Makefile" ;;
"daemons/dmeventd/plugins/vdo/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/dmeventd/plugins/vdo/Makefile" ;;
"daemons/dmfilemapd/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/dmfilemapd/Makefile" ;;
"daemons/lvmdbusd/Makefile") CONFIG_FILES="$CONFIG_FILES daemons/lvmdbusd/Makefile" ;;
"daemons/lvmdbusd/lvmdbusd") CONFIG_FILES="$CONFIG_FILES daemons/lvmdbusd/lvmdbusd" ;;

View File

@@ -2099,6 +2099,7 @@ daemons/dmeventd/plugins/raid/Makefile
daemons/dmeventd/plugins/mirror/Makefile
daemons/dmeventd/plugins/snapshot/Makefile
daemons/dmeventd/plugins/thin/Makefile
daemons/dmeventd/plugins/vdo/Makefile
daemons/dmfilemapd/Makefile
daemons/lvmdbusd/Makefile
daemons/lvmdbusd/lvmdbusd

View File

@@ -832,7 +832,7 @@ void lvm_do_backup(const char *vgname)
pthread_mutex_lock(&lvm_lock);
vg = vg_read_internal(cmd, vgname, NULL /*vgid*/, 0, WARN_PV_READ, &consistent);
vg = vg_read_internal(cmd, vgname, NULL /*vgid*/, 0, 0, WARN_PV_READ, &consistent);
if (vg && consistent)
check_current_backup(vg);

View File

@@ -645,6 +645,7 @@ int dm_event_register_handler(const struct dm_event_handler *dmevh)
uuid = dm_task_get_uuid(dmt);
if (!strstr(dmevh->dso, "libdevmapper-event-lvm2thin.so") &&
!strstr(dmevh->dso, "libdevmapper-event-lvm2vdo.so") &&
!strstr(dmevh->dso, "libdevmapper-event-lvm2snapshot.so") &&
!strstr(dmevh->dso, "libdevmapper-event-lvm2mirror.so") &&
!strstr(dmevh->dso, "libdevmapper-event-lvm2raid.so"))

View File

@@ -1,6 +1,6 @@
#
# Copyright (C) 2001-2004 Sistina Software, Inc. All rights reserved.
# Copyright (C) 2004-2005, 2011 Red Hat, Inc. All rights reserved.
# Copyright (C) 2004-2018 Red Hat, Inc. All rights reserved.
#
# This file is part of LVM2.
#
@@ -16,11 +16,7 @@ srcdir = @srcdir@
top_srcdir = @top_srcdir@
top_builddir = @top_builddir@
SUBDIRS += lvm2 snapshot raid thin mirror
ifeq ($(MAKECMDGOALS),distclean)
SUBDIRS = lvm2 mirror snapshot raid thin
endif
SUBDIRS += lvm2 snapshot raid thin mirror vdo
include $(top_builddir)/make.tmpl
@@ -28,3 +24,4 @@ snapshot: lvm2
mirror: lvm2
raid: lvm2
thin: lvm2
vdo: lvm2

View File

@@ -31,6 +31,13 @@ static pthread_mutex_t _register_mutex = PTHREAD_MUTEX_INITIALIZER;
static int _register_count = 0;
static struct dm_pool *_mem_pool = NULL;
static void *_lvm_handle = NULL;
static DM_LIST_INIT(_env_registry);
struct env_data {
struct dm_list list;
const char *cmd;
const char *data;
};
DM_EVENT_LOG_FN("#lvm")
@@ -100,6 +107,7 @@ void dmeventd_lvm2_exit(void)
lvm2_run(_lvm_handle, "_memlock_dec");
dm_pool_destroy(_mem_pool);
_mem_pool = NULL;
dm_list_init(&_env_registry);
lvm2_exit(_lvm_handle);
_lvm_handle = NULL;
log_debug("lvm plugin exited.");
@@ -124,6 +132,8 @@ int dmeventd_lvm2_command(struct dm_pool *mem, char *buffer, size_t size,
static char _internal_prefix[] = "_dmeventd_";
char *vg = NULL, *lv = NULL, *layer;
int r;
struct env_data *env_data;
const char *env = NULL;
if (!dm_split_lvm_name(mem, device, &vg, &lv, &layer)) {
log_error("Unable to determine VG name from %s.",
@@ -137,18 +147,35 @@ int dmeventd_lvm2_command(struct dm_pool *mem, char *buffer, size_t size,
*layer = '\0';
if (!strncmp(cmd, _internal_prefix, sizeof(_internal_prefix) - 1)) {
dmeventd_lvm2_lock();
/* output of internal command passed via env var */
if (!dmeventd_lvm2_run(cmd))
cmd = NULL;
else if ((cmd = getenv(cmd)))
cmd = dm_pool_strdup(mem, cmd); /* copy with lock */
dmeventd_lvm2_unlock();
/* check if ENVVAR wasn't already resolved */
dm_list_iterate_items(env_data, &_env_registry)
if (!strcmp(cmd, env_data->cmd)) {
env = env_data->data;
break;
}
if (!cmd) {
log_error("Unable to find configured command.");
return 0;
if (!env) {
/* run lvm2 command to find out setting value */
dmeventd_lvm2_lock();
if (!dmeventd_lvm2_run(cmd) ||
!(env = getenv(cmd))) {
log_error("Unable to find configured command.");
return 0;
}
/* output of internal command passed via env var */
env = dm_pool_strdup(_mem_pool, env); /* copy with lock */
dmeventd_lvm2_unlock();
if (!env ||
!(env_data = dm_pool_zalloc(_mem_pool, sizeof(*env_data))) ||
!(env_data->cmd = dm_pool_strdup(_mem_pool, cmd))) {
log_error("Unable to allocate env memory.");
return 0;
}
env_data->data = env;
/* add to ENVVAR registry */
dm_list_add(&_env_registry, &env_data->list);
}
cmd = env;
}
r = dm_snprintf(buffer, size, "%s %s/%s", cmd, vg, lv);

View File

@@ -0,0 +1,3 @@
process_event
register_device
unregister_device

View File

@@ -0,0 +1,36 @@
#
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This file is part of LVM2.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
srcdir = @srcdir@
top_srcdir = @top_srcdir@
top_builddir = @top_builddir@
INCLUDES += -I$(top_srcdir)/daemons/dmeventd/plugins/lvm2
CLDFLAGS += -L$(top_builddir)/daemons/dmeventd/plugins/lvm2
SOURCES = dmeventd_vdo.c
LIB_NAME = libdevmapper-event-lvm2vdo
LIB_SHARED = $(LIB_NAME).$(LIB_SUFFIX)
LIB_VERSION = $(LIB_VERSION_LVM)
CFLOW_LIST = $(SOURCES)
CFLOW_LIST_TARGET = $(LIB_NAME).cflow
include $(top_builddir)/make.tmpl
LIBS += -ldevmapper-event-lvm2 $(INTERNAL_LIBS)
install_lvm2: install_dm_plugin
install: install_lvm2

View File

@@ -0,0 +1,419 @@
/*
* Copyright (C) 2018 Red Hat, Inc. All rights reserved.
*
* This file is part of LVM2.
*
* This copyrighted material is made available to anyone wishing to use,
* modify, copy, or redistribute it subject to the terms and conditions
* of the GNU Lesser General Public License v.2.1.
*
* You should have received a copy of the GNU Lesser General Public License
* along with this program; if not, write to the Free Software Foundation,
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "lib.h"
#include "dmeventd_lvm.h"
#include "libdevmapper-event.h"
#include <sys/wait.h>
#include <stdarg.h>
/* First warning when VDO pool is 80% full. */
#define WARNING_THRESH (DM_PERCENT_1 * 80)
/* Run a check every 5%. */
#define CHECK_STEP (DM_PERCENT_1 * 5)
/* Do not bother checking VDO pool is less than 50% full. */
#define CHECK_MINIMUM (DM_PERCENT_1 * 50)
#define MAX_FAILS (256) /* ~42 mins between cmd call retry with 10s delay */
#define VDO_DEBUG 0
struct dso_state {
struct dm_pool *mem;
int percent_check;
int percent;
uint64_t known_data_size;
unsigned fails;
unsigned max_fails;
int restore_sigset;
sigset_t old_sigset;
pid_t pid;
char *argv[3];
const char *cmd_str;
const char *name;
};
struct vdo_status {
uint64_t used_blocks;
uint64_t total_blocks;
};
static int _vdo_status_parse(const char *params, struct vdo_status *status)
{
if (sscanf(params, "%*s %*s %*s %*s %*s %" PRIu64 " %" PRIu64,
&status->used_blocks,
&status->total_blocks) < 2) {
log_error("Failed to parse vdo params: %s.", params);
return 0;
}
return 1;
}
DM_EVENT_LOG_FN("vdo")
static int _run_command(struct dso_state *state)
{
char val[16];
int i;
/* Mark for possible lvm2 command we are running from dmeventd
* lvm2 will not try to talk back to dmeventd while processing it */
(void) setenv("LVM_RUN_BY_DMEVENTD", "1", 1);
if (state->percent) {
/* Prepare some known data to env vars for easy use */
if (dm_snprintf(val, sizeof(val), "%d",
state->percent / DM_PERCENT_1) != -1)
(void) setenv("DMEVENTD_VDO_POOL", val, 1);
} else {
/* For an error event it's for a user to check status and decide */
log_debug("Error event processing.");
}
log_verbose("Executing command: %s", state->cmd_str);
/* TODO:
* Support parallel run of 'task' and it's waitpid maintainence
* ATM we can't handle signaling of SIGALRM
* as signalling is not allowed while 'process_event()' is running
*/
if (!(state->pid = fork())) {
/* child */
(void) close(0);
for (i = 3; i < 255; ++i) (void) close(i);
execvp(state->argv[0], state->argv);
_exit(errno);
} else if (state->pid == -1) {
log_error("Can't fork command %s.", state->cmd_str);
state->fails = 1;
return 0;
}
return 1;
}
static int _use_policy(struct dm_task *dmt, struct dso_state *state)
{
#if VDO_DEBUG
log_debug("dmeventd executes: %s.", state->cmd_str);
#endif
if (state->argv[0])
return _run_command(state);
if (!dmeventd_lvm2_run_with_lock(state->cmd_str)) {
log_error("Failed command for %s.", dm_task_get_name(dmt));
state->fails = 1;
return 0;
}
state->fails = 0;
return 1;
}
/* Check if executed command has finished
* Only 1 command may run */
static int _wait_for_pid(struct dso_state *state)
{
int status = 0;
if (state->pid == -1)
return 1;
if (!waitpid(state->pid, &status, WNOHANG))
return 0;
/* Wait for finish */
if (WIFEXITED(status)) {
log_verbose("Child %d exited with status %d.",
state->pid, WEXITSTATUS(status));
state->fails = WEXITSTATUS(status) ? 1 : 0;
} else {
if (WIFSIGNALED(status))
log_verbose("Child %d was terminated with status %d.",
state->pid, WTERMSIG(status));
state->fails = 1;
}
state->pid = -1;
return 1;
}
void process_event(struct dm_task *dmt,
enum dm_event_mask event __attribute__((unused)),
void **user)
{
const char *device = dm_task_get_name(dmt);
struct dso_state *state = *user;
void *next = NULL;
uint64_t start, length;
char *target_type = NULL;
char *params;
int needs_policy = 0;
struct dm_task *new_dmt = NULL;
struct vdo_status status;
#if VDO_DEBUG
log_debug("Watch for VDO %s:%.2f%%.", state->name,
dm_percent_to_round_float(state->percent_check, 2));
#endif
if (!_wait_for_pid(state)) {
log_warn("WARNING: Skipping event, child %d is still running (%s).",
state->pid, state->cmd_str);
return;
}
if (event & DM_EVENT_DEVICE_ERROR) {
#if VDO_DEBUG
log_debug("VDO event error.");
#endif
/* Error -> no need to check and do instant resize */
state->percent = 0;
if (_use_policy(dmt, state))
goto out;
stack;
if (!(new_dmt = dm_task_create(DM_DEVICE_STATUS)))
goto_out;
if (!dm_task_set_uuid(new_dmt, dm_task_get_uuid(dmt)))
goto_out;
/* Non-blocking status read */
if (!dm_task_no_flush(new_dmt))
log_warn("WARNING: Can't set no_flush for dm status.");
if (!dm_task_run(new_dmt))
goto_out;
dmt = new_dmt;
}
dm_get_next_target(dmt, next, &start, &length, &target_type, &params);
if (!target_type || (strcmp(target_type, "vdo") != 0)) {
log_error("Invalid target type.");
goto out;
}
if (!_vdo_status_parse(params, &status)) {
log_error("Failed to parse status.");
goto out;
}
state->percent = dm_make_percent(status.used_blocks,
status.total_blocks);
#if VDO_DEBUG
log_debug("VDO %s status %.2f%% " FMTu64 "/" FMTu64 ".",
state->name, dm_percent_to_round_float(state->percent, 2),
status.used_blocks, status.total_blocks);
#endif
/* VDO pool size had changed. Clear the threshold. */
if (state->known_data_size != status.total_blocks) {
state->percent_check = CHECK_MINIMUM;
state->known_data_size = status.total_blocks;
state->fails = 0;
}
/*
* Trigger action when threshold boundary is exceeded.
* Report 80% threshold warning when it's used above 80%.
* Only 100% is exception as it cannot be surpased so policy
* action is called for: >50%, >55% ... >95%, 100%
*/
if ((state->percent > WARNING_THRESH) &&
(state->percent > state->percent_check))
log_warn("WARNING: VDO %s %s is now %.2f%% full.",
state->name, device,
dm_percent_to_round_float(state->percent, 2));
if (state->percent > CHECK_MINIMUM) {
/* Run action when usage raised more than CHECK_STEP since the last time */
if (state->percent > state->percent_check)
needs_policy = 1;
state->percent_check = (state->percent / CHECK_STEP + 1) * CHECK_STEP;
if (state->percent_check == DM_PERCENT_100)
state->percent_check--; /* Can't get bigger then 100% */
} else
state->percent_check = CHECK_MINIMUM;
/* Reduce number of _use_policy() calls by power-of-2 factor till frequency of MAX_FAILS is reached.
* Avoids too high number of error retries, yet shows some status messages in log regularly.
* i.e. PV could have been pvmoved and VG/LV was locked for a while...
*/
if (state->fails) {
if (state->fails++ <= state->max_fails) {
log_debug("Postponing frequently failing policy (%u <= %u).",
state->fails - 1, state->max_fails);
return;
}
if (state->max_fails < MAX_FAILS)
state->max_fails <<= 1;
state->fails = needs_policy = 1; /* Retry failing command */
} else
state->max_fails = 1; /* Reset on success */
/* FIXME: ATM nothing can be done, drop 0, once it becomes useful */
if (0 && needs_policy)
_use_policy(dmt, state);
out:
if (new_dmt)
dm_task_destroy(new_dmt);
}
/* Handle SIGCHLD for a thread */
static void _sig_child(int signum __attribute__((unused)))
{
/* empty SIG_IGN */;
}
/* Setup handler for SIGCHLD when executing external command
* to get quick 'waitpid()' reaction
* It will interrupt syscall just like SIGALRM and
* invoke process_event().
*/
static void _init_thread_signals(struct dso_state *state)
{
struct sigaction act = { .sa_handler = _sig_child };
sigset_t my_sigset;
sigemptyset(&my_sigset);
if (sigaction(SIGCHLD, &act, NULL))
log_warn("WARNING: Failed to set SIGCHLD action.");
else if (sigaddset(&my_sigset, SIGCHLD))
log_warn("WARNING: Failed to add SIGCHLD to set.");
else if (pthread_sigmask(SIG_UNBLOCK, &my_sigset, &state->old_sigset))
log_warn("WARNING: Failed to unblock SIGCHLD.");
else
state->restore_sigset = 1;
}
static void _restore_thread_signals(struct dso_state *state)
{
if (state->restore_sigset &&
pthread_sigmask(SIG_SETMASK, &state->old_sigset, NULL))
log_warn("WARNING: Failed to block SIGCHLD.");
}
int register_device(const char *device,
const char *uuid,
int major __attribute__((unused)),
int minor __attribute__((unused)),
void **user)
{
struct dso_state *state;
const char *cmd;
char *str;
char cmd_str[PATH_MAX + 128 + 2]; /* cmd ' ' vg/lv \0 */
const char *name = "pool";
if (!dmeventd_lvm2_init_with_pool("vdo_pool_state", state))
goto_bad;
state->cmd_str = "";
/* Search for command for LVM- prefixed devices only */
cmd = (strncmp(uuid, "LVM-", 4) == 0) ? "_dmeventd_vdo_command" : "";
if (!dmeventd_lvm2_command(state->mem, cmd_str, sizeof(cmd_str), cmd, device))
goto_bad;
if (strncmp(cmd_str, "lvm ", 4) == 0) {
if (!(state->cmd_str = dm_pool_strdup(state->mem, cmd_str + 4))) {
log_error("Failed to copy lvm VDO command.");
goto bad;
}
} else if (cmd_str[0] == '/') {
if (!(state->cmd_str = dm_pool_strdup(state->mem, cmd_str))) {
log_error("Failed to copy VDO command.");
goto bad;
}
/* Find last space before 'vg/lv' */
if (!(str = strrchr(state->cmd_str, ' ')))
goto inval;
if (!(state->argv[0] = dm_pool_strndup(state->mem, state->cmd_str,
str - state->cmd_str))) {
log_error("Failed to copy command.");
goto bad;
}
state->argv[1] = str + 1; /* 1 argument - vg/lv */
_init_thread_signals(state);
} else if (cmd[0] == 0) {
state->name = "volume"; /* What to use with 'others?' */
} else/* Unuspported command format */
goto inval;
state->pid = -1;
state->name = name;
*user = state;
log_info("Monitoring VDO %s %s.", name, device);
return 1;
inval:
log_error("Invalid command for monitoring: %s.", cmd_str);
bad:
log_error("Failed to monitor VDO %s %s.", name, device);
if (state)
dmeventd_lvm2_exit_with_pool(state);
return 0;
}
int unregister_device(const char *device,
const char *uuid __attribute__((unused)),
int major __attribute__((unused)),
int minor __attribute__((unused)),
void **user)
{
struct dso_state *state = *user;
const char *name = state->name;
int i;
for (i = 0; !_wait_for_pid(state) && (i < 6); ++i) {
if (i == 0)
/* Give it 2 seconds, then try to terminate & kill it */
log_verbose("Child %d still not finished (%s) waiting.",
state->pid, state->cmd_str);
else if (i == 3) {
log_warn("WARNING: Terminating child %d.", state->pid);
kill(state->pid, SIGINT);
kill(state->pid, SIGTERM);
} else if (i == 5) {
log_warn("WARNING: Killing child %d.", state->pid);
kill(state->pid, SIGKILL);
}
sleep(1);
}
if (state->pid != -1)
log_warn("WARNING: Cannot kill child %d!", state->pid);
_restore_thread_signals(state);
dmeventd_lvm2_exit_with_pool(state);
log_info("No longer monitoring VDO %s %s.", name, device);
return 1;
}

View File

@@ -33,7 +33,7 @@
#else
# define MAJOR(x) major((x))
# define MINOR(x) minor((x))
# define MKDEV(x,y) makedev((x),(y))
# define MKDEV(x,y) makedev((dev_t)(x),(dev_t)(y))
#endif
/* limit to two updates/sec */
@@ -629,10 +629,10 @@ check_unlinked:
static int _daemonise(struct filemap_monitor *fm)
{
pid_t pid = 0, sid;
pid_t pid = 0;
int fd;
if (!(sid = setsid())) {
if (!setsid()) {
_early_log("setsid failed.");
return 0;
}

View File

@@ -2815,6 +2815,9 @@ static int add_lockspace_thread(const char *ls_name,
if (ls2->thread_stop) {
log_debug("add_lockspace_thread %s exists and stopping", ls->name);
rv = -EAGAIN;
} else if (!ls2->create_fail && !ls2->create_done) {
log_debug("add_lockspace_thread %s exists and starting", ls->name);
rv = -ESTARTING;
} else {
log_debug("add_lockspace_thread %s exists", ls->name);
rv = -EEXIST;
@@ -3056,7 +3059,7 @@ static int count_lockspace_starting(uint32_t client_id)
pthread_mutex_lock(&lockspaces_mutex);
list_for_each_entry(ls, &lockspaces, list) {
if (ls->start_client_id != client_id)
if (client_id && (ls->start_client_id != client_id))
continue;
if (!ls->create_done && !ls->create_fail) {
@@ -3457,7 +3460,7 @@ static void *worker_thread_main(void *arg_in)
add_client_result(act);
} else if (act->op == LD_OP_START_WAIT) {
act->result = count_lockspace_starting(act->client_id);
act->result = count_lockspace_starting(0);
if (!act->result)
add_client_result(act);
else
@@ -3491,7 +3494,7 @@ static void *worker_thread_main(void *arg_in)
list_for_each_entry_safe(act, safe, &delayed_list, list) {
if (act->op == LD_OP_START_WAIT) {
log_debug("work delayed start_wait for client %u", act->client_id);
act->result = count_lockspace_starting(act->client_id);
act->result = count_lockspace_starting(0);
if (!act->result) {
list_del(&act->list);
add_client_result(act);

View File

@@ -1851,6 +1851,8 @@ int monitor_dev_for_events(struct cmd_context *cmd, const struct logical_volume
if (!laopts)
laopts = &zlaopts;
else
mirr_laopts.read_only = laopts->read_only;
/* skip dmeventd code altogether */
if (dmeventd_monitor_mode() == DMEVENTD_MONITOR_IGNORE)
@@ -1907,7 +1909,8 @@ int monitor_dev_for_events(struct cmd_context *cmd, const struct logical_volume
* In case of a snapshot device, we monitor lv->snapshot->lv,
* not the actual LV itself.
*/
if (lv_is_cow(lv) && (laopts->no_merging || !lv_is_merging_cow(lv))) {
if (lv_is_cow(lv) && (laopts->no_merging || !lv_is_merging_cow(lv) ||
lv_has_target_type(lv->vg->cmd->mem, lv, NULL, TARGET_NAME_SNAPSHOT))) {
if (!(r = monitor_dev_for_events(cmd, lv->snapshot->lv, NULL, monitor)))
stack;
return r;
@@ -2109,6 +2112,11 @@ static int _preload_detached_lv(struct logical_volume *lv, void *data)
!lv_is_raid_metadata(lv_pre) && lv_is_active(lv) &&
!_lv_preload(lv_pre, detached->laopts, detached->flush_required))
return_0;
} else if (lv_is_mirror_image(lv)) {
if ((lv_pre = find_lv_in_vg_by_lvid(detached->lv_pre->vg, &lv->lvid)) &&
!lv_is_mirror_image(lv_pre) && lv_is_active(lv) &&
!_lv_preload(lv_pre, detached->laopts, detached->flush_required))
return_0;
}
if (!lv_is_visible(lv) && (lv_pre = find_lv(detached->lv_pre->vg, lv->name)) &&
@@ -2781,6 +2789,12 @@ static int _lv_activate(struct cmd_context *cmd, const char *lvid_s,
goto out;
}
if (lv_raid_has_visible_sublvs(lv)) {
log_error("Refusing activation of RAID LV %s with "
"visible SubLVs.", display_lvname(lv));
goto out;
}
if (test_mode()) {
_skip("Activating %s.", display_lvname(lv));
r = 1;

View File

@@ -234,6 +234,7 @@ struct dev_usable_check_params {
unsigned int check_suspended:1;
unsigned int check_error_target:1;
unsigned int check_reserved:1;
unsigned int check_lv:1;
};
/*

View File

@@ -178,7 +178,8 @@ static int _get_segment_status_from_target_params(const char *target_name,
}
/* Validate target_name segtype from DM table with lvm2 metadata segtype */
if (strcmp(segtype->name, target_name) &&
if (!lv_is_locked(seg->lv) &&
strcmp(segtype->name, target_name) &&
/* If kernel's type isn't an exact match is it compatible? */
(!segtype->ops->target_status_compatible ||
!segtype->ops->target_status_compatible(target_name))) {
@@ -365,7 +366,7 @@ static int _ignore_blocked_mirror_devices(struct device *dev,
if (!(tmp_dev = dev_create_file(buf, NULL, NULL, 0)))
goto_out;
tmp_dev->dev = MKDEV((dev_t)sm->logs[0].major, (dev_t)sm->logs[0].minor);
tmp_dev->dev = MKDEV(sm->logs[0].major, sm->logs[0].minor);
if (device_is_usable(tmp_dev, (struct dev_usable_check_params)
{ .check_empty = 1,
.check_blocked = 1,
@@ -639,6 +640,11 @@ int device_is_usable(struct device *dev, struct dev_usable_check_params check)
}
}
if (check.check_lv && uuid && !strncmp(uuid, "LVM-", 4)) {
/* Skip LVs */
goto out;
}
if (check.check_reserved && uuid &&
(!strncmp(uuid, CRYPT_TEMP, sizeof(CRYPT_TEMP) - 1) ||
!strncmp(uuid, STRATIS, sizeof(STRATIS) - 1))) {

36
lib/cache/lvmcache.c vendored
View File

@@ -295,6 +295,11 @@ static void _drop_metadata(const char *vgname, int drop_precommitted)
_saved_vg_free(svg, 0, 1);
else
_saved_vg_free(svg, 1, 1);
if (!svg->saved_vg_old && !svg->saved_vg_new) {
dm_hash_remove(_saved_vg_hash, svg->vgid);
dm_free(svg);
}
}
void lvmcache_save_vg(struct volume_group *vg, int precommitted)
@@ -993,7 +998,7 @@ int lvmcache_dev_is_unchosen_duplicate(struct device *dev)
* unused_duplicate_devs list, and restrict what we allow done with it.
*
* In the case of md components, we usually filter these out in filter-md,
* but in the special case of md superblocks <= 1.0 where the superblock
* but in the special case of md superblock version 1.0 where the superblock
* is at the end of the device, filter-md doesn't always eliminate them
* first, so we eliminate them here.
*
@@ -1010,7 +1015,8 @@ static void _filter_duplicate_devs(struct cmd_context *cmd)
dm_list_iterate_items_safe(devl, devl2, &_unused_duplicate_devs) {
info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0);
if (!(info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0)))
continue;
if (MAJOR(info->dev->dev) == dt->md_major) {
log_debug_devs("Ignoring md component duplicate %s", dev_name(devl->dev));
@@ -1038,7 +1044,8 @@ static void _warn_duplicate_devs(struct cmd_context *cmd)
dm_list_iterate_items_safe(devl, devl2, &_unused_duplicate_devs) {
/* info for the preferred device that we're actually using */
info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0);
if (!(info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0)))
continue;
if (!id_write_format((const struct id *)info->dev->pvid, uuid, sizeof(uuid)))
stack;
@@ -1344,7 +1351,7 @@ next:
* comes directly from files.)
*/
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid)
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid, int open_rw)
{
struct dm_list devs;
struct device_list *devl, *devl2;
@@ -1389,7 +1396,10 @@ int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const
/* FIXME: should we also rescan unused_duplicate_devs for devs
being rescanned here and then repeat resolving the duplicates? */
label_scan_devs(cmd, cmd->filter, &devs);
if (open_rw)
label_scan_devs_rw(cmd, cmd->filter, &devs);
else
label_scan_devs(cmd, cmd->filter, &devs);
dm_list_iterate_items_safe(devl, devl2, &devs) {
dm_list_del(&devl->list);
@@ -2515,6 +2525,7 @@ static void _lvmcache_destroy_lockname(struct dm_hash_node *n)
static void _destroy_saved_vg(struct saved_vg *svg)
{
_saved_vg_free(svg, 1, 1);
dm_free(svg);
}
void lvmcache_destroy(struct cmd_context *cmd, int retain_orphans, int reset)
@@ -3033,3 +3044,18 @@ int lvmcache_scan_mismatch(struct cmd_context *cmd, const char *vgname, const ch
return 1;
}
static uint64_t _max_metadata_size;
void lvmcache_save_metadata_size(uint64_t val)
{
if (!_max_metadata_size)
_max_metadata_size = val;
else if (_max_metadata_size < val)
_max_metadata_size = val;
}
uint64_t lvmcache_max_metadata_size(void)
{
return _max_metadata_size;
}

View File

@@ -69,7 +69,7 @@ void lvmcache_allow_reads_with_lvmetad(void);
void lvmcache_destroy(struct cmd_context *cmd, int retain_orphans, int reset);
int lvmcache_label_scan(struct cmd_context *cmd);
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid);
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid, int open_rw);
/* Add/delete a device */
struct lvmcache_info *lvmcache_add(struct labeller *labeller, const char *pvid,
@@ -225,4 +225,7 @@ struct volume_group *lvmcache_get_saved_vg(const char *vgid, int precommitted);
struct volume_group *lvmcache_get_saved_vg_latest(const char *vgid);
void lvmcache_drop_saved_vgid(const char *vgid);
uint64_t lvmcache_max_metadata_size(void);
void lvmcache_save_metadata_size(uint64_t val);
#endif

105
lib/cache/lvmetad.c vendored
View File

@@ -31,6 +31,7 @@ static daemon_handle _lvmetad = { .error = 0 };
static int _lvmetad_use = 0;
static int _lvmetad_connected = 0;
static int _lvmetad_daemon_pid = 0;
static int _was_connected = 0;
static char *_lvmetad_token = NULL;
static const char *_lvmetad_socket = NULL;
@@ -114,8 +115,10 @@ static int _log_debug_inequality(const char *name, struct dm_config_node *a, str
void lvmetad_disconnect(void)
{
if (_lvmetad_connected)
if (_lvmetad_connected) {
daemon_close(_lvmetad);
_was_connected = 1;
}
_lvmetad_connected = 0;
_lvmetad_use = 0;
@@ -310,6 +313,7 @@ retry:
* The caller should do a disk scan to populate lvmetad.
*/
if (!strcmp(daemon_token, "none")) {
log_debug_lvmetad("lvmetad initialization needed.");
ret = 0;
goto out;
}
@@ -321,10 +325,16 @@ retry:
* our global filter.
*/
if (strcmp(daemon_token, _lvmetad_token)) {
log_debug_lvmetad("lvmetad initialization needed for different filter.");
ret = 0;
goto out;
}
if (wait_start)
log_debug_lvmetad("lvmetad initialized during wait.");
else
log_debug_lvmetad("lvmetad initialized previously.");
out:
daemon_reply_destroy(reply);
return ret;
@@ -2322,8 +2332,8 @@ bad:
int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
{
struct dev_iter *iter;
struct device *dev;
struct device_list *devl, *devl2;
struct dm_list scan_devs;
daemon_reply reply;
char *future_token;
const char *reason;
@@ -2339,6 +2349,8 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
}
retry:
dm_list_init(&scan_devs);
/*
* If another update is in progress, delay to allow it to finish,
* rather than interrupting it with our own update.
@@ -2348,28 +2360,11 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
replacing_other_update = 1;
}
label_scan(cmd);
lvmcache_pvscan_duplicate_check(cmd);
if (lvmcache_found_duplicate_pvs()) {
log_warn("WARNING: Scan found duplicate PVs.");
return 0;
}
log_verbose("Scanning all devices to update lvmetad.");
if (!(iter = dev_iter_create(cmd->lvmetad_filter, 1))) {
log_error("dev_iter creation failed");
return 0;
}
future_token = _lvmetad_token;
_lvmetad_token = (char *) LVMETAD_TOKEN_UPDATE_IN_PROGRESS;
if (!_token_update(&replaced_update)) {
log_error("Failed to update lvmetad which had an update in progress.");
dev_iter_destroy(iter);
log_error("Failed to start lvmetad update.");
_lvmetad_token = future_token;
return 0;
}
@@ -2385,16 +2380,18 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
if (do_wait && !retries) {
retries = 1;
log_warn("WARNING: lvmetad update in progress, retrying update.");
dev_iter_destroy(iter);
_lvmetad_token = future_token;
goto retry;
}
log_warn("WARNING: lvmetad update in progress, skipping update.");
dev_iter_destroy(iter);
_lvmetad_token = future_token;
return 0;
}
log_verbose("Scanning all devices to initialize lvmetad.");
label_scan_pvscan_all(cmd, &scan_devs);
log_debug_lvmetad("Telling lvmetad to clear its cache");
reply = _lvmetad_send(cmd, "pv_clear_all", NULL);
if (!_lvmetad_handle_reply(reply, "pv_clear_all", "", NULL))
@@ -2404,15 +2401,24 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
was_silent = silent_mode();
init_silent(1);
while ((dev = dev_iter_get(iter))) {
log_debug_lvmetad("Sending %d devices to lvmetad.", dm_list_size(&scan_devs));
dm_list_iterate_items_safe(devl, devl2, &scan_devs) {
if (sigint_caught()) {
ret = 0;
stack;
break;
}
if (!lvmetad_pvscan_single(cmd, dev, NULL, NULL)) {
ret = 0;
dm_list_del(&devl->list);
ret = lvmetad_pvscan_single(cmd, devl->dev, NULL, NULL);
label_scan_invalidate(devl->dev);
dm_free(devl);
if (!ret) {
stack;
break;
}
@@ -2420,8 +2426,6 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
init_silent(was_silent);
dev_iter_destroy(iter);
_lvmetad_token = future_token;
/*
@@ -2439,6 +2443,13 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
return 0;
}
/* This will disable lvmetad if label scan found duplicates. */
lvmcache_pvscan_duplicate_check(cmd);
if (lvmcache_found_duplicate_pvs()) {
log_warn("WARNING: Scan found duplicate PVs.");
return 0;
}
/*
* If lvmetad is disabled, and no duplicate PVs were seen, then re-enable lvmetad.
*/
@@ -2973,20 +2984,49 @@ int lvmetad_vg_is_foreign(struct cmd_context *cmd, const char *vgname, const cha
*/
void lvmetad_set_disabled(struct cmd_context *cmd, const char *reason)
{
daemon_handle tmph = { .error = 0 };
daemon_reply reply;
int tmp_con = 0;
if (!_lvmetad_use)
return;
/*
* If we were using lvmetad at the start of the command, but are not
* now, then _was_connected should still be set. In this case we
* want to make a temp connection just to disable it.
*/
if (!_lvmetad_use) {
if (_was_connected) {
/* Create a special temp connection just to send disable */
tmph = lvmetad_open(_lvmetad_socket);
if (tmph.socket_fd < 0 || tmph.error) {
log_warn("Failed to connect to lvmetad to disable.");
return;
}
tmp_con = 1;
} else {
/* We were never using lvmetad, don't start now. */
return;
}
}
log_debug_lvmetad("Sending lvmetad disabled %s", reason);
reply = daemon_send_simple(_lvmetad, "set_global_info",
if (tmp_con)
reply = daemon_send_simple(tmph, "set_global_info",
"token = %s", "skip",
"global_disable = " FMTd64, (int64_t)1,
"disable_reason = %s", reason,
"pid = " FMTd64, (int64_t)getpid(),
"cmd = %s", get_cmd_name(),
NULL);
else
reply = daemon_send_simple(_lvmetad, "set_global_info",
"token = %s", "skip",
"global_disable = " FMTd64, (int64_t)1,
"disable_reason = %s", reason,
"pid = " FMTd64, (int64_t)getpid(),
"cmd = %s", get_cmd_name(),
NULL);
if (reply.error)
log_error("Failed to send message to lvmetad %d", reply.error);
@@ -2994,6 +3034,9 @@ void lvmetad_set_disabled(struct cmd_context *cmd, const char *reason)
log_error("Failed response from lvmetad.");
daemon_reply_destroy(reply);
if (tmp_con)
daemon_close(tmph);
}
void lvmetad_clear_disabled(struct cmd_context *cmd)

View File

@@ -333,6 +333,8 @@ static void _init_logging(struct cmd_context *cmd)
find_config_tree_bool(cmd, global_test_CFG, NULL);
init_test(cmd->default_settings.test);
init_use_aio(find_config_tree_bool(cmd, global_use_aio_CFG, NULL));
/* Settings for logging to file */
if (find_config_tree_bool(cmd, log_overwrite_CFG, NULL))
append = 0;
@@ -683,6 +685,8 @@ static int _process_config(struct cmd_context *cmd)
if (!_init_system_id(cmd))
return_0;
init_io_memory_size(find_config_tree_int(cmd, global_io_memory_size_CFG, NULL));
return 1;
}
@@ -1113,7 +1117,7 @@ static struct dev_filter *_init_lvmetad_filter_chain(struct cmd_context *cmd)
nr_filt++;
/* usable device filter. Required. */
if (!(filters[nr_filt] = usable_filter_create(cmd->dev_types,
if (!(filters[nr_filt] = usable_filter_create(cmd, cmd->dev_types,
lvmetad_used() ? FILTER_MODE_PRE_LVMETAD
: FILTER_MODE_NO_LVMETAD))) {
log_error("Failed to create usabled device filter");
@@ -1233,7 +1237,7 @@ int init_filters(struct cmd_context *cmd, unsigned load_persistent_cache)
}
nr_filt++;
}
if (!(filter_components[nr_filt] = usable_filter_create(cmd->dev_types, FILTER_MODE_POST_LVMETAD))) {
if (!(filter_components[nr_filt] = usable_filter_create(cmd, cmd->dev_types, FILTER_MODE_POST_LVMETAD))) {
log_verbose("Failed to create usable device filter.");
goto bad;
}
@@ -1462,6 +1466,7 @@ static int _init_segtypes(struct cmd_context *cmd)
struct segment_type *segtype;
struct segtype_library seglib = { .cmd = cmd, .lib = NULL };
struct segment_type *(*init_segtype_array[])(struct cmd_context *cmd) = {
init_linear_segtype,
init_striped_segtype,
init_zero_segtype,
init_error_segtype,

View File

@@ -95,6 +95,7 @@ struct cmd_context {
char **argv;
struct arg_values *opt_arg_values;
struct dm_list arg_value_groups;
int opt_count; /* total number of options (beginning with - or --) */
/*
* Position args remaining after command name
@@ -154,6 +155,7 @@ struct cmd_context {
unsigned include_shared_vgs:1; /* report/display cmds can reveal lockd VGs */
unsigned include_active_foreign_vgs:1; /* cmd should process foreign VGs with active LVs */
unsigned vg_read_print_access_error:1; /* print access errors from vg_read */
unsigned force_access_clustered:1;
unsigned lockd_gl_disable:1;
unsigned lockd_vg_disable:1;
unsigned lockd_lv_disable:1;

View File

@@ -345,6 +345,19 @@ cfg(devices_sysfs_scan_CFG, "sysfs_scan", devices_CFG_SECTION, 0, CFG_TYPE_BOOL,
"This is a quick way of filtering out block devices that are not\n"
"present on the system. sysfs must be part of the kernel and mounted.)\n")
cfg(devices_scan_lvs_CFG, "scan_lvs", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_SCAN_LVS, vsn(2, 2, 182), NULL, 0, NULL,
"Scan LVM LVs for layered PVs, allowing LVs to be used as PVs.\n"
"When 1, LVM will detect PVs layered on LVs, and caution must be\n"
"taken to avoid a host accessing a layered VG that may not belong\n"
"to it, e.g. from a guest image. This generally requires excluding\n"
"the LVs with device filters. Also, when this setting is enabled,\n"
"every LVM command will scan every active LV on the system (unless\n"
"filtered), which can cause performance problems on systems with\n"
"many active LVs. When this setting is 0, LVM will not detect or\n"
"use PVs that exist on LVs, and will not allow a PV to be created on\n"
"an LV. The LVs are ignored using a built in device filter that\n"
"identifies and excludes LVs.\n")
cfg(devices_multipath_component_detection_CFG, "multipath_component_detection", devices_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_MULTIPATH_COMPONENT_DETECTION, vsn(2, 2, 89), NULL, 0, NULL,
"Ignore devices that are components of DM multipath devices.\n")
@@ -935,6 +948,9 @@ cfg(global_lvdisplay_shows_full_device_path_CFG, "lvdisplay_shows_full_device_pa
"Previously this was always shown as /dev/vgname/lvname even when that\n"
"was never a valid path in the /dev filesystem.\n")
cfg(global_use_aio_CFG, "use_aio", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, DEFAULT_USE_AIO, vsn(2, 2, 183), NULL, 0, NULL,
"Use async I/O when reading and writing devices.\n")
cfg(global_use_lvmetad_CFG, "use_lvmetad", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_USE_LVMETAD, vsn(2, 2, 93), "@DEFAULT_USE_LVMETAD@", 0, NULL,
"Use lvmetad to cache metadata and reduce disk scanning.\n"
"When enabled (and running), lvmetad provides LVM commands with VG\n"
@@ -1123,6 +1139,14 @@ cfg(global_notify_dbus_CFG, "notify_dbus", global_CFG_SECTION, 0, CFG_TYPE_BOOL,
"When enabled, an LVM command that changes PVs, changes VG metadata,\n"
"or changes the activation state of an LV will send a notification.\n")
cfg(global_io_memory_size_CFG, "io_memory_size", global_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_IO_MEMORY_SIZE_KB, vsn(2, 2, 184), NULL, 0, NULL,
"The amount of memory in KiB that LVM allocates to perform disk io.\n"
"LVM performance may benefit from more io memory when there are many\n"
"disks or VG metadata is large. Increasing this size may be necessary\n"
"when a single copy of VG metadata is larger than the current setting.\n"
"This value should usually not be decreased from the default; setting\n"
"it too low can result in lvm failing to read VGs.\n")
cfg(activation_udev_sync_CFG, "udev_sync", activation_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_UDEV_SYNC, vsn(2, 2, 51), NULL, 0, NULL,
"Use udev notifications to synchronize udev and LVM.\n"
"The --nodevsync option overrides this setting.\n"

View File

@@ -59,6 +59,7 @@
#define DEFAULT_METADATA_READ_ONLY 0
#define DEFAULT_LVDISPLAY_SHOWS_FULL_DEVICE_PATH 0
#define DEFAULT_UNKNOWN_DEVICE_NAME "[unknown]"
#define DEFAULT_USE_AIO 1
#define DEFAULT_SANLOCK_LV_EXTEND_MB 256
@@ -266,4 +267,8 @@
#define DEFAULT_THIN_POOL_AUTOEXTEND_THRESHOLD 100
#define DEFAULT_THIN_POOL_AUTOEXTEND_PERCENT 20
#define DEFAULT_SCAN_LVS 0
#define DEFAULT_IO_MEMORY_SIZE_KB 8192
#endif /* _LVM_DEFAULTS_H */

View File

@@ -156,6 +156,10 @@ static void _async_destroy(struct io_engine *ioe)
dm_free(e);
}
static int _last_byte_fd;
static uint64_t _last_byte_offset;
static int _last_byte_sector_size;
static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
sector_t sb, sector_t se, void *data, void *context)
{
@@ -163,12 +167,53 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
struct iocb *cb_array[1];
struct control_block *cb;
struct async_engine *e = _to_async(ioe);
sector_t offset;
sector_t nbytes;
sector_t limit_nbytes;
sector_t extra_nbytes = 0;
if (((uintptr_t) data) & e->page_mask) {
log_warn("misaligned data buffer");
return false;
}
offset = sb << SECTOR_SHIFT;
nbytes = (se - sb) << SECTOR_SHIFT;
/*
* If bcache block goes past where lvm wants to write, then clamp it.
*/
if ((d == DIR_WRITE) && _last_byte_offset && (fd == _last_byte_fd)) {
if (offset > _last_byte_offset) {
log_error("Limit write at %llu len %llu beyond last byte %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)_last_byte_offset);
return false;
}
if (offset + nbytes > _last_byte_offset) {
limit_nbytes = _last_byte_offset - offset;
if (limit_nbytes % _last_byte_sector_size)
extra_nbytes = _last_byte_sector_size - (limit_nbytes % _last_byte_sector_size);
if (extra_nbytes) {
log_debug("Limit write at %llu len %llu to len %llu rounded to %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)limit_nbytes,
(unsigned long long)(limit_nbytes + extra_nbytes));
nbytes = limit_nbytes + extra_nbytes;
} else {
log_debug("Limit write at %llu len %llu to len %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)limit_nbytes);
nbytes = limit_nbytes;
}
}
}
cb = _cb_alloc(e->cbs, context);
if (!cb) {
log_warn("couldn't allocate control block");
@@ -179,17 +224,28 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
cb->cb.aio_fildes = (int) fd;
cb->cb.u.c.buf = data;
cb->cb.u.c.offset = sb << SECTOR_SHIFT;
cb->cb.u.c.nbytes = (se - sb) << SECTOR_SHIFT;
cb->cb.u.c.offset = offset;
cb->cb.u.c.nbytes = nbytes;
cb->cb.aio_lio_opcode = (d == DIR_READ) ? IO_CMD_PREAD : IO_CMD_PWRITE;
#if 0
if (d == DIR_READ) {
log_debug("io R off %llu bytes %llu",
(unsigned long long)cb->cb.u.c.offset,
(unsigned long long)cb->cb.u.c.nbytes);
} else {
log_debug("io W off %llu bytes %llu",
(unsigned long long)cb->cb.u.c.offset,
(unsigned long long)cb->cb.u.c.nbytes);
}
#endif
cb_array[0] = &cb->cb;
do {
r = io_submit(e->aio_context, 1, cb_array);
} while (r == -EAGAIN);
if (r < 0) {
log_sys_warn("io_submit");
_cb_free(e->cbs, cb);
return false;
}
@@ -197,7 +253,15 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
return true;
}
#define MAX_IO 1024
/*
* MAX_IO is returned to the layer above via bcache_max_prefetches() which
* tells the caller how many devices to submit io for concurrently. There will
* be an open file descriptor for each of these, so keep it low enough to avoid
* reaching the default max open file limit (1024) when there are over 1024
* devices being scanned.
*/
#define MAX_IO 256
#define MAX_EVENT 64
static bool _async_wait(struct io_engine *ioe, io_complete_fn fn)
@@ -264,7 +328,7 @@ struct io_engine *create_async_io_engine(void)
e->aio_context = 0;
r = io_setup(MAX_IO, &e->aio_context);
if (r < 0) {
log_warn("io_setup failed");
log_debug("io_setup failed %d", r);
dm_free(e);
return NULL;
}
@@ -307,8 +371,11 @@ static void _sync_destroy(struct io_engine *ioe)
static bool _sync_issue(struct io_engine *ioe, enum dir d, int fd,
sector_t sb, sector_t se, void *data, void *context)
{
int r;
uint64_t len = (se - sb) * 512, where;
int rv;
off_t off;
uint64_t where;
uint64_t pos = 0;
uint64_t len = (se - sb) * 512;
struct sync_engine *e = _to_sync(ioe);
struct sync_io *io = malloc(sizeof(*io));
if (!io) {
@@ -317,32 +384,99 @@ static bool _sync_issue(struct io_engine *ioe, enum dir d, int fd,
}
where = sb * 512;
r = lseek(fd, where, SEEK_SET);
if (r < 0) {
log_warn("unable to seek to position %llu", (unsigned long long) where);
return false;
off = lseek(fd, where, SEEK_SET);
if (off == (off_t) -1) {
log_warn("Device seek error %d for offset %llu", errno, (unsigned long long)where);
free(io);
return false;
}
if (off != (off_t) where) {
log_warn("Device seek failed for offset %llu", (unsigned long long)where);
free(io);
return false;
}
while (len) {
do {
if (d == DIR_READ)
r = read(fd, data, len);
else
r = write(fd, data, len);
/*
* If bcache block goes past where lvm wants to write, then clamp it.
*/
if ((d == DIR_WRITE) && _last_byte_offset && (fd == _last_byte_fd)) {
uint64_t offset = where;
uint64_t nbytes = len;
sector_t limit_nbytes = 0;
sector_t extra_nbytes = 0;
} while ((r < 0) && ((r == EINTR) || (r == EAGAIN)));
if (offset > _last_byte_offset) {
log_error("Limit write at %llu len %llu beyond last byte %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)_last_byte_offset);
return false;
}
if (r < 0) {
log_warn("io failed %d", r);
if (offset + nbytes > _last_byte_offset) {
limit_nbytes = _last_byte_offset - offset;
if (limit_nbytes % _last_byte_sector_size)
extra_nbytes = _last_byte_sector_size - (limit_nbytes % _last_byte_sector_size);
if (extra_nbytes) {
log_debug("Limit write at %llu len %llu to len %llu rounded to %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)limit_nbytes,
(unsigned long long)(limit_nbytes + extra_nbytes));
nbytes = limit_nbytes + extra_nbytes;
} else {
log_debug("Limit write at %llu len %llu to len %llu",
(unsigned long long)offset,
(unsigned long long)nbytes,
(unsigned long long)limit_nbytes);
nbytes = limit_nbytes;
}
}
where = offset;
len = nbytes;
}
while (pos < len) {
if (d == DIR_READ)
rv = read(fd, (char *)data + pos, len - pos);
else
rv = write(fd, (char *)data + pos, len - pos);
if (rv == -1 && errno == EINTR)
continue;
if (rv == -1 && errno == EAGAIN)
continue;
if (!rv)
break;
if (rv < 0) {
if (d == DIR_READ)
log_debug("Device read error %d offset %llu len %llu", errno,
(unsigned long long)(where + pos),
(unsigned long long)(len - pos));
else
log_debug("Device write error %d offset %llu len %llu", errno,
(unsigned long long)(where + pos),
(unsigned long long)(len - pos));
free(io);
return false;
}
len -= r;
}
pos += rv;
}
if (len) {
log_warn("short io %u bytes remaining", (unsigned) len);
if (pos < len) {
if (d == DIR_READ)
log_warn("Device read short %u bytes remaining", (unsigned)(len - pos));
else
log_warn("Device write short %u bytes remaining", (unsigned)(len - pos));
/*
free(io);
return false;
*/
}
@@ -557,11 +691,13 @@ static bool _init_free_list(struct bcache *cache, unsigned count, unsigned pgsiz
if (!data)
return false;
cache->raw_data = data;
cache->raw_blocks = dm_malloc(count * sizeof(*cache->raw_blocks));
if (!cache->raw_blocks) {
free(data);
return false;
}
if (!cache->raw_blocks)
dm_free(cache->raw_data);
cache->raw_data = data;
for (i = 0; i < count; i++) {
struct block *b = cache->raw_blocks + i;
@@ -646,7 +782,6 @@ static void _complete_io(void *context, int err)
dm_list_del(&b->list);
if (b->error) {
log_warn("bcache io error %d fd %d", b->error, b->fd);
dm_list_add(&cache->errored, &b->list);
} else {
@@ -1142,3 +1277,21 @@ bool bcache_invalidate_fd(struct bcache *cache, int fd)
//----------------------------------------------------------------
void bcache_set_last_byte(struct bcache *cache, int fd, uint64_t offset, int sector_size)
{
_last_byte_fd = fd;
_last_byte_offset = offset;
_last_byte_sector_size = sector_size;
if (!sector_size)
_last_byte_sector_size = 512;
}
void bcache_unset_last_byte(struct bcache *cache, int fd)
{
if (_last_byte_fd == fd) {
_last_byte_fd = 0;
_last_byte_offset = 0;
_last_byte_sector_size = 0;
}
}

View File

@@ -158,6 +158,9 @@ bool bcache_write_bytes(struct bcache *cache, int fd, uint64_t start, size_t len
bool bcache_zero_bytes(struct bcache *cache, int fd, uint64_t start, size_t len);
bool bcache_set_bytes(struct bcache *cache, int fd, uint64_t start, size_t len, uint8_t val);
void bcache_set_last_byte(struct bcache *cache, int fd, uint64_t offset, int sector_size);
void bcache_unset_last_byte(struct bcache *cache, int fd);
//----------------------------------------------------------------
#endif

View File

@@ -480,7 +480,7 @@ static struct device *_get_device_for_sysfs_dev_name_using_devno(const char *dev
return NULL;
}
devno = MKDEV((dev_t)major, (dev_t)minor);
devno = MKDEV(major, minor);
if (!(dev = (struct device *) btree_lookup(_cache.devices, (uint32_t) devno))) {
/*
* If we get here, it means the device is referenced in sysfs, but it's not yet in /dev.
@@ -667,10 +667,9 @@ struct dm_list *dev_cache_get_dev_list_for_lvid(const char *lvid)
void dev_cache_failed_path(struct device *dev, const char *path)
{
struct device *dev_by_path;
struct dm_str_list *strl;
if ((dev_by_path = (struct device *) dm_hash_lookup(_cache.names, path)))
if (dm_hash_lookup(_cache.names, path))
dm_hash_remove(_cache.names, path);
dm_list_iterate_items(strl, &dev->aliases) {
@@ -985,7 +984,7 @@ static int _dev_cache_iterate_sysfs_for_index(const char *path)
continue;
}
devno = MKDEV((dev_t)major, (dev_t)minor);
devno = MKDEV(major, minor);
if (!(dev = (struct device *) btree_lookup(_cache.devices, (uint32_t) devno)) &&
!(dev = (struct device *) btree_lookup(_cache.sysfs_only_devices, (uint32_t) devno))) {
if (!dm_device_get_name(major, minor, 1, devname, sizeof(devname)) ||

View File

@@ -149,16 +149,27 @@ static int _io(struct device_area *where, char *buffer, int should_write, dev_io
int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, unsigned int *block_size)
{
const char *name = dev_name(dev);
int needs_open;
int fd = dev->bcache_fd;
int do_close = 0;
int r = 1;
needs_open = (!dev->open_count && (dev->phys_block_size == -1 || dev->block_size == -1));
if ((dev->phys_block_size > 0) && (dev->block_size > 0)) {
*physical_block_size = (unsigned int)dev->phys_block_size;
*block_size = (unsigned int)dev->block_size;
return 1;
}
if (needs_open && !dev_open_readonly(dev))
return_0;
if (fd <= 0) {
if (!dev->open_count) {
if (!dev_open_readonly(dev))
return_0;
do_close = 1;
}
fd = dev_fd(dev);
}
if (dev->block_size == -1) {
if (ioctl(dev_fd(dev), BLKBSZGET, &dev->block_size) < 0) {
if (ioctl(fd, BLKBSZGET, &dev->block_size) < 0) {
log_sys_error("ioctl BLKBSZGET", name);
r = 0;
goto out;
@@ -169,7 +180,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
#ifdef BLKPBSZGET
/* BLKPBSZGET is available in kernel >= 2.6.32 only */
if (dev->phys_block_size == -1) {
if (ioctl(dev_fd(dev), BLKPBSZGET, &dev->phys_block_size) < 0) {
if (ioctl(fd, BLKPBSZGET, &dev->phys_block_size) < 0) {
log_sys_error("ioctl BLKPBSZGET", name);
r = 0;
goto out;
@@ -179,7 +190,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
#elif defined (BLKSSZGET)
/* if we can't get physical block size, just use logical block size instead */
if (dev->phys_block_size == -1) {
if (ioctl(dev_fd(dev), BLKSSZGET, &dev->phys_block_size) < 0) {
if (ioctl(fd, BLKSSZGET, &dev->phys_block_size) < 0) {
log_sys_error("ioctl BLKSSZGET", name);
r = 0;
goto out;
@@ -197,7 +208,7 @@ int dev_get_block_size(struct device *dev, unsigned int *physical_block_size, un
*physical_block_size = (unsigned int) dev->phys_block_size;
*block_size = (unsigned int) dev->block_size;
out:
if (needs_open && !dev_close_immediate(dev))
if (do_close && !dev_close_immediate(dev))
stack;
return r;
@@ -367,18 +378,24 @@ static int _dev_get_size_dev(struct device *dev, uint64_t *size)
static int _dev_read_ahead_dev(struct device *dev, uint32_t *read_ahead)
{
long read_ahead_long;
int fd = dev->bcache_fd;
int do_close = 0;
if (dev->read_ahead != -1) {
*read_ahead = (uint32_t) dev->read_ahead;
return 1;
}
if (!dev_open_readonly(dev))
return_0;
if (fd <= 0) {
if (!dev_open_readonly(dev))
return_0;
fd = dev_fd(dev);
do_close = 1;
}
if (ioctl(dev->fd, BLKRAGET, &read_ahead_long) < 0) {
if (ioctl(fd, BLKRAGET, &read_ahead_long) < 0) {
log_sys_error("ioctl BLKRAGET", dev_name(dev));
if (!dev_close_immediate(dev))
if (do_close && !dev_close_immediate(dev))
stack;
return 0;
}
@@ -389,8 +406,8 @@ static int _dev_read_ahead_dev(struct device *dev, uint32_t *read_ahead)
log_very_verbose("%s: read_ahead is %u sectors",
dev_name(dev), *read_ahead);
if (!dev_close_immediate(dev))
stack;
if (do_close && !dev_close_immediate(dev))
log_sys_error("close", dev_name(dev));
return 1;
}
@@ -405,9 +422,11 @@ static int _dev_discard_blocks(struct device *dev, uint64_t offset_bytes, uint64
discard_range[0] = offset_bytes;
discard_range[1] = size_bytes;
log_debug_devs("Discarding %" PRIu64 " bytes offset %" PRIu64 " bytes on %s.",
size_bytes, offset_bytes, dev_name(dev));
if (ioctl(dev->fd, BLKDISCARD, &discard_range) < 0) {
log_debug_devs("Discarding %" PRIu64 " bytes offset %" PRIu64 " bytes on %s. %s",
size_bytes, offset_bytes, dev_name(dev),
test_mode() ? " (test mode - suppressed)" : "");
if (!test_mode() && ioctl(dev->fd, BLKDISCARD, &discard_range) < 0) {
log_error("%s: BLKDISCARD ioctl at offset %" PRIu64 " size %" PRIu64 " failed: %s.",
dev_name(dev), offset_bytes, size_bytes, strerror(errno));
if (!dev_close_immediate(dev))

View File

@@ -142,13 +142,6 @@ static int _native_dev_is_md(struct device *dev, uint64_t *offset_found, int ful
* command if it should do a full check (cmd->use_full_md_check),
* and set it for commands that could possibly write to an md dev
* (pvcreate/vgcreate/vgextend).
*
* For old md versions with magic numbers at the end of devices,
* the md dev components won't be filtered out here when full is 0,
* so they will be scanned, and appear as duplicate PVs in lvmcache.
* The md device itself will be chosen as the primary duplicate,
* and the components are dropped from the list of duplicates in,
* i.e. a kind of post-scan filtering.
*/
if (!full) {
sb_offset = 0;
@@ -197,14 +190,24 @@ out:
int dev_is_md(struct device *dev, uint64_t *offset_found, int full)
{
int ret;
/*
* If non-native device status source is selected, use it
* only if offset_found is not requested as this
* information is not in udev db.
*/
if ((dev->ext.src == DEV_EXT_NONE) || offset_found)
return _native_dev_is_md(dev, offset_found, full);
if ((dev->ext.src == DEV_EXT_NONE) || offset_found) {
ret = _native_dev_is_md(dev, offset_found, full);
if (!full) {
if (!ret || (ret == -EAGAIN)) {
if (udev_dev_is_md_component(dev))
return 1;
}
}
return ret;
}
if (dev->ext.src == DEV_EXT_UDEV)
return _udev_dev_is_md(dev);
@@ -414,6 +417,26 @@ unsigned long dev_md_stripe_width(struct dev_types *dt, struct device *dev)
return stripe_width_sectors;
}
int dev_is_md_with_end_superblock(struct dev_types *dt, struct device *dev)
{
char version_string[MD_MAX_SYSFS_SIZE];
const char *attribute = "metadata_version";
if (MAJOR(dev->dev) != dt->md_major)
return 0;
if (_md_sysfs_attribute_scanf(dt, dev, attribute,
"%s", &version_string) != 1)
return -1;
log_very_verbose("Device %s %s is %s.",
dev_name(dev), attribute, version_string);
if (!strcmp(version_string, "1.0") || !strcmp(version_string, "0.90"))
return 1;
return 0;
}
#else
int dev_is_md(struct device *dev __attribute__((unused)),

View File

@@ -505,7 +505,7 @@ int dev_get_primary_dev(struct dev_types *dt, struct device *dev, dev_t *result)
*/
if ((parts = dt->dev_type_array[major].max_partitions) > 1) {
if ((residue = minor % parts)) {
*result = MKDEV((dev_t)major, (dev_t)(minor - residue));
*result = MKDEV(major, (minor - residue));
ret = 2;
} else {
*result = dev->dev;
@@ -575,7 +575,7 @@ int dev_get_primary_dev(struct dev_types *dt, struct device *dev, dev_t *result)
path, buffer);
goto out;
}
*result = MKDEV((dev_t)major, (dev_t)minor);
*result = MKDEV(major, minor);
ret = 2;
out:
if (fp && fclose(fp))
@@ -1004,25 +1004,23 @@ int dev_is_rotational(struct dev_types *dt, struct device *dev)
* failed already due to timeout in udev - in both cases the
* udev_device_get_is_initialized returns 0.
*/
#define UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT 100
#define UDEV_DEV_IS_MPATH_COMPONENT_USLEEP 100000
#define UDEV_DEV_IS_COMPONENT_ITERATION_COUNT 100
#define UDEV_DEV_IS_COMPONENT_USLEEP 100000
int udev_dev_is_mpath_component(struct device *dev)
static struct udev_device *_udev_get_dev(struct device *dev)
{
struct udev *udev_context = udev_get_library_context();
struct udev_device *udev_device = NULL;
const char *value;
int initialized = 0;
unsigned i = 0;
int ret = 0;
if (!udev_context) {
log_warn("WARNING: No udev context available to check if device %s is multipath component.", dev_name(dev));
return 0;
return NULL;
}
while (1) {
if (i >= UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT)
if (i >= UDEV_DEV_IS_COMPONENT_ITERATION_COUNT)
break;
if (udev_device)
@@ -1030,7 +1028,7 @@ int udev_dev_is_mpath_component(struct device *dev)
if (!(udev_device = udev_device_new_from_devnum(udev_context, 'b', dev->dev))) {
log_warn("WARNING: Failed to get udev device handler for device %s.", dev_name(dev));
return 0;
return NULL;
}
#ifdef HAVE_LIBUDEV_UDEV_DEVICE_GET_IS_INITIALIZED
@@ -1042,19 +1040,35 @@ int udev_dev_is_mpath_component(struct device *dev)
#endif
log_debug("Device %s not initialized in udev database (%u/%u, %u microseconds).", dev_name(dev),
i + 1, UDEV_DEV_IS_MPATH_COMPONENT_ITERATION_COUNT,
i * UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
i + 1, UDEV_DEV_IS_COMPONENT_ITERATION_COUNT,
i * UDEV_DEV_IS_COMPONENT_USLEEP);
usleep(UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
usleep(UDEV_DEV_IS_COMPONENT_USLEEP);
i++;
}
if (!initialized) {
log_warn("WARNING: Device %s not initialized in udev database even after waiting %u microseconds.",
dev_name(dev), i * UDEV_DEV_IS_MPATH_COMPONENT_USLEEP);
dev_name(dev), i * UDEV_DEV_IS_COMPONENT_USLEEP);
goto out;
}
out:
return udev_device;
}
int udev_dev_is_mpath_component(struct device *dev)
{
struct udev_device *udev_device;
const char *value;
int ret = 0;
if (!obtain_device_list_from_udev())
return 0;
if (!(udev_device = _udev_get_dev(dev)))
return 0;
value = udev_device_get_property_value(udev_device, DEV_EXT_UDEV_BLKID_TYPE);
if (value && !strcmp(value, DEV_EXT_UDEV_BLKID_TYPE_MPATH)) {
log_debug("Device %s is multipath component based on blkid variable in udev db (%s=\"%s\").",
@@ -1074,6 +1088,31 @@ out:
udev_device_unref(udev_device);
return ret;
}
int udev_dev_is_md_component(struct device *dev)
{
struct udev_device *udev_device;
const char *value;
int ret = 0;
if (!obtain_device_list_from_udev())
return 0;
if (!(udev_device = _udev_get_dev(dev)))
return 0;
value = udev_device_get_property_value(udev_device, DEV_EXT_UDEV_BLKID_TYPE);
if (value && !strcmp(value, DEV_EXT_UDEV_BLKID_TYPE_SW_RAID)) {
log_debug("Device %s is md raid component based on blkid variable in udev db (%s=\"%s\").",
dev_name(dev), DEV_EXT_UDEV_BLKID_TYPE, value);
ret = 1;
goto out;
}
out:
udev_device_unref(udev_device);
return ret;
}
#else
int udev_dev_is_mpath_component(struct device *dev)
@@ -1081,4 +1120,9 @@ int udev_dev_is_mpath_component(struct device *dev)
return 0;
}
int udev_dev_is_md_component(struct device *dev)
{
return 0;
}
#endif

View File

@@ -26,7 +26,7 @@
#else
# define MAJOR(x) major((x))
# define MINOR(x) minor((x))
# define MKDEV(x,y) makedev((x),(y))
# define MKDEV(x,y) makedev((dev_t)(x),(dev_t)(y))
#endif
#define PARTITION_SCSI_DEVICE (1 << 0)
@@ -62,6 +62,7 @@ int dev_is_swap(struct device *dev, uint64_t *signature, int full);
int dev_is_luks(struct device *dev, uint64_t *signature, int full);
int dasd_is_cdl_formatted(struct device *dev);
int udev_dev_is_mpath_component(struct device *dev);
int udev_dev_is_md_component(struct device *dev);
int dev_is_lvm1(struct device *dev, char *buf, int buflen);
int dev_is_pool(struct device *dev, char *buf, int buflen);
@@ -76,6 +77,7 @@ int wipe_known_signatures(struct cmd_context *cmd, struct device *dev, const cha
/* Type-specific device properties */
unsigned long dev_md_stripe_width(struct dev_types *dt, struct device *dev);
int dev_is_md_with_end_superblock(struct dev_types *dt, struct device *dev);
/* Partitioning */
int major_max_partitions(struct dev_types *dt, int major);

View File

@@ -35,6 +35,7 @@
#define DEV_BCACHE_EXCL 0x00001000 /* bcache_fd should be open EXCL */
#define DEV_FILTER_AFTER_SCAN 0x00002000 /* apply filter after bcache has data */
#define DEV_FILTER_OUT_SCAN 0x00004000 /* filtered out during label scan */
#define DEV_BCACHE_WRITE 0x00008000 /* bcache_fd is open with RDWR */
/*
* Support for external device info.

View File

@@ -16,6 +16,9 @@
#include "lib.h"
#include "filter.h"
/* See label.c comment about this hack. */
extern int use_full_md_check;
#ifdef __linux__
#define MSG_SKIPPING "%s: Skipping md component device"
@@ -29,43 +32,43 @@
*
* (This is assuming lvm.conf md_component_detection=1.)
*
* If lvm does *not* ignore the components, then lvm will read lvm
* labels from the md dev and from the component devs, and will see
* them all as duplicates of each other. LVM duplicate resolution
* will then kick in and keep the md dev around to use and ignore
* the components.
* If lvm does *not* ignore the components, then lvm may read lvm
* labels from the component devs and potentially the md dev,
* which can trigger duplicate detection, and/or cause lvm to display
* md components as PVs rather than ignoring them.
*
* It is better to exclude the components as early as possible during
* lvm processing, ideally before lvm even looks for labels on the
* components, so that duplicate resolution can be avoided. There are
* a number of ways that md components can be excluded earlier than
* the duplicate resolution phase:
* If scanning md componenents causes duplicates to be seen, then
* the lvm duplicate resolution will exclude the components.
*
* - When external_device_info_source="udev", lvm discovers a device is
* an md component by asking udev during the initial filtering phase.
* However, lvm's default is to not use udev for this. The
* alternative is "native" detection in which lvm tries to detect
* md components itself.
* The lvm md filter has three modes:
*
* - When using native detection, lvm's md filter looks for the md
* superblock at the start of devices. It will see the md superblock
* on the components, exclude them in the md filter, and avoid
* handling them later in duplicate resolution.
* 1. look for md superblock at the start of the device
* 2. look for md superblock at the start and end of the device
* 3. use udev to detect components
*
* - When using native detection, lvm's md filter will not detect
* components when the md device has an older superblock version that
* places the superblock at the end of the device. This case will
* fall back to duplicate resolution to exclude components.
* mode 1 will not detect and exclude components of md devices
* that use superblock version 0.9 or 1.0 which is at the end of the device.
*
* A variation of the description above occurs for lvm commands that
* intend to create new PVs on devices (pvcreate, vgcreate, vgextend).
* For these commands, the native md filter also reads the end of all
* devices to check for the odd md superblocks.
* mode 2 will detect these, but mode 2 doubles the i/o done by label
* scan, since there's a read at both the start and end of every device.
*
* (The reason that external_device_info_source is not set to udev by
* default is that there have be issues with udev not being promptly
* or reliably updated about md state changes, causing the udev info
* that lvm uses to be occasionally wrong.)
* mode 3 is used when external_device_info_source="udev". It does
* not require any io from lvm, but this mode is not used by default
* because there have been problems getting reliable info from udev.
*
* lvm uses mode 2 when:
*
* - the command is pvcreate/vgcreate/vgextend, which format new
* devices, and if the user ran these commands on a component
* device of an md device 0.9 or 1.0, then it would cause problems.
* FIXME: this would only really need to scan the end of the
* devices being formatted, not all devices.
*
* - it sees an md device on the system using version 0.9 or 1.0.
* The point of this is just to avoid displaying md components
* from the 'pvs' command.
* FIXME: the cost (double i/o) may not be worth the benefit
* (not showing md components).
*/
/*
@@ -80,7 +83,7 @@
* that will not pass.
*/
static int _passes_md_filter(struct device *dev, int full)
static int _passes_md_filter(struct dev_filter *f, struct device *dev)
{
int ret;
@@ -91,7 +94,7 @@ static int _passes_md_filter(struct device *dev, int full)
if (!md_filtering())
return 1;
ret = dev_is_md(dev, NULL, full);
ret = dev_is_md(dev, NULL, use_full_md_check);
if (ret == -EAGAIN) {
/* let pass, call again after scan */
@@ -104,6 +107,7 @@ static int _passes_md_filter(struct device *dev, int full)
return 1;
if (ret == 1) {
log_debug_devs("md filter full %d excluding md component %s", use_full_md_check, dev_name(dev));
if (dev->ext.src == DEV_EXT_NONE)
log_debug_devs(MSG_SKIPPING, dev_name(dev));
else
@@ -121,18 +125,6 @@ static int _passes_md_filter(struct device *dev, int full)
return 1;
}
static int _passes_md_filter_lite(struct dev_filter *f __attribute__((unused)),
struct device *dev)
{
return _passes_md_filter(dev, 0);
}
static int _passes_md_filter_full(struct dev_filter *f __attribute__((unused)),
struct device *dev)
{
return _passes_md_filter(dev, 1);
}
static void _destroy(struct dev_filter *f)
{
if (f->use_count)
@@ -150,18 +142,7 @@ struct dev_filter *md_filter_create(struct cmd_context *cmd, struct dev_types *d
return NULL;
}
/*
* FIXME: for commands that want a full md check (pvcreate, vgcreate,
* vgextend), we do an extra read at the end of every device that the
* filter looks at. This isn't necessary; we only need to do the full
* md check on the PVs that these commands are trying to use.
*/
if (cmd->use_full_md_check)
f->passes_filter = _passes_md_filter_full;
else
f->passes_filter = _passes_md_filter_lite;
f->passes_filter = _passes_md_filter;
f->destroy = _destroy;
f->use_count = 0;
f->private = dt;

View File

@@ -50,12 +50,15 @@ struct pfilter {
* by default. The old code for it should be removed.
*/
static char* _good_device = "good";
static char* _bad_device = "bad";
/*
* The hash table holds one of these two states
* against each entry.
*/
#define PF_BAD_DEVICE ((void *) 1)
#define PF_GOOD_DEVICE ((void *) 2)
#define PF_BAD_DEVICE ((void *) &_good_device)
#define PF_GOOD_DEVICE ((void *) &_bad_device)
static int _init_hash(struct pfilter *pf)
{

View File

@@ -20,6 +20,11 @@
#include "dev-ext-udev-constants.h"
#endif
struct filter_data {
filter_mode_t mode;
int skip_lvs;
};
static const char *_too_small_to_hold_pv_msg = "Too small to hold a PV";
static int _native_check_pv_min_size(struct device *dev)
@@ -101,7 +106,9 @@ static int _check_pv_min_size(struct device *dev)
static int _passes_usable_filter(struct dev_filter *f, struct device *dev)
{
filter_mode_t mode = *((filter_mode_t *) f->private);
struct filter_data *data = f->private;
filter_mode_t mode = data->mode;
int skip_lvs = data->skip_lvs;
struct dev_usable_check_params ucp = {0};
int r = 1;
@@ -114,6 +121,7 @@ static int _passes_usable_filter(struct dev_filter *f, struct device *dev)
ucp.check_suspended = ignore_suspended_devices();
ucp.check_error_target = 1;
ucp.check_reserved = 1;
ucp.check_lv = skip_lvs;
break;
case FILTER_MODE_PRE_LVMETAD:
ucp.check_empty = 1;
@@ -121,6 +129,7 @@ static int _passes_usable_filter(struct dev_filter *f, struct device *dev)
ucp.check_suspended = 0;
ucp.check_error_target = 1;
ucp.check_reserved = 1;
ucp.check_lv = skip_lvs;
break;
case FILTER_MODE_POST_LVMETAD:
ucp.check_empty = 0;
@@ -128,6 +137,7 @@ static int _passes_usable_filter(struct dev_filter *f, struct device *dev)
ucp.check_suspended = ignore_suspended_devices();
ucp.check_error_target = 0;
ucp.check_reserved = 0;
ucp.check_lv = skip_lvs;
break;
}
@@ -161,8 +171,9 @@ static void _usable_filter_destroy(struct dev_filter *f)
dm_free(f);
}
struct dev_filter *usable_filter_create(struct dev_types *dt __attribute__((unused)), filter_mode_t mode)
struct dev_filter *usable_filter_create(struct cmd_context *cmd, struct dev_types *dt __attribute__((unused)), filter_mode_t mode)
{
struct filter_data *data;
struct dev_filter *f;
if (!(f = dm_zalloc(sizeof(struct dev_filter)))) {
@@ -173,14 +184,20 @@ struct dev_filter *usable_filter_create(struct dev_types *dt __attribute__((unus
f->passes_filter = _passes_usable_filter;
f->destroy = _usable_filter_destroy;
f->use_count = 0;
if (!(f->private = dm_zalloc(sizeof(filter_mode_t)))) {
if (!(data = dm_zalloc(sizeof(struct filter_data)))) {
log_error("Usable device filter mode allocation failed");
dm_free(f);
return NULL;
}
*((filter_mode_t *) f->private) = mode;
log_debug_devs("Usable device filter initialised.");
data->mode = mode;
data->skip_lvs = !find_config_tree_bool(cmd, devices_scan_lvs_CFG, NULL);
f->private = data;
log_debug_devs("Usable device filter initialised (scan_lvs %d).", !data->skip_lvs);
return f;
}

View File

@@ -52,7 +52,7 @@ typedef enum {
FILTER_MODE_PRE_LVMETAD,
FILTER_MODE_POST_LVMETAD
} filter_mode_t;
struct dev_filter *usable_filter_create(struct dev_types *dt, filter_mode_t mode);
struct dev_filter *usable_filter_create(struct cmd_context *cmd, struct dev_types *dt, filter_mode_t mode);
int persistent_filter_load(struct dev_filter *f, struct dm_config_tree **cft_out);

View File

@@ -400,10 +400,14 @@ static int _raw_write_mda_header(const struct format_type *fmt,
MDA_HEADER_SIZE -
sizeof(mdah->checksum_xl)));
dev_set_last_byte(dev, start_byte + MDA_HEADER_SIZE);
if (!dev_write_bytes(dev, start_byte, MDA_HEADER_SIZE, mdah)) {
dev_unset_last_byte(dev);
log_error("Failed to write mda header to %s fd %d", dev_name(dev), dev->bcache_fd);
return 0;
}
dev_unset_last_byte(dev);
return 1;
}
@@ -677,10 +681,13 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg,
(unsigned long long)(mdac->rlocn.size - new_wrap),
(unsigned long long)new_wrap);
dev_set_last_byte(mdac->area.dev, mdac->area.start + mdah->size);
if (!dev_write_bytes(mdac->area.dev, mdac->area.start + mdac->rlocn.offset,
(size_t) (mdac->rlocn.size - new_wrap),
fidtc->raw_metadata_buf)) {
log_error("Failed to write metadata to %s fd %d", dev_name(mdac->area.dev), mdac->area.dev->bcache_fd);
dev_unset_last_byte(mdac->area.dev);
goto out;
}
@@ -694,10 +701,13 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg,
(size_t) new_wrap,
fidtc->raw_metadata_buf + mdac->rlocn.size - new_wrap)) {
log_error("Failed to write metadata wrap to %s fd %d", dev_name(mdac->area.dev), mdac->area.dev->bcache_fd);
dev_unset_last_byte(mdac->area.dev);
goto out;
}
}
dev_unset_last_byte(mdac->area.dev);
mdac->rlocn.checksum = calc_crc(INITIAL_CRC, (uint8_t *)fidtc->raw_metadata_buf,
(uint32_t) (mdac->rlocn.size -
new_wrap));
@@ -1284,6 +1294,10 @@ int read_metadata_location_summary(const struct format_type *fmt,
*/
vgsummary->mda_checksum = rlocn->checksum;
vgsummary->mda_size = rlocn->size;
/* Keep track of largest metadata size we find. */
lvmcache_save_metadata_size(rlocn->size);
lvmcache_lookup_mda(vgsummary);
if (!text_read_metadata_summary(fmt, dev_area->dev, MDA_CONTENT_REASON(primary_mda),

View File

@@ -21,12 +21,16 @@
#include "bcache.h"
#include "toolcontext.h"
#include "activate.h"
#include "metadata.h"
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/time.h>
int use_full_md_check;
static uint64_t _current_bcache_size_bytes;
/* FIXME Allow for larger labels? Restricted to single sector currently */
@@ -172,6 +176,7 @@ int label_write(struct device *dev, struct label *label)
{
char buf[LABEL_SIZE] __attribute__((aligned(8)));
struct label_header *lh = (struct label_header *) buf;
uint64_t offset;
int r = 1;
if (!label->labeller->ops->write) {
@@ -206,11 +211,17 @@ int label_write(struct device *dev, struct label *label)
return 0;
}
if (!dev_write_bytes(dev, label->sector << SECTOR_SHIFT, LABEL_SIZE, buf)) {
offset = label->sector << SECTOR_SHIFT;
dev_set_last_byte(dev, offset + LABEL_SIZE);
if (!dev_write_bytes(dev, offset, LABEL_SIZE, buf)) {
log_debug_devs("Failed to write label to %s", dev_name(dev));
r = 0;
}
dev_unset_last_byte(dev);
return r;
}
@@ -464,12 +475,24 @@ static int _scan_dev_open(struct device *dev)
name_sl = dm_list_item(name_list, struct dm_str_list);
name = name_sl->str;
flags |= O_RDWR;
flags |= O_DIRECT;
flags |= O_NOATIME;
if (dev->flags & DEV_BCACHE_EXCL)
/*
* FIXME: udev is a train wreck when we open RDWR and close, so we
* need to only use RDWR when we actually need to write, and use
* RDONLY otherwise. Fix, disable or scrap udev nonsense so we can
* just open with RDWR by default.
*/
if (dev->flags & DEV_BCACHE_EXCL) {
flags |= O_EXCL;
flags |= O_RDWR;
} else if (dev->flags & DEV_BCACHE_WRITE) {
flags |= O_RDWR;
} else {
flags |= O_RDONLY;
}
retry_open:
@@ -752,33 +775,33 @@ out:
}
/*
* How many blocks to set up in bcache? Is 1024 a good max?
* num_devs is the number of devices the caller is going to scan.
* When 0 the caller doesn't know, and we use the default cache size.
* When non-zero, allocate at least num_devs bcache blocks.
* num_devs doesn't really tell us how many bcache blocks we'll use
* because it includes lvm devs and non-lvm devs, and each lvm dev
* will often use a number of bcache blocks.
*
* Currently, we tell bcache to set up N blocks where N
* is the number of devices that are going to be scanned.
* Reasons why this number may not be be a good choice:
*
* - there may be a lot of non-lvm devices, which
* would make this number larger than necessary
*
* - each lvm device may use more than one cache
* block if the metadata is large enough or it
* uses more than one metadata area, which
* would make this number smaller than it
* should be for the best performance.
*
* This is even more tricky to estimate when lvmetad
* is used, because it's hard to predict how many
* devs might need to be scanned when using lvmetad.
* This currently just sets up bcache with MIN blocks.
* We don't know ahead of time if we will find some VG metadata
* that is larger than the total size of the bcache, which would
* prevent us from reading/writing the VG since we do not dynamically
* increase the bcache size when we find it's too small. In these
* cases the user would need to set io_memory_size to be larger
* than the max VG metadata size (lvm does not impose any limit on
* the metadata size.)
*/
#define MIN_BCACHE_BLOCKS 32
#define MIN_BCACHE_BLOCKS 32 /* 4MB */
#define MAX_BCACHE_BLOCKS 1024
static int _setup_bcache(int cache_blocks)
static int _setup_bcache(int num_devs)
{
struct io_engine *ioe;
struct io_engine *ioe = NULL;
int iomem_kb = io_memory_size();
int block_size_kb = (BCACHE_BLOCK_SIZE_IN_SECTORS * 512) / 1024;
int cache_blocks;
cache_blocks = iomem_kb / block_size_kb;
if (cache_blocks < MIN_BCACHE_BLOCKS)
cache_blocks = MIN_BCACHE_BLOCKS;
@@ -786,9 +809,20 @@ static int _setup_bcache(int cache_blocks)
if (cache_blocks > MAX_BCACHE_BLOCKS)
cache_blocks = MAX_BCACHE_BLOCKS;
if (!(ioe = create_async_io_engine())) {
log_error("Failed to create bcache io engine.");
return 0;
_current_bcache_size_bytes = cache_blocks * BCACHE_BLOCK_SIZE_IN_SECTORS * 512;
if (use_aio()) {
if (!(ioe = create_async_io_engine())) {
log_warn("Failed to set up async io, using sync io.");
init_use_aio(0);
}
}
if (!ioe) {
if (!(ioe = create_sync_io_engine())) {
log_error("Failed to set up sync io.");
return 0;
}
}
if (!(scan_bcache = bcache_create(BCACHE_BLOCK_SIZE_IN_SECTORS, cache_blocks, ioe))) {
@@ -810,6 +844,7 @@ int label_scan(struct cmd_context *cmd)
struct dev_iter *iter;
struct device_list *devl, *devl2;
struct device *dev;
uint64_t max_metadata_size_bytes;
log_debug_devs("Finding devices to scan");
@@ -844,6 +879,30 @@ int label_scan(struct cmd_context *cmd)
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
}
/*
* When md devices exist that use the old superblock at the
* end of the device, then in order to detect and filter out
* the component devices of those md devs, we need to enable
* the full md filter which scans both the start and the end
* of every device. This doubles the amount of scanning i/o,
* which we want to avoid. FIXME: it may not be worth the
* cost of double i/o just to avoid displaying md component
* devs in 'pvs', which is a pretty harmless effect from a
* pretty uncommon situation.
*/
if (dev_is_md_with_end_superblock(cmd->dev_types, dev)) {
cmd->use_full_md_check = 1;
/* This is a hack because 'cmd' is not passed
into the filters so we can't check the flag
in the cmd struct. The master branch has
changed the filters in commit 8eab37593eccb
to accept cmd, but it's a complex change
that I'm trying to avoid in the stable branch. */
use_full_md_check = 1;
}
};
dev_iter_destroy(iter);
@@ -856,6 +915,41 @@ int label_scan(struct cmd_context *cmd)
_scan_list(cmd, cmd->full_filter, &all_devs, NULL);
/*
* Metadata could be larger than total size of bcache, and bcache
* cannot currently be resized during the command. If this is the
* case (or within reach), warn that io_memory_size needs to be
* set larger.
*
* Even if bcache out of space did not cause a failure during scan, it
* may cause a failure during the next vg_read phase or during vg_write.
*
* If there was an error during scan, we could recreate bcache here
* with a larger size and then restart label_scan. But, this does not
* address the problem of writing new metadata that excedes the bcache
* size and failing, which would often be hit first, i.e. we'll fail
* to write new metadata exceding the max size before we have a chance
* to read any metadata with that size, unless we find an existing vg
* that has been previously created with the larger size.
*
* If the largest metadata is within 1MB of the bcache size, then start
* warning.
*/
max_metadata_size_bytes = lvmcache_max_metadata_size();
if (max_metadata_size_bytes + (1024 * 1024) > _current_bcache_size_bytes) {
/* we want bcache to be 1MB larger than the max metadata seen */
uint64_t want_size_kb = (max_metadata_size_bytes / 1024) + 1024;
uint64_t remainder;
if ((remainder = (want_size_kb % 1024)))
want_size_kb = want_size_kb + 1024 - remainder;
log_warn("WARNING: metadata may not be usable with current io_memory_size %d KiB",
io_memory_size());
log_warn("WARNING: increase lvm.conf io_memory_size to at least %llu KiB",
(unsigned long long)want_size_kb);
}
dm_list_iterate_items_safe(devl, devl2, &all_devs) {
dm_list_del(&devl->list);
dm_free(devl);
@@ -864,6 +958,85 @@ int label_scan(struct cmd_context *cmd)
return 1;
}
int label_scan_pvscan_all(struct cmd_context *cmd, struct dm_list *scan_devs)
{
struct dm_list all_devs;
struct dev_iter *iter;
struct device_list *devl, *devl2;
struct device *dev;
log_debug_devs("Finding devices to scan");
dm_list_init(&all_devs);
/*
* Iterate through all the devices in dev-cache (block devs that appear
* under /dev that could possibly hold a PV and are not excluded by
* filters). Read each to see if it's an lvm device, and if so
* populate lvmcache with some basic info about the device and the VG
* on it. This info will be used by the vg_read() phase of the
* command.
*/
dev_cache_scan();
if (!(iter = dev_iter_create(cmd->lvmetad_filter, 0))) {
log_error("Scanning failed to get devices.");
return 0;
}
while ((dev = dev_iter_get(iter))) {
if (!(devl = dm_zalloc(sizeof(*devl))))
return 0;
devl->dev = dev;
dm_list_add(&all_devs, &devl->list);
/*
* label_scan should not generally be called a second time,
* so this will usually not be true.
*/
if (_in_bcache(dev)) {
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
}
if (dev_is_md_with_end_superblock(cmd->dev_types, dev)) {
cmd->use_full_md_check = 1;
use_full_md_check = 1;
log_debug("Found md component in sysfs with end superblock %s", dev_name(dev));
}
};
dev_iter_destroy(iter);
log_debug_devs("Found %d devices to scan", dm_list_size(&all_devs));
if (!scan_bcache) {
if (!_setup_bcache(dm_list_size(&all_devs)))
return 0;
}
_scan_list(cmd, cmd->lvmetad_filter, &all_devs, NULL);
dm_list_iterate_items_safe(devl, devl2, &all_devs) {
dm_list_del(&devl->list);
/*
* If this device is lvm's then, return it to pvscan
* to do the further pvscan. (We could have _scan_list
* just set a result in devl indicating the result, but
* instead we're just checking indirectly if _scan_list
* saved lvmcache info for the dev which also means it's
* an lvm device.)
*/
if (lvmcache_has_dev_info(devl->dev))
dm_list_add(scan_devs, &devl->list);
else
dm_free(devl);
}
return 1;
}
/*
* Scan and cache lvm data from the listed devices. If a device is already
* scanned and cached, this replaces the previously cached lvm data for the
@@ -897,6 +1070,28 @@ int label_scan_devs(struct cmd_context *cmd, struct dev_filter *f, struct dm_lis
return 1;
}
int label_scan_devs_rw(struct cmd_context *cmd, struct dev_filter *f, struct dm_list *devs)
{
struct device_list *devl;
int failed = 0;
dm_list_iterate_items(devl, devs) {
if (_in_bcache(devl->dev)) {
bcache_invalidate_fd(scan_bcache, devl->dev->bcache_fd);
_scan_dev_close(devl->dev);
}
/* _scan_dev_open will open(RDWR) when this flag is set */
devl->dev->flags |= DEV_BCACHE_WRITE;
}
_scan_list(cmd, f, devs, &failed);
/* FIXME: this function should probably fail if any devs couldn't be scanned */
return 1;
}
int label_scan_devs_excl(struct dm_list *devs)
{
struct device_list *devl;
@@ -1107,7 +1302,14 @@ int label_scan_open(struct device *dev)
int label_scan_open_excl(struct device *dev)
{
if (_in_bcache(dev) && !(dev->flags & DEV_BCACHE_EXCL)) {
/* FIXME: avoid tossing out bcache blocks just to replace fd. */
log_debug("Close and reopen excl %s", dev_name(dev));
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
}
dev->flags |= DEV_BCACHE_EXCL;
dev->flags |= DEV_BCACHE_WRITE;
return label_scan_open(dev);
}
@@ -1122,14 +1324,15 @@ bool dev_read_bytes(struct device *dev, uint64_t start, size_t len, void *data)
if (dev->bcache_fd <= 0) {
/* This is not often needed, perhaps only with lvmetad. */
if (!label_scan_open(dev)) {
log_error("dev_read_bytes %s cannot open dev", dev_name(dev));
log_error("Error opening device %s for reading at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
return false;
}
}
if (!bcache_read_bytes(scan_bcache, dev->bcache_fd, start, len, data)) {
log_error("dev_read_bytes %s at %u failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error reading device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
label_scan_invalidate(dev);
return false;
}
@@ -1148,24 +1351,36 @@ bool dev_write_bytes(struct device *dev, uint64_t start, size_t len, void *data)
return false;
}
if (_in_bcache(dev) && !(dev->flags & DEV_BCACHE_WRITE)) {
/* FIXME: avoid tossing out bcache blocks just to replace fd. */
log_debug("Close and reopen to write %s", dev_name(dev));
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
dev->flags |= DEV_BCACHE_WRITE;
label_scan_open(dev);
}
if (dev->bcache_fd <= 0) {
/* This is not often needed, perhaps only with lvmetad. */
dev->flags |= DEV_BCACHE_WRITE;
if (!label_scan_open(dev)) {
log_error("dev_write_bytes %s cannot open dev", dev_name(dev));
log_error("Error opening device %s for writing at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
return false;
}
}
if (!bcache_write_bytes(scan_bcache, dev->bcache_fd, start, len, data)) {
log_error("dev_write_bytes %s at %u bcache write failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
label_scan_invalidate(dev);
return false;
}
if (!bcache_flush(scan_bcache)) {
log_error("dev_write_bytes %s at %u bcache flush failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
label_scan_invalidate(dev);
return false;
}
@@ -1182,27 +1397,44 @@ bool dev_write_zeros(struct device *dev, uint64_t start, size_t len)
return false;
}
if (_in_bcache(dev) && !(dev->flags & DEV_BCACHE_WRITE)) {
/* FIXME: avoid tossing out bcache blocks just to replace fd. */
log_debug("Close and reopen to write %s", dev_name(dev));
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
dev->flags |= DEV_BCACHE_WRITE;
label_scan_open(dev);
}
if (dev->bcache_fd <= 0) {
/* This is not often needed, perhaps only with lvmetad. */
dev->flags |= DEV_BCACHE_WRITE;
if (!label_scan_open(dev)) {
log_error("dev_write_zeros %s cannot open dev", dev_name(dev));
log_error("Error opening device %s for writing at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
return false;
}
}
dev_set_last_byte(dev, start + len);
if (!bcache_zero_bytes(scan_bcache, dev->bcache_fd, start, len)) {
log_error("dev_write_zeros %s at %u bcache write failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
dev_unset_last_byte(dev);
label_scan_invalidate(dev);
return false;
}
if (!bcache_flush(scan_bcache)) {
log_error("dev_write_zeros %s at %u bcache flush failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
dev_unset_last_byte(dev);
label_scan_invalidate(dev);
return false;
}
dev_unset_last_byte(dev);
return true;
}
@@ -1216,27 +1448,60 @@ bool dev_set_bytes(struct device *dev, uint64_t start, size_t len, uint8_t val)
return false;
}
if (_in_bcache(dev) && !(dev->flags & DEV_BCACHE_WRITE)) {
/* FIXME: avoid tossing out bcache blocks just to replace fd. */
log_debug("Close and reopen to write %s", dev_name(dev));
bcache_invalidate_fd(scan_bcache, dev->bcache_fd);
_scan_dev_close(dev);
dev->flags |= DEV_BCACHE_WRITE;
label_scan_open(dev);
}
if (dev->bcache_fd <= 0) {
/* This is not often needed, perhaps only with lvmetad. */
dev->flags |= DEV_BCACHE_WRITE;
if (!label_scan_open(dev)) {
log_error("dev_set_bytes %s cannot open dev", dev_name(dev));
log_error("Error opening device %s for writing at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
return false;
}
}
dev_set_last_byte(dev, start + len);
if (!bcache_set_bytes(scan_bcache, dev->bcache_fd, start, len, val)) {
log_error("dev_set_bytes %s at %u bcache write failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
dev_unset_last_byte(dev);
label_scan_invalidate(dev);
return false;
}
if (!bcache_flush(scan_bcache)) {
log_error("dev_set_bytes %s at %u bcache flush failed invalidate fd %d",
dev_name(dev), (uint32_t)start, dev->bcache_fd);
log_error("Error writing device %s at %llu length %u.",
dev_name(dev), (unsigned long long)start, (uint32_t)len);
dev_unset_last_byte(dev);
label_scan_invalidate(dev);
return false;
}
dev_unset_last_byte(dev);
return true;
}
void dev_set_last_byte(struct device *dev, uint64_t offset)
{
unsigned int phys_block_size = 0;
unsigned int block_size = 0;
dev_get_block_size(dev, &phys_block_size, &block_size);
bcache_set_last_byte(scan_bcache, dev->bcache_fd, offset, phys_block_size);
}
void dev_unset_last_byte(struct device *dev)
{
bcache_unset_last_byte(scan_bcache, dev->bcache_fd);
}

View File

@@ -104,6 +104,7 @@ extern struct bcache *scan_bcache;
int label_scan(struct cmd_context *cmd);
int label_scan_devs(struct cmd_context *cmd, struct dev_filter *f, struct dm_list *devs);
int label_scan_devs_rw(struct cmd_context *cmd, struct dev_filter *f, struct dm_list *devs);
int label_scan_devs_excl(struct dm_list *devs);
void label_scan_invalidate(struct device *dev);
void label_scan_invalidate_lv(struct cmd_context *cmd, struct logical_volume *lv);
@@ -115,6 +116,7 @@ void label_scan_confirm(struct device *dev);
int label_scan_setup_bcache(void);
int label_scan_open(struct device *dev);
int label_scan_open_excl(struct device *dev);
int label_scan_pvscan_all(struct cmd_context *cmd, struct dm_list *scan_devs);
/*
* Wrappers around bcache equivalents.
@@ -124,5 +126,7 @@ bool dev_read_bytes(struct device *dev, uint64_t start, size_t len, void *data);
bool dev_write_bytes(struct device *dev, uint64_t start, size_t len, void *data);
bool dev_write_zeros(struct device *dev, uint64_t start, size_t len);
bool dev_set_bytes(struct device *dev, uint64_t start, size_t len, uint8_t val);
void dev_set_last_byte(struct device *dev, uint64_t offset);
void dev_unset_last_byte(struct device *dev);
#endif

View File

@@ -1009,7 +1009,7 @@ void lockd_free_vg_final(struct cmd_context *cmd, struct volume_group *vg)
* that the VG lockspace being started is new.
*/
int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init)
int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init, int *exists)
{
char uuid[64] __attribute__((aligned(8)));
daemon_reply reply;
@@ -1084,6 +1084,12 @@ int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_i
log_debug("VG %s start error: already started", vg->name);
ret = 1;
break;
case -ESTARTING:
log_debug("VG %s start error: already starting", vg->name);
if (exists)
*exists = 1;
ret = 1;
break;
case -EARGS:
log_error("VG %s start failed: invalid parameters for %s", vg->name, vg->lock_type);
break;
@@ -2662,7 +2668,7 @@ int lockd_rename_vg_final(struct cmd_context *cmd, struct volume_group *vg, int
* Depending on the problem that caused the rename to
* fail, it may make sense to not restart the VG here.
*/
if (!lockd_start_vg(cmd, vg, 0))
if (!lockd_start_vg(cmd, vg, 0, NULL))
log_error("Failed to restart VG %s lockspace.", vg->name);
return 1;
}
@@ -2702,7 +2708,7 @@ int lockd_rename_vg_final(struct cmd_context *cmd, struct volume_group *vg, int
}
}
if (!lockd_start_vg(cmd, vg, 1))
if (!lockd_start_vg(cmd, vg, 1, NULL))
log_error("Failed to start VG %s lockspace.", vg->name);
return 1;

View File

@@ -63,7 +63,7 @@ int lockd_rename_vg_final(struct cmd_context *cmd, struct volume_group *vg, int
/* start and stop the lockspace for a vg */
int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init);
int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init, int *exists);
int lockd_stop_vg(struct cmd_context *cmd, struct volume_group *vg);
int lockd_start_wait(struct cmd_context *cmd);
@@ -148,7 +148,7 @@ static inline int lockd_rename_vg_final(struct cmd_context *cmd, struct volume_g
return 1;
}
static inline int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init)
static inline int lockd_start_vg(struct cmd_context *cmd, struct volume_group *vg, int start_init, int *exists)
{
return 0;
}

View File

@@ -843,14 +843,10 @@ int cache_set_metadata_format(struct lv_segment *seg, cache_metadata_format_t fo
/*
* If policy is unselected, but format 2 is selected, policy smq is enforced.
* ATM no other then smq policy is allowed to select format 2.
*/
if (!seg->policy_name) {
if (format == CACHE_METADATA_FORMAT_2)
seg->policy_name = "smq";
} else if (strcmp(seg->policy_name, "smq")) {
seg->cache_metadata_format = CACHE_METADATA_FORMAT_1;
return 1;
}
/* Check if we need to search for configured cache metadata format */

View File

@@ -301,7 +301,8 @@ char *lvseg_monitor_dup(struct dm_pool *mem, const struct lv_segment *seg)
int pending = 0, monitored = 0;
struct lv_segment *segm = (struct lv_segment *) seg;
if (lv_is_cow(seg->lv) && !lv_is_merging_cow(seg->lv))
if (lv_is_cow(seg->lv) && (!lv_is_merging_cow(seg->lv) ||
lv_has_target_type(seg->lv->vg->cmd->mem, seg->lv, NULL, TARGET_NAME_SNAPSHOT)))
segm = first_seg(seg->lv->snapshot->lv);
// log_debug("Query LV:%s mon:%s segm:%s tgtm:%p segmon:%d statusm:%d", seg->lv->name, segm->lv->name, segm->segtype->name, segm->segtype->ops->target_monitored, seg_monitored(segm), (int)(segm->status & PVMOVE));

View File

@@ -2959,12 +2959,16 @@ static int _find_some_parallel_space(struct alloc_handle *ah,
(*(alloc_state->areas + alloc_state->num_positional_areas + ix - 1 -
too_small_for_log_count)).used < ah->log_len)
too_small_for_log_count++;
ix_log_offset = alloc_state->num_positional_areas + ix - too_small_for_log_count - ah->log_area_count;
if (ah->mirror_logs_separate &&
too_small_for_log_count &&
(too_small_for_log_count >= devices_needed))
return 1;
if ((alloc_state->num_positional_areas + ix) < (too_small_for_log_count + ah->log_area_count))
return 1;
ix_log_offset = alloc_state->num_positional_areas + ix - (too_small_for_log_count + ah->log_area_count);
}
if (ix + alloc_state->num_positional_areas < devices_needed +
(alloc_state->log_area_count_still_needed ? alloc_state->log_area_count_still_needed +
too_small_for_log_count : 0))
if (ix + alloc_state->num_positional_areas < devices_needed)
return 1;
/*
@@ -3954,6 +3958,25 @@ bad:
return 0;
}
/* Add all rmeta SubLVs for @seg to @lvs and return allocated @lvl to free by caller. */
static struct lv_list *_raid_list_metalvs(struct lv_segment *seg, struct dm_list *lvs)
{
uint32_t s;
struct lv_list *lvl;
dm_list_init(lvs);
if (!(lvl = dm_pool_alloc(seg->lv->vg->vgmem, sizeof(*lvl) * seg->area_count)))
return_NULL;
for (s = 0; s < seg->area_count; s++) {
lvl[s].lv = seg_metalv(seg, s);
dm_list_add(lvs, &lvl[s].list);
}
return lvl;
}
static int _lv_extend_layered_lv(struct alloc_handle *ah,
struct logical_volume *lv,
uint32_t extents, uint32_t first_area,
@@ -3965,7 +3988,6 @@ static int _lv_extend_layered_lv(struct alloc_handle *ah,
uint32_t fa, s;
int clear_metadata = 0;
uint32_t area_multiple = 1;
int fail;
if (!(segtype = get_segtype_from_string(lv->vg->cmd, SEG_TYPE_NAME_STRIPED)))
return_0;
@@ -4043,74 +4065,50 @@ static int _lv_extend_layered_lv(struct alloc_handle *ah,
return_0;
if (clear_metadata) {
struct volume_group *vg = lv->vg;
/*
* We must clear the metadata areas upon creation.
*/
/* FIXME VG is not in a fully-consistent state here and should not be committed! */
if (!vg_write(lv->vg) || !vg_commit(lv->vg))
return_0;
if (test_mode())
/*
* Declare the new RaidLV as temporary to avoid visible SubLV
* failures on activation until after we wiped them so that
* we can avoid activating crashed, potentially partially
* wiped RaidLVs.
*/
lv->status |= LV_ACTIVATION_SKIP;
if (test_mode()) {
/* FIXME VG is not in a fully-consistent state here and should not be committed! */
if (!vg_write(vg) || !vg_commit(vg))
return_0;
log_verbose("Test mode: Skipping wiping of metadata areas.");
else {
fail = 0;
/* Activate all rmeta devices locally first (more efficient) */
for (s = 0; !fail && s < seg->area_count; s++) {
meta_lv = seg_metalv(seg, s);
} else {
struct dm_list meta_lvs;
struct lv_list *lvl;
if (!activate_lv_local(meta_lv->vg->cmd, meta_lv)) {
log_error("Failed to activate %s for clearing.",
display_lvname(meta_lv));
fail = 1;
}
}
/* Clear all rmeta devices */
for (s = 0; !fail && s < seg->area_count; s++) {
meta_lv = seg_metalv(seg, s);
log_verbose("Clearing metadata area of %s.",
display_lvname(meta_lv));
/*
* Rather than wiping meta_lv->size, we can simply
* wipe '1' to remove the superblock of any previous
* RAID devices. It is much quicker.
*/
if (!wipe_lv(meta_lv, (struct wipe_params)
{ .do_zero = 1, .zero_sectors = 1 })) {
stack;
fail = 1;
}
}
/* Deactivate all rmeta devices */
for (s = 0; s < seg->area_count; s++) {
meta_lv = seg_metalv(seg, s);
if (!deactivate_lv(meta_lv->vg->cmd, meta_lv)) {
log_error("Failed to deactivate %s after clearing.",
display_lvname(meta_lv));
fail = 1;
}
/* Wipe any temporary tags required for activation. */
str_list_wipe(&meta_lv->tags);
}
if (fail) {
/* Fail, after trying to deactivate all we could */
struct volume_group *vg = lv->vg;
if (!(lvl = _raid_list_metalvs(seg, &meta_lvs)))
return 0;
/* Wipe lv list committing metadata */
if (!activate_and_wipe_lvlist(&meta_lvs, 1)) {
/* If we failed clearing rmeta SubLVs, try removing the new RaidLV */
if (!lv_remove(lv))
log_error("Failed to remove LV");
else if (!vg_write(vg) || !vg_commit(vg))
log_error("Failed to commit VG %s", vg->name);
return_0;
}
dm_pool_free(vg->vgmem, lvl);
}
for (s = 0; s < seg->area_count; s++)
lv_set_hidden(seg_metalv(seg, s));
lv->status &= ~LV_ACTIVATION_SKIP;
}
return 1;
@@ -5856,7 +5854,7 @@ static int _add_pvs(struct cmd_context *cmd, struct pv_segment *peg,
if (find_pv_in_pv_list(&spvs->pvs, peg->pv))
return 1;
if (!(pvl = dm_pool_alloc(cmd->mem, sizeof(*pvl)))) {
if (!(pvl = dm_pool_zalloc(cmd->mem, sizeof(*pvl)))) {
log_error("pv_list allocation failed");
return 0;
}
@@ -7196,6 +7194,100 @@ out:
return 1;
}
/*
* Optionally makes on-disk metadata changes if @commit
*
* If LV is active:
* wipe any signatures and clear first sector of LVs listed on @lv_list
* otherwise:
* activate, wipe (as above), deactivate
*
* Returns: 1 on success, 0 on failure
*/
int activate_and_wipe_lvlist(struct dm_list *lv_list, int commit)
{
struct lv_list *lvl;
struct volume_group *vg = NULL;
unsigned i = 0, sz = dm_list_size(lv_list);
char *was_active;
int r = 1;
if (!sz) {
log_debug_metadata(INTERNAL_ERROR "Empty list of LVs given for wiping.");
return 1;
}
dm_list_iterate_items(lvl, lv_list) {
if (!lv_is_visible(lvl->lv)) {
log_error(INTERNAL_ERROR
"LVs must be set visible before wiping.");
return 0;
}
vg = lvl->lv->vg;
}
if (test_mode())
return 1;
/*
* FIXME: only vg_[write|commit] if LVs are not already written
* as visible in the LVM metadata (which is never the case yet).
*/
if (commit &&
(!vg || !vg_write(vg) || !vg_commit(vg)))
return_0;
was_active = alloca(sz);
dm_list_iterate_items(lvl, lv_list)
if (!(was_active[i++] = lv_is_active(lvl->lv))) {
lvl->lv->status |= LV_TEMPORARY;
if (!activate_lv(vg->cmd, lvl->lv)) {
log_error("Failed to activate localy %s for wiping.",
display_lvname(lvl->lv));
r = 0;
goto out;
}
lvl->lv->status &= ~LV_TEMPORARY;
}
dm_list_iterate_items(lvl, lv_list) {
log_verbose("Wiping metadata area %s.", display_lvname(lvl->lv));
/* Wipe any know signatures */
if (!wipe_lv(lvl->lv, (struct wipe_params) { .do_wipe_signatures = 1, .do_zero = 1, .zero_sectors = 1 })) {
log_error("Failed to wipe %s.", display_lvname(lvl->lv));
r = 0;
goto out;
}
}
out:
/* TODO: deactivation is only needed with clustered locking
* in normal case we should keep device active
*/
sz = 0;
dm_list_iterate_items(lvl, lv_list)
if ((i > sz) && !was_active[sz++] &&
!deactivate_lv(vg->cmd, lvl->lv)) {
log_error("Failed to deactivate %s.", display_lvname(lvl->lv));
r = 0; /* Continue deactivating as many as possible. */
}
return r;
}
/* Wipe logical volume @lv, optionally with @commit of metadata */
int activate_and_wipe_lv(struct logical_volume *lv, int commit)
{
struct dm_list lv_list;
struct lv_list lvl;
lvl.lv = lv;
dm_list_init(&lv_list);
dm_list_add(&lv_list, &lvl.list);
return activate_and_wipe_lvlist(&lv_list, commit);
}
static struct logical_volume *_create_virtual_origin(struct cmd_context *cmd,
struct volume_group *vg,
const char *lv_name,

View File

@@ -234,6 +234,30 @@ static void _check_non_raid_seg_members(struct lv_segment *seg, int *error_count
/* .... more members? */
}
static void _check_raid_sublvs(struct lv_segment *seg, int *error_count)
{
unsigned s;
for (s = 0; s < seg->area_count; s++) {
if (seg_type(seg, s) != AREA_LV)
raid_seg_error("no raid image SubLV");
if ((seg_lv(seg, s)->status & LVM_WRITE) &&
!(seg->lv->status & LV_ACTIVATION_SKIP) &&
lv_is_visible(seg_lv(seg, s)))
raid_seg_error("visible raid image LV");
if (!seg_is_raid_with_meta(seg) || !seg->meta_areas)
continue;
if (seg_metatype(seg, s) != AREA_LV)
raid_seg_error("no raid meta SubLV");
else if (!(seg->lv->status & LV_ACTIVATION_SKIP) &&
lv_is_visible(seg_metalv(seg, s)))
raid_seg_error("visible raid meta LV");
}
}
/*
* Check RAID segment struct members of @seg for acceptable
* properties and increment @error_count for any bogus ones.
@@ -287,10 +311,14 @@ static void _check_raid_seg(struct lv_segment *seg, int *error_count)
/* Check for any MetaLV flaws like non-existing ones or size variations */
if (seg->meta_areas)
for (area_len = s = 0; s < seg->area_count; s++) {
if (seg_metatype(seg, s) == AREA_UNASSIGNED)
continue;
if (seg_metatype(seg, s) != AREA_LV) {
raid_seg_error("no MetaLV");
continue;
}
if (!lv_is_raid_metadata(seg_metalv(seg, s)))
raid_seg_error("MetaLV without RAID metadata flag");
if (area_len &&
@@ -314,6 +342,8 @@ static void _check_raid_seg(struct lv_segment *seg, int *error_count)
_check_raid45610_seg(seg, error_count);
else
raid_seg_error("bogus RAID segment type");
_check_raid_sublvs(seg, error_count);
}
/* END: RAID segment property checks. */

View File

@@ -651,8 +651,12 @@ void pvcreate_params_set_defaults(struct pvcreate_params *pp);
int vg_write(struct volume_group *vg);
int vg_commit(struct volume_group *vg);
void vg_revert(struct volume_group *vg);
struct volume_group *vg_read_internal(struct cmd_context *cmd, const char *vg_name,
const char *vgid, uint32_t lockd_state, uint32_t warn_flags, int *consistent);
struct volume_group *vg_read_internal(struct cmd_context *cmd, const char *vg_name, const char *vgid,
int write_lock_held,
uint32_t lockd_state,
uint32_t warn_flags,
int *consistent);
#define get_pvs( cmd ) get_pvs_internal((cmd), NULL, NULL)
#define get_pvs_perserve_vg( cmd, pv_list, vg_list ) get_pvs_internal((cmd), (pv_list), (vg_list))
@@ -780,6 +784,12 @@ struct wipe_params {
/* Zero out LV and/or wipe signatures */
int wipe_lv(struct logical_volume *lv, struct wipe_params params);
/* Wipe any signatures and zero first sector on @lv */
int activate_and_wipe_lv(struct logical_volume *lv, int commit);
/* Wipe any signatures and zero first sector of LVs listed on @lv_list */
int activate_and_wipe_lvlist(struct dm_list *lv_list, int commit);
int lv_change_tag(struct logical_volume *lv, const char *tag, int add_tag);
/* Reduce the size of an LV by extents */
@@ -1209,6 +1219,8 @@ int lv_raid_change_region_size(struct logical_volume *lv,
int lv_raid_in_sync(const struct logical_volume *lv);
uint32_t lv_raid_data_copies(const struct segment_type *segtype, uint32_t area_count);
int lv_raid_free_reshape_space(const struct logical_volume *lv);
int lv_raid_clear_lv(struct logical_volume *lv, int commit);
int lv_raid_has_visible_sublvs(const struct logical_volume *lv);
/* -- metadata/raid_manip.c */
/* ++ metadata/cache_manip.c */

View File

@@ -1720,15 +1720,13 @@ struct generic_logical_volume *find_historical_glv(const struct volume_group *vg
int lv_name_is_used_in_vg(const struct volume_group *vg, const char *name, int *historical)
{
struct generic_logical_volume *historical_lv;
struct logical_volume *lv;
int found = 0;
if ((lv = find_lv(vg, name))) {
if (find_lv(vg, name)) {
found = 1;
if (historical)
*historical = 0;
} else if ((historical_lv = find_historical_glv(vg, name, 0, NULL))) {
} else if (find_historical_glv(vg, name, 0, NULL)) {
found = 1;
if (historical)
*historical = 1;
@@ -3731,6 +3729,7 @@ out:
static struct volume_group *_vg_read(struct cmd_context *cmd,
const char *vgname,
const char *vgid,
int write_lock_held,
uint32_t lockd_state,
uint32_t warn_flags,
int *consistent, unsigned precommitted)
@@ -3863,8 +3862,15 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
if (warn_flags & SKIP_RESCAN)
goto find_vg;
skipped_rescan = 0;
/*
* When a write lock is held, it implies we are going to be
* writing to the devs in the VG, so when we rescan the VG
* we should reopen the devices in RDWR (since they were
* open RDONLY from the initial scan.
*/
log_debug_metadata("Rescanning devices for %s", vgname);
lvmcache_label_rescan_vg(cmd, vgname, vgid);
lvmcache_label_rescan_vg(cmd, vgname, vgid, write_lock_held);
} else {
log_debug_metadata("Skipped rescanning devices for %s", vgname);
skipped_rescan = 1;
@@ -4498,13 +4504,15 @@ static int _check_devs_used_correspond_with_vg(struct volume_group *vg)
struct volume_group *vg_read_internal(struct cmd_context *cmd,
const char *vgname, const char *vgid,
uint32_t lockd_state, uint32_t warn_flags,
int write_lock_held,
uint32_t lockd_state,
uint32_t warn_flags,
int *consistent)
{
struct volume_group *vg;
struct lv_list *lvl;
if (!(vg = _vg_read(cmd, vgname, vgid, lockd_state, warn_flags, consistent, 0)))
if (!(vg = _vg_read(cmd, vgname, vgid, write_lock_held, lockd_state, warn_flags, consistent, 0)))
goto_out;
if (!check_pv_dev_sizes(vg))
@@ -4612,7 +4620,7 @@ struct volume_group *vg_read_by_vgid(struct cmd_context *cmd,
label_scan_setup_bcache();
if (!(vg = _vg_read(cmd, vgname, vgid, 0, warn_flags, &consistent, precommitted))) {
if (!(vg = _vg_read(cmd, vgname, vgid, 0, 0, warn_flags, &consistent, precommitted))) {
log_error("Rescan devices to look for missing VG.");
goto scan;
}
@@ -4633,7 +4641,7 @@ struct volume_group *vg_read_by_vgid(struct cmd_context *cmd,
lvmcache_label_scan(cmd);
warn_flags |= SKIP_RESCAN;
if (!(vg = _vg_read(cmd, vgname, vgid, 0, warn_flags, &consistent, precommitted)))
if (!(vg = _vg_read(cmd, vgname, vgid, 0, 0, warn_flags, &consistent, precommitted)))
goto fail;
label_scan_destroy(cmd); /* drop bcache to close devs, keep lvmcache */
@@ -4872,7 +4880,7 @@ static int _get_pvs(struct cmd_context *cmd, uint32_t warn_flags,
warn_flags |= WARN_INCONSISTENT;
if (!(vg = vg_read_internal(cmd, vgname, (!vgslist) ? vgid : NULL, 0, warn_flags, &consistent))) {
if (!(vg = vg_read_internal(cmd, vgname, (!vgslist) ? vgid : NULL, 0, 0, warn_flags, &consistent))) {
stack;
continue;
}
@@ -5126,6 +5134,15 @@ int vg_flag_write_locked(struct volume_group *vg)
static int _access_vg_clustered(struct cmd_context *cmd, const struct volume_group *vg)
{
if (vg_is_clustered(vg) && !locking_is_clustered()) {
/*
* force_access_clustered is only set when forcibly
* converting a clustered vg to lock type none.
*/
if (cmd->force_access_clustered) {
log_debug("Allowing forced access to clustered vg %s", vg->name);
return 1;
}
if (!cmd->ignore_clustered_vgs)
log_error("Skipping clustered volume group %s", vg->name);
else
@@ -5185,7 +5202,8 @@ int vg_check_status(const struct volume_group *vg, uint64_t status)
* VG is left unlocked on failure
*/
static struct volume_group *_recover_vg(struct cmd_context *cmd,
const char *vg_name, const char *vgid, uint32_t lockd_state)
const char *vg_name, const char *vgid,
int is_shared, uint32_t lockd_state)
{
int consistent = 1;
struct volume_group *vg;
@@ -5199,7 +5217,7 @@ static struct volume_group *_recover_vg(struct cmd_context *cmd,
/*
* Convert vg lock in lvmlockd from sh to ex.
*/
if (!(lockd_state & LDST_FAIL) && !(lockd_state & LDST_EX)) {
if (is_shared && !(lockd_state & LDST_FAIL) && !(lockd_state & LDST_EX)) {
log_debug("Upgrade lvmlockd lock to repair vg %s.", vg_name);
if (!lockd_vg(cmd, vg_name, "ex", 0, &state)) {
log_warn("Skip repair for shared VG without exclusive lock.");
@@ -5208,7 +5226,7 @@ static struct volume_group *_recover_vg(struct cmd_context *cmd,
lockd_state |= LDST_EX;
}
if (!(vg = vg_read_internal(cmd, vg_name, vgid, lockd_state, WARN_PV_READ, &consistent))) {
if (!(vg = vg_read_internal(cmd, vg_name, vgid, 1, lockd_state, WARN_PV_READ, &consistent))) {
unlock_vg(cmd, NULL, vg_name);
return_NULL;
}
@@ -5450,7 +5468,9 @@ static struct volume_group *_vg_lock_and_read(struct cmd_context *cmd, const cha
int consistent_in;
uint32_t failure = 0;
uint32_t warn_flags = 0;
int is_shared = 0;
int already_locked;
int write_lock_held = (lock_flags == LCK_VG_WRITE);
if ((read_flags & READ_ALLOW_INCONSISTENT) || (lock_flags != LCK_VG_WRITE))
consistent = 0;
@@ -5482,7 +5502,7 @@ static struct volume_group *_vg_lock_and_read(struct cmd_context *cmd, const cha
warn_flags |= WARN_INCONSISTENT;
/* If consistent == 1, we get NULL here if correction fails. */
if (!(vg = vg_read_internal(cmd, vg_name, vgid, lockd_state, warn_flags, &consistent))) {
if (!(vg = vg_read_internal(cmd, vg_name, vgid, write_lock_held, lockd_state, warn_flags, &consistent))) {
if (consistent_in && !consistent) {
failure |= FAILED_INCONSISTENT;
goto bad;
@@ -5498,8 +5518,9 @@ static struct volume_group *_vg_lock_and_read(struct cmd_context *cmd, const cha
/* consistent == 0 when VG is not found, but failed == FAILED_NOTFOUND */
if (!consistent && !failure) {
is_shared = vg_is_shared(vg);
release_vg(vg);
if (!(vg = _recover_vg(cmd, vg_name, vgid, lockd_state))) {
if (!(vg = _recover_vg(cmd, vg_name, vgid, is_shared, lockd_state))) {
if (is_orphan_vg(vg_name))
log_error("Recovery of standalone physical volumes failed.");
else

View File

@@ -302,10 +302,14 @@ static int _write_log_header(struct cmd_context *cmd, struct logical_volume *lv)
return 0;
}
dev_set_last_byte(dev, sizeof(log_header));
if (!dev_write_bytes(dev, UINT64_C(0), sizeof(log_header), &log_header)) {
dev_unset_last_byte(dev);
log_error("Failed to write log header to %s.", name);
return 0;
}
dev_unset_last_byte(dev);
label_scan_invalidate(dev);
@@ -710,7 +714,7 @@ static int _split_mirror_images(struct logical_volume *lv,
return 0;
}
if (!strcmp(lv->vg->lock_type, "dlm"))
if (lv->vg->lock_type && !strcmp(lv->vg->lock_type, "dlm"))
new_lv->lock_args = lv->lock_args;
if (!dm_list_empty(&split_images)) {
@@ -786,7 +790,7 @@ static int _split_mirror_images(struct logical_volume *lv,
act = lv_is_active(lv_lock_holder(lv));
if (act && !_activate_lv_like_model(lv, new_lv)) {
if (act && (!deactivate_lv(cmd, new_lv) || !_activate_lv_like_model(lv, new_lv))) {
log_error("Failed to rename newly split LV in the kernel");
return 0;
}

View File

@@ -566,6 +566,7 @@ static int _pv_resize(struct physical_volume *pv, struct volume_group *vg, uint6
log_error("Size must exceed physical extent start "
"of %" PRIu64 " sectors on PV %s.",
pv_pe_start(pv), pv_dev_name(pv));
return 0;
}
old_pe_count = pv->pe_count;
@@ -645,7 +646,7 @@ int pv_resize_single(struct cmd_context *cmd,
pv_name, display_size(cmd, new_size),
display_size(cmd, size)) == 'n') {
log_error("Physical Volume %s not resized.", pv_name);
goto_out;
goto out;
}
} else if (new_size < size)
@@ -653,7 +654,7 @@ int pv_resize_single(struct cmd_context *cmd,
pv_name, display_size(cmd, new_size),
display_size(cmd, size)) == 'n') {
log_error("Physical Volume %s not resized.", pv_name);
goto_out;
goto out;
}
if (new_size == size)

View File

@@ -689,86 +689,33 @@ static int _lv_update_and_reload_list(struct logical_volume *lv, int origin_only
return r;
}
/* Makes on-disk metadata changes
* If LV is active:
* clear first block of device
* otherwise:
* activate, clear, deactivate
*
* Returns: 1 on success, 0 on failure
*/
/* Wipe all LVs listsed on @lv_list committing lvm metadata */
static int _clear_lvs(struct dm_list *lv_list)
{
struct lv_list *lvl;
struct volume_group *vg = NULL;
unsigned i = 0, sz = dm_list_size(lv_list);
char *was_active;
int r = 1;
return activate_and_wipe_lvlist(lv_list, 1);
}
if (!sz) {
log_debug_metadata(INTERNAL_ERROR "Empty list of LVs given for clearing.");
return 1;
/* External interface to clear logical volumes on @lv_list */
int lv_raid_has_visible_sublvs(const struct logical_volume *lv)
{
unsigned s;
struct lv_segment *seg = first_seg(lv);
if (!lv_is_raid(lv) || (lv->status & LV_TEMPORARY) || !seg)
return 0;
if (lv_is_raid_image(lv) || lv_is_raid_metadata(lv))
return 0;
for (s = 0; s < seg->area_count; s++) {
if ((seg_lv(seg, s)->status & LVM_WRITE) && /* Split off track changes raid1 leg */
lv_is_visible(seg_lv(seg, s)))
return 1;
if (seg->meta_areas && lv_is_visible(seg_metalv(seg, s)))
return 1;
}
dm_list_iterate_items(lvl, lv_list) {
if (!lv_is_visible(lvl->lv)) {
log_error(INTERNAL_ERROR
"LVs must be set visible before clearing.");
return 0;
}
vg = lvl->lv->vg;
}
if (test_mode())
return 1;
/*
* FIXME: only vg_[write|commit] if LVs are not already written
* as visible in the LVM metadata (which is never the case yet).
*/
if (!vg || !vg_write(vg) || !vg_commit(vg))
return_0;
was_active = alloca(sz);
dm_list_iterate_items(lvl, lv_list)
if (!(was_active[i++] = lv_is_active_locally(lvl->lv))) {
lvl->lv->status |= LV_TEMPORARY;
if (!activate_lv_excl_local(vg->cmd, lvl->lv)) {
log_error("Failed to activate localy %s for clearing.",
display_lvname(lvl->lv));
r = 0;
goto out;
}
lvl->lv->status &= ~LV_TEMPORARY;
}
dm_list_iterate_items(lvl, lv_list) {
log_verbose("Clearing metadata area %s.", display_lvname(lvl->lv));
/*
* Rather than wiping lv->size, we can simply
* wipe the first sector to remove the superblock of any previous
* RAID devices. It is much quicker.
*/
if (!wipe_lv(lvl->lv, (struct wipe_params) { .do_zero = 1, .zero_sectors = 1 })) {
log_error("Failed to zero %s.", display_lvname(lvl->lv));
r = 0;
goto out;
}
}
out:
/* TODO: deactivation is only needed with clustered locking
* in normal case we should keep device active
*/
sz = 0;
dm_list_iterate_items(lvl, lv_list)
if ((i > sz) && !was_active[sz++] &&
!deactivate_lv(vg->cmd, lvl->lv)) {
log_error("Failed to deactivate %s.", display_lvname(lvl->lv));
r = 0; /* continue deactivating */
}
return r;
return 0;
}
/* raid0* <-> raid10_near area reorder helper: swap 2 LV segment areas @a1 and @a2 */
@@ -3395,7 +3342,7 @@ int lv_raid_split(struct logical_volume *lv, int yes, const char *split_name,
lvl->lv->name = split_name;
if (!strcmp(lv->vg->lock_type, "dlm"))
if (lv->vg->lock_type && !strcmp(lv->vg->lock_type, "dlm"))
lvl->lv->lock_args = lv->lock_args;
if (!vg_write(lv->vg)) {
@@ -3563,7 +3510,7 @@ int lv_raid_merge(struct logical_volume *image_lv)
struct volume_group *vg = image_lv->vg;
if (image_lv->status & LVM_WRITE) {
log_error("%s is not read-only - refusing to merge.",
log_error("%s cannot be merged because --trackchanges was not used.",
display_lvname(image_lv));
return 0;
}
@@ -3572,7 +3519,7 @@ int lv_raid_merge(struct logical_volume *image_lv)
return_0;
if (!(p = strstr(lv_name, "_rimage_"))) {
log_error("Unable to merge non-mirror image %s.",
log_error("Unable to merge non-raid image %s.",
display_lvname(image_lv));
return 0;
}
@@ -4526,17 +4473,18 @@ static struct possible_takeover_reshape_type _possible_takeover_reshape_types[]
.current_areas = 1,
.options = ALLOW_REGION_SIZE },
{ .current_types = SEG_STRIPED_TARGET, /* linear, i.e. seg->area_count = 1 */
.possible_types = SEG_RAID0|SEG_RAID0_META,
.current_areas = 1,
.options = ALLOW_STRIPE_SIZE },
/* raid0* -> raid1 */
{ .current_types = SEG_RAID0|SEG_RAID0_META, /* seg->area_count = 1 */
.possible_types = SEG_RAID1,
.current_areas = 1,
.options = ALLOW_REGION_SIZE },
/* raid5_n -> linear through interim raid1 */
{ .current_types = SEG_RAID5_N,
.possible_types = SEG_STRIPED_TARGET,
.current_areas = 2,
.options = ALLOW_NONE },
/* striped,raid0* <-> striped,raid0* */
{ .current_types = SEG_STRIPED_TARGET|SEG_RAID0|SEG_RAID0_META,
.possible_types = SEG_STRIPED_TARGET|SEG_RAID0|SEG_RAID0_META,
@@ -4547,13 +4495,13 @@ static struct possible_takeover_reshape_type _possible_takeover_reshape_types[]
{ .current_types = SEG_STRIPED_TARGET|SEG_RAID0|SEG_RAID0_META,
.possible_types = SEG_RAID4|SEG_RAID5_N|SEG_RAID6_N_6|SEG_RAID10_NEAR,
.current_areas = ~0U,
.options = ALLOW_REGION_SIZE },
.options = ALLOW_REGION_SIZE|ALLOW_STRIPES },
/* raid4,raid5_n,raid6_n_6,raid10_near -> striped/raid0* */
{ .current_types = SEG_RAID4|SEG_RAID5_N|SEG_RAID6_N_6|SEG_RAID10_NEAR,
.possible_types = SEG_STRIPED_TARGET|SEG_RAID0|SEG_RAID0_META,
.current_areas = ~0U,
.options = ALLOW_NONE },
.options = ALLOW_STRIPES },
/* raid4,raid5_n,raid6_n_6 <-> raid4,raid5_n,raid6_n_6 */
{ .current_types = SEG_RAID4|SEG_RAID5_N|SEG_RAID6_N_6,
@@ -4640,7 +4588,8 @@ static struct possible_takeover_reshape_type *_get_possible_takeover_reshape_typ
for ( ; pt->current_types; pt++)
if ((seg_from->segtype->flags & pt->current_types) &&
(segtype_to ? (segtype_to->flags & pt->possible_types) : 1))
if (seg_from->area_count <= pt->current_areas)
if ((seg_from->area_count == pt->current_areas) ||
(seg_from->area_count > 1 && seg_from->area_count <= pt->current_areas))
return pt;
return NULL;
@@ -4816,7 +4765,7 @@ typedef int (*takeover_fn_t)(TAKEOVER_FN_ARGS);
/*
* Unsupported takeover functions.
*/
static int _takeover_noop(TAKEOVER_FN_ARGS)
static int _takeover_same_layout(const struct logical_volume *lv)
{
log_error("Logical volume %s is already of requested type %s.",
display_lvname(lv), lvseg_name(first_seg(lv)));
@@ -4824,6 +4773,11 @@ static int _takeover_noop(TAKEOVER_FN_ARGS)
return 0;
}
static int _takeover_noop(TAKEOVER_FN_ARGS)
{
return _takeover_same_layout(lv);
}
static int _takeover_unsupported(TAKEOVER_FN_ARGS)
{
struct lv_segment *seg = first_seg(lv);
@@ -5618,7 +5572,9 @@ static int _takeover_from_linear_to_raid0(TAKEOVER_FN_ARGS)
static int _takeover_from_linear_to_raid1(TAKEOVER_FN_ARGS)
{
return _takeover_unsupported_yet(lv, new_stripes, new_segtype);
first_seg(lv)->region_size = new_region_size;
return _lv_raid_change_image_count(lv, 1, 2, allocate_pvs, NULL, 1, 0);
}
static int _takeover_from_linear_to_raid10(TAKEOVER_FN_ARGS)
@@ -6102,28 +6058,37 @@ static uint64_t _raid_segtype_flag_5_to_6(const struct segment_type *segtype)
/* FIXME: do this like _conversion_options_allowed()? */
static int _set_convenient_raid145610_segtype_to(const struct lv_segment *seg_from,
const struct segment_type **segtype,
uint32_t *new_image_count,
uint32_t *stripes,
int yes)
{
uint64_t seg_flag = 0;
struct cmd_context *cmd = seg_from->lv->vg->cmd;
const struct segment_type *segtype_sav = *segtype;
/* Linear -> striped request */
if (seg_is_linear(seg_from) &&
segtype_is_striped(*segtype))
;
/* Bail out if same RAID level is requested. */
if (_is_same_level(seg_from->segtype, *segtype))
else if (_is_same_level(seg_from->segtype, *segtype))
return 1;
log_debug("Checking LV %s requested %s segment type for convenience",
display_lvname(seg_from->lv), (*segtype)->name);
/* striped/raid0 -> raid5/6 */
if (seg_is_striped(seg_from) || seg_is_any_raid0(seg_from)) {
/* If this is any raid5 conversion request -> enforce raid5_n, because we convert from striped */
if (segtype_is_any_raid5(*segtype) && !segtype_is_raid5_n(*segtype))
seg_flag = SEG_RAID5_N;
/* linear -> */
if (seg_is_linear(seg_from)) {
seg_flag = SEG_RAID1;
/* If this is any raid6 conversion request -> enforce raid6_n_6, because we convert from striped */
else if (segtype_is_any_raid6(*segtype) && !segtype_is_raid6_n_6(*segtype))
seg_flag = SEG_RAID6_N_6;
/* striped/raid0 -> */
} else if (seg_is_striped(seg_from) || seg_is_any_raid0(seg_from)) {
if (segtype_is_any_raid6(*segtype))
seg_flag = seg_from->area_count < 3 ? SEG_RAID5_N : SEG_RAID6_N_6;
else if (segtype_is_linear(*segtype) ||
(!segtype_is_raid4(*segtype) && !segtype_is_raid10(*segtype) && !segtype_is_striped(*segtype)))
seg_flag = SEG_RAID5_N;
/* raid1 -> */
} else if (seg_is_raid1(seg_from) && !segtype_is_mirror(*segtype)) {
@@ -6134,54 +6099,67 @@ static int _set_convenient_raid145610_segtype_to(const struct lv_segment *seg_fr
}
if (segtype_is_striped(*segtype) ||
segtype_is_any_raid0(*segtype) ||
segtype_is_raid10(*segtype))
segtype_is_any_raid0(*segtype) ||
segtype_is_raid10(*segtype))
seg_flag = SEG_RAID5_N;
else if (!segtype_is_raid4(*segtype) && !segtype_is_any_raid5(*segtype))
seg_flag = SEG_RAID5_LS;
/* raid4/raid5 -> striped/raid0/raid1/raid6/raid10 */
} else if (seg_is_raid4(seg_from) || seg_is_any_raid5(seg_from)) {
if (segtype_is_raid1(*segtype) &&
seg_from->area_count != 2) {
log_error("Convert %s LV %s to 2 stripes first (i.e. --stripes 1).",
lvseg_name(seg_from), display_lvname(seg_from->lv));
return 0;
}
if (seg_is_raid4(seg_from) &&
segtype_is_any_raid5(*segtype) &&
!segtype_is_raid5_n(*segtype))
seg_flag = SEG_RAID5_N;
else if (seg_is_any_raid5(seg_from) &&
segtype_is_raid4(*segtype) &&
!segtype_is_raid5_n(*segtype))
seg_flag = SEG_RAID5_N;
else if (segtype_is_raid10(*segtype)) {
if (seg_from->area_count < 3) {
log_error("Convert %s LV %s to minimum 3 stripes first (i.e. --stripes 2).",
/* raid5* -> */
} else if (seg_is_any_raid5(seg_from)) {
if (segtype_is_raid1(*segtype) || segtype_is_linear(*segtype)) {
if (seg_from->area_count != 2) {
log_error("Converting %s LV %s to 2 stripes first.",
lvseg_name(seg_from), display_lvname(seg_from->lv));
return 0;
}
seg_flag = seg_is_raid5_n(seg_from) ? SEG_RAID0_META : SEG_RAID5_N;
*new_image_count = 2;
*segtype = seg_from->segtype;
seg_flag = 0;
} else
seg_flag = SEG_RAID1;
} else if (segtype_is_any_raid6(*segtype)) {
if (seg_from->area_count < 4) {
log_error("Convert %s LV %s to minimum 4 stripes first (i.e. --stripes 3).",
lvseg_name(seg_from), display_lvname(seg_from->lv));
return 0;
}
if (*stripes > 3)
*new_image_count = *stripes + seg_from->segtype->parity_devs;
else
*new_image_count = 4;
if (seg_is_raid4(seg_from) && !segtype_is_raid6_n_6(*segtype))
seg_flag = SEG_RAID6_N_6;
else
*segtype = seg_from->segtype;
log_error("Converting %s LV %s to %u stripes first.",
lvseg_name(seg_from), display_lvname(seg_from->lv), *new_image_count);
} else
seg_flag = _raid_seg_flag_5_to_6(seg_from);
} else if (segtype_is_striped(*segtype) || segtype_is_raid10(*segtype)) {
int change = 0;
if (!seg_is_raid5_n(seg_from)) {
seg_flag = SEG_RAID5_N;
} else if (*stripes > 2 && *stripes != seg_from->area_count - seg_from->segtype->parity_devs) {
change = 1;
*new_image_count = *stripes + seg_from->segtype->parity_devs;
seg_flag = SEG_RAID5_N;
} else if (seg_from->area_count < 3) {
change = 1;
*new_image_count = 3;
seg_flag = SEG_RAID5_N;
} else if (!segtype_is_striped(*segtype))
seg_flag = SEG_RAID0_META;
if (change)
log_error("Converting %s LV %s to %u stripes first.",
lvseg_name(seg_from), display_lvname(seg_from->lv), *new_image_count);
}
/* raid4 -> * */
} else if (seg_is_raid4(seg_from) && !segtype_is_raid4(*segtype) && !segtype_is_striped(*segtype)) {
seg_flag = segtype_is_any_raid6(*segtype) ? SEG_RAID6_N_6 : SEG_RAID5_N;
/* raid6 -> striped/raid0/raid5/raid10 */
} else if (seg_is_any_raid6(seg_from)) {
if (segtype_is_raid1(*segtype)) {
@@ -6193,9 +6171,12 @@ static int _set_convenient_raid145610_segtype_to(const struct lv_segment *seg_fr
} else if (segtype_is_any_raid10(*segtype)) {
seg_flag = seg_is_raid6_n_6(seg_from) ? SEG_RAID0_META : SEG_RAID6_N_6;
} else if ((segtype_is_striped(*segtype) || segtype_is_any_raid0(*segtype)) &&
!seg_is_raid6_n_6(seg_from)) {
seg_flag = SEG_RAID6_N_6;
} else if (segtype_is_linear(*segtype)) {
seg_flag = seg_is_raid6_n_6(seg_from) ? SEG_RAID5_N : SEG_RAID6_N_6;
} else if (segtype_is_striped(*segtype) || segtype_is_any_raid0(*segtype)) {
if (!seg_is_raid6_n_6(seg_from))
seg_flag = SEG_RAID6_N_6;
} else if (segtype_is_raid4(*segtype) && !seg_is_raid6_n_6(seg_from)) {
seg_flag = SEG_RAID6_N_6;
@@ -6223,12 +6204,16 @@ static int _set_convenient_raid145610_segtype_to(const struct lv_segment *seg_fr
return 0;
}
/* raid10 -> ... */
} else if (seg_is_raid10(seg_from) &&
!segtype_is_striped(*segtype) &&
!segtype_is_any_raid0(*segtype))
seg_flag = SEG_RAID0_META;
} else if (seg_is_raid10(seg_from)) {
if (segtype_is_linear(*segtype) ||
(!segtype_is_striped(*segtype) &&
!segtype_is_any_raid0(*segtype))) {
seg_flag = SEG_RAID0_META;
}
}
/* raid10 -> ... */
if (seg_flag) {
if (!(*segtype = get_segtype_from_flag(cmd, seg_flag)))
return_0;
@@ -6331,41 +6316,48 @@ static int _conversion_options_allowed(const struct lv_segment *seg_from,
int yes,
uint32_t new_image_count,
int new_data_copies, int new_region_size,
int stripes, unsigned new_stripe_size_supplied)
uint32_t *stripes, unsigned new_stripe_size_supplied)
{
int r = 1;
uint32_t opts;
uint32_t count = new_image_count, opts;
if (!new_image_count && !_set_convenient_raid145610_segtype_to(seg_from, segtype_to, yes))
/* Linear -> linear rejection */
if ((seg_is_linear(seg_from) || seg_is_striped(seg_from)) &&
seg_from->area_count == 1 &&
segtype_is_striped(*segtype_to) &&
*stripes < 2)
return _takeover_same_layout(seg_from->lv);
if (!new_image_count && !_set_convenient_raid145610_segtype_to(seg_from, segtype_to, &count, stripes, yes))
return_0;
if (new_image_count != count)
*stripes = count - seg_from->segtype->parity_devs;
if (!_get_allowed_conversion_options(seg_from, *segtype_to, new_image_count, &opts)) {
log_error("Unable to convert LV %s from %s to %s.",
display_lvname(seg_from->lv), lvseg_name(seg_from), (*segtype_to)->name);
if (strcmp(lvseg_name(seg_from), (*segtype_to)->name))
log_error("Unable to convert LV %s from %s to %s.",
display_lvname(seg_from->lv), lvseg_name(seg_from), (*segtype_to)->name);
else
_takeover_same_layout(seg_from->lv);
return 0;
}
if (stripes > 1 && !(opts & ALLOW_STRIPES)) {
if (!_log_prohibited_option(seg_from, *segtype_to, "--stripes"))
stack;
r = 0;
if (*stripes > 1 && !(opts & ALLOW_STRIPES)) {
_log_prohibited_option(seg_from, *segtype_to, "--stripes");
*stripes = seg_from->area_count;
}
if (new_stripe_size_supplied && !(opts & ALLOW_STRIPE_SIZE)) {
if (!_log_prohibited_option(seg_from, *segtype_to, "-I/--stripesize"))
stack;
r = 0;
}
if (new_stripe_size_supplied && !(opts & ALLOW_STRIPE_SIZE))
_log_prohibited_option(seg_from, *segtype_to, "-I/--stripesize");
if (new_region_size && !(opts & ALLOW_REGION_SIZE)) {
if (!_log_prohibited_option(seg_from, *segtype_to, "-R/--regionsize"))
stack;
r = 0;
}
if (new_region_size && new_region_size != seg_from->region_size && !(opts & ALLOW_REGION_SIZE))
_log_prohibited_option(seg_from, *segtype_to, "-R/--regionsize");
/* Can't reshape stripes or stripe size when performing a takeover! */
if (!_is_same_level(seg_from->segtype, *segtype_to)) {
if (stripes && stripes != _data_rimages_count(seg_from, seg_from->area_count))
if (*stripes && *stripes != _data_rimages_count(seg_from, seg_from->area_count))
log_warn("WARNING: ignoring --stripes option on takeover of %s (reshape afterwards).",
display_lvname(seg_from->lv));
@@ -6501,7 +6493,7 @@ int lv_raid_convert(struct logical_volume *lv,
*/
if (!_conversion_options_allowed(seg, &new_segtype, yes,
0 /* Takeover */, 0 /*new_data_copies*/, new_region_size,
new_stripes, new_stripe_size_supplied))
&stripes, new_stripe_size_supplied))
return _log_possible_conversion_types(lv, new_segtype);
/* https://bugzilla.redhat.com/1439399 */

View File

@@ -22,10 +22,6 @@ struct segment_type *get_segtype_from_string(struct cmd_context *cmd,
{
struct segment_type *segtype;
/* FIXME Register this properly within striped.c */
if (!strcmp(str, SEG_TYPE_NAME_LINEAR))
str = SEG_TYPE_NAME_STRIPED;
dm_list_iterate_items(segtype, &cmd->segtypes)
if (!strcmp(segtype->name, str))
return segtype;

View File

@@ -68,6 +68,7 @@ struct dev_manager;
#define SEG_RAID6 SEG_RAID6_ZR
#define SEG_STRIPED_TARGET (1ULL << 39)
#define SEG_LINEAR_TARGET (1ULL << 40)
#define SEG_UNKNOWN (1ULL << 63)
@@ -105,7 +106,7 @@ struct dev_manager;
#define SEG_TYPE_NAME_RAID6_RS_6 "raid6_rs_6"
#define SEG_TYPE_NAME_RAID6_N_6 "raid6_n_6"
#define segtype_is_linear(segtype) (!strcmp(segtype->name, SEG_TYPE_NAME_LINEAR))
#define segtype_is_linear(segtype) (!strcmp((segtype)->name, SEG_TYPE_NAME_LINEAR))
#define segtype_is_striped_target(segtype) ((segtype)->flags & SEG_STRIPED_TARGET ? 1 : 0)
#define segtype_is_cache(segtype) ((segtype)->flags & SEG_CACHE ? 1 : 0)
#define segtype_is_cache_pool(segtype) ((segtype)->flags & SEG_CACHE_POOL ? 1 : 0)
@@ -274,6 +275,7 @@ struct segtype_library;
int lvm_register_segtype(struct segtype_library *seglib,
struct segment_type *segtype);
struct segment_type *init_linear_segtype(struct cmd_context *cmd);
struct segment_type *init_striped_segtype(struct cmd_context *cmd);
struct segment_type *init_zero_segtype(struct cmd_context *cmd);
struct segment_type *init_error_segtype(struct cmd_context *cmd);

View File

@@ -24,6 +24,7 @@
static int _verbose_level = VERBOSE_BASE_LEVEL;
static int _silent = 0;
static int _test = 0;
static int _use_aio = 0;
static int _md_filtering = 0;
static int _internal_filtering = 0;
static int _fwraid_filtering = 0;
@@ -53,6 +54,7 @@ static char _sysfs_dir_path[PATH_MAX] = "";
static int _dev_disable_after_error_count = DEFAULT_DISABLE_AFTER_ERROR_COUNT;
static uint64_t _pv_min_size = (DEFAULT_PV_MIN_SIZE_KB * 1024L >> SECTOR_SHIFT);
static const char *_unknown_device_name = DEFAULT_UNKNOWN_DEVICE_NAME;
static int _io_memory_size_kb = DEFAULT_IO_MEMORY_SIZE_KB;
void init_verbose(int level)
{
@@ -71,6 +73,11 @@ void init_test(int level)
_test = level;
}
void init_use_aio(int use_aio)
{
_use_aio = use_aio;
}
void init_md_filtering(int level)
{
_md_filtering = level;
@@ -227,6 +234,11 @@ int test_mode(void)
return _test;
}
int use_aio(void)
{
return _use_aio;
}
int md_filtering(void)
{
return _md_filtering;
@@ -376,3 +388,12 @@ void init_unknown_device_name(const char *name)
_unknown_device_name = name;
}
int io_memory_size(void)
{
return _io_memory_size_kb;
}
void init_io_memory_size(int val)
{
_io_memory_size_kb = val;
}

View File

@@ -25,6 +25,7 @@ enum dev_ext_e;
void init_verbose(int level);
void init_silent(int silent);
void init_test(int level);
void init_use_aio(int use_aio);
void init_md_filtering(int level);
void init_internal_filtering(int level);
void init_fwraid_filtering(int level);
@@ -52,12 +53,14 @@ void init_pv_min_size(uint64_t sectors);
void init_activation_checks(int checks);
void init_retry_deactivation(int retry);
void init_unknown_device_name(const char *name);
void init_io_memory_size(int val);
void set_cmd_name(const char *cmd_name);
const char *get_cmd_name(void);
void set_sysfs_dir_path(const char *path);
int test_mode(void);
int use_aio(void);
int md_filtering(void);
int internal_filtering(void);
int fwraid_filtering(void);
@@ -84,6 +87,7 @@ uint64_t pv_min_size(void);
int activation_checks(void);
int retry_deactivation(void);
const char *unknown_device_name(void);
int io_memory_size(void);
#define DMEVENTD_MONITOR_IGNORE -1
int dmeventd_monitor_mode(void);
@@ -91,4 +95,5 @@ int dmeventd_monitor_mode(void);
#define NO_DEV_ERROR_COUNT_LIMIT 0
int dev_disable_after_error_count(void);
#endif

View File

@@ -105,23 +105,30 @@ static const char * const _blacklist_maps[] = {
"/LC_MESSAGES/",
"gconv/gconv-modules.cache",
"/ld-2.", /* not using dlopen,dlsym during mlock */
"/libaio.so.", /* not using aio during mlock */
"/libattr.so.", /* not using during mlock (udev) */
"/libblkid.so.", /* not using lzma during mlock (selinux) */
"/libblkid.so.", /* not using blkid during mlock (udev) */
"/libbz2.so.", /* not using during mlock (udev) */
"/libcap.so.", /* not using during mlock (udev) */
"/libcap.so.", /* not using during mlock (systemd) */
"/libdl-", /* not using dlopen,dlsym during mlock */
"/libdw-", /* not using during mlock (udev) */
"/libelf-", /* not using during mlock (udev) */
"/liblzma.so.", /* not using lzma during mlock (selinux) */
"/libgcrypt.so.", /* not using during mlock (systemd) */
"/libgpg-error.so.", /* not using gpg-error during mlock (systemd) */
"/liblz4.so.", /* not using lz4 during mlock (systemd) */
"/liblzma.so.", /* not using lzma during mlock (systemd) */
"/libmount.so.", /* not using mount during mlock (udev) */
"/libncurses.so.", /* not using ncurses during mlock */
"/libpcre.so.", /* not using pcre during mlock (selinux) */
"/libpcre.so.", /* not using pcre during mlock (selinux) */
"/libpcre2-", /* not using pcre during mlock (selinux) */
"/libreadline.so.", /* not using readline during mlock */
"/libresolv-", /* not using during mlock (udev) */
"/libresolv-", /* not using during mlock (udev) */
"/libselinux.so.", /* not using selinux during mlock */
"/libsepol.so.", /* not using sepol during mlock */
"/libsystemd.so.", /* not using systemd during mlock */
"/libtinfo.so.", /* not using tinfo during mlock */
"/libudev.so.", /* not using udev during mlock */
"/libuuid.so.", /* not using uuid during mlock (blkid) */
"/libdl-", /* not using dlopen,dlsym during mlock */
"/libz.so.", /* not using during mlock (udev) */
"/etc/selinux", /* not using selinux during mlock */
/* "/libdevmapper-event.so" */

View File

@@ -230,7 +230,7 @@ static struct segtype_handler _striped_ops = {
.destroy = _striped_destroy,
};
struct segment_type *init_striped_segtype(struct cmd_context *cmd)
static struct segment_type *_init_segtype(struct cmd_context *cmd, const char *name, uint64_t target)
{
struct segment_type *segtype = dm_zalloc(sizeof(*segtype));
@@ -238,11 +238,20 @@ struct segment_type *init_striped_segtype(struct cmd_context *cmd)
return_NULL;
segtype->ops = &_striped_ops;
segtype->name = SEG_TYPE_NAME_STRIPED;
segtype->flags = SEG_STRIPED_TARGET |
SEG_CAN_SPLIT | SEG_AREAS_STRIPED;
segtype->name = name;
segtype->flags = target | SEG_CAN_SPLIT | SEG_AREAS_STRIPED;
log_very_verbose("Initialised segtype: %s", segtype->name);
return segtype;
}
struct segment_type *init_striped_segtype(struct cmd_context *cmd)
{
return _init_segtype(cmd, SEG_TYPE_NAME_STRIPED, SEG_STRIPED_TARGET);
}
struct segment_type *init_linear_segtype(struct cmd_context *cmd)
{
return _init_segtype(cmd, SEG_TYPE_NAME_LINEAR, SEG_LINEAR_TARGET);
}

View File

@@ -30,7 +30,7 @@
#else
# define MAJOR(x) major((x))
# define MINOR(x) minor((x))
# define MKDEV(x,y) makedev((x),(y))
# define MKDEV(x,y) makedev(((dev_t)x),((dev_t)y))
#endif
#include "dm-ioctl.h"
@@ -115,6 +115,9 @@ static struct cmd_data _cmd_data_v4[] = {
#ifdef DM_DEV_SET_GEOMETRY
{"setgeometry", DM_DEV_SET_GEOMETRY, {4, 6, 0}},
#endif
#ifdef DM_DEV_ARM_POLL
{"armpoll", DM_DEV_ARM_POLL, {4, 36, 0}},
#endif
};
/* *INDENT-ON* */
@@ -259,7 +262,7 @@ static int _control_exists(const char *control, uint32_t major, uint32_t minor)
return -1;
}
if (major && buf.st_rdev != MKDEV((dev_t)major, (dev_t)minor)) {
if (major && buf.st_rdev != MKDEV(major, minor)) {
log_verbose("%s: Wrong device number: (%u, %u) instead of "
"(%u, %u)", control,
MAJOR(buf.st_mode), MINOR(buf.st_mode),
@@ -302,7 +305,7 @@ static int _create_control(const char *control, uint32_t major, uint32_t minor)
(void) dm_prepare_selinux_context(control, S_IFCHR);
old_umask = umask(DM_CONTROL_NODE_UMASK);
if (mknod(control, S_IFCHR | S_IRUSR | S_IWUSR,
MKDEV((dev_t)major, (dev_t)minor)) < 0) {
MKDEV(major, minor)) < 0) {
log_sys_error("mknod", control);
ret = 0;
}
@@ -466,6 +469,7 @@ static void _dm_zfree_string(char *string)
{
if (string) {
memset(string, 0, strlen(string));
asm volatile ("" ::: "memory"); /* Compiler barrier. */
dm_free(string);
}
}
@@ -474,6 +478,7 @@ static void _dm_zfree_dmi(struct dm_ioctl *dmi)
{
if (dmi) {
memset(dmi, 0, dmi->data_size);
asm volatile ("" ::: "memory"); /* Compiler barrier. */
dm_free(dmi);
}
}
@@ -1094,6 +1099,22 @@ static int _lookup_dev_name(uint64_t dev, char *buf, size_t len)
return r;
}
static int _add_params(int type)
{
switch (type) {
case DM_DEVICE_REMOVE_ALL:
case DM_DEVICE_CREATE:
case DM_DEVICE_REMOVE:
case DM_DEVICE_SUSPEND:
case DM_DEVICE_STATUS:
case DM_DEVICE_CLEAR:
case DM_DEVICE_ARM_POLL:
return 0; /* IOCTL_FLAGS_NO_PARAMS in drivers/md/dm-ioctl.c */
default:
return 1;
}
}
static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
{
const size_t min_size = 16 * 1024;
@@ -1106,11 +1127,15 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
char *b, *e;
int count = 0;
for (t = dmt->head; t; t = t->next) {
len += sizeof(struct dm_target_spec);
len += strlen(t->params) + 1 + ALIGNMENT;
count++;
}
if (_add_params(dmt->type))
for (t = dmt->head; t; t = t->next) {
len += sizeof(struct dm_target_spec);
len += strlen(t->params) + 1 + ALIGNMENT;
count++;
}
else if (dmt->head)
log_debug_activation(INTERNAL_ERROR "dm '%s' ioctl should not define parameters.",
_cmd_data_v4[dmt->type].name);
if (count && (dmt->sector || dmt->message)) {
log_error("targets and message are incompatible");
@@ -1194,7 +1219,7 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
}
dmi->flags |= DM_PERSISTENT_DEV_FLAG;
dmi->dev = MKDEV((dev_t)dmt->major, (dev_t)dmt->minor);
dmi->dev = MKDEV(dmt->major, dmt->minor);
}
/* Does driver support device number referencing? */
@@ -1260,9 +1285,10 @@ static struct dm_ioctl *_flatten(struct dm_task *dmt, unsigned repeat_count)
b = (char *) (dmi + 1);
e = (char *) dmi + len;
for (t = dmt->head; t; t = t->next)
if (!(b = _add_target(t, b, e)))
goto_bad;
if (_add_params(dmt->type))
for (t = dmt->head; t; t = t->next)
if (!(b = _add_target(t, b, e)))
goto_bad;
if (dmt->newname)
strcpy(b, dmt->newname);
@@ -1466,6 +1492,7 @@ static int _create_and_load_v4(struct dm_task *dmt)
dmt->uuid = NULL;
dm_free(dmt->mangled_uuid);
dmt->mangled_uuid = NULL;
_dm_task_free_targets(dmt);
if (dm_task_run(dmt))
return 1;
@@ -1476,6 +1503,7 @@ static int _create_and_load_v4(struct dm_task *dmt)
dmt->uuid = NULL;
dm_free(dmt->mangled_uuid);
dmt->mangled_uuid = NULL;
_dm_task_free_targets(dmt);
/*
* Also udev-synchronize "remove" dm task that is a part of this revert!

View File

@@ -119,7 +119,9 @@ enum {
DM_DEVICE_TARGET_MSG,
DM_DEVICE_SET_GEOMETRY
DM_DEVICE_SET_GEOMETRY,
DM_DEVICE_ARM_POLL
};
/*

View File

@@ -1040,7 +1040,7 @@ static int _add_dev_node(const char *dev_name, uint32_t major, uint32_t minor,
{
char path[PATH_MAX];
struct stat info;
dev_t dev = MKDEV((dev_t)major, (dev_t)minor);
dev_t dev = MKDEV(major, minor);
mode_t old_mask;
if (!_build_dev_path(path, sizeof(path), dev_name))
@@ -1763,7 +1763,7 @@ static int _mountinfo_parse_line(const char *line, unsigned *maj, unsigned *min,
return 0;
}
devmapper += 12; /* skip fixed prefix */
for (i = 0; devmapper[i] && devmapper[i] != ' ' && i < sizeof(root); ++i)
for (i = 0; devmapper[i] && devmapper[i] != ' ' && i < sizeof(root)-1; ++i)
root[i] = devmapper[i];
root[i] = 0;
_unmangle_mountinfo_string(root, buf);

View File

@@ -192,6 +192,7 @@ struct load_segment {
uint64_t transaction_id; /* Thin_pool */
uint64_t low_water_mark; /* Thin_pool */
uint32_t data_block_size; /* Thin_pool + cache */
uint32_t migration_threshold; /* Cache */
unsigned skip_block_zeroing; /* Thin_pool */
unsigned ignore_discard; /* Thin_pool target vsn 1.1 */
unsigned no_discard_passdown; /* Thin_pool target vsn 1.1 */
@@ -523,7 +524,7 @@ static struct dm_tree_node *_create_dm_tree_node(struct dm_tree *dtree,
dm_list_init(&node->activated);
dm_list_init(&node->props.segs);
dev = MKDEV((dev_t)info->major, (dev_t)info->minor);
dev = MKDEV(info->major, info->minor);
if (!dm_hash_insert_binary(dtree->devs, (const char *) &dev,
sizeof(dev), node)) {
@@ -546,7 +547,7 @@ static struct dm_tree_node *_create_dm_tree_node(struct dm_tree *dtree,
static struct dm_tree_node *_find_dm_tree_node(struct dm_tree *dtree,
uint32_t major, uint32_t minor)
{
dev_t dev = MKDEV((dev_t)major, (dev_t)minor);
dev_t dev = MKDEV(major, minor);
return dm_hash_lookup_binary(dtree->devs, (const char *) &dev,
sizeof(dev));
@@ -2462,10 +2463,14 @@ static int _cache_emit_segment_line(struct dm_task *dmt,
EMIT_PARAMS(pos, " %s", name);
EMIT_PARAMS(pos, " %u", seg->policy_argc * 2);
/* Do not pass migration_threshold 2048 which is default */
EMIT_PARAMS(pos, " %u", (seg->policy_argc + (seg->migration_threshold != 2048) ? 1 : 0) * 2);
if (seg->migration_threshold != 2048)
EMIT_PARAMS(pos, " migration_threshold %u", seg->migration_threshold);
if (seg->policy_settings)
for (cn = seg->policy_settings->child; cn; cn = cn->sib)
EMIT_PARAMS(pos, " %s %" PRIu64, cn->key, cn->v->v.i);
if (cn->v) /* Skip deleted entry */
EMIT_PARAMS(pos, " %s %" PRIu64, cn->key, cn->v->v.i);
return 1;
}
@@ -3373,6 +3378,7 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
seg->data_block_size = data_block_size;
seg->flags = feature_flags;
seg->policy_name = policy_name;
seg->migration_threshold = 2048; /* Default migration threshold 1MiB */
/* FIXME: better validation missing */
if (policy_settings) {
@@ -3385,10 +3391,18 @@ int dm_tree_node_add_cache_target(struct dm_tree_node *node,
log_error("Cache policy parameter %s is without integer value.", cn->key);
return 0;
}
seg->policy_argc++;
if (strcmp(cn->key, "migration_threshold") == 0) {
seg->migration_threshold = cn->v->v.i;
cn->v = NULL; /* skip this entry */
} else
seg->policy_argc++;
}
}
/* Always some throughput available for cache to proceed */
if (seg->migration_threshold < data_block_size * 8)
seg->migration_threshold = data_block_size * 8;
return 1;
}

View File

@@ -221,6 +221,8 @@ retry_fcntl:
goto fail_close_unlink;
}
/* coverity[leaked_handle] intentional leak of fd handle here */
return 1;
fail_close_unlink:

View File

@@ -748,10 +748,11 @@ static void _display_fields_more(struct dm_report *rh,
id_len = strlen(type->prefix) + 3;
for (f = 0; fields[f].report_fn; f++) {
if ((type = _find_type(rh, fields[f].type)) && type->desc)
desc = type->desc;
else
desc = " ";
if (!(type = _find_type(rh, fields[f].type))) {
log_debug(INTERNAL_ERROR "Field type undefined.");
continue;
}
desc = (type->desc) ? : " ";
if (desc != last_desc) {
if (*last_desc)
log_warn(" ");
@@ -2380,7 +2381,7 @@ static const char *_get_reserved(struct dm_report *rh, unsigned type,
{
const struct dm_report_reserved_value *iter = implicit ? NULL : rh->reserved_values;
const struct dm_report_field_reserved_value *frv;
const char *tmp_begin, *tmp_end, *tmp_s = s;
const char *tmp_begin = NULL, *tmp_end = NULL, *tmp_s = s;
const char *name = NULL;
char c;

View File

@@ -1009,6 +1009,7 @@ static int _stats_parse_list(struct dm_stats *dms, const char *resp)
* dm_task_get_message_response() returns a 'const char *' but
* since fmemopen also permits "w" it expects a 'char *'.
*/
/* coverity[alloc_strlen] intentional */
if (!(list_rows = fmemopen((char *)resp, strlen(resp), "r")))
return_0;
@@ -1240,6 +1241,7 @@ static int _stats_parse_region(struct dm_stats *dms, const char *resp,
* dm_task_get_message_response() returns a 'const char *' but
* since fmemopen also permits "w" it expects a 'char *'.
*/
/* coverity[alloc_strlen] intentional */
stats_rows = fmemopen((char *)resp, strlen(resp), "r");
if (!stats_rows)
goto_bad;
@@ -2336,11 +2338,6 @@ int dm_stats_populate(struct dm_stats *dms, const char *program_id,
return 0;
}
if (!dms->nr_regions) {
log_error("No regions registered.");
return 0;
}
/* allow zero-length program_id for populate */
if (!program_id)
program_id = dms->program_id;
@@ -2352,6 +2349,11 @@ int dm_stats_populate(struct dm_stats *dms, const char *program_id,
goto_bad;
}
if (!dms->nr_regions) {
log_verbose("No stats regions registered: %s", dms->name);
return 0;
}
dms->walk_flags = DM_STATS_WALK_REGION;
dm_stats_walk_start(dms);
do {
@@ -4807,7 +4809,7 @@ uint64_t *dm_stats_update_regions_from_fd(struct dm_stats *dms, int fd,
{
struct dm_histogram *bounds = NULL;
int nr_bins, precise, regroup;
uint64_t *regions, count = 0;
uint64_t *regions = NULL, count = 0;
const char *alias = NULL;
if (!dms->regions || !dm_stats_group_present(dms, group_id)) {
@@ -4867,24 +4869,24 @@ uint64_t *dm_stats_update_regions_from_fd(struct dm_stats *dms, int fd,
group_id, &count, &regroup);
if (!regions)
goto bad;
goto_out;
if (!dm_stats_list(dms, NULL))
goto bad;
goto_bad;
/* regroup if there are regions to group */
if (regroup && (*regions != DM_STATS_REGION_NOT_PRESENT))
if (!_stats_group_file_regions(dms, regions, count, alias))
goto bad;
goto_bad;
dm_free(bounds);
dm_free((char *) alias);
return regions;
bad:
_stats_cleanup_region_ids(dms, regions, count);
dm_free(bounds);
dm_free(regions);
out:
dm_free(regions);
dm_free(bounds);
dm_free((char *) alias);
return NULL;
}

View File

@@ -17,6 +17,6 @@
#define MAJOR(dev) ((dev & 0xfff00) >> 8)
#define MINOR(dev) ((dev & 0xff) | ((dev >> 12) & 0xfff00))
#define MKDEV(ma,mi) ((mi & 0xff) | (ma << 8) | ((mi & ~0xff) << 12))
#define MKDEV(ma,mi) (((dev_t)mi & 0xff) | ((dev_t)ma << 8) | (((dev_t)mi & ~0xff) << 12))
#endif

View File

@@ -40,6 +40,11 @@ filesystem.
Unmount ext2/ext3/ext4 filesystem before doing resize.
.
.HP
.BR -l | --lvresize
.br
Resize given device if it is LVM device.
.
.HP
.BR -f | --force
.br
Bypass some sanity checks.

View File

@@ -475,7 +475,7 @@ Split images from a raid1 or mirror LV and use them to create a new LV.
.RE
-
Split images from a raid1 LV and track changes to origin.
Split images from a raid1 LV and track changes to origin for later merge.
.br
.P
\fBlvconvert\fP \fB--splitmirrors\fP \fINumber\fP \fB--trackchanges\fP \fILV\fP\fI_cache_raid1\fP
@@ -1281,6 +1281,8 @@ Before the separation, the cache is flushed. Also see --uncache.
Splits the specified number of images from a raid1 or mirror LV
and uses them to create a new LV. If --trackchanges is also specified,
changes to the raid1 LV are tracked while the split LV remains detached.
If --name is specified, then the images are permanently split from the
original LV and changes are not tracked.
.ad b
.HP
.ad l
@@ -1354,10 +1356,12 @@ The name of a thin pool LV.
.br
Can be used with --splitmirrors on a raid1 LV. This causes
changes to the original raid1 LV to be tracked while the split images
remain detached. This allows the read-only detached image(s) to be
merged efficiently back into the raid1 LV later. Only the regions with
changed data are resynchronized during merge. (This option only applies
when using the raid1 LV type.)
remain detached. This is a temporary state that allows the read-only
detached image to be merged efficiently back into the raid1 LV later.
Only the regions with changed data are resynchronized during merge.
While a raid1 LV is tracking changes, operations on it are limited to
merging the split image (see --mergemirrors) or permanently splitting
the image (see --splitmirrors with --name.
.ad b
.HP
.ad l

View File

@@ -84,8 +84,8 @@ For default settings, see lvmlockd -h.
.SS Initial set up
Using LVM with lvmlockd for the first time includes some one-time set up
steps:
Setting up LVM to use lvmlockd and a shared VG for the first time includes
some one time set up steps:
.SS 1. choose a lock manager
@@ -94,7 +94,7 @@ steps:
If dlm (or corosync) are already being used by other cluster
software, then select dlm. dlm uses corosync which requires additional
configuration beyond the scope of this document. See corosync and dlm
documentation for instructions on configuration, setup and usage.
documentation for instructions on configuration, set up and usage.
.I sanlock
.br
@@ -117,7 +117,9 @@ Assign each host a unique host_id in the range 1-2000 by setting
.SS 3. start lvmlockd
Use a unit/init file, or run the lvmlockd daemon directly:
Start the lvmlockd daemon.
.br
Use systemctl, a cluster resource agent, or run directly, e.g.
.br
systemctl start lvm2-lvmlockd
@@ -125,14 +127,17 @@ systemctl start lvm2-lvmlockd
.I sanlock
.br
Use unit/init files, or start wdmd and sanlock daemons directly:
Start the sanlock and wdmd daemons.
.br
Use systemctl or run directly, e.g.
.br
systemctl start wdmd sanlock
.I dlm
.br
Follow external clustering documentation when applicable, or use
unit/init files:
Start the dlm and corosync daemons.
.br
Use systemctl, a cluster resource agent, or run directly, e.g.
.br
systemctl start corosync dlm
@@ -141,18 +146,17 @@ systemctl start corosync dlm
vgcreate --shared <vgname> <devices>
The shared option sets the VG lock type to sanlock or dlm depending on
which lock manager is running. LVM commands will perform locking for the
VG using lvmlockd. lvmlockd will use the chosen lock manager.
which lock manager is running. LVM commands acquire locks from lvmlockd,
and lvmlockd uses the chosen lock manager.
.SS 6. start VG on all hosts
vgchange --lock-start
lvmlockd requires shared VGs to be started before they are used. This is
a lock manager operation to start (join) the VG lockspace, and it may take
some time. Until the start completes, locks for the VG are not available.
LVM commands are allowed to read the VG while start is in progress. (A
unit/init file can also be used to start VGs.)
Shared VGs must be started before they are used. Starting the VG performs
lock manager initialization that is necessary to begin using locks (i.e.
creating and joining a lockspace). Starting the VG may take some time,
and until the start completes the VG may not be modified or activated.
.SS 7. create and activate LVs
@@ -168,13 +172,10 @@ multiple hosts.)
.SS Normal start up and shut down
After initial set up, start up and shut down include the following general
steps. They can be performed manually or using the system service
manager.
After initial set up, start up and shut down include the following steps.
They can be performed directly or may be automated using systemd or a
cluster resource manager/agents.
\[bu]
start lvmetad
.br
\[bu]
start lvmlockd
.br
@@ -202,114 +203,69 @@ stop lock manager
\[bu]
stop lvmlockd
.br
\[bu]
stop lvmetad
.br
.P
.SH TOPICS
.SS VG access control
.SS Protecting VGs on shared devices
The following terms are used to describe different forms of VG access
control.
The following terms are used to describe the different ways of accessing
VGs on shared devices.
.I "lockd VG"
.I "shared VG"
A "lockd VG" is a shared VG that has a "lock type" of dlm or sanlock.
Using it requires lvmlockd. These VGs exist on shared storage that is
visible to multiple hosts. LVM commands use lvmlockd to perform locking
for these VGs when they are used.
A shared VG exists on shared storage that is visible to multiple hosts.
LVM acquires locks through lvmlockd to coordinate access to shared VGs.
A shared VG has lock_type "dlm" or "sanlock", which specifies the lock
manager lvmlockd will use.
If the lock manager for the lock type is not available (e.g. not started
or failed), lvmlockd is unable to acquire locks for LVM commands. LVM
commands that only read the VG will generally be allowed to continue
without locks in this case (with a warning). Commands to modify or
activate the VG will fail without the necessary locks.
When the lock manager for the lock type is not available (e.g. not started
or failed), lvmlockd is unable to acquire locks for LVM commands. In this
situation, LVM commands are only allowed to read and display the VG;
changes and activation will fail.
.I "local VG"
A "local VG" is meant to be used by a single host. It has no lock type or
lock type "none". LVM commands and lvmlockd do not perform locking for
these VGs. A local VG typically exists on local (non-shared) devices and
cannot be used concurrently from different hosts.
A local VG is meant to be used by a single host. It has no lock type or
lock type "none". A local VG typically exists on local (non-shared)
devices and cannot be used concurrently from different hosts.
If a local VG does exist on shared devices, it should be owned by a single
host by having its system ID set, see
host by having the system ID set, see
.BR lvmsystemid (7).
Only the host with a matching system ID can use the local VG. A VG
with no lock type and no system ID should be excluded from all but one
host using lvm.conf filters. Without any of these protections, a local VG
on shared devices can be easily damaged or destroyed.
The host with a matching system ID can use the local VG and other hosts
will ignore it. A VG with no lock type and no system ID should be
excluded from all but one host using lvm.conf filters. Without any of
these protections, a local VG on shared devices can be easily damaged or
destroyed.
.I "clvm VG"
A "clvm VG" is a VG on shared storage (like a lockd VG) that requires
clvmd for clustering. See below for converting a clvm VG to a lockd VG.
A clvm VG (or clustered VG) is a VG on shared storage (like a shared VG)
that requires clvmd for clustering and locking. See below for converting
a clvm/clustered VG to a shared VG.
.SS lockd VGs from hosts not using lvmlockd
.SS shared VGs from hosts not using lvmlockd
Only hosts that use lockd VGs should be configured to run lvmlockd.
However, shared devices in lockd VGs may be visible from hosts not
using lvmlockd. From a host not using lvmlockd, lockd VGs are ignored
in the same way as foreign VGs (see
Hosts that do not use shared VGs will not be running lvmlockd. In this
case, shared VGs that are still visible to the host will be ignored
(like foreign VGs, see
.BR lvmsystemid (7).)
The --shared option for reporting and display commands causes lockd VGs
The --shared option for reporting and display commands causes shared VGs
to be displayed on a host not using lvmlockd, like the --foreign option
does for foreign VGs.
.SS vgcreate comparison
The type of VG access control is specified in the vgcreate command.
See
.BR vgcreate (8)
for all vgcreate options.
.B vgcreate <vgname> <devices>
.IP \[bu] 2
Creates a local VG with the local host's system ID when neither lvmlockd nor clvm are configured.
.IP \[bu] 2
Creates a local VG with the local host's system ID when lvmlockd is configured.
.IP \[bu] 2
Creates a clvm VG when clvm is configured.
.P
.B vgcreate --shared <vgname> <devices>
.IP \[bu] 2
Requires lvmlockd to be configured and running.
.IP \[bu] 2
Creates a lockd VG with lock type sanlock|dlm depending on which lock
manager is running.
.IP \[bu] 2
LVM commands request locks from lvmlockd to use the VG.
.IP \[bu] 2
lvmlockd obtains locks from the selected lock manager.
.P
.B vgcreate -c|--clustered y <vgname> <devices>
.IP \[bu] 2
Requires clvm to be configured and running.
.IP \[bu] 2
Creates a clvm VG with the "clustered" flag.
.IP \[bu] 2
LVM commands request locks from clvmd to use the VG.
.P
.SS creating the first sanlock VG
Creating the first sanlock VG is not protected by locking, so it requires
special attention. This is because sanlock locks exist on storage within
the VG, so they are not available until the VG exists. The first sanlock
VG created will automatically contain the "global lock". Be aware of the
following special considerations:
the VG, so they are not available until after the VG is created. The
first sanlock VG that is created will automatically contain the "global
lock". Be aware of the following special considerations:
.IP \[bu] 2
The first vgcreate command needs to be given the path to a device that has
@@ -324,54 +280,48 @@ to be accessible to all hosts that will use sanlock shared VGs. All hosts
will need to use the global lock from the first sanlock VG.
.IP \[bu] 2
While running vgcreate for the first sanlock VG, ensure that the device
being used is not used by another LVM command. Allocation of shared
devices is usually protected by the global lock, but this cannot be done
for the first sanlock VG which will hold the global lock.
.IP \[bu] 2
While running vgcreate for the first sanlock VG, ensure that the VG name
being used is not used by another LVM command. Uniqueness of VG names is
usually ensured by the global lock.
The device and VG name used by the initial vgcreate will not be protected
from concurrent use by another vgcreate on another host.
See below for more information about managing the sanlock global lock.
.SS using lockd VGs
.SS using shared VGs
There are some special considerations when using lockd VGs.
There are some special considerations when using shared VGs.
When use_lvmlockd is first enabled in lvm.conf, and before the first lockd
VG is created, no global lock will exist. In this initial state, LVM
commands try and fail to acquire the global lock, producing a warning, and
some commands are disallowed. Once the first lockd VG is created, the
global lock will be available, and LVM will be fully operational.
When use_lvmlockd is first enabled in lvm.conf, and before the first
shared VG is created, no global lock will exist. In this initial state,
LVM commands try and fail to acquire the global lock, producing a warning,
and some commands are disallowed. Once the first shared VG is created,
the global lock will be available, and LVM will be fully operational.
When a new lockd VG is created, its lockspace is automatically started on
the host that creates it. Other hosts need to run 'vgchange
--lock-start' to start the new VG before they can use it.
When a new shared VG is created, its lockspace is automatically started on
the host that creates it. Other hosts need to run 'vgchange --lock-start'
to start the new VG before they can use it.
From the 'vgs' command, lockd VGs are indicated by "s" (for shared) in the
sixth attr field. The specific lock type and lock args for a lockd VG can
be displayed with 'vgs -o+locktype,lockargs'.
From the 'vgs' command, shared VGs are indicated by "s" (for shared) in
the sixth attr field, and by "shared" in the "--options shared" report
field. The specific lock type and lock args for a shared VG can be
displayed with 'vgs -o+locktype,lockargs'.
lockd VGs need to be "started" and "stopped", unlike other types of VGs.
Shared VGs need to be "started" and "stopped", unlike other types of VGs.
See the following section for a full description of starting and stopping.
vgremove of a lockd VG will fail if other hosts have the VG started.
Run vgchange --lock-stop <vgname> on all other hosts before vgremove.
(It may take several seconds before vgremove recognizes that all hosts
have stopped a sanlock VG.)
Removing a shared VG will fail if other hosts have the VG started. Run
vgchange --lock-stop <vgname> on all other hosts before vgremove. (It may
take several seconds before vgremove recognizes that all hosts have
stopped a sanlock VG.)
.SS starting and stopping VGs
Starting a lockd VG (vgchange --lock-start) causes the lock manager to
Starting a shared VG (vgchange --lock-start) causes the lock manager to
start (join) the lockspace for the VG on the host where it is run. This
makes locks for the VG available to LVM commands on the host. Before a VG
is started, only LVM commands that read/display the VG are allowed to
continue without locks (and with a warning).
Stopping a lockd VG (vgchange --lock-stop) causes the lock manager to
Stopping a shared VG (vgchange --lock-stop) causes the lock manager to
stop (leave) the lockspace for the VG on the host where it is run. This
makes locks for the VG inaccessible to the host. A VG cannot be stopped
while it has active LVs.
@@ -380,7 +330,7 @@ When using the lock type sanlock, starting a VG can take a long time
(potentially minutes if the host was previously shut down without cleanly
stopping the VG.)
A lockd VG can be started after all the following are true:
A shared VG can be started after all the following are true:
.br
\[bu]
lvmlockd is running
@@ -392,9 +342,9 @@ the lock manager is running
the VG's devices are visible on the system
.br
A lockd VG can be stopped if all LVs are deactivated.
A shared VG can be stopped if all LVs are deactivated.
All lockd VGs can be started/stopped using:
All shared VGs can be started/stopped using:
.br
vgchange --lock-start
.br
@@ -413,12 +363,12 @@ vgchange --lock-start --lock-opt nowait ...
lvmlockd can be asked directly to stop all lockspaces:
.br
lvmlockctl --stop-lockspaces
lvmlockctl -S|--stop-lockspaces
To start only selected lockd VGs, use the lvm.conf
To start only selected shared VGs, use the lvm.conf
activation/lock_start_list. When defined, only VG names in this list are
started by vgchange. If the list is not defined (the default), all
visible lockd VGs are started. To start only "vg1", use the following
visible shared VGs are started. To start only "vg1", use the following
lvm.conf configuration:
.nf
@@ -441,7 +391,7 @@ The "auto" option causes the command to follow the lvm.conf
activation/auto_lock_start_list. If auto_lock_start_list is undefined,
all VGs are started, just as if the auto option was not used.
When auto_lock_start_list is defined, it lists the lockd VGs that should
When auto_lock_start_list is defined, it lists the shared VGs that should
be started by the auto command. VG names that do not match an item in the
list will be ignored by the auto start command.
@@ -449,23 +399,20 @@ list will be ignored by the auto start command.
commands, i.e. with or without the auto option. When the lock_start_list
is defined, only VGs matching a list item can be started with vgchange.)
The auto_lock_start_list allows a user to select certain lockd VGs that
The auto_lock_start_list allows a user to select certain shared VGs that
should be automatically started by the system (or indirectly, those that
should not).
To use auto activation of lockd LVs (see auto_activation_volume_list),
auto starting of the corresponding lockd VGs is necessary.
.SS internal command locking
To optimize the use of LVM with lvmlockd, be aware of the three kinds of
locks and when they are used:
.I GL lock
.I Global lock
The global lock (GL lock) is associated with global information, which is
information not isolated to a single VG. This includes:
The global lock s associated with global information, which is information
not isolated to a single VG. This includes:
\[bu]
The global VG namespace.
@@ -490,61 +437,58 @@ acquired.
.I VG lock
A VG lock is associated with each lockd VG. The VG lock is acquired in
shared mode to read the VG and in exclusive mode to change the VG (modify
the VG metadata or activating LVs). This lock serializes access to a VG
with all other LVM commands accessing the VG from all hosts.
A VG lock is associated with each shared VG. The VG lock is acquired in
shared mode to read the VG and in exclusive mode to change the VG or
activate LVs. This lock serializes access to a VG with all other LVM
commands accessing the VG from all hosts.
The command 'vgs' will not only acquire the GL lock to read the list of
all VG names, but will acquire the VG lock for each VG prior to reading
it.
The command 'vgs <vgname>' does not acquire the GL lock (it does not need
the list of all VG names), but will acquire the VG lock on each VG name
argument.
The command 'vgs <vgname>' does not acquire the global lock (it does not
need the list of all VG names), but will acquire the VG lock on each VG
name argument.
.I LV lock
An LV lock is acquired before the LV is activated, and is released after
the LV is deactivated. If the LV lock cannot be acquired, the LV is not
activated. LV locks are persistent and remain in place when the
activation command is done. GL and VG locks are transient, and are held
only while an LVM command is running.
activated. (LV locks are persistent and remain in place when the
activation command is done. Global and VG locks are transient, and are
held only while an LVM command is running.)
.I lock retries
If a request for a GL or VG lock fails due to a lock conflict with another
host, lvmlockd automatically retries for a short time before returning a
failure to the LVM command. If those retries are insufficient, the LVM
command will retry the entire lock request a number of times specified by
global/lvmlockd_lock_retries before failing. If a request for an LV lock
fails due to a lock conflict, the command fails immediately.
If a request for a Global or VG lock fails due to a lock conflict with
another host, lvmlockd automatically retries for a short time before
returning a failure to the LVM command. If those retries are
insufficient, the LVM command will retry the entire lock request a number
of times specified by global/lvmlockd_lock_retries before failing. If a
request for an LV lock fails due to a lock conflict, the command fails
immediately.
.SS managing the global lock in sanlock VGs
The global lock exists in one of the sanlock VGs. The first sanlock VG
created will contain the global lock. Subsequent sanlock VGs will each
contain disabled global locks that can be enabled later if necessary.
contain a disabled global lock that can be enabled later if necessary.
The VG containing the global lock must be visible to all hosts using
sanlock VGs. This can be a reason to create a small sanlock VG, visible
to all hosts, and dedicated to just holding the global lock. While not
required, this strategy can help to avoid difficulty in the future if VGs
are moved or removed.
sanlock VGs. For this reason, it can be useful to create a small sanlock
VG, visible to all hosts, and dedicated to just holding the global lock.
While not required, this strategy can help to avoid difficulty in the
future if VGs are moved or removed.
The vgcreate command typically acquires the global lock, but in the case
of the first sanlock VG, there will be no global lock to acquire until the
first vgcreate is complete. So, creating the first sanlock VG is a
special case that skips the global lock.
vgcreate for a sanlock VG determines it is the first one to exist if no
other sanlock VGs are visible. It is possible that other sanlock VGs do
exist but are not visible on the host running vgcreate. In this case,
vgcreate would create a new sanlock VG with the global lock enabled. When
the other VG containing a global lock appears, lvmlockd will see more than
one VG with a global lock enabled, and LVM commands will report that there
are duplicate global locks.
vgcreate determines that it's creating the first sanlock VG when no other
sanlock VGs are visible on the system. It is possible that other sanlock
VGs do exist, but are not visible when vgcreate checks for them. In this
case, vgcreate will create a new sanlock VG with the global lock enabled.
When the another VG containing a global lock appears, lvmlockd will then
see more than one VG with a global lock enabled. LVM commands will report
that there are duplicate global locks.
If the situation arises where more than one sanlock VG contains a global
lock, the global lock should be manually disabled in all but one of them
@@ -562,8 +506,8 @@ VGs with the command:
lvmlockctl --gl-enable <vgname>
A small sanlock VG dedicated to holding the global lock can avoid the case
where the GL lock must be manually enabled after a vgremove.
(Using a small sanlock VG dedicated to holding the global lock can avoid
the case where the global lock must be manually enabled after a vgremove.)
.SS internal lvmlock LV
@@ -580,8 +524,8 @@ device, then use vgextend to add other devices.
.SS LV activation
In a shared VG, activation changes involve locking through lvmlockd, and
the following values are possible with lvchange/vgchange -a:
In a shared VG, LV activation involves locking through lvmlockd, and the
following values are possible with lvchange/vgchange -a:
.IP \fBy\fP|\fBey\fP
The command activates the LV in exclusive mode, allowing a single host
@@ -602,10 +546,6 @@ The shared mode is intended for a multi-host/cluster application or
file system.
LV types that cannot be used concurrently
from multiple hosts include thin, cache, raid, and snapshot.
lvextend on LV with shared locks is not yet allowed. The LV must be
deactivated, or activated exclusively to run lvextend. (LVs with
the mirror type can be activated in shared mode from multiple hosts
when using the dlm lock type and cmirrord.)
.IP \fBn\fP
The command deactivates the LV. After deactivating the LV, the command
@@ -660,7 +600,7 @@ with the expiring lease before other hosts can acquire its locks.
When the sanlock daemon detects that the lease storage is lost, it runs
the command lvmlockctl --kill <vgname>. This command emits a syslog
message stating that lease storage is lost for the VG and LVs must be
message stating that lease storage is lost for the VG, and LVs must be
immediately deactivated.
If no LVs are active in the VG, then the lockspace with an expiring lease
@@ -672,10 +612,10 @@ If the VG has active LVs when the lock storage is lost, the LVs must be
quickly deactivated before the lockspace lease expires. After all LVs are
deactivated, run lvmlockctl --drop <vgname> to clear the expiring
lockspace from lvmlockd. If all LVs in the VG are not deactivated within
about 40 seconds, sanlock will reset the host using the local watchdog.
The machine reset is effectively a severe form of "deactivating" LVs
before they can be activated on other hosts. The reset is considered a
better alternative than having LVs used by multiple hosts at once, which
about 40 seconds, sanlock uses wdmd and the local watchdog to reset the
host. The machine reset is effectively a severe form of "deactivating"
LVs before they can be activated on other hosts. The reset is considered
a better alternative than having LVs used by multiple hosts at once, which
could easily damage or destroy their content.
In the future, the lvmlockctl kill command may automatically attempt to
@@ -687,8 +627,7 @@ sanlock resets the machine.
If the sanlock daemon fails or exits while a lockspace is started, the
local watchdog will reset the host. This is necessary to protect any
application resources that depend on sanlock leases which will be lost
without sanlock running.
application resources that depend on sanlock leases.
.SS changing dlm cluster name
@@ -768,14 +707,14 @@ Start the VG on hosts to use it:
vgchange --lock-start <vgname>
.SS changing a local VG to a lockd VG
.SS changing a local VG to a shared VG
All LVs must be inactive to change the lock type.
lvmlockd must be configured and running as described in USAGE.
.IP \[bu] 2
Change a local VG to a lockd VG with the command:
Change a local VG to a shared VG with the command:
.br
vgchange --lock-type sanlock|dlm <vgname>
@@ -786,7 +725,7 @@ vgchange --lock-start <vgname>
.P
.SS changing a lockd VG to a local VG
.SS changing a shared VG to a local VG
All LVs must be inactive to change the lock type.
@@ -812,16 +751,16 @@ type can be forcibly changed to none with:
vgchange --lock-type none --lock-opt force <vgname>
To change a VG from one lockd type to another (i.e. between sanlock and
To change a VG from one lock type to another (i.e. between sanlock and
dlm), first change it to a local VG, then to the new type.
.SS changing a clvm VG to a lockd VG
.SS changing a clvm/clustered VG to a shared VG
All LVs must be inactive to change the lock type.
First change the clvm VG to a local VG. Within a running clvm cluster,
change a clvm VG to a local VG with the command:
First change the clvm/clustered VG to a local VG. Within a running clvm
cluster, change a clustered VG to a local VG with the command:
vgchange -cn <vgname>
@@ -829,18 +768,15 @@ If the clvm cluster is no longer running on any nodes, then extra options
can be used to forcibly make the VG local. Caution: this is only safe if
all nodes have stopped using the VG:
vgchange --config 'global/locking_type=0 global/use_lvmlockd=0'
.RS
-cn <vgname>
.RE
vgchange --lock-type none --lock-opt force <vgname>
After the VG is local, follow the steps described in "changing a local VG
to a lockd VG".
to a shared VG".
.SS limitations of lockd VGs
.SS limitations of shared VGs
Things that do not yet work in lockd VGs:
Things that do not yet work in shared VGs:
.br
\[bu]
using external origins for thin LVs
@@ -860,22 +796,22 @@ vgsplit and vgmerge (convert to a local VG to do this)
.SS lvmlockd changes from clvmd
(See above for converting an existing clvm VG to a lockd VG.)
(See above for converting an existing clvm VG to a shared VG.)
While lvmlockd and clvmd are entirely different systems, LVM command usage
remains similar. Differences are more notable when using lvmlockd's
sanlock option.
Visible usage differences between lockd VGs (using lvmlockd) and clvm VGs
(using clvmd):
Visible usage differences between shared VGs (using lvmlockd) and
clvm/clustered VGs (using clvmd):
.IP \[bu] 2
lvm.conf must be configured to use either lvmlockd (use_lvmlockd=1) or
clvmd (locking_type=3), but not both.
.IP \[bu] 2
vgcreate --shared creates a lockd VG, and vgcreate --clustered y
creates a clvm VG.
vgcreate --shared creates a shared VG, and vgcreate --clustered y
creates a clvm/clustered VG.
.IP \[bu] 2
lvmlockd adds the option of using sanlock for locking, avoiding the
@@ -896,11 +832,11 @@ lvmlockd works with thin and cache pools and LVs.
lvmlockd works with lvmetad.
.IP \[bu] 2
lvmlockd saves the cluster name for a lockd VG using dlm. Only hosts in
lvmlockd saves the cluster name for a shared VG using dlm. Only hosts in
the matching cluster can use the VG.
.IP \[bu] 2
lvmlockd requires starting/stopping lockd VGs with vgchange --lock-start
lvmlockd requires starting/stopping shared VGs with vgchange --lock-start
and --lock-stop.
.IP \[bu] 2
@@ -923,7 +859,7 @@ reporting option lock_args to view the corresponding metadata fields.
.IP \[bu] 2
In the 'vgs' command's sixth VG attr field, "s" for "shared" is displayed
for lockd VGs.
for shared VGs.
.IP \[bu] 2
If lvmlockd fails or is killed while in use, locks it held remain but are

View File

@@ -346,9 +346,9 @@ of the foreign VG to its own. See Overriding system ID above.
.SS shared VGs
A shared/lockd VG has no system ID set, allowing multiple hosts to use it
via lvmlockd. Changing a VG to a lockd type will clear the existing
system ID. Applicable only if LVM is compiled with lockd support.
A shared VG has no system ID set, allowing multiple hosts to use it
via lvmlockd. Changing a VG to shared will clear the existing
system ID. Applicable only if LVM is compiled with lvmlockd support.
.SS clustered VGs

View File

@@ -1,6 +1,6 @@
[Unit]
Description=Availability of block devices
After=lvm2-activation.service lvm2-lvmetad.service iscsi-shutdown.service iscsi.service iscsid.service fcoe.service
After=lvm2-activation.service lvm2-lvmetad.service iscsi-shutdown.service iscsi.service iscsid.service fcoe.service rbdmap.service
DefaultDependencies=no
Conflicts=shutdown.target

View File

@@ -128,7 +128,7 @@ static int generate_unit(const char *dir, int unit, int sysinit_needed)
"DefaultDependencies=no\n", f);
if (unit == UNIT_NET) {
fprintf(f, "After=%s iscsi.service fcoe.service\n"
fprintf(f, "After=%s iscsi.service fcoe.service rbdmap.service\n"
"Before=remote-fs-pre.target shutdown.target\n\n"
"[Service]\n"
"ExecStartPre=/usr/bin/udevadm settle\n", unit_names[UNIT_MAIN]);

View File

@@ -2,6 +2,7 @@
Description=LVM2 metadata daemon socket
Documentation=man:lvmetad(8)
DefaultDependencies=no
Conflicts=shutdown.target
[Socket]
ListenStream=@DEFAULT_RUN_DIR@/lvmetad.socket

View File

@@ -2,6 +2,7 @@
Description=LVM2 poll daemon socket
Documentation=man:lvmpolld(8)
DefaultDependencies=no
Conflicts=shutdown.target
[Socket]
ListenStream=@DEFAULT_RUN_DIR@/lvmpolld.socket

View File

@@ -2,7 +2,6 @@
Description=LVM2 PV scan on device %i
Documentation=man:pvscan(8)
DefaultDependencies=no
StartLimitInterval=0
BindsTo=dev-block-%i.device
Requires=lvm2-lvmetad.socket
After=lvm2-lvmetad.socket lvm2-lvmetad.service
@@ -14,3 +13,4 @@ Type=oneshot
RemainAfterExit=yes
ExecStart=@SBINDIR@/lvm pvscan --cache --activate ay %i
ExecStop=@SBINDIR@/lvm pvscan --cache %i
StartLimitInterval=0

View File

@@ -32,7 +32,10 @@
%endif
%enableif %{enable_python} python2-bindings
%enableif %{enable_python3} python3-bindings
%enableif %{enable_python} applib
# Must use this, or applib will be enabled and disabled depending in python[23] availability
%if %{enable_python3} || %{enable_python}
%enableif 1 applib
%endif
%enableif %{enable_dbusd} dbus-service
%enableif %{enable_dbusd} notify-dbus
%enableif %{enable_dmfilemapd} dmfilemapd

View File

@@ -258,6 +258,7 @@ This package contains shared lvm2 libraries for applications.
%{_libdir}/device-mapper/libdevmapper-event-lvm2mirror.so
%{_libdir}/device-mapper/libdevmapper-event-lvm2snapshot.so
%{_libdir}/device-mapper/libdevmapper-event-lvm2raid.so
%{_libdir}/device-mapper/libdevmapper-event-lvm2vdo.so
%if %{have_with thin}
%{_libdir}/device-mapper/libdevmapper-event-lvm2thin.so
%{_libdir}/libdevmapper-event-lvm2thin.so
@@ -265,6 +266,7 @@ This package contains shared lvm2 libraries for applications.
%{_libdir}/libdevmapper-event-lvm2mirror.so
%{_libdir}/libdevmapper-event-lvm2snapshot.so
%{_libdir}/libdevmapper-event-lvm2raid.so
%{_libdir}/libdevmapper-event-lvm2vdo.so
##############################################################################

View File

@@ -79,17 +79,26 @@
%global enable_python 0
%endif
%if %{rhel} >= 8 || %{fedora} >= 20
%if %{rhel} > 7 || %{fedora} >= 20
%global enable_python3 1
%endif
%if %{rhel} > 7 || %{fedora} >= 29
%global enable_python 0
%endif
%if %{rhel} > 7 || %{fedora} >= 23
%global enable_dbusd 1
%endif
%if %{enable_python}
%global buildreq_python2_devel python2-devel
%global buildreq_python_setuptools python-setuptools
%endif
%if %{enable_python3}
%if %{enable_python3} || %{enable_dbusd}
%global buildreq_python3_devel python3-devel
%global buildreq_python_setuptools python-setuptools
%global buildreq_python_setuptools python3-setuptools
%endif
##############################################################
@@ -100,15 +109,6 @@
##############################################################
%if %{rhel} >= 8 || %{fedora} >= 23
%if %{enable_python3}
%global enable_dbusd 1
%else
# dbusd requires python3
false
%endif
%endif
%if %{enable_dbusd}
%global buildreq_python3_dbus python3-dbus
%global buildreq_python3_pyudev python3-pyudev

View File

@@ -25,6 +25,8 @@ TESTNAME=${0##*/}
PS4='#${BASH_SOURCE[0]##*/}:${LINENO}+ '
export TESTNAME PS4
LVM_TEST_FLAVOUR=${LVM_TEST_FLAVOUR-}
LVM_TEST_BACKING_DEVICE=${LVM_TEST_BACKING_DEVICE-}
LVM_TEST_DEVDIR=${LVM_TEST_DEVDIR-}
LVM_TEST_NODEBUG=${LVM_TEST_NODEBUG-}
@@ -49,9 +51,9 @@ SKIP_WITH_LVMPOLLD=${SKIP_WITH_LVMPOLLD-}
SKIP_WITH_LVMLOCKD=${SKIP_WITH_LVMLOCKD-}
SKIP_ROOT_DM_CHECK=${SKIP_ROOT_DM_CHECK-}
if test -n "$LVM_TEST_FLAVOUR"; then
. "lib/flavour-$LVM_TEST_FLAVOUR"
fi
test -n "$LVM_TEST_FLAVOUR" || { echo "NOTE: Empty flavour">&2; initskip; }
test -f "lib/flavour-$LVM_TEST_FLAVOUR" || { echo "NOTE: Flavour '$LVM_TEST_FLAVOUR' does not exist">&2; initskip; }
. "lib/flavour-$LVM_TEST_FLAVOUR"
test -n "$SKIP_WITHOUT_CLVMD" && test "$LVM_TEST_LOCKING" -ne 3 && initskip
test -n "$SKIP_WITH_CLVMD" && test "$LVM_TEST_LOCKING" = 3 && initskip

View File

@@ -78,6 +78,12 @@ lvconvert --config 'allocation/cache_metadata_format=1' -y -H --cachepool $vg/cp
check lv_field $vg/$lv1 cachemetadataformat "1"
lvremove -f $vg
lvcreate --type cache-pool --cachepolicy mq --cachemetadataformat 1 -L1 $vg/cpool
check lv_field $vg/cpool cachemetadataformat "1"
lvcreate -H -L10 -n $lv1 --cachemetadataformat 2 --cachepool $vg/cpool
check lv_field $vg/$lv1 cachemetadataformat "2"
lvremove -f $vg
fi
#lvs -a -o name,cachemetadataformat,kernelmetadataformat,chunksize,cachepolicy,cachemode $vg

View File

@@ -25,6 +25,8 @@ export LVM_TEST_THIN_REPAIR_CMD=${LVM_TEST_THIN_REPAIR_CMD-/bin/false}
#
aux have_thin 1 1 0 || skip
aux lvmconf 'devices/scan_lvs = 1'
aux prepare_vg 2 64
get_devs

View File

@@ -17,7 +17,7 @@ SKIP_WITH_LVMPOLLD=1
aux have_raid 1 3 2 || skip
v1_9_0=0
aux have_raid 1 9 && v1_9_0=1
aux have_raid 1 9 0 && v1_9_0=1
aux prepare_vg 8
get_devs

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
# Check --splitmirrors for mirror segtype
. lib/inittest
aux prepare_vg 3
###########################################
# Mirror split tests
###########################################
# 3-way to 2-way/linear
lvcreate --type mirror -m 2 -l 2 -n $lv1 $vg
aux wait_for_sync $vg $lv1
lvconvert --splitmirrors 1 -n $lv2 -vvvv $vg/$lv1
check lv_exists $vg $lv1
check linear $vg $lv2
check active $vg $lv2
# FIXME: ensure no residual devices
vgremove -ff $vg

View File

@@ -0,0 +1,104 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA2110-1301 USA
SKIP_WITH_LVMLOCKD=1
SKIP_WITH_LVMPOLLD=1
. lib/inittest
# Ensure expected default region size
aux lvmconf 'activation/raid_region_size = 512'
which mkfs.ext4 || skip
aux have_raid 1 13 1 || skip
# Temporarily skip reshape tests on single-core CPUs until there's a fix for
# https://bugzilla.redhat.com/1443999 - AGK 2017/04/20
aux have_multi_core || skip
aux prepare_vg 5
#
# Test multi step linear -> striped conversion
#
# Create linear LV
lvcreate -aey -L 16M -n $lv $vg
check lv_field $vg/$lv segtype "linear"
check lv_field $vg/$lv stripes 1
check lv_field $vg/$lv data_stripes 1
echo y|mkfs -t ext4 $DM_DEV_DIR/$vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert linear -> raid1 (takeover)
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_field $vg/$lv segtype "raid1"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 2
check lv_field $vg/$lv regionsize "128.00k"
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid1 -> raid5_ls (takeover)
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_field $vg/$lv segtype "raid5_ls"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 1
check lv_field $vg/$lv stripesize "64.00k"
check lv_field $vg/$lv regionsize "128.00k"
# Convert raid5_ls adding stripes (reshape)
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid5_ls"
check lv_first_seg_field $vg/$lv stripes 4
check lv_first_seg_field $vg/$lv data_stripes 3
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "128.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 8
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid5_ls -> raid6_ls_6 (takeover)
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid6_ls_6"
check lv_first_seg_field $vg/$lv stripes 5
check lv_first_seg_field $vg/$lv data_stripes 3
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "128.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 0
aux wait_for_sync $vg $lv
# Convert raid6_ls_6 -> raid6(_zr) (reshape)
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid6"
check lv_first_seg_field $vg/$lv stripes 5
check lv_first_seg_field $vg/$lv data_stripes 3
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "128.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 10
aux wait_for_sync $vg $lv
# Remove reshape space
lvconvert -y --type raid6 --stripes 3 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid6"
check lv_first_seg_field $vg/$lv stripes 5
check lv_first_seg_field $vg/$lv data_stripes 3
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "128.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 0
vgremove -ff $vg

View File

@@ -0,0 +1,80 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA2110-1301 USA
SKIP_WITH_LVMLOCKD=1
SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux lvmconf 'activation/raid_region_size = 512'
which mkfs.ext4 || skip
aux have_raid 1 13 1 || skip
# Temporarily skip reshape tests on single-core CPUs until there's a fix for
# https://bugzilla.redhat.com/1443999 - AGK 2017/04/20
aux have_multi_core || skip
aux prepare_vg 5
#
# Test multi step linear -> striped conversion
#
# Create linear LV
lvcreate -aey -L 16M -n $lv $vg
check lv_field $vg/$lv segtype "linear"
check lv_field $vg/$lv stripes 1
check lv_field $vg/$lv data_stripes 1
echo y|mkfs -t ext4 $DM_DEV_DIR/$vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert linear -> raid1
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_field $vg/$lv segtype "raid1"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 2
check lv_field $vg/$lv regionsize "128.00k"
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid1 -> raid5_n
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_field $vg/$lv segtype "raid5_n"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 1
check lv_field $vg/$lv stripesize "64.00k"
check lv_field $vg/$lv regionsize "128.00k"
# Convert raid5_n adding stripes
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid5_n"
check lv_first_seg_field $vg/$lv data_stripes 4
check lv_first_seg_field $vg/$lv stripes 5
check lv_first_seg_field $vg/$lv data_stripes 4
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "128.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 10
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid5_n -> striped
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 128K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_first_seg_field $vg/$lv segtype "striped"
check lv_first_seg_field $vg/$lv stripes 4
check lv_first_seg_field $vg/$lv data_stripes 4
check lv_first_seg_field $vg/$lv stripesize "64.00k"
vgremove -ff $vg

View File

@@ -14,6 +14,8 @@ SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux lvmconf 'activation/raid_region_size = 512'
which mkfs.ext4 || skip
aux have_raid 1 12 0 || skip

View File

@@ -0,0 +1,89 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA2110-1301 USA
SKIP_WITH_LVMLOCKD=1
SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux lvmconf 'activation/raid_region_size = 512'
which mkfs.ext4 || skip
aux have_raid 1 13 1 || skip
# Temporarily skip reshape tests on single-core CPUs until there's a fix for
# https://bugzilla.redhat.com/1443999 - AGK 2017/04/20
aux have_multi_core || skip
aux prepare_vg 5
#
# Test multi step striped -> linear conversion
#
# Create 4-way striped LV
lvcreate -aey --type striped -L 16M --stripes 4 --stripesize 64K -n $lv $vg
check lv_first_seg_field $vg/$lv segtype "striped"
check lv_first_seg_field $vg/$lv stripes 4
check lv_first_seg_field $vg/$lv data_stripes 4
check lv_first_seg_field $vg/$lv stripesize "64.00k"
echo y|mkfs -t ext4 $DM_DEV_DIR/$vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
lvextend -y -L64M $DM_DEV_DIR/$vg/$lv
# Convert striped -> raid5_n
lvconvert -y --type linear $vg/$lv
check lv_field $vg/$lv segtype "raid5_n"
check lv_field $vg/$lv data_stripes 4
check lv_field $vg/$lv stripes 5
check lv_field $vg/$lv data_stripes 4
check lv_field $vg/$lv stripesize "64.00k"
check lv_field $vg/$lv regionsize "512.00k"
check lv_field $vg/$lv reshape_len_le 0
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Restripe raid5_n LV to single data stripe
#
# Need --force in order to remove stripes thus shrinking LV size!
lvconvert -y --force --type linear $vg/$lv
aux wait_for_sync $vg $lv 1
fsck -fn $DM_DEV_DIR/$vg/$lv
# Remove the now freed stripes
lvconvert -y --type linear $vg/$lv
check lv_field $vg/$lv segtype "raid5_n"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 1
check lv_field $vg/$lv stripesize "64.00k"
check lv_field $vg/$lv regionsize "512.00k"
check lv_field $vg/$lv reshape_len_le 4
# Convert raid5_n -> raid1
lvconvert -y --type linear $vg/$lv
check lv_field $vg/$lv segtype "raid1"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 2
check lv_field $vg/$lv stripesize 0
check lv_field $vg/$lv regionsize "512.00k"
check lv_field $vg/$lv reshape_len_le ""
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid1 -> linear
lvconvert -y --type linear $vg/$lv
check lv_first_seg_field $vg/$lv segtype "linear"
check lv_first_seg_field $vg/$lv stripes 1
check lv_first_seg_field $vg/$lv data_stripes 1
check lv_first_seg_field $vg/$lv stripesize 0
check lv_first_seg_field $vg/$lv regionsize 0
fsck -fn $DM_DEV_DIR/$vg/$lv
vgremove -ff $vg

View File

@@ -15,6 +15,8 @@ SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux lvmconf 'activation/raid_region_size = 512'
which mkfs.ext4 || skip
aux have_raid 1 12 0 || skip
@@ -51,7 +53,9 @@ aux wait_for_sync $vg $lv1
fsck -fn $DM_DEV_DIR/$vg/$lv1
# Extend raid5_n LV by factor 4 to keep size once linear
lvresize -y -L 64 $vg/$lv1
lvresize -y -L 64M $vg/$lv1
aux wait_for_sync $vg $lv1
check lv_field $vg/$lv1 segtype "raid5_n"
check lv_field $vg/$lv1 data_stripes 4
check lv_field $vg/$lv1 stripes 5
@@ -87,6 +91,7 @@ check lv_first_seg_field $vg/$lv1 stripes 2
check lv_first_seg_field $vg/$lv1 stripesize "32.00k"
check lv_first_seg_field $vg/$lv1 regionsize "1.00m"
check lv_first_seg_field $vg/$lv1 reshape_len_le 4
fsck -fn $DM_DEV_DIR/$vg/$lv1
# Convert raid5_n to raid1
lvconvert -y --type raid1 $vg/$lv1
@@ -97,6 +102,7 @@ check lv_first_seg_field $vg/$lv1 stripes 2
check lv_first_seg_field $vg/$lv1 stripesize "0"
check lv_first_seg_field $vg/$lv1 regionsize "1.00m"
check lv_first_seg_field $vg/$lv1 reshape_len_le ""
fsck -fn $DM_DEV_DIR/$vg/$lv1
# Convert raid1 -> linear
lvconvert -y --type linear $vg/$lv1
@@ -107,5 +113,6 @@ check lv_first_seg_field $vg/$lv1 stripes 1
check lv_first_seg_field $vg/$lv1 stripesize "0"
check lv_first_seg_field $vg/$lv1 regionsize "0"
check lv_first_seg_field $vg/$lv1 reshape_len_le ""
fsck -fn $DM_DEV_DIR/$vg/$lv1
vgremove -ff $vg

View File

@@ -46,6 +46,7 @@ check lv_first_seg_field $vg/$lv1 stripesize "64.00k"
check lv_first_seg_field $vg/$lv1 data_stripes 10
check lv_first_seg_field $vg/$lv1 stripes 11
echo y|mkfs -t ext4 /dev/$vg/$lv1
fsck -fn /dev/$vg/$lv1
mkdir -p $mount_dir
mount "$DM_DEV_DIR/$vg/$lv1" $mount_dir
@@ -53,8 +54,8 @@ mkdir -p $mount_dir/1 $mount_dir/2
echo 3 >/proc/sys/vm/drop_caches
cp -r /usr/bin $mount_dir/1 >/dev/null 2>/dev/null &
cp -r /usr/bin $mount_dir/2 >/dev/null 2>/dev/null &
cp -r /usr/bin $mount_dir/1 &>/dev/null &
cp -r /usr/bin $mount_dir/2 &>/dev/null &
sync &
aux wait_for_sync $vg $lv1
@@ -69,11 +70,11 @@ check lv_first_seg_field $vg/$lv1 stripesize "64.00k"
check lv_first_seg_field $vg/$lv1 data_stripes 15
check lv_first_seg_field $vg/$lv1 stripes 16
rm -fr $mount_dir/2
sync
kill -9 %%
wait
rm -fr $mount_dir/[12]
sync
umount $mount_dir
fsck -fn "$DM_DEV_DIR/$vg/$lv1"

View File

@@ -0,0 +1,76 @@
#!/usr/bin/env bash
# Copyright (C) 2017 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA2110-1301 USA
SKIP_WITH_LVMLOCKD=1
SKIP_WITH_LVMPOLLD=1
. lib/inittest
which mkfs.ext4 || skip
aux have_raid 1 12 0 || skip
# Temporarily skip reshape tests on single-core CPUs until there's a fix for
# https://bugzilla.redhat.com/1443999 - AGK 2017/04/20
aux have_multi_core || skip
aux prepare_vg 5
#
# Test single step linear -> striped conversion
#
# Create linear LV
lvcreate -aey -L 16M -n $lv $vg
check lv_field $vg/$lv segtype "linear"
check lv_field $vg/$lv stripes 1
check lv_field $vg/$lv data_stripes 1
echo y|mkfs -t ext4 $DM_DEV_DIR/$vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert linear -> raid1
not lvconvert -y --stripes 4 $vg/$lv
not lvconvert -y --stripes 4 --stripesize 64K $vg/$lv
not lvconvert -y --stripes 4 --stripesize 64K --regionsize 512K $vg/$lv
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 512K $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
check lv_field $vg/$lv segtype "raid1"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 2
check lv_field $vg/$lv regionsize "512.00k"
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid1 -> raid5_n
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 512K $vg/$lv
check lv_field $vg/$lv segtype "raid5_n"
check lv_field $vg/$lv stripes 2
check lv_field $vg/$lv data_stripes 1
check lv_field $vg/$lv stripesize "64.00k"
check lv_field $vg/$lv regionsize "512.00k"
fsck -fn $DM_DEV_DIR/$vg/$lv
# Convert raid5_n adding stripes
lvconvert -y --type striped --stripes 4 --stripesize 64K --regionsize 512K $vg/$lv
check lv_first_seg_field $vg/$lv segtype "raid5_n"
check lv_first_seg_field $vg/$lv data_stripes 4
check lv_first_seg_field $vg/$lv stripes 5
check lv_first_seg_field $vg/$lv stripesize "64.00k"
check lv_first_seg_field $vg/$lv regionsize "512.00k"
check lv_first_seg_field $vg/$lv reshape_len_le 10
aux wait_for_sync $vg $lv
fsck -fn $DM_DEV_DIR/$vg/$lv
resize2fs $DM_DEV_DIR/$vg/$lv
# Convert raid5_n -> striped
lvconvert -y --type striped $vg/$lv
fsck -fn $DM_DEV_DIR/$vg/$lv
vgremove -ff $vg

View File

@@ -108,11 +108,19 @@ function _invalid_raid5_conversions
not _lvconvert raid6 raid6_n_6 4 6 $vg $lv1
}
# Check raid6 conversion constrainst of minimum 3 stripes
_lvcreate striped 2 2 4m $vg $lv1
not _lvconvert raid6 raid6_n_6 2 4 $vg $lv1
lvremove -y $vg
# Check raid6 conversion constrainst for 2 stripes
for type in striped raid0 raid0_meta
do
_lvcreate $type 2 2 4m $vg $lv1
not _lvconvert raid6 raid6_n_6 2 4 $vg $lv1
_lvconvert raid6 raid5_n 2 3 $vg $lv1
_lvconvert raid6 raid5_n 3 4 $vg $lv1
_lvconvert raid6 raid6_n_6 3 5 $vg $lv1
lvremove -y $vg
done
# Check raid6 conversion constrainst of minimum 3 stripes
_lvcreate raid0 3 3 4m $vg $lv1
_lvconvert raid6 raid6_n_6 3 5 $vg $lv1
lvremove -y $vg

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux have_raid 1 7 0 || skip
aux prepare_vg 3 16
lvcreate -aey --type raid0 -i 3 -l3 -n $lv $vg
lvconvert -y --type striped $vg/$lv
check lv_field $vg/$lv segtype "striped"
vgremove -ff $vg

View File

@@ -0,0 +1,43 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
SKIP_WITH_LVMPOLLD=1
. lib/inittest
# rhbz1579072/rhbz1579438
aux have_raid 1 3 0 || skip
# 8 PVs needed for RAID10 testing (4-stripes/2-mirror)
aux prepare_pvs 4 2
get_devs
vgcreate $SHARED -s 512k "$vg" "${DEVICES[@]}"
lvcreate -y --ty raid1 -m 2 -n $lv1 -l 1 $vg
lvconvert -y --splitmirrors 1 --trackchanges $vg/$lv1
not lvconvert -y --ty linear $vg/$lv1
not lvconvert -y --ty striped -i 3 $vg/$lv1
not lvconvert -y --ty mirror $vg/$lv1
not lvconvert -y --ty raid4 $vg/$lv1
not lvconvert -y --ty raid5 $vg/$lv1
not lvconvert -y --ty raid6 $vg/$lv1
not lvconvert -y --ty raid10 $vg/$lv1
not lvconvert -y --ty striped -m 1 $vg/${lv1}_rimage_2
not lvconvert -y --ty raid1 -m 1 $vg/${lv1}_rimage_2
not lvconvert -y --ty mirror -m 1 $vg/${lv1}_rimage_2
not lvconvert -y --ty cache-pool $vg/${lv1}_rimage_2
not lvconvert -y --ty thin-pool $vg/${lv1}_rimage_2
vgremove -ff $vg

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux have_raid 1 7 0 || skip
aux prepare_vg 3 16
lvcreate -aey --type striped -i 3 -l3 -n $lv $vg
lvconvert -y --type raid0_meta $vg/$lv
check lv_field $vg/$lv segtype "raid0_meta"
vgremove -ff $vg

View File

@@ -17,6 +17,8 @@ export LVM_TEST_THIN_REPAIR_CMD=${LVM_TEST_THIN_REPAIR_CMD-/bin/false}
. lib/inittest
aux lvmconf 'devices/scan_lvs = 1'
prepare_lvs() {
lvremove -f $vg
lvcreate -L10M -n $lv1 $vg

View File

@@ -20,6 +20,7 @@ SKIP_WITH_LVMPOLLD=1
# FIXME update test to make something useful on <16T
aux can_use_16T || skip
aux have_raid 1 3 0 || skip
aux lvmconf 'devices/scan_lvs = 1'
aux prepare_vg 5

View File

@@ -17,6 +17,8 @@ SKIP_WITH_LVMPOLLD=1
. lib/inittest
aux lvmconf 'devices/scan_lvs = 1'
# FIXME update test to make something useful on <16T
aux can_use_16T || skip

87
test/shell/lvm-on-md.sh Normal file
View File

@@ -0,0 +1,87 @@
#!/usr/bin/env bash
# Copyright (C) 2018 Red Hat, Inc. All rights reserved.
#
# This copyrighted material is made available to anyone wishing to use,
# modify, copy, or redistribute it subject to the terms and conditions
# of the GNU General Public License v.2.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
SKIP_WITH_LVMPOLLD=1
. lib/inittest
test -f /proc/mdstat && grep -q raid1 /proc/mdstat || \
modprobe raid1 || skip
aux lvmconf 'devices/md_component_detection = 1'
aux extend_filter_LVMTEST "a|/dev/md|"
aux prepare_devs 2
# create 2 disk MD raid1 array
# by default using metadata format 1.0 with data at the end of device
aux prepare_md_dev 1 64 2 "$dev1" "$dev2"
mddev=$(< MD_DEV)
pvdev=$(< MD_DEV_PV)
vgcreate $vg "$mddev"
lvs $vg
lvcreate -n $lv1 -l 2 $vg
lvcreate -n $lv2 -l 2 -an $vg
lvchange -ay $vg/$lv2
lvs $vg
pvs -vvvv 2>&1|tee pvs.out
vgchange -an $vg
vgchange -ay -vvvv $vg 2>&1| tee vgchange.out
lvs $vg
pvs
vgchange -an $vg
mdadm --stop "$mddev"
# with md superblock 1.0 this pvs will report duplicates
# for the two md legs since the md device itself is not
# started
pvs 2>&1 |tee out
cat out
grep "prefers device" out
pvs -vvvv 2>&1| tee pvs2.out
# should not activate from the md legs
not vgchange -ay -vvvv $vg 2>&1|tee vgchange-fail.out
# should not show an active lv
lvs $vg
# start the md dev
mdadm --assemble "$mddev" "$dev1" "$dev2"
# Now that the md dev is online, pvs can see it and
# ignore the two legs, so there's no duplicate warning
pvs 2>&1 |tee out
cat out
not grep "prefers device" out
vgchange -ay $vg 2>&1 |tee out
cat out
not grep "prefers device" out
vgchange -an $vg
vgremove -f $vg

View File

@@ -89,6 +89,7 @@ sleep 1
# (when mdadm supports repair)
if mdadm --action=repair "$mddev" ; then
sleep 1
pvscan -vvvv
# should be showing correctly PV3 & PV4
pvs
pvs -vvvv "$dev3" "$dev4"
fi

Some files were not shown because too many files have changed in this diff Show More