1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

5646 Commits

Author SHA1 Message Date
Zdenek Kabelac
2b18be87aa raid: recognize transient failed raid leg
When raid leg rimage device is marked as 'D'ead by mdcore,
lvm2 was not able to replace such device with allocate policy,
as device has not appared as missing.

Add detection of transiently failing devices.
2017-06-23 23:27:07 +02:00
Zdenek Kabelac
cc03a872c0 cleanup: update messages 2017-06-23 18:44:01 +02:00
Zdenek Kabelac
a7c7d53543 debug: add missing internal error message
Do not just 'return_0'  log error would need to be shown.
2017-06-23 18:44:01 +02:00
Zdenek Kabelac
1bdcd156fd cache: restore origin only reload
Basically reverting commit 58a9f88b8c.
We can use origin_only  in case we are snapshot's origin,
as we do support this stack.

So when we are 'uncaching'  origin+snaps - we do need to reload only
origin and we do not need to play with snaps.
2017-06-23 18:44:01 +02:00
Zdenek Kabelac
63ecbcd1b7 raid: switch message to verbose
As this is not 'error' resulting query, decrease reported level.
2017-06-23 18:44:01 +02:00
Zdenek Kabelac
6d30350dd1 raid: improving messages for regionsize change
Handle change of 'region size' better and follow also standard rule
if the command can't success (i.e. size is already same) we return
error for all such cases.

Also log_pring more info about adjusted value (just like we
do for rounding)

Also avoid keep pointers on 'display_*' values - they are in
ringbuffer for immediate use - not to be kept across multiple calls
(as they could be already overwritten by later calls) - so dropped
seg_region_size_str
2017-06-23 18:44:00 +02:00
Zdenek Kabelac
41c10034aa debug: show message only when origin_only was set 2017-06-22 20:17:20 +02:00
Zdenek Kabelac
58e075f5fb cache: fix lvdisplay output
Unused cache pool may have lots of fields actually undefined,
so avoid printing them, if they are not specified in metadata.
2017-06-22 20:17:18 +02:00
Zdenek Kabelac
732928dda8 cache: fix lvdisplay --maps
'lvdisplay -m' tried to go through NULL policy settings,
when such policy was not defined for CachedLV.

Patch is fixing display of cache-pool without defined settings,
as this is now a valid pool and we mostly want users to define
these settings when actually really caching a LV.
2017-06-22 20:15:12 +02:00
Zdenek Kabelac
58a9f88b8c cache: drop usage of origin_only
Since cache LV can be a stacked device, there is no real reason
trying to use slight optimised tree for origin_only cache reload
(it could be even wrongly implemented in this case).

We can easily go with stardard tree load here.
2017-06-22 20:14:31 +02:00
Zdenek Kabelac
ca9e6cec61 cache: make syncing abortable by user
When user runs command like 'lvconvert --splitcache' the operation
might be actually either slow or not making any progress in kernel,
so lets give user a chance to abort such operation.

When user press 'Ctrl+C' device table is restored to pre-flushing state.
2017-06-22 20:11:43 +02:00
Heinz Mauelshagen
2df9a78684 mirror: reformat conditional 2017-06-22 00:57:16 +02:00
Heinz Mauelshagen
64fac77e8a raid: fix segfault
Add missing else clause
(already missing in initial commit fe18e5e77a).

Resolves: rhbz1463794
2017-06-22 00:49:00 +02:00
Zdenek Kabelac
e3f63693a4 lvresize: support passing --yes to fsadm
Since fsadm now needs --yes to pass prompting operations,
we need to pass --yes from  lvresize to fsadm.
2017-06-21 14:03:29 +02:00
Zdenek Kabelac
48f06005ab raid: update path for repair
Updating path from commit 61980bcf06.

When repair is running, no removing PVS are given so it shall return
success in such case.
2017-06-21 14:00:50 +02:00
Zdenek Kabelac
5f4cfa7c4a debug: missing traces 2017-06-21 12:36:01 +02:00
Zdenek Kabelac
07fe64b473 raid: use log_error on error path
Converting log_warn to log_error since error must be logged
when tool returns error.
2017-06-21 12:35:17 +02:00
Zdenek Kabelac
61980bcf06 raid: report error when specified devices are not contained
lvm2 always return non-zero error code when action cannot happen.
2017-06-21 12:35:17 +02:00
Zdenek Kabelac
31d153ced0 raid: drop debug code 2017-06-21 12:35:16 +02:00
Zdenek Kabelac
49fa2bea1c raid: more origin_only updates
Seems the code is multiplied - so keep it consistent for now.

TODO:  drop all uneeded code
2017-06-21 12:35:16 +02:00
Heinz Mauelshagen
1766eaec4b lvconvert: provide better reshape reject message for open RaidLV
On commits
5e611c700b and
601ad1c73f.

Related: rhbz1447812
2017-06-20 19:06:18 +02:00
Heinz Mauelshagen
76314183e2 raid: avoid explicit activation of SubLVs on reshape/takeover
Remove explicit activation of SubLVs and let lv_update_and_reload()
perform the proper (pre-)loading sequencing of tables.
This avoids related callback functions which are removed.

Related: rhbz1448116
Related: rhbz1461526
Related: rhbz1448123
2017-06-20 18:56:45 +02:00
Heinz Mauelshagen
0dfe1bc29d raid: provide clickable URL BZ references 2017-06-20 18:43:26 +02:00
Zdenek Kabelac
1ea41b6d48 activation: fix usage of origin_only
When lock-holding LV differs from actually request locked LV,
we drop  origin_only flag as it has no use - it'd be applied
on completely different LV.

Example of problem:

Raid is  thin-pool _tdata LV.
Raid run  origin_only locking on stacked device.
As lock holder is discovered thinLV.
Whole origin_only operation is then applied only on thinLV
changing the meaning of whole operation.

NOTE: this patch does not change anything for LV that are
already top-level lock holding LVs (i.e. thinLVs, snahoshots/origins).
2017-06-20 18:23:24 +02:00
Heinz Mauelshagen
5e611c700b lvconvert: check open count to disable reshaping of open RAID LV
Also check LV open count in addition to opening the RaidLV
exclusively as of commit 601ad1c73f.

Related: rhbz1447812
2017-06-20 17:59:10 +02:00
Heinz Mauelshagen
601ad1c73f lvconvert: enhance disable reshaping of open RAID LV
Enhance commit 9e9163618a
to use dev_open_flags/dev_close API.

Related: rhbz1447812
2017-06-20 17:27:58 +02:00
Zdenek Kabelac
19cc03fa52 thin: restore conversion to raid
Since commit  1bc546269a we've disabled
coversion of raid. This however already got fixed, so reenable
commands like:  'lvconvert --type raid1 vg/pool_tdata'.
2017-06-19 23:30:08 +02:00
Heinz Mauelshagen
9e9163618a lvconvert: disable reshaping of open RAID LV
Disable until we have a proper fix for reshape space allocation,
switching it to begin/end of rimages and activation.

Related: rhbz1447812
2017-06-19 22:25:54 +02:00
Heinz Mauelshagen
e1a1c20e95 lvconvert: enhance message
Enhance message introduced by last
commit f342e803ba.

Related: rhbz1439399
2017-06-19 21:40:38 +02:00
Heinz Mauelshagen
f342e803ba lvconvert: disable conversion of RAID LV under snapshot
Disable until we have a proper fix for reshape space allocation,
switching it to begin/end of rimages and activation.

Related: rhbz1439399
2017-06-19 21:08:52 +02:00
Heinz Mauelshagen
fb46175ce7 lvconvert: disable reshaping of RAID LVs in the cluster
Disable until we have a proper fix for reshape space allocation,
switching it to begin/end of rimages and activation in the cluster.

Related: rhbz1448116
Related: rhbz1461526
Related: rhbz1448123
2017-06-19 21:06:53 +02:00
Zdenek Kabelac
fbb3bffb22 debug: passing non-raid seg would be internal error 2017-06-16 17:04:02 +02:00
Zdenek Kabelac
9e96f96a41 cleanup: drop unused parameter 2017-06-16 17:04:02 +02:00
Zdenek Kabelac
cdb55c19cd cleanup: show what happens when passed prompt
When we show prompt and user passes --yes - we still
do tell user which action is going to happen.
2017-06-16 17:04:02 +02:00
Zdenek Kabelac
14816222a1 cleanup: improve debug tracing 2017-06-16 17:04:02 +02:00
Zdenek Kabelac
b7c9ec8a24 cleanup: use 'dm_get_status_raid'
Use single 'dm' call to parse raid status.
(Avoiding multiple parsers - even when we know it's slighly
less efficient).
2017-06-16 17:04:01 +02:00
Zdenek Kabelac
59d646167f raid: report percent with segtype info
Enhance reporting code, so it does not need to do 'extra' ioctl to
get 'status' of normal raid and provide percentage directly.

When we have 'merging' snapshot into raid origin, we still need to get
this secondary number with extra status call - however, since  'raid'
is always a single segment LV - we may skip 'copy_percent' call as
we directly know the percent and also with better precision.

NOTE: for mirror we still base reported number on the percetage of
transferred extents which might get quite imprecisse if big size
of extent is used while volume itself is smaller as reporting jump
steps are much bigger the actual reported number provides.

2nd.NOTE: raid lvs line report already requires quite a few extra status
calls for the same device - but fix will be need slight code improval.
2017-06-16 17:04:01 +02:00
Heinz Mauelshagen
40e0dcf70d raid: adjust reshape feature flag check
Relative to last comit ddf2a1d656:

adjust the dm-raid target version to 1.12.0 which shows
mandatory kernel MD deadlock fixes related to reshaping
are presant in the kernel.

Related: rhbz1443999
2017-06-16 15:58:47 +02:00
Heinz Mauelshagen
ddf2a1d656 Revert "lvconvert: reject changing number of stripes on single core
This reverts commit 3719f4bc54
to allow for single core testing on kernels with deadlock
fixes relative to rhbz1443999."
2017-06-16 15:43:23 +02:00
Jonathan Brassow
6c4b2a6aa1 clean-up: Very picky update to comment - hopefully making it clearer 2017-06-14 15:22:04 -05:00
Jonathan Brassow
1f57a5263e clean-ups: remove unused var, add 'static' for local fn, adjust test
For the test clean-up, I was providing too many devices to the first
command - possibly allowing it to allocate in the wrong place.  I was
also not providing a device for the second command - virtually ensuring
the test was not performing correctly at times.
2017-06-14 14:49:42 -05:00
Jonathan Brassow
ddb14b6b05 lvconvert: Disallow removal of primary when up-converting (recovering)
This patch ensures that under normal conditions (i.e. not during repair
operations) that users are prevented from removing devices that would
cause data loss.

When a RAID1 is undergoing its initial sync, it is ok to remove all but
one of the images because they have all existed since creation and
contain all the data written since the array was created.  OTOH, if the
RAID1 was created as a result of an up-convert from linear, it is very
important not to let the user remove the primary image (the source of
all the data).  They should be allowed to remove any devices they want
and as many as they want as long as one original (primary) device is left
during a "recover" (aka up-convert).

This fixes bug 1461187 and includes the necessary regression tests.
2017-06-14 08:41:05 -05:00
Jonathan Brassow
4c0e908b0a RAID (lvconvert/dmeventd): Cleanly handle primary failure during 'recover' op
Add the checks necessary to distiguish the state of a RAID when the primary
source for syncing fails during the "recover" process.

It has been possible to hit this condition before (like when converting from
2-way RAID1 to 3-way and having the first two devices die during the "recover"
process).  However, this condition is now more likely since we treat linear ->
RAID1 conversions as "recover" now - so it is especially important we cleanly
handle this condition.
2017-06-14 08:39:50 -05:00
Jonathan Brassow
d34d2068dd lvconvert: Don't require a 'force' option during RAID repair.
Previously, we were treating non-RAID to RAID up-converts as a "resync"
operation.  (The most common example being 'linear -> RAID1'.)  RAID to
RAID up-converts or rebuilds of specific RAID images are properly treated
as a "recover" operation.

Since we were treating some up-convert operations as "resync", it was
possible to have scenarios where data corruption or data loss were
possibilities if the RAID hadn't been able to sync completely before a
loss of the primary source devices.  In order to ensure that the user took
the proper precautions in such scenarios, we required a '--force' option
to be present.  Unfortuneately, the force option was rendered useless
because there was no way to distiguish the failure state of a potentially
destructive repair from a nominal one - making the '--force' option a
requirement for any RAID1 repair!

We now treat non-RAID to RAID up-converts properly as "recover" operations.
This eliminates the scenarios that can potentially cause data loss or
data corruption; and this eliminates the need for the '--force' requirement.
This patch removes the requirement to specify '--force' for RAID repairs.
2017-06-14 08:39:07 -05:00
Jonathan Brassow
c87907dcd5 lvconvert: linear -> raid1 upconvert should cause "recover" not "resync"
Two of the sync actions performed by the kernel (aka MD runtime) are
"resync" and "recover".  The "resync" refers to when an entirely new array
is going through the process of initializing (or resynchronizing after an
unexpected shutdown).  The "recover" is the process of initializing a new
member device to the array.  So, a brand new array with all new devices
will undergo "resync".  An array with replaced or added sub-LVs will undergo
"recover".

These two states are treated very differently when failures happen.  If any
device is lost or replaced while "resync", there are no worries.  This is
because any writes created from the inception of the array have occurred to
all the devices and can be safely recovered.  Even though non-initialized
portions will still be resync'ed with uninitialized data, it is ok.  However,
if a pre-existing device is lost (aka, the original linear device in a
linear -> raid1 convert) during a "recover", data loss can be the result.
Thus, writes are errored by the kernel and recovery is halted.  The failed
device must be restored or removed.  This is the correct behavior.

Unfortunately, we were treating an up-convert from linear as a "resync"
when we should have been treating it as a "recover".  This patch
removes the special case for linear upconvert.  It allows each new image
sub-LV to be marked with a rebuild flag and treats the array as 'in-sync'.
This has the correct effect of causing the upconvert to be treated as a
"recover" rather than a "resync".  There is no need to flag these two states
differently in LVM metadata, because they are already considered differently
by the kernel RAID metadata.  (Any activation/deactivation will properly
resume the "recover" process and not a "resync" process.)

We make this behavior change based on the presense of dm-raid target
version 1.9.0+.
2017-06-14 08:35:22 -05:00
Heinz Mauelshagen
14d563accc raid: change reshape segtype flags
Commit 1c916ec5ff
missed new reshape flags.
2017-06-14 15:01:19 +02:00
Heinz Mauelshagen
08079ec420 lvconvert: fix detached SubLV deactivation in cluster
On conversion from raid10 to raid0 (takeover), all rmeta
devices and the rimage devices of mirrored stripes are
detached from the raid10 LV. The remaining rimage areas
are being shifted down into the slots of the detached
ones hence requiring renames to show proper _N suffix
sequences (e.g. 0,1,2,3 instead of 0,2,4,6).  Only the
top-level raid10 LV has a cluster lock, not the detached
SubLVs thus their deactivation is impossible and e.g the
rename from *_rimage_6 to *_rimage_3 will fail.  Fix by
activating exclusively before deactivating and removing.

Resolves: rhbz1448123
2017-06-13 23:15:51 +02:00
Heinz Mauelshagen
1c916ec5ff raid: add reshape segtype flag support
Prohibit activation of reshaping RaidLVs on incompatible
lvm2 runtime by storing e.g. 'raid5+RESHAPE' segment type
strings in the lvm2 metadata.  Incompatible runtime not
supporting reshaping won't be able to activate those thus
avoiding potential data corruption.

Any new non-reshaping lvconvert command will reset the
segment type string from 'raid5+RESHAPE' to 'raid5'.

See commits
0299a7af1e and
4141409eb0
for segtype flag support.
2017-06-09 22:23:04 +02:00
Zdenek Kabelac
57379157f4 cleanup: update message 2017-06-09 21:49:19 +02:00
Zdenek Kabelac
db5938a4f8 cleanup: define really uses KB
Cleanup also units for DEFAULT_THIN_POOL_OPTIMAL_METADATA_SIZE define
(128MB) and update calcs for it.
2017-06-09 21:49:19 +02:00