1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

43 Commits

Author SHA1 Message Date
Heinz Mauelshagen
c4aba47dd0 test: add checks for not 100% sync ratio after initiation of check/repair
Related: rhbz1640630
2019-10-02 15:25:30 +02:00
David Teigland
595196bc29 tests: enable lvmlockd for passing tests 2018-05-30 09:25:45 -05:00
Jonathan Brassow
4129cf5090 testsuite: Forgot to pull 'should's after fixing RAID4/5/6 mismatch test
Test will now fail rather than warn if conditions are not met.
2017-11-02 10:25:46 -05:00
Jonathan Brassow
9e8dec2f38 testsuite: Fix problem when checking RAID4/5/6 for mismatches.
The lvchange-raid[456].sh test checks that mismatches can be detected
properly.  It does this by writing garbage to the back half of one of
the legs directly.  When performing a "check" or "repair" of mismatches,
MD does a good job going directly to disk and bypassing any buffers that
may prevent it from seeing mismatches.  However, in the case of RAID4/5/6
we have the stripe cache to contend with and this is not bypassed.  Thus,
mismatches which have /just/ happened to an area that now populates the
stripe cache may be overlooked.  This isn't a serious issue, however,
because the stripe cache is short-lived and reasonably small.  So, while
there may be a small window of time between the disk changing underneath
the RAID array and when you run a "check"/"repair" - causing a mismatch
to be missed - that would be no worse than if a user had simply run a
"check" a few seconds before the disk changed.  IOW, it simply isn't worth
making a fuss over dropping the stripe cache before beginning a "check" or
"repair" (which we actually did attempt to do a while back).

So, to get the test running smoothly, we simply deactivate and reactivate
the LV to force the stripe cache to be dropped and then proceed.  We could
just as easily wait a few seconds for the stripe cache to empty also.
2017-11-02 09:49:35 -05:00
Zdenek Kabelac
1c9789b0cc tests: use well defined test
Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.

Apparently && and || "short-circuit" while -a and -o do not.
2017-07-10 14:23:53 +02:00
Zdenek Kabelac
146bfb2417 tests: math drop uncessary $/${}
$/${} is unnecessary on arithmetic variables.

Use $((..)) instead of deprecated $[..]
2017-07-10 14:23:53 +02:00
Zdenek Kabelac
e55bae2b2c tests: use bash 2017-07-10 14:23:53 +02:00
Zdenek Kabelac
24751b45bd tests: double quote 2017-07-10 14:23:53 +02:00
Zdenek Kabelac
5ca4fd0478 tests: correcting usage of '==' in bash 2017-07-10 14:23:53 +02:00
Zdenek Kabelac
597b3576c7 tests: wait for raid in sync
Lvchange needs synchronized raid.
2017-05-29 14:41:53 +02:00
Heinz Mauelshagen
0f65d7ec3a lvconvert: prompt on raid1 image changes
Don't change resilience of raid1 LVs without --yes.

Adjust respective tests.
2017-04-06 18:47:41 +02:00
Heinz Mauelshagen
4b0ae266c2 test: raid1 down convert to linear
Add/adjust more tests for commit 7fbe6ef16b.
2017-03-09 13:16:08 +01:00
Heinz Mauelshagen
48778bc503 lvconvert: add new reporting fields for reshaping
During an ongoing reshape, the MD kernel runtime reads stripes relative
to data_offset and starts storing the reshaped stripes (with new raid
layout and/or new stripesize  and/or new number of stripes) relative
to new_data_offset.  This is to avoid writing over any data in place
which is non-atomic by nature and thus be recoverable without data loss
in the transition.  MD uses the term out-of-place reshaping for it.

There's 2 other areas we don't have report capability for:
- number of data stripes vs. total stripes
  (e.g. raid6 with 7 stripes toal has 5 data stripes)
- number of (rotating) parity/syndrome chunks
  (e.g. raid6 with 7 stripes toal has 2 parity chunks; one
   per stripe for P-Syndrome and another one for Q-Syndrome)

Thus, add the following reportable keys:

- reshape_len      (in current units)
- reshape_len_le   (in logical extents)
- data_offset      (in sectors)
- new_data_offset  (     "    )
- data_stripes
- parity_chunks

Enhance lvchange-raid.sh, lvconvert-raid-reshape-linear_to_striped.sh,
lvconvert-raid-reshape-striped_to_linear.sh, lvconvert-raid-reshape.sh
and lvconvert-raid-takeover.sh to make use of new keys.

Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
2017-03-01 18:50:35 +01:00
David Teigland
f54253d396 tests: add SKIP_WITH_LVMLOCKD
to all tests that don't already used vgcreate $SHARED
2016-02-23 09:28:48 -06:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Zdenek Kabelac
4159680a0e tests: use more SKIP
Speed-up check_lvmpolld.
2015-10-27 16:00:09 +01:00
Zdenek Kabelac
79844b9066 tests: minor simplifications
minor updates
2015-05-01 15:07:59 +02:00
Zdenek Kabelac
c3bb9629a8 tests: syncaction needs kernel fix
Add 'should' as we currently cannot pass this test.

FIXME:
Add properly wrapper to not use 'should' with fixed kernel.
2014-10-24 16:39:32 +02:00
Zdenek Kabelac
b22ab4dab0 tests: avoid hiding results in local
There is a difference between:

local a=$(shell)

and

local a
a=$(shell)

The first return exit code from shells' local command.
2014-07-02 10:45:43 +02:00
Zdenek Kabelac
5c5177c37c tests: rename test to inittest
We are getting into problem when we use 'test' for commands like
should/not/...

So avoid overloading test name and change it to inittest.
2014-06-10 10:51:27 +02:00
Zdenek Kabelac
171a668e81 tests: dd needs to hit disk
Unsure if this is feature or bug of syncaction,
but it needs to be present physically on the media
and it ignores content of buffer cache...

(maybe lvchange should implicitely fsync all disks
that are members of raid array before starting test??)
2014-05-28 15:33:41 +02:00
Zdenek Kabelac
2e9792121f tests: add have_cache and have_raid
Need to be aware of build options, when system would be
configure without raid or cache support
2014-05-20 21:50:30 +02:00
Zdenek Kabelac
76c06c7252 tests: speedup
Avoid some expencive raid/mirror synchronization when testing
just allocation sizes.
Use lv_attr_bit
2014-05-15 12:08:35 +02:00
Zdenek Kabelac
de5683d8d9 tests: add quotes around device paths 2014-03-21 22:29:27 +01:00
Petr Rockai
3c9887467f test: Use correct path to /dev in lvchange-raid.sh. 2014-03-05 10:22:39 +01:00
Zdenek Kabelac
52007a9191 tests: split raid test
Use separate files for raid1, raid456, raid10.
They need different target versions to work, so support
more precise test selection.

Optimize duplicate tests of target avalability and skip
unsupported test cases sooner.
2014-03-03 11:23:57 +01:00
Marian Csontos
445c0a5585 test: Remove incorrect evaluation 2014-03-03 08:31:33 +01:00
Zdenek Kabelac
47b15b805e tests: updates
Add some vgremove calls.
Remove uneeded test for some unused commands.
Add tests for missing commands.
2014-02-27 13:01:04 +01:00
Zdenek Kabelac
d68e5d5ab9 tests: utilize check and get
Replace some in-test use of lvs commands with their check
and get equivalent.

Advantage is these 'checking' commands are not necessarily always
valiadated via extensive valgrind testing and also the output noice
is significantly reduces since the output of check/get is suppressed.
2014-02-11 19:00:07 +01:00
Jonathan Brassow
1d06868240 TEST: Unaccounted possible output causing failure
lvchange-raid.sh checks to ensure that the 'p'artial flag takes
precedence over the 'w'ritemostly flag by disabling and reenabling
a device in the array.  Most of the time this works fine, but
sometimes the kernel can notice the device failure before it is
reenabled.  In that case, the attr flag will not return to 'w', but
to 'r'efresh.  This is because 'r'efresh also takes precedence over
the 'w'ritemostly flag.  So, we also do a quick check for 'r' and
not just 'w'.
2013-09-12 13:23:53 -05:00
Jonathan Brassow
82228acfc9 Mirror/Thin: Disallow thinpools on mirror logical volumes
The same corner cases that exist for snapshots on mirrors exist for
any logical volume layered on top of mirror.  (One example is when
a mirror image fails and a non-repair LVM command is the first to
detect it via label reading.  In this case, the LVM command will hang
and prevent the necessary LVM repair command from running.)  When
a better alternative exists, it makes no sense to allow a new target
to stack on mirrors as a new feature.  Since, RAID is now capable of
running EX in a cluster and thin is not active-active aware, it makes
sense to pair these two rather than mirror+thinpool.

As further background, here are some additional comments that I made
when addressing a bug related to mirror+thinpool:
(https://bugzilla.redhat.com/show_bug.cgi?id=919604#c9)
I am going to disallow thin* on top of mirror logical volumes.
Users will have to use the "raid1" segment type if they want this.

This bug has come down to a choice between:
1) Disallowing thin-LVs from being used as PVs.
2) Disallowing thinpools on top of mirrors.

The problem is that the code in dev_manager.c:device_is_usable() is unable
to tell whether there is a mirror device lower in the stack from the device
being checked.  Pretty much anything layered on top of a mirror will suffer
from this problem.  (Snapshots are a good example of this; and option #1
above has been chosen to deal with them.  This can also be seen in
dev_manager.c:device_is_usable().)  When a mirror failure occurs, the
kernel blocks all I/O to it.  If there is an LVM command that comes along
to do the repair (or a different operation that requires label reading), it
would normally avoid the mirror when it sees that it is blocked.  However,
if there is a snapshot or a thin-LV that is on a mirror, the above code
will not detect the mirror underneath and will issue label reading I/O.
This causes the command to hang.

Choosing #1 would mean that thin-LVs could never be used as PVs - even if
they are stacked on something other than mirrors.

Choosing #2 means that thinpools can never be placed on mirrors.  This is
probably better than we think, since it is preferred that people use the
"raid1" segment type in the first place.  However, RAID* cannot currently
be used in a cluster volume group - even in EX-only mode.  Thus, a complete
solution for option #2 must include the ability to activate RAID logical
volumes (and perform RAID operations) in a cluster volume group.  I've
already begun working on this.
2013-09-11 15:58:44 -05:00
Jonathan Brassow
2691f1d764 RAID: Make RAID single-machine-exclusive capable in a cluster
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.

The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide.  This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them.  Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction.  This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.

The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID.  (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
2013-09-10 16:33:22 -05:00
Jonathan Brassow
e72b2d047d TEST: Add tests for lvchange actions of RAID under thin
Patch includes RAID1,4,5,6,10 tests for:
- setting writemostly/writebehind
* syncaction changes (i.e. scrubbing operations)
- refresh (i.e. reviving devices after transient failures)
- setting recovery rate (sync I/O throttling)
while the RAID LVs are under a thin-pool (both data and metadata)

* not fully tested because I haven't found a way to force bad
  blocks to be noticed in the testsuite yet.  Works just fine
  when dealing with "real" devices.
2013-08-27 16:46:40 -05:00
Jonathan Brassow
8615234c0f RAID: Fix bug making lvchange unable to change recovery rate for RAID
1) Since the min|maxrecoveryrate args are size_kb_ARGs and they
   are recorded (and sent to the kernel) in terms of kB/sec/disk,
   we must back out the factor multiple done by size_kb_arg.  This
   is already performed by 'lvcreate' for these arguments.
2) Allow all RAID types, not just RAID1, to change these values.
3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe'
   commands so that lvchange will not complain about needing at
   least one of a certain set of arguments and failing.
4) Add tests that check that these values can be set via lvchange
   and lvcreate and that 'lvs' reports back the proper results.
2013-08-09 17:09:47 -05:00
Jonathan Brassow
081308af30 TEST: Support testing new RAID features in RHEL6 kernels
We check the version number of dm-raid before testing certain
features to make sure they are present.  However, this has
become somewhat complicated by the fact that the version #'s
in the upstream kernel and the REHL6 kernel have been diverging.
This has been a necessity because the upstream kernel has
undergone ABI changes that have necessitated a bump in the
'Y' component of the version #, while the RHEL6 kernel has not.
Thus, we need to know that the ABI has not changed but the
features have been added.  So, the current version #'ing stands
as follows:

RHEL6   Upstream   Comment
======|==========|========
** Same until version 1.3.1 **
------|----------|--------
 N/A  |   1.4.0  | Non-functional change.
      |          | Removes arg from mapping function.
------|----------|--------
1.3.2 |   1.4.1  | RAID10 fix redundancy validation checks.
------|----------|--------
1.3.5 |   1.4.2  | Add RAID10 "far" and "offset" algorithm support.
      |          | Note this feature came later in RHEL6 as part of
      |          | a separate update/feature.
------|----------|--------
1.3.3 |   1.5.0  | Add message interface to allow manipulation of
      |          | the sync_action.
      |          | New status (STATUSTYPE_INFO) fields: sync_action
      |          | and mismatch_cnt.
------|----------|--------
1.3.4 |   1.5.1  | Add ability to restore transiently failed devices
      |          | on resume.
------|----------|--------
1.3.5 |   1.5.2  | 'mismatch_cnt' is zero unless [last_]sync_action
      |          | is "check".
------|----------|--------

To simplify, writemostly/writebehind, scrubbing, and transient device
failure restoration are all tested based on the same version
requirements: (1.3.5 < V < 1.4.0) || (V > 1.5.2).  Since kernel
support for writemostly/writebehind has been around for some time,
this could mean a reduction in the scope of kernels tested for this
feature.  I don't view this as much of a problem, since support for
this feature was only recently added to LVM.  Thus, the user would
have to be using a very recent LVM version with an older kernel.
2013-07-22 08:50:27 -05:00
Jonathan Brassow
6aeb54c77c TEST: Update syncaction test to match latest kernel updates
The mismatch count reported by a dm-raid kernel target used
to be effectively random, unless it was checked after a
"check" scrubbing action had been performed.  Updates to the
kernel now mean that the mismatch count will be 0 unless a
check has been performed and discrepancies had been found.
This has been the intended behaviour all along.

This patch updates the test suite to handle the change.
2013-07-19 15:24:34 -05:00
Alasdair G Kergon
da79fe4c1d reporting: tidy recent new fields
Add underscores and prefixes to recently-added fields.
(Might add more alias functionality in future.)
2013-07-19 01:30:02 +01:00
Peter Rajnoha
55d418fb20 tests: fix tests to cope with latest changes
- lvs -o lv_attr has now 10 indicator bits
- use '--ignoremonitoring' instead of the shortcut '--ig' used before (since
it would be ambiguous with new '--ignoreactivationskip')
2013-07-12 20:54:17 +02:00
Jonathan Brassow
bdcfe8c6de TEST: Test RAID syncaction, writemostly, & refresh under snapshots
Test the different RAID lvchange scenarios under snapshot as well.

This patch also updates calculations for where to write to an
underlying PV when testing various syncactions.
2013-06-20 11:48:15 -05:00
Jonathan Brassow
7a4fdc1902 TEST: Fix 'dd' overrunning device size and causing test failure
Assumed size of 4M was too large and the test was failing because
'dd' was failing to perform its write.

Calculate the size we need to write with 'dd' instead, so we
don't overrun the device.
2013-06-17 12:38:09 -05:00
Zdenek Kabelac
362d8ead64 tests: more test run in cluster mode
aux updates:

prepare_vg now created clustered VG for cluster tests.

since dm-raid doesn't work in cluster, skip the cluster
test when someone checks for dm-raid target until fixed.
2013-06-16 00:07:33 +02:00
Jonathan Brassow
2e0740f7ef RAID: Add writemostly/writebehind support for RAID1
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics.  The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value.  If no trailing
character is given, it will set the flag.
Synopsis:
        lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
        lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv

The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set.  It is signified with a 'w'.  If the device
has failed, the 'p'artial flag has priority.

Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   Rwi---r-m    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-w    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r--    1 linear   4.00m

Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   rwi---r-p    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-p    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r-p    1 linear   4.00m

A new reportable field has been added for writebehind as well.  If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--     512
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--

Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--
2013-04-15 13:59:46 -05:00
Jonathan Brassow
ff64e3500f RAID: Add scrubbing support for RAID LVs
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
	lvchange --syncaction {check|repair} vg/raid_lv

RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent.  'lvchange' can
now initaite the two scrubbing operations: "check" and "repair".  "check"
will go over the array and recored the number of discrepancies but not
repair them.  "repair" will correct the discrepancies as it finds them.

'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'.  The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.

Additional reporting has been added for 'lvs' to support the new
operations.  Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches".  These
can be accessed using the '-o' option to 'lvs', like:
	lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing.  It can be one of the following:
        - idle:   All sync operations complete (doing nothing)
        - resync: Initializing an array or recovering after a machine failure
        - recover: Replacing a device in the array
        - check: Looking for array inconsistencies
        - repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.

The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.

Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well.  The role of the 'p'artial character in the lv_attr report field
as expanded.  "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
        'm'ismatches:  Indicates that there are discrepancies in a RAID
                       LV.  This character is shown after a scrubbing
                       operation has detected that portions of the RAID
                       are not coherent.
        'r'efresh   :  Indicates that a device in a RAID array has suffered
                       a failure and the kernel regards it as failed -
                       even though LVM can read the device label and
                       considers the device to be ok.  The LV should be
                       'r'efreshed to notify the kernel that the device is
                       now available, or the device should be 'r'eplaced
                       if it is suspected of failing.
2013-04-11 15:33:59 -05:00