1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-06 17:18:29 +03:00
Commit Graph

3843 Commits

Author SHA1 Message Date
Heinz Mauelshagen
8ab0725077 lvchange: reject writemostly/writebehind on raid1 during resync
The MD kernel raid1 personality does no use any writemostly leg as the primary.

In case a previous linear LV holding data gets upconverted to
raid1 it becomes the primary leg of the new raid1 LV and a full
resynchronization is started to update the new legs.

No writemostly and/or writebehind setting may be allowed during
this initial, full synchronization period of this new raid1 LV
(using the lvchange(8) command), because that would change the
primary (i.e the previous linear LV) thus causing data loss.

lvchange has a bug not preventing this scenario.

Fix rejects setting writemostly and/or writebehind on resychronizing raid1 LVs.

Once we have status in the lvm2 metadata about the linear -> raid upconversion,
we may relax this constraint for other types of resynchronization
(e.g. for user requested "lvchange --resync ").

New lvchange-raid1-writemostly.sh test is added to the test suite.

Resolves: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855895
2017-02-23 15:09:29 +01:00
Zdenek Kabelac
9712995edd snapshot: improve removal of active origin volume
Previously when lvremove tried to remove 'active' origin,
it had been asking for every 'snapshot' LV separately
and doing individual single snapshot removals first.

To be faster it now deactivates origin before removal
all connected snapshots.

This avoids multiple reloads of dm table for origin volume
which were unnecessary as origin was meant to be removed as well.
2017-02-22 15:35:04 +01:00
Zdenek Kabelac
6716f5a2f4 WHATS_NEW: add entry 2017-02-22 15:20:52 +01:00
David Teigland
5ab051df7a WHATS_NEW: changes for recent command defs 2017-02-13 15:36:17 -06:00
David Teigland
c816e8b636 WHATS_NEW: items from command definitions patch set 2017-02-13 15:22:26 -06:00
Zdenek Kabelac
2a9eda1229 mem: add extra mem pages for pthread stack
Some archs can use even 64K pages and then lvm2 runs into trouble if
the stack is 'too small' to fit extra page capturing stack overwrite.

So when lvm2 limits stack - add extra mem page - be it 4K or 64K.

Relates to ppc64le bug: https://bugzilla.redhat.com/1387279
2017-02-11 18:23:15 +01:00
Heinz Mauelshagen
baba3f8e2a lvconvert: add conversion from/to raid10
Add:
- conversion support from striped/raid0/raid0_meta to/from raid10;
  raid10 goes by the near format (same as used in creation of
  raid10 LVs), which groups data copies together with their original
  blocks (e.g. 3-way striped, 2 data copies resulting in 112233 in the
  first stripe followed by 445566 in the second etc.) and is limited
  to even numbers of legs for now
- related tests to lvconvert-raid-takeover.sh
- typo

Related: rhbz1366296
2017-02-10 19:13:02 +01:00
Heinz Mauelshagen
55eaabd118 lvreduce/lvresize: add ability to reduce the size of a RaidLV
- support shrinking of raid0/1/4/5/6/10 LVs
- enhance lvresize-raid.sh tests: add raid0* and raid10
- fix have_raid4 in aux.sh to allow lv_resize-raid.sh
  and other scripts to test raid4

Resolves: rhbz1394048
2017-02-09 22:42:03 +01:00
Heinz Mauelshagen
cfb6ef654d lvconvert: add support to change RAID region size
Add:
- support to change region size of existing RaidLVs
  (all RAID LV types but raid0/raid0_meta)
- lvconvert-raid-regionsize.sh with test variations
  for different RAID types and region sizes

Resolves: rhbz1392947
2017-02-07 01:01:19 +01:00
Zdenek Kabelac
dae4f53acb clvmd: add mutex protection for cpg_ call
The library for corosync multicasting is not supporting multithread
usage - add local mutex to avoid parallel call of cpg_mcast_joined().
2017-02-05 17:55:37 +01:00
Heinz Mauelshagen
a4bbaa3b89 lvconvert: add segtypes raid6_{ls,rs,la,ra}_6 and conversions to/from it
Add:
- support for segment types raid6_{ls,rs,la,ra}_6
  (striped raid with dedicated last Q-Syndrome SubLVs)
- conversion support from raid5_{ls,rs,la,ra} to/from raid6_{ls,rs,la,ra}_6
- setting convenient segtypes on conversions from/to raid4/5/6
- related tests to lvconvert-raid-takeover.sh factoring
  out _lvcreate,_lvconvert funxtions

Related: rhbz1366296
2017-02-05 00:56:27 +01:00
Heinz Mauelshagen
d8568552e4 WHATS_NEW: New segment type raid6_n_6 2017-02-04 14:09:26 +01:00
Heinz Mauelshagen
96f331fe05 WHATS_NEW: New segment type raid5_n 2017-02-03 23:41:48 +01:00
Zdenek Kabelac
d80f9a107f lvmcmd2lib: support new command
Internal command which reads lvm.conf settins and passes it
via envvar to dmeventd monitoring thread.
2017-01-20 23:55:07 +01:00
Zdenek Kabelac
04a9cad499 config: new option dmeventd/thin_command
This setting will allowing configuring which command gets executed
when thin-pool fullness goes from 50%..100%
2017-01-20 23:53:26 +01:00
Zdenek Kabelac
f8234d6e5f libdm: add human R|readable units
When showing sizes with 'H|human' units we do use standard rounding.
This however is confusing users from time to time,
when the printed number uses some biger units i.e. GiB and there is just
tiny fraction of space missing.

So here is some real-life example with new 'r' unit.

$lvs

  LV    VG Attr       LSize  Pool Origin
  lvol0 vg -wi-a-----  1.99g
  lvol1 vg -wi-a----- <2.00g
  lvol2 vg -wi-a----- <2.01g

Meaning is - lvol1 has 'slightly' less then 2.00g - from sign '<' user
can be aware the LV doesn't have full 2.00GiB in size so he
will be less surpriced allocation of 2G volume will not succeed.

$ vgs
  VG #PV #LV #SN Attr   VSize  VFree
  vg   2   2   0 wz--n- <6,00g <2,01g

For uses needing  'old'  undecorated human unit simply will continue
to use 'H|h' units.

The new R|r  may further change when we would recongnize some
other way how to improve readability.
2017-01-20 23:52:17 +01:00
Peter Rajnoha
d90320f4f1 blkdeactivate: also unmount mount point on top of MD device if using blkdeactivate -u
The blkdeactivate script processes MD devices too so we should unmount
any mount point on top of an MD device if blkdeactivate -u|--umount is
called.

Diagnosed and reported by: Rick Warner <rick@microway.com>
See also https://bugzilla.redhat.com/show_bug.cgi?id=1410585.
2017-01-06 11:16:07 +01:00
Zdenek Kabelac
3e9c03cbbc cache: resize is still unsupported
During rework of resize code this validation check
has been lost (in my resize branch). Upstream
is still not supporting resize of any cache type LV
so needs to be prevented.
2017-01-05 15:34:22 +01:00
Zdenek Kabelac
95d5877f7a cache: add missing udev wait
When we need to clear dirty cache content of cached LV, there
is table reload which usually is shortly followed by next metadata
change.  However  udev  can't (as of now)  process   udev event
while device is 'suspended'.

So whenever sequence of  'suspend/resume/suspend' is needed,
we need to wait first for finishing of 'resume' processing before
starting next 'suspend'. Otherwise there is  'race' danger of triggering
unwantend umount by systemd as  such event will trigger
SYSTEMD_READY=0 state for a moment for such changed device.

Such race is pretty ugly to trace so we may need to review more
sequencies for missing 'sync'.

(Other option is to enhnace 'udev' rules processing to avoid
such dramatic actions to be happening for suspended devices).
2017-01-03 14:55:16 +01:00
Zdenek Kabelac
4fd41cf67f vgchange: max_pv limited to uint32
Solves: https://bugzilla.redhat.com/1280496

The only reasonable behaviour here is to error on
any number out of accepted range (i.e. now numbers
wrapping around with some hidden logic).

As this is plain bug there is no support for
backward compatibility since noone should
set numbers >UINT32_MAX and expect 0 or error
depending on how big number was used....

TODO: more fields might need to be converted.
2017-01-03 14:55:16 +01:00
Zdenek Kabelac
77997c7673 report: show proper info for merging origin
When there is 'merging' of an origin in progress, but metadata stil
do provide both origin and snapshot, we should show data from merged
snapshot.  This is important mainly for thin case, where there was
a window, where i.e. 'lvs -o+device_id' would report information
about 'already gone' origin thin LV.

This race window is usually hard to trigger but can be ocasionally hit.
Usually shortly after activation, but before polling process manages
to update metadata after merge.
2016-12-22 23:37:07 +01:00
Zdenek Kabelac
2aee4769b4 snapshot: validate merge has started
Before starting polling process, validate the merge has actually started
so there is not pointless invoke of lvmpolld.

This also fixes reported message from command, so user has
correct info whether merging has already started or
if it's delayed for next activation.
2016-12-22 23:37:07 +01:00
Zdenek Kabelac
95e3dd5fb1 lv: more exact check for merging origin
Merging origin has 'MERGE_LV' and should also have its merging snapshot.
2016-12-22 23:37:07 +01:00
Zdenek Kabelac
9491ab41cd validation: rework segment validation
Move individual segment validation to a separate function
executed for 'complete_vg'.

Move some 'extra' validation bits from 'raid' validation to global
segtype validation (so extending existing normal validation)

TODO: still some test are left to be moved.
Reduce some duplication in validation process - there are still
some left thought so still room for improving overal speed.
2016-12-22 23:37:07 +01:00
Zdenek Kabelac
0c56eb8f43 cache: support cached origin for snapshot
Enable  'lvcreate/lvconvert -s' for cached LV.
and supported operations:

Create a snapshot of cached LV

Split/Join snapshot LV to cached origin LV.
2016-12-19 14:41:42 +01:00
Zdenek Kabelac
eb3f83357a lvconvert: fix shown lv name for snapshot split
We can't keep 'display_lvname' for too long - it's using
ringbuffer and keeps limited number of names. So it's
safe only per few simple tests,  but can't be used anymore
after some function calls..
(Fixes 00e641ef37)
2016-12-19 14:41:16 +01:00
Zdenek Kabelac
75f2388093 backup: show warning once per command
When command calls backup() more then once (which is actually not
wanted) this warning message is shown repeatedly:

"WARNING: This metadata update is NOT backed up."

Instead now print message just once and less confuse user.
2016-12-18 19:38:30 +01:00
Zdenek Kabelac
5bb6266046 lvconvert: support cache to external origin conversion
Add this functionality to lvconvert:

'lvconvert --thin cachedLV --thinpool vg/poll'

Converts cachedLV to external origin (which will be read-only).
New thin volume is created in thinpool LV and it's using external
origin as source for unprovisioned chunks.
This conversion happens  online (while volume is in use).
Thin LV remains fully writable.
Cached external origin no longer could be written so cache will be used
ONLY for read operations. For this limitation we require cache mode
to be writethrough (as writeback cannot write to read-only volumes).

When  thinLV is later removed  cached external origin is again
fully usable, just note, LV remain in 'read-only' mode.
When read-write is needed,  'lvchange -prw' has to be used.

Single external origin could be user by multiple thinLV in
multiple differen thin pool.
2016-12-18 19:35:27 +01:00
Zdenek Kabelac
69434c2eca cache: improve activation with -real
When cache volume may be converted from normal to -real layer LV
we need to improve logic for call cache_check.

With this patch, we register call for cache_check only when metadata LV
is not yet present in active table slot (should match initial table
load).
This avoids unwanted checking when cache would become layer device
online.
2016-12-18 19:30:50 +01:00
Zdenek Kabelac
29b0e42be3 lv: fix lock holder for external origin
External origin could be reloaded via more locks.
It's actually even more complex then thin-pool,
as it may be active on more nodes for linear LVs
(and maybe even more types).

External origin is always read-only thus unmodifiable
device so there should not be a problem accesing it
through multiple nodes.

Also for thin-pool check first presence of active thin-pool.

FIXME:
It's not easy to detect on which nodes this device is active
Thus manipulation with such device may require checking every
node and it active state and refresh.

But since such setup is quite complex to prepare and use,
hopefully there are not user trying to 'explore' this usage yet.
2016-12-18 19:25:25 +01:00
Zdenek Kabelac
a24eae6e82 cache: prepare status checking for layer
To be ready to show status of cache volume, call the status
with layer.  Layer is automatically detected in this case when
cache volume is used in 'layered' form (needs -real suffix).
2016-12-18 19:23:13 +01:00
Zdenek Kabelac
bf157ed833 cache: improve wait for cache clear
Avoid printing misleading message about single dirty block.
Instead properly detect condition where the 'cleaner' policy
needs to be installed without 'overloading' dirty variable.

Also print warning if we would be clearing read-only volume.
(it really shouldn't happen).
2016-12-18 19:22:11 +01:00
Zdenek Kabelac
36f609e513 validation: check external property is matching
Detect if number of external_count is matching
referencing devices for  external_origin LV.
2016-12-18 19:17:59 +01:00
Zdenek Kabelac
7db46c4a39 thin: reload external origin with last thin
External origin could be activated as stand-alone device.
When the last thin LV is removed, external origin is no longer
the external origin and it's layer property was dropped.

Ensure dm table is correct by reloading external origin
(when it's active).
2016-12-18 19:13:34 +01:00
Zdenek Kabelac
c71fefad8d lvs: show status for layer
When LV is external origin, show info for LV but
status for -layer.  So we expose more info to a user
as otherwise active external origin is only linear
mapping of -real layer.

We do the same for i.e. old snaphost origin.
2016-12-18 19:12:12 +01:00
Zdenek Kabelac
bdfc96cb08 raid: fix activation of tracked image
Activation of raid has brough up also splitted image with tracing
(without taking lock for this).

So when raid is now activate - such image is not put into
table (with _rmeta).  When user needs such device, just active it.
2016-12-18 19:10:38 +01:00
Zdenek Kabelac
fecd043cca raid: split preserves local exlusive activation 2016-12-14 11:40:01 +01:00
Zdenek Kabelac
d0fe3ec0c5 raid: avoid manipulation of segment status
RAID is LV property

TODO: only 2 flags are seg->status: PVMOVE & MERGING
At least the second one should be soon elimanted as again
we merge LV not a segment.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
d1e398c474 segtype: check for seg type instead of status
RAID is LV property - which has single segment of raid type.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
0690392040 raid: improve table reload sequence
This is another place for 'common' use pattern or
reload and activation of deleted devices.
(Moving the exclusive activation to _deactivate_and_remove_lvs()).

TODO: looks like halve of raid function is reloading
just 'origin' - and the other full LV.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
3903f915f8 pvmove: fix activation order
For proper locking we need to gain lock first for mirror which
needs to be deactivated later to be working in cluster.
2016-12-11 23:22:36 +01:00
Zdenek Kabelac
67f9e6b175 raid: avoid _ at end of name of extracted metadata LV
Do not generate @PREFIX@vg/LV1_rmeta_1_extracted_.
2016-12-11 23:20:51 +01:00
Zdenek Kabelac
55ca8043d4 raid: optimize clearing of lvs
Activate whole list of metadata lvs first before clearing them.
(Similar to commit ada5733c56)

TODO: make this clearing in a single common function.
2016-12-11 23:19:41 +01:00
Zdenek Kabelac
8831a541a8 raid: fix delete on clustered vg
For clustered VG ensure lock is grabbed first,
so later deactivation works.

TODO: fix tree to solve device removal automatically.
2016-12-11 23:18:22 +01:00
Zdenek Kabelac
0c8369099b raid: fix raid1 to mirror conversion
Fix order of operation when converting raid1 into old mirror.
Before any later metadata modification are initiated prepare
mirror_log device with all clearing.
Then directly convert  raid1 into mirror with mirror_log.
This convertion now properly see as precommitted metadata
new 'mirror' and committed old 'raid' and is able to
preload all LVs.
2016-12-11 23:17:22 +01:00
Zdenek Kabelac
31564834db mirror: add prepare_mirror_log
Function prepares new mirror log LV in-sync optionaly.
This is useful to have such device ready when converting
raid1 to mirror.
2016-12-11 23:16:16 +01:00
David Teigland
c459f23565 lvmetad: fix segfault in daemon_reply_simple
missing NULL termination
2016-12-09 15:22:30 -06:00
Zdenek Kabelac
114f7e6285 dev_manager: use setup_task_run for mknod
Simplify info run for use only for INFO & STATUS.
Drop handling MKNODES within _info_run() call
and use more advanced _setup_task_run() directly.

This allows to further simplify _info_run().
2016-12-05 17:12:39 +01:00
Zdenek Kabelac
5163b8f697 dev_manager: extend setup_task
Integrate also query for inactive table and
handle dm_task_run() and dm_task_get_info()
(thus switching to setup_task_run)

Add one exception case for DM_DEVICE_TARGET_MSG.

This allows further shortening and simplification of all
other users of this function.
2016-12-05 17:11:49 +01:00
Zdenek Kabelac
e2c7e0ad11 activation: optimize away lv_has_target_type
It's actually not needed to call extra lv_has_target_type() to detect
snapshot merge is in progress - decode this right during status
capturing and save even few extra ioctl calls.
2016-12-05 17:10:14 +01:00