IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In order to support striped raid5/6/10 LV reshaping (change
of LV type, stripesize or number of legs), this patch
introduces infrastructure prerequisites to be used
by raid_manip.c extensions in followup patches.
This base is needed for allocation of out-of-place
reshape space required by the MD raid personalities to
avoid writing over data in-place when reading off the
current RAID layout or number of legs and writing out
the new layout or to a different number of legs
(i.e. restripe)
Changes:
- add members reshape_len to 'struct lv_segment' to store
out-of-place reshape length per component rimage
- add member data_copies to struct lv_segment
to support more than 2 raid10 data copies
- make alloc_lv_segment() aware of both reshape_len and data_copies
- adjust all alloc_lv_segment() callers to the new API
- add functions to retrieve the current data offset (needed for
out-of-place reshaping space allocation) and the devices count
from the kernel
- make libdm deptree code aware of reshape_len
- add LV flags for disk add/remove reshaping
- support import/export of the new 'struct lv_segment' members
- enhance lv_extend/_lv_reduce to cope with reshape_len
- add seg_is_*/segtype_is_* macros related to reshaping
- add target version check for reshaping
- grow rebuilds/writemostly bitmaps to 246 bit to support kernel maximal
- enhance libdm deptree code to support data_offset (out-of-place reshaping)
and delta_disk (legs add/remove reshaping) target arguments
Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
The MD kernel raid1 personality does no use any writemostly leg as the primary.
In case a previous linear LV holding data gets upconverted to
raid1 it becomes the primary leg of the new raid1 LV and a full
resynchronization is started to update the new legs.
No writemostly and/or writebehind setting may be allowed during
this initial, full synchronization period of this new raid1 LV
(using the lvchange(8) command), because that would change the
primary (i.e the previous linear LV) thus causing data loss.
lvchange has a bug not preventing this scenario.
Fix rejects setting writemostly and/or writebehind on resychronizing raid1 LVs.
Once we have status in the lvm2 metadata about the linear -> raid upconversion,
we may relax this constraint for other types of resynchronization
(e.g. for user requested "lvchange --resync ").
New lvchange-raid1-writemostly.sh test is added to the test suite.
Resolves: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855895
Previously when lvremove tried to remove 'active' origin,
it had been asking for every 'snapshot' LV separately
and doing individual single snapshot removals first.
To be faster it now deactivates origin before removal
all connected snapshots.
This avoids multiple reloads of dm table for origin volume
which were unnecessary as origin was meant to be removed as well.
. Define a prototype for every lvm command.
. Match every user command with one definition.
. Generate help text and man pages from them.
The new file command-lines.in defines a prototype for every
unique lvm command. A unique lvm command is a unique
combination of: command name + required option args +
required positional args. Each of these prototypes also
includes the optional option args and optional positional
args that the command will accept, a description, and a
unique string ID for the definition. Any valid command
will match one of the prototypes.
Here's an example of the lvresize command definitions from
command-lines.in, there are three unique lvresize commands:
lvresize --size SizeMB LV
OO: --alloc Alloc, --autobackup Bool, --force,
--nofsck, --nosync, --noudevsync, --reportformat String, --resizefs,
--stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB
OP: PV ...
ID: lvresize_by_size
DESC: Resize an LV by a specified size.
lvresize LV PV ...
OO: --alloc Alloc, --autobackup Bool, --force,
--nofsck, --nosync, --noudevsync,
--reportformat String, --resizefs, --stripes Number, --stripesize SizeKB
ID: lvresize_by_pv
DESC: Resize an LV by specified PV extents.
FLAGS: SECONDARY_SYNTAX
lvresize --poolmetadatasize SizeMB LV_thinpool
OO: --alloc Alloc, --autobackup Bool, --force,
--nofsck, --nosync, --noudevsync,
--reportformat String, --stripes Number, --stripesize SizeKB
OP: PV ...
ID: lvresize_pool_metadata_by_size
DESC: Resize a pool metadata SubLV by a specified size.
The three commands have separate definitions because they have
different required parameters. Required parameters are specified
on the first line of the definition. Optional options are
listed after OO, and optional positional args are listed after OP.
This data is used to generate corresponding command definition
structures for lvm in command-lines.h. usage/help output is also
auto generated, so it is always in sync with the definitions.
Every user-entered command is compared against the set of
command structures, and matched with one. An error is
reported if an entered command does not have the required
parameters for any definition. The closest match is printed
as a suggestion, and running lvresize --help will display
the usage for each possible lvresize command.
The prototype syntax used for help/man output includes
required --option and positional args on the first line,
and optional --option and positional args enclosed in [ ]
on subsequent lines.
command_name <required_opt_args> <required_pos_args>
[ <optional_opt_args> ]
[ <optional_pos_args> ]
Command definitions that are not to be advertised/suggested
have the flag SECONDARY_SYNTAX. These commands will not be
printed in the normal help output.
Man page prototypes are also generated from the same original
command definitions, and are always in sync with the code
and help text.
Very early in command execution, a matching command definition
is found. lvm then knows the operation being done, and that
the provided args conform to the definition. This will allow
lots of ad hoc checking/validation to be removed throughout
the code.
Each command definition can also be routed to a specific
function to implement it. The function is associated with
an enum value for the command definition (generated from
the ID string.) These per-command-definition implementation
functions have not yet been created, so all commands
currently fall back to the existing per-command-name
implementation functions.
Using per-command-definition functions will allow lots of
code to be removed which tries to figure out what the
command is meant to do. This is currently based on ad hoc
and complicated option analysis. When using the new
functions, what the command is doing is already known
from the associated command definition.
Remove allocate_pvs from raid_manip.c:_region_size_change_request() API
and lv_extend() using it added for temporary test purpose.
Related: rhbz1366296
Add:
- region size checks to raid_manip.c types array and supporting functions
- tests to lvconvert-raid-takeover.sh to check bogus
"lvconvert --type --regionsize N " requests
Related: rhbz1366296
Add:
- conversion support from striped/raid0/raid0_meta to/from raid10;
raid10 goes by the near format (same as used in creation of
raid10 LVs), which groups data copies together with their original
blocks (e.g. 3-way striped, 2 data copies resulting in 112233 in the
first stripe followed by 445566 in the second etc.) and is limited
to even numbers of legs for now
- related tests to lvconvert-raid-takeover.sh
- typo
Related: rhbz1366296
- support shrinking of raid0/1/4/5/6/10 LVs
- enhance lvresize-raid.sh tests: add raid0* and raid10
- fix have_raid4 in aux.sh to allow lv_resize-raid.sh
and other scripts to test raid4
Resolves: rhbz1394048
Commit cfb6ef654d introduced
support to change RAID region size.
Fix:
- don't change region_size until after prompting the user
- use log_print_unless_silent() instead of log_warn()
- avoid superfluous sigint() calls which are already
covered in yes_no_prompt()
- typo
Related: rhbz1392947
Commit cfb6ef654d introduced
support to change RAID region size.
Add:
- missing conditions to support any types to function with
it in lv_raid_convert(); temporary workaround used until
cli validation patches get merged
- tests requesting "-R " to lvconvert-raid-takeover.sh
involving a cleanup of the script
Related: rhbz1392947
Cleanups as of Jons review:
- enhance comment about mandatory raid4 <-> raid5_n activation w/o metadata SubLVs
- remove bogus segment flag setting
- fix to sync related comments on conversions to raid0/striped and amongst raid4/5
- add missing error message for non-synced conversion to raid0/striped
Related: rhbz1366296
Add:
- support to change region size of existing RaidLVs
(all RAID LV types but raid0/raid0_meta)
- lvconvert-raid-regionsize.sh with test variations
for different RAID types and region sizes
Resolves: rhbz1392947
Add:
- support for segment types raid6_{ls,rs,la,ra}_6
(striped raid with dedicated last Q-Syndrome SubLVs)
- conversion support from raid5_{ls,rs,la,ra} to/from raid6_{ls,rs,la,ra}_6
- setting convenient segtypes on conversions from/to raid4/5/6
- related tests to lvconvert-raid-takeover.sh factoring
out _lvcreate,_lvconvert funxtions
Related: rhbz1366296
Add:
- support for segment type raid6_n_6 (striped raid with dedicated last parity/Q-Syndrome SubLVs)
- conversion support from striped/raid0/raid0_meta/raid4 to/from raid6_n_6
- related tests to lvconvert-raid-takeover.sh
Related: rhbz1366296
Add:
- support for segment type raid5_n (striped raid with dedicated last parity SubLVs)
- conversion support from striped/raid0/raid0_meta/raid4 to/from raid5_n
- related tests to lvconvert-raid-takeover.sh
Related: rhbz1366296
With recent commit d6a74025df using
INTERNAL_ERROR while cheking layer LV - it's been noticed mirror
logic currently doesn't do a correct thing during upconversion and
does a full-try instead of checking only allocator capabilities.
This leads to invalid usage of layer.
To keep existing code running before providing a fix, relax
INTERNAL_ERROR just an error and keep the 'code' running.
Once mirror code is fixed, these all check should be switched
to internal errors.
Show proper internal error for failing command when there are some
inconsitencies in sizes of LV and its layer instead of rather
meaningless error code 5.
(Could be hit i.e. if user tried to 'resize' cached LV and then
uncache such LV.)
During rework of resize code this validation check
has been lost (in my resize branch). Upstream
is still not supporting resize of any cache type LV
so needs to be prevented.
When we need to clear dirty cache content of cached LV, there
is table reload which usually is shortly followed by next metadata
change. However udev can't (as of now) process udev event
while device is 'suspended'.
So whenever sequence of 'suspend/resume/suspend' is needed,
we need to wait first for finishing of 'resume' processing before
starting next 'suspend'. Otherwise there is 'race' danger of triggering
unwantend umount by systemd as such event will trigger
SYSTEMD_READY=0 state for a moment for such changed device.
Such race is pretty ugly to trace so we may need to review more
sequencies for missing 'sync'.
(Other option is to enhnace 'udev' rules processing to avoid
such dramatic actions to be happening for suspended devices).
Add to commits 87117c2b25 and 0b8bf73a63 to avoid refreshing two
times altogether, thus avoiding issues related to clustered, remotely
activated RaidLV. Avoid need to repeat "lvchange --refresh RaidLV"
two times as a workaround to refresh a RaidLV. Fix handles removal
of temporary *-missing-* devices created for any missing segments
in RAID SubLVs during activation.
Because the kernel dm-raid target isn't able to handle transiently
failing devices properly we need
"[dm-devel][PATCH] dm raid: fix transient device failure processing"
as well.
test: add lvchange-raid-transient-failures.sh
and enhance lvconvert-raid.sh
Resolves: rhbz1025322
Related: rhbz1265191
Related: rhbz1399844
Related: rhbz1404425
Move individual segment validation to a separate function
executed for 'complete_vg'.
Move some 'extra' validation bits from 'raid' validation to global
segtype validation (so extending existing normal validation)
TODO: still some test are left to be moved.
Reduce some duplication in validation process - there are still
some left thought so still room for improving overal speed.
It could be actually better to use even cache origin in
read-only mode so there could no be some 'acidental'
change being done on this volume.
This however need further tools enhancment - where we would need
to handle whole subtree on 'lvchange -pr/-prw'.
Add this functionality to lvconvert:
'lvconvert --thin cachedLV --thinpool vg/poll'
Converts cachedLV to external origin (which will be read-only).
New thin volume is created in thinpool LV and it's using external
origin as source for unprovisioned chunks.
This conversion happens online (while volume is in use).
Thin LV remains fully writable.
Cached external origin no longer could be written so cache will be used
ONLY for read operations. For this limitation we require cache mode
to be writethrough (as writeback cannot write to read-only volumes).
When thinLV is later removed cached external origin is again
fully usable, just note, LV remain in 'read-only' mode.
When read-write is needed, 'lvchange -prw' has to be used.
Single external origin could be user by multiple thinLV in
multiple differen thin pool.
External origin could be reloaded via more locks.
It's actually even more complex then thin-pool,
as it may be active on more nodes for linear LVs
(and maybe even more types).
External origin is always read-only thus unmodifiable
device so there should not be a problem accesing it
through multiple nodes.
Also for thin-pool check first presence of active thin-pool.
FIXME:
It's not easy to detect on which nodes this device is active
Thus manipulation with such device may require checking every
node and it active state and refresh.
But since such setup is quite complex to prepare and use,
hopefully there are not user trying to 'explore' this usage yet.
To be ready to show status of cache volume, call the status
with layer. Layer is automatically detected in this case when
cache volume is used in 'layered' form (needs -real suffix).
Avoid printing misleading message about single dirty block.
Instead properly detect condition where the 'cleaner' policy
needs to be installed without 'overloading' dirty variable.
Also print warning if we would be clearing read-only volume.
(it really shouldn't happen).
External origin could be activated as stand-alone device.
When the last thin LV is removed, external origin is no longer
the external origin and it's layer property was dropped.
Ensure dm table is correct by reloading external origin
(when it's active).
Activation of raid has brough up also splitted image with tracing
(without taking lock for this).
So when raid is now activate - such image is not put into
table (with _rmeta). When user needs such device, just active it.
Commit 0690392040 revealed a problem
in raid metadata manipulation.
We do two operations in one table reload:
- raid leg/image extraction
- rename remaining raid legs
This should be made in separate steps. Otherwise we do an
uncorrectable table change on error path (leaving tables
for admin and dmsetup).
As a hotfix - restore the previous logic and use a single
new function _lv_update_and_reload_list which activates exclusively
extracted LVs on the list before resuming suspended raid LV.
This restore 'rename' functionality upon resume.
Also still preserve the 'origin_only' logic - although we know
it's not working properly for cluster and LV stacking.
Further fixes are needed.
backup is not 'tested' for success and also it should
actually happen just when command is finished.
We do not target to make backups with each inter-step
metadata change.