IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch fixes mostly cluster behavior but also updates
non-cluster reaction where calls like 'lvchange -aln'
lead to incorrect errors for some segment types.
Fix the implicit activation rules where some segment types could
be activated only in exclusive mode in cluster.
lvm2 command was not preserver 'local' property and incorrectly
converted local activations in to plain exclusive, so the local
activation could have activate volumes exclusively, but remotely.
If the volume_list filters out volume from activation,
it is still success result for this function.
Change the error message back to verbose level.
Detect if the volume is active localy before zeroing,
so we report error a bit later for cases, where volume
could not be activated because it doesn't pass through volume
list (but user still could create volume when he disables
zeroing)
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.
We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.
This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.
For example: lvcreate --thinpool POOL --zero n -L 1G vg
- first, the usual LV is created to do a clean up for pool metadata
spare. The LV is activated, zeroed, deactivated.
- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
to avoid any scanning in udev
- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
udev rule, but since the LV is just a usual LV, we can't make a
difference. The LV_TEMPORARY internal LV flag helps here. If we
create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
with "invisible" and non-top-level LVs) - udev is directed to
skip WATCH rule use.
- if the LV_TEMPORARY flag was not used, there would normally be
a WATCH event generated once the LV is closed after "zeroed"
stage. This will make problems with immediated deactivation that
follows.
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.
This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the
required space for RAID allocations incorrectly due to double
accounting. This resulted in failure to allocate when available
space was tight.
When RAID data and metadata areas are allocated together, the total
amount is stored in ah->new_extents and ah->alloc_and_split_meta is
set. '_sufficient_pes_free' was adding the necessary metadata extents
to ah->new_extents without ever checking ah->alloc_and_split_meta.
This often led to double accounting of the metadata extents. This
patch checks 'ah->alloc_and_split_meta' to perform proper calculations
for RAID.
This error is only present in the function that checks for the needed
space, not in the functions that do the actual allocation.
If "default" thin pool chunk size calculation method is selected,
use minimum_io_size, otherwise optimal_io_size for "performance"
device hint exposed in sysfs. If there appear to be PVs with
different hints presented, use their least common multiple.
If the hint is less than the default value defined for the
calculation method, use the default value instead.
A previous commit (b6bfddcd0a) which
was designed to prevent segfaults during lvextend when trying to
extend striped logical volumes forgot to include calculations for
RAID4/5/6 parity devices. This was causing the 'contiguous' and
'cling_by_tags' allocation policies to fail for RAID 4/5/6.
The solution is to remember that while we can compare
ah->area_count == prev_lvseg->area_count
for non-RAID, we should compare
(ah->area_count + ah->parity_count) == prev_lvseg->area_count
for a general solution.
Older gcc is giving misleading warning:
metadata/lv_manip.c:4018: warning: ‘seg’ may be used uninitialized in
this function
But warning free compilation is better.
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.
The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide. This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them. Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction. This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.
The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID. (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.
Also properly check for active pool and extra code to active it
temporarily.
With this fix it's now possible to use:
lvcreate -L20 -m2 -n pool vg --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm
lvresize -L+10 vg/pool
The pool metadata LV must be accounted for when determining what PVs
are in a thin-pool. The pool LV must also be accounted for when
checking thin volumes.
This is a prerequisite for pvmove working with thin types.
The function 'get_pv_list_for_lv' will assemble all the PVs that are
used by the specified LV. It uses 'for_each_sub_lv' to traverse all
of the sub-lvs which may compose it.
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated. If they do not match, the code may crash.
E.g. https://bugzilla.redhat.com/989347
Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
Three fixme's addressed in this commit:
1) lib/metadata/lv_manip.c:_calc_area_multiple() - this could be
safely changed to a comment explaining that currently because
RAID10 can only have a 2-way mirror, we don't need to know the
number of stripes. However, we will need to know that in the
future if RAID10 is to support more than 2-way mirroring.
2) lib/metadata/mirror.c:_delete_lv() - should have been calling
_activate_lv_like_model() with 'mirror_lv'. This is because
'mirror_lv' is the LV that the overall operation is being
performed on. We need to use this LV as the basis for
determining whether to activate locally, or across the
cluster, etc.
3) tools/lvcreate.c:_lvcreate_params() - Minor clean-up. If
'-m 0' is given, treat it as though the mirroring argument
was not given (i.e. as though the requested segment type
was 'stripe' and not mirror).
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.
This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.
However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.
Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.
Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.
So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.
Replace the test for valid activation combination with simpler list of
deactivation combinations.
The activation/auto_set_activation_skip enables/disables automatic
adding of the ACTIVATION_SKIP LV flag. By default thin snapshots
are flagged to be skipped during activation.
And by default, the auto_set_activation_skip is enabled.
Also add -k/--setactivationskip y/n and -K/--ignoreactivationskip
options to lvcreate.
The --setactivationskip y sets the flag in metadata for an LV to
skip the LV during activation. Also, the newly created LV is not
activated.
Thin snapsots have this flag set automatically if not specified
directly by the --setactivationskip y/n option.
The --ignoreactivationskip overrides the activation skip flag set
in metadata for an LV (just for the run of the command - the flag
is not changed in metadata!)
A few examples for the lvcreate with the new options:
(non-thin snap LV => skip flag not set in MDA + LV activated)
raw/~ $ lvcreate -l1 vg
Logical volume "lvol0" created
raw/~ $ lvs -o lv_name,attr vg/lvol0
LV Attr
lvol0 -wi-a----
(non-thin snap LV + -ky => skip flag set in MDA + LV not activated)
raw/~ $ lvcreate -l1 -ky vg
Logical volume "lvol1" created
raw/~ $ lvs -o lv_name,attr vg/lvol1
LV Attr
lvol1 -wi------
(non-thin snap LV + -ky + -K => skip flag set in MDA + LV activated)
raw/~ $ lvcreate -l1 -ky -K vg
Logical volume "lvol2" created
raw/~ $ lvs -o lv_name,attr vg/lvol2
LV Attr
lvol2 -wi-a----
(thin snap LV => skip flag set in MDA (default behaviour) + LV not activated)
raw/~ $ lvcreate -L100M -T vg/pool -V 1T -n thin_lv
Logical volume "thin_lv" created
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg
LV Attr
pool twi-a-tz-
thin_lv Vwi-a-tz-
thin_snap Vwi---tz-
(thin snap LV + -K => skip flag set in MDA (default behaviour) + LV activated)
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap -K
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg/thin_lv
LV Attr
thin_lv Vwi-a-tz-
(thins snap LV + -kn => no skip flag in MDA (default behaviour overridden) + LV activated)
[0] raw/~ # lvcreate -s vg/thin_lv -n thin_snap -kn
Logical volume "thin_snap" created
[0] raw/~ # lvs -o name,attr vg/thin_snap
LV Attr
thin_snap Vwi-a-tz-
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions. Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
Fix and improve handling on sigint.
Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.
If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.
Fix cuple related stack; reports and assignments.
If "vgcreate/lvcreate --profile <profile_name>" is used, the profile
name is automatically stored in metadata for making it possible to
load it automatically next time the VG/LV is used.
Since reduce the message has informational character and doesn't lead
to exit of the command - reduce the log level to info print as we
use for other similar types.
Reindent next print message.
The special suspend/resume code in lv_remove for LVM1 snapshots was interpsersed
with a vg_commit call. However, while with LVM1 metadata, vg_commit is
technically a no-op, the activation code relied on the ondisk and incore
metadata being the same, since on LVM1, a "commit" happens in vg_write
already. Since the "ondisk" metadata was previously not available with format1
(and incore was silently used instead, via lvmcache), the problem was masked.