IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently it is impossible to remove a failed PV which has a RAID LV
on it. This patch fixes the issue by replacing the failed PV with an
'error' segment within the affected sub-LVs. Once there is no longer
a RAID LV using the PV, it can be removed.
Most often, it is better to replace a failed RAID device with a spare.
(You can use 'lvconvert --repair <vg>/<LV>' to accomplish that.)
However, if there are no spares in the volume group and none will be
added, it is useful to be able to removed the failed device.
Following patches address the ability to perform 'lvconvert' operations
on RAID LVs that contain sub-LVs composed of 'error' segments.
We have been using 'mirror_region_size' in lvm.conf as the default region
size for RAID logical volumes as well as mirror logical volumes. Since,
"raid" is more inclusive and representative than "mirror", I have changed
the name of this setting. We must still check for the old setting and warn
the user if we are overriding it with the new setting if both happen to be
present.
Instead of check for lv_is_active() for thin pool LV,
query the whole pool via new pool_is_active().
Fixes a problem when we cannot change discards settings
for active pool device where the actual layer for pool
device was inactive, but thin volumes using thin pool
have been active.
Update the error path after problems with suspend_lv or vg_commit.
It's not exactly well defined what should happen, and this
code seems to appear in many different instancies<F2> in the
whole source code tree - we should probably pick the best version.
Rename lvmetad_warning() to lvmetad_connect_or_warn().
Log all connection attempts on the client side, whether successful or not.
Reduce some nesting and remove a redundant assertion.
We need to call sync_local_dev_names directly as pvscan uses
VG_GLOBAL lock and this one *does not* cause the synchronization
(sync_dev_names) to be called on unlock (VG_GLOBAL is not a real VG):
define unlock_vg(cmd, vol)
do { \
if (is_real_vg(vol)) \
sync_dev_names(cmd); \
(void) lock_vol(cmd, vol, LCK_VG_UNLOCK); \
} while (0)
Without this fix, we end up without udev synchronization for the
pvscan --cache (mainly for -aay that causes the VGs/LVs to be
autoactivated) and also udev synchronization cookies are then left
in the system since they're not managed properly (code before sets
up udev sync cookies, but we have to call dm_udev_wait at least once
after that to do the wait and cleanup).
If a RAID array is not in-sync, replacing devices should not be allowed
as a general rule. This is because the contents used to populate the
incoming device may be undefined because the devices being read where
not in-sync. The kernel enforces this rule unless overridden by not
allowing the creation of an array that is not in-sync and includes a
devices that needs to be rebuilt.
Since we cannot know the sync state of an LV if it is inactive, we must
also enforce the rule that an array must be active to replace devices.
That leaves us with the following conditions:
1) never allow replacement or repair of devices if the LV is in-active
2) never allow replacement if the LV is not in-sync
3) allow repair if the LV is not in-sync, but warn that contents may
not be recoverable.
In the case where a user is performing the repair on the command line via
'lvconvert --repair', the warning is printed before the user is prompted
if they would like to replace the device(s). If the repair is automated
(i.e. via dmeventd and policy is "allocate"), then the device is replaced
if possible and the warning is printed.
We can also use this for conversion between different mirror segment
types. Each new segment type converter then needs to check itself
whether the --stripes is applicable.
The motivation to grab the global lock is to avoid a scan and metadata parsing
for each PV, but the cost of obtaining metadata is _mostly_ mitigated by having
lvmetad around. Not taking the global lock improves throughput when multiple pvs
or related commands are running in parallel, like in RHEV.
Calling pvscan --cache with -aay on a PV without an MDA would spuriously fail
with an internal error, because of an incorrect assumption that a parsed VG
structure was always available. This is not true and the autoactivation handler
needs to call vg_read to obtain metadata in cases where the PV had no MDAs to
parse. Therefore, we pass vgid into the handler instead of the (possibly NULL)
VG coming from the PV's MDA.
Remove no longer needed warning for unsuppoted discards
for non-power-2 lvcreate commands.
(Missed from the patch for the same update in lvchange made
by commit dde5a6c52b)
Attempting pvmove on RAID LVs replaces the kernel RAID target with
a temporary pvmove target, ultimately destroying the RAID LV. pvmove
must be prevented on RAID LVs for now.
Use 'lvconvert --replace old_pv vg/lv new_pv' if you want to move
an image of the RAID LV.
Support swapping of metadata device if the thin pool already
exists. This way it's easy to i.e. resize metadata or their
repair operation.
User may create some empty LV, replace existing metadata
or dump and restore them into bigger LV.
If udev synchronization is disabled by means of --noudevsync
option, we should disable just the synchronization and nothing else.
The udev fallback (verifying udev operations and fixing the
nodes/symlinks if found incorrect) is orthogonal and controlled
by a separate activation/verify_udev_operations configuration option.
Allow restoring metadata with thin pool volumes.
No validation is done for this case within vgcfgrestore tool -
thus incorrect metadata may lead to destruction of pool content.
Configurable settings for thin pool create
if they are not specified on command line.
New supported lvm.conf options are:
allocation/thin_pool_chunk_size
allocation/thin_pool_discards
allocation/thin_pool_zero
Similar to the way the 'mirror', 'raid1' and 'raid10' segment types set
the number of mirrors to 2 ('-m 1') if the argument is not specified,
here we set the number of stripes to 2 if not given on the command line
when creating a RAID10 LV.
Move common functions for lvcreate and lvconvert.
get_pool_params() - read thin pool args.
update_pool_params() - updates/validates some thin args.
It is getting complicated and even few more things will be
implemented, so to avoid reimplementing things differently
in lvcreate and lvconvert code has been splitted
into 2 common functions that allow some future extension.
Target tells us its version, and we may allow different set of options
to be supported with different version of driver.
Idea is to provide individual feature flags and later be
able to query for them.
This patch is intended to fix bug 825323 - FS turns read-only during a double
fault of a mirror leg and mirrored log's leg at the same time. It only
affects a 2-way mirror with a mirrored log. 3+-way mirrors and mirrors
without a mirrored log are not affected.
The problem resulted from the fact that the top level mirror was not
using 'noflush' when suspending before its "down-convert". When a
mirror image fails, the bios are queue until a suspend is recieved. If
it is a 'noflush' suspend, the bios can be safely requeued in the DM
core. If 'noflush' is not used, the bios must be pushed through the
target and if a device is failed for a mirror, that means issuing an
error. When an error is received by a file system, it results in it
turning read-only (depending on the FS).
Part of the problem was is due to the nature of the stacking involved in
using a mirror as a mirror's log. When an image in each fail, the top
level mirror stalls because it is waiting for a log flush. The other
stalls waiting for corrective action. When the repair command is issued,
the entire stacked arrangement is collapsed to a linear LV. The log
flush then fails (somewhat uncleanly) and the top-level mirror is suspended
without 'noflush' because it is a linear device.
This patch allows the log to be repaired first, which in turn allows the
top-level mirror's log flush to complete cleanly. The top-level mirror
is then secondarily reduced to a linear device - at which time this mirror
is suspended properly with 'noflush'.
Use log_warn to print non-fatal warning messages.
Use of log_error would confuse checker for testing
whether proper error has been reported for some real error.
When valgrind usage is desired by user (--enable-valgrind-pool)
skip playing/closing/reopenning with descriptors - it makes
valgridng useless.
Make sleep delay for clvmd start longer.
A while back, the behavior of LVM changed from allowing metadata changes
when PVs were missing to not allowing changes. Until recently, this
change was tolerated by HA-LVM by forcing a 'vgreduce --removemissing'
before trying (again) to add tags to an LV and then activate it. LVM
mirroring requires that failed devices are removed anyway, so this was
largely harmless. However, RAID LVs do not require devices to be removed
from the array in order to be activated. In fact, in an HA-LVM
environment this would be very undesirable. Device failures in such an
environment can often be transient and it would be much better to restore
the device to the array than synchronize an entirely new device.
There are two methods that can be used to setup an HA-LVM environment:
"clvm" or "tagging". For RAID LVs, "clvm" is out of the question because
RAID LVs are not supported in clustered VGs - not even in an exclusively
activated manner. That leaves "tagging". HA-LVM uses tagging - coupled
with 'volume_list' - to ensure that only one machine can have an LV active
at a time. If updates are not allowed when a PV is missing, it is
impossible to add or remove tags to allow for activation. This removes
one of the most basic functionalities of HA-LVM - site redundancy. If
mirroring or RAID is used to replicate the storage in two data centers
and one of them goes down, a server and a storage device are lost. When
the service fails-over to the alternate site, the VG will be "partial".
Unable to add a tag to the VG/LV, the RAID device will be unable to
activate.
The solution is to allow vgchange and lvchange to alter the LVM metadata
for a limited set of options - --[add|del]tag included. The set of
allowable options are ones that do not cause changes to the DM kernel
target (like --resync would) or could alter the structure of the LV
(like allocation or conversion).
Compared to names, UUIDs can't be renamed once they are created
for a device. The 'mangle' command will just issue an error message
about a need for manual intervention in this case - reactivating the
device (remove + create) does the job as the defualt mangling mode
used is "auto" and that will assign a correct mangled form the UUID.
For now this convertions is not supported, thus disabled.
The only supported conversion for now is to create mirrored thin pools
from mirrored devices.
Update code for lvconvert.
Change the lvconvert user interface a bit - now we require 2 specifiers
--thinpool takes LV name for data device (and makes the name)
--poolmetadata takes LV name for metadata device.
Fix type in thin help text -z -> -Z.
Supported is also new flag --discards for thinpools.
When reformatting the 'lvchange_resync' code in commit
05131f5853, a '!' should have been removed
from the condition that checks for the LV_NOTSYNCED flag on a corelog
mirror LV. The presence of this '!' caused the LV_NOTSYNCED flag to be
cleared when it wasn't present and left when it was present.
It is not allowed to add images to a 'mirror' or 'raid1' LV if the
LV_NOTSYNCED flag is set. We add some up-convert tests to ensure this
behavior is being enforced and that the LV_NOTSYNCED flag is being
properly cleared by 'lvchange --resync'.
(Not updating WHATS_NEW because this is intrarelease.)
Don't try to issue discards to a missing PV to avoid segfault.
Prevent lvremove from removing LVs that have any part missing.
https://bugzilla.redhat.com/857554
Using 'activation/auto_activation_volume_list = [ "vg/lvol1" ]'.
Before this patch:
3 logical volume(s) in volume group "vg" now active
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
lvol0 vg -wi----- 4.00m
lvol1 vg -wi-a--- 4.00m
lvol2 vg -wi-a--- 4.00m
lvol3 vg -wi-a--- 4.00m
(vg/lvol1 activated as it passes the list and all subsequent volumes too - wrong!)
With this patch:
1 logical volume(s) in volume group "vg" now active
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
lvol0 vg -wi----- 4.00m
lvol1 vg -wi-a--- 4.00m
lvol2 vg -wi----- 4.00m
lvol3 vg -wi----- 4.00m
(only vg/lvol1 activated as it passes the list and no other - correct!)
Issuing a 'lvchange --resync <VG>/<RAID_LV>' had no effect. This is
because the code to handle RAID LVs was not present. This patch adds
the code that will clear the metadata areas of RAID LVs - causing them
to resync upon activation.
When an LV is to be resynced, the metadata areas are cleared and the
LV is reactivated. This is true for mirroring and will also be true
for RAID LVs. We restructure the code in lvchange_resync() so that we
keep all the common steps necessary (validation of ability to resync,
deactivation, activation of meta/log devices, clearing of those devices,
etc) and place the code that will be divergent in separate functions:
detach_metadata_devices()
attach_metadata_devices()
The common steps will be processed on lists of metadata devices. Before
RAID capability is added, this will simply be the mirror log device (if
found).
This patch lays the ground-work for adding resync of RAID LVs.
By changing the conditional for resyncing mirrors with core-logs a
bit, we can short-circuit the rest of the function for that case
and reduce the amount of indenting in the rest of the function.
This cleanup will simplify future patches aimed at properly handling
the resync of RAID LVs.
When printing a message for the user and the lv_segment pointer is available,
use segtype->ops->name() instead of segtype->name. This gives a better
user-readable name for the segment. This is especially true for the
'striped' segment type, which prints "linear" if there is an area_count of
one.
We should check whether the fd is opened before trying to reopen it.
For example, the stdin is closed in test/lib/harness.c causing the
test suite to fail.
Accept -q as the short form of --quiet.
Suppress non-essential standard output if -q is given twice.
Treat log/silent in lvm.conf as equivalent to -qq.
Review all log_print messages and change some to
log_print_unless_silent.
When silent, the following commands still produce output:
dumpconfig, lvdisplay, lvmdiskscan, lvs, pvck, pvdisplay,
pvs, version, vgcfgrestore -l, vgdisplay, vgs.
[Needs checking.]
Non-essential messages are shifted from log level 4 to log level 5
for syslog and lvm2_log_fn purposes.
This patch adds support for RAID10. It is not the default at this
stage. The user needs to specify '--type raid10' if they would like
RAID10 instead of stacked mirror over stripe.
Adding couple INTERNAL_ERROR reports for unwanted parameters:
Ensure the 'top' metadata node cannot be NULL for lvmetad.
Make obvious vginfo2 cannot be NULL.
Report internal error if handler and vg is undefined.
Check for handle in poll_vg().
Ensure seg is not NULL in dev_manager_transient().
Report missing read_ahead for _lv_read_ahead_single().
Check for report handler in dm_report_object().
Check missing VG in _vgreduce_single().