IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch adds the ability to upconvert a raid1 array - say from 2-way to
3-way. It does not yet support upconverting linear to n-way.
The 'raid' device-mapper target allows for individual components (images) of
an array to be specified for rebuild. This mechanism is used when adding
new images to the array so that the new images can be resync'ed while the
rest of the images in the array can remain 'in-sync'. (There is no
mirror-on-mirror layering required.)
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access. The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed. This
operation can be thought of as a partial split. The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap. The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed. The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
~> lvconvert --merge vg/lv_rimage_<n>
The internal mechanics of this are relatively simple. The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'. This is what will be used if a partial activation of an array
is ever required. (It would also be possible to use 'error' targets in
place of the '- -'.) If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array. So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only. To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
Users already have the ability to split an image from an LV of "mirror"
segtype. This patch extends that ability to LVs of "raid1" segtype.
This patch only allows a single image to be split off, however. (The
"mirror" segtype allows an arbitrary number of images to be split off.
e.g. 4-way => 3-way/linear, 2-way/2-way, linear,3-way)
of top-level LV.
We can't activate sub-lv's that are being removed from a RAID1 LV while it
is suspended. However, this is what was being used to have them show-up
so we could remove them. 'sync_local_dev_names' is a sufficient and
proper replacement and can be done after the top-level LV is resumed.
1) add new function 'raid_remove_top_layer' which will be useful
to other conversion functions later (also cleans up code)
2) Add error messages if raid_[extract|add]_images fails
3) Add function prototypes to prevent compiler warnings when
compiling with '--with-raid=shared'
Fix a couple more issues that kabi found.
- Add some error messages in failure cases
- s/malloc/zalloc/
- use vg->vgmem for lv names instead of vg->cmd->mem
Add config option to enable crc checking of VG structures.
Currently it's disabled by default.
For the internal test-suite this check it is enabled.
Note: In the case the internal error is detected, debug build with
compile option DEBUG_ENFORCE_POOL_LOCKING helps to catch the source
of the problem.
Use debug pool locking functionality. So the command could check,
whether the memory in the pool has not been modified.
For lv_postoder() instead of unlocking and locking for every changed
struct status member do it once when entering and leaving function.
(mprotect would trap each such memory access).
Currently lv_postoder() does not modify other part of vg structure
then status flags of each LV with flags that are reverted back to
its original state after function exit.
Extend vginfo cache with cached VG structure. So if the same metadata
are use, skip mda decoding in the case, the same data are in use.
This helps for operations like activation of all LVs in one VG,
where same data were decoded giving the same output result.
Patch adds 1-to-1 connection between volume_group and lvmcache_vginfo.
Move the free_vg() to vg.c and replace free_vg with release_vg
and make the _free_vg internal.
Patch is needed for sharing VG in vginfo cache so the release_vg function name
is a better fit here.
As this flag could not have been set by the current code - removing it.
Note: because of the wrong code logic this call:
lvmcache_update_vg(correct_vg, correct_vg->status & PRECOMMITTED &
(inconsistent ? INCONSISTENT_VG : 0));
had always passed '0' - now after flag removal it's passing
PRECOMMITTED flag in - this present functinal change in this patch.
To match the original functionality - 0 had to be always passed.
More testing is needed here.
Compiler complaining that meta_lv could be used uninitialized. (Not true
because it is protected by 'clear_metadata'.) I switched to using 'lv->vg',
as it makes no difference to vg_[write|commit].
(here clvmd crashed in the middle of operation),
lock is not removed from cache - here is one example:
locking/cluster_locking.c:497 Locking VG V_vg_test UN (VG) (0x6)
locking/cluster_locking.c:113 Error writing data to clvmd: Broken pipe
locking/locking.c:399 <backtrace>
locking/locking.c:461 <backtrace>
Internal error: Volume Group vg_test was not unlocked
Code should always remove lock info from lvmcache and update counters
on unlock, even if unlock fails.
Today, we use "suppress_messages" flag (set internally in init_locking fn based
on 'ignorelockingfailure() && getenv("LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES")'.
This way, we can suppress high level messages like "File-based locking
initialisation failed" or "Internal cluster locking initialisation failed".
However, each locking has its own sequence of initialization steps and these
could log some errors as well. It's quite misleading for the user to see such
errors and warnings if the "--sysinit" is used (and so the ignorelockingfailure
&& LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES environment variable). Errors and
warnings from these intermediary steps should be suppressed as well if requested.
This patch propagates the "suppress_messages" flag deeper into locking init
functions. I've also added these flags for other locking types for consistency,
though it's not actually used for no_locking and readonly_locking.
Last usage was removed in Petr's commit related to VG mda repair fix
where relaxed check starts to ignore inconsistencies coming from
PVs that are marked MISSING - thus removing unused variable.
Implementation described in doc/lvm2-raid.txt.
Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types
This should be set by default! Normally we have "activation/udev_sync = 1"
in lvm.conf (example.conf.in). But if we use lvm2 without any config file
(or without a definition within '--config' option) the DEFAULT_UDEV_SYNC
is used instead. Together with verify_udev_operations=0 (when we rely on
udev fully), this can cause races as the node could be missing when needed.
(See also https://bugzilla.redhat.com/show_bug.cgi?id=723144)
Clvmd detects modifed config file before it takes lv_lock.
If the config file is changed rapidly - the change was ignored within
a seocnd ranged. This patch adds also compare of file size.
So change like some flag for 0 to 1 would pass unnoticed - but
it's quick fix for failing test suite.
FIXME: Implement inotify solution.
If MD linear device has set rounding (overload chunk size attribute),
the pvcreate command prints this warning:
/dev/md0 sysfs attr level not in expected format: linear
teardown sequence. (Previously the snapshot was deactivated while its
origin was active and before its removal was committed to disk, so
restarting after a crash at the point would leave corruption.)
When an LVM mirror is up-converted (an additional image added), it creates
a temporary mirror stack. The lower-level mirror in the stack that is
created was not being activated exclusively - violating the exclusive nature
of the original mirror. We now check for exclusive activation of a mirror
before converting it, and if found, we ensure that the temporary mirror
is also exclusively activated.
Mirrors used to be created by first creating a linear device and then adding
the other images plus the log. Now mirrors are created by creating all the
images in one go and then adding the log separately. The new way ran into
the condition that cluster mirrors cannot change the log type (in the case
of creation, from core -> disk) while the mirror is not active. (It isn't
active because it is in the process of being created.) The reason this
condition is in place is because a remote node may have the mirror active, and
we don't want to alter the log underneath it.
What we really needed was a way of checking if the mirror was active remotely
but not locally, and in that case do not allow a change of the log. I've added
this check, and cluster mirrors can now be created again.
We've used udev fallback code till now to check whether udev
created/removed the entries in /dev correctly and if not,
a repair was done (giving a warning messagea about that).
This patch adds a possibility to enable this additional check
and subsequent fallback only when required (debugging purposes
mostly) and trust udev completely.
So let's disable the fallback code by default and add a new
configuration option "activation/udev_fallback".
(The original code for creating the nodes will still be used
in case the device directory that is set in lvm.conf differs
from the one that udev uses and also when activation/udev_rules
is set to 0 - otherwise we would end up with no nodes/symlinks
at all)
It's useful to keep the partial flag cached - so just move the call
for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets
evaluated before it goes through the lvmcache.
This patch should not present any functional change.
Note: It is rather temporal solution - proper place is probably inside the
'read' call back - but needs some more discussion.
For now using this minor hack.
As the ACTIVATE_EXCL could be set only in clvmd code - there is no
use for this test in lv_add_mirrors() function only called from
tools context.
FIXME: Add cluster test case for this.
To avoid modification of 'read-only' volume group structure
add a new structure to pass local data around the code for LV
activation.
As origin_only is one such flag - replace this parameter with new
struct lv_activate_opts.
More parameters might eventually become part of lv_activate_opts.
transient error), stemming from the following sequence of events:
1) devices fail IO, triggering repair
2) dmeventd starts fixing up the mirror
3) during the downconversion, a new metadata version is written
--> the devices come back online here
4) the mirror device suspend/resume is called to update DM tables
5) during the suspend/resume cycle, *pre*-commit metadata is read;
however, since the failed devices are now back online, we get back
inconsistent set of precommit metadata and the whole operation fails
The patch relaxes the check that fails in step 5 above, namely by ignoring
inconsistencies coming from PVs that are marked MISSING.
and use this for the LVM critical section logic. Also report an error if
code tries to load a table while any device is known to be in the
suspended state.
(If the variety of problems these changes are showing up can't be fixed
before the next release, the error messages can be reduced to debug
level.)
are affected by the move. (Currently it's possible for I/O to become
trapped between suspended devices amongst other problems.
The current fix was selected so as to minimise the testing surface. I
hope eventually to replace it with a cleaner one that extends the
deptree code.
Some lvconvert scenarios still suffer from related problems.
Currently some operation with striped mirrors lead
to corrupted metadata, this patch just add detection of such
situation.
Example:
# lvcreate -i2 -l10 -n lvs vg_test
# lvconvert -m1 vg_test/lvs
# lvreduce -f -l1 vg_test/lvs
Reducing logical volume lvs to 4.00 MiB
Segment extent reduction 9not divisible by #stripes 2
Logical volume lvs successfully resized
# lvremove vg_test/lvs
Segment extent reduction 1not divisible by #stripes 2
LV segment lvs:0-4294967295 is incorrectly listed as being used by LV lvs_mimage_0
Internal error: LV segments corrupted in lvs_mimage_0.
There's a possibility someone will use the '/' in the hostname. Since we
generate a temporary file name (path) including the hostname, any '/' would
be ambiguous.
We can always set such hostname using 'sethostname' from unistd.h. But the
'hostname' command already includes the check and removes the '/' char.
However, some old versions still allow that.
See: https://bugzilla.redhat.com/show_bug.cgi?id=711445.
Since this is only a temporary name and the possibility of this error is
quite negligible, we don't need any complex escape sequence here, just a
simple char replace.
Before, we used vg_write_lock_held call to determnine the way a device is
opened. Unfortunately, this opened many devices in RW mode when it was not
really necessary. With the OPTIONS+="watch" rule used in the udev rules,
this could fire numerous events while closing such devices (and it caused
useless scans from within udev rules in return).
A common bug we hit with this was with the lvremove command which was unable
to remove the LV since it was being opened from within the udev rules. This
patch should minimize such situations (at least with respect to LVM handling
of devices).
Though there's still a possibility someone will open a device 'outside' in
parallel and fire the event based on the watch rule when closing a device
once opened for RW.