IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Let's use the size of origin as the real base for percenta calculation,
and 'silenly' add needed metadata space for snapshot.
So now command 'lvcreate -s -l100%ORIGIN vg/lv' should always create a
snapshot to handle full device overwrite.
Expresing -lXX%LV is not valid for snapshot, but error message for
snapshost case was not complete and missed %ORIGIN.
Also document correct settings for in manpage properly where
it missed %PVS.
While polling for snapshot, detect first the snapshot still
exits. It's valid to have multiple polling threads watching
for the same thing and just 1 can 'win' the finish part.
All others should nicely 'fail'.
If the we are polling an LV due to some sort of conversion and it
becomes inactive, a rather worrisome message is produced, e.g.:
" ABORTING: Mirror percentage check failed."
We can cleanly exit if we do a simple check to see if the LV is
active before performing the check. This eliminates the scary
message.
Internal reporting function cannot handle NULL reporting value,
so ensure there is at least dummy label.
So move dummy_lable from tools/reporter.c and use it for all
report_object() calls in lib/report/report.c.
(Fixes RHBZ 1108394)
Simlify lvm_report_object initialization.
And use ifdefs there, not exposing it in the tool code itself.
Later in the future, we should probably make the PIDFILE and
daemon checking code available also in case the daemon itself
is not built.
If the user supplies a '--yes' argument, then don't bother them with
a question to confirm whether to change the cluster attribute (even
if clvmd isn't running).
If clvmd is not running or the locking type is not clustered and someone
attempts to set the cluster attribute on a volume group, prompt them to
see if they are sure. (Only prompt for one though. If neither are true,
simply ask them once.)
Warn when --udevcookie/DM_UDEV_COOKIE is used with 'dmsetup remove --force'.
When command is doing multiple ioctl operations on a single device,
it may invoke udev activity, that is colliding with further ioctl commands.
The result of such operation becomes unpredictable.
Use of --retry could partially help...
LVM has restricter character set that is allowed for VG-LV names
and the dm names constructed do not contain any blacklisted characters
that would require name mangling.
Also, when any other device-mapper device is scanned that could
possibly contain such blacklisted characters, we reference the
device by its major:minor instead of dm name (e.g. _device_is_usable fn).
Warn user before converting volume to different type.
WARNING: Converting vg/lvol0 logical volume to pool's meta/data volume.
THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Since the content of volume is lost we have to query user to confirm
such operation. If user is 100% sure, he may use '--yes' to avoid prompts.
The dumpconfig now understands --commandprofile/--profile/--metadataprofile
The --commandprofile and --profile functionality is almost the same
with only one difference and that is that the --profile is just used
for dumping the content, it's not applied for the command itself
(while the --commandprofile profile is applied like it is done for
any other LVM command).
We also allow --metadataprofile for dumpconfig - dumpconfig *does not*
touch VG/LV and metadata in any way so it's OK to use it here (just for
dumping the content, checking the profile validity etc.).
The validity of the profile can be checked with:
dumpconfig --commandprofile/--profile/--metadataprofile --validate
...depending on the profile type.
Also, mention --config in the dumpconfig help string so users know
that dumpconfig handles this too (it did even before, but it was not
documented in the help string).
- When defining configuration source, the code now uses separate
CONFIG_PROFILE_COMMAND and CONFIG_PROFILE_METADATA markers
(before, it was just CONFIG_PROFILE that did not make the
difference between the two). This helps when checking the
configuration if it contains correct set of options which
are all in either command-profilable or metadata-profilable
group without mixing these groups together - so it's a firm
distinction. The "command profile" can't contain
"metadata profile" and vice versa! This is strictly checked
and if the settings are mixed, such profile is rejected and
it's not used. So in the end, the CONFIG_PROFILE_COMMAND
set of options and CONFIG_PROFILE_METADATA are mutually exclusive
sets.
- Marking configuration with one or the other marker will also
determine the way these configuration sources are positioned
in the configuration cascade which is now:
CONFIG_STRING -> CONFIG_PROFILE_COMMAND -> CONFIG_PROFILE_METADATA -> CONFIG_FILE/CONFIG_MERGED_FILES
- Marking configuration with one or the other marker will also make
it possible to issue a command context refresh (will be probably
a part of a future patch) if needed for settings in global profile
set. For settings in metadata profile set this is impossible since
we can't refresh cmd context in the middle of reading VG/LV metadata
and for each VG/LV separately because each VG/LV can have a different
metadata profile assinged and it's not possible to change these
settings at this level.
- When command profile is incorrect, it's rejected *and also* the
command exits immediately - the profile *must* be correct for the
command that was run with a profile to be executed. Before this
patch, when the profile was found incorrect, there was just the
warning message and the command continued without profile applied.
But it's more correct to exit immediately in this case.
- When metadata profile is incorrect, we reject it during command
runtime (as we know the profile name from metadata and not early
from command line as it is in case of command profiles) and we
*do continue* with the command as we're in the middle of operation.
Also, the metadata profile is applied directly and on the fly on
find_config_tree_* fn call and even if the metadata profile is
found incorrect, we still need to return the non-profiled value
as found in the other configuration provided or default value.
To exit immediately even in this case, we'd need to refactor
existing find_config_tree_* fns so they can return error. Currently,
these fns return only config values (which end up with default
values in the end if the config is not found).
- To check the profile validity before use to be sure it's correct,
one can use :
lvm dumpconfig --commandprofile/--metadataprofile ProfileName --validate
(the --commandprofile/--metadataprofile for dumpconfig will come
as part of the subsequent patch)
- This patch also adds a reference to --commandprofile and
--metadataprofile in the cmd help string (which was missing before
for the --profile for some commands). We do not mention --profile
now as people should use --commandprofile or --metadataprofile
directly. However, the --profile is still supported for backward
compatibility and it's translated as:
--profile == --metadataprofile for lvcreate, vgcreate, lvchange and vgchange
(as these commands are able to attach profile to metadata)
--profile == --commandprofile for all the other commands
(--metadataprofile is not allowed there as it makes no sense)
- This patch also contains some cleanups to make the code handling
the profiles more readable...
The dumpconfig reuses existing config_def_check results in case
the check is done during general lvm command context initialization
(when enabled by config/checks=1) so dumpconfig does not need to run
the same check again during its execution, hence saving some time.
However, we don't check for differences from defaults during general
lvm command initialization as it's useless at that time. It makes
sense only in case when such a check is directly requested (like in
the case of lvm dumpconfig --type diff). We need to take care that
the reused information was already produced with this "diff" checking
before and if not, we need to force the check so the check status also
gathers the new "diff" info now.
Also, do not do diff checking for any other dumpconfig command that
is run after dumpconfig --type diff.
When cmd refresh is called, we need to move any already loaded profiles
to profiles_to_load list which will cause their reload on subsequent
use. In addition to that, we need to take into account any change
in config/profile configuration setting on cmd context refresh
since this setting could be overriden with --config.
Also, when running commands in the shell, we need to remove the
global profile used from the configuration cascade so the profile
is not incorrectly reused next time when the --profile option is
not specified anymore for the next command in the shell.
This bug only affected profile specified by --profile cmd line
arg, not profiles referenced from LVM metadata.
Before, the cft_check_handle used to direct configuration checking
was part of cmd_context. It's better to attach this as part of the
exact config tree against which the check is done. This patch moves
the cft_check_handle out of cmd_context and it attaches it to the
config tree directly as dm_config_tree->custom->config_source->check_handle.
This change makes it easier to track the config tree check results
and provides less space for bugs as the results are directly attached
to the tree and we don't need to be cautious whether the global value
is correct or not (and whether it needs reinitialization) as it was
in the case when the cft_check_handle was part of cmd_context.
Share DM_REPORT_FIELD_RESERVED_NAME_{HELP,HELP_ALT} between libdm and
any libdm user to handle reserved field names, in this case the virtual
field name to show help instead of failing on unrecognized field.
The libdm user also needs to check the field name so it can fire
proper code in this case (cleanup, exit etc.).
When showing ACTIVE status for snapshot's origin,
avoid testing all its snapshot - it's not useful
to tell user origin is inactivate, while it's clearly
available and running - just one of its snapshot leg
is invalid...
While the 'raid1/10' segment types were being handled inadvertently
by '_move_mirrors()', the parity RAIDs were not being properly checked
to ensure that the user had specified all necessary PVs when moving
them. Thus, internal errors were being triggered when only part of
a RAID LV was moved to the new VG. I've added a new function,
'_move_raid()', which properly checks over any affected RAID LVs and
ensures that all the necessary PVs are being moved.
Config variables that are processed during setup prior to calling into
particular tools must not be accessed directly afterwards in case the
values already got overridden.
_process_config() already used the tests I'm removing here to call
lvmetad_set_active() and set up lvmetad_used().
There were two bugs before when using pvcreate --restorefile together
with data alignment and its offset specified:
- the --dataalignment was always ignored due to missing braces in the
code when validating the divisibility of supplied --dataalignment
argument with pe_start which we're just restoring:
if (pp->rp.pe_start % pp->data_alignment)
log_warn("WARNING: Ignoring data alignment %" PRIu64
" incompatible with --restorefile value (%"
PRIu64").", pp->data_alignment, pp->rp.pe_start);
pp->data_alignment = 0
The pp->data_alignment should be zeroed only if the pe_start is not
divisible with data_alignment.
- the check for compatibility of restored pe_start was incorrect too
since it did not properly count with the dataalignmentoffset that
could be supplied together with dataalignment
The proper formula is:
X * dataalignment + dataalignmentoffset == pe_start
So it should be:
if ((pp->rp.pe_start % pp->data_alignment) != pp->data_alignment_offset) {
...ignore supplied dataalignment and dataalignment offset...
}
This test for LV name restriction check name of device is below 128
chars (which is enforced by dm target).
Thus it should not count with device name.
(Though the test for PATH_MAX size should be probably also added,
but this is runtime test, since theoretically devpath might differ in cluster)
It's unclear why we should prohibit use of -v output.
So reenable (like with other 'display' tools)
But -c -m is really unsupported - return invalid cmd.
Commit 1a832398a7 moved
some code from _pvchange_single() to main pvchange() and
introduced exit code regression as return codes have not
been properly changed, thus pvchange command exited
with '0' exit code, even though it has reported error.
Also there is a missing vg unlock in error path.
Fix it by counting the total number of expected calls before
checking for pvname and also unlock and relase vg when
pv is not found.
Since commit f12ee43f2e call destroy,
it start to check all VGs are unlocked. However when we become_daemon,
we simply reset locking (since lock is still kept by parent process).
So implement a simple 'reset' flag.
pvscan --uuid was broken since it was using only 128 char buffers
without checking any write size, so any longer device path leads to
crash.
Also ansure format is properly aligned into columns with this option.
Check whether lvm dumpconfig --mergedconfig is used only
with --type current (where we're merging current config and
the config supplied on command line). With other types
the config was merged, but it was thrown away since we're
generating other type of config anyway. This lead to a memleak.
Error out if --mergedconfig is used with anything else than
--type current (or without specifying --type in which case
the --type current is used by default).
Delay archiving of metadata until we really start to
update metadata when converting volume into a snapshot.
Archive is not necessary when we abort operation early.
Do not allow conversion of too small LV into a COW snapshot device.
Without this patch snapshot target is generating these kernel
messages before creation fails:
attempt to access beyond end of device
dm-9: rw=16, want=8, limit=2
attempt to access beyond end of device
...
device-mapper: table: 253:11: snapshot: Failed to read snapshot metadata
device-mapper: ioctl: error adding target to table
device-mapper: reload ioctl on failed: Input/output error
Usage of origin as a snapshot 'COW' volume is unsupported.
Without this test lvm2 is able to generate this ugly internal error message.
To test this:
lvcreate -L1 -n lv1 vg
lvcreate -L1 -n lv2 -s vg/lv1
lvcreate -L1 -n lv3 vg
lvconvert -s vg/lv3 vg/lv1
Internal error: LVs (5) != visible LVs (1) + snapshots (1) + internal LVs (0) in VG vg
This prevents numerous VG refreshes on each "pvscan --cache -aay" call
if the VG is found complete. We need to issue the refresh only if the PV:
- is new
- was gone before and now it reappears (device "unplug/plug back" scenario)
- the metadata has changed
Don't add blkid to every linkage.
Link udev library just with lvm tools.
Drop extra linkage of udev library, since deps from libdevmapper
are already resolved in linked -ldevmapper.
Based on patch:
https://www.redhat.com/archives/lvm-devel/2014-March/msg00015.html
The CPPFunction typedef (among others) have been deprecated in favour of
specific prototyped typedefs since readline 4.2 (circa 2001).
It's been working since because compatibility typedefs have been in
place until they where removed in the recent readline 6.3 release.
Switch to the new style to avoid build breakage.
But also add full backward compatibility with define.
Signed-off-by: Gustavo Zacarias <gustavo zacarias com ar>
Previously, we declared a default value as undefined ("NULL") for
settings which require runtime context to be set first (e.g. settings
for paths that rely on SYSTEM_DIR environment variable or they depend
on any other setting in some way).
If we want to output default values as they are really used in runtime,
we should make it possible to define a default value as function which
is evaluated, not just providing a firm constant value as it was before.
This patch defines simple prototypes for such functions. Also, there's
new helper macros "cfg_runtime" and "cfg_array_runtime" - they provide
exactly the same functionality as the original "cfg" and "cfg_array"
macros when defining the configuration settings in config_settings.h,
but they don't set the constant default value. Instead, they automatically
link the configuration setting definition with one of these functions:
typedef int (*t_fn_CFG_TYPE_BOOL) (struct cmd_context *cmd, struct profile *profile);
typedef int (*t_fn_CFG_TYPE_INT) (struct cmd_context *cmd, struct profile *profile);
typedef float (*t_fn_CFG_TYPE_FLOAT) (struct cmd_context *cmd, struct profile *profile);
typedef const char* (*t_fn_CFG_TYPE_STRING) (struct cmd_context *cmd, struct profile *profile);
typedef const char* (*t_fn_CFG_TYPE_ARRAY) (struct cmd_context *cmd, struct profile *profile);
(The new macros actually set the CFG_DEFAULT_RUNTIME flag properly and
set the default value link to the function accordingly).
Then such configuration setting requires a function of selected type to
be defined. This function has a predefined name:
get_default_<id>
...where the <id> is the id of the setting as defined in
config_settings.h. For example "backup_archive_dir_CFG" if defined
as a setting with default value evaluated in runtime with "cfg_runtime"
will automatically have "get_default_backup_archive_dir_CFG" function
linked to this setting to get the default value.
When read-only snapshot was created, tool was skipping header
initialization of cow device. If it happened device has been
already containing header from some previous snapshot, it's
been 'reused' for a newly created snapshot instead of being cleared.
To make "lvm dumpconfig --type default" output to be usable like any
other config, we need to comment out lines that have no default value
defined. Otherwise, we'd have the output with config options
with blank or zero values which is not the same as when the value
is not defined! And such configuration can't be feed into lvm again
without further edits. So let's fix this.
Currently this covers these configuration options exactly:
devices/loopfiles
devices/preferred_names
devices/filter
devices/global_filter
devices/types
allocation/cling_tag_list
global/format_libraries
global/segment_libraries
activation/volume_list
activation/auto_activation_volume_list
activation/read_only_volume_list
activation/mlock_filter
metadata/dirs
metadata/disk_areas
metadata/disk_areas/<disk_area>
metadata/disk_areas/<disk_area>/start_sector
metadata/disk_areas/<disk_area>/size
metadata/disk_areas/<disk_area>/id
tags/<tag>
tags/<tag>/host_list
The code seems to work fine for the most trivial case - moving a
simple cache LV. However, it can cause problems when trying to
split out other LVs on different VGs and there hasn't been sufficient
testing for LV stacks that contain cache to enable the code. So,
we actively disable what is already broken and wait for the next
release to fix it.
Start to convert percentage size handling in lvresize to the new
standard. Note in the man pages that this code is incomplete.
Fix a regression in non-percentage allocation in my last check in.
This is what I am aiming for:
-l<extents>
-l<percent> LV/ORIGIN
sets or changes the LV size based on the specified quantity
of logical logical extents (that might be backed by
a higher number of physical extents)
-l<percent> PVS/VG/FREE
sets or changes the LV size so as to allocate or free the
desired quantity of physical extents (that might amount to a
lower number of logical extents for the LV concerned)
-l+50%FREE - Use up half the remaining free space in the VG when
carrying out this operation.
-l50%VG - After this operation, this LV should be using up half the
space in the VG.
-l200%LV - Double the logical size of this LV.
-l+100%LV - Double the logical size of this LV.
-l-50%LV - Reduce the logical size of this LV by half.
Test raid10 availability as a target feature (instead of doing
it in all the places where raid10 should be checked).
TODO: activation needs runtime validation - so metadata with raid10
are skipped from activation in user-friendly way in lvm2.
Skip over LVs that have a cache LV in their tree of LV dependencies
when performing a pvmove.
This means that users cannot move a cache pool or a cache LV's origin -
even when that cache LV is used as part of another LV (e.g. a thin pool).
The new test (pvmove-cache-segtypes.sh) currently builds up various LV
stacks that incorporate cache LVs. pvmove tests are then performed to
ensure that cache related LVs are /not/ moved. Once pvmove is enabled
for cache, those tests will switch to ensuring that the LVs /are/
moved.
Several fixes for the recent changes that treat allocation percentages
as upper limits.
Improve messages to make it easier to see what is happening.
Fix some cases that failed with errors when they didn't need to.
Fix crashes when first_seg() returns NULL.
Remove a couple of log_errors that were actually debugging messages.
Remove 'skip' argument passed into the function.
We always used '0' - as this is the only supported
option (-K) and there is no complementary option.
Also add some testing for behaviour of skipping.
There are typically 2 functions for the more advanced segment types that
deal with parameters in lvcreate.c: _get_*_params() and _check_*_params().
(Not all segment types name their functions according to this scheme.)
The former function is responsible for reading parameters before the VG
has been read. The latter is for sanity checking and possibly setting
parameters after the VG has been read.
This patch adds a _check_raid_parameters() function that will determine
if the user has specified 'stripe' or 'mirror' parameters. If not, the
proper number is computed from the list of PVs the user has supplied or
the number that are available in the VG. Now that _check_raid_parameters()
is available, we move the check for proper number of stripes from
_get_* to _check_*.
This gives the user the ability to create RAID LVs as follows:
# 5-device RAID5, 4-data, 1-parity (i.e. implicit '-i 4')
~> lvcreate --type raid5 -L 100G -n lv vg /dev/sd[abcde]1
# 5-device RAID6, 3-data, 2-parity (i.e. implicit '-i 3')
~> lvcreate --type raid6 -L 100G -n lv vg /dev/sd[abcde]1
# If 5 PVs in VG, 4-data, 1-parity RAID5
~> lvcreate --type raid5 -L 100G -n lv vg
Considerations:
This patch only affects RAID. It might also be useful to apply this to
the 'stripe' segment type. LVM RAID may include RAID0 at some point in
the future and the implicit stripes would apply there. It would be odd
to have RAID0 be able to auto-determine the stripe count while 'stripe'
could not.
The only draw-back of this patch that I can see is that there might be
less error checking. Rather than informing the user that they forgot
to supply an argument (e.g. '-i'), the value would be computed and it
may differ from what the user actually wanted. I don't see this as a
problem, because the user can check the device count after creation
and remove the LV if they have made an error.
Introduce a new parameter called "approx_alloc" that is set when the
desired size of a new LV is specified in percentage terms. If set,
the allocation code tries to get as much space as it can but does not
fail if can at least get some.
One of the practical implications is that users can now specify 100%FREE
when creating RAID LVs, like this:
~> lvcreate --type raid5 -i 2 -l 100%FREE -n lv vg
Users now have the ability to convert their existing logical volumes
into cached logical volumes. A cache pool LV must be specified using
the '--cachepool' argument. The cachepool is the small, fast LV used
to cache the large, slow LV that is being converted.
This patch allows users to convert existing logical volumes into
cache pool LVs. Since cache pool LVs consist of data and metadata
sub-LVs, there is also the '--poolmetadata' (similar to thin_pool)
which allows for the specification of the metadata device.
This patch allows users to create cache LVs with 'lvcreate'. An origin
or a cache pool LV must be created first. Then, while supplying the
origin or cache pool to the lvcreate command, the cache can be created.
Ex1:
Here the cache pool is created first, followed by the origin which will
be cached.
~> lvcreate --type cache_pool -L 500M -n cachepool vg /dev/small_n_fast
~> lvcreate --type cache -L 1G -n lv vg/cachepool /dev/large_n_slow
Ex2:
Here the origin is created first, followed by the cache pool - allowing
a cache LV to be created covering the origin.
~> lvcreate -L 1G -n lv vg /dev/large_n_slow
~> lvcreate --type cache -L 500M -n cachepool vg/lv /dev/small_n_fast
The code determines which type of LV was supplied (cache pool or origin)
by checking its type. It ensures the right argument was given by ensuring
that the origin is larger than the cache pool.
If the user wants to remove just the cache for an LV. They specify
the LV's associated cache pool when removing:
~> lvremove vg/cachepool
If the user wishes to remove the origin, but leave the cachepool to be
used for another LV, they specify the cache LV.
~> lvremove vg/lv
In order to remove it all, specify both LVs.
This patch also includes tests to create and remove cache pools and
cache LVs.
This patch allows the creation and removal of cache pools. Users are not
yet able to create cache LVs. They are only able to define the space used
for the cache and its characteristics (chunk_size and cache mode ATM) by
creating the cache pool.
Avoid use of external origin with size unaligned/incompatible with
thin pool chunk size, since the last chunk is not correctly provisioned
when it is overwritten.
Avoid starting conversion of the LV to the thin pool and thin volume
at the same time. Since this is mostly a user mistake, do not try
to just convert to one of those type, since we cannot assume if the
user wanted LV to become thin volume or thin pool.
Before the fix tool reported pretty strange internal error:
Internal error: Referenced LV lvol1_tdata not listed in VG mvg.
Fixed output:
lvconvert --thinpool lvol0 -T mvg/lvol0
Can't use same LV mvg/lvol0 for thin pool and thin volume.
In preparation for other segment types that create and use "pools", we
s/create_thin_pool/create_pool/. This way it is not awkward when creating
a cachepool, for example, to use "create_thin_pool".
Several fields used to display 0 if undefined. Recent changes
to the way the fields are reported threw away some tests for
valid pointers, leading to segfaults with 'pvs -o all'.
Reinstate the original behaviour.
Boolean algebra changes for process_each_lv_in_vg().
1st.
Drop process_lv variable since it's not needed.
2nd.
process_lv was always initilized to 0 - so the condition was always true.
It the condition (!tags_supplied && !lvargs_supplied) evaluates as "true",
process_all is already set to 1, so skip vg tags evaluation.
3rd.
Move check for matching lv name in the front of lv tags check
since this check can't be skipped for lvargs_matched counter.
If this filter evaluates to true, skip lv tags evaluation.
Since activation takes only read-lock, there could be
multiple activation running in parallel.
So instead of checking before taking any real lock,
let the locking resolve the problem and just
detect if the reason for failure has been remote
exlusive activation.
It should be also faster, since each activation does
not need to do explicit lock query.
For merging thin snapshot we have to do couple extra
checks before we allow this operation.
We pretend thin-snapshot and thin-origin
are tied together and we have to properly
maintain locking.
At the end of lvconvert --snapshot with an active origin, the origin
gets reloaded.
Commit 57c0f72b1d ("lvconvert: use
_reload_lv on more places") accidentally replaced this with a snapshot
LV reload (which does nothing because only the origin is active).
Replacement of pv_read by find_pv_by_name in commit
651d5093ed caused spurious
error messages when running pvcreate or vgextend against an
unformatted device.
Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created
Physical volume /dev/loop4 not found
Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created
Volume group "vg1" successfully extended
Optimize and cleanup recently introduced new function wipe_lv.
Use compound literals to get nicely initialized wipe_params struct.
Pass in lv as explicit argument for wipe_lv.
Use cmd from lv structure.
Initialize only non-null members so it's easy to see what
is the special arg.
Drop find_merging_snapshot() function. Use find_snapshot()
called after check for lv_is_merging_origin() which
is the commonly used code path - so we avoid duplicated
tests and potential risk of derefering NULL point
in unhandled error path.
Use common wipe_lv (former set_lv) fn to do zeroing as well as signature
wiping if needed. Provide new struct wipe_lv_params to define the
functionality.
Bind "lvcreate -W/--wipesignatures y" with proper wipe_lv call.
Also, add "yes" and "force" to lvcreate_params so it's possible
to apply them for the prompt: "WARNING: %s detected on %s. Wipe it? [y/n]".
If the refresh fails for any reason before autoactivation, let's not
make this a stopper for autoactivation itself - just log the error
message if it appears.
The reason is that in some rare situations, we can still hit the
problem with the suspend call to fail (as already described in
commit d8085edf65, also
https://bugzilla.redhat.com/show_bug.cgi?id=1027314). The refresh
itself is done for only one reason - to refresh any dm tables
for LVs for which the underlying PVs got unplugged/disconnected
and then plugged/connected back (see also
https://bugzilla.redhat.com/show_bug.cgi?id=954061 for more info).
In this case, the major:minor pair is changed and we need to
update dm tables for LVs accordingly.
Now if refresh fails, the error is still logged, but autoactivation
continues.
If using lv/vgchange --sysinit -aay and lvmetad is enabled, we'd like to
avoid the direct activation and rely on autoactivation instead so
it fits system initialization scripts.
But if we're calling lv/vgchange --sysinit -aay too early when even
lvmetad service is not started yet, we just need to do the direct
activation instead without printing any error messages (while
trying to connect to lvmetad and not finding its socket).
This patch adds two helper functions - "lvmetad_socket_present" and
"lvmetad_used" which can be used to check for this condition properly
and avoid these lvmetad connections when the socket is not present
(and hence lvmetad is not yet running).
Whole struct will be set to 0, just
if the first member is array, gcc gives warning
we should initialized this element as array,
so pick any later simple type.
Revert 4777eb6872 which put
target_present check into init_snapshot_merge(). However
this function is also used when parsing metadata. So we would
get this present test performed even when target is not really
needed. So move this target_present test directly into lvconvert.
Add a PV create which takes a paramters object that
has get/set method to configure PV creation.
Current get/set operations include:
- size
- pvmetadatacopies
- pvmetadatasize
- data_alignment
- data_alignment_offset
- zero
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=880395
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Replace the code with the refactored vgreduce_single instead
of calling its own implementation.
Corrects bug: https://bugzilla.redhat.com/show_bug.cgi?id=989174
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Moving the core functionality of vgreduce single into
lib/metadata/vg.c so that the command line and lvm2app library
can call the same core functionality. New function is
vgreduce_single.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Only reading a single PV works correctly only in very limited circumstances.
Moreover, we can't rely on the MDA available on the PV either, since it may be
out of date in some circumstances (until now, we believed that PVs that have an
empty MDA are always orphans, but this is not 100% reliable either).
There's a tiny race when suspending the device which is part
of the refresh because when suspend ioctl is performed, the
dm kernel driver executes (do_suspend and dm_suspend kernel fn):
step 1: a check whether the dev is already suspended and
if yes it returns success immediately as there's
nothing to do
step 2: it grabs the suspend lock
step 3: another check whether the dev is already suspended
and if found suspended, it exits with -EINVAL now
The race can occur in between step 1 and step 2. To prevent
premature autoactivation failure, we're using a simple retry
logic here before we fail completely. For a complete solution,
we need to fix the locking so there's no possibility for suspend
calls to interleave each other to cause this kind of race.
This is just a workaround. Remove it and replace it with proper
locking once we have that in!
Send error message on stdout, since after _display_info_long()
command return errors.
Patch makes consistent behavior for command:
dmsetup info -c non-existing-dev
&
dmsetup info non-existing-dev
Now both commands report error on stderr when they return error status
for non-existing device.
This is an addition to original patch for lvcreate - commit 039bdad.
The same principle applies to lvconvert where there are several steps
during which we need to wipe the existing LV that's being converted
to thin pool, making sure there's no other interference from outside (udev).
Remove conditional that boils down to "if yes or no, then do". The
previous condition in the statement is sufficient and the extra
(always true) condition is unnecessary.
Before, pvscan recognized either:
pvscan --cache --major <major> --minor <minor>
or
pvscan --cache <DevicePath>
When the device is gone and we need to notify lvmetad about device
removal, only --major/--minor works as we can't translate DevicePath
into major/minor pair anymore. The device does not exist in the system
and we don't keep DevicePath index in lvmetad cache to make the
translation internally into original major/minor pair. It would be
useless to keep this index just for this one exact case.
There's nothing bad about using "--major <major> --minor <minor>",
but it makes our life a bit harder when trying to make an
interconnection with systemd units, mainly with instantiated services
where only one and only one arg can be passed (which is encoded in the
service name).
This patch tries to make this easier by adding support for recognizing
the "<major>:<minor>" as a shortcut for the longer form
"--major <major> --minor <minor>". The rule here is simple: if the argument
starts with "/", it's a DevicePath, otherwise it's a <major>:<minor> pair.
Prohibit conversion of pool device with active thin volumes.
Properly restore active states only for active thin pool volume.
Use new LV_NOSCAN when converting volume into thin pool's metadata.
Patch 562ad293fd introduced code regression
when LV was converted to a thin LV with external origin and at the same time,
conversion of LV to a thin pool has been requested.
(RHBZ: #997704)
data_lv needs to be assigned after test for external conversion find pool.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.
The messages can still be seen with -v.
1) When converting from an x-way mirror/raid1 to a y-way mirror/raid1,
the default behaviour should be to stay the same segment type.
2) When converting from linear to mirror or raid1, the default behaviour
should honor the mirror_segtype_default.
3) When converting and the '--type' argument is specified, the '--type'
argument should be honored.
catch such conditions, but errors in the tests caused the issue to go
unnoticed. The code has been fixed to perform #2 properly, the tests
have been corrected to properly test for #2, and a few other tests
were changed to explicitly specify the '--type mirror' when necessary.
Add internal devtypes reporting command to display built-in recognised
block device types. (The output does not include any additional
types added by a configuration file.)
> lvm devtypes -o help
Device Types Fields
-------------------
devtype_all - All fields in this section.
devtype_name - Name of Device Type exactly as it appears in /proc/devices.
devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
devtype_description - Description of Device Type.
> lvm devtypes
DevType MaxParts Description
aoe 16 ATA over Ethernet
ataraid 16 ATA Raid
bcache 1 bcache block device cache
blkext 1 Extended device partitions
...
The traditional style used for optional editable definitions
/* #define X /* */
produces a bogus warning from gcc -Wall.
Rather than suppressing this with -Wno-comment, switch over to
the // comment style.
The same corner cases that exist for snapshots on mirrors exist for
any logical volume layered on top of mirror. (One example is when
a mirror image fails and a non-repair LVM command is the first to
detect it via label reading. In this case, the LVM command will hang
and prevent the necessary LVM repair command from running.) When
a better alternative exists, it makes no sense to allow a new target
to stack on mirrors as a new feature. Since, RAID is now capable of
running EX in a cluster and thin is not active-active aware, it makes
sense to pair these two rather than mirror+thinpool.
As further background, here are some additional comments that I made
when addressing a bug related to mirror+thinpool:
(https://bugzilla.redhat.com/show_bug.cgi?id=919604#c9)
I am going to disallow thin* on top of mirror logical volumes.
Users will have to use the "raid1" segment type if they want this.
This bug has come down to a choice between:
1) Disallowing thin-LVs from being used as PVs.
2) Disallowing thinpools on top of mirrors.
The problem is that the code in dev_manager.c:device_is_usable() is unable
to tell whether there is a mirror device lower in the stack from the device
being checked. Pretty much anything layered on top of a mirror will suffer
from this problem. (Snapshots are a good example of this; and option #1
above has been chosen to deal with them. This can also be seen in
dev_manager.c:device_is_usable().) When a mirror failure occurs, the
kernel blocks all I/O to it. If there is an LVM command that comes along
to do the repair (or a different operation that requires label reading), it
would normally avoid the mirror when it sees that it is blocked. However,
if there is a snapshot or a thin-LV that is on a mirror, the above code
will not detect the mirror underneath and will issue label reading I/O.
This causes the command to hang.
Choosing #1 would mean that thin-LVs could never be used as PVs - even if
they are stacked on something other than mirrors.
Choosing #2 means that thinpools can never be placed on mirrors. This is
probably better than we think, since it is preferred that people use the
"raid1" segment type in the first place. However, RAID* cannot currently
be used in a cluster volume group - even in EX-only mode. Thus, a complete
solution for option #2 must include the ability to activate RAID logical
volumes (and perform RAID operations) in a cluster volume group. I've
already begun working on this.
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.
The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide. This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them. Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction. This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.
The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID. (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
Udev daemon has recently introduced a limit on the number of udev
processes (there was no limit before). This causes a problem
when calling pvscan --cache -aay in lvmetad udev rules which
is supposed to activate the volumes. This activation is itself
synced with udev and so it waits for the activation to complete
before the pvscan finishes. The event processing can't continue
until this pvscan call is finished.
But if we're at the limit with the udev process count, we can't
instatiate any more udev processes, all such events are queued
and so we can't process the lvm activation event for which the
pvscan is waiting.
Then we're in a deadlock since the udev process with the
pvscan --cache -aay call waits for the lvm activation udev
processing to complete, but that will never happen as there's
this limit hit with the number of udev processes.
The process with pvscan --cache -aay actually times out eventually
(3min or 30sec, depends on the version of udev).
This patch makes it possible to run the pvscan --cache -aay
in the background so the udev processing can continue and hence
we can avoid the deadlock mentioned above.
The commit 82d83a01ce
"autoactivation: refresh existing VG before autoactivation"
causes problems (dangling udev_sync cookies, slow processing
of the pvscan --cache --major --minor call from udev rules)
when the autoactivation handler is run in parallel on
several PVs that belong to the same VG. Revert this patch
until the exact source of the problem is found and then
properly fixed and handled.
The patch allows the user to also pvmove snapshots and origin logical
volumes. This means pvmove should be able to move all segment types.
I have, however, disallowed moving converting or merging logical volumes.
Top-level LVs (like RAID, mirror or thin) are ignored when determining which
portions of an LV to pvmove. If the user specified the name of an LV to
move and it was one of the above types, it would be skipped. The code would
never move on to check whether its sub-LVs needed moving because their names
did not match what the user specified.
The solution is to check whether a sub-LVs is part of the LV whose name was
specified by the user - not just if there was a name match.
This patch allows pvmove to operate on RAID, mirror and thin LVs.
The key component is the ability to avoid moving a RAID or mirror
sub-LV onto a PV that already has another RAID sub-LV on it.
(e.g. Avoid placing both images of a RAID1 LV on the same PV.)
Top-level LVs are processed to determine which PVs to avoid for
the sake of redundancy, while bottom-level LVs are processed
to determine which segments/extents to move.
This approach does have some drawbacks. By eliminating whole PVs
from the allocation list, we might miss the opportunity to perform
pvmove in some senarios. For example, if we have 3 devices and
a linear uses half of the first, a RAID1 uses half of the first and
half of the second, and a linear uses half of the third (FIGURE 1);
we should be able to pvmove the first device (FIGURE 2).
FIGURE 1:
[ linear ] [ -RAID- ] [ linear ]
[ -RAID- ] [ ] [ ]
FIGURE 2:
[ moved ] [ -RAID- ] [ linear ]
[ moved ] [ linear ] [ -RAID- ]
However, the approach we are using would eliminate the second
device from consideration and would leave us with too little space
for allocation. In these situations, the user does have the ability
to specify LVs and move them one at a time.
Recent kernels allow messages to respond with a string.
Add dm_task_get_message_response() to libdevmapper to perform some
basic sanity checks and return this.
Have 'dmsetup message' display any response.
DM statistics will make extensive use of this.
(From Mikulas.)
When autoactivating a VG, there could be an existing VG with exactly
the same PV UUIDs. The PVs could be reappeared after previous
loss/disconnect (for example disconnecting and reconnecting iscsi).
Since there's no "autodeactivation" yet, the mappings for the LVs
from the VG were left in the system even if the device was disconnected.
These mappings also hold the major:minor of the underlying device.
So if the device reappears, it is assigned a different major:minor
pair (...and kernel name). We need to cope with this during
autoactivation so any existing mappings are corrected for any changes.
The VG refresh does that (the vgchange --refresh functionality) -
call this before VG autoactivation.
(If the VG does not exist yet, the VG refresh is NOP)
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.
When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.
Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.
Log each filter initialisation to show sequence.
Avoid duplicate 'Using $device' debug messages.
Commit ID 8615234c0f failed to include
the actual code changes that were made to fix the bug. Instead, all
tests went in to validate the bug fix. This patch adds the missing
code changes.
1) Since the min|maxrecoveryrate args are size_kb_ARGs and they
are recorded (and sent to the kernel) in terms of kB/sec/disk,
we must back out the factor multiple done by size_kb_arg. This
is already performed by 'lvcreate' for these arguments.
2) Allow all RAID types, not just RAID1, to change these values.
3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe'
commands so that lvchange will not complain about needing at
least one of a certain set of arguments and failing.
4) Add tests that check that these values can be set via lvchange
and lvcreate and that 'lvs' reports back the proper results.
If there is no RAID support in the kernel but the default mirror
segtype is "raid1", converting legacy mirrors can be problematic.
For example, changing the log type or converting a mirror to a linear
LV does not require the RAID modules to be present. However, because
lp->segtype is set to be RAID1 by the configuration file, the command
fails.
We should only be setting lp->segtype when converting mirrors if it is
going to change (e.g. to linear or between mirror types).
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.
Currently this applies to these settings:
allocation/thin_pool_chunk_size
allocation/thin_pool_discards
allocation/thin_pool_zero
Initial basic support for repair.
It currently takes pool metadata spare volume, which
is used for recovery. New spare is created if the volume
is successfuly repaired.
After the operation the previous _tmeta volume is moved
into _tmeta%d volume and if everything is ok, this volume
could be removed.
New _tmeta needs to be pvmoved to proper place and also
converted to i.e. mirror if it should be mirrored.
Later version will try to automate some steps here.
Suggest to use _tdata and _tmeta devices for that.
This fixes regression from too relaxed change in
f1d5f6ae81
Without this patch there are some empty LVs created before
mirror code recognizes it cannot continue.
(in release fix)
Three fixme's addressed in this commit:
1) lib/metadata/lv_manip.c:_calc_area_multiple() - this could be
safely changed to a comment explaining that currently because
RAID10 can only have a 2-way mirror, we don't need to know the
number of stripes. However, we will need to know that in the
future if RAID10 is to support more than 2-way mirroring.
2) lib/metadata/mirror.c:_delete_lv() - should have been calling
_activate_lv_like_model() with 'mirror_lv'. This is because
'mirror_lv' is the LV that the overall operation is being
performed on. We need to use this LV as the basis for
determining whether to activate locally, or across the
cluster, etc.
3) tools/lvcreate.c:_lvcreate_params() - Minor clean-up. If
'-m 0' is given, treat it as though the mirroring argument
was not given (i.e. as though the requested segment type
was 'stripe' and not mirror).
The --type mirror requires -m/--mirrrors:
lvconvert --type mirror vg/lvol0
--type mirror requires -m/--mirrors
Run `lvconvert --help' for more information.
The --type raid* is allowed (the checks already existed):
lvconvert --type raid10 vg/lvol0
Converting the segment type for vg/lvol0 from linear to raid10 is not yet supported.
The --type snapshot is a synonym to -s/--snapshot:
lvconvert -s vg/lvol0 vg/lvol1
Logical volume lvol1 converted to snapshot.
lvconvert --type snapshot vg/lvol0 vg/lvol1
Logical volume lvol1 converted to snapshot.
All the other segment types are not supported, e.g.:
lvconvert --type zero vg/lvol0
Conversion using --type zero is not supported.
Run `lvconvert --help' for more information.
Add --poolmetadataspare option and creates and handles
pool metadata spare lv when thin pool is created.
With default setting 'y' it tries to ensure, spare has
at least the size of created LV.
The lvchange has both -k/--setactivationskip and
-K/--ignoreactivationskip option available for use.
The vgchange has only -K/--ignoreactivationskip, but
not the -k/--setactivationskip as the ACTIVATION_SKIP
flag is an LV property, not a VG one and so we change it
only by using the lvchange...
Also add -k/--setactivationskip y/n and -K/--ignoreactivationskip
options to lvcreate.
The --setactivationskip y sets the flag in metadata for an LV to
skip the LV during activation. Also, the newly created LV is not
activated.
Thin snapsots have this flag set automatically if not specified
directly by the --setactivationskip y/n option.
The --ignoreactivationskip overrides the activation skip flag set
in metadata for an LV (just for the run of the command - the flag
is not changed in metadata!)
A few examples for the lvcreate with the new options:
(non-thin snap LV => skip flag not set in MDA + LV activated)
raw/~ $ lvcreate -l1 vg
Logical volume "lvol0" created
raw/~ $ lvs -o lv_name,attr vg/lvol0
LV Attr
lvol0 -wi-a----
(non-thin snap LV + -ky => skip flag set in MDA + LV not activated)
raw/~ $ lvcreate -l1 -ky vg
Logical volume "lvol1" created
raw/~ $ lvs -o lv_name,attr vg/lvol1
LV Attr
lvol1 -wi------
(non-thin snap LV + -ky + -K => skip flag set in MDA + LV activated)
raw/~ $ lvcreate -l1 -ky -K vg
Logical volume "lvol2" created
raw/~ $ lvs -o lv_name,attr vg/lvol2
LV Attr
lvol2 -wi-a----
(thin snap LV => skip flag set in MDA (default behaviour) + LV not activated)
raw/~ $ lvcreate -L100M -T vg/pool -V 1T -n thin_lv
Logical volume "thin_lv" created
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg
LV Attr
pool twi-a-tz-
thin_lv Vwi-a-tz-
thin_snap Vwi---tz-
(thin snap LV + -K => skip flag set in MDA (default behaviour) + LV activated)
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap -K
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg/thin_lv
LV Attr
thin_lv Vwi-a-tz-
(thins snap LV + -kn => no skip flag in MDA (default behaviour overridden) + LV activated)
[0] raw/~ # lvcreate -s vg/thin_lv -n thin_snap -kn
Logical volume "thin_snap" created
[0] raw/~ # lvs -o name,attr vg/thin_snap
LV Attr
thin_snap Vwi-a-tz-