IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch allows users to convert existing logical volumes into
cache pool LVs. Since cache pool LVs consist of data and metadata
sub-LVs, there is also the '--poolmetadata' (similar to thin_pool)
which allows for the specification of the metadata device.
This patch allows the creation and removal of cache pools. Users are not
yet able to create cache LVs. They are only able to define the space used
for the cache and its characteristics (chunk_size and cache mode ATM) by
creating the cache pool.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.
The messages can still be seen with -v.
Add internal devtypes reporting command to display built-in recognised
block device types. (The output does not include any additional
types added by a configuration file.)
> lvm devtypes -o help
Device Types Fields
-------------------
devtype_all - All fields in this section.
devtype_name - Name of Device Type exactly as it appears in /proc/devices.
devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
devtype_description - Description of Device Type.
> lvm devtypes
DevType MaxParts Description
aoe 16 ATA over Ethernet
ataraid 16 ATA Raid
bcache 1 bcache block device cache
blkext 1 Extended device partitions
...
Udev daemon has recently introduced a limit on the number of udev
processes (there was no limit before). This causes a problem
when calling pvscan --cache -aay in lvmetad udev rules which
is supposed to activate the volumes. This activation is itself
synced with udev and so it waits for the activation to complete
before the pvscan finishes. The event processing can't continue
until this pvscan call is finished.
But if we're at the limit with the udev process count, we can't
instatiate any more udev processes, all such events are queued
and so we can't process the lvm activation event for which the
pvscan is waiting.
Then we're in a deadlock since the udev process with the
pvscan --cache -aay call waits for the lvm activation udev
processing to complete, but that will never happen as there's
this limit hit with the number of udev processes.
The process with pvscan --cache -aay actually times out eventually
(3min or 30sec, depends on the version of udev).
This patch makes it possible to run the pvscan --cache -aay
in the background so the udev processing can continue and hence
we can avoid the deadlock mentioned above.
Add --poolmetadataspare option and creates and handles
pool metadata spare lv when thin pool is created.
With default setting 'y' it tries to ensure, spare has
at least the size of created LV.
The lvchange has both -k/--setactivationskip and
-K/--ignoreactivationskip option available for use.
The vgchange has only -K/--ignoreactivationskip, but
not the -k/--setactivationskip as the ACTIVATION_SKIP
flag is an LV property, not a VG one and so we change it
only by using the lvchange...
Also add -k/--setactivationskip y/n and -K/--ignoreactivationskip
options to lvcreate.
The --setactivationskip y sets the flag in metadata for an LV to
skip the LV during activation. Also, the newly created LV is not
activated.
Thin snapsots have this flag set automatically if not specified
directly by the --setactivationskip y/n option.
The --ignoreactivationskip overrides the activation skip flag set
in metadata for an LV (just for the run of the command - the flag
is not changed in metadata!)
A few examples for the lvcreate with the new options:
(non-thin snap LV => skip flag not set in MDA + LV activated)
raw/~ $ lvcreate -l1 vg
Logical volume "lvol0" created
raw/~ $ lvs -o lv_name,attr vg/lvol0
LV Attr
lvol0 -wi-a----
(non-thin snap LV + -ky => skip flag set in MDA + LV not activated)
raw/~ $ lvcreate -l1 -ky vg
Logical volume "lvol1" created
raw/~ $ lvs -o lv_name,attr vg/lvol1
LV Attr
lvol1 -wi------
(non-thin snap LV + -ky + -K => skip flag set in MDA + LV activated)
raw/~ $ lvcreate -l1 -ky -K vg
Logical volume "lvol2" created
raw/~ $ lvs -o lv_name,attr vg/lvol2
LV Attr
lvol2 -wi-a----
(thin snap LV => skip flag set in MDA (default behaviour) + LV not activated)
raw/~ $ lvcreate -L100M -T vg/pool -V 1T -n thin_lv
Logical volume "thin_lv" created
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg
LV Attr
pool twi-a-tz-
thin_lv Vwi-a-tz-
thin_snap Vwi---tz-
(thin snap LV + -K => skip flag set in MDA (default behaviour) + LV activated)
raw/~ $ lvcreate -s vg/thin_lv -n thin_snap -K
Logical volume "thin_snap" created
raw/~ $ lvs -o name,attr vg/thin_lv
LV Attr
thin_lv Vwi-a-tz-
(thins snap LV + -kn => no skip flag in MDA (default behaviour overridden) + LV activated)
[0] raw/~ # lvcreate -s vg/thin_lv -n thin_snap -kn
Logical volume "thin_snap" created
[0] raw/~ # lvs -o name,attr vg/thin_snap
LV Attr
thin_snap Vwi-a-tz-
Normally, the lvm dumpconfig processes only the configuration tree
that is at the top of the cascade. Considering the cascade is:
CONFIG_STRING -> CONFIG_PROFILE -> CONFIG_MERGED_FILES/CONFIG_FILE
...then:
(dumpconfig of lvm.conf only)
raw/~ $ lvm dumpconfig allocation
allocation {
maximise_cling=1
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=64
}
(dumpconfig of selected profile configuration only)
raw/~ $ lvm dumpconfig --profile test allocation
allocation {
thin_pool_chunk_size=8
thin_pool_discards="passdown"
thin_pool_zero=1
}
(dumpconfig of given --config configuration only)
raw/~ $ lvm dumpconfig --config 'allocation{thin_pool_chunk_size=16}' allocation
allocation {
thin_pool_chunk_size=16
}
The --mergedconfig option causes the configuration cascade to be
merged before processing it with dumpconfig:
(dumpconfig of merged selected profile and lvm.conf)
raw/~ $ lvm dumpconfig --profile test allocation --mergedconfig
allocation {
maximise_cling=1
thin_pool_zero=1
thin_pool_discards="passdown"
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=8
}
(dumpconfig merged given --config and selected profile and lvm.conf)
raw/~ $ lvm dumpconfig --profile test --config 'allocation{thin_pool_chunk_size=16}' allocation --mergedconfig
allocation {
maximise_cling=1
thin_pool_zero=1
thin_pool_discards="passdown"
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=16
}
Hence with the --mergedconfig, we are able to see the
configuration that is actually used when processing any
LVM command while using any combination of --config/--profile
options together with lvm.conf file.
The command to change the profile for existing VG/LV:
"vgchange/lvchange --profile <profile_name>"
The command to detach any existing profile from VG/LV:
"vgchange/lvchange --detachprofile"
Add support for lvresize of thin pool metadata device.
lvresize --poolmetadatasize +20 vgname/thinpool_lv
or
lvresize -L +20 vgname/thinpool_lv_tmeta
Where the second one allows all the args for resize (striping...)
and the first option resizes accoding to the last metadata lv segment.
This patch adds the ability to set the minimum and maximum I/O rate for
sync operations in RAID LVs. The options are available for 'lvcreate' and
'lvchange' and are as follows:
--minrecoveryrate <Rate> [bBsSkKmMgG]
--maxrecoveryrate <Rate> [bBsSkKmMgG]
The rate is specified in size/sec/device. If a suffix is not given,
kiB/sec/device is assumed. Setting the rate to 0 removes the preference.
Accept --yes on all commands, even ones that don't today have prompts,
so that test scripts that don't care about interactive prompts no
longer need to deal with them.
But continue to mention --yes only in the command prototypes that
actually use it.
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics. The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value. If no trailing
character is given, it will set the flag.
Synopsis:
lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv
The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set. It is signified with a 'w'. If the device
has failed, the 'p'artial flag has priority.
Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg Rwi---r-m 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-w 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-- 1 linear 4.00m
Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg rwi---r-p 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-p 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-p 1 linear 4.00m
A new reportable field has been added for writebehind as well. If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r-- 512
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r--
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
lvchange --syncaction {check|repair} vg/raid_lv
RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent. 'lvchange' can
now initaite the two scrubbing operations: "check" and "repair". "check"
will go over the array and recored the number of discrepancies but not
repair them. "repair" will correct the discrepancies as it finds them.
'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'. The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.
Additional reporting has been added for 'lvs' to support the new
operations. Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches". These
can be accessed using the '-o' option to 'lvs', like:
lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing. It can be one of the following:
- idle: All sync operations complete (doing nothing)
- resync: Initializing an array or recovering after a machine failure
- recover: Replacing a device in the array
- check: Looking for array inconsistencies
- repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.
The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.
Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well. The role of the 'p'artial character in the lv_attr report field
as expanded. "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
'm'ismatches: Indicates that there are discrepancies in a RAID
LV. This character is shown after a scrubbing
operation has detected that portions of the RAID
are not coherent.
'r'efresh : Indicates that a device in a RAID array has suffered
a failure and the kernel regards it as failed -
even though LVM can read the device label and
considers the device to be ok. The LV should be
'r'efreshed to notify the kernel that the device is
now available, or the device should be 'r'eplaced
if it is suspected of failing.
lvm dumpconfig [--ignoreadvanced] [--ignoreunsupported]
--ignoreadvanced causes the advanced configuration options to be left
out on dumpconfig output
--ignoreunsupported causes the options that are not officially supported
to be lef out on dumpconfig output
lvm dumpconfig [--withcomments] [--withversions]
The --withcomments causes the comments to appear on output before each
config node (if they were defined in config_settings.h).
The --withversions causes a one line extra comment to appear on output
before each config node with the version information in which the
configuration setting first appeared.
lvm dumpconfig [--type {current|default|missing|new}] [--atversion] [--validate]
This patch adds above-mentioned args to lvm dumpconfig and it maps them
to creation and writing out a configuration tree of a specific type
(see also previous commit):
- current maps to CFG_TYPE_CURRENT
- default maps to CFG_TYPE_DEFAULT
- missing maps to CFG_TYPE_MISSING
- new maps to CFG_TYPE_NEW
If --type is not defined, dumpconfig defaults to "--type current"
which is the original behaviour of dumpconfig before all these changes.
The --validate option just validates current configuration tree
(lvm.conf/--config) and it writes a simple status message:
"LVM configuration valid" or "LVM configuration invalid"
To create an Embedding Area during PV creation (pvcreate or as part of
the vgconvert operation), we need to define the Embedding Area size.
The Embedding Area start will be calculated automatically by the tools.
This patch adds --embeddingareasize argument to pvcreate and vgconvert.
Add basic support for converting LV into an external origin volume.
Syntax:
lvconvert --thinpool vg/pool --originname renamed_origin -T origin
It will convert volume 'origin' into a thin volume, which will
use 'renamed_origin' as an external read-only origin.
All read/write into origin will go via 'pool'.
renamed_origin volume is read-only volume, that could be activated
only in read-only mode, and cannot be modified.
Allow restoring metadata with thin pool volumes.
No validation is done for this case within vgcfgrestore tool -
thus incorrect metadata may lead to destruction of pool content.
Update code for lvconvert.
Change the lvconvert user interface a bit - now we require 2 specifiers
--thinpool takes LV name for data device (and makes the name)
--poolmetadata takes LV name for metadata device.
Fix type in thin help text -z -> -Z.
Supported is also new flag --discards for thinpools.
Update lvchange to allow change of 'zero' flag for thinpool.
Add support for changing discard handling.
N.B. from/to ignore could be only changed for inactive pool.
Add arg support for discard.
Add discard ignore, nopassdown, passdown (=default) support.
Flags could be set per pool.
lvcreate [--discard {ignore|no_passdown|passdown}] vg/thinlv
One can use "lvcreate --aay" to have the newly created volume
activated or not activated based on the activation/auto_activation_volume_list
this way.
Note: -Z/--zero is not compatible with -aay, zeroing is not used in this case!
When using lvcreate -aay, a default warning message is also issued that zeroing
is not done.
Define auto_activation_handler that activates VGs/LVs automatically
based on the activation/auto_activation_volume_list (activating all
volumes by default if the list is not defined).
The autoactivation is done within the pvscan call in 69-dm-lvmetad.rules
that watches for udev events (device appearance/removal).
For now, this works for non-clustered and complete VGs only.
Normally, the 'vgchange -ay' activates all volume groups (that pass
the activation/volume_list filter if set).
This call can appear in two scenarios:
- system boot (so activation within a script in general)
- manual call on command line (so activaton on user's direct request)
For the former one, we would like to select which VGs should be actually
activated. One can define the list of VGs directly to do that. But that
would require the same list to be provided in all the scripts.
The 'vgchange -aay' will check for the activation/auto_activation_volume_list
in adition and it will activate only those VGs/LVs that pass this
filter (assuming all to be activated if the list is not defined - the
same logic we already have for activation/volume_list).
Init/boot scripts should use this form of activation primarily
(which, anyway, becomes only a fallback now with autoactivation done
on PV appearance in tandem with lvmetad in place).
We're refererring to 'activation' all over the code and we're talking
about 'LVs being activated' all the time so let's use 'activation/activate'
everywhere for clarity and consistency (still providing the old
'available' keyword as a synonym for backward compatibility with
existing environments).
Support has many limitations and lots of FIXMEs inside,
however it makes initial task when user creates a separate LV for
thin pool data and thin metadata already usable, so let's enable
it for testing.
Easiest API:
lvconvert --chunksize XX --thinpool data_lv metadata_lv
More functionality extensions will follow up.
TODO: Code needs some rework since a lot of same code is getting copied.
Calling vgscan alone should reuse information from the lvmetad (if running).
The --cache option should initiate direct device scan and update lvmetad
appropriately (if running).
This is mainly for vgscan to behave consistently compared to pvscan.
Hold global lock in pvscan --lvmetad. (This might need refinement.)
Add PV name to "PV gone" messages.
Adjust some log message severities. (More changes needed.)
RAID is not like traditional LVM mirroring. LVM mirroring required failed
devices to be removed or the logical volume would simply hang. RAID arrays can
keep on running with failed devices. In fact, for RAID types other than RAID1,
removing a device would mean substituting an error target or converting to a
lower level RAID (e.g. RAID6 -> RAID5, or RAID4/5 to RAID0). Therefore, rather
than removing a failed device unconditionally and potentially allocating a
replacement, RAID allows the user to "replace" a device with a new one. This
approach is a 1-step solution vs the current 2-step solution.
example> lvconvert --replace <dev_to_remove> vg/lv [possible_replacement_PVs]
'--replace' can be specified more than once.
example> lvconvert --replace /dev/sdb1 --replace /dev/sdc1 vg/lv
The '--merge' option to lvconvert works on snapshots and RAID1. The man
pages correctly reflect this, but the CLI help output still used the term,
'SnapshotLogicalVolume'.
Example:
~> lvconvert --type raid1 -m 1 vg/lv
The following steps are performed to convert linear to RAID1:
1) Allocate a metadata device from the same PV as the linear device
to provide the metadata/data LV pair required for all RAID components.
2) Allocate the required number of metadata/data LV pairs for the
remaining additional images.
3) Clear the metadata LVs. This performs a LVM metadata update.
4) Create the top-level RAID LV and add the component devices.
We want to make any failure easy to unwind. This is why we don't create the
top-level LV and add the components until the last step. Should anything
happen before that, the user could simply remove the unnecessary images. Also,
we want to ensure that the metadata LVs are cleared before forming the array to
prevent stale information from polluting the new array.
A new macro 'seg_is_linear' was added to allow us to distinguish linear LVs
from striped LVs.
This patch allows a mirror to be extended without an initial resync of the
extended portion. It compliments the existing '--nosync' option to lvcreate.
This action can be done implicitly if the mirror was created with the '--nosync'
option, or explicitly if the '--nosync' option is used when extending the device.
Here are the operational criteria:
1) A mirror created with '--nosync' should extend with 'nosync' implicitly
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv ; lvs vg
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg Mwi-a-m- 5.00g lv_mlog 100.00
Extending 2 mirror images.
Extending logical volume lv to 10.00 GiB
Logical volume lv successfully resized
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg Mwi-a-m- 10.00g lv_mlog 100.00
2) The 'M' attribute ('M' signifies a mirror created with '--nosync', while 'm'
signifies a mirror created w/o '--nosync') must be preserved when extending a
mirror created with '--nosync'. See #1 for example of 'M' attribute.
3) A mirror created without '--nosync' should extend with 'nosync' only when
'--nosync' is explicitly used when extending.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv; lvs vg
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg mwi-a-m- 20.00m lv_mlog 100.00
Extending 2 mirror images.
Extending logical volume lv to 5.02 GiB
Logical volume lv successfully resized
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg mwi-a-m- 5.02g lv_mlog 0.39
vs.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv --nosync; lvs vg
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg mwi-a-m- 20.00m lv_mlog 100.00
Extending 2 mirror images.
Extending logical volume lv to 5.02 GiB
Logical volume lv successfully resized
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg Mwi-a-m- 5.02g lv_mlog 100.00
4) The 'm' attribute must change to 'M' when extending a mirror created without
'--nosync' is extended with the '--nosync' option. (See #3 examples above.)
5) An inactive mirror's sync percent cannot be determined definitively, so it
must not be allowed to skip resync. Instead, the extend should ask the user if
they want to extend while performing a resync.
[EXAMPLE]# lvchange -an vg/lv
[EXAMPLE]# lvextend -L +5G vg/lv
Extending 2 mirror images.
Extending logical volume lv to 10.00 GiB
vg/lv is not active. Unable to get sync percent.
Do full resync of extended portion of vg/lv? [y/n]: y
Logical volume lv successfully resized
6) A mirror that is performing recovery (as opposed to an initial sync) - like
after a failure - is not allowed to extend with either an implicit or
explicit nosync option. [You can simulate this with a 'corelog' mirror because
when it is reactivated, it must be recovered every time.]
[EXAMPLE]# lvcreate -m1 -L 5G -n lv vg --nosync --corelog
WARNING: New mirror won't be synchronised. Don't read what you didn't write!
Logical volume "lv" created
[EXAMPLE]# lvs vg
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg Mwi-a-m- 5.00g 100.00
[EXAMPLE]# lvchange -an vg/lv; lvchange -ay vg/lv; lvs vg
LV VG Attr LSize Pool Origin Snap% Move Log Copy% Convert
lv vg Mwi-a-m- 5.00g 0.08
[EXAMPLE]# lvextend -L +5G vg/lv
Extending 2 mirror images.
Extending logical volume lv to 10.00 GiB
vg/lv cannot be extended while it is recovering.
7) If 'no' is selected in #5 or if the condition in #6 is hit, it should not
result in the mirror being resized or the 'm/M' attribute being changed.
NOTE: A mirror created with '--nosync' behaves differently than one created
without it when performing an extension. The former cannot be extended when
the mirror is recovering (unless in-active), while the latter can. This is
a reasonable thing to do since recovery of a mirror doesn't take long (at
least in the case of an on-disk log) and it would cause far more time in
degraded mode if the extension w/o '--nosync' was allowed. It might be
reasonable to add the ability to force the operation in the future. This
should /not/ force a nosync extension, but rather force a sync'ed extension.
IOW, the user would be saying, "Yes, yes... I know recovery won't take long
and that I'll be adding significantly to the time spent in degraded mode, but
I need the extra space right now!".
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access. The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed. This
operation can be thought of as a partial split. The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap. The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed. The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
~> lvconvert --merge vg/lv_rimage_<n>
The internal mechanics of this are relatively simple. The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'. This is what will be used if a partial activation of an array
is ever required. (It would also be possible to use 'error' targets in
place of the '- -'.) If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array. So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only. To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
to lvm.conf in the activation section: 'snapshot_autoextend_threshold' and
'snapshot_autoextend_percent', that define how to handle automatic snapshot
extension. The former defines when the snapshot should be extended: when its
space usage exceeds this many percent. The latter defines how much extra space
should be allocated for the snapshot, in percent of its current size.
re-add a physical volume that has gone missing previously, due to a transient
device failure, without re-initialising it.
Signed-off-by: Petr Rockai <prockai@redhat.com>
Reviewed-by: Alasdair Kergon <agk@redhat.com>
Introduce --norestorefile to allow user to override the new requirement.
This can also be overridden with "devices/require_restorefile_with_uuid"
in lvm.conf -- however the default is 1.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Allow metadataignore flag to be passed in to pvcreate.
Ideally, more refactoring of the mda allocation / initialization
is warranted, but for now, we just add another parameter to 'add_mda'
to take an existing mda ignored flag. We need to do this or pv_write
loses the state of the mda 'ignored' flag before copying and writing
to disk.
Allow parsing of --vgmetadatacopies for vgcreate. Accept
--metadatacopies as a synonym for --vgmetadatacopies.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Update logic in vgchange to handle --vgmetadatacopies, allow
--metadatacopies as a synonym to --vgmetadatacopies,
and add these parameters to args.h and commands.h
Forbit both --vgmetadatacopies and --metadatacopies as only
one allowed.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
This patch just modifies pvchange to call the underlying ignore
functions for mdas. Ensure special cases do not reflect changes
in metadata (PVs with 0 mdas, setting ignored when already ignored,
clearing ignored when not ignored).
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
A shortcut for --ignorelockingfailure, --ignoremonitoring, --poll n options
and LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES environment variable used all at
once in initialisation scripts (e.g. rc.sysinit or initrd).
This check-in enables the 'mirrored' log type. It can be specified
by using the '--mirrorlog' option as follows:
#> lvcreate -m1 --mirrorlog mirrored -L 5G -n lv vg
I've also included a couple updates to the testsuite. These updates
include tests for the new log type, and some fixes to some of the
*lvconvert* tests.
. Add "monitoring" option to "activation" section of lvm.conf
. Have clvmd consult the lvm.conf "activation/monitoring" too.
. Introduce toollib.c:get_activation_monitoring_mode().
. Error out when both --monitor and --ignoremonitoring are provided.
. Add --monitor and --ignoremonitoring support to lvcreate. Update
lvcreate man page accordingly.
. Clarify that '--monitor' controls the start and stop of monitoring in
the {vg,lv}change man pages.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Allow the number of logical extents to be expressed (for a snapshot) as
a percentage of the total space in the Origin Logical Volume with the
suffix %ORIGIN.
Update the relevant man pages accordingly. Eliminate inconsistencies
between the man pages and tools/commands.h
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
It is pretty much the same as reducing the number of
mirror legs, but we just don't delete them afterwards.
The following command line interface is enforced:
prompt> lvconvert --splitmirror <n> -n <name> <VG>/<LV>
where 'n' is the number of images to split off, and
where 'name' is the name of the newly split off logical volume.
If more than one leg is split off, a new mirror will be the
result. The newly split off mirror will have a 'core' log.
Example:
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_1(0),lv_mimage_2(0),lv_mimage_3(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_1] /dev/sdc1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mimage_3] /dev/sde1(0)
[lv_mlog] /dev/sdi1(0)
[root@bp-01 LVM2]# lvconvert --splitmirrors 2 --name split vg/lv /dev/sd[ce]1
Logical volume lv converted.
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_2(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mlog] /dev/sdi1(0)
split 100.00 split_mimage_0(0),split_mimage_1(0)
[split_mimage_0] /dev/sde1(0)
[split_mimage_1] /dev/sdc1(0)
It can be seen that '--splitmirror <n>' is exactly the same
as '--mirrors -<n>' (note the minus sign), except there is the
additional notion to keep the image being detached from the
mirror instead of just throwing it away.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
the background polldaemon is allowed to start. It can be used
standalone or in conjunction with --refresh or --available y.
Control over when the background polldaemon starts will be particularly
important for snapshot-merge of a root filesystem.
Dracut will be updated to activate all LVs with: --poll n
The lvm2-monitor initscript will start polling with: --poll y
NOTE: Because we currently have no way of knowing if a background
polldaemon is active for a given LV the following limitations exist and
have been deemed acceptable:
1) it is not possible to stop an active polldaemon; so the lvm2-monitor
initscript doesn't stop running polldaemon(s)
2) redundant polldaemon instances will be started for all specified LVs
if vgchange or lvchange are repeatedly used with '--poll y'
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Option --all is only partially documented currently, so document in all
commands. Also make {pv|vg|lv}{display|s} man pages consistent with help
output. Remove ununsed 'disk_ARG' parameter. Leave --trustcache out of
the man page output. Update --units argument to show all possible units.
Going forward, we would like to allow users to specify the total
number of metadatacopies in a VG rather than on a per-PV basis. In
order to facilitate that, introduce --pvmetadatacopes to replace
--metadatacopies everywhere. We still allow --metadatacopies for
pv commands, but require --pvmetadatacopies for vg commands.
Eventually we will introduce --vgmetadatacopies. Once we do that,
we should either deprecate --metadatacopies or make it a synonym based
on the command (pvmetadatacopies for pv commands, and vgmetadatacopies
for vg commands). The latter option would likely just require a simple
'strncpy' check against cmd->command->name to qualify the merge_synonym
call.
Update nightly tests to cover the pvmetadatacopies synonym.
Note that this patch is the result of an eariler review comment for
the implicit pvcreate patches. Should apply cleanly on top of the
implicit pvcreate patches (I applied after patch 10/10 in that series).
NOTE: This patch will require --pvmetadatacopies for vgconvert as
--metadatacopies is no longer accepted.
Adds implicit pvcreate support when calling vgcreate or vgextend with
device paths that are not yet PVs. This changes the behavior of vgcreate
and vgextend from failing with an error message to implicitly pvcreating.
Adds pe_align_offset to 'struct physical_volume'; is initialized with
set_pe_align_offset(). After pe_start is established pe_align_offset is
added to it.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add lvs origin_size field.
Fix linux configure --enable-debug to exclude -O2.
Still a few rough edges, but hopefully usable now:
lvcreate -s vg1 -L 100M --virtualoriginsize 1T
Currently PV commands, which performs full device scan, repeatly
re-reads PVs and scans for all devices.
This behaviour can lead to OOM for large VG.
This patch allows using internal metadata cache for pvs & pvdisplay,
so the commands scan the PVs only once.
(We have to use VG_GLOBAL otherwise cache is invalidated on every
VG unlock in process_single PV call.)
This patch is not fully tested and leaves some related bugs unfixed.
Intended behaviour of the code now:
pe_start in the lvm2 format PV label header is set only by pvcreate (or
vgconvert -M2) and then preserved in *all* operations thereafter.
In some specialist cases, after the PV is added to a VG, the pe_start
field in the VG metadata may hold a different value and if so, it
overrides the other one for as long as the PV is in such a VG.
Currently, the field storing the size of the data area in the PV label
header always holds 0. As it only has meaning in the context of a
volume group, it is calculated whenever the PV is added to a VG (and can
be derived from extent_size and pe_count in the VG metadata).
log type. Previously, we had a '--corelog' argument that
would change the default type from 'disk' to 'core'. I
think that creates too much confusion - especially when
doing conversions on mirrors.
The new argument '--log' takes either "disk" or "core"
as a parameter. This could be expanded in the future
for additional logging types as well.
Examples:
# Creating a 2-way mirror
$> lvcreate -m1 ... # implicitly use default disk logging
$> lvcreate -m1 --log disk ... # explicit disk logging
$> lvcreate -m1 --log core ... # specify core logging
$> lvcreate -m1 --corelog ... # old way still works
# Conversion examples
$> lvconvert --log core ... # convert to core logging
$> lvconvert --log disk ... # convert to disk logging
$> lvconvert -mX --corelog ... # old way still works
$> lvconvert -mX ... # old way of converting to disk logging still works
Changes are reflected in the man pages.
e.g. lvcreate -l 100%FREE to create an LV using all available space.
lvextend -l 50%LV to increase an LV by 50% of its existing size.
lvcreate -l 20%VG to create an LV using 20% of the total VG size.
event-driven model. Without changes to the way the cache gets updated, the
option is currently unreliable without a global lock to prevent any lvm2
commands from running concurrently.
Fix some memory leaks in error paths found by coverity.
Use C99 struct initialisers.
Move DEFS into configure.h.
Clean-ups to remove miscellaneous compiler warnings.