IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated. If they do not match, the code may crash.
E.g. https://bugzilla.redhat.com/989347
Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
When using a global_filter and if this filter is incorrectly
specified, we ended up with a segfault:
raw/~ $ pvs
Invalid filter pattern "r|/dev/sda".
Segmentation fault (core dumped)
In the example above a closing '|' character is missing at the end
of the regex. The segfault itself was caused by trying to destroy
the same filter twice in _init_filters fn within the error path
(the "bad" goto target):
bad:
if (f3)
f3->destroy(f3);
if (f4)
f4->destroy(f4);
Where f3 is the composite filter (sysfs + regex + type + md + mpath filter)
and f4 is the persistent filter which encompasses this composite filter
within persistent filter's 'real' field in 'struct pfilter'.
So in the end, we need to destroy the persistent filter only as
this will also destroy any 'real' filter attached to it.
363 files changed, 19863 insertions(+), 9055 deletions(-)
This is a very large release - so expect bugs!
Please treat this release like a release candidate.
Changes to the external interfaces since 2.02.98 are not yet frozen.
Updated releases will follow quickly (days not weeks) as any problems
are handled.
The --type mirror requires -m/--mirrrors:
lvconvert --type mirror vg/lvol0
--type mirror requires -m/--mirrors
Run `lvconvert --help' for more information.
The --type raid* is allowed (the checks already existed):
lvconvert --type raid10 vg/lvol0
Converting the segment type for vg/lvol0 from linear to raid10 is not yet supported.
The --type snapshot is a synonym to -s/--snapshot:
lvconvert -s vg/lvol0 vg/lvol1
Logical volume lvol1 converted to snapshot.
lvconvert --type snapshot vg/lvol0 vg/lvol1
Logical volume lvol1 converted to snapshot.
All the other segment types are not supported, e.g.:
lvconvert --type zero vg/lvol0
Conversion using --type zero is not supported.
Run `lvconvert --help' for more information.
The status printed for dm-raid targets on older kernels does not include
the syncaction field. This is handled by dev_manager_raid_status() just
fine by populating the raid status structure with NULL for that field.
However, lv_raid_sync_action() does not properly handle that field being
NULL. So, check for it and return 0 if it is NULL.
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.
This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.
However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.
Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.
Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
The new lvm2-activation-net.service activates LVM volumes
after network-attached devices are set up (iSCSI and FCoE)
if lvmetad is disabled and hence the autoactivation is not
used.
Created dlid for test is not needed afterward, so lower a memory
usage of this call is repeatedly used for building some large tree.
TODO: create function to use given buffer on stack as much cheaper.
Code needs to check if the layer origin device is suspended,
It's valid to create thinvolume snapshot of thinvolume which is also
used as an old-style snapshot. In this case we need to check -real
is suspended.
When adding origin_only - add only layer thin volume.
(in case it's also old-snapshot add only -real device)
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.
So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.
Replace the test for valid activation combination with simpler list of
deactivation combinations.
Normally, the lvm dumpconfig processes only the configuration tree
that is at the top of the cascade. Considering the cascade is:
CONFIG_STRING -> CONFIG_PROFILE -> CONFIG_MERGED_FILES/CONFIG_FILE
...then:
(dumpconfig of lvm.conf only)
raw/~ $ lvm dumpconfig allocation
allocation {
maximise_cling=1
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=64
}
(dumpconfig of selected profile configuration only)
raw/~ $ lvm dumpconfig --profile test allocation
allocation {
thin_pool_chunk_size=8
thin_pool_discards="passdown"
thin_pool_zero=1
}
(dumpconfig of given --config configuration only)
raw/~ $ lvm dumpconfig --config 'allocation{thin_pool_chunk_size=16}' allocation
allocation {
thin_pool_chunk_size=16
}
The --mergedconfig option causes the configuration cascade to be
merged before processing it with dumpconfig:
(dumpconfig of merged selected profile and lvm.conf)
raw/~ $ lvm dumpconfig --profile test allocation --mergedconfig
allocation {
maximise_cling=1
thin_pool_zero=1
thin_pool_discards="passdown"
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=8
}
(dumpconfig merged given --config and selected profile and lvm.conf)
raw/~ $ lvm dumpconfig --profile test --config 'allocation{thin_pool_chunk_size=16}' allocation --mergedconfig
allocation {
maximise_cling=1
thin_pool_zero=1
thin_pool_discards="passdown"
mirror_logs_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
thin_pool_chunk_size=16
}
Hence with the --mergedconfig, we are able to see the
configuration that is actually used when processing any
LVM command while using any combination of --config/--profile
options together with lvm.conf file.
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions. Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
Document following items:
configuration cascade (man lvm.conf)
--profile ProfileName (man lvm)
--detachprofile (man vgchange/lvchange)
-o vg_profile/lv_profile (man vgs/lvs)
Also document --config a bit so we can see where it fits in the
configuration cascade - will be documented more in following commit...
Fix and improve handling on sigint.
Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.
If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.
Fix cuple related stack; reports and assignments.
Do not keep multiple archives for the executed command.
Reuse the ALLOCATABLE_PV from pv status for
ARCHIVED_VG vg status. Mark VG with the bit with the
first archivation.
If the user would upconvert a linear LV to a mirror without specifying
the segment type ("--type mirror" vs "--type raid1"), the "mirror"
segment type would be chosen without consulting the 'default_mirror_segtype'
setting in lvm.conf. This is now used as the basis for determining
which should be used if left unspecified.
Check for enough space in preallocated buffer.
Fixes problem, when lvm code started to suddenly allocate
too big memory chunks.
TODO: lvmetad protocol should announce needed size ahead,
so if metadata have 1MB we are not reallocating memory...
When vgname has not existed in metadata, it has crashed on double free
in format_instance destroy() - since VG was created, used FID and was
released - which also released FID, so further use was accessing bad
memory.
Fix it for this code path before release_vg() so FID will exists
when _vg_read_file_name() returns NULL.
Support vgsplit for VGs with thin pools and thin volumes.
In case the thin data and thin metadata volumes are moved to a new VG,
move there also all related thin volumes and check that external origins
are also present in this new VG.
We use mpath filtering (enabled by devices/multipath_component_detection=1
lvm.conf setting) to avoid a situation in which we could end up with
duplicate PVs found. We need to filter out the mpath components and
use only the top-level multipath mapping instead for PV scans.
However, if the there are partitions on multipath components, we need
to filter out these partitions. This patch fixes it so those
partitions found on multipath components are filtered as well.
For example, let's consider following configuration:
The sda and sdb are mpath components, sda1 and sdb1 the partitions
on these components, mpath-test the mpath mapping and mpath-test1
the partition mapping - created automatically by kpartx right
after mpath-test creation. The PV resides on top.
(LVM PV)
|
mpath-test1
|
mpath-test
|
sda1 ---------- sdb1
\ | |/
sda sdb
E.g. for sda1 and sdb1, the code will detect this and it skips
the partition that belongs to the multipath component:
<snippet from the log>
#filters/filter-mpath.c:156 /dev/sda1: Device is a partition, using primary device /dev/sda for mpath component detection
130 #ioctl/libdm-iface.c:1724 dm status (253:2) OF[16384](*1)
131 #filters/filter-mpath.c:196 /dev/sda1: Skipping mpath component device
</snippet from the log>
Othewise, we'd see the same PV label on sda1/sdb1 and mpath-test1
at the same time ending up with "Duplicate PV found...".
Add support for lvresize of thin pool metadata device.
lvresize --poolmetadatasize +20 vgname/thinpool_lv
or
lvresize -L +20 vgname/thinpool_lv_tmeta
Where the second one allows all the args for resize (striping...)
and the first option resizes accoding to the last metadata lv segment.
Giving volume type information about being 'metadata' type of volume
has higher priority then i.e. 'mirror' or 'thin' flag - for those
type we have 'target attr' (7th. field).
Last commit made dump filter only partially composable.
Add remaining functionality and also support composable wipe,
which is needed, when i.e. vgscan needs to remove cache.
(in release fix)
Add a generic dump operation to filters and make the composite filter call
through to its components. Previously, when global filter was set, the code
would treat the toplevel composite filter's private area as if it belonged a
persistent filter, trying to write nonsense into a non-sensical file.
Also deal with NULL cmd->filter gracefully.
The global filter in system's lvm.conf may conflict with the custom filter we
set up in vgimportclone (they can easily fail to intersect). Since we explicitly
avoid talking to lvmetad in vgimportclone, it is safe and reasonable to do so.
There is no point in creation of 2chunks snapshot,
since the snapshot is invalidated immeditelly with the first write
as there is no free chunk for COW blocks
(2 chunks are used by the snap header and the 1st. metadata chunk).
Enhance error message about the lowest usable size.
Instead of seeing wierd overflows inside the lvm code,
giving false error messages, kill the user experiment in the begining.
Who needs to use more then 16EiB with lvm2 and 64bit anyway...
Avoid hitting memory corruption (double free) in code path,
where PV FID has been already destroyed and the released pointer
was left in PV structure and could have been tried to be released
from there 2nd. time with final context destruction.
Check for mounted fs also for vgchange command, not just lvchange.
NOTE: Code is using lv_info() just like lvs_in_vg_opened().
It should be probably converted into lv_is_active_locally().
There are places where 'lv_is_active' was being used where it was
more correct to use 'lv_is_active_locally'. For example, when checking
for the existance of a kernel instance before asking for its status.
Most of the time these would work correctly. (RAID is only allowed on
non-clustered VGs at the moment, which means that 'lv_is_active' and
'lv_is_active_locally' would give the same result.) However, it is
more correct to use the proper variant and it helps with future
scenarios where targets might be allowed exclusively (or clustered) in
a cluster VG.
Accept --yes on all commands, even ones that don't today have prompts,
so that test scripts that don't care about interactive prompts no
longer need to deal with them.
But continue to mention --yes only in the command prototypes that
actually use it.
This fixes a long standing regression since LVM2 2.02.74 (commit 4efb1d9c,
"Update heuristic used for default and detected data alignment.")
The default PE alignment could be used (via MAX()) even if it was
determined that the device's MD stripe width, or minimal_io_size or
optimal_io_size were not factors of the default PE alignment (either 64K
or the newer default of 1MB, etc). This bug would manifest if the
default PE alignment was larger than the overriding hint that the
device provided (e.g. default of 1MB vs optimal_io_size of 768K).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If the dm_realloc would fail, the already allocate _maps_buffer
memory would have been lost (overwritten with NULL).
Fix this by using temporary line buffer.
Also add a minor cleanup to set end of buffer to '\0',
only when we really know the file size fits the preallocated buffer.
Setting the cmd->default_settings.udev_fallback also requires DM
driver version check. However, this caused useless mapper/control
access with ioctl if not needed actually. For example if we're not
using activation code, we don't need to know the udev_fallback as
there's no node and symlink processing.
For example, this premature mapper/control access caused problems
when using lvm2app even when no activation happens - there are
situations in which we don't need to use mapper/control, but still
need some of the lvm2app functionality. This is also the case for
lvm2-activation systemd generator which just needs to look at the
lvm2 configuration, but it shouldn't touch mapper/control.
Add new lvs segment field 'Monitor' showing 3 states:
"monitored" - LV is monitored by dmeventd.
"not monitored" - LV is currently not being monitored by dmeventd
"" (empty) - LV does not support monitoring, or dmeventd support
is not compiled in.
Support for exclusive activation of snapshots revealed some problems.
When snapshot is created, COW LV is activated first (for clearing) and
then it's transformed into snapshot's COW LV, but it has left the lock
for such LV active in cluster and this lock could not have been removed
from dlm, unless snapshot has been removed within same dlm session.
If the user tried to remove snapshot after rebooting node, the lock was
missing, and COW LV could not have been detached.
Patch modifes the approach in this way:
Always deactivate COW LV for clustered vg after clearing (so it's
activated again via imlicit snapshot activation rule when snapshot is activated).
When snapshot is removed, activate COW LV as independend LV, so the lock
will exist for such LV, but only when the snapshot is active.
Also add test case for testing snapshot removal after cluster reboot.
Patch fixes hidden problem with lvm metadata caching.
When the pretest was made, only the commited data have been cached back
since the call lv_info_by_lvid() triggers mda read operation.
However call of lv_suspend_if_active() also reads precommited metadata.
The problem is visible in this sequence of calls:
vg_write(), suspend_lv(), vg_commit(), resume_lv()
which may end with leaving outdated mda in lvm cache, since vg_write()
drops cached metadata and vg_commit() only transforms precommited
to commited metadata, but in the case of pretesting we have
no precommited mda available so the cache will continue to use
old metadata. This happens, when suspend LV is inactive.
Merging multiple config files together needs to know newest (highest)
timestamp of merged files. Persistent cache file is being used
only in case, the config file is older then .cache file.
Assign fid as the last step before returning VG.
Make the format reader for 'lvm1' and 'pool' equal to 'lvm2' format reader.
It has caused memory corruption to lvmetad as it later calls
destroy_instance() to allocated fid. This patch should fix problems
with crashing test lvmetad-lvm1.sh.
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics. The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value. If no trailing
character is given, it will set the flag.
Synopsis:
lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv
The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set. It is signified with a 'w'. If the device
has failed, the 'p'artial flag has priority.
Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg Rwi---r-m 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-w 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-- 1 linear 4.00m
Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg rwi---r-p 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-p 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-p 1 linear 4.00m
A new reportable field has been added for writebehind as well. If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r-- 512
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r--
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
Move common code for changing activation state from
vgchange and lvchange to one function.
Fix the order of checks - so we always implicitelly
activate snapshots and thin volumes in exclusive mode,
and we do not allow local deactivation for them.
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
lvchange --syncaction {check|repair} vg/raid_lv
RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent. 'lvchange' can
now initaite the two scrubbing operations: "check" and "repair". "check"
will go over the array and recored the number of discrepancies but not
repair them. "repair" will correct the discrepancies as it finds them.
'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'. The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.
Additional reporting has been added for 'lvs' to support the new
operations. Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches". These
can be accessed using the '-o' option to 'lvs', like:
lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing. It can be one of the following:
- idle: All sync operations complete (doing nothing)
- resync: Initializing an array or recovering after a machine failure
- recover: Replacing a device in the array
- check: Looking for array inconsistencies
- repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.
The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.
Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well. The role of the 'p'artial character in the lv_attr report field
as expanded. "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
'm'ismatches: Indicates that there are discrepancies in a RAID
LV. This character is shown after a scrubbing
operation has detected that portions of the RAID
are not coherent.
'r'efresh : Indicates that a device in a RAID array has suffered
a failure and the kernel regards it as failed -
even though LVM can read the device label and
considers the device to be ok. The LV should be
'r'efreshed to notify the kernel that the device is
now available, or the device should be 'r'eplaced
if it is suspected of failing.