IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Detect when we have mixed dos partition with gpt's PMBR partition.
This is not a sane configuration, but detect it anyway, just in case
someone configures such partition layout manually and forcefully and
incorrectly defines one of the partition types to be the GPT's PMBR.
For example:
❯ fdisk -l /dev/sdc
Device Boot Start End Sectors Size Id Type
/dev/sdc1 2048 67583 65536 32M 83 Linux
/dev/sdc2 67584 262143 194560 95M ee GPT
Before:
(The partition filter passes even though there's real existing dos
partition - the empty GPT PMBR overrides it.)
❯ pvcreate /dev/sdc
WARNING: PMBR signature detected on /dev/sdc at offset 510. Wipe it? [y/n]:
Wiping PMBR signature on /dev/sdc.
Physical volume "/dev/sdc" successfully created.
With this patch applied:
(The GPT PMBR does not override the existence of the dos partition.)
❯ pvcreate /dev/sdc
Cannot use /dev/sdc: device is partitioned
This provides better hints when trying to resize the fs on top of an LV.
Also needs a3f6d2f593 for proper operation.
❯ lvs -o name,size vg/swap
lv_name lv_size
swap 60.00m
Before:
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system device usage is not available from libblkid.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system device usage is not available from libblkid.
After:
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system size (60.00 MiB) is equal to the requested size (60.00 MiB).
File system reduce is not needed, skipping.
Size of logical volume vg/swap changed from 72.00 MiB (18 extents) to 60.00 MiB (15 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system size (60.00 MiB) is larger than the requested size (52.00 MiB).
File system reduce is required and not supported (swap).
blkid does not report FSLASTBLOCK for a swap device. However, blkid
does report FSSIZE for swap devices, so use this field (and including
the header size which is of FSBLOCKSIZE for the swap) instead to
set the "filesystem last block" which is used subsequently for
further calculations and conditions.
We already detect msdos partition table. If it is empty, that is, there
is just the partition header and no actual partitions defined, then the
filter-partitioned passes, otherwise not.
Do the same for GPT partition table.
New config setting sanlock_align_size can be used to configure
the sanlock lease size that lvmlockd will use on 4K disks.
By default, lvmlockd and sanlock use 8MiB align_size (lease size)
on 4K disks, which supports up to 2000 hosts (and max host_id.)
This can be reduced to 1, 2 or 4 (in MiB), to reduce lease i/o.
The reduced sizes correspond to smaller max hosts/host_id:
1 MiB = 250 hosts
2 MiB = 500 hosts
4 MiB = 1000 hosts
8 MiB = 2000 hosts (default)
(Disks with 512 byte sectors always use 1MiB leases and support
2000 hosts/host_id, and are not affected by this.)
In cases user is sure he is not using his 'rootfs' or 'swap' on LVs
managed with his command - it possible to completely bypass pinning
process to RAM which may eventually slightly speedup command execution,
(however at the risk the process can be eventually delayed by swapping).
Basicaly use this only at your risk...
TODO: add some dmeventd support for this.
Previously, lvmlockd detected the end of the lvmlock LV
by doing i/o to it until an i/o error was returned.
This triggered sanlock warning messages, so use the LV
size to avoid accessing beyond the end of the device.
Previously, every lvcreate would refresh the lvmlock LV
in case another machine had extended it. This involves
a lot of unnecessary work in most cases, so now compare
the LV size and device size to detect when a refresh is
needed.
lvremove of a thin lv while the pool is inactive would
leave the pool locked but inactive.
lvcreate of a thin snapshot while the pool is inactive
would leave the pool locked but inactive.
lvcreate of a thin lv could activate the pool to check
a threshold before the pool lock was acquired in lvmlockd.
The lv_hash wasn't being passed to the seg-specific text import
functions, so they were doing many find_lv() calls which consumes
a lot of time when there are many LVs in the metadata.
Older gcc doesn't really like complex types (buffer, struct) to be
initialized without extra {} around such type.
So pick any other 'single type' var from a struct and set it to 0,
rest will do the compiler without emitting a warning.
Revert 373372c8ab and instead update
our validation code to handle LVs with empty segment - currently
we should need this only for pvmove operation, thus such LV should
have name 'pvmove%u'.
This fixes a problem where user tried i.e. pvmove on a VG with single
PV - as reported: https://github.com/lvmteam/lvm2/issues/148
Reported-by: bob@redhat.com
The option can be used in multiple ways (like --cachesettings):
--integritysettings key=val
--integritysettings 'key1=val1 key2=val2'
--integritysettings key1=val1 --integritysettings key2=val2
Use with lvcreate or lvconvert when integrity is first enabled
to configure:
journal_sectors
journal_watermark
commit_time
bitmap_flush_interval
allow_discards
Use with lvchange to configure (only while inactive):
journal_watermark
commit_time
bitmap_flush_interval
allow_discards
lvchange --integritysettings "" clears any previously configured
settings, so dm-integrity will use its own defaults.
lvs -a -o integritysettings displays configured settings.
log/command_log_report config setting defaults to 1 now if json or json_std
output format is used (either by setting report/output_format config
setting or using --reportformat cmd line arg).
This means that if we use json/json_std output format, the command log
messages are then part of the json output too, not interleaved as
unstructured text mixed with the json output.
If log/command_log_report is set explicitly in the config, then we still
respect that, no matter what output format is used currently. In this
case, users can still separate and redirect the output by using
LVM_OUT_FD, LVM_ERR_FD and LVM_REPORT_FD so that the different types
do not interleave with the json/json_std output.
In case of different PV sizes in a VG, the lvm2 allocator falls short
to define extended segments resiliently asked for 100%FREE RaidLV extension
and a RAID distinct allocation check fails. Fix is to release a memory pool
on the resulting error path.
Until the lvm2 allocator gets enhanced (WIP) to do such complex (and other)
allocations proper, a workaround is to extend a RaidLV to any free space on
its already allocated PVs by defining those PVs on the lvextend command line
then iteratively run further such lvextend commands to extend it to its
final intended size. Mind, this may be a non-trivial extension interation.
The cmd struct is now required in many more functions, and
it's added as a function arg for most direct dev-cache function
calls. The cmd struct is added to struct device (dev->cmd) so
that it can be accessed in many other cases where dev-cache
functions are being called from places where getting the cmd
struct is too difficult.
The dm devs cache is separate from the ordinary dev cache,
so give the function names distinct prefixes, using
"dm_devs_cache" to prefix dm devs cache functions.
When a PV is stacked on an LV, the PV needs to be
dropped from bcache before the LV is processed.
The LV can be found in dev-cache using its name
rather than the devno.
The list of dm devs was in the cmd struct and had a
different lifetime than the radix trees referencing
those dm devs. Now the list and radix trees are
created and destroyed together.
In the context of dm, 'device' refers to a dm device, but
in the context of lvm, 'device' refers to struct device.
Change some lvm function names to make that difference clearer.
dev_manager_get_device_list() -> dev_manager_get_dm_active_devices()
get_device_list() -> get_dm_active_devices()
device_get_uuid() -> dev_dm_uuid(), devno_dm_uuid()
vgchange -an vg is permitted when the vg lockspace
is not available, because LVs could still be active
for some reason, and they should be inactive when not
properly locked. In case lvmlockd was not running, or
the lockspace was not started, the command was
unnecessarily trying and failing to unlock every LV,
printing errors for every LV. We can skip this when
the lockspace is known to not be available.
Lock adoption is not part of standard command behavior, but can
be used for manual recovery or cleanup from unexpected failure
cases. Like other lockopt values, they are hidden options for
--lockopt. Different lock managers will behave differently.
Adopting locks with lvmlockd -A1 is more accurate and automatic.
--lockopt adoptls
. for vgchange --lockstart
. adopt existing ls, or fail if no existing lockspace is found
--lockopt adoptgl | adoptvg | adoptlv
. for commands using lvmlockd locks
. adopt orphan gl/vg/lv lock, or fail the lock request if
no orphan lock is found
. will fail if orphan lock exists with a different lock mode
. command may still continue with a failed shared lock request
--lockopt adopt
. for lockstart or any command using lvmlockd locks
. adopt existing lockspace, or start lockspace if none exists
. adopt orphan gl/vg/lv lock, or acquire new lock if no orphan found
. will fail if orphan lock exists with a different lock mode
. command may still continue with a failed shared lock request
. with dlm this option only works for ls
Stop printing "Skipping global lock: lockspace not found or started"
for vgchange --lockstart, since it's generally an inherent limitation
that the global lock isn't available until after locking is started.
Update the start delay warning to "a few seconds".
The lvb is used to hold lock versions, but lock verions are
no longer used (since the removal of lvmetad), so the lvb
is not actually useful. Disable their use for sanlock to
avoid the extra i/o required to maintain the lvb.
vgremove with --lockopt force should skip lvmlockd-related
steps and allow a forced vg cleanup, in addition to using
--nolocking to skip normal locking calls.
Previously, a command would call lockd_vg() for a local VG,
which would go to lvmlockd, which would send back ENOLS,
and the command would not care when it saw the VG was local.
The pointless back-and-forth to lvmlockd for local VGs can
be avoided by checking the VG lock_type in lvmcache (which
label_scan now saves there; this wasn't the case back when
the original lockd_vg logic was added.) If the lock_type
saved during label_scan indicates a local VG, then the
lockd_vg step is skipped.
The idea in the patch 6e6d4c62b for handling -suffix as
indication of private device needs to be disabled.
Some problematic cases are currently not resolvable and some
more thinking is needed.
Once fixed, we can revert this patch.
For large device sets our dm_hash can produce larger amounf of mapping
collision and we would need to further increase our has size.
So instead use the radix_tree code which is immune agains growing size
of devices and uses memory more effiecently to store all the paths.
Instead of less efficient 'btree' switch dev_cache to use
radix_tree, that is generating more efficient tree mapping.
Some direct use of btree iteration replace with our dev_iter code.
Convert the persisten filter to use more memory compact radix_tree as
dm_hash is bound to preallocated number of slots and stores whole
key together with value.
When DM uuid cache is available, we can possibly avoid unnecessary
status ioctl() when we check the device for 'usable' uuid.
If this test passes the existing code will got through the full check.
Move the code around caching active dm device devno, name and uuid
from device_mapper/libdm-iface to dev_cache file - as libdm layer
cares about 'decoding' ioctl data from kernel and caching for use by
lvm stays within lvm.
Introduce:
dev_cache_update_dm_devs
dev_cache_get_dm_dev_by_devno
dev_cache_get_dm_dev_by_uuid
Use radix_tree for searching.
Do not attach layer suffix to the UUID when activating component LV.
In this case we want to see allow this LV to be public, thus
such LV should not be using -layer suffix in its UUID.
This also requires that our 'cached' access will check for
both UUID (with & without suffix) which was unnoticed issue before.
This change is now necesssary since our udev rule automatically expects
any LV with -layer suffix is private and will prevent generaring
any systemd unit even when there are no 'DM' flags bits passed via
cookie mechanism while creating such LV.
In order to free SubLVs after a stripe removing reshape, lvconvert has
to be run without layout changes. Prevent a layout changing request
in case any such freed SubLVs exist and inform the user about the fact
requesting to release them first.
Instead of using 'key state & key end' uint8_t* switch to use
void* key, & size_t keylen. This allows easier adaptation with
lvm code base with way too much casting with every use of function.
Also correctly mark const buffers to avoid compiled warnings and
casting.
Adapt the only bcache user ATM for API change.
Adapt unit test to match changed API.
Size of these hashes was quite small, so raise the size of
hashed entries to reduce amount of hash collistion.
Select some unique/unused number for hash_create below 8192.
Replace call to get_dm_uuid_from_sysfs() with use of
device_get_uuid() which gets the same information,
but instead of several syscalls it need either 1 or even 0
when the information is cached with newer kernels.
We've got cached DM list before grabbing lock, so there
is some chance, that DM table has changed and we would
need to refresh this info.
TODO: benchmark, whether it would even make sense to refresh cache
and keep it content instead of using individual ioctl() for tree build.