IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When data are growing, adapt also size of metadata.
As we get way too many reports from users doing huge growths of
data portion while keep metadata small and avoiding using monitoring.
So to enhance the user-experience in case user requests grown of
thin-pool (without passing PV list for growth) - lvm2 will automaticaly
grown also the metadata part of thin-pool (if possible).
Add function for estimation of thin-pool metadata size for given size of
data. Function is using already existing internal API so it can
be reused for resize of thin-pool data.
When lvextend extends an LV that is active with a shared
lock, use this as a signal that other hosts may also have
the LV active, with gfs2 mounted, and should have the LV
refreshed to reflect the new size. Use the libdlmcontrol
run api, which uses dlm_controld/corosync to run an
lvchange --refresh command on other cluster nodes.
When an LV is active with a shared lock, a command can be
run to change the LV with --lockopt skiplv (to override the
exclusive lock the command ordinarily requires which is not
compatible with the outstanding shared lock.)
In this case, other commands may have the LV active and may
need to refresh the LV, so print warning stating this.
Udev is running udev-rule action upon 'resume'.
However lvm2 in special case is doing replacement of
'soon-to-be-removed' device with 'error' target for resuming
and then follows actual removal - the sequence is usually quick,
so when udev start action - it can result in 'strange' error
message in kernel log like:
Process '/usr/sbin/dmsetup info -j 253 -m 17 -c --nameprefixes --noheadings --rows -o name,uuid,suspended' failed with exit code 1.
To avoid this - we need to ensure there is synchronization wait for udev
between 'resume' and 'remove' part of this process.
However existing code put strict requirement to avoid synchronizing with
udev inside critical section - but this originally came from requirement
to not do anything special while there could be devices in
suspend-state. Now we are able to see differnce between critical section
with or without suspended devices. For udev synchronization only
suspended devices are prohibited to be there - so slightly relax
condition and allow calling and using 'fs_sync()' even inside critical
section - but there must not be any suspended device.
Allow using caching with VDO.
User can either cache a single vdopool or
a vdo LV - difference when the caching is put-in depends on a use-case
and it's upto user to decide which kind of speed is expected.
Internal detection of SCSI device being in-use by DM mpath has been
performed several times for each component device - this could be
eventually racy - so instead when we do remember 1st. checked result
for device being mpath and use it consistenly over the filter runtime.
Move DM usage into dev_manager.c source file.
Also convert STATUS to INFO ioctl - as that's enough
to obtain UUID - this also avoid issuing unwanted flush on checked DM
device for being mpath.
Activation would not be allowed anyway, but we can
check for these cases early and avoid wasted time in
pvscan managing online files an attempting activation.
This is the default bcache size that is created at the
start of the command. It needs to be large enough to
hold a single copy of metadata for a given VG, or the
VG cannot be read or written (since the entire VG would
not fit into available memory.)
Increasing the default reduces the chances of anyone
needing to increase the default to use their VG.
The size can be set in lvm.conf global/io_memory_size;
the lower limit is 4 MiB and the upper limit is 128 MiB.
When a single copy of metadata gets within 1MB of the
current io_memory_size value, begin printing a warning
that the io_memory_size should be increased.
which defines the amount of memory that lvm will allocate
for bcache. Increasing this setting is required if it is
smaller than a single copy of VG metadata.
and "cachepool" to refer to a cache on a cache pool object.
The problem was that the --cachepool option was being used
to refer to both a cache pool object, and to a standard LV
used for caching. This could be somewhat confusing, and it
made it less clear when each kind would be used. By
separating them, it's clear when a cachepool or a cachevol
should be used.
Previously:
- lvm would use the cache pool approach when the user passed
a cache-pool LV to the --cachepool option.
- lvm would use the cache vol approach when the user passed
a standard LV in the --cachepool option.
Now:
- lvm will always use the cache pool approach when the user
uses the --cachepool option.
- lvm will always use the cache vol approach when the user
uses the --cachevol option.
For users who do not want all of the fields included
in debug lines, let them specify in lvm.conf which
fields to include. timestamp, command[pid], and
file:line fields can all be disabled.
Without this, the output from different commands in a single
log file could not be separated.
Change the default "indent" setting to 0 so that the default
debug output does not include variable spaces in the middle
of debug lines.
When aay was included in the pvscan --cache command,
the activation part was complaining about the unusual
state of the hint file since it had been recreated
just prior.
udev_dev_is_md_component and udev_dev_is_mpath_component
are not used for obtaining the device list, but they still
use libudev for device info. When there are problems with
udev, these functions can get stuck. So, use the existing
obtain_device_list_from_udev config setting to also control
whether these "is component" functions are used, which gives
us a way to avoid using libudev entirely when it's causing
problems.
Whenever thin-pool chunk size is unspecified and left for lvm calculation
try to select the size as nearest highest power-of-2 instead of
just being a multiple of 64KiB.
Fixing recent commit 022ebb0cfe
Resize already has size that needs to be counted with,
otherwise upsizing operation could turn into size reduction one.
Since the parse_vdo_pool_status() become vdo_manip API part,
and there will be no 'dm' matching status parser,
the API can be simplified and closely match thin API here.
Now with newer VDO kvdo target we can start to use standard mechanism
to enable resize of VDO volumes.
VDO pool can be grown.
Virtual volume grows on top of VDO pool when is not big enough.
Reduced VDOLV is calling discard for reduced areas - this can
take long time!
TODO: implement some pollable mechanism for out-of-lock TRIM.
When using 'lvcreate -l100%VG' and there is big disproportion between
real available space and requested setting - automatically fallback
to 100%FREE.
Difference can be seen when VG is big and already most space was
allocated, so the requestion 100%VG can end (and by spec for % modifier
it's correct) as LV with size of 1%VG. Usually this is not a big
problem - buit in some cases - like cache-pool allocation, this
can result a big difference for chunksize selection.
With this patch it's more closely match common-sense logic without
the need of reitteration of too big changes in lvm2 core ATM.
TODO: in the future there should be allocator solving all allocations
in a single call.
lvm uses 'minimum_io_size' name to exactly match VDO naming here,
however in all common cases _size is using 'sector/512b' unit.
But in this case the value is in bytes and can have only 2 values:
either 512 or 4096.
It's probably not worth to rename it internaly, so we can just
drop comment - instead of using 1 or 8.
Thought let's think about it....
An idea from Zdenek for better ensuring valid hints by invalidating
them when pvscan --cache <device> sees a new PV, which is a case
where we know that hints should be invalidated. This is triggered
from systemd/udev logic, and there may be some cases where it would
invalidate hints that the existing methods wouldn't detect.
If there are two independent scripts doing:
vgchange --lockstart vg
lvchange -ay vg/lv
The first vgchange to do the lockstart will wait for
the lockstart to complete before returning.
The second vgchange to do the lockstart will see that
the start is already in progress (from the first) and
will do nothing. This means the second does not wait
for any lockstart to complete, and moves on to the
lvchange which may find the lockspace still starting
and fail.
To fix this, make the vgchange lockstart command
wait for any lockstart's in progress to complete.
Save the list of PVs in /run/lvm/hints. These hints
are used to reduce scanning in a number of commands
to only the PVs on the system, or only the PVs in a
requested VG (rather than all devices on the system.)
The systemd generators are executed very early during the switch
from initramfs to system partition and the syslog is not yet fully
operational - it may cause blocking, if some debug logging is enabled
at the same time in /etc/lvm/lvm.conf log{} section.
To avoid timeouting and killing this generator - rather enhance lvm
code to suppress any syslog communication when LVM_SUPPRESS_SYSLOG
envvar is set.
Use of this envvar is needed since the parsing of i.e. cmdline options
that could eventually override lvm.conf setting happens in this case
way too late and number of lines could have been already streamed to
syslog.
In few cases error paths from initialization were returned as
'success == 1'.
Also assing num_mb with single compare checking valid sector_size.
For dumb compiler make num_mb always defined.
Drop very old original format of VDO target and focus on V2 version.
So some variables were renamed or replaced.
There is no compatibility preserved (with assumption so far this is
experimental feature and there is no real user).
Note - version currently VDO calls this version 6.2.
This is a followup patch to commit edb72cb70c
to support related lvm2 test suite tests.
A 'global/support_mirrored_mirror_log' bool configuration variable gets
introduced allowing the creation of, or conversion to mirrored 'mirror'
logs if set. The capability to create these in turn allows the rest of
the tests to perform activation of such existing LVs and their conversions
to disk/core 'mirror' logs.
Display a disclaimer warning if enabled that this is not for regular use.
Add definition of the enabled config option to respective test scripts.
Related: rhbz1643562
Scenario: Given an existed LV `lvol0`, I want to create another LV
on the PVs used by `lvol0`.
I use `build_parallel_areas_from_lv()` to obtain the `pv_list` of each segments.
However, the returned `pv_list` is not properly initialized, which causes
segfault in subsequent operations.
Ensure configure.h is always 1st. included header.
Maybe we could eventually introduce gcc -include option, but for now
this better uses dependency tracking.
Also move _REENTRANT and _GNU_SOURCE into configure.h so it
doesn't need to be present in various source files.
This ensures consistent compilation of headers like stdio.h since
it may produce different declaration.
There's a small window during creation of a new RaidLV when
rmeta SubLVs are made visible to wipe them in order to prevent
erroneous discovery of stale RAID metadata. In case a crash
prevents the SubLVs from being committed hidden after such
wiping, the RaidLV can still be activated with the SubLVs visible.
During deactivation though, a deadlock occurs because the visible
SubLVs are deactivated before the RaidLV.
The patch adds _check_raid_sublvs to the raid validation in merge.c,
an activation check to activate.c (paranoid, because the merge.c check
will prevent activation in case of visible SubLVs) and shares the
existing wiping function _clear_lvs in raid_manip.c moved to lv_manip.c
and renamed to activate_and_wipe_lvlist to remove code duplication.
Whilst on it, introduce activate_and_wipe_lv to share with
(lvconvert|lvchange).c.
Resolves: rhbz1633167
In RHEL7 we marked mirrored mirror logs as deprecated and
added a related message. This patch prohibits creating new
'mirror' LVs with that log type or converting existing LVs
to have one.
Existing LVs with mirrored mirror log can be activated
and converted to disk/core logs.
Avoid double deprecation message when running lvconvert.
Resolves: rhbz1643562
commit de28637
scan: use full md filter when md 1.0 devices are present
missed the fact that md superblock version 0.90 also puts
metadata at the end of the device, so the full md filter
needs to be used when either 0.90 or 1.0 is present.
. When using default settings, this commit should change
nothing. The first PE continues to be placed at 1 MiB
resulting in a metadata area size of 1020 KiB (for
4K page sizes; slightly smaller for larger page sizes.)
. When default_data_alignment is disabled in lvm.conf,
align pe_start at 1 MiB, based on a default metadata area
size that adapts to the page size. Previously, disabling
this option would result in mda_size that was too small
for common use, and produced a 64 KiB aligned pe_start.
. Customized pe_start and mda_size values continue to be
set as before in lvm.conf and command line.
. Remove the configure option for setting default_data_alignment
at build time.
. Improve alignment related option descriptions.
. Add section about alignment to pvcreate man page.
Previously, DEFAULT_PVMETADATASIZE was 255 sectors.
However, the fact that the config setting named
"default_data_alignment" has a default value of 1 (MiB)
meant that DEFAULT_PVMETADATASIZE was having no effect.
The metadata area size is the space between the start of
the metadata area (page size offset from the start of the
device) and the first PE (1 MiB by default due to
default_data_alignment 1.) The result is a 1020 KiB metadata
area on machines with 4KiB page size (1024 KiB - 4 KiB),
and smaller on machines with larger page size.
If default_data_alignment was set to 0 (disabled), then
DEFAULT_PVMETADATASIZE 255 would take effect, and produce a
metadata area that was 188 KiB and pe_start of 192 KiB.
This was too small for common use.
This is fixed by making the default metadata area size a
computed value that matches the value produced by
default_data_alignment.
The pvscan systemd service for autoactivation was
mistakenly dropped along with the lvmetad related
services.
The activation generator program now looks at the new
lvm.conf setting "event_activation" (default 1) to
switch between event activation and direct activation.
Previously, the old use_lvmetad setting was used to
switch between event and direct activation.
fix lseek error check
fix read/write error checks
handle zero return from read and write
don't return an error for short io
fix partial read/write loop
io_setup() for aio may fail if a system has reached the
aio request limit. In this case, fall back to using
sync io. Also, lvm use of aio can be disabled entirely
with config setting global/use_aio=0.
The system limit for aio requests can be seen from
/proc/sys/fs/aio-max-nr
The current usage of aio requests can be seen from
/proc/sys/fs/aio-nr
The system limit for aio requests can be increased by
setting fs.aio-max-nr using sysctl.
Also add last-byte limit to the sync io code.
Commit 813347cf84 added extra validation,
however in this particular we do want to trim suffix out so rather ignore
resulting error code here intentionaly.
If a single, standard LV is specified as the cache, use
it directly instead of converting it into a cache-pool
object with two separate LVs (for data and metadata).
With a single LV as the cache, lvm will use blocks at the
beginning for metadata, and the rest for data. Separate
dm linear devices are set up to point at the metadata and
data areas of the LV. These dm devs are given to the
dm-cache target to use.
The single LV cache cannot be resized without recreating it.
If the --poolmetadata option is used to specify an LV for
metadata, then a cache pool will be created (with separate
LVs for data and metadata.)
Usage:
$ lvcreate -n main -L 128M vg /dev/loop0
$ lvcreate -n fast -L 64M vg /dev/loop1
$ lvs -a vg
LV VG Attr LSize Type Devices
main vg -wi-a----- 128.00m linear /dev/loop0(0)
fast vg -wi-a----- 64.00m linear /dev/loop1(0)
$ lvconvert --type cache --cachepool fast vg/main
$ lvs -a vg
LV VG Attr LSize Origin Pool Type Devices
[fast] vg Cwi---C--- 64.00m linear /dev/loop1(0)
main vg Cwi---C--- 128.00m [main_corig] [fast] cache main_corig(0)
[main_corig] vg owi---C--- 128.00m linear /dev/loop0(0)
$ lvchange -ay vg/main
$ dmsetup ls
vg-fast_cdata (253:4)
vg-fast_cmeta (253:5)
vg-main_corig (253:6)
vg-main (253:24)
vg-fast (253:3)
$ dmsetup table
vg-fast_cdata: 0 98304 linear 253:3 32768
vg-fast_cmeta: 0 32768 linear 253:3 0
vg-main_corig: 0 262144 linear 7:0 2048
vg-main: 0 262144 cache 253:5 253:4 253:6 128 2 metadata2 writethrough mq 0
vg-fast: 0 131072 linear 7:1 2048
$ lvchange -an vg/min
$ lvconvert --splitcache vg/main
$ lvs -a vg
LV VG Attr LSize Type Devices
fast vg -wi------- 64.00m linear /dev/loop1(0)
main vg -wi------- 128.00m linear /dev/loop0(0)