IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
During removal of a lot of locking code the signal blocking got lost
and signal processing got broken leading to unpredictable
behavior of i.e. activation code the can get interrupted in the
middle of DM table processing.
lvm2 code always expects signals are blocked while lock is held
unless it is explictelly placed into section of:
sigint_allow();....;sigint_restore();
For checking catched interrupt there is sigint_catched();
Metadata size was calculated correctly only for raids.
Fixes problem for crash during lvcreate when thin-pool was created
on a VG where remaining free space had the size to only fit a single
metadata LV and not also its _pmspare.
Lvcreate crashed with this assert message:
lvcreate: metadata/pv_map.c:198: consume_pv_area: Assertion `to_go <= pva->count' failed.
Aborted (core dumped)
TODO: there is probably to large overload of several alloc_handle
variables.
Reported-by: Wu Guanghao<wuguanghao3@huawei.com>
Reported-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
When using --use-policy for automatic extension of thin-pool,
the extension of thin-pool's metadata itself can actually take
some extra space.
Since I'm not aware of exact compensation formula, add just
1% extra to calculated amount and hope it fits.
Wanted target is to always have usable thin-pool that fits
bellow pool_metadata_min_threshold().
Correcting rounding rules for percentage evaluation.
Validate supported range of percentage.
(although ranges are already validated earlier on code path)
Enable compilation of vdo and writecache support as internaly
supported segment types by default.
For disabling use:
--with-vdo=none
--with-writcache=none
This is probably somewhat experimantal patch - but when i.e. raid device
is just extend, there should not be a technical need for flush,
unless the target would stricly need it. It should allow faster
processing of lvm command not being blocked by possibly longer flush.
Since we do not support rimage & rmeta for snapshots - we can
avoid quering for -cow devices and add them as origin_only -
since their snapshots (-cow) could have never existed.
This redumes several ioctl operation during table preloading.
Switch remaining zero sized struct to flexible arrays to be C99
complient.
These simple rules should apply:
- The incomplete array type must be the last element within the structure.
- There cannot be an array of structures that contain a flexible array member.
- Structures that contain a flexible array member cannot be used as a member of another structure.
- The structure must contain at least one named member in addition to the flexible array member.
Although some of the code pieces should be still improved.
merge.c:_check_lv_segment() was checking regionsize vs. mirrored LV size on
any 'mirror/raid1/raid10' segment type including type 'mirrored' mirror logs.
Avoid the check only for 'mirrored' mirror logs to allow conversion from log
type 'disk' with regionsize > mirror log SubLV size.
As we disabled support for 'mirrored' mirror logs with
commit e82303fd6a which still conditionally
allows to enable it via global/support_mirrored_mirror_logs=1,
patch is mandatory for all distributions.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1712983
To avoid polution of metadata with some 'garbage' content or eventualy
some leak of stale data in case user want to upload metadata somewhere,
ensure upon allocation the metadata device is fully zeroed.
Behaviour may slow down allocation of thin-pool or cache-pool a bit
so the old behaviour can be restored with lvm.conf setting:
allocation/zero_metadata=0
TODO: add zeroing for extension of metadata volume.
Failure in wiping/zeroing stop the command.
If user wants to avoid command abortion he should use -Zn or -Wn
to avoid wiping.
Note: there is no easy way to distinguish which kind of failure has
happend - so it's safe to not proceed any futher.
When initiated larger write request, it may have happened, bcache
got out of free chunks - fix the loop, that is supposed to wait
until next free chunk becomes avain available.
Fix the anoying kernel message reported:
device-mapper: cache: 253:2: metadata operation 'dm_cache_commit' failed: error = -5
which has been reported while cachevol has been removed.
Happened via confusing variable - so switch the variable to commonly user '_size'
which presents a value in sector units and avoid 'scaling' this as extent length
by vg extent size when placing 'error' target on removal path.
Patch shouldn't have impact on actual users data, since at this moment
of removal all date should have been already flushed to origin device.
m
The kernel MD runtime requires region size to be larger than stripe size
on striped raid layouts, thus the dm-raid target's constructor rejects
such request.
This causes e.g. an 'lvcreate --type raid10 -i3 -I4096 -R2048 -n lv vg' to fail.
Avoid failing late in the kernel by enforcing region size to be
larger or equal to stripe size.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1698225
When LV gets cached and uses cache-pool - such cache-pool
will now get _cpool suffix automatically.
Thus 'Pool' column for cached LV will now show either _cvol
or _cpool LV.
Since code is using -cdata and -cmeta UUID suffixes, it does not need
any new 'extra' ID to be generated and stored in metadata.
Since introduce of new 'segtype' cache+CACHE_USES_CACHEVOL we can
safely assume 'new' cache with cachevol will now be created
without extra metadata_id and data_id in metadata.
For backward compatibility, code still reads them in case older
version of metadata have them - so it still should be able
to activate such volumes.
Bonus is lowered size of lv structure used to store info about LV
(noticable with big volume groups).
Enhance activation of cached devices using cachevol.
Correctly instatiace cachevol -cdata & -cmeta devices with
'-' in name (as they are only layered devices).
Code is also a bit more compacted (although still not ideal,
as the usage of extra UUIDs stored in metadata is troublesome
and will be repaired later).
NOTE: this patch my brink potentially minor incompatiblity for 'runtime' upgrade
For wiping we activate and clear 'regular' devices,
since in case of whole process interuption (i.e. kill -9)
we leave metadata & DM table and workable state all the time.
Enhance 'activation' experience for VDO pool to more closely match
what happens for thin-pools where we do use a 'fake' LV to keep pool
running even when no thinLVs are active. This gives user a choice
whether he want to keep thin-pool running (wihout possibly lenghty
activation/deactivation process)
As we do plan to support multple VDO LVs to be mapped into a single VDO,
we want to give user same experience and 'use-patter' as with thin-pools.
This patch gives option to activate VDO pool only without activating
VDO LV.
Also due to 'fake' layering LV we can protect usage of VDO pool from
command like 'mkfs' which do require exlusive access to the volume,
which is no longer possible.
Note: VDO pool contains 1024 initial sectors as 'empty' header - such
header is also exposed in layered LV (as read-only LV).
For blkid we are indentified as LV with UUID suffix - thus private DM
device of lvm2 - so we do not need to store any extra info in this
header space (aka zero is good enough).
When lvm2 is activating layered pool LV (to basically keep pool opened,
the other function used to be 'locking' be in sync with DM table)
use this LV in read-only mode - this prevents 'write' access into
data volume content of thin-pool.
Note: since EMPTY/unused thin-pool is created as 'public LV' for generic
use by any user who i.e. wish to maintain thin-pool and thins himself.
At this moment, thin-pool appears as writable LV. As soon as the 1st.
thinLV is created, layer volume will appear is 'read-only' LV from this moment.
Support internal removal of 'cache origin' volume - which we
do not normally expose to a user - however internal processing
loops may hit this condition (depending on order of list LVs).
So when this operation is internally requested - we automatically
try to remove it's 'holding' LV (cache LV) - which will also
remove the origin.
Resuming of 'error' table entry followed with it's dirrect removal
is now troublesame with latest udev as it may skip processing of
udev rules for already 'dropped' device nodes.
As we cannot 'synchronize' with udev while we know we have devices
in suspended state - rework 'cleanup' so it collects nodes
for removal into pending_delete list and process the list with
synchronization once we are without any suspended nodes.
Between 'resume' and 'remove' we need to wait for udev to synchronize,
otherwise udev may 'skip' resume event processing if the udev node
is already gone.
When pvmove is finished, we do a tricky operation since we try to
resume multiple different device that were all joined into 1 big tree.
Currently we use the infromation from existing live DM table,
where we can get list of all holders of pvmove device.
We look for these nodes (by uuid) in new metadata, and we do now a full
regular device add into dm tree structure. All devices should be
already PRELOAD with correct table before entering suspend state,
however for correctly working readahead we need to put correct info
also into RESUME tree. Since table are preloaded, the same table
is skip and resume, but correct read ahead is now set.
The OPTIONS+="event_timeout" is Unsupported since systemd/udev version 216,
that is ~5 years ago.
Since systemd/udev version 243, there's a new message printed if unsupported
OPTIONS value is used:
Invalid value for OPTIONS key, ignoring: 'event_timeout=180'
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1740666
Recent kernel version from kernel commit:
de7180ff908b2bc0342e832dbdaa9a5f1ecaa33a
started to report in cache status line new flag:
no_discard_passdown
Whenever lvm spots unknown status it reports:
Unknown feature in status:
So add reconginzing this feature flag and also report this with
'lvs -o+kernel_discards'
When no_discard_passdown is found in status 'nopassdown' gets reported
for this field (roughly matching what we report for thin-pools).
To avoid tiny race on checking arrival of signal and entering select
(that can latter remain stuck as signal was already delivered) switch
to use pselect().
If it would needed, we can eventually add extra code for older systems
without pselect(), but there are probably no such ancient systems in
use.
We already used Conflicts=shutdown target to stop LVM2 services on shutdown.
But we still missed the ordering - the shutdown.target should be reached
only after all the services are really stopped.
Reported here: https://github.com/lvmteam/lvm2/issues/17
When data are growing, adapt also size of metadata.
As we get way too many reports from users doing huge growths of
data portion while keep metadata small and avoiding using monitoring.
So to enhance the user-experience in case user requests grown of
thin-pool (without passing PV list for growth) - lvm2 will automaticaly
grown also the metadata part of thin-pool (if possible).
Udev is running udev-rule action upon 'resume'.
However lvm2 in special case is doing replacement of
'soon-to-be-removed' device with 'error' target for resuming
and then follows actual removal - the sequence is usually quick,
so when udev start action - it can result in 'strange' error
message in kernel log like:
Process '/usr/sbin/dmsetup info -j 253 -m 17 -c --nameprefixes --noheadings --rows -o name,uuid,suspended' failed with exit code 1.
To avoid this - we need to ensure there is synchronization wait for udev
between 'resume' and 'remove' part of this process.
However existing code put strict requirement to avoid synchronizing with
udev inside critical section - but this originally came from requirement
to not do anything special while there could be devices in
suspend-state. Now we are able to see differnce between critical section
with or without suspended devices. For udev synchronization only
suspended devices are prohibited to be there - so slightly relax
condition and allow calling and using 'fs_sync()' even inside critical
section - but there must not be any suspended device.
Allow using caching with VDO.
User can either cache a single vdopool or
a vdo LV - difference when the caching is put-in depends on a use-case
and it's upto user to decide which kind of speed is expected.
Internal detection of SCSI device being in-use by DM mpath has been
performed several times for each component device - this could be
eventually racy - so instead when we do remember 1st. checked result
for device being mpath and use it consistenly over the filter runtime.
Move DM usage into dev_manager.c source file.
Also convert STATUS to INFO ioctl - as that's enough
to obtain UUID - this also avoid issuing unwanted flush on checked DM
device for being mpath.
Whenever thin-pool chunk size is unspecified and left for lvm calculation
try to select the size as nearest highest power-of-2 instead of
just being a multiple of 64KiB.
Now with newer VDO kvdo target we can start to use standard mechanism
to enable resize of VDO volumes.
VDO pool can be grown.
Virtual volume grows on top of VDO pool when is not big enough.
Reduced VDOLV is calling discard for reduced areas - this can
take long time!
TODO: implement some pollable mechanism for out-of-lock TRIM.
When using 'lvcreate -l100%VG' and there is big disproportion between
real available space and requested setting - automatically fallback
to 100%FREE.
Difference can be seen when VG is big and already most space was
allocated, so the requestion 100%VG can end (and by spec for % modifier
it's correct) as LV with size of 1%VG. Usually this is not a big
problem - buit in some cases - like cache-pool allocation, this
can result a big difference for chunksize selection.
With this patch it's more closely match common-sense logic without
the need of reitteration of too big changes in lvm2 core ATM.
TODO: in the future there should be allocator solving all allocations
in a single call.
When using caches with BIG pool size (>TB) there is required
to use relatively huge chunk size. Once the chunksize has
got over 1MiB - kernel cache target stopped writing such chunks
back on this if the migration_threshold remained on default 1MiB
(2048 sectors) size.
This patch ensure, DM layer will not let pass table line which
has not big enough migration threshold that can let pass
at least 8 chunks (independently of lvm2 metadata).
lvm uses 'minimum_io_size' name to exactly match VDO naming here,
however in all common cases _size is using 'sector/512b' unit.
But in this case the value is in bytes and can have only 2 values:
either 512 or 4096.
It's probably not worth to rename it internaly, so we can just
drop comment - instead of using 1 or 8.
Thought let's think about it....
The systemd generators are executed very early during the switch
from initramfs to system partition and the syslog is not yet fully
operational - it may cause blocking, if some debug logging is enabled
at the same time in /etc/lvm/lvm.conf log{} section.
To avoid timeouting and killing this generator - rather enhance lvm
code to suppress any syslog communication when LVM_SUPPRESS_SYSLOG
envvar is set.
Use of this envvar is needed since the parsing of i.e. cmdline options
that could eventually override lvm.conf setting happens in this case
way too late and number of lines could have been already streamed to
syslog.
Fix a scenario where global/event_activation setting is not found. In
this case we need to take default value just like lvm tools do when
executed. So use "lvmconfig --type full".
Also, if we fail to execute lvmconfig for whatever reason, fallback to
generating the activation units as failsafe action.
Reported by: Bastian Blank <waldi debian org>