IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Names matching internal code layout.
Functionc in thin_manip.c uses thin_pool in its name.
Keep 'pool' only for function working for both cache and thin pools.
No change of functionality.
Allow to use --vdosettings with lvcreate,lvconvert,lvchange.
Support settings currenly only configurable via lvm.conf.
With lvchange we require inactivate LV for changes to be applied.
Settings block_map_era_length has supported alias block_map_period.
Recent commit 84bd394cf9
"writecache: use block size 4096 when no fs is found"
failed to account for the case where writecache is attached
to thin pool data. Checking fs block size on the thin pool
data LV is wrong, and checking the fs block on each thin LV
would be impractical, so default to 512 which cannot break
any existing file systems, and require the user to specify
4k when appropriate.
Add profilable configurable setting for vdo pool header size, that is
used as 'extra' empty space at the front and end of vdo-pool device
to avoid having a disk in the system the may have same data is real
vdo LV.
For some conversion cases however we may need to allow using '0' header size.
TODO: in this case we may eventually avoid adding 'linear' mapping layer
in future - but this requires further modification over lvm code base.
Previously there have been necessary explicit call of backup (often
either forgotten or over-used). With this patch the necessity to
store backup is remember at vg_commit and once the VG is unlocked,
the committed metadata are automatically store in backup file.
This may possibly alter some printed messages from command when the
backup is now taken later.
Instead of calling explicit archive with command processing logic,
move this step towards 1st. vg_write() call, which will automatically
store archive of committed metadata.
This slightly changes some error path where the error in archiving
was detected earlier in the command, while now some on going command
'actions' might have been, but will be simply scratched in case
of error (since even new metadata would not have been even written).
So general effect should be only some command message ordering.
This reverts commit 8e7690b798.
Actully this was bad idea - to make it on pair.
-Zn for thin-pools is already used - so here user must have
create new pool and swap existing thin-pool metadata into.
So reverting this commit to avoid any possible regression.
Since lvm does support external users of thin-pool when thin devices
are managed outside it can be useful to support conversion to
thin pool from data and metadata LV without zeroing.
TransactionID will be 0 in lvm2 metadata.
lvconvert -Zn --thinpool vg/data --poolmetadata vg/meta
User use 'lvconvert -Zn --type vdo-pool' to convert an existing
vdo formated volume and skip lvm2 internal formating.
This however requires user is passing proper matching parameters.
For them user can use --profile|--metadataprofile option whos
support has been also enhanced.
TODO: add support to read values directly from formated volume.
When converting an existing LV to thin-pool,
user may now pass also '--errorwhenfull' option
like with 'lvcreate'.
Also recalculate chunksize when performace profile is
used with conversion (again matching lvcreate).
Adds missing flagging for uncropped metadata sizes.
A cachevol can be forcibly detached when it's missing devices.
Also allow this if it's damaged/invalid and unrepairable.
This would be needed to recover data from the origin LV after
a cachevol is lost or damaged beyond repair.
In cases where lvconvert does not detect a fs block size on the
device, it falls back to choosing a writecache block size based
on the device's LBS and PBS (tries to match those.)
If the user specifies a writecache block size on the command
line (--cachesettings block_size=4096|512), lvconvert currently
fails and reports an error if the user-specified value does not
match the value lvconvert would have chosen based on LBS and PBS.
The purpose of allowing a user-specified value on the command line
is to override what lvconvert would otherwise do, so change this
to just print a warning that the user value does not match the
value that would be chosen based on the LBS/PBS, and then take
the user-specified value as the writecache block size.
Use update_pool_metadata_min_max() which is shared with
thin-pool metadata min-max updating.
Gives improved messages when converting volumes to metadata.
Initial support for thin-pool used slightly smaller max size 15.81GiB
for thin-pool metadata. However the real limit later settled at 15.88GiB
(difference is ~64MiB - 16448 4K blocks).
lvm2 could not simply increase the size as it has been using hard cropping
of the loaded metadata device to avoid warnings printing warning of kernel
when the size was bigger (i.e. due to bigger extent_size).
This patch adds the new lvm.conf configurable setting:
allocation/thin_pool_crop_metadata
which defaults to 0 -> no crop of metadata beyond 15.81GiB.
Only user with these sizes of metadata will be affected.
Without cropping lvm2 now limits metadata allocation size to 15.88GiB.
Any space beyond is currently not used by thin-pool target.
Even if i.e. bigger LV is used for metadata via lvconvert,
or allocated bigger because of to large extent size.
With cropping enabled (=1) lvm2 preserves the old limitation
15.81GiB and should allow to work in the evironement with
older lvm2 tools (i.e. older distribution).
Thin-pool metadata with size bigger then 15.81G is now using CROP_METADATA
flag within lvm2 metadata, so older lvm2 recognizes an
incompatible thin-pool and cannot activate such pool!
Users should use uncropped version as it is not suffering
from various issues between thin_repair results and allocated
metadata LV as thin_repair limit is 15.88GiB
Users should use cropping only when really needed!
Patch also better handles resize of thin-pool metadata and prevents resize
beoyond usable size 15.88GiB. Resize beyond 15.81GiB automatically
switches pool to no-crop version. Even with existing bigger thin-pool
metadata command 'lvextend -l+1 vg/pool_tmeta' does the change.
Patch gives better controls 'coverted' metadata LV and
reports less confusing message during conversion.
Patch set also moves the code for updating min/max into pool_manip.c
for better sharing with cache_pool code.
When detaching writecache, make the first stage send a message
to dm-writecache to set the cleaner option. This is instead of
reloading the dm table with the cleaner option set. Reloading
the table causes udev to process/probe the dm dev, which gets
stalled because of the writeback activity, and the stalled udev
in turn stalls the lvconvert command when it tries to sync with
udev events.
When getting writecache status we do not need to get
open_count or read_head info, which can cause extra steps.
Fix the two-step writecache detach in commit c32d7fed4f.
In the case of uncache, the cachevol is removed after
detaching the writecache. When the detach is finished
in the second step, the remove must wait until then.
When detaching a writecache, use the cleaner setting
by default to writeback data prior to suspending the
lv to detach the writecache. This avoids potentially
blocking for a long period with the device suspended.
Detaching a writecache first sets the cleaner option, waits
for a short period of time (less than a second), and checks
if the writecache has quickly become clean. If so, the
writecache is detached immediately. This optimizes the case
where little writeback is needed.
If the writecache does not quickly become clean, then the
detach command leaves the writecache attached with the
cleaner option set. This leaves the LV in the same state
as if the user had set the cleaner option directly with
lvchange --cachesettings cleaner=1 LV.
After leaving the LV with the cleaner option set, the
detach command will wait and watch the writeback progress,
and will finally detach the writecache when the writeback
is finished. The detach command does not need to wait
during the writeback phase, and can be canceled, in which
case the LV will remain with the writecache attached and
the cleaner option set. When the user runs the detach
command again it will complete the detach.
To detach a writecache directly, without using the cleaner
step (which has been the approach previously), add the
option --cachesettings cleaner=0 to the detach command.
Cow may not be a COW type, the return value of origin_from_cow(cow) may be NULL.
Reported-by: Wu Guanghao <wuguanghao3@huawei.com>
Reported-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Cow may not be a snapshot type, the return value of origin_from_cow(cow) may be NULL
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Use '0' for error and '1' as success.
Also drop INTERNAL_ERROR from path - as this error
is ATM used for invalid devices.
(i.e. test lvconvert-raid1-split-trackchanges.sh)
Since we declare dev_name in lib/device/device.h
and pvs in commands.h
rename local dev_name to device_name
and pvs to pvs_list to prevent shadowing warning.
m
When converting volume to pool LV use also wiping of other signatures.
For writecache & pool conversion support --yet and --force
to bypass prompting for signature wiping.
For writecache drop unneded zero_sectors.
Note: currently we have lvconvert doing convertion and prompting
for confirmation of conversion - and then again wipe_lv() prompts
for removing i.e. filesystem signature - we should unify this
prompting into 1 message - althought the 'filesystem' discovery
needs active volume - while the 1st. conversion prompt can
work without active converted volume.
To create a new cache or writecache LV with a single command:
lvcreate --type cache|writecache
-n Name -L Size --cachedevice PVfast VG [PVslow ...]
- A new main linear|striped LV is created as usual, using the
specified -n Name and -L Size, and using the optionally
specified PVslow devices.
- Then, a new cachevol LV is created internally, using PVfast
specified by the cachedevice option.
- Then, the cachevol is attached to the main LV, converting the
main LV to type cache|writecache.
Include --cachesize Size to specify the size of cache|writecache
to create from the specified --cachedevice PVs, otherwise the
entire cachedevice PV is used. The --cachedevice option can be
repeated to create the cache from multiple devices, or the
cachedevice option can contain a tag name specifying a set of PVs
to allocate the cache from.
To create a new cache or writecache LV with a single command
using an existing cachevol LV:
lvcreate --type cache|writecache
-n Name -L Size --cachevol LVfast VG [PVslow ...]
- A new main linear|striped LV is created as usual, using the
specified -n Name and -L Size, and using the optionally
specified PVslow devices.
- Then, the cachevol LVfast is attached to the main LV, converting
the main LV to type cache|writecache.
In cases where more advanced types (for the main LV or cachevol LV)
are needed, they should be created independently and then combined
with lvconvert.
Example
-------
user creates a new VG with one slow device and one fast device:
$ vgcreate vg /dev/slow1 /dev/fast1
user creates a new 8G main LV on /dev/slow1 that uses all of
/dev/fast1 as a writecache:
$ lvcreate --type writecache --cachedevice /dev/fast1
-n main -L 8G vg /dev/slow1
Example
-------
user creates a new VG with two slow devs and two fast devs:
$ vgcreate vg /dev/slow1 /dev/slow2 /dev/fast1 /dev/fast2
user creates a new 8G main LV on /dev/slow1 and /dev/slow2
that uses all of /dev/fast1 and /dev/fast2 as a writecache:
$ lvcreate --type writecache --cachedevice /dev/fast1 --cachedevice /dev/fast2
-n main -L 8G vg /dev/slow1 /dev/slow2
Example
-------
A user has several slow devices and several fast devices in their VG,
the slow devs have tag @slow, the fast devs have tag @fast.
user creates a new 8G main LV on the slow devs with a
2G writecache on the fast devs:
$ lvcreate --type writecache -n main -L 8G
--cachedevice @fast --cachesize 2G vg @slow
To add a cache or writecache to a main LV with a single command:
lvconvert --type cache|writecache --cachedevice /dev/ssd vg/main
A cachevol LV will be allocated from the specified cache device,
then attached to the main LV. Include --cachesize to specify the
size of cachevol to create, otherwise the entire cachedevice is
used. The cachedevice option can be repeated to create a cachevol
from multiple devices.
Example
-------
A user has an existing main LV that they want to speed up
using a new ssd.
user adds the new ssd to the VG:
$ vgextend vg /dev/ssd
user attaches the new ssd their main LV:
$ lvconvert --type writecache --cachedevice /dev/ssd vg/main
Example
-------
A user has two existing main LVs that they want to speed up
with a new ssd.
user adds the new 16G ssd to the VG:
$ vgextend vg /dev/ssd
user attaches some of the new ssd to the first main LV,
using half of the space:
$ lvconvert --type writecache --cachedevice /dev/ssd
--cachesize 8G vg/main1
user attaches some of the new ssd to the second main LV,
using the other half of the space:
$ lvconvert --type writecache --cachedevice /dev/ssd
--cachesize 8G vg/main2
Example
-------
A user has an existing main LV that they want to speed up using
two new ssds.
user adds the new two ssds the VG:
$ vgextend vg /dev/ssd1
$ vgextend vg /dev/ssd2
user attaches both ssds their main LV:
$ lvconvert --type writecache
--cachedevice /dev/ssd1 --cachedevice /dev/ssd2 vg/main
Use libblkid to detect sector/block size of the fs on the LV.
Use this to choose a compatible writecache block size.
Enable attaching writecache to an active LV.
When lvconvert is used to remove raid images, we can
skip calling lv_add_integrity_to_raid(), which finds
nothing to do, but the the blocksize validation would
be called unnecessarily and trigger spurious errors.
dm-integrity stores checksums of the data written to an
LV, and returns an error if data read from the LV does
not match the previously saved checksum. When used on
raid images, dm-raid will correct the error by reading
the block from another image, and the device user sees
no error. The integrity metadata (checksums) are stored
on an internal LV allocated by lvm for each linear image.
The internal LV is allocated on the same PV as the image.
Create a raid LV with an integrity layer over each
raid image (for raid levels 1,4,5,6,10):
lvcreate --type raidN --raidintegrity y [options]
Add an integrity layer to images of an existing raid LV:
lvconvert --raidintegrity y LV
Remove the integrity layer from images of a raid LV:
lvconvert --raidintegrity n LV
Settings
Use --raidintegritymode journal|bitmap (journal is default)
to configure the method used by dm-integrity to ensure
crash consistency.
Initialization
When integrity is added to an LV, the kernel needs to
initialize the integrity metadata/checksums for all blocks
in the LV. The data corruption checking performed by
dm-integrity will only operate on areas of the LV that
are already initialized. The progress of integrity
initialization is reported by the "syncpercent" LV
reporting field (and under the Cpy%Sync lvs column.)
Example: create a raid1 LV with integrity:
$ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo
Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB.
Logical volume "rr_rimage_0_imeta" created.
Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB.
Logical volume "rr_rimage_1_imeta" created.
Logical volume "rr" created.
$ lvs -a foo
LV VG Attr LSize Origin Cpy%Sync
rr foo rwi-a-r--- 1.00g 4.93
[rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02
[rr_rimage_0_imeta] foo ewi-ao---- 12.00m
[rr_rimage_0_iorig] foo -wi-ao---- 1.00g
[rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45
[rr_rimage_1_imeta] foo ewi-ao---- 12.00m
[rr_rimage_1_iorig] foo -wi-ao---- 1.00g
[rr_rmeta_0] foo ewi-aor--- 4.00m
[rr_rmeta_1] foo ewi-aor--- 4.00m
lvm2 supports thin-pool to be later used by other tools doing
virtual volumes themself (i.e. docker) - in this case we
shall not validate transaction Id - is this is used by
other tools and lvm2 keeps value 0 - so the transationId
validation need to be skipped in this case.
Prevent attaching writecache to an active LV until
we can determine the block size of the fs on the LV,
and use that to enforce an appropriate writecache
block size. Changing the block size under a mounted
fs can cause panic/corruption.
When a cachevol LV is attached, have the LV keep it's lock
allocated. The lock on the cachevol won't be used while
it's attached. When the cachevol is split a new lock does
not need to be allocated. (Applies to cachevol usage by
both dm-cache and dm-writecache.)
When LV gets cached and uses cache-pool - such cache-pool
will now get _cpool suffix automatically.
Thus 'Pool' column for cached LV will now show either _cvol
or _cpool LV.
Before 'archive()' is called, lvm2 must not touch/modify metadata.
So move setting CACHE_VOL related flags past this point.
Also make sure reading of cache segtype always restores this
flag properly (even if compatible flag would be lost).
When an LV is used as a writecache cachevol, give
it the LV name a _cvol suffix. Remove the suffix
when the cachevol is detached, restoring the
original LV name.
A cachevol LV had the CACHE_VOL status flag in metadata,
and the cache LV using it had no new flag. This caused
problems if the new metadata was used by an old version
of lvm. An old version of lvm would have two problems
processing the new metadata:
. The old lvm would return an error when reading the VG
metadata when it saw the unknown CACHE_VOL status flag.
. The old lvm would return an error when reading the VG
metadata because it would not find an expected cache pool
attached to the cache LV (since the cache LV had a
cachevol attached instead.)
Change the use of flags:
. Change the CACHE_VOL flag to be a COMPATIBLE flag (instead
of a STATUS flag) so that old versions will not fail when
they see it.
. When a cache LV is using a cachevol, the cache LV gets
a new SEGTYPE flag CACHE_USES_CACHEVOL. This flag is
appended to the segtype name, so that old lvm versions
will fail to use the LV because of an unknown segtype,
as opposed to failing to read the VG.
Instead of using 'noflush' option, switch cache_mode into WRITETHROUGH
which does not require flushing, when user confirmed he does not
want flushing for WRITEBACK (because of (partially) missing caching PV)
For wiping we activate and clear 'regular' devices,
since in case of whole process interuption (i.e. kill -9)
we leave metadata & DM table and workable state all the time.