IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When repairing thin pool or swapping thin pool metadata,
preserve chunk_size property and avoid to be automatically changed
later in the code to better match thin pool metadata size.
When raid leg is extracted, now the preload code handles this state
correctly and put proper new table entry into dm tree,
so the activation of extracted leg and removed metadata works
after commit.
It's not an error if the device is filtered out and hence cleared from
lvmetad cache - "pvscan --cache DevPath" has now the same behaviour in
this case as "pvscan --cache major:minor" (which is more consistent).
Before, the tests expected failure return code for "pvscan --cache DevicePath"
if the device was filtered (which is a different situation if the device
is missing in the system completely!).
Normally, if there are partitions defined on top of device-mapper
device, there should be a device-mapper device created for each
partiton on top of the old one and once the underlying DM device
is used by another devices (partition mappings in this case),
it can't be used as a PV anymore.
However, sometimes, it may happen the partition mappings are
missing - either the partitioning tool is not creating them if
it does not contain full support for device-mapper devices or
the mappings were removed.
Better safe than sorry - check for partition header on DM devs
and filter them out as unsuitable for PVs in case the check is
positive. Whatever the user is doing, let's do our best to prevent
unwanted corruption (...by running pvcreate on top of such device
that would corrupt the partition header).
We have to use empty list, not NULL if we want to denote that the list
has no items. Otherwise, the code further can segfault as it expects
there's always a sane value (= some list), including empty list,
but never NULL.
When we split leg from raid - we take a proper new lock for a new LV.
However for now activation checks only 'existince' of device UUID,
but it's not validating device has a proper name.
As a quick fix call suspend()/resume() to rename after split mirror.
When chunk size needs to be estimated, the code missed to round
to proper 64kb boundaries (or power of 2 for older thin pool driver).
So for some data and metadata size (i.e. 10GB and 4MB) it resulted
in incorrect chunk size (not being a multiple of 64KB)
Fix it by adding proper rounding and also use 1 routine for 2 places
where the same calculation is made.
Fix also incorrect printed warning that has used 'ffs()'
(which returns first 'least significant' bit in word)
and it was not really giving any useful size info and replace it
with properly estimated chunk size.
Make sure there is 'control' node before clvmd is started.
Somehow 'clvmd' is not allowed by selinux to create one.
TODO: Check is selinux policy is right here...
Since we support device stack of pools over pool
(thin-pool with cache data volume) the existing code
is no longer able to detect orphan _pmspare.
So instead do a _pmspare check after volume removal,
and remove spare afterwards.
Fixed syntax parsing means that some commands that used to work are now
failing. Particullary this case:
$ invalid lvcreate -l1 --type thin vg/pool
> Needs to fail becase thin type LV needs --virtualsize
$ invalid lvcreate --type snapshot vg/lv1
> Needs to fail because old-snapshot segment type needs --size
Some reported error messages have been also updated.
If we want to support conversion of VG to clustered type,
we currently need to relock active LV to get proper DLM lock.
So add extra loop after change of VG clustered attribute
to exlusively activate all active top level LVs.
When doing change -cy -> -cn we should validate LVs are not
active on other cluster nodes - we could be sure about this only
when with local exclusive activation - for other types
we require user to deactivate volumes first.
As a workaround for this limitation there is always
locking_type = 0 which amongs other skip the detection
of active LVs.
FIXME:
clvmd should handle looks for cluster locking type all the time.
While we could probably reacquire some type of lock when
going from non-clustered to clustered vg, we don't have any
single road back to drop the lock and keep LV active.
For now keep it safe and prohibit conversion when LV
is active in the VG.
We used to print an error message whenever we tried to deal with devices that
lvmetad knew about but were rejected by a client-side filter. Instead, we now
check whether the device is actually absent or only filtered out and only print
a warning in the latter case.
Commit 5ebff6cc9f seemed to introduce
new 'for' loop but the mode is not yet used.
But the access to /dev dir needs to go through $DM_DEV_DIR
and whole path needs to be in "".
Since the type passed LV is changed and content of data detroyed,
query user with prompt to confirm this operation.
Also add a proper wiping of header.
Using '--yes' will skip this prompt:
lvconvert -s --yes vg/lv vg/lvcow
Commit 33d69162e4 reduced the number of
PVs to a level where the test could not function. (It is impossible
to replace 3 PVs of a 4-way RAID1 LV if there are only 5 PVs.)
When repairing RAID LVs that have multiple PVs per image, allow
replacement images to be reallocated from the PVs that have not
failed in the image if there is sufficient space.
This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
where each image lives on two PVs but doesn't use the entire space
on any of them. If one PV fails and there is sufficient space on the
remaining PV in the image, the image can be reallocated on just the
remaining PV.
I've changed build_parallel_areas_from_lv to take a new parameter
that allows the caller to build parallel areas by LV vs by segment.
Previously, the function created a list of parallel areas for each
segment in the given LV. When it came time for allocation, the
parallel areas were honored on a segment basis. This was problematic
for RAID because any new RAID image must avoid being placed on any
PVs used by other images in the RAID. For example, if we have a
linear LV that has half its space on one PV and half on another, we
do not want an up-convert to use either of those PVs. It should
especially not wind up with the following, where the first portion
of one LV is paired up with the second portion of the other:
------PV1------- ------PV2-------
[ 2of2 image_1 ] [ 1of2 image_1 ]
[ 1of2 image_0 ] [ 2of2 image_0 ]
---------------- ----------------
Previously, it was possible for this to happen. The change makes
it so that the returned parallel areas list contains one "super"
segment (seg_pvs) with a list of all the PVs from every actual
segment in the given LV and covering the entire logical extent range.
This change allows RAID conversions to function properly when there
are existing images that contain multiple segments that span more
than one PV.
If a RAID LV has images that are spread across more than one PV
and you allocate a new image that requires more than one PV,
parallel_areas is only honored for one segment. This commit
adds a test for this condition.
Fix gcc warnings:
libdm-report.c:1952:5: warning: "end_op_flag_hit" may be used uninitialized in this function [-Wmaybe-uninitialized]
libdm-report.c:2232:28: warning: "custom" may be used uninitialized in this function [-Wmaybe-uninitialized]
And snap_percent is not 0% in dm < 1.10.0 so
don't test comparison with 0% here.
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000). When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV. The LVs
then point to the range in the pvmove LV, rather than the PV
range.
Example 1:
We have two LVs in this example. After they were
created, the first LV was grown, yeilding two segments
in LV1. So, there are two LVs with a total of three
segments.
Before pvmove:
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
PV | 000 - 255 | 256 - 511 | 512 - 767 |
-------------------------------------
After pvmove inserts the temporary pvmove LV:
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
pvmove0 | seg 0 | seg 1 | seg 2 |
-------------------------------------
| | |
-------------------------------------
PV | 000 - 255 | 256 - 511 | 512 - 767 |
-------------------------------------
Each of the affected LV segments now point to a
range of blocks in the pvmove LV, which purposefully
corresponds to the segments moved from the original
LVs into the temporary pvmove LV.
The current implementation goes on from here to mirror the temporary
pvmove LV by segment. Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet. If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV. Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.
Example 2:
What 'example 1' might look if it was half-way
through the move.
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
pvmove0 | seg 0 | seg 1 | seg 2 |
-------------------------------------
| | |
| -------------------------
source PV | | 256 - 511 | 512 - 767 |
| -------------------------
| ||
-------------------------
dest PV | 000 - 255 | 256 - 511 |
-------------------------
This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment". That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.
Example 3:
A pvmove that is performed "by LV" rather than "by segment".
--------- ---------
| LV1s0 | | LV2s0 |
--------- ---------
| |
-------------------------
pvmove0 | * LV-level mirror * |
-------------------------
/ \
pvmove_mimage0 / pvmove_mimage1
------------------------- -------------------------
| seg 0 | seg 1 | | seg 0 | seg 1 |
------------------------- -------------------------
| | | |
------------------------- -------------------------
| 000 - 255 | 256 - 511 | | 000 - 255 | 256 - 511 |
------------------------- -------------------------
source PV dest PV
The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments. A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries. The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved. So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated. The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case. This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV. This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).
The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together. It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
With recent changes introduced with the report selection support,
the content of lv_modules field is of string list type (before
it was just string type).
String list elements are always ordered now so update lvcreate-thin
test to expect the elements to be ordered.
When creating a cache LV with a RAID origin, we need to ensure that
the sub-LVs of that origin properly change their names to include
the "_corig" extention of the top-level LV. We do this by first
performing a 'lv_rename_update' before making the call to
'insert_layer_for_lv'.
Add more time for tests, since debug kernels are getting slower...
and we add more and more tests.
However many test should be shortened to avoid testing disk-perfomance
and focus on lvm functionality...
(Often we should probably test with inactive volumes when we check
metadata operation of lvm2)
We may need to support option for 'DEEP' longer testing.
Also something like LVM_TEST_TIMEOUT_FACTOR might be useful
though it would be much better if test suite could approximete
from system perfomance test lenght...
Instead of rereading device list via cat - keep
the list in bash array. (Also solves problem
with spaces in device path)
Move usage of "$path" out of lvm shell usage,
since we don't support such thing there...
When directly corrupting RAID images for the purpose of testing,
we must use direct I/O (or a 'sync' after the 'dd') to ensure that
the writes are not caught in the buffer cache in a way that is not
reachable by the top-level RAID device.
Unsure if this is feature or bug of syncaction,
but it needs to be present physically on the media
and it ignores content of buffer cache...
(maybe lvchange should implicitely fsync all disks
that are members of raid array before starting test??)
Check we know how to handle same UUID
Test currently does NOT work on lvmetad
(or it's unclear it even should - thus test error
is currently lowered to 'test warning')
TODO: replace lib/test with a better shell script name
disable_dev can't use transaction - since it may lead occasionaly to
weird error - example could be nomda-missing.sh test case.
Here occasionaly device instead of being removed was left as
error device and testing different code path (which is unfortunatelly
buggy)
When we want to test 'error' device - 'aux error_dev()' should be used.
Delay udev notification after the point udev transaction
is finished - since otherwise some device may still
be found missing until udev transaction is finished.
While the 'raid1/10' segment types were being handled inadvertently
by '_move_mirrors()', the parity RAIDs were not being properly checked
to ensure that the user had specified all necessary PVs when moving
them. Thus, internal errors were being triggered when only part of
a RAID LV was moved to the new VG. I've added a new function,
'_move_raid()', which properly checks over any affected RAID LVs and
ensures that all the necessary PVs are being moved.
vgsplit of RAID volumes was problematic because the metadata area
and data areas were always on the same PVs. This problem is similar
to one that was just fixed for mirrors that had log and images sharing
a PV (commit 9ac858f). We can now add RAID LVs to the tests for
vgsplit.
Note that there still seems to be an issue when specifying an
incomplete set of PVs when moving RAID LVs.
Given a named mirror LV, vgsplit will look for the PVs that compose it
and move them to a new VG. It does this by first looking at the log
and then the legs. If the log is on the same device as one of the mirror
images, a problem occurs. This is because the PV is moved to the new VG
as the log is processed and thus cannot be found in the current VG when
the image is processed. The solution is to check and see if the PV we are
looking for has already been moved to the new VG. If so, it is not an
error.
Since the kill may take various amount of time,
(especially when running with valgrind)
check it's really pvmoved LV.
Restore initial restart of clvmd - it's currently
broken at various moments - basically killed lvm2
command may leave clvmd and confusing state leading
to reports of internal errors.
TODO:
It seems commit 7e685e6c70
has changed the old logic, when 'pvs device_name' used
to work. (regression from 2.02.104)
Currently put in extra pvscan.
We need to use "--verifyudev" for dmsetup mangle command used in
the name-mangling test since without the --verifyudev, we'd end up
with the failed rename.
Also, add direct check for the dev nodes - node with old name must
be gone and node with new name must be present. Before, we checked
just the output of the command.
One bug popped up here when renaming with udev and libdevmapper
fallback checking the udev when target mangle mode is "none"
(fixme added in the libdevmapper's node rename code).
It's not so easy to recongnize unusable /dev/kmsg
Reorder the code in a way if the first regular read of /dev/kmsg
fail, fallback to klogctl interface.
Call drain_dmesg also for the case there is no user log output.
Add a bit more complexity here - Switch to use /dev/kmsg
which has been introduced in 3.5 kernels and could run without
lossing lines from /proc/kmsg.
On older systems user may set env var LVM_TEST_CAN_CLOBBER_DMESG=1
to get kernel messages via klogctl() call (which deletes dmesg buffer)
otherwise no logging of kernel messages is provided.
Since there could be multiple readers of kmsg (test & journald) it needs
to be fast, to capture things like sysrq trace.
But to capture whole output it would need to prioritize reading of kmsg,
thus we would first log kernel messages and followed by command output.
As a trade-off always log command output first and use large drain
buffer so is captures most of messages, but occasionaly miss some
lines.
Use separate files for raid1, raid456, raid10.
They need different target versions to work, so support
more precise test selection.
Optimize duplicate tests of target avalability and skip
unsupported test cases sooner.
Basically reverts commit af8580d756.
"test: Use klogctl in the harness instead of reading /var/log/messages."
Problem is - this interface clears dmesg buffer
(just like call of dmesg -c)
Thus after running lvm2 test suitedmesg is empty - while all the
messages are usually logged in the journal/message, it's still not nice to
clear dmesg buffer.
It's not a pure revert, but switch to use /proc/kmsg directly instead of
reading /var/log/messages.
In cases where PV appears on a new device without disappearing from an old one
first, the device->pvid pointers could become ambiguous. This could cause the
ambiguous PV to be lost from the cache when a different PV comes up on one of
the ambiguous devices.
This is probably not optimal, but makes the lvmetad case mimic non-lvmetad code
more closely. It also fixes vgremove of a partially corrupt VG with lvmetad, as
_vg_write_raw (and consequently, entire vg_write) currently panics when it
encounters a corrupt MDA. Ideally, we'd be able to explicitly control when it is
safe to ignore them.
%FREE allocation has been broken for RAID. At 100%FREE, there is
still an extent left for certain tests. For now, change the test
to warn rather than completely fail.
Condition was swapped - however since it's been based on 'random'
memory content it's been missed as attribute has not been set.
So now we have quite a few possible results when testing.
We have old status without separate metadata and
we have kernels with fixed snapshot leak bug.
(in-release update)
Code uses target driver version for better estimation of
max size of COW device for snapshot.
The bug can be tested with this script:
VG=vg1
lvremove -f $VG/origin
set -e
lvcreate -L 2143289344b -n origin $VG
lvcreate -n snap -c 8k -L 2304M -s $VG/origin
dd if=/dev/zero of=/dev/$VG/snap bs=1M count=2044 oflag=direct
The bug happens when these two conditions are met
* origin size is divisible by (chunk_size/16) - so that the last
metadata area is filled completely
* the miscalculated snapshot metadata size is divisible by extent size -
so that there is no padding to extent boundary which would otherwise
save us
Signed-off-by:Mikulas Patocka <mpatocka@redhat.com>
While stripe size is twice the physical extent size,
the original code will not reduce stripe size to maximum
(physical extent size).
Signed-off-by: Zhiqing Zhang <zhangzq.fnst@cn.fujitsu.com>
Switch to use ext2 to make it usable on older systems.
Previous test has not been able to catching problem.
Multiple tests are now put in.
FIXME: validate what is doing kernel target when
the header is undeleted and same chunk size is used.
It seems snapshot target successfully resumes and
just complains COW is not big enough:
kernel: dm-8: rw=0, want=40, limit=24
kernel: attempt to access beyond end of device
When chunk size is different it fails instantly.
For checking this with lvm2 and this test case use this patch:
--- a/tools/lvcreate.c
+++ b/tools/lvcreate.c
@@ -769,7 +769,7 @@ static int _read_activation_params(struct
lvcreate_params *lp,
lp->permission = arg_uint_value(cmd, permission_ARG,
LVM_READ | LVM_WRITE);
- if (lp->snapshot) {
+ if (0 && lp->snapshot) {
/* Snapshot has to zero COW header */
lp->zero = 1;
lp->wipe_signatures = 0;
---
and switch to use -c 4 for both snapshots
When read-only snapshot was created, tool was skipping header
initialization of cow device. If it happened device has been
already containing header from some previous snapshot, it's
been 'reused' for a newly created snapshot instead of being cleared.