IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Use suffixes for easier detection of private volumes.
This commit makes older volume UUIDs incompatible and
it most probably needs machine reboot after upgrade.
When creating pool's metadata - create initial LV for clearing with some
generic name and after the volume is create & cleared - rename it to
reserved name '_tmeta/_cmeta'.
We should not expose 'reserved' names for public LVs.
When repairing RAID LVs that have multiple PVs per image, allow
replacement images to be reallocated from the PVs that have not
failed in the image if there is sufficient space.
This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
where each image lives on two PVs but doesn't use the entire space
on any of them. If one PV fails and there is sufficient space on the
remaining PV in the image, the image can be reallocated on just the
remaining PV.
Previously, the seg_pvs used to track free and allocated space where left
in place after 'release_pv_segment' was called to free space from an LV.
Now, an attempt is made to combine any adjacent seg_pvs that also track
free space. Usually, this doesn't provide much benefit, but in a case
where one command might free some space and then do an allocation, it
can make a difference. One such case is during a repair of a RAID LV,
where one PV of a multi-PV image fails. This new behavior is used when
the replacement image can be allocated from the remaining space of the
PV that did not fail. (First the entire image with the failed PV is
removed. Then the image is reallocated from the remaining PVs.)
I've changed build_parallel_areas_from_lv to take a new parameter
that allows the caller to build parallel areas by LV vs by segment.
Previously, the function created a list of parallel areas for each
segment in the given LV. When it came time for allocation, the
parallel areas were honored on a segment basis. This was problematic
for RAID because any new RAID image must avoid being placed on any
PVs used by other images in the RAID. For example, if we have a
linear LV that has half its space on one PV and half on another, we
do not want an up-convert to use either of those PVs. It should
especially not wind up with the following, where the first portion
of one LV is paired up with the second portion of the other:
------PV1------- ------PV2-------
[ 2of2 image_1 ] [ 1of2 image_1 ]
[ 1of2 image_0 ] [ 2of2 image_0 ]
---------------- ----------------
Previously, it was possible for this to happen. The change makes
it so that the returned parallel areas list contains one "super"
segment (seg_pvs) with a list of all the PVs from every actual
segment in the given LV and covering the entire logical extent range.
This change allows RAID conversions to function properly when there
are existing images that contain multiple segments that span more
than one PV.
...to avoid using cached value (persistent filter) and therefore
not noticing any change made after last scan/filtering - the state
of the device may have changed, for example new signatures added.
$ lvm dumpconfig --type diff
allocation {
use_blkid_wiping=0
}
devices {
obtain_device_list_from_udev=0
}
$ cat /etc/lvm/cache/.cache | grep sda
$ vgscan
Reading all physical volumes. This may take a while...
Found volume group "fedora" using metadata type lvm2
$ cat /etc/lvm/cache/.cache | grep sda
"/dev/sda",
$ parted /dev/sda mklabel gpt
Information: You may need to update /etc/fstab.
$ parted /dev/sda print
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 134MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
$ cat /etc/lvm/cache/.cache | grep sda
"/dev/sda",
====
Before this patch:
$ pvcreate /dev/sda
Physical volume "/dev/sda" successfully created
With this patch applied:
$ pvcreate /dev/sda
Physical volume /dev/sda not found
Device /dev/sda not found (or ignored by filtering).
Take a local file lock to prevent concurrent activation/deactivation of LVs.
Thin/cache types and an extension for cluster support are excluded for
now.
'lvchange -ay $lv' and 'lvchange -an $lv' should no longer cause trouble
if issued concurrently: the new lock should make sure they
activate/deactivate $lv one-after-the-other, instead of overlapping.
(If anyone wants to experiment with the cluster patch, please get in touch.)
'lvs' would segfault if trying to display the "move pv" if the
pvmove was run with '--atomic'. The structure of an atomic pvmove
is different and requires us to descend another level in the
LV tree to retrieve the PV information.
In 'find_pvmove_lv', separate the code that searches the atomic
pvmove LVs from the code that searches the normal pvmove LVs. This
cleans up the segment iterator code a bit.
replicator/replicator.c:338:2: warning: passing argument 2 of 'build_dm_uuid' from incompatible pointer type [enabled by default]
replicator/replicator.c:629:3: warning: passing argument 2 of 'build_dm_uuid' from incompatible pointer type [enabled by default]
replicator/replicator.c:644:6: warning: passing argument 2 of 'build_dm_uuid' from incompatible pointer type [enabled by default]
replicator/replicator.c:668:7: warning: passing argument 2 of 'build_dm_uuid' from incompatible pointer type [enabled by default]
replicator/replicator.c:677:4: warning: passing argument 2 of 'build_dm_uuid' from incompatible pointer type [enabled by default]
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000). When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV. The LVs
then point to the range in the pvmove LV, rather than the PV
range.
Example 1:
We have two LVs in this example. After they were
created, the first LV was grown, yeilding two segments
in LV1. So, there are two LVs with a total of three
segments.
Before pvmove:
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
PV | 000 - 255 | 256 - 511 | 512 - 767 |
-------------------------------------
After pvmove inserts the temporary pvmove LV:
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
pvmove0 | seg 0 | seg 1 | seg 2 |
-------------------------------------
| | |
-------------------------------------
PV | 000 - 255 | 256 - 511 | 512 - 767 |
-------------------------------------
Each of the affected LV segments now point to a
range of blocks in the pvmove LV, which purposefully
corresponds to the segments moved from the original
LVs into the temporary pvmove LV.
The current implementation goes on from here to mirror the temporary
pvmove LV by segment. Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet. If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV. Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.
Example 2:
What 'example 1' might look if it was half-way
through the move.
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
pvmove0 | seg 0 | seg 1 | seg 2 |
-------------------------------------
| | |
| -------------------------
source PV | | 256 - 511 | 512 - 767 |
| -------------------------
| ||
-------------------------
dest PV | 000 - 255 | 256 - 511 |
-------------------------
This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment". That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.
Example 3:
A pvmove that is performed "by LV" rather than "by segment".
--------- ---------
| LV1s0 | | LV2s0 |
--------- ---------
| |
-------------------------
pvmove0 | * LV-level mirror * |
-------------------------
/ \
pvmove_mimage0 / pvmove_mimage1
------------------------- -------------------------
| seg 0 | seg 1 | | seg 0 | seg 1 |
------------------------- -------------------------
| | | |
------------------------- -------------------------
| 000 - 255 | 256 - 511 | | 000 - 255 | 256 - 511 |
------------------------- -------------------------
source PV dest PV
The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments. A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries. The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved. So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated. The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case. This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV. This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).
The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together. It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
The differentiation of the original number field into number, size and
percent field types has been introduced with recent changes for report
selection support.
Make dm_report_init_with_selection to accept an argument with an
array of reserved values where each element contains a triple:
{dm report field type, reserved value, array of strings representing this value}
When the selection is parsed, we always check whether a string
representation of some reserved value is not hit and if it is,
we use the reserved value assigned for this string instead of
trying to parse it as a value of certain field type.
This makes it possible to define selections like:
... --select lv_major=undefined (or -1 or unknown or undef or whatever string representations are registered for this reserved value in the future)
... --select lv_read_ahead=auto
... --select vg_mda_copies=unmanaged
With this, each time the field value of certain type is hit
and when we compare it with the selection, we use the proper
value for comparison.
For now, register these reserved values that are used at the moment
(also more descriptive names are used for the values):
const uint64_t _reserved_number_undef_64 = UINT64_MAX;
const uint64_t _reserved_number_unmanaged_64 = UINT64_MAX - 1;
const uint64_t _reserved_size_auto_64 = UINT64_MAX;
{
{DM_REPORT_FIELD_TYPE_NUMBER, _reserved_number_undef_64, {"-1", "undefined", "undef", "unknown", NULL}},
{DM_REPORT_FIELD_TYPE_NUMBER, _reserved_number_unmanaged_64, {"unmanaged", NULL}},
{DM_REPORT_FIELD_TYPE_SIZE, _reserved_size_auto_64, {"auto", NULL}},
NULL
}
Same reserved value of different field types do not collide.
All arrays are null-terminated.
The list of reserved values is automatically displayed within
selection help output:
Selection operands
------------------
...
Reserved values
---------------
-1, undefined, undef, unknown - Reserved value for undefined numeric value. [number]
unmanaged - Reserved value for unmanaged number of metadata copies in VG. [number]
auto - Reserved value for size that is automatically calculated. [size]
Selection operators
-------------------
...
The {pv,vg,lv,seg}_tags and lv_modules fields are reported as string
lists using the new dm_report_field_string_list - so we just pass
the list to the fn that takes care of reporting and item sorting itself.
The list of strings is used quite frequently and we'd like to reuse
this simple structure for report selection support too. Make it part
of libdevmapper for general reuse throughout the code.
This also simplifies the LVM code a bit since we don't need to
include and manage lvm-types.h anymore (the string list was the
only structure defined there).
This makes it easier to check against the fields (following patches for
report selection) and check whether size units are allowed or not
with the field value.
When creating a cache LV with a RAID origin, we need to ensure that
the sub-LVs of that origin properly change their names to include
the "_corig" extention of the top-level LV. We do this by first
performing a 'lv_rename_update' before making the call to
'insert_layer_for_lv'.
Internal reporting function cannot handle NULL reporting value,
so ensure there is at least dummy label.
So move dummy_lable from tools/reporter.c and use it for all
report_object() calls in lib/report/report.c.
(Fixes RHBZ 1108394)
Simlify lvm_report_object initialization.
Enable 'retry' deactivation also in 'cleanup' phase.
It shouldn't be mostly needed - however udev now produces
more and more completelny non-synchronizable device opens,
so even for orphan devices we can't easily predict where
udevd opens devices.
So it's more preferable here to log error about device being open
and retry clean, but let the command proceed.
And use ifdefs there, not exposing it in the tool code itself.
Later in the future, we should probably make the PIDFILE and
daemon checking code available also in case the daemon itself
is not built.
Accidently it's been commited - but it has also shown,
that on heavy loaded systems (like our test machine could be)
slightly bigger timeouts which waits longer for udev rules
processing does help and avoids occasional refuse of deactivation
because device is still being open.
(i.e. lvcreate...; lvchange -an...)
Unsure how we could now synchronize for this. On very slow(/loaded)
system 5 second timeout is simply not enough.
TODO: introduce at least lvm.conf configurable setting to
allow longer 'retry' loops.
Reindent lv_check_not_in_use to simplify internal loop code.
Also return always '0/1' (drop -1) - since we only
check for failure (0) - and we don't really know
why lv_info() has failed.