IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Only flag thin LV for no scanning in udev if this LV is about
to be wiped. This happens only in case the thin LV's pool was not
created with zeroing of the new blocks enabled.
The size of any metadata must be ignored when calculating the size of an
orphan PV.
Bug introduced by 603b45e0ed ("pvresize: Do
not use pv_read (get the PV from orphan VG).")
Block creations of archive and backup files for internal orphan VGs.
Bug introduced by 603b45e0ed ("pvresize: Do
not use pv_read (get the PV from orphan VG).")
DO NOT USE LVMETAD IF YOU HAVE ANY LVM1-FORMATTED PVS.
You may continue to use it without lvmetad, but do please schedule
an upgrade to the lvm2 format (with 'vgconvert').
Sending the original LVM1 formatted metadata to lvmetad is breaking
assumptions made by the code, so I am marking the format as obsolete for
now and no longer sending it to lvmetad.
This means that if you are using lvmetad, lvm1 volumes will usually
appear invisible - though not always: it depends on exactly what
sequence of commands you run!
The current situation is not satisfactory.
We'll either fix lvmetad and reenable this or we'll fix the code to
issue appropriate warning messages when lvm1 PVs are encountered
to avoid accidents.
(The latest unfixed problem is that lvmetad assumes metadata sequence
numbers exist and always increase - but the lvm1 format does not define
or store any sequence number, confusing both the daemon and client
when default values get passed to-and-fro.)
Several fields used to display 0 if undefined. Recent changes
to the way the fields are reported threw away some tests for
valid pointers, leading to segfaults with 'pvs -o all'.
Reinstate the original behaviour.
If a PV in an existing VG becomes orphaned (with 'pvcreate -ff', for
example) the VG struct cached against its vginfo must be invalidated.
This is because the struct device it references no longer contains
the PV label so becomes incorrect.
This triggers the error:
Internal error: PV $dev unexpectedly not in cache.
when the PV from the cached VG metadata is subsequently looked up
in the cache.
Bug introduced in 2.02.87 by commit 7ad0d47c3c
("Cache and share generated VG structs").
Before:
lvm> pvs
PV VG Fmt Attr PSize PFree
/dev/loop3 vg12 lvm2 a-- 28.00m 28.00m
/dev/loop4 vg12 lvm2 a-- 28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
Physical volume "/dev/loop3" successfully created
lvm> pvs
Internal error: PV /dev/loop3 unexpectedly not in cache.
PV VG Fmt Attr PSize PFree
/dev/loop3 vg12 lvm2 a-- 28.00m 28.00m
/dev/loop3 lvm2 a-- 32.00m 32.00m
/dev/loop4 vg12 lvm2 a-- 28.00m 28.00m
After:
lvm> pvs
PV VG Fmt Attr PSize PFree
/dev/loop3 vg12 lvm2 a-- 28.00m 28.00m
/dev/loop4 vg12 lvm2 a-- 28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
Physical volume "/dev/loop3" successfully created
lvm> pvs
PV VG Fmt Attr PSize PFree
/dev/loop3 lvm2 a-- 32.00m 32.00m
/dev/loop4 vg12 lvm2 a-- 28.00m 28.00m
unknown device vg12 lvm2 a-m 28.00m 28.00m
Make this code a bit more readable for Coverity as otherwise
it marks the "type" variable in the "_thin_pool_add_message" fn
as undefined for certain path (...which is normally unreachable anyway,
but let's clean this up).
Introduce FMT_OBSOLETE to identify pool metadata and use it and FMT_MDAS
instead of hard-coded format names.
Explain device accesses on pvscan --cache man page.
If there is no define for BLKPBSZGET - we have hard time how to
decrypt physical block size - we can't use here block_size,
since this is usually 4k while we need to use 512b.
FIXME: find some better way, until that enforce value 512.
Eventually we could also try to put in:
+#ifndef BLKPBSZGET
+# define BLKPBSZGET _IO(0x12,123)
+#endif
but this will still not work well on old kernels.
This reverts commit 24639be558.
Ok - seems we could be here a bit too active - and we
may remove devices which are unsuable for reasons we are not
aware of - thus taking down whole device could be way to big hammer.
So we still need some solution to recover from failing preload
and activation - but it needs more tunning.
When activation fails - we may leak large tree of partially loaded
devices in the dm table (i.e. failure in snapshot activation)
The best we can do here is try to deactivate whole device and
remove as much inactive table entries as we can.
When LV is scanned for its dependencies - scan also origin's snapshots,
and thin external origins.
So if any PV from snapshot or external origin device is missing - lvm2 will
avoid trying to activate such device.
The metadata/disk_areas setting was incorrectly registered as
"string" configuration option but it's a section where each area
is defined in its own subsection with "start_sector", "size" and "id"
setting.
This setting is not officialy supported, it's undocumented and it's
used solely for debugging.
Note: At this moment, it does not seem to be working with lvmetad!
When the device is inserted in dev_name_confirmed() stat() is
called twice as _insert() has it's own stat() call.
Extend _insert() parameter with struct stat* - which could be used
if it has been just obtained. When NULL is passed code is
doing its own stat() call as before.
Thin kernel target 1.9 still does not support online resize of
thin pool metadata properly - so disable it with expectation
for much higher version - and reenable after fixing kernel.
Replacement of pv_read by find_pv_by_name in commit
651d5093ed caused spurious
error messages when running pvcreate or vgextend against an
unformatted device.
Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created
Physical volume /dev/loop4 not found
Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created
Volume group "vg1" successfully extended
If we're calling pvcreate on a device that already has a PV label,
the blkid detects the existing PV and then we consider it for wiping
before we continue creating the new PV label and we issue a warning
with a prompt whether such old PV label should be removed. We don't
do this with native signature detection code. Let's make it consistent
with old behaviour.
But still keep this "PV" (identified as "LVM1_member" or "LVM2_member"
by blkid) detection when creating new LVs to avoid unexpected PV label
appeareance inside LV.
Collapse 2 ifs and replace log_error() with log_warn(), since\
the reported message is not causing tools error.
(and cannot be probably triggered anyway).
Optimize and cleanup recently introduced new function wipe_lv.
Use compound literals to get nicely initialized wipe_params struct.
Pass in lv as explicit argument for wipe_lv.
Use cmd from lv structure.
Initialize only non-null members so it's easy to see what
is the special arg.
Drop find_merging_snapshot() function. Use find_snapshot()
called after check for lv_is_merging_origin() which
is the commonly used code path - so we avoid duplicated
tests and potential risk of derefering NULL point
in unhandled error path.
This is actually the wipefs functionailty as a matter of fact
(wipefs uses the same libblkid calls).
libblkid is more rich when it comes to detecting various
signatures, including filesystems and users can better
decide what to erase and what should be kept.
The code is shared for both pvcreate (where wiping is necessary
to complete the pvcreate operation) and lvcreate where it's up
to the user to decide.
The verbose output contains a bit more information about the
signature like LABEL and UUID.
For example:
raw/~ # lvcreate -L16m vg
WARNING: linux_raid_member signature detected on /dev/vg/lvol0 at offset 4096. Wipe it? [y/n]
or more verbose one:
raw/~ # lvcreate -L16m vg -v
...
Found existing signature on /dev/vg/lvol0 at offset 4096: LABEL="raw.virt:0" UUID="da6af139-8403-5d06-b8c4-13f6f24b73b1" TYPE="linux_raid_member" USAGE="raid"
WARNING: linux_raid_member signature detected on /dev/vg/lvol0 at offset 4096. Wipe it? [y/n]
The verbose output is the same output as found in blkid.
Use common wipe_lv (former set_lv) fn to do zeroing as well as signature
wiping if needed. Provide new struct wipe_lv_params to define the
functionality.
Bind "lvcreate -W/--wipesignatures y" with proper wipe_lv call.
Also, add "yes" and "force" to lvcreate_params so it's possible
to apply them for the prompt: "WARNING: %s detected on %s. Wipe it? [y/n]".
The wipe_known_signatures fn now wraps the _wipe_signature fn that is called
for each known signature (currently md, swap and luks). This patch makes the
code more readable, not repeating the same sequence when used anywhere in the
code. We're going to reuse this code later...
If using lv/vgchange --sysinit -aay and lvmetad is enabled, we'd like to
avoid the direct activation and rely on autoactivation instead so
it fits system initialization scripts.
But if we're calling lv/vgchange --sysinit -aay too early when even
lvmetad service is not started yet, we just need to do the direct
activation instead without printing any error messages (while
trying to connect to lvmetad and not finding its socket).
This patch adds two helper functions - "lvmetad_socket_present" and
"lvmetad_used" which can be used to check for this condition properly
and avoid these lvmetad connections when the socket is not present
(and hence lvmetad is not yet running).
It will likely not fail to duplicate empty string, but
just keep the test of result of this function consistent.
Also on error path restore extent_size if in some
case someone would still use that variable.
Put common printf() case into a function and use
the string with text format as direct arg to make
the compile time validation of args easier and
code shorter.
Switch log_error() to log_warn(), since 'return 0'
doesn't cause any failure here.
Revert 4777eb6872 which put
target_present check into init_snapshot_merge(). However
this function is also used when parsing metadata. So we would
get this present test performed even when target is not really
needed. So move this target_present test directly into lvconvert.
The error buffer will stack error messages which is fine. However,
once you retrieve the error messages it doesn't make sense to keep
appending for each additional error message when running in the
context of a library call.
This patch clears and resets the buffer after the user retrieves
the error message.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Add a PV create which takes a paramters object that
has get/set method to configure PV creation.
Current get/set operations include:
- size
- pvmetadatacopies
- pvmetadatasize
- data_alignment
- data_alignment_offset
- zero
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=880395
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Replace the code with the refactored vgreduce_single instead
of calling its own implementation.
Corrects bug: https://bugzilla.redhat.com/show_bug.cgi?id=989174
Signed-off-by: Tony Asleson <tasleson@redhat.com>
Moving the core functionality of vgreduce single into
lib/metadata/vg.c so that the command line and lvm2app library
can call the same core functionality. New function is
vgreduce_single.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
All labellers always use the "private" (void *) field as the fmt pointer. Making
this fact explicit in the type of the labeller simplifies the label reporting
code which needs to extract the format. Moreover, it removes a number of
error-prone casts from the code.
Fix buggy usage of "" (empty string) as a numerical string
value used for sorting.
On intel 64b platform this was typically resolve
as 0xffffff0000000000 - which is already 'close' to
UINT64_MAX which is used for _minusone64.
On other platforms it might have been giving
different numbers depends on aligment of strings.
Use proper &_minusone64 for sorting value when the reported
value is NUM.
Note: each numerical value needs to be thought about if it needs
default value &_zero64 or &_minusone64 since for cases, were
value of zero is valid, sorting should not be mixing entries
together.
Add wrapper function for dm_report_field_set_value() which returns void
and return 1, so the code could be shorter.
Add wrapper function for percent display _field_set_percent().
This patch fixes mostly cluster behavior but also updates
non-cluster reaction where calls like 'lvchange -aln'
lead to incorrect errors for some segment types.
Fix the implicit activation rules where some segment types could
be activated only in exclusive mode in cluster.
lvm2 command was not preserver 'local' property and incorrectly
converted local activations in to plain exclusive, so the local
activation could have activate volumes exclusively, but remotely.
If the volume_list filters out volume from activation,
it is still success result for this function.
Change the error message back to verbose level.
Detect if the volume is active localy before zeroing,
so we report error a bit later for cases, where volume
could not be activated because it doesn't pass through volume
list (but user still could create volume when he disables
zeroing)
Correct return code of activate_lv_excl().
Function is not supposed to return activation state of
activated volume, but return code of the operation.
Since i.e. when activation filter is allowing to activate
volume on current system, it is still success even though
no volume is activated.
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O. When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices. This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.
Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel. While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely. This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.
I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs). If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command. This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror. They
can then release their lock and allow a repair command to commence. In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.
This patch brings solution #1. If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.
We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.
This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.
For example: lvcreate --thinpool POOL --zero n -L 1G vg
- first, the usual LV is created to do a clean up for pool metadata
spare. The LV is activated, zeroed, deactivated.
- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
to avoid any scanning in udev
- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
udev rule, but since the LV is just a usual LV, we can't make a
difference. The LV_TEMPORARY internal LV flag helps here. If we
create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
with "invisible" and non-top-level LVs) - udev is directed to
skip WATCH rule use.
- if the LV_TEMPORARY flag was not used, there would normally be
a WATCH event generated once the LV is closed after "zeroed"
stage. This will make problems with immediated deactivation that
follows.
This patch reinstates the lv_info call to check for open count of
the LV we're removing/deactivating - this was changed with commit 125712b
some time ago and we relied on the ioctl retry logic deeper in the libdm
while calling the exact 'remove' ioctl.
However, there are still some situations in which it's still required to
check for open count before we do any 'remove' actions - this mainly
applies to LVs which consist of several sub LVs, like it is for
virtual snapshot devices.
The commit 1146691 fixed the issue with ordering of actions during
virtual snapshot removal while the snapshot is still open. But
the check for the open status of the snapshot is still prone to
marking the snapshot as in use with an immediate exit even though
this could be a temporary asynchronous open only, most notably
because of udev and its WATCH udev rule with accompanying scans
for the event which is asynchronous. The situation where this crops
up most often is when we're closing the LV that was open for read-write
and then calling lvremove immediately.
This patch reinstates the original lv_info call for the open status
of the LV in the lv_check_not_in_use fn that gets called before
we do any LV removal/deactivation. In addition to original logic,
this patch adds its own retry loop with a delay (25x0.2 seconds)
besides the existing ioctl retry loop.
Component LVs of a thinpool can be RAID LVs. Users who attempt a
scrubbing operation directly on a thinpool will be prompted to
specify the sub-LV they wish the operation to be performed on. If
neither of the sub-LVs are RAID, then a message telling them that
the operation can only be performed on a RAID LV will be given.
Split image should have an out-of-sync attr ('I') - always. Even if
the RAID LV has not been written to since the LV was split off, it is
still not part of the group that makes up the RAID and is therefore
"out-of-sync".
Since the virtual snapshot has no reason to stay alive once we
detach related snapshot - deactivate whole thing in front of
snapshot removal - otherwice the code would get tricky for
support in cluster.
The correct full solution would require to have transactions
for libdm operations.
Also enable to the check for snapshot being opened prior
the origin deactivation, otherwise we could easily end
with the origin being deactivate, but snapshot still kept
active, desynchronizing locking state in cluster.
Addendum to commit ce7489e which introduced a new *internal* LV_NOSCAN
flag and so it needs to be marked that way properly otherwise it
ends up unrecognized and improperly handled during metadata export.
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.
This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.
The messages can still be seen with -v.
Some code has been added recently which makes it impossible to compile
when "configure --disable-devmapper" is used. This patch just shuffles
the code around so it's under proper #ifdef DEVMAPPER_SUPPORT.
lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the
required space for RAID allocations incorrectly due to double
accounting. This resulted in failure to allocate when available
space was tight.
When RAID data and metadata areas are allocated together, the total
amount is stored in ah->new_extents and ah->alloc_and_split_meta is
set. '_sufficient_pes_free' was adding the necessary metadata extents
to ah->new_extents without ever checking ah->alloc_and_split_meta.
This often led to double accounting of the metadata extents. This
patch checks 'ah->alloc_and_split_meta' to perform proper calculations
for RAID.
This error is only present in the function that checks for the needed
space, not in the functions that do the actual allocation.
If "default" thin pool chunk size calculation method is selected,
use minimum_io_size, otherwise optimal_io_size for "performance"
device hint exposed in sysfs. If there appear to be PVs with
different hints presented, use their least common multiple.
If the hint is less than the default value defined for the
calculation method, use the default value instead.
Add allocation/thin_pool_chunk_size_calculation lvm.conf
option to select a method for calculating thin pool chunk
sizes and define two possible values - "default" and "performance".
A previous commit (b6bfddcd0a) which
was designed to prevent segfaults during lvextend when trying to
extend striped logical volumes forgot to include calculations for
RAID4/5/6 parity devices. This was causing the 'contiguous' and
'cling_by_tags' allocation policies to fail for RAID 4/5/6.
The solution is to remember that while we can compare
ah->area_count == prev_lvseg->area_count
for non-RAID, we should compare
(ah->area_count + ah->parity_count) == prev_lvseg->area_count
for a general solution.