1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

4001 Commits

Author SHA1 Message Date
David Teigland
b19b7b6111 pvck: ensure text lines are terminated 2020-03-03 13:47:07 -06:00
David Teigland
84eab461c8 writecache: check watermark value 2020-02-25 10:34:30 -06:00
David Teigland
2284f845b0 writecache: fix watermark error message 2020-02-21 08:13:32 -06:00
David Teigland
8153c5f1e6 writecache: working real dm uuid suffix for wcorig lv 2020-02-20 17:13:43 -06:00
David Teigland
db1d66859f thin: don't use writecache for poolmetadata 2020-02-13 17:22:37 -06:00
David Teigland
cba06012ac writecache: check if cachevol is writable
before trying to initialize it (since wipe_lv
does not return an error if it fails to write.)
2020-02-11 13:01:13 -06:00
David Teigland
8f794f2095 writecache: skip zeroing in test mode 2020-02-07 10:32:10 -06:00
David Teigland
744b75f881 writecache: check for invalid cachevol 2020-02-07 10:26:59 -06:00
David Teigland
b756cb3e49 writecache: fix return value 2020-02-07 10:21:07 -06:00
Zdenek Kabelac
e6a3c09017 command: validate reporting of previous argument
When reporting parsing error, report 'previous' argument
only when there is one.
2020-02-04 17:22:06 +01:00
Zdenek Kabelac
de43527f94 cov: unused header file removal
cov: unused header removed
Also ensure library header file with config settings goes first.
Move inclusion of format-text.h into layout.h
2020-02-04 17:22:06 +01:00
Zdenek Kabelac
1bde35e596 pvck: avoid memleak of vgname
clang: no vgname buffer leak.
2020-02-04 17:22:06 +01:00
David Teigland
c1ee6f0eef pvmove: prevent moving writecache device 2020-02-03 15:59:12 -06:00
David Teigland
379a7e1288 vgsplit: handle cachevol
attached to a cache or writecache LV.
Ensure PVs in cachevol are moved with the main LV.
2020-02-03 15:33:58 -06:00
David Teigland
adbb0a8d5b writecache: reject invalid high/low watermark setting 2020-02-03 11:33:30 -06:00
Heming Zhao
d53bfae273 add suggestion message for mirror LVs
Currently the error messages are not clear. This very easy to
guide user to execute "--removemissing --force", it is dangerous
and will make the LVs to be destroied.

Signed-off-by: Zhao Heming <heming.zhao@suse.com>
2020-01-15 09:46:54 -06:00
Zdenek Kabelac
5ccf3e6f30 vdo: avoid running initialization of cache pool vars
Since VDO is also pool, the old if() case missed to know about this,
and executed unnecesserily initialization of cache pool variables.
This was usually harmless when using 'smaller' sizes of VDO pools,
but for big VDO pool size, we were reporting senseless messages
about big cache chunk sizes.
2020-01-13 17:42:31 +01:00
David Teigland
2da6f01c15 pvck: show specific dump option values 2019-12-10 11:07:07 -06:00
David Teigland
3f381784f2 update option description for settings 2019-12-06 16:21:26 -06:00
David Teigland
ec71df6fec pvck: deal with coverity warnings 2019-12-02 11:16:02 -06:00
David Teigland
5a88b2ce7f pvck: use zalloc in more places 2019-11-27 11:17:15 -06:00
David Teigland
3145a85583 pvck: repair headers and metadata
To write a new/repaired pv_header and label_header:

  pvck --repairtype pv_header --file <file> <device>

This uses the metadata input file to find the PV UUID,
device size, and data offset.

To write new/repaired metadata text and mda_header:

  pvck --repairtype metadata --file <file> <device>

This requires a good pv_header which points to one or two
metadata areas.  Any metadata areas referenced by the
pv_header are updated with the specified metadata and
a new mda_header. "--settings mda_num=1|2" can be used
to select one mda to repair.

To combine all header and metadata repairs:

  pvck --repair --file <file> <device>

It's best to use a raw metadata file as input, that was
extracted from another PV in the same VG (or from another
metadata area on the same PV.)  pvck will also accept a
metadata backup file, but that will produce metadata that
is not identical to other metadata copies on other PVs
and other areas.  So, when using a backup file, consider
using it to update metadata on all PVs/areas.

To get a raw metadata file to use for the repair, see
pvck --dump metadata|metadata_search.

List all instances of metadata from the metadata area:
  pvck --dump metadata_search <device>

Save one instance of metadata at the given offset to
the specified file (this file can be used for repair):

  pvck --dump metadata_search --file <file>
    --settings "metadata_offset=<off>" <device>
2019-11-27 11:13:47 -06:00
David Teigland
2e0f273008 pvck: dump functions cleanup args and return vals 2019-11-27 11:13:47 -06:00
David Teigland
d051e899a5 pvck: dump show most recent metadata 2019-11-27 11:13:47 -06:00
David Teigland
9cf08836ef pvck: allow disk locations to be specified
using --settings:

mda_offset=<offset> mda_size=<size> can be used
in place of the offset/size that normally come
from headers.

metadata_offset=<offset> prints/saves one instance
of metadata text at the given offset, in
metadata_all or metadata_search.
2019-11-27 11:13:47 -06:00
David Teigland
53126ceada pvck: move some arg processing 2019-11-27 11:13:47 -06:00
David Teigland
13c629fb78 Revert "cov: use zalloc"
This reverts commit 9af1d63b4d.

fixes folded into subsequent pvck commit
2019-11-27 11:13:43 -06:00
David Teigland
39bd9b111b Revert "pvck: check result of dev_get_size"
This reverts commit 1f4968289c.

fixes folded into subsequent pvck commit
2019-11-27 11:13:40 -06:00
David Teigland
4485b8edca Revert "cov: fix mem leaking buffer"
This reverts commit d67ce9e140.

fixes folded into subsequent pvck commit
2019-11-27 11:13:36 -06:00
David Teigland
657d42e879 Revert "cov: avoid passing NULL to strstr function"
This reverts commit 0bad3977df.

fixes folded into subsequent pvck commit
2019-11-27 11:13:32 -06:00
David Teigland
595aa1d452 Revert "cov: check for retvalue"
This reverts commit 153e55c20e.

fixes folded into subsequent pvck commit
2019-11-27 11:13:09 -06:00
David Teigland
a61272a6f0 Revert "lvs: disable scanning optimization"
This reverts commit 7474440d3b.

lvs can use the scanning optimization again since it has
been changed in:
"scanning: optimize by checking text offset and checksum"
2019-11-26 16:52:28 -06:00
Heinz Mauelshagen
29db9c6325 lvcreate: ensure striped raid region size is at least stripe size
The kernel MD runtime requires region size to be larger than stripe size
on striped raid layouts, thus the dm-raid target's constructor rejects
such request.

This causes e.g. an 'lvcreate --type raid10 -i3 -I4096 -R2048 -n lv vg' to fail.

Avoid failing late in the kernel by enforcing region size to be
larger or equal to stripe size.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1698225
2019-11-26 22:31:58 +01:00
David Teigland
2037476008 pvcreate,pvremove: fix reacquiring global lock after prompt
When pvcreate/pvremove prompt the user, they first release
the global lock, then acquire it again after the prompt,
to avoid blocking other commands while waiting for a user
response.  This release/reacquire changes the locking
order with respect to the hints flock (and potentially other
locks).  So, to avoid deadlock, use a nonblocking request
when reacquiring the global lock.
2019-11-26 14:34:43 -06:00
David Teigland
7474440d3b lvs: disable scanning optimization
The scanning optimization can produce warnings from
'lvs' when run concurrently with commands modifying LVs,
so disable the optimization until it can be improved.

Without the scanning optimization, lvs will always
read all PVs twice:

1. read metadata from all PVs, saving it in memory
2. for each VG
3. lock VG
4. reread metadata from all PVs in VG, replacing metadata
   saved from step 1
5. run command on VG
6. unlock VG

The optimization would usually cause step 4 to be skipped,
and PVs would be read only once.

Running the command in step 5 using metadata that was not
read under the VG lock is usually fine, except for the
fact that lvs attempts to validate the metadata by comparing
it to current dm state.  If other commands are modifying dm
state while lvs is running, lvs may see differences between
metadata from step 1 and dm state checked during step 5,
and print warnings.

(A better fix may be to detect the concurrent change and
fall back to rereading metadata in step 4 only when needed.)
2019-11-19 10:56:12 -06:00
Zdenek Kabelac
9af1d63b4d cov: use zalloc
Instead of malloc() memset() -> zalloc()
2019-11-14 18:06:42 +01:00
Zdenek Kabelac
1f4968289c pvck: check result of dev_get_size
Don't use garbage value for later computations.
2019-11-14 18:06:42 +01:00
Zdenek Kabelac
d67ce9e140 cov: fix mem leaking buffer
Free allocated buffer on function's exit.
Also check for fwrite() results.
2019-11-14 18:06:42 +01:00
Zdenek Kabelac
0bad3977df cov: avoid passing NULL to strstr function
When 'str1' would be NULL, there is no point to run 2nd. strstr().
2019-11-14 18:06:42 +01:00
Zdenek Kabelac
153e55c20e cov: check for retvalue 2019-11-14 18:06:42 +01:00
Zdenek Kabelac
82e6b820b8 cov: check for NULL
Since we check for NULL pointers earlier we need
to be consistent across function - since the NULL
would applies across whole function.

When dropping 'mda' check - we are actually
already dereferencing it before - so it can't
be NULL at that places (and it's validated
before entering  _read_mda_header_and_metadata).
2019-11-14 18:06:42 +01:00
David Teigland
6a8bd0c509 lvmlockd: fix cachevol locking
When a cachevol LV is attached, have the LV keep it's lock
allocated.  The lock on the cachevol won't be used while
it's attached.  When the cachevol is split a new lock does
not need to be allocated.  (Applies to cachevol usage by
both dm-cache and dm-writecache.)
2019-10-25 14:08:59 -05:00
David Teigland
5706764885 improve command definition matching using type
When a user includes "--type foo" in a command, only
look at command definitions with matching type, as
opposed to using matching/mismatching --type as a
vote for/against a given command def.  This means a
command with --type foo will prioritize a command def
with --type foo over other command defs that have
more matching options but an unmatching type.  This
makes it more likely that a closely matching command
def will be recommended.
2019-10-22 09:35:10 -05:00
David Teigland
967e2decd2 vgchange: remove bogus option restriction
for -A with -a
2019-10-21 13:29:57 -05:00
Zdenek Kabelac
644186e920 gcc: all paths will set ret
Set success on common path.
Fixes random failure on writecache uncaching path.
2019-10-21 15:32:35 +02:00
Zdenek Kabelac
dd7629ea09 cache: use _cpool for used cache-pools
When LV gets cached and uses cache-pool - such cache-pool
will now get _cpool suffix automatically.

Thus 'Pool' column for cached LV will now show either _cvol
or _cpool LV.
2019-10-21 15:31:33 +02:00
Zdenek Kabelac
23f660cf98 cache: drop _cpool suffix from unused cache-pool
Drop _cpool prefix if present and cache-pool is going to be unused.
2019-10-21 12:14:15 +02:00
Zdenek Kabelac
a5f8e7a96c lvconvert: use new functions 2019-10-21 12:14:15 +02:00
David Teigland
5714c8c9cc pvck: dump metadata search
Improve the implementation of extracting all text metadata
copies from the metadata area.  Use this for the existing
metadata_all dump option.

Add a new metadata_search dump option which does not use
lvm headers to find metadata, but looks in standard
locations.  This is useful if headers are damaged and
can't be used to locate metadata.

Adding '-v' to metadata_all or metadata_search will add
the description and creation_time to the printed list of
metadata instances that are found.
2019-10-18 12:26:29 -05:00
Zdenek Kabelac
a255385e3a cachevol: move cvol rename
Move rename of CVOL after archive().
2019-10-17 13:03:50 +02:00
Zdenek Kabelac
dab4a2c893 cachevol: move flag setting after taking archive
Before 'archive()' is called, lvm2 must not touch/modify metadata.
So move setting  CACHE_VOL related flags past this point.

Also make sure reading of cache segtype always restores this
flag properly (even if compatible flag would be lost).
2019-10-17 13:03:50 +02:00
David Teigland
998e7b1075 writecache: add cvol suffix to attached cachevol
When an LV is used as a writecache cachevol, give
it the LV name a _cvol suffix.  Remove the suffix
when the cachevol is detached, restoring the
original LV name.
2019-10-15 16:03:34 -05:00
David Teigland
91ee025d5b cache: change cachevol flags for backward compat
A cachevol LV had the CACHE_VOL status flag in metadata,
and the cache LV using it had no new flag.  This caused
problems if the new metadata was used by an old version
of lvm.  An old version of lvm would have two problems
processing the new metadata:

. The old lvm would return an error when reading the VG
  metadata when it saw the unknown CACHE_VOL status flag.

. The old lvm would return an error when reading the VG
  metadata because it would not find an expected cache pool
  attached to the cache LV (since the cache LV had a
  cachevol attached instead.)

Change the use of flags:

. Change the CACHE_VOL flag to be a COMPATIBLE flag (instead
  of a STATUS flag) so that old versions will not fail when
  they see it.

. When a cache LV is using a cachevol, the cache LV gets
  a new SEGTYPE flag CACHE_USES_CACHEVOL.  This flag is
  appended to the segtype name, so that old lvm versions
  will fail to use the LV because of an unknown segtype,
  as opposed to failing to read the VG.
2019-10-15 09:05:52 -05:00
Zdenek Kabelac
201ffbd04a cachevol: use lv_cache_remove
Use same routine for dropping cache.
2019-10-14 15:20:25 +02:00
Zdenek Kabelac
8d8047883e cachevol: use writethrough for partial removal
Instead of using 'noflush' option, switch cache_mode into WRITETHROUGH
which does not require flushing, when user confirmed he does not
want flushing for WRITEBACK (because of (partially) missing caching PV)
2019-10-14 15:15:14 +02:00
Zdenek Kabelac
8a8e6ebba2 cachevol: rename converted LV to _cvol
When converting existing public LV to internally used
'CacheVol' LV - rename LV to LV_cvol.

When splitting CacheVol, remove _cvol suffix.
2019-10-14 15:15:12 +02:00
Zdenek Kabelac
f6d171ffe3 cachevol: wipe 'normal' device
For wiping we activate and clear 'regular' devices,
since in case of whole process interuption (i.e. kill -9)
we leave metadata & DM table and workable state all the time.
2019-10-14 15:14:46 +02:00
Zdenek Kabelac
ddaf2002c9 lvconvert: use struct initializer
Always good to keep rest of structure initilized with zeros.
2019-10-14 15:13:47 +02:00
Zdenek Kabelac
76a9a86fd3 lvconvert: fix return value when zeroing fails
Use correct error return code for fail path.
2019-10-14 15:13:33 +02:00
David Teigland
d6ffc99052 vgck: fix updatemetadata writing different descriptions
vgck --updatemetadata would write the same correct
metadata to good mdas, and then to bad mdas, but the
sequence of vg_write/vg_commit calls betwen good and
bad mdas could cause a different description field to
be generated for good/bad mdas. (The description field
describing the command was recently included in the
ondisk copy of the metadata text.)
2019-10-11 12:57:32 -05:00
David Teigland
fe16d296b0 pvmove: remove some cmirror related code
which is no longer used
2019-10-11 11:31:42 -05:00
David Teigland
7368cf8e7d pvck: handle PVs with zero metadata copies 2019-09-30 16:20:17 -05:00
David Teigland
0c23d3fc84 pvscan: use quick activation only with matching PV device names
When the PV device names in the VG metadata do not match the
current PV device names seen on the system, do not use the
optimized activation function (that avoids extra device scanning.)

When the device names do not match, it's a clue that there could
be duplicate PVs, in which case we want to scan all devicess to
find any duplicates and stop the activation if found.

This does not prevent autoactivating a VG from the incorrect
duplicate PV, because the incorrect duplicate may appear by itself
first.  At that point its duplicate PV does not exist to be seen.
(A future enhancement could use the WWID to strengthen this
detection.)
2019-09-30 11:38:10 -05:00
Zdenek Kabelac
5c0264d689 vdo: restore monitoring of vdo pool
Switch to -vpool layered name needs to monitor proper device.
2019-09-30 13:34:34 +02:00
David Teigland
9a8e6ad014 lvconvert: enable --uncache with dm-writecache cachevol
splitcache followed by an automatic lvremove of
the cachevol LV
2019-09-24 15:51:05 -05:00
David Teigland
76dd9b2b51 writecache: move code into new file
put writecache specific code in writecache_manip.c

should be no functional change
2019-09-24 15:51:05 -05:00
David Teigland
f27625f005 lvconvert: enable --uncache with dm-cache cachevol
splitcache followed by an automatic lvremove of
the cachevol LV
2019-09-24 15:50:58 -05:00
David Teigland
4464004362 lvconvert: separate splitcache and uncache functions
Reorg code so there are separate functions for splitcache
and uncache for both cachepool and cachevol.  Should be no
functional change.
2019-09-24 13:55:21 -05:00
David Teigland
4fe4c30e7a lvconvert: allow --cache shortcut for --type cache with cachevol 2019-09-23 14:21:09 -05:00
David Teigland
6f7d7089b4 writecache: use dm suffixes and lv attributes
- use internal CACHE_VOL flag on cachevol LV
- add suffixes to dm uuids for internal LVs
- display appropriate letters in the LV attr field
- display writecache's cachevol in lvs output
2019-09-20 14:08:51 -05:00
David Teigland
5d3bced5ea lvconvert: detaching cachevol with missing PVs
. For dm-cache in writethrough, always allow splitcache,
  whether the cache is missing PVs or not.

. For dm-cache in writeback, if the cache is missing PVs,
  allow splitcache with force and yes.

. For dm-writecache, if the cache is missing PVs,
  allow splitcache with force and yes.
2019-09-20 09:59:37 -05:00
David Teigland
b46dce0bad lvchange: allow activating cachevol 2019-09-20 09:59:37 -05:00
Zdenek Kabelac
6612d8dd5e vdo: enhance activation with layer -vpool
Enhance 'activation' experience for VDO pool to more closely match
what happens for thin-pools where we do use a 'fake' LV to keep pool
running even when no thinLVs are active. This gives user a choice
whether he want to keep thin-pool running (wihout possibly lenghty
activation/deactivation process)

As we do plan to support multple VDO LVs to be mapped into a single VDO,
we want to give user same experience and 'use-patter' as with thin-pools.

This patch gives option to activate VDO pool only without activating
VDO LV.

Also due to 'fake' layering LV we can protect usage of VDO pool from
command like 'mkfs' which do require exlusive access to the volume,
which is no longer possible.

Note: VDO pool contains 1024 initial sectors as 'empty' header - such
header is also exposed in layered LV (as read-only LV).
For blkid we are indentified as LV with UUID suffix - thus private DM
device of lvm2 - so we do not need to store any extra info in this
header space (aka zero is good enough).
2019-09-17 13:17:19 +02:00
Zdenek Kabelac
7612c21f55 lvconvert: improve validation thin and cache pool conversion
Limit convertible LVs to thin-pool and cache-pools.
Also fix return code on  interal error path to return ECMD_FAILED.
2019-09-17 13:13:49 +02:00
David Teigland
3e5e7fd6c9 pvscan: allow use of noudevsync option
When pvscan is used to activate a VG via an
asynchronous service (i.e. lvm2-pvscan), there
is no requirement that the command wait for
udev to create device nodes before returning.

It's possible that waiting for udev is slow
enough to cause the service running the command
to time out.  So, allow the --noudevsync option
to be given to pvscan to skip waiting for udev.

(This commit is not changing the lvm2-pvscan
service itself to use --noudevsync.)

Still unknown is whether there are any complex
LV activation cases in which lvm itself requires
access to a device node, in which case the udev
wait could be needed by lvm itself.

(When running an activation command directly
from the command line, it's generally expected
that the activated LVs are ready to use when
the command is finished, so lvm waits for
udev to finish creating the dev nodes.)
2019-09-10 09:47:33 -05:00
Heinz Mauelshagen
aae2e872b4 lvchange: add --resync help/manual text relative to 'R' attribute
Add information that --resync clears the 'R' attribute
on not initially synchronized mirror/RAID LVs.

Related: 1708299
2019-09-06 14:18:29 +02:00
David Teigland
25b58310e3 pvscan: avoid full scan for activation
When an online PV completed a VG, the standard
activation functions were used to activate the VG.
These functions use a full scan of all devs.
When many pvscans are run during startup and need
to activate many VGs, scanning all devs from all
the pvscans can take a long time.

Optimize VG activation in pvscan to scan only the
devs in the VG being activated.  This makes use of
the online file info that was used to determine
the VG was complete.

The downside of this approach is that pvscan activation
will not detect duplicate PVs and block activation,
where a normal activation command (which scans all
devices) would.
2019-09-03 10:11:16 -05:00
Zdenek Kabelac
4b1dcc2eeb lv_manip: add synchronizations
New udev in rawhide seems to be 'dropping' udev rule operations for devices
that are no longer existing - while this is 'probably' a bug - it's
revealing moments in lvm2 that likely should not run in a single
transaction and we should wait for a cookie before submitting more work.

TODO: it seem more 'error' paths should always include synchronization
before starting deactivating 'just activated' devices.
We should probably figure out some 'automatic' solution for this instead
of placing sync_local_dev_name() all over the place...
2019-08-26 15:32:19 +02:00
Zdenek Kabelac
0bdd6d6240 pvmove: add missing synchronization
Between 'resume' and 'remove' we need to wait for udev to synchronize,
otherwise udev may 'skip' resume event processing if the udev node
is already gone.
2019-08-20 12:44:39 +02:00
David Teigland
0534cd9cd4 pvscan: disable sleeping and retrying for udev
When systemd is running pvscans, udev may not be
entirely initialized, so the pvscan should not
sleep and retry waiting for udev info.
2019-08-16 14:41:26 -05:00
David Teigland
83261b79b5 pvscan cache: use lvmcache_label_scan
instead of the lower level label_scan.  The lvmcache wrapper
around label_scan checks for and eliminates more duplicate devs
and md components.
2019-08-16 13:26:12 -05:00
David Teigland
677833ce6f lvmcache: renaming functions and variables
related to duplicates, no functional changes.
2019-08-16 13:26:11 -05:00
David Teigland
65bcd16be2 md component detection addition in vg_read
Usually md components are eliminated in label scan and/or
duplicate resolution, but they could sometimes get into
the vg_read stage, where set_pv_devices compares the
device to the PV.

If set_pv_devices runs an md component check and finds
one, vg_read should eliminate the components.

In set_pv_devices, run an md component check always
if the PV is smaller than the device (this is not
very common.)  If the PV is larger than the device,
(more common), do the component check when the config
setting is "auto" (the default).
2019-08-16 13:24:34 -05:00
Zdenek Kabelac
cc4a92b13c cov: ensure cname exists before derefering it
Just make it clear to analyzers  cname can't be NULL.
TODO: maybe exclude NULL at front of the function...
2019-08-09 12:57:07 +02:00
David Teigland
0404539edb vgcreate/vgextend: restrict PVs with mixed block sizes
Avoid having PVs with different logical block sizes in the same VG.
This prevents LVs from having mixed block sizes, which can produce
file system errors.

The new config setting devices/allow_mixed_block_sizes (default 0)
can be changed to 1 to return to the unrestricted mode.
2019-08-01 10:06:47 -05:00
David Teigland
7657313740 pvck: fix looping dump metadata_all
dump metadata_all wouldn't quit if the metadata wrapped.
2019-07-12 14:09:06 -05:00
David Teigland
4567c6a2b2 enable full md component detection at the right time
An active md device with an end superblock causes lvm to
enable full md component detection.  This was being done
within the filter loop instead of before, so the full
filtering of some devs could be missed.

Also incorporate the recently added config setting that
controls the md component detection.
2019-07-10 13:30:50 -05:00
David Teigland
b16abb3816 pvscan: fix PV online when device has a different size
Fix commit 7836e7aa1c
"pvscan: ignore device with incorrect size"

which caused pvscan to not consider a PV online (for purposes
of event based activation) if the PV and device sizes differed.

This helped to avoid mistaking MD components for PVs, and is
replaced by triggering an md component check when PV and device
sizes differ (which happens in set_pv_device).
2019-07-09 13:45:09 -05:00
Heinz Mauelshagen
1b63a219f4 lvconvert: allow --stripes/--stripesize in 'mirror' conversions
This allows the creation of a striped mirror leg(s) during upconvert
by adding lvconvert command line options --stripes/--stripesize
for 'mirror' to tools/command-lines.in.

In case multiple mirror legs are being added, all will have the
same requested striped layout.

Resolves: rhbz1720705
2019-07-08 19:32:17 +02:00
David Teigland
f938545687 cache: warn and prompt for writeback with cachevol
The cache repair utility does not yet work with a cachevol
(where metadata and data exist on the same LV.)  So, warn
and prompt if writeback is specified with a cachevol.
2019-07-02 11:03:03 -05:00
David Teigland
b4402bd821 exported vg handling
The exported VG checking/enforcement was scattered and
inconsistent.  This centralizes it and makes it consistent,
following the existing approach for foreign and shared
VGs/PVs, which are very similar to exported VGs/PVs.

The access policy that now applies to foreign/shared/exported
VGs/PVs, is that if a foreign/shared/exported VG/PV is named
on the command line (i.e. explicitly requested by the user),
and the command is not permitted to operate on it because it
is foreign/shared/exported, then an access error is reported
and the command exits with an error.  But, if the command is
processing all VGs/PVs, and happens to come across a
foreign/shared/exported VG/PV (that is not explicitly named on
the command line), then the command silently skips it and does
not produce an error.

A command using tags or --select handles inaccessible VGs/PVs
the same way as a command processing all VGs/PVs, and will
not report/return errors if these inaccessible VGs/PVs exist.

The new policy fixes the exit codes on a somewhat random set of
commands that previously exited with an error if they were
looking at all VGs/PVs and an exported VG existed on the system.

There should be no change to which commands are allowed/disallowed
on exported VGs/PVs.

Certain LV commands (lvs/lvdisplay/lvscan) would previously not
display LVs from an exported VG (for unknown reasons).  This has
not changed.  The lvm fullreport command would previously report
info about an exported VG but not about the LVs in it.  This
has changed to include all info from the exported VG.
2019-06-25 15:39:08 -05:00
David Teigland
d16142f90f scanning: open devs rw when rescanning for write
When vg_read rescans devices with the intention of
writing the VG, the label rescan can open the devs
RW so they do not need to be closed and reopened
RW in dev_write_bytes.
2019-06-21 10:57:49 -05:00
David Teigland
82b137ef2f vgchange: don't fail monitor command if vg is exported
When monitoring, skip exported VGs without causing a command
failure.

The lvm2-monitor service runs 'vgchange --monitor y', so
any exported VG on the system would cause the service to
fail.
2019-06-20 15:59:36 -05:00
David Teigland
9f5e46965b fix man page generation
The man page generation for pvchange/lvchange/vgchange was
incorrect (leaving out some option listings) as a result of
commit e225bf5 "fix command definition for pvchange -a"
2019-06-14 09:26:08 -05:00
David Teigland
7eaa3adedf vgchange: change debug message level
A debug message was mistakely left visible.
2019-06-11 16:14:07 -05:00
David Teigland
4bb7d3da0e lvmcache: remove wrapper around lvmcache_get_vgnameids
This was left over from when there was an lvmetad
version of the function.
2019-06-11 14:10:14 -05:00
David Teigland
0f350ba890 remove unused trustcache option 2019-06-11 11:42:49 -05:00
David Teigland
e225bf59ff fix command definition for pvchange -a
The -a was being included in the set of "one or more"
options instead of an actual required option.  Even
though the cmd def was not implementing the restrictions
correctly, the command internally was.

Adjust the cmd def code which did not support a command
with some real required options and a set of "one or more"
options.
2019-06-10 13:43:20 -05:00
David Teigland
49b8846567 lvmcache: remove unused function
Drop lvmcache_fmt_from_vgname(), the way it was called made
it identical to the existing lvmcache_vginfo_from_vgname().
2019-06-10 10:38:32 -05:00
David Teigland
550536474f vgsplit: simplify vg creation
The way that this command now uses the global lock
followed by a label scan, it can simply check if the
new VG name exists, and if not lock it and create it.
2019-06-10 10:38:32 -05:00
David Teigland
a07cc8dbef reset cmd wipe_outdated_pvs
at the start of a command, which is needed in case the cmd
struct is reused.
2019-06-10 10:34:58 -05:00
David Teigland
36cbc6db24 locking: reset global_ex flag at end of cmd
These two flags may be not reset at the end of
the command when the unlock is implicit, which
is a problem if the cmd struct is reused.
Clear the flags in the general fin_locking.
2019-06-10 10:34:58 -05:00
David Teigland
ba7ff96faf improve reading and repairing vg metadata
The fact that vg repair is implemented as a part of vg read
has led to a messy and complicated implementation of vg_read,
and limited and uncontrolled repair capability.  This splits
read and repair apart.

Summary
-------

- take all kinds of various repairs out of vg_read
- vg_read no longer writes anything
- vg_read now simply reads and returns vg metadata
- vg_read ignores bad or old copies of metadata
- vg_read proceeds with a single good copy of metadata
- improve error checks and handling when reading
- keep track of bad (corrupt) copies of metadata in lvmcache
- keep track of old (seqno) copies of metadata in lvmcache
- keep track of outdated PVs in lvmcache
- vg_write will do basic repairs
- new command vgck --updatemetdata will do all repairs

Details
-------

- In scan, do not delete dev from lvmcache if reading/processing fails;
  the dev is still present, and removing it makes it look like the dev
  is not there.  Records are now kept about the problems with each PV
  so they be fixed/repaired in the appropriate places.

- In scan, record a bad mda on failure, and delete the mda from
  mda in use list so it will not be used by vg_read or vg_write,
  only by repair.

- In scan, succeed if any good mda on a device is found, instead of
  failing if any is bad.  The bad/old copies of metadata should not
  interfere with normal usage while good copies can be used.

- In scan, add a record of old mdas in lvmcache for later, do not repair
  them while reading, and do not let them prevent us from finding and
  using a good copy of metadata from elsewhere.  One result is that
  "inconsistent metadata" is no longer a read error, but instead a
  record in lvmcache that can be addressed separate from the read.

- Treat a dev with no good mdas like a dev with no mdas, which is an
  existing case we already handle.

- Don't use a fake vg "handle" for returning an error from vg_read,
  or the vg_read_error function for getting that error number;
  just return null if the vg cannot be read or used, and an error_flags
  arg with flags set for the specific kind of error (which can be used
  later for determining the kind of repair.)

- Saving an original copy of the vg metadata, for purposes of reverting
  a write, is now done explicitly in vg_read instead of being hidden in
  the vg_make_handle function.

- When a vg is not accessible due to "access restrictions" but is
  otherwise fine, return the vg through the new error_vg arg so that
  process_each_pv can skip the PVs in the VG while processing.
  (This is a temporary accomodation for the way process_each_pv
  tracks which devs have been looked at, and can be dropped later
  when process_each_pv implementation dev tracking is changed.)

- vg_read does not try to fix or recover a vg, but now just reads the
  metadata, checks access restrictions and returns it.
  (Checking access restrictions might be better done outside of vg_read,
   but this is a later improvement.)

- _vg_read now simply makes one attempt to read metadata from
  each mda, and uses the most recent copy to return to the caller
  in the form of a 'vg' struct.
  (bad mdas were excluded during the scan and are not retried)
  (old mdas were not excluded during scan and are retried here)

- vg_read uses _vg_read to get the latest copy of metadata from mdas,
  and then makes various checks against it to produce warnings,
  and to check if VG access is allowed (access restrictions include:
  writable, foreign, shared, clustered, missing pvs).

- Things that were previously silently/automatically written by vg_read
  that are now done by vg_write, based on the records made in lvmcache
  during the scan and read:
  . clearing the missing flag
  . updating old copies of metadata
  . clearing outdated pvs
  . updating pv header flags

- Bad/corrupt metadata are now repaired; they were not before.

Test changes
------------

- A read command no longer writes the VG to repair it, so add a write
  command to do a repair.
  (inconsistent-metadata, unlost-pv)

- When a missing PV is removed from a VG, and then the device is
  enabled again, vgck --updatemetadata is needed to clear the
  outdated PV before it can be used again, where it wasn't before.
  (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair,
   mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv)

Reading bad/old metadata
------------------------

- "bad metadata": the mda_header or metadata text has invalid fields
  or can't be parsed by lvm.  This is a form of corruption that would
  not be caused by known failure scenarios.  A checksum error is
  typically included among the errors reported.

- "old metadata": a valid copy of the metadata that has a smaller seqno
  than other copies of the metadata.  This can happen if the device
  failed, or io failed, or lvm failed while commiting new metadata
  to all the metadata areas.  Old metadata on a PV that has been
  removed from the VG is the "outdated" case below.

When a VG has some PVs with bad/old metadata, lvm can simply ignore
the bad/old copies, and use a good copy.  This is why there are
multiple copies of the metadata -- so it's available even when some
of the copies cannot be used.  The bad/old copies do not have to be
repaired before the VG can be used (the repair can happen later.)

A PV with no good copies of the metadata simply falls back to being
treated like a PV with no mdas; a common and harmless configuration.

When bad/old metadata exists, lvm warns the user about it, and
suggests repairing it using a new metadata repair command.
Bad metadata in particular is something that users will want to
investigate and repair themselves, since it should not happen and
may indicate some other problem that needs to be fixed.

PVs with bad/old metadata are not the same as missing devices.
Missing devices will block various kinds of VG modification or
activation, but bad/old metadata will not.

Previously, lvm would attempt to repair bad/old metadata whenever
it was read.  This was unnecessary since lvm does not require every
copy of the metadata to be used.  It would also hide potential
problems that should be investigated by the user.  It was also
dangerous in cases where the VG was on shared storage.  The user
is now allowed to investigate potential problems and decide how
and when to repair them.

Repairing bad/old metadata
--------------------------

When label scan sees bad metadata in an mda, that mda is removed
from the lvmcache info->mdas list.  This means that vg_read will
skip it, and not attempt to read/process it again.  If it was
the only in-use mda on a PV, that PV is treated like a PV with
no mdas.  It also means that vg_write will also skip the bad mda,
and not attempt to write new metadata to it.  The only way to
repair bad metadata is with the metadata repair command.

When label scan sees old metadata in an mda, that mda is kept
in the lvmcache info->mdas list.  This means that vg_read will
read/process it again, and likely see the same mismatch with
the other copies of the metadata.  Like the label_scan, the
vg_read will simply ignore the old copy of the metadata and
use the latest copy.  If the command is modifying the vg
(e.g. lvcreate), then vg_write, which writes new metadata to
every mda on info->mdas, will write the new metadata to the
mda that had the old version.  If successful, this will resolve
the old metadata problem (without needing to run a metadata
repair command.)

Outdated PVs
------------

An outdated PV is a PV that has an old copy of VG metadata
that shows it is a member of the VG, but the latest copy of
the VG metadata does not include this PV.  This happens if
the PV is disconnected, vgreduce --removemissing is run to
remove the PV from the VG, then the PV is reconnected.
In this case, the outdated PV needs have its outdated metadata
removed and the PV used flag needs to be cleared.  This repair
will be done by the subsequent repair command.  It is also done
if vgremove is run on the VG.

MISSING PVs
-----------

When a device is missing, most commands will refuse to modify
the VG.  This is the simple case.  More complicated is when
a command is allowed to modify the VG while it is missing a
device.

When a VG is written while a device is missing for one of it's PVs,
the VG metadata is written to disk with the MISSING flag on the PV
with the missing device.  When the VG is next used, it is treated
as if the PV with the MISSING flag still has a missing device, even
if that device has reappeared.

If all LVs that were using a PV with the MISSING flag are removed
or repaired so that the MISSING PV is no longer used, then the
next time the VG metadata is written, the MISSING flag will be
dropped.

Alternative methods of clearing the MISSING flag are:

vgreduce --removemissing will remove PVs with missing devices,
or PVs with the MISSING flag where the device has reappeared.

vgextend --restoremissing will clear the MISSING flag on PVs
where the device has reappeared, allowing the VG to be used
normally.  This must be done with caution since the reappeared
device may have old data that is inconsistent with data on other PVs.

Bad mda repair
--------------

The new command:
vgck --updatemetadata VG

first uses vg_write to repair old metadata, and other basic
issues mentioned above (old metadata, outdated PVs, pv_header
flags, MISSING_PV flags).  It will also go further and repair
bad metadata:

. text metadata that has a bad checksum
. text metadata that is not parsable
. corrupt mda_header checksum and version fields

(To keep a clean diff, #if 0 is added around functions that
are replaced by new code.  These commented functions are
removed by the following commit.)
2019-06-07 15:54:04 -05:00
David Teigland
5dd32680b0 vgcfgbackup add error messages 2019-06-07 15:54:04 -05:00
David Teigland
47effdc025 vgck --updatemetadata is a new command
uses vg_write to correct more common or less severe issues,
and also adds the ability to repair some metadata corruption
that couldn't be handled previously.
2019-06-07 15:54:04 -05:00
David Teigland
89914a541f process_each_pv handle outdated pvs
process_each_pv should account for outdated pvs
in the list of all devices it is processing.
2019-06-07 15:54:04 -05:00
David Teigland
ab61a6d85d move wipe_outdated_pvs to vg_write
and implement it based on a device, not based
on a pv struct (which is not available when the
device is not a part of the vg.)

currently only the vgremove command wipes outdated
pvs until more advanced recovery is added in a
subsequent commit
2019-06-07 15:54:04 -05:00
David Teigland
db98a6e362 Additional MD component checking
If udev info is missing for a device, (which would indicate
if it's an MD component), then do an end-of-device read to
check if a PV is an MD component.  (This is skipped when
using hints since we already know devs in hints are good.)

A new config setting md_component_checks can be used to
disable the additional end-of-device MD checks, or to
always enable end-of-device MD checks.

When both hints and udev info are disabled/unavailable,
the end of PVs will now be scanned by default.  If md
devices with end-of-device superblocks are not being
used, the extra I/O overhead can be avoided by setting
md_component_checks="start".
2019-06-07 13:27:16 -05:00
David Teigland
2b241eb1f6 pvck: use new dump routines for old output
Use the recently added dump routines to produce the
old/traditional pvck output, and remove the code that
had been used for that.

The validation/checking done by the new routines means
that new lines prefixed with CHECK are printed for
incorrect values.
2019-06-05 16:28:52 -05:00
David Teigland
bada89a224 pvck: dump metadata_all
This searches the entire metadata area for any
copy of the metadata and dumps it to file.
2019-06-05 12:25:34 -05:00
David Teigland
d18e491f68 pvck: dump headers and metadata
Add 'pvck --dump headers' to print all the
lvm ondisk structs.  Also checks the values
and prints any problems.

The previous dump metadata is also converted to
use these same routines, which do not depend on lvm
fully scanning/reading/processing the headers and
metadata on disk.  This makes it useful to get data in
cases where there is corruption that would otherwise
prevent the normal functions from working.
2019-06-03 15:13:32 -05:00
David Teigland
645dd27604 separate code for setting devices from metadata parsing
Pull the code that sets devs for PVs out of the metadata
parsing code and call it separately.
2019-05-23 11:57:38 -05:00
David Teigland
52586b1039 pvck: new dump option to extract metadata
The new command 'pvck --dump metadata PV' will extract
the current version of VG metadata from a PV for testing
and debugging.  --dump metadata_area extracts the entire
text metadata area.
2019-05-23 11:49:06 -05:00
David Teigland
6422b9ddc5 move the setting of use_full_md_check flag
from each command to one location in command init.
No functional change.
2019-05-21 11:51:58 -05:00
David Teigland
9f561f2206 pvscan: fix segfault in recent commit
commit aa75b31db5
  "pvscan: handle case of scanning PV without metadata last"

failed to recognize that an arg may be null in the case of
'pvscan --cache' (without -aay) which does not keep track
of complete VGs because it does not need to activate them.
2019-05-03 16:51:34 -05:00
David Teigland
3405ead1e0 pvs: remove unnecessary label scan
The scanning rework missed removing this instance of label scan.
It's no longer needed because of the way that label scan is always
run once from the start of the command.  This unnecessary scan
would be triggered by running 'pvs @tag'.
2019-05-03 16:16:29 -05:00
David Teigland
1e9e21a171 pvscan: don't record PV online after error reading metadata 2019-05-03 14:39:42 -05:00
Zdenek Kabelac
3c70ae1803 clean: avoid cleaning iterator on error path
Return error dirrectly instead of using 'out' code path.
2019-05-03 13:17:22 +02:00
David Teigland
d7054cd28a vgcreate: remove the lvmcache locking workaround
Recent cleanups and simplifications to lvmcache and locking
mean that the odd locking to workaround other issues is now
unnecessary.
2019-04-30 14:26:16 -05:00
David Teigland
366c1ac15b pvcreate: call label scan prior to pvcreate_each_device
and don't call it from inside pvcreate_each_device.
This avoids having to repeat it for users of
pvcreate_each_device (pvcreate/pvremove/vgcreate/vgextend.)
2019-04-30 14:10:27 -05:00
David Teigland
6d0f09f478 pvscan: remove fixme comment that is fixed
Remove the fixme comment describing the case that was
fixed by aa75b31db5
  "pvscan: handle case of scanning PV without metadata last"
2019-04-29 15:44:57 -05:00
David Teigland
c3e385c108 hints: skip hint flock if nolocking option is set 2019-04-29 13:01:15 -05:00
David Teigland
a519be8d4b remove retry for missed PVs in process_each_pv
This is no longer needed with the change to orphan
and global locks.
2019-04-29 13:01:15 -05:00
David Teigland
8c87dda195 locking: unify global lock for flock and lockd
There have been two file locks used to protect lvm
"global state": "ORPHANS" and "GLOBAL".

Commands that used the ORPHAN flock in exclusive mode:
  pvcreate, pvremove, vgcreate, vgextend, vgremove,
  vgcfgrestore

Commands that used the ORPHAN flock in shared mode:
  vgimportclone, pvs, pvscan, pvresize, pvmove,
  pvdisplay, pvchange, fullreport

Commands that used the GLOBAL flock in exclusive mode:
  pvchange, pvscan, vgimportclone, vgscan

Commands that used the GLOBAL flock in shared mode:
  pvscan --cache, pvs

The ORPHAN lock covers the important cases of serializing
the use of orphan PVs.  It also partially covers the
reporting of orphan PVs (although not correctly as
explained below.)

The GLOBAL lock doesn't seem to have a clear purpose
(it may have eroded over time.)

Neither lock correctly protects the VG namespace, or
orphan PV properties.

To simplify and correct these issues, the two separate
flocks are combined into the one GLOBAL flock, and this flock
is used from the locking sites that are in place for the
lvmlockd global lock.

The logic behind the lvmlockd (distributed) global lock is
that any command that changes "global state" needs to take
the global lock in ex mode.  Global state in lvm is: the list
of VG names, the set of orphan PVs, and any properties of
orphan PVs.  Reading this global state can use the global lock
in sh mode to ensure it doesn't change while being reported.

The locking of global state now looks like:

lockd_global()
  previously named lockd_gl(), acquires the distributed
  global lock through lvmlockd.  This is unchanged.
  It serializes distributed lvm commands that are changing
  global state.  This is a no-op when lvmlockd is not in use.

lockf_global()
  acquires an flock on a local file.  It serializes local lvm
  commands that are changing global state.

lock_global()
  first calls lockf_global() to acquire the local flock for
  global state, and if this succeeds, it calls lockd_global()
  to acquire the distributed lock for global state.

Replace instances of lockd_gl() with lock_global(), so that the
existing sites for lvmlockd global state locking are now also
used for local file locking of global state.  Remove the previous
file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN).

The following commands which change global state are now
serialized with the exclusive global flock:

pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove,
vgcreate, vgextend, vgremove, vgreduce, vgrename,
vgcfgrestore, vgimportclone, vgmerge, vgsplit

Commands that use a shared flock to read global state (and will
be serialized against the prior list) are those that use
process_each functions that are based on processing a list of
all VG names, or all PVs.  The list of all VGs or all PVs is
global state and the shared lock prevents those lists from
changing while the command is processing them.

The ORPHAN lock previously attempted to produce an accurate
listing of orphan PVs, but it was only acquired at the end of
the command during the fake vg_read of the fake orphan vg.
This is not when orphan PVs were determined; they were
determined by elimination beforehand by processing all real
VGs, and subtracting the PVs in the real VGs from the list
of all PVs that had been identified during the initial scan.
This is fixed by holding the single global lock in shared mode
while processing all VGs to determine the list of orphan PVs.
2019-04-29 13:01:05 -05:00
David Teigland
aa75b31db5 pvscan: handle case of scanning PV without metadata last
Handle the case where pvscan --cache -aay (with no dev args)
gets to the final PV, completing the VG, but that final PV does not
have VG metadata.  In this case, we need to use VG metadata from a
previously scanned PV in the same VG, which we saved for this
possibility.  Using this saved metadata, we can find which VG
this PVID belongs to, and then check if that VG is now complete,
and if so add the VG name to the list of complete VGs to be
autoactivated.
2019-04-15 11:27:49 -05:00
David Teigland
41ba2b568b tests: disable unworking pvscan case
and add corresponding fixme in the code
2019-04-12 15:40:38 -05:00
David Teigland
7836e7aa1c pvscan: ignore device with incorrect size
If a device looks like a PV, but its size does not
match the PV size in the metadata, then skip it for
purposes of autoactivation.  It's probably not wrong
device for the PV.
2019-04-05 16:44:00 -05:00
David Teigland
6f18186bfd pvscan: print more reasons for ignoring devices 2019-04-05 15:48:12 -05:00
David Teigland
f58a70c168 pvscan: don't print warning about lvmlockd not running
pvscan --cache ignores shared VGs, so it doesn't need to
consider lvmlockd, and shouldn't include a warning about it.
2019-04-05 14:04:42 -05:00
David Teigland
0ba316f102 pvscan: remove initialization case
In the past, the first 'pvscan --cache -aay dev' command
to run on the system would initialize the pvs_online dir
by scanning all devs and creating online files for all pvs
it found, and then autoactivating the VG (if complete) for
the named dev.  The idea was that the system may not have
been able to run pvscan commands for early devices, so the
first pvscan to run would need to "make up" for any devices
that had appeared previously, which the system was unable to
scan.  The problem or idea of making up for missed scans is
historical and should no longer be needed, so remove this
special init case.
2019-04-05 14:04:02 -05:00
David Teigland
6b89c0d4b7 pvscan: for init only autoactivate vg for named dev
When pvscan is run for the initialization case (the first
pvscan run on the system), it scans all devs and creates
online files for all PVs it finds.  Previously it would
then autoactivate every complete VG, but change this to
only autoactive the (complete) VG corresponding to the
named device arg(s).
2019-04-05 12:46:39 -05:00
David Teigland
417724efe2 pvscan: reorganize code
to simplify and prepare for subsequent change.
Should be no change in behavior.
2019-04-05 12:46:39 -05:00
David Teigland
85e68a8333 lvextend: refresh shared LV remotely using dlm/corosync
When lvextend extends an LV that is active with a shared
lock, use this as a signal that other hosts may also have
the LV active, with gfs2 mounted, and should have the LV
refreshed to reflect the new size.  Use the libdlmcontrol
run api, which uses dlm_controld/corosync to run an
lvchange --refresh command on other cluster nodes.
2019-03-21 12:38:20 -05:00
Zdenek Kabelac
677aa84be3 vdo: enable caching for vdopool LV and vdo LV
Allow using caching with VDO.
User can either cache a single vdopool or
a vdo LV - difference when the caching is put-in depends on a use-case
and it's upto user to decide which kind of speed is expected.
2019-03-20 14:38:31 +01:00
Zdenek Kabelac
030c39073e cache: support vgsplit
Enable vgsplit to work with VG containing cached LVs.
2019-03-20 14:38:02 +01:00
David Teigland
d84134c75b pvscan: fix ignoring foreign PVs
Fix to previous commit
  "pvscan: ignore online for shared and foreign PVs"

which was incorrectly considering a PV foreign if its
VG had no system ID when the host did have a system ID.
2019-03-13 16:03:02 -05:00
David Teigland
4e20ebd6a1 pvscan: ignore online for shared and foreign PVs
Activation would not be allowed anyway, but we can
check for these cases early and avoid wasted time in
pvscan managing online files an attempting activation.
2019-03-05 15:19:05 -06:00
David Teigland
a0c848d4e4 pvscan: ignore online for unused PV
If an unused PV comes online, ignore it from
pvscan --cache.
2019-03-04 14:25:53 -06:00
David Teigland
a9eaab6beb Use "cachevol" to refer to cache on a single LV
and "cachepool" to refer to a cache on a cache pool object.

The problem was that the --cachepool option was being used
to refer to both a cache pool object, and to a standard LV
used for caching.  This could be somewhat confusing, and it
made it less clear when each kind would be used.  By
separating them, it's clear when a cachepool or a cachevol
should be used.

Previously:

- lvm would use the cache pool approach when the user passed
  a cache-pool LV to the --cachepool option.

- lvm would use the cache vol approach when the user passed
  a standard LV in the --cachepool option.

Now:

- lvm will always use the cache pool approach when the user
  uses the --cachepool option.

- lvm will always use the cache vol approach when the user
  uses the --cachevol option.
2019-02-27 08:52:34 -06:00
David Teigland
74460f70ef pvscan: fix hint recreation
Restore part of the fix from f0089472e7 that was lost
in the process of backporting 74a388cca1.
2019-02-26 10:30:11 -06:00
David Teigland
9aea6ae956 logging: add command[pid] and timestamp to file and verbose output
Without this, the output from different commands in a single
log file could not be separated.

Change the default "indent" setting to 0 so that the default
debug output does not include variable spaces in the middle
of debug lines.
2019-02-26 10:03:44 -06:00
David Teigland
74a388cca1 pvscan: autoactivate a VG once
When a VG has multiple PVs, and all those PVs come online
at the same time, concurrent pvscans for each PV will all
create the individual pvid files, and all will often see
the VG is now complete.  This causes each of the pvscan
commands to think it should activate the VG, so there
are multiple activations of the same VG.  The vg lock
serializes them, and only the first pvscan actually does
the activation, but there is still a lot of extra overhead
and time used by the other pvscans that attempt to
activate the already active VG.  This can lead to a backlog
of pvscans and timeouts.

To fix this, this adds a new /run/lvm/vgs_online/ dir that
works like the existing /run/lvm/pvs_online/ dir.  Each pvscan
that wants to activate a VG will first try to exlusively create
the file vgs_online/<vgname>.  Only the first pvscan will
succeed, and that one will do the VG activation. The other
pvscans will find the vgname file exists and will not do the
activation step.

When a PV goes offline, the vgs_online file for the corresponding
VG is removed.  This allows the VG to be autoactivated again
when the PV comes online again.  This requires that the vgname be
stored in the pvid files.
2019-02-21 15:17:41 -06:00
David Teigland
f0089472e7 pvscan: fix autoactivation from concurrent pvscans
Use a file lock to ensure that only one pvscan will do
initialization of pvs_online, otherwise multiple concurrent
pvscans may all see an empty pvs_online directory and
do initialization.

The pvscan that is doing initialization should also only
attempt to activate complete VGs.
2019-02-20 16:33:59 -06:00
David Teigland
0aa51a2f61 hints: fix recreating hints from pvscan
When aay was included in the pvscan --cache command,
the activation part was complaining about the unusual
state of the hint file since it had been recreated
just prior.
2019-02-13 15:23:43 -06:00
Zdenek Kabelac
59b87cf7d6 vdo: document types vdo and vdo-pool 2019-01-28 22:39:10 +01:00
Zdenek Kabelac
87864f09f6 vdo: complete matching with thin syntax
Just like we support for thin-pool syntax:

lvcreate --thinpool new_tpoolname -L Size vg

add same support logic with for vdo-poo:

lvcreate --vdopool new_vpoolname -L Size vg

Also move description of syntax bellow thin-pool, so it's
correctly ordered in generated man page.
2019-01-28 22:18:17 +01:00
Zdenek Kabelac
b64021ee5f lvconvert: pass force and yes options for vdo conversion 2019-01-28 22:17:27 +01:00
David Teigland
30ad845f3d vgscan: drop 'take a while' message
every command does this
2019-01-28 11:22:42 -06:00
Zdenek Kabelac
022ebb0cfe lv_manip: better work with PERCENT_VG modifier
When using 'lvcreate -l100%VG' and there is big disproportion between
real available space and requested setting - automatically fallback
to 100%FREE.

Difference can be seen when VG is big and already most space was
allocated, so the requestion 100%VG can end (and by spec for % modifier
it's correct) as LV with size of 1%VG.  Usually this is not a big
problem - buit in some cases - like cache-pool allocation, this
can result a big difference for chunksize selection.

With this patch it's more closely match common-sense logic without
the need of reitteration of too big changes in lvm2 core ATM.

TODO: in the future there should be allocator solving all allocations
in a single call.
2019-01-21 12:53:15 +01:00
Zdenek Kabelac
e8ea3c9a61 man: missed --zero option for thin-pool creation
During man page rewrite this info got lost and remained
only for lvconvert. So restore it back for lvcreate.
2019-01-21 12:38:47 +01:00
David Teigland
5f102b3421 hints: invalidate when pvscan --cache sees a new PV
An idea from Zdenek for better ensuring valid hints by invalidating
them when pvscan --cache <device> sees a new PV, which is a case
where we know that hints should be invalidated.  This is triggered
from systemd/udev logic, and there may be some cases where it would
invalidate hints that the existing methods wouldn't detect.
2019-01-16 15:34:20 -06:00
David Teigland
e158835a05 lvmlockd: make lockstart wait for existing start
If there are two independent scripts doing:
  vgchange --lockstart vg
  lvchange -ay vg/lv

The first vgchange to do the lockstart will wait for
the lockstart to complete before returning.
The second vgchange to do the lockstart will see that
the start is already in progress (from the first) and
will do nothing.  This means the second does not wait
for any lockstart to complete, and moves on to the
lvchange which may find the lockspace still starting
and fail.

To fix this, make the vgchange lockstart command
wait for any lockstart's in progress to complete.
2019-01-16 10:49:04 -06:00
David Teigland
7b5abc3fb1 hints: fix hint flock when using lvm shell
also cmd->use_hints needs to be set for each shell command
2019-01-15 12:23:16 -06:00
David Teigland
6620dc9475 add device hints to reduce scanning
Save the list of PVs in /run/lvm/hints.  These hints
are used to reduce scanning in a number of commands
to only the PVs on the system, or only the PVs in a
requested VG (rather than all devices on the system.)
2019-01-15 10:23:47 -06:00
David Teigland
bc40391b7d writecache: use wipe_lv to warn about specific signatures
When initializing an LV to hold the writecache, use wipe_lv()
which looks for specific signatures on the LV.

Wiping signatures is not necessary, but printing a warning
that names a specific signature (in addition to the existing
generic warning/confirmation) may help if a user accidentally
specifies the wrong LV which contains something important.
2019-01-03 10:47:35 -06:00
David Teigland
938b6b8253 writecache: prompt before using an LV to hold cache 2019-01-02 11:44:03 -06:00
David Teigland
89c61f2018 Revert "lvconvert: use standard wiping code"
This reverts commit fb85d5d024.

Adding a confirmation prompt in the following commit so the
wiping confirmation won't be needed.
2019-01-02 11:21:45 -06:00
Zdenek Kabelac
88faf5a53b debug: drop some unneeded backtraces 2018-12-22 23:55:48 +01:00
Zdenek Kabelac
a355aeb17a cov: looks like cut&paste error
Fua and nofua code path should have different compares.
2018-12-21 21:45:08 +01:00
Zdenek Kabelac
8db2527c6e cov: ensure lock_type is not NULL 2018-12-21 21:45:08 +01:00
Zdenek Kabelac
e2c017fdac mangenerator: check strdup was successfull
Check for strdup != NULL
and drop unneeded zeroing when buffer is overwritten.
2018-12-21 21:45:08 +01:00
Zdenek Kabelac
2724a09e58 debug: tracing close errors 2018-12-21 21:45:08 +01:00
Zdenek Kabelac
095c9791ca debug: drop some extra backtraces
Unneeded tracking after log_*.
2018-12-21 21:45:08 +01:00
Zdenek Kabelac
65cb8efd16 lvconvert: writecache fix return code
Detach function return 0 for error and 1 for success.
Add missing log errors from failing deactivation.
Add missing log error from failing synchronization.
2018-12-21 21:42:30 +01:00
Zdenek Kabelac
fb85d5d024 lvconvert: use standard wiping code 2018-12-21 21:42:30 +01:00
Zdenek Kabelac
18aa541ca2 configure: avoid repeative inclusion of configure.h
Since configure.h is a generated header and it's missing traditional
ifdefs preambule - it can be included & parsed multiple times.
Normally compiler is fine when defines have same value and there is
no warning - yet we don't need to parse this several times
and by adding -include  directive we can ensure every file
in the package is rightly compile with configure.h as the
first header file.
2018-12-21 19:19:50 +01:00
Marian Csontos
f05104af76 cov: Close a FD on error 2018-12-19 16:29:31 +01:00
Zdenek Kabelac
5db56b36f1 makefile: fixes build for older system
With older gcc - we need to resolve symbols linked with devmapper-event
that is now using -ldevmapper.

Also add forgotten systemd library needed for dbus notification.
2018-12-17 11:41:38 +01:00
Zdenek Kabelac
94237354dd makefiles: correcting login of makefile
Fixing some ordering issue with inclusion of common make.tmpl.
Correcting dependency calculation
Simplifying inclusive makefile
2018-12-17 10:55:20 +01:00
Zdenek Kabelac
d76b4afb8e makefiles: sort 2018-12-17 10:36:52 +01:00
Zdenek Kabelac
f514e37978 lvconvert: ensure proper init of pv_list 2018-12-14 22:27:33 +01:00
Zdenek Kabelac
454024f957 makefiles: drop unneeded include path 2018-12-14 15:14:48 +01:00
Zdenek Kabelac
0b19387dae headers: use configure.h as 1st. header
Ensure configure.h is always 1st. included header.
Maybe we could eventually introduce gcc -include option, but for now
this better uses dependency tracking.

Also move _REENTRANT and _GNU_SOURCE into configure.h so it
doesn't need to be present in various source files.
This ensures consistent compilation of headers like stdio.h since
it may produce different declaration.
2018-12-14 15:09:13 +01:00
Heinz Mauelshagen
dd5716ddf2 raid: fix (de)activation of RaidLVs with visible SubLVs
There's a small window during creation of a new RaidLV when
rmeta SubLVs are made visible to wipe them in order to prevent
erroneous discovery of stale RAID metadata.  In case a crash
prevents the SubLVs from being committed hidden after such
wiping, the RaidLV can still be activated with the SubLVs visible.
During deactivation though, a deadlock occurs because the visible
SubLVs are deactivated before the RaidLV.

The patch adds _check_raid_sublvs to the raid validation in merge.c,
an activation check to activate.c (paranoid, because the merge.c check
will prevent activation in case of visible SubLVs) and shares the
existing wiping function _clear_lvs in raid_manip.c moved to lv_manip.c
and renamed to activate_and_wipe_lvlist to remove code duplication.
Whilst on it, introduce activate_and_wipe_lv to share with
(lvconvert|lvchange).c.

Resolves: rhbz1633167
2018-12-11 16:35:34 +01:00
Heinz Mauelshagen
edb72cb70c lvcreate/lvconvert: prohibit creation of/conversion to mirrored mirror logs
In RHEL7 we marked mirrored mirror logs as deprecated and
added a related message.  This patch prohibits creating new
'mirror' LVs with that log type or converting existing LVs
to have one.

Existing LVs with mirrored mirror log can be activated
and converted to disk/core logs.

Avoid double deprecation message when running lvconvert.

Resolves: rhbz1643562
2018-12-08 02:52:50 +01:00
David Teigland
a4b8377488 lvmlockd: fix missing LV lock for lvconvert repair
Add missing lvmlockd LV lock for lvconvert repair
on mirror and thin/cache pools.
2018-12-07 13:11:31 -06:00
David Teigland
3d2fd95af7 remove unused full filter
it's the same as cmd->filter
2018-12-04 14:06:46 -06:00
David Teigland
c1b2de936c pvscan: use correct dev filters
pvscan was still using lvmetad_filter which has been
null since lvmetad was removed.  Switch it to use the
full_filter.
2018-12-03 12:58:46 -06:00
Zdenek Kabelac
ceb2f0ad3b makefiles: updates for less verbosity 2018-11-29 23:05:43 +01:00
David Teigland
ea9b2c2122 lvmlockd: vgchange locktype with yes option
for auto response to yes/no prompt.
2018-11-27 14:40:24 -06:00
David Teigland
904e1e3d26 Place the first PE at 1 MiB for all defaults
. When using default settings, this commit should change
  nothing.  The first PE continues to be placed at 1 MiB
  resulting in a metadata area size of 1020 KiB (for
  4K page sizes; slightly smaller for larger page sizes.)

. When default_data_alignment is disabled in lvm.conf,
  align pe_start at 1 MiB, based on a default metadata area
  size that adapts to the page size.  Previously, disabling
  this option would result in mda_size that was too small
  for common use, and produced a 64 KiB aligned pe_start.

. Customized pe_start and mda_size values continue to be
  set as before in lvm.conf and command line.

. Remove the configure option for setting default_data_alignment
  at build time.

. Improve alignment related option descriptions.

. Add section about alignment to pvcreate man page.

Previously, DEFAULT_PVMETADATASIZE was 255 sectors.
However, the fact that the config setting named
"default_data_alignment" has a default value of 1 (MiB)
meant that DEFAULT_PVMETADATASIZE was having no effect.

The metadata area size is the space between the start of
the metadata area (page size offset from the start of the
device) and the first PE (1 MiB by default due to
default_data_alignment 1.)  The result is a 1020 KiB metadata
area on machines with 4KiB page size (1024 KiB - 4 KiB),
and smaller on machines with larger page size.

If default_data_alignment was set to 0 (disabled), then
DEFAULT_PVMETADATASIZE 255 would take effect, and produce a
metadata area that was 188 KiB and pe_start of 192 KiB.
This was too small for common use.

This is fixed by making the default metadata area size a
computed value that matches the value produced by
default_data_alignment.
2018-11-26 16:36:50 -06:00
David Teigland
229e63b638 writecache: set block_size using --cachesettings
instead of a separate --writecacheblocksize option.
writecache block_size is not technically a setting,
but it can borrow the option as a special case.
2018-11-21 15:16:23 -06:00
David Teigland
3ca8ed66a7 remove unused backgroundfork option 2018-11-14 09:34:49 -06:00
David Teigland
819b469880 pvscan: background option is not used
Move this into the list of ignored options so
it doesn't appear in the man page.
2018-11-13 17:27:53 -06:00
David Teigland
970f49dcab man: remove cluster references 2018-11-13 16:15:41 -06:00
David Teigland
3ae5569570 Add dm-writecache support
dm-writecache is used like dm-cache with a standard LV
as the cache.

$ lvcreate -n main -L 128M -an foo /dev/loop0

$ lvcreate -n fast -L 32M -an foo /dev/pmem0

$ lvconvert --type writecache --cachepool fast foo/main

$ lvs -a foo -o+devices
  LV            VG  Attr       LSize   Origin        Devices
  [fast]        foo -wi-------  32.00m               /dev/pmem0(0)
  main          foo Cwi------- 128.00m [main_wcorig] main_wcorig(0)
  [main_wcorig] foo -wi------- 128.00m               /dev/loop0(0)

$ lvchange -ay foo/main

$ dmsetup table
foo-main_wcorig: 0 262144 linear 7:0 2048
foo-main: 0 262144 writecache p 253:4 253:3 4096 0
foo-fast: 0 65536 linear 259:0 2048

$ lvchange -an foo/main

$ lvconvert --splitcache foo/main

$ lvs -a foo -o+devices
  LV   VG  Attr       LSize   Devices
  fast foo -wi-------  32.00m /dev/pmem0(0)
  main foo -wi------- 128.00m /dev/loop0(0)
2018-11-06 14:18:41 -06:00
David Teigland
cac4a9743a Allow dm-cache cache device to be standard LV
If a single, standard LV is specified as the cache, use
it directly instead of converting it into a cache-pool
object with two separate LVs (for data and metadata).

With a single LV as the cache, lvm will use blocks at the
beginning for metadata, and the rest for data.  Separate
dm linear devices are set up to point at the metadata and
data areas of the LV.  These dm devs are given to the
dm-cache target to use.

The single LV cache cannot be resized without recreating it.

If the --poolmetadata option is used to specify an LV for
metadata, then a cache pool will be created (with separate
LVs for data and metadata.)

Usage:

$ lvcreate -n main -L 128M vg /dev/loop0

$ lvcreate -n fast -L 64M vg /dev/loop1

$ lvs -a vg
  LV   VG Attr       LSize   Type   Devices
  main vg -wi-a----- 128.00m linear /dev/loop0(0)
  fast vg -wi-a-----  64.00m linear /dev/loop1(0)

$ lvconvert --type cache --cachepool fast vg/main

$ lvs -a vg
  LV           VG Attr       LSize   Origin       Pool  Type   Devices
  [fast]       vg Cwi---C---  64.00m                     linear /dev/loop1(0)
  main         vg Cwi---C--- 128.00m [main_corig] [fast] cache  main_corig(0)
  [main_corig] vg owi---C--- 128.00m                     linear /dev/loop0(0)

$ lvchange -ay vg/main

$ dmsetup ls
vg-fast_cdata   (253:4)
vg-fast_cmeta   (253:5)
vg-main_corig   (253:6)
vg-main (253:24)
vg-fast (253:3)

$ dmsetup table
vg-fast_cdata: 0 98304 linear 253:3 32768
vg-fast_cmeta: 0 32768 linear 253:3 0
vg-main_corig: 0 262144 linear 7:0 2048
vg-main: 0 262144 cache 253:5 253:4 253:6 128 2 metadata2 writethrough mq 0
vg-fast: 0 131072 linear 7:1 2048

$ lvchange -an vg/min

$ lvconvert --splitcache vg/main

$ lvs -a vg
  LV   VG Attr       LSize   Type   Devices
  fast vg -wi-------  64.00m linear /dev/loop1(0)
  main vg -wi------- 128.00m linear /dev/loop0(0)
2018-11-06 13:44:54 -06:00
David Teigland
8c9d9a7446 cache: factor lvchange_cache
to prepare for future addition
2018-11-06 11:36:34 -06:00
Zdenek Kabelac
a1b1b3dbb6 pvscan: add error checking for write of online files
When there is any write failure during writting file,
report this upward as error and fail command instead
of continuing futher.
2018-11-06 15:05:44 +01:00
Zdenek Kabelac
aa8b2d6a0f cleanup: move cast to det_t into MKDEV macro 2018-11-05 17:25:11 +01:00
Zdenek Kabelac
70e3d0a613 cov: remove unused assigns 2018-11-05 17:25:11 +01:00
David Teigland
f6a54a50a0 lvmlockd: deactivate lvmlock LV in vgchange
When changing a VG to lock_type sanlock, the internal
lvmlock LV was left active at the end of vgchange.
It shouldn't be active until lockstart.
2018-11-01 13:25:21 -05:00
Zdenek Kabelac
c4f39decc8 cov: pvscan ensure sigle_devs list is always initialized 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
70950bbd97 cov: log failing unlink 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
f1ac130dc1 cov: check closedir result
Log problems around failing closedir().
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
b1ff52ca14 cov: check dev_close_immediate
Function can report log_error() on fail path.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
2ab784440a cov: fix leaking openned file descriptors
Once the FD is no longer needed, close it.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
1bb30a8c27 cov: warn about failing sigaction 2018-10-15 14:24:28 +02:00
David Teigland
1365f0d4c8 remove unneded check to skip filter init
There's no more persistent filter so we don't need
to check for it.
2018-09-12 16:30:50 -05:00
David Teigland
0aeca60aaa fix readonly activation override options
This fixes a problem in commit e6bb780d24, in which the
back compat handling for the old locking_type=4 was
incorrectly translated to mean the same thing as --readonly,
which prevented activation because activation uses an
exclusive vg lock.  Previously, locking_type=4 allowed
activation.

If we see locking_type 4 in an old config, translate it to
the new combination of --readonly and --sysinit, which we
now define to mean the --readonly behavior with an exception
to allow activation.
2018-09-12 16:30:50 -05:00