1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

6722 Commits

Author SHA1 Message Date
David Teigland
47effdc025 vgck --updatemetadata is a new command
uses vg_write to correct more common or less severe issues,
and also adds the ability to repair some metadata corruption
that couldn't be handled previously.
2019-06-07 15:54:04 -05:00
David Teigland
de3d3b11f4 move pv header repairs to vg_write
Correct PV header in-use or version fields
from vg_write instead of vg_read.
2019-06-07 15:54:04 -05:00
David Teigland
ab61a6d85d move wipe_outdated_pvs to vg_write
and implement it based on a device, not based
on a pv struct (which is not available when the
device is not a part of the vg.)

currently only the vgremove command wipes outdated
pvs until more advanced recovery is added in a
subsequent commit
2019-06-07 15:54:04 -05:00
David Teigland
45b164f62c create separate lvmcache update functions for read and write
The vg read and vg write cases need to update lvmcache
differently, so create separate functions for them.

The read case now handles checking for outdated mdas
and moves them aside into a new list to be repaired in
a subsequent commit.
2019-06-07 15:54:04 -05:00
David Teigland
027e0e92e6 fix vg_commit return value
The existing comment was desribing the correct behavior,
but the code didn't match.  The commit is successful if
one mda was committed.  Making it depend on the result of
the internal lvmcache update was wrong.
2019-06-07 15:54:04 -05:00
David Teigland
86d831b916 change args for text label read function
Have the caller pass the label_sector to the read
function so the read function can set the sector
field in the label struct, instead of having the
read function return a pointer to the label for
the caller to set the sector field.

Also have the read function return a flag indicating
to the caller that the scanned device was identified
as a duplicate pv.
2019-06-07 15:54:04 -05:00
David Teigland
889b5d3183 add mda arg to add_mda
Allow the caller of lvmcache_add_mda() to have the
new mda returned.
2019-06-07 15:54:04 -05:00
David Teigland
b2447e3538 keep track of which mdas have old metadata in lvmcache
This will be used for more advanced repair in a
subsequent commit.
2019-06-07 15:54:04 -05:00
David Teigland
0b18c25d93 ability to keep track of outdated pvs in lvmcache
Outdated PVs hold metadata for VG from which they
have been removed.  Add the ability to keep track
of these in lvmcache.
This will be used for more advanced repair in a
subsequent commit.
2019-06-07 15:54:04 -05:00
David Teigland
650524b955 ability to keep track of bad mdas in lvmcache
mda's that cannot be processed by lvm because of
some corruption can be kept on a separate list.
These will be used for more advanced repair in a
subsequent commit.
2019-06-07 15:54:04 -05:00
David Teigland
aeafdc1f45 add flags to keep track of bad metadata
When reading metadata headers and text, use a new set
of flags to identify specific errors that are seen.
These will be used for more advanced repair in a
subsequent commit.
2019-06-07 15:54:04 -05:00
David Teigland
db98a6e362 Additional MD component checking
If udev info is missing for a device, (which would indicate
if it's an MD component), then do an end-of-device read to
check if a PV is an MD component.  (This is skipped when
using hints since we already know devs in hints are good.)

A new config setting md_component_checks can be used to
disable the additional end-of-device MD checks, or to
always enable end-of-device MD checks.

When both hints and udev info are disabled/unavailable,
the end of PVs will now be scanned by default.  If md
devices with end-of-device superblocks are not being
used, the extra I/O overhead can be avoided by setting
md_component_checks="start".
2019-06-07 13:27:16 -05:00
David Teigland
2bcd43c683 lvmcache: remove unused_duplicate_devs list from cmd
Save the previous duplicate PVs in a global list instead
of a list on the cmd struct.  dmeventd reuses the cmd struct
for multiple commands, and the list entries between commands
were being freed (apparently), causing a segfault in dmeventd
when it tried to use items in cmd->unused_duplicate_devs
that had been saved there by the previous command.
2019-06-07 10:14:33 -05:00
David Teigland
2b241eb1f6 pvck: use new dump routines for old output
Use the recently added dump routines to produce the
old/traditional pvck output, and remove the code that
had been used for that.

The validation/checking done by the new routines means
that new lines prefixed with CHECK are printed for
incorrect values.
2019-06-05 16:28:52 -05:00
Zdenek Kabelac
e3c4ab0cc7 cache: support no_discard_passdown
Recent kernel version from kernel commit:
de7180ff908b2bc0342e832dbdaa9a5f1ecaa33a
started to report in cache status line new flag:
no_discard_passdown

Whenever lvm spots unknown status it reports:
Unknown feature in status:

So add reconginzing this feature flag and also report this with

'lvs -o+kernel_discards'

When no_discard_passdown is found in status 'nopassdown' gets reported
for this field  (roughly matching what we report for thin-pools).
2019-06-05 15:48:41 +02:00
David Teigland
d18e491f68 pvck: dump headers and metadata
Add 'pvck --dump headers' to print all the
lvm ondisk structs.  Also checks the values
and prints any problems.

The previous dump metadata is also converted to
use these same routines, which do not depend on lvm
fully scanning/reading/processing the headers and
metadata on disk.  This makes it useful to get data in
cases where there is corruption that would otherwise
prevent the normal functions from working.
2019-06-03 15:13:32 -05:00
David Teigland
645dd27604 separate code for setting devices from metadata parsing
Pull the code that sets devs for PVs out of the metadata
parsing code and call it separately.
2019-05-23 11:57:38 -05:00
David Teigland
52586b1039 pvck: new dump option to extract metadata
The new command 'pvck --dump metadata PV' will extract
the current version of VG metadata from a PV for testing
and debugging.  --dump metadata_area extracts the entire
text metadata area.
2019-05-23 11:49:06 -05:00
David Teigland
dc1e12dcd4 scan: expand and update label scan comments 2019-05-21 12:02:40 -05:00
David Teigland
60bf9c9f33 hints: exclude md components
In some cases md components could be included in
the hints, so add a check to hint creation to make
sure they are excluded.
2019-05-21 11:58:01 -05:00
David Teigland
19ef399ea7 devs: rename dev_is_md dev_is_md_component
The naming was confusing and misleading since
it it's testing if a device is an md component,
not an md device.
2019-05-21 11:44:39 -05:00
David Teigland
6078585381 add md component check in vg_read based on size
If an md component is not excluded by other means and
vg_read is used to read metadata from it, then this new
check compares the device size with the PV size, and runs
a full md check on the device if the sizes don't match.
2019-05-03 14:39:42 -05:00
Zdenek Kabelac
d60d59a5f3 cleanup: use unsigned type 2019-05-03 13:17:22 +02:00
Zdenek Kabelac
7a5ea681fb build: fix compilation without lvmlockd 2019-05-03 13:17:22 +02:00
Zdenek Kabelac
a520b3002c locking: validate locking mode
Ensure 'ret' is always defined and validate 'mode'.
2019-05-03 13:17:22 +02:00
David Teigland
99de816a1b scan: remove comments about lvmetad 2019-05-02 13:32:30 -05:00
David Teigland
0046c4e7a7 use memcpy for constant ondisk strings
Use memcpy/memcmp for on disk strings which are not
null terminated: FMTT_MAGIC, LVM2_LABEL and LABEL_ID.
Quiets compile warnings.
2019-05-02 12:59:50 -05:00
David Teigland
adfb9bf20c remove unused string writecache 2019-05-01 16:50:14 -05:00
David Teigland
90b94ead12 lvmcache: remove unused flag
The new label scan design is never called recursively,
so we don't need a flag to check for that.
2019-04-30 14:59:27 -05:00
David Teigland
c3e385c108 hints: skip hint flock if nolocking option is set 2019-04-29 13:01:15 -05:00
David Teigland
8c87dda195 locking: unify global lock for flock and lockd
There have been two file locks used to protect lvm
"global state": "ORPHANS" and "GLOBAL".

Commands that used the ORPHAN flock in exclusive mode:
  pvcreate, pvremove, vgcreate, vgextend, vgremove,
  vgcfgrestore

Commands that used the ORPHAN flock in shared mode:
  vgimportclone, pvs, pvscan, pvresize, pvmove,
  pvdisplay, pvchange, fullreport

Commands that used the GLOBAL flock in exclusive mode:
  pvchange, pvscan, vgimportclone, vgscan

Commands that used the GLOBAL flock in shared mode:
  pvscan --cache, pvs

The ORPHAN lock covers the important cases of serializing
the use of orphan PVs.  It also partially covers the
reporting of orphan PVs (although not correctly as
explained below.)

The GLOBAL lock doesn't seem to have a clear purpose
(it may have eroded over time.)

Neither lock correctly protects the VG namespace, or
orphan PV properties.

To simplify and correct these issues, the two separate
flocks are combined into the one GLOBAL flock, and this flock
is used from the locking sites that are in place for the
lvmlockd global lock.

The logic behind the lvmlockd (distributed) global lock is
that any command that changes "global state" needs to take
the global lock in ex mode.  Global state in lvm is: the list
of VG names, the set of orphan PVs, and any properties of
orphan PVs.  Reading this global state can use the global lock
in sh mode to ensure it doesn't change while being reported.

The locking of global state now looks like:

lockd_global()
  previously named lockd_gl(), acquires the distributed
  global lock through lvmlockd.  This is unchanged.
  It serializes distributed lvm commands that are changing
  global state.  This is a no-op when lvmlockd is not in use.

lockf_global()
  acquires an flock on a local file.  It serializes local lvm
  commands that are changing global state.

lock_global()
  first calls lockf_global() to acquire the local flock for
  global state, and if this succeeds, it calls lockd_global()
  to acquire the distributed lock for global state.

Replace instances of lockd_gl() with lock_global(), so that the
existing sites for lvmlockd global state locking are now also
used for local file locking of global state.  Remove the previous
file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN).

The following commands which change global state are now
serialized with the exclusive global flock:

pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove,
vgcreate, vgextend, vgremove, vgreduce, vgrename,
vgcfgrestore, vgimportclone, vgmerge, vgsplit

Commands that use a shared flock to read global state (and will
be serialized against the prior list) are those that use
process_each functions that are based on processing a list of
all VG names, or all PVs.  The list of all VGs or all PVs is
global state and the shared lock prevents those lists from
changing while the command is processing them.

The ORPHAN lock previously attempted to produce an accurate
listing of orphan PVs, but it was only acquired at the end of
the command during the fake vg_read of the fake orphan vg.
This is not when orphan PVs were determined; they were
determined by elimination beforehand by processing all real
VGs, and subtracting the PVs in the real VGs from the list
of all PVs that had been identified during the initial scan.
This is fixed by holding the single global lock in shared mode
while processing all VGs to determine the list of orphan PVs.
2019-04-29 13:01:05 -05:00
David Teigland
ccd1386070 wipe_lv: initially open LV in writable mode
wipe_lv knows it's going to write the device, so it
can open rw from the start.  It was opening readonly,
and then dev_write needed to reopen it readwrite.
2019-04-26 14:49:27 -05:00
David Teigland
d0b869e46a hints: fix non-empty hints list when not using hints
When hints are invalid and ignored, the list of hints
could be non-empty (from additions before an invalid
hint was found).  This confused the calling code which
was checking for an empty list to see if hints were used.
Ensure the list is empty when hints are not used.
2019-04-11 11:58:51 -05:00
David Teigland
0cc80ccfd5 hints: fix case of error getting device size
When checking hints, if there's an error getting
the device size, that should be equivalent to
seeing zero size.
2019-04-11 10:32:28 -05:00
David Teigland
6f18186bfd pvscan: print more reasons for ignoring devices 2019-04-05 15:48:12 -05:00
David Teigland
c33770c02d lvmlockd: do not allow mirror LV to be activated shared
This reverts 518a8e8cfb
  "lvmlockd: activate mirror LVs in shared mode with cmirrord"

because while activating a mirror LV with cmirrord worked,
changes to the active cmirror did not work.
2019-04-04 13:21:38 -05:00
Zdenek Kabelac
fcec6691f0 thin: fix maintenance of _pmspare
When metadata grows lvm2 may need to extend also _pmspare volume.
2019-04-03 13:28:54 +02:00
Zdenek Kabelac
e27d027155 thin: resize metadata with data
When data are growing, adapt also size of metadata.
As we get way too many reports from users doing huge growths of
data portion while keep metadata small and avoiding using monitoring.

So to enhance the user-experience in case user requests grown of
thin-pool (without passing PV list for growth) - lvm2 will automaticaly
grown also the metadata part of thin-pool (if possible).
2019-04-03 13:28:22 +02:00
Zdenek Kabelac
7c3de2fd93 thin: introduce estimate_thin_pool_metadata_size
Add function for estimation of thin-pool metadata size for given size of
data. Function is using already existing internal API so it can
be reused for resize of thin-pool data.
2019-04-03 13:27:17 +02:00
Zdenek Kabelac
bca0a4df9a filter: fix mpath test
Fix bug which leaked into commit
dc6dea4033,
where the testing code got mistakenly commited.
2019-04-03 13:27:17 +02:00
David Teigland
2f471f0184 lvresize: fix when compiled without lvmlockd
The no-op result of lockd_lv_resize should be success.
2019-04-02 10:51:38 -05:00
David Teigland
85e68a8333 lvextend: refresh shared LV remotely using dlm/corosync
When lvextend extends an LV that is active with a shared
lock, use this as a signal that other hosts may also have
the LV active, with gfs2 mounted, and should have the LV
refreshed to reflect the new size.  Use the libdlmcontrol
run api, which uses dlm_controld/corosync to run an
lvchange --refresh command on other cluster nodes.
2019-03-21 12:38:20 -05:00
David Teigland
d369de8399 lvextend: allow on LV active with a shared lock
Detect when a shared lock exists, don't require the
normal exclusive lock, and allow the lvextend.
2019-03-21 12:38:20 -05:00
David Teigland
9b4926aaff warn about changes to an active lv with shared lock
When an LV is active with a shared lock, a command can be
run to change the LV with --lockopt skiplv (to override the
exclusive lock the command ordinarily requires which is not
compatible with the outstanding shared lock.)

In this case, other commands may have the LV active and may
need to refresh the LV, so print warning stating this.
2019-03-21 12:38:20 -05:00
Zdenek Kabelac
4411fe2ba8 activation: synchronize before removing devices
Udev is running udev-rule action upon 'resume'.

However lvm2 in special case is doing replacement of
'soon-to-be-removed' device with 'error' target for resuming
and then follows actual removal - the sequence is usually quick,
so when udev start action - it can result in 'strange' error
message in kernel log like:

Process '/usr/sbin/dmsetup info -j 253 -m 17 -c --nameprefixes --noheadings --rows -o name,uuid,suspended' failed with exit code 1.

To avoid this - we need to ensure there is synchronization wait for udev
between 'resume'  and 'remove' part of this process.

However existing code put strict requirement to avoid synchronizing with
udev inside critical section - but this originally came from requirement
to not do anything special while there could be devices in
suspend-state. Now we are able to see differnce between critical section
with or without suspended devices.  For udev synchronization only
suspended devices are prohibited to be there - so slightly relax
condition and allow calling and using 'fs_sync()' even inside critical
section - but there must not be any suspended device.
2019-03-20 14:39:09 +01:00
Zdenek Kabelac
677aa84be3 vdo: enable caching for vdopool LV and vdo LV
Allow using caching with VDO.
User can either cache a single vdopool or
a vdo LV - difference when the caching is put-in depends on a use-case
and it's upto user to decide which kind of speed is expected.
2019-03-20 14:38:31 +01:00
Zdenek Kabelac
0db22c5f81 lv_manip: insert remove layer skips pools
Fixing renaming of subLVs when removing and inserting layers - this
got visible when using stacked VDO pools.
2019-03-20 14:38:05 +01:00
Zdenek Kabelac
1cc690e911 thin: max thin 2019-03-20 14:37:44 +01:00
Zdenek Kabelac
74b5f22838 debug: use log_warn
This reports are not causing command failure, so report them as
warning.
2019-03-20 14:37:44 +01:00
Zdenek Kabelac
dc6dea4033 filter: enhance mpath detection
Internal detection of SCSI device being in-use by DM mpath has been
performed several times for each component device - this could be
eventually racy - so instead when we do remember  1st. checked result
for device being mpath and use it consistenly over the filter runtime.
2019-03-20 14:37:42 +01:00
Zdenek Kabelac
1eeb2fa3f6 dev_manager: add dev_manager_remove_dm_major_minor
Move DM usage into dev_manager.c source file.
Also convert STATUS to INFO ioctl - as that's enough
to obtain UUID - this also avoid issuing unwanted flush on checked DM
device for being mpath.
2019-03-20 14:37:10 +01:00
David Teigland
9b2b0fef9c config: improve scan_lvs description 2019-03-06 13:33:07 -06:00
David Teigland
4e20ebd6a1 pvscan: ignore online for shared and foreign PVs
Activation would not be allowed anyway, but we can
check for these cases early and avoid wasted time in
pvscan managing online files an attempting activation.
2019-03-05 15:19:05 -06:00
David Teigland
7edbf8a441 io: increase the default io memory from 4 to 8 MiB
This is the default bcache size that is created at the
start of the command.  It needs to be large enough to
hold a single copy of metadata for a given VG, or the
VG cannot be read or written (since the entire VG would
not fit into available memory.)

Increasing the default reduces the chances of anyone
needing to increase the default to use their VG.

The size can be set in lvm.conf global/io_memory_size;
the lower limit is 4 MiB and the upper limit is 128 MiB.
2019-03-04 12:14:06 -06:00
David Teigland
3584e0c0d5 io: warn when metadata size approaches io memory size
When a single copy of metadata gets within 1MB of the
current io_memory_size value, begin printing a warning
that the io_memory_size should be increased.
2019-03-04 12:13:09 -06:00
David Teigland
dd8d083795 config: add new setting io_memory_size
which defines the amount of memory that lvm will allocate
for bcache.  Increasing this setting is required if it is
smaller than a single copy of VG metadata.
2019-03-04 11:36:21 -06:00
David Teigland
3ed9256985 remove unused io functions 2019-02-28 10:58:00 -06:00
David Teigland
fb83719d7f logging: remove unused code
Incomplete bits of original code that's unused.
2019-02-28 10:30:54 -06:00
David Teigland
a9eaab6beb Use "cachevol" to refer to cache on a single LV
and "cachepool" to refer to a cache on a cache pool object.

The problem was that the --cachepool option was being used
to refer to both a cache pool object, and to a standard LV
used for caching.  This could be somewhat confusing, and it
made it less clear when each kind would be used.  By
separating them, it's clear when a cachepool or a cachevol
should be used.

Previously:

- lvm would use the cache pool approach when the user passed
  a cache-pool LV to the --cachepool option.

- lvm would use the cache vol approach when the user passed
  a standard LV in the --cachepool option.

Now:

- lvm will always use the cache pool approach when the user
  uses the --cachepool option.

- lvm will always use the cache vol approach when the user
  uses the --cachevol option.
2019-02-27 08:52:34 -06:00
David Teigland
c8fc18e8bf config: make hints setting commented 2019-02-26 15:54:30 -06:00
David Teigland
90149c303e logging: new config settings to specify debug fields
For users who do not want all of the fields included
in debug lines, let them specify in lvm.conf which
fields to include.  timestamp, command[pid], and
file:line fields can all be disabled.
2019-02-26 14:42:16 -06:00
David Teigland
9aea6ae956 logging: add command[pid] and timestamp to file and verbose output
Without this, the output from different commands in a single
log file could not be separated.

Change the default "indent" setting to 0 so that the default
debug output does not include variable spaces in the middle
of debug lines.
2019-02-26 10:03:44 -06:00
David Teigland
7be6791e70 config: change scan_lvs default to 0
so that lvm does not scan LVs for PVs by default.
2019-02-20 13:30:46 -06:00
David Teigland
0aa51a2f61 hints: fix recreating hints from pvscan
When aay was included in the pvscan --cache command,
the activation part was complaining about the unusual
state of the hint file since it had been recreated
just prior.
2019-02-13 15:23:43 -06:00
David Teigland
3ebce8dbd2 apply obtain_device_list_from_udev to all libudev usage
udev_dev_is_md_component and udev_dev_is_mpath_component
are not used for obtaining the device list, but they still
use libudev for device info.  When there are problems with
udev, these functions can get stuck. So, use the existing
obtain_device_list_from_udev config setting to also control
whether these "is component" functions are used, which gives
us a way to avoid using libudev entirely when it's causing
problems.
2019-02-05 10:15:40 -06:00
Zdenek Kabelac
d19e372795 cleanup: indent 2019-01-28 22:39:10 +01:00
Zdenek Kabelac
78dd9d820d thin: select chunk size as power of 2
Whenever thin-pool chunk size is unspecified and left for lvm calculation
try to select the size as nearest highest power-of-2 instead of
just being a multiple of 64KiB.
2019-01-28 22:17:25 +01:00
Zdenek Kabelac
58ad831c72 cache: select chunk size as power of 2
When cache chunk size is not configured, and left for lvm deduction,
select the value which is power-of-2.
2019-01-28 22:17:14 +01:00
Zdenek Kabelac
105a8edea1 lv_manip: better work with PERCENT_VG modifier with lvresize
Fixing recent commit 022ebb0cfe
Resize already has size that needs to be counted with,
otherwise upsizing operation could turn into size reduction one.
2019-01-21 15:39:24 +01:00
Zdenek Kabelac
e689bfb5d5 vdo: minor API cleanup
Since the parse_vdo_pool_status() become vdo_manip API part,
and there will be no 'dm' matching status parser,
the API can be simplified and closely match thin API here.
2019-01-21 12:53:16 +01:00
Zdenek Kabelac
f3c52a515b vdo: enable dmeventd resize 2019-01-21 12:53:16 +01:00
Zdenek Kabelac
3d367f3348 vdo: add simple wrapper for getting pool percentage
Just like with i.e. thins provide simple function for
getting percentage of VDO Pool usage (uses existing
status function).
2019-01-21 12:53:16 +01:00
Zdenek Kabelac
a16d914d34 cleanup: better naming 2019-01-21 12:53:16 +01:00
Zdenek Kabelac
08cabe9b83 vdo: allow resize of VDO and VDO pool volumes
Now with newer VDO kvdo target we can start to use standard mechanism
to enable resize of VDO volumes.

VDO pool can be grown.

Virtual volume grows on top of VDO pool when is not big enough.
Reduced VDOLV is calling discard for reduced areas - this can
take long time!

TODO: implement some pollable mechanism for out-of-lock TRIM.
2019-01-21 12:53:16 +01:00
Zdenek Kabelac
bd6709cec6 vdo: size reduction requires VDO to be active
To be able to send discard to reduced areas - the VDO LV needs to
be active.
2019-01-21 12:53:16 +01:00
Zdenek Kabelac
f1ad4b0679 vdo: discard reduced area
Implement sending discard to reduced LV area.
2019-01-21 12:53:16 +01:00
Zdenek Kabelac
ca72d19691 vdo: estimate virtual size after resize 2019-01-21 12:53:16 +01:00
Zdenek Kabelac
ab031d673d vdo: introduce function for estimation of virtual size 2019-01-21 12:53:16 +01:00
Zdenek Kabelac
022ebb0cfe lv_manip: better work with PERCENT_VG modifier
When using 'lvcreate -l100%VG' and there is big disproportion between
real available space and requested setting - automatically fallback
to 100%FREE.

Difference can be seen when VG is big and already most space was
allocated, so the requestion 100%VG can end (and by spec for % modifier
it's correct) as LV with size of 1%VG.  Usually this is not a big
problem - buit in some cases - like cache-pool allocation, this
can result a big difference for chunksize selection.

With this patch it's more closely match common-sense logic without
the need of reitteration of too big changes in lvm2 core ATM.

TODO: in the future there should be allocator solving all allocations
in a single call.
2019-01-21 12:53:15 +01:00
Zdenek Kabelac
f87dd7b127 vdo: fix archived metadata comment
lvm uses 'minimum_io_size' name to exactly match  VDO naming here,
however in all common cases  _size  is using 'sector/512b' unit.
But in this case the value is in bytes and can have only 2 values:
either 512 or 4096.

It's probably not worth to rename it internaly, so we can just
drop comment - instead of using 1 or 8.

Thought let's think about it....
2019-01-21 12:37:52 +01:00
David Teigland
5f102b3421 hints: invalidate when pvscan --cache sees a new PV
An idea from Zdenek for better ensuring valid hints by invalidating
them when pvscan --cache <device> sees a new PV, which is a case
where we know that hints should be invalidated.  This is triggered
from systemd/udev logic, and there may be some cases where it would
invalidate hints that the existing methods wouldn't detect.
2019-01-16 15:34:20 -06:00
David Teigland
facd520931 lvmlockd: fix make lockstart wait
when building without lvmlockd
2019-01-16 13:24:29 -06:00
David Teigland
ebaaff3590 move init_use_aio
it doesn't make sense to call from init_logging
2019-01-16 11:45:53 -06:00
David Teigland
e158835a05 lvmlockd: make lockstart wait for existing start
If there are two independent scripts doing:
  vgchange --lockstart vg
  lvchange -ay vg/lv

The first vgchange to do the lockstart will wait for
the lockstart to complete before returning.
The second vgchange to do the lockstart will see that
the start is already in progress (from the first) and
will do nothing.  This means the second does not wait
for any lockstart to complete, and moves on to the
lvchange which may find the lockspace still starting
and fail.

To fix this, make the vgchange lockstart command
wait for any lockstart's in progress to complete.
2019-01-16 10:49:04 -06:00
David Teigland
7b5abc3fb1 hints: fix hint flock when using lvm shell
also cmd->use_hints needs to be set for each shell command
2019-01-15 12:23:16 -06:00
David Teigland
6620dc9475 add device hints to reduce scanning
Save the list of PVs in /run/lvm/hints.  These hints
are used to reduce scanning in a number of commands
to only the PVs on the system, or only the PVs in a
requested VG (rather than all devices on the system.)
2019-01-15 10:23:47 -06:00
Zdenek Kabelac
c0c202e606 mirror: regenerate config
Drop extra line in source file - since this line is auto-generated
and would appear twice in resuling .in file with 'make generate'.
2019-01-08 13:13:57 +01:00
Zdenek Kabelac
54a569be40 vdo: regenerate config 2019-01-08 13:13:57 +01:00
Zdenek Kabelac
61e378c4e7 config: drop extra spaces 2019-01-08 13:13:57 +01:00
Zdenek Kabelac
fdd612b824 generators: avoid contacting syslog with generators
The systemd generators are executed very early during the switch
from initramfs to system partition and the syslog is not yet fully
operational - it may cause blocking, if some debug logging is enabled
at the same time in /etc/lvm/lvm.conf log{} section.

To avoid timeouting and killing this generator - rather enhance lvm
code to suppress any syslog communication when LVM_SUPPRESS_SYSLOG
envvar is set.

Use of this envvar is needed since the parsing of i.e. cmdline options
that could eventually override lvm.conf setting happens in this case
way too late and number of lines could have been already streamed to
syslog.
2019-01-08 13:13:54 +01:00
Zdenek Kabelac
88faf5a53b debug: drop some unneeded backtraces 2018-12-22 23:55:48 +01:00
Zdenek Kabelac
26ead4bf45 cov: extent_size cannot be 0
Make this obvious to coverity.
2018-12-21 21:45:08 +01:00
Zdenek Kabelac
9dfb1a11b7 cov: drop unneeded header file
MAX macro no longer needed in pe_align.
2018-12-21 21:45:08 +01:00
Zdenek Kabelac
2724a09e58 debug: tracing close errors 2018-12-21 21:45:08 +01:00
Zdenek Kabelac
82f66834ef bcache: fix memory leak on error path
Coverity noticed missing free of io struct on error path.
2018-12-21 21:45:03 +01:00
Zdenek Kabelac
7832d35668 lvmlockd: fix error return code for _init_vg_sanlock
In few cases error paths from initialization were returned as
'success == 1'.

Also assing num_mb with single compare checking valid sector_size.

For dumb compiler make num_mb always defined.
2018-12-21 21:42:30 +01:00
Zdenek Kabelac
3320ab8334 lib: move towards v2 version of VDO format
Drop very old original format of VDO target and focus on V2 version.
So some variables were renamed or replaced.
There is no compatibility preserved (with assumption so far this is
experimental feature and there is no real user).

Note - version currently VDO calls this version 6.2.
2018-12-20 13:26:55 +01:00
Heinz Mauelshagen
e82303fd6a lvcreate/lvconvert: optionally reenable mirrored mirror log for testing purposes only
This is a followup patch to commit edb72cb70c
to support related lvm2 test suite tests.

A 'global/support_mirrored_mirror_log' bool configuration variable gets
introduced allowing the creation of, or conversion to mirrored 'mirror'
logs if set.  The capability to create these in turn allows the rest of
the tests to perform activation of such existing LVs and their conversions
to disk/core 'mirror' logs.

Display a disclaimer warning if enabled that this is not for regular use.

Add definition of the enabled config option to respective test scripts.

Related: rhbz1643562
2018-12-17 19:28:54 +01:00
Zdenek Kabelac
701ecff0ff lvm: drop usage of dl library
Since lvm no longer supports any dlopen-able plugins
(which in practice was never really usable) drop linking
with -ldl.
2018-12-17 10:36:52 +01:00
Ming-Hung Tsai
859feb81e5 lvmanip: uninitialized members in struct pv_list (#10)
Scenario: Given an existed LV `lvol0`, I want to create another LV
on the PVs used by `lvol0`.

I use `build_parallel_areas_from_lv()` to obtain the `pv_list` of each segments.
However, the returned `pv_list` is not properly initialized, which causes
segfault in subsequent operations.
2018-12-14 15:23:18 +01:00
Zdenek Kabelac
cc5cfb88d7 cleanup: some local headers first 2018-12-14 15:14:48 +01:00
Zdenek Kabelac
0b19387dae headers: use configure.h as 1st. header
Ensure configure.h is always 1st. included header.
Maybe we could eventually introduce gcc -include option, but for now
this better uses dependency tracking.

Also move _REENTRANT and _GNU_SOURCE into configure.h so it
doesn't need to be present in various source files.
This ensures consistent compilation of headers like stdio.h since
it may produce different declaration.
2018-12-14 15:09:13 +01:00
Heinz Mauelshagen
dd5716ddf2 raid: fix (de)activation of RaidLVs with visible SubLVs
There's a small window during creation of a new RaidLV when
rmeta SubLVs are made visible to wipe them in order to prevent
erroneous discovery of stale RAID metadata.  In case a crash
prevents the SubLVs from being committed hidden after such
wiping, the RaidLV can still be activated with the SubLVs visible.
During deactivation though, a deadlock occurs because the visible
SubLVs are deactivated before the RaidLV.

The patch adds _check_raid_sublvs to the raid validation in merge.c,
an activation check to activate.c (paranoid, because the merge.c check
will prevent activation in case of visible SubLVs) and shares the
existing wiping function _clear_lvs in raid_manip.c moved to lv_manip.c
and renamed to activate_and_wipe_lvlist to remove code duplication.
Whilst on it, introduce activate_and_wipe_lv to share with
(lvconvert|lvchange).c.

Resolves: rhbz1633167
2018-12-11 16:35:34 +01:00
Heinz Mauelshagen
edb72cb70c lvcreate/lvconvert: prohibit creation of/conversion to mirrored mirror logs
In RHEL7 we marked mirrored mirror logs as deprecated and
added a related message.  This patch prohibits creating new
'mirror' LVs with that log type or converting existing LVs
to have one.

Existing LVs with mirrored mirror log can be activated
and converted to disk/core logs.

Avoid double deprecation message when running lvconvert.

Resolves: rhbz1643562
2018-12-08 02:52:50 +01:00
David Teigland
3d2fd95af7 remove unused full filter
it's the same as cmd->filter
2018-12-04 14:06:46 -06:00
David Teigland
89c11a2b49 remove unused lvmetad filter 2018-12-04 12:44:43 -06:00
David Teigland
a063d2d123 devs: use udev info to improve md component detection
Use udev info to supplement native md component detection.
2018-12-03 12:58:28 -06:00
Zdenek Kabelac
5a5e3bcf15 gcc: ensure sector is initilized
Some older gcc errnously report the variable can be used uninitlized.
Quite warning by explicit initalization.
2018-12-01 01:07:01 +01:00
Zdenek Kabelac
d8ad73e937 gcc: avoid shadowing use_aio
Function use_aio() is already declared, avoid its shadowing.
lvm-globals.h:59: warning: shadowed declaration is here
2018-12-01 01:07:01 +01:00
Zdenek Kabelac
0d61a17152 gcc: avoid shadowing activate_lv
Function activate_lv() is already declared, avoid its shadowing.
activate.h:133: warning: shadowed declaration is here
2018-12-01 01:06:57 +01:00
Peter Rajnoha
cb04b84c79 scan: md metadata version 0.90 is at the end of disk
commit de28637
  scan: use full md filter when md 1.0 devices are present

missed the fact that md superblock version 0.90 also puts
metadata at the end of the device, so the full md filter
needs to be used when either 0.90 or 1.0 is present.
2018-11-29 12:35:54 -06:00
David Teigland
cd0fb0846d config settings: fix version 3.0.0
version 3.0.0 was changed in the end to 2.3.0,
but config settings had previously been encoded
with version 3.0.0.
2018-11-28 12:16:50 -06:00
David Teigland
904e1e3d26 Place the first PE at 1 MiB for all defaults
. When using default settings, this commit should change
  nothing.  The first PE continues to be placed at 1 MiB
  resulting in a metadata area size of 1020 KiB (for
  4K page sizes; slightly smaller for larger page sizes.)

. When default_data_alignment is disabled in lvm.conf,
  align pe_start at 1 MiB, based on a default metadata area
  size that adapts to the page size.  Previously, disabling
  this option would result in mda_size that was too small
  for common use, and produced a 64 KiB aligned pe_start.

. Customized pe_start and mda_size values continue to be
  set as before in lvm.conf and command line.

. Remove the configure option for setting default_data_alignment
  at build time.

. Improve alignment related option descriptions.

. Add section about alignment to pvcreate man page.

Previously, DEFAULT_PVMETADATASIZE was 255 sectors.
However, the fact that the config setting named
"default_data_alignment" has a default value of 1 (MiB)
meant that DEFAULT_PVMETADATASIZE was having no effect.

The metadata area size is the space between the start of
the metadata area (page size offset from the start of the
device) and the first PE (1 MiB by default due to
default_data_alignment 1.)  The result is a 1020 KiB metadata
area on machines with 4KiB page size (1024 KiB - 4 KiB),
and smaller on machines with larger page size.

If default_data_alignment was set to 0 (disabled), then
DEFAULT_PVMETADATASIZE 255 would take effect, and produce a
metadata area that was 188 KiB and pe_start of 192 KiB.
This was too small for common use.

This is fixed by making the default metadata area size a
computed value that matches the value produced by
default_data_alignment.
2018-11-26 16:36:50 -06:00
David Teigland
4b5d6de86b pvscan systemd service for event based activation
The pvscan systemd service for autoactivation was
mistakenly dropped along with the lvmetad related
services.

The activation generator program now looks at the new
lvm.conf setting "event_activation" (default 1) to
switch between event activation and direct activation.

Previously, the old use_lvmetad setting was used to
switch between event and direct activation.
2018-11-26 14:33:31 -06:00
David Teigland
7e721ca048 bcache: sync io fixes
fix lseek error check
fix read/write error checks
handle zero return from read and write
don't return an error for short io
fix partial read/write loop
2018-11-20 09:19:18 -06:00
David Teigland
ca66d52032 io: use sync io if aio fails
io_setup() for aio may fail if a system has reached the
aio request limit.  In this case, fall back to using
sync io.  Also, lvm use of aio can be disabled entirely
with config setting global/use_aio=0.

The system limit for aio requests can be seen from
  /proc/sys/fs/aio-max-nr

The current usage of aio requests can be seen from
  /proc/sys/fs/aio-nr

The system limit for aio requests can be increased by
setting fs.aio-max-nr using sysctl.

Also add last-byte limit to the sync io code.
2018-11-20 09:13:20 -06:00
Zdenek Kabelac
c1703845c3 activation: trimming string is expected
Commit 813347cf84 added extra validation,
however in this particular we do want to trim suffix out so rather ignore
resulting error code here intentionaly.
2018-11-08 12:20:57 +01:00
David Teigland
1dc5603f73 devices: reuse bcache fd when getting block size
This avoids an unnecessary open() on the device.
2018-11-06 16:36:18 -06:00
David Teigland
3ae5569570 Add dm-writecache support
dm-writecache is used like dm-cache with a standard LV
as the cache.

$ lvcreate -n main -L 128M -an foo /dev/loop0

$ lvcreate -n fast -L 32M -an foo /dev/pmem0

$ lvconvert --type writecache --cachepool fast foo/main

$ lvs -a foo -o+devices
  LV            VG  Attr       LSize   Origin        Devices
  [fast]        foo -wi-------  32.00m               /dev/pmem0(0)
  main          foo Cwi------- 128.00m [main_wcorig] main_wcorig(0)
  [main_wcorig] foo -wi------- 128.00m               /dev/loop0(0)

$ lvchange -ay foo/main

$ dmsetup table
foo-main_wcorig: 0 262144 linear 7:0 2048
foo-main: 0 262144 writecache p 253:4 253:3 4096 0
foo-fast: 0 65536 linear 259:0 2048

$ lvchange -an foo/main

$ lvconvert --splitcache foo/main

$ lvs -a foo -o+devices
  LV   VG  Attr       LSize   Devices
  fast foo -wi-------  32.00m /dev/pmem0(0)
  main foo -wi------- 128.00m /dev/loop0(0)
2018-11-06 14:18:41 -06:00
David Teigland
cac4a9743a Allow dm-cache cache device to be standard LV
If a single, standard LV is specified as the cache, use
it directly instead of converting it into a cache-pool
object with two separate LVs (for data and metadata).

With a single LV as the cache, lvm will use blocks at the
beginning for metadata, and the rest for data.  Separate
dm linear devices are set up to point at the metadata and
data areas of the LV.  These dm devs are given to the
dm-cache target to use.

The single LV cache cannot be resized without recreating it.

If the --poolmetadata option is used to specify an LV for
metadata, then a cache pool will be created (with separate
LVs for data and metadata.)

Usage:

$ lvcreate -n main -L 128M vg /dev/loop0

$ lvcreate -n fast -L 64M vg /dev/loop1

$ lvs -a vg
  LV   VG Attr       LSize   Type   Devices
  main vg -wi-a----- 128.00m linear /dev/loop0(0)
  fast vg -wi-a-----  64.00m linear /dev/loop1(0)

$ lvconvert --type cache --cachepool fast vg/main

$ lvs -a vg
  LV           VG Attr       LSize   Origin       Pool  Type   Devices
  [fast]       vg Cwi---C---  64.00m                     linear /dev/loop1(0)
  main         vg Cwi---C--- 128.00m [main_corig] [fast] cache  main_corig(0)
  [main_corig] vg owi---C--- 128.00m                     linear /dev/loop0(0)

$ lvchange -ay vg/main

$ dmsetup ls
vg-fast_cdata   (253:4)
vg-fast_cmeta   (253:5)
vg-main_corig   (253:6)
vg-main (253:24)
vg-fast (253:3)

$ dmsetup table
vg-fast_cdata: 0 98304 linear 253:3 32768
vg-fast_cmeta: 0 32768 linear 253:3 0
vg-main_corig: 0 262144 linear 7:0 2048
vg-main: 0 262144 cache 253:5 253:4 253:6 128 2 metadata2 writethrough mq 0
vg-fast: 0 131072 linear 7:1 2048

$ lvchange -an vg/min

$ lvconvert --splitcache vg/main

$ lvs -a vg
  LV   VG Attr       LSize   Type   Devices
  fast vg -wi-------  64.00m linear /dev/loop1(0)
  main vg -wi------- 128.00m linear /dev/loop0(0)
2018-11-06 13:44:54 -06:00
David Teigland
e548e7c29d cache: factor report functions
to prepare for future addition
2018-11-06 11:36:29 -06:00
David Teigland
a686391eca cache: reorganize cache_set_policy
to prepare for future addition
2018-11-06 11:36:29 -06:00
David Teigland
23948e99b3 cache: improve error message about flush 2018-11-06 11:36:29 -06:00
David Teigland
3e547fa952 cache: improve warning message about cached thin data 2018-11-06 11:36:28 -06:00
David Teigland
5ee1727f80 cache: rename variable in _cache_add_target_line
so it is not specific to lv/seg type
2018-11-06 11:36:28 -06:00
David Teigland
7541e002b2 cache: rename variable in _cache_display
so it is not specific to lv/seg type
2018-11-06 11:36:28 -06:00
David Teigland
e26dacf30a cache: factor getting cache mode
so part can be called separately
2018-11-06 11:36:28 -06:00
David Teigland
f3f3d6066b cache: factor settings text import export
Pull out the export/import of settings text so
it can be used later from elsewhere.
2018-11-06 11:36:28 -06:00
David Teigland
8d7075528f cache: add cache_mode_num_to_str
Requires only string and number, no specific lv/seg type.
2018-11-06 11:36:28 -06:00
Zdenek Kabelac
9a6f0e64f9 debug: missing backtrace 2018-11-05 17:25:11 +01:00
Zdenek Kabelac
aa8b2d6a0f cleanup: move cast to det_t into MKDEV macro 2018-11-05 17:25:11 +01:00
Zdenek Kabelac
d3ebb18f40 cov: avoid unsing unchecked label_scan_open
Drop extra call too label_scan_open() without checking return value,
and let code go through next call bellow.
2018-11-05 17:25:11 +01:00
Zdenek Kabelac
70e3d0a613 cov: remove unused assigns 2018-11-05 17:25:11 +01:00
Zdenek Kabelac
813347cf84 cov: add missing check for dm_strncpy 2018-11-03 16:10:32 +01:00
Zdenek Kabelac
c7789daec0 cov: overflow before widen
Evaluate as 64bit arithmetic (instead of doing 32bit mults which can
in this case purely teoretically overflow).
2018-11-03 16:10:31 +01:00
Zdenek Kabelac
6235861e64 cov: remove uneeded code
Since clvmd was dropped this code become useless.
2018-11-03 16:09:36 +01:00
Zdenek Kabelac
1951e0db0f label: add stack trace for failing dev_set_last_byte
Temporarily add check for failure, but whole function
needs to be likely traced for error result.

FIXME
2018-11-03 16:09:36 +01:00
David Teigland
7a170873aa lvmlockd: fix size/resizing of internal lvmlock LV for sanlock
The lvmlock LV size was not adjusted correctly for 512 vs 4K
sector sizes which influence the lease size used by sanlock.

When lvmlock was automatically extended, the zeroing through
bcache wasn't working.
2018-11-01 13:25:21 -05:00
David Teigland
aecf542126 metadata: prevent writing beyond metadata area
lvm uses a bcache block size of 128K.  A bcache block
at the end of the metadata area will overlap the PEs
from which LVs are allocated.  How much depends on
alignments.  When lvm reads and writes one of these
bcache blocks to update VG metadata, it can also be
reading and writing PEs that belong to an LV.

If these overlapping PEs are being written to by the
LV user (e.g. filesystem) at the same time that lvm
is modifying VG metadata in the overlapping bcache
block, then the user's updates to the PEs can be lost.

This patch is a quick hack to prevent lvm from writing
past the end of the metadata area.
2018-10-29 16:53:17 -05:00
Heinz Mauelshagen
8df2dd66ce Revert "raid: fix left behind SubLVs"
This reverts commit 16ae968d24.

We need to come up with a better fix, because we fall short
wiping all known signatures when not using the wipe_lv API.
2018-10-25 14:35:56 +02:00
Heinz Mauelshagen
16ae968d24 raid: fix left behind SubLVs
lvm metadata writes, commits and activations are performed
for (newly) allocated RAID metadata SubLVs to wipe any preexisiting
data thus avoid false raid superblock positives on RaidLV activation.

This process can be interrupted by command or system crashs
thus leaving stale SubLVs in the lvm metadata as a problem.

Because we hold an exclusive lock in this metadata SubLV wiping
process, we can address this problem by avoiding aforementioned
commits/writes/activations altogether wiping the respective first
sector of the first physical extent allocated to any metadata SubLV
directly via the existing dev_set() API.

Succeeds all LVM RAID tests.

Related: rhbz1633167
2018-10-24 16:35:30 +02:00
David Teigland
2217d6396a fix: cov: missed return value test
use the existing error paths
2018-10-15 11:53:28 -05:00
Zdenek Kabelac
06a4a356db cov: avoid selfrecursive inclusion of toolcontext.h 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
fdd76da33d cov: drop uneeded header files 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
84f00f5058 cov: add missing error path check for label_scan_open 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
b57e73a0f1 cov: make sure label scans valid lvinfo 2018-10-15 17:49:44 +02:00
Zdenek Kabelac
b1ff52ca14 cov: check dev_close_immediate
Function can report log_error() on fail path.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
253989ecd9 cov: fix error path
Avoid calling 'bad:' section since we have not set 'fd' yet
and instead directly return failing 0 value.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
13c49033ed cov: fix failing filter initialization
When persistent_filter_create() fails, the existing passed filter
should be preserved, so it could be properly deleted on
error path - so new pfilter is assigned instead.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
eb566e034f cov: add check for positive value
As pgsize parameter for _init_free_list() can't be negative,
report problem in case for any reason we would get negative number.
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
9b85ecb85b cov: fix memleak on bcache io error path
Drop allocated IO.

merge free bache
2018-10-15 17:49:44 +02:00
Zdenek Kabelac
fbfbbf6d6a cov: drop check for pointer
Pointer must be always set and it's been already dereferenced.
2018-10-15 14:24:28 +02:00
Zdenek Kabelac
5811fa33bb cov: missed return value test
Check validity of read.
2018-10-15 14:24:28 +02:00
Marian Csontos
48768cc5be config: Fix version for VDO 2018-10-11 11:06:23 +02:00
David Teigland
a49f494c4d metadata: clarify comments about max size
Since there is now a direct limit of half the space.
2018-09-24 15:27:03 -05:00
David Teigland
6be1efd13d metadata: add direct size limit
Previously the size was limited by checking if the
old and new copies of the metadata overlapped.
This generally limited the size to about half of
the total space, but it could be larger given the
size differences between old and new.  Now add a
direct check to limit the size to half the space.
2018-09-24 14:41:58 -05:00
David Teigland
91c7e66f2b metadata: remove incorrect comment about alignment 2018-09-20 15:38:09 -05:00
David Teigland
09131e3922 metadata: add comment about negative impact of rounding 2018-09-20 14:15:49 -05:00
David Teigland
30c94b0324 metadata: remove an unused and incorrect overflow check
Remove another instance of an invalid check for metadata
overflow during read.  The previous instance was removed
in commit 5fb15b193.

This was checking for metadata that that overflowed the
circular disk metadata buffer during read, but such metadata
cannot be written, so it shouldn't be possible to find see.
Also, the check was incorrect and could trigger when there
was no overflow.
2018-09-20 13:53:50 -05:00
David Teigland
0aeca60aaa fix readonly activation override options
This fixes a problem in commit e6bb780d24, in which the
back compat handling for the old locking_type=4 was
incorrectly translated to mean the same thing as --readonly,
which prevented activation because activation uses an
exclusive vg lock.  Previously, locking_type=4 allowed
activation.

If we see locking_type 4 in an old config, translate it to
the new combination of --readonly and --sysinit, which we
now define to mean the --readonly behavior with an exception
to allow activation.
2018-09-12 16:30:50 -05:00
David Teigland
5fb15b1934 metadata: improve write and commit code
The vg_write/vg_commit code was imprecise, uncommented, and
hard to understand.  Rewrite it with clearer, cleaner code,
extensive comments, descriptions of how it works, and add
more info in debugging output.

The minor changes in behavior are to things that were
either incorrect or probably unintended:

- vg_write/vg_commit no longer check that the current vgname at
  the start of the text metadata matches the vgname being written.
  This has already been done at least twice by the time they are
  called, and repeating it again against the same cached data has
  no use.

- A fragment of old removed code had been left behind that checked
  if the old unused alignment policy would wrap.  It was still
  being checked to decide if the metadata area was full, which
  could possibly cause an incorrect full metadata failure.

- vg_remove now clears both the raw_locns in the mda_header that
  point to committed metadata (raw_locn slot 0) and precommitted
  metadata (raw_locn slot 1).  Previously it fully cleared the
  committed slot, and would only clear the offset field in the
  precommitted slot if it saw a problem with the metadata in the
  vg being removed.

- read_metadata_location_summary was wrongly comparing the number
  of wrapped bytes with an offset to report an error about the
  metadata being too large.  This wrong check is removed, it
  could have resulted in erroneous errors.
2018-09-11 10:06:25 -05:00
Joe Thornber
d0ff078e77 Merge branch 'master' of git://sourceware.org/git/lvm2 2018-09-11 13:19:08 +01:00
Joe Thornber
3255e384db [bcache] Remove unused 'hash' field from blocks.
We use a radix tree these days rather than a hash table.
2018-09-11 13:17:29 +01:00
Heinz Mauelshagen
989626926c lvconvert: allow raid4 -> linear conversion request
Allow "lvconvert --type linear RaidLV" on a raid4 LV
providing convenient interim steps to convert to linear.

Add respective new test
   lvconvert-raid-takeover-raid4_to_linear.sh
and
   lvconvert-raid-takeover-linear_to_raid4.sh
for linear to raid4 once on it.
2018-09-10 18:43:21 +02:00
Heinz Mauelshagen
e2e30a64ab lvconvert: fix interim segtype regression on raid6 conversions
When converting from striped/raid0/raid0_meta
to raid6 with > 2 stripes, allow possible
direct conversion (to raid6_n_6).

In case of 2 stripes, first convert to raid5_n to restripe
to at least 3 data stripes (the raid6 minimum in lvm2) in
a second conversion before finally converting to raid6_n_6.

As before, raid6_n_6 then can be converted
to any other raid6 layout.

Enhance lvconvert-raid-takeover.sh to test the
2 stripes conversions to raid6.

Resolves: rhbz1624038
2018-09-07 13:48:19 +02:00
Heinz Mauelshagen
22a1304368 lvconvert: avoid superfluous interim raid type
When converting striped/raid0*/raid6_n_6 <-> raid4,
avoid superfluous interim raid5_n layout.

Related: rhbz1447809
2018-08-31 19:04:19 +02:00
David Teigland
bfcecbbce1 filter: add config setting to skip scanning LVs
devices/scan_lvs (default 1) determines whether lvm
will scan LVs for layered PVs.  The lvm behavior has
always been to scan LVs, but it's rare for LVs to have
layered PVs, and much more common for there to be many
LVs that substantially slow down scanning with no benefit.

This is implemented in the usable filter, and has the
same effect as listing all LVs in the global_filter.
2018-08-30 09:59:50 -05:00
David Teigland
fade9ca3b6 bcache: reduce MAX_IO to 256
This is the number of concurrent async io requests that
the scan layer will submit to the bcache layer.  There
will be an open fd for each of these, so it is best to
keep this well below the default limit for max open files
(1024), otherwise lvm may get EMFILE from open(2) when
there are around 1024 devices to scan on the system.
2018-08-24 14:55:12 -05:00
Heinz Mauelshagen
e83c4f07ca lvconvert: fix conversion attempts to linear
"lvconvert --type linear RaidLV" on striped and raid4/5/6/10
have to provide the convenient interim layouts.  Fix involves
a cleanup to the convenience type function.

As a result of testing, add missing sync waits to
lvconvert-raid-reshape-linear_to_raid6-single-type.sh.

Resolves: rhbz1447809
2018-08-22 17:12:43 +02:00
David Teigland
10ede2cc0f config: improve use_blkid_wiping
mention that libblkid is used to both detect
and erase signatures.
2018-08-21 12:24:35 -05:00
Heinz Mauelshagen
4578411633 lvconvert: fix regression preventing direct striped conversion
Conversion to striped from raid0/raid0_meta is directly possible.

Fix a regression setting superfluous interim raid5_n conversion type
introduced by commit bd7cdd0b09.

Add new test script lvconvert-raid0-striped.sh.

Resolves: rhbz1608067
2018-08-21 17:28:56 +02:00
Zdenek Kabelac
acab591378 mirror: fix splitmirrors for mirror type
With improved mirror activation code --splitmirror issue poppedup
since there was missing proper preload code and deactivation
for splitted mirror leg.
2018-08-07 17:58:30 +02:00
Zdenek Kabelac
c34291e3bf cache: drop metadata_format validation
Allow to use any combination of cache metadata format for policy.
2018-08-07 17:57:00 +02:00
David Teigland
9adae653e9 mirrors: fix read_only_volume_list
If a mirror LV is listed in read_only_volume_list, it would
still be activated rw.  The activation would initially be
readonly, but the monitoring function would immediately
change it to rw.  This was a regression from commit

fade45b1d1 mirror: improve table update

The monitoring function needs to copy the read_only setting
into the new set of mirror activation options it uses.
2018-08-02 11:42:33 -05:00
David Teigland
763219611c vgcreate: close exclusive fd after pvcreate
When vgcreate does an automatic pvcreate, it opens the
dev with O_EXCL to ensure no other subsystem is using
the device.  This exclusive fd remained in bcache and
prevented activation parts of lvm from using the dev.

This appeared with vgcreate of a sanlock VG because of
the unique combination where the dev is not yet a PV,
so pvcreate is needed, and the vgcreate also creates
and activates an internal LV for sanlock.

Fix this by closing the exclusive fd after it's used
by pvcreate so that it won't interfere with other
bits of lvm that may try to use the device.
2018-08-01 11:22:23 -05:00
David Teigland
778ce8d808 lvconvert: improve text about splitmirrors
in messages and man page.
2018-07-23 12:28:48 -05:00
David Teigland
8a66c81b9b lvconvert: restrict command matching for no option variant
The 'lvconvert LV' command def has caused multiple problems
for command matching because it matches the required options
of any lvconvert command.  Any lvconvert with incorrect options
ends up matching 'lvconvert LV', which then produces an error
about incorrect options being used for 'lvconvert LV'.  This
prevents suggestions from nearest-command partial command matches.

Add a special case for 'lvconvert LV' so that it won't be used
as a partial match for a command that has options specified.
2018-07-23 11:12:38 -05:00
David Teigland
117160b27e Remove lvmetad
Native disk scanning is now both reduced and
async/parallel, which makes it comparable in
performance (and often faster) when compared
to lvm using lvmetad.

Autoactivation now uses local temp files to record
online PVs, and no longer requires lvmetad.

There should be no apparent command-level change
in behavior.
2018-07-11 11:26:42 -05:00
David Teigland
db741e75a2 pvscan: autoactivate without lvmetad
When lvmetad is not used, use temporary files to record
which PVs have appeared.  Use these temp files to determine
when a VG is complete, to trigger autoactivation.

This change allows us to remove lvmetad while keeping the
same autoactivation behavior that lvmetad provides.

The temp files are created in /run/lvm/pvs_online/ and are
named for the PVID of the PV.  The files contain the
major:minor of the device the PV was read from.

e.g. if VG foo has dev1 and dev2, then:

. pvscan --cache -aay dev1
  reads vg metadata from dev1
  creates /run/lvm/pvs_online/<pvid-of-dev1>
  checks if all vg->pvs are online: no

. pvscan --cache -aay dev2
  reads vg metadata from dev2
  creates /run/lvm/pvs_online/<pvid-of-dev2>
  checks if all vg->pvs are online: yes
  autoactivates vg

A 'pvscan --cache dev' (without -aay) still records that
dev is online.

A 'pvscan --cache --major X --minor Y' after a device is
gone will remove the temp file for it.

A 'pvscan --cache [-aay]' (no devs) resets the state of
temp files by removing them all, then scanning all devs
and creating temp files for PVs that are found.

If no online files exist, the first pvscan --cache scans
all devs and creates temp files for any PVs found.

The scope of the temp files is only pvscan, and they are only
used for pvscan-based autoactivation.  No other commands are
concerned with or aware of these temp files.  When lvm creates
or removes PVs, no attempt is made to update the temp files.
2018-07-09 16:11:24 -05:00
Zdenek Kabelac
faa126882a dmeventd: lvm vdo support 2018-07-09 15:29:16 +02:00
Zdenek Kabelac
12213445b5 vgchange: vdo support
Support vgchange usage with VDO segtype.
Also changing extent size need small update for vdo virtual extent.

TODO: API needs enhancements so it's not about adding ifs() everywhere.
2018-07-09 15:29:16 +02:00
Zdenek Kabelac
c58733ca15 lvcreate: vdo support
Supports basic:  'lvcreate --vdo -LXXXG -VYYYG vg/vdoname -n lvname'
Allows to create basic VDO pool volume and virtual VDO volume.
2018-07-09 15:29:12 +02:00
Zdenek Kabelac
6945bbdbc6 lvresize: vdo support
Unsupported ATM.

Wait till VDO kernel target starts to use updated resize sequence,
LOAD, SUSPEND, RESUME.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
44c99a8822 vdo: data percentage
Display percentage of used virtual size of vdo-pool volume.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
5807993bbf display: basic vdo segment lvdisplay and lvs support
Print some basic info about vdo segment.

'lvdisplay -m' ATM shows the most.
lvs  shows usage percentage.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
4f708e8709 dev_manager: add dev_manager_vdo_pool_status 2018-07-09 15:28:35 +02:00
Zdenek Kabelac
493ffe7a0f lv_manip: layout and role support for vdo segment 2018-07-09 15:28:35 +02:00
Zdenek Kabelac
00990ed53e check_lv_segment: internal vdo segment validation
Check if settings for vdo segment are correct.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
0dafd159a8 vdo_manip: parsing status of VDO device 2018-07-09 15:28:35 +02:00
Zdenek Kabelac
aa63dfbe39 vdo: support functions to map enums to string names
Translate VDO enums to printable strings.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
aff69ecf39 vdo: component activation of VDO data LV
Allow component activation of VDO data LV.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
4b7a57c9ed vdo: with created names use vpool
When user create vdo-pool - use different automatic name.
So unlike with traditional LVs using  lvol0, lvol1
use vpool0, vpool1...

TODO: apply similar for thin-pool  & cache-pool...
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
a8f84f7801 vdo: introduce segment types and manip functions
Core functionality introducing lvm VDO support.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
0d9a4c6989 lib: new vdo segment configurable options
Configurable for vdo segment with their default values.
Also specify their ranges with minimal and maximal values.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
2e05f6018b activate: kvdo modprobe workaround
To support autoloading of VDO dm target driver loading of 'kvdo'
kernel module is needed - ATM it's not using 'dm-vdo' name.
So to support this strange name - add temporarily solution to
autoload  kvdo kernel module in this case.
2018-07-09 15:28:35 +02:00
Zdenek Kabelac
e9d1f676b3 allocation: add check for passing log allocation
Updates previous commit.
2018-07-09 00:59:34 +02:00
Zdenek Kabelac
6d1c983122 cleanup: use last_seg
More readable code.
2018-07-09 00:23:35 +02:00
Zdenek Kabelac
c8b4f9414c dev_io: no discard in testmode
When lvm2 command is executed in test mode, discard ioctl is skipped.
This may cause even data-loose in case, issuing discard for released
areas was enabled and user 'tested'  lvreduce.
2018-07-09 00:19:30 +02:00
Zdenek Kabelac
b697aa9646 allocator: fix thin-pool allocation
When allocating thin-pool with more then 1 device - try to
allocate 'metadataLV' with reuse of log-type allocation for mirror LV.
It should be naturally place on other device then 'dataLV'.

However due to somewhat hard to follow allocation logic code,
it's been rejected allocation in cases where there was not
enough space for data or metadata on single PV, thus to successed,
usage of segments was mandatory.

While user may use:

allocation/thin_pool_metadata_require_separate_pvs=1

to enforce separe meta and data LV - on default settings, this is not
enable thus segment allocation is meant to work.

NOTE:

As already said - the original intention of this whole  'if()' is unclear,
so try to split this test into multiple more simple tests that are more readable.

TODO: more validation.
2018-07-09 00:19:30 +02:00
Zdenek Kabelac
f2b856c994 lv_manip: do not check extents for any virtual target
Allow creation of any virtual segment type with just --virtualsize
specified without any real extent size give.

TODO: likely --type error,zero might be later enhanced to use -V
(along with -L) - but since those targets do not allocate real
space, supporting -V makes sense with them.
2018-07-02 10:24:23 +02:00
Zdenek Kabelac
2bb9627d01 lv_manip: add name of failing LV into error message 2018-07-02 10:24:23 +02:00
Zdenek Kabelac
ed3428b7ed memlock: extend exception list
Amound of linked libraries grows.
Most of them we don't need to lock in, since we are not using
them in locked section, so skip locking them in memory.
2018-07-02 10:24:20 +02:00
Zdenek Kabelac
0bae9a1bff locking: memory locking ONLY with suspending reason
It's important to lock memory beforo running SUSPEND ioctl - but whole
lvm preload runs in memory unlocked environment - as in this phase
memory allocation is allowed and is meant to happen.

Once all targets are preload and ready (confirmed from all targets)
we start suspending tree - and here the memory allocation (or i.e.
opening files) is no longer allowed - as it may cause kernel deadlock.
2018-07-02 10:21:42 +02:00
Marian Csontos
a14f21bf1d bcache: Fix null pointer dereferencing 2018-06-26 17:04:18 +02:00
Zdenek Kabelac
cea88a9e4e lv_manip: use vgmem pool
Switch to vgmem pool for allocation associated with modification
of particular VG.
2018-06-25 15:07:55 +02:00
Zdenek Kabelac
357e9f9572 cache: use new api function 2018-06-25 15:07:55 +02:00
Zdenek Kabelac
9c0d92d957 lv_manip: add new internal api function 2018-06-25 15:07:55 +02:00
Zdenek Kabelac
8949903fbb cache: set areas count prior using it
Set correct counter, so it's not failing on internal error check.
2018-06-25 15:07:32 +02:00
Zdenek Kabelac
106ee05ba0 lv_manip: add extra internal error
Catch error early, when trying to store data into non-allocated area.
2018-06-22 23:37:02 +02:00
Zdenek Kabelac
6c84a36b53 utils: add clzll
Check for __builtin_clzll and add wrapper when missing.
2018-06-22 23:37:02 +02:00
Zdenek Kabelac
c728d88e11 build: include configure.h
It's important to consistenly include  configure.h as the 1st. header.
It containts #defines influencing behavior of other included header
files.
2018-06-22 23:11:44 +02:00
David Teigland
dd7ebec120 filter: use pointers to real addresses
instead of casting values 1 and 2 to pointers
which gcc optimization can have problems with.
2018-06-21 10:54:43 -05:00
David Teigland
15826214f9 Remove code for using files as devices
It appears this has not been used in a long time,
and it seems to have no point since loop devices exist.
2018-06-21 09:33:21 -05:00
David Teigland
e166d2b14c lvmlockd: fix another missing lock_type null check
Same as 347c807f8.
2018-06-21 09:24:51 -05:00
David Teigland
42f7caf1c2 scan: work around udev problems by avoiding open RDWR
udev creates a train wreck of events if we open devices
with RDWR.  Until we can fix/disable/scrap udev, work around
this by opening RDONLY and then closing/reopening RDWR when
a write is needed.  This invalidates the bcache blocks for
the device before writing so it can trigger unnecessary
rereading.
2018-06-20 14:08:12 -05:00
David Teigland
f85a010a6b bcache: remove extraneous error message
an error from io_submit is already recognized by
the caller like errors during completion.
2018-06-18 12:02:22 -05:00
David Teigland
428514a07f Drop --ignoreskippedcluster option
It's no longer needed.  Clustered VGs are now handled in
the same way as foreign VGs, and as shared VGs that
can't be accessed:

- A command processing all VGs sees a clustered VG,
  prints a message ("Skipping clustered VG foo."),
  skips it, and does not fail.

- A command where the clustered VG is explicitly
  named on the command line, prints a message and fails.
  "Cannot access clustered VG foo, see lvmlockd(8)."

The option is listed in the set of ignored options for
the commands that previously accepted it.  (Removing it
entirely would cause commands/scripts to fail if they
set it.)
2018-06-15 15:59:34 -05:00
David Teigland
ccab4a1994 report: show empty lock_type for none
Sometimes lock_type would be displayed as "none"
(after changing it) and sometimes as empty.
Make it consistently empty.
2018-06-15 14:14:39 -05:00
David Teigland
328303d4d4 Remove unused device error counting 2018-06-15 14:04:39 -05:00
David Teigland
54f61e7dcc config: add deprecated version for recently removed settings
assumes that the next version from this branch is 3.0.0
2018-06-15 13:56:26 -05:00
David Teigland
3fd75d1bcd scan: use full md filter when md 1.0 devices are present
The md filter can operate in two native modes:
- normal: reads only the start of each device
- full: reads both the start and end of each device

md 1.0 devices place the superblock at the end of the device,
so components of this version will only be identified and
excluded when lvm uses the full md filter.

Previously, the full md filter was only used in commands
that could write to the device.  Now, the full md filter
is also applied when there is an md 1.0 device present
on the system.  This means the 'pvs' command can avoid
displaying md 1.0 components (at the cost of doubling
the i/o to every device on the system.)

(The md filter can operate in a third mode, using udev,
but this is disabled by default because there have been
problems with reliability of the info returned from udev.)
2018-06-15 12:21:25 -05:00
David Teigland
8eab37593e Add cmd arg to more functions
so that it can be used in the filter code
2018-06-15 11:03:55 -05:00
David Teigland
e53cfc6a88 lvmlockd: update method for changing clustered VG
The previous method for forcibly changing a clustered VG
to a local VG involved using -cn and locking_type 0.
Since those options are deprecated, replace it with
the same command used for other forced lock type changes:
vgchange --locktype none --lockopt force.
2018-06-13 15:30:28 -05:00
David Teigland
22c5467add filters: remove cache file in persistent filter
It creates problems because it's not always correct,
and it doesn't actually help much.
2018-06-13 14:00:47 -05:00
David Teigland
17f5572bc9 Remove independent metadata areas
in which metadata is stored in files on the local fs
instead of on PVs.
2018-06-13 12:25:19 -05:00
David Teigland
9df6f601e0 Remove code for loading other metadata formats
other formats are not used.
2018-06-13 12:03:42 -05:00
David Teigland
be3af7f93e Remove the unused lock_hash in lvmcache
It kept track of which VGs were locked, but is
no longer used, so remove it.
2018-06-12 11:29:56 -05:00
David Teigland
981a3ba98e Clean up repair and result values in vg_read
Fix the confusing mix of input and output values
in the single variable.
2018-06-12 11:08:26 -05:00
David Teigland
9a8c36b891 Fix use of orphan lock in commands
vgreduce, vgremove and vgcfgrestore were acquiring
the orphan lock in the midst of command processing
instead of at the start of the command.  (The orphan
lock moved to being acquired at the start of the
command back when pvcreate/vgcreate/vgextend were
reworked based on pvcreate_each_device.)

vgsplit also needed a small update to avoid reacquiring
a VG lock that it already held (for the new VG name).
2018-06-12 09:46:11 -05:00
David Teigland
c4153a8dfc Remove checking for locked VGs
A few places were calling a function to check if a
VG lock was held.  The only place it was actually
needed is for pvcreate which wants to do its own
locking (and scanning) around process_each_pv.

The locking/scanning exceptions for pvcreate in
process_each_pv/vg_read can be enabled by just passing
a couple of flags instead of checking if the VG is
already locked.  This also means that these special
cases won't be enabled unknowingly in other places
where they shouldn't be used.
2018-06-12 09:46:04 -05:00
David Teigland
3b6b7f8f9b lvmlockd: skip repair lock upgrade for non shared vgs
Only attempt lvmlockd lock upgrade for shared VGs.
2018-06-12 09:44:05 -05:00
Zdenek Kabelac
77d5caae90 snapshot: improve checking of merging snapshot
Add runtime detection for 'lvs -o+seg_monitor' and 'vgchange --monitor'.
This fix should avoid unnecessary timeout on systemd shutdown.
2018-06-11 22:25:42 +02:00
David Teigland
b48e10d9e6 Remove lvmcache CACHE_LOCKED flag
and the functions that set it.  It's no longer used.
2018-06-08 15:11:47 -05:00
David Teigland
ebd147ff24 Remove locking for non-vgs
Locks for VGs are the only thing that locking.[ch]
now handles, so references to other variations
can be removed.
2018-06-08 14:34:50 -05:00
David Teigland
1c59140f5f Remove unused cluster-related locking flags 2018-06-08 14:01:00 -05:00
David Teigland
a8759dc7a6 Remove unused cache management from locking
This code was for managing lvmcache for clvm
and it no longer does anything.
2018-06-08 12:30:43 -05:00
David Teigland
5e672df6ae Removing locking layer from sync_local_dev_names
the indirection is not needed without clvm
2018-06-08 12:18:57 -05:00
David Teigland
669b1295ae Remove header declarations for removed functions 2018-06-08 10:01:05 -05:00
David Teigland
73b7e6fde7 Remove more code that was only used by liblvm2app 2018-06-08 09:29:11 -05:00
Joe Thornber
7c4b19c335 Merge branch '2018-06-04-data-structs' 2018-06-08 14:21:07 +01:00
Joe Thornber
d5da55ed85 device_mapper: remove dbg_malloc.
I wrote dbg_malloc before we had valgrind.  These days there's just
no need.
2018-06-08 13:40:53 +01:00
Zdenek Kabelac
bc8c8d2f87 build: drop exported symbols
This libs are no longer possible to create,
drop maintanence of exported symbols.
2018-06-08 14:37:31 +02:00
Zdenek Kabelac
5cb4b2a424 cache: cleaner policy also uses fmt2
Format 2 is also with cleaner policy.
2018-06-08 14:37:29 +02:00
Zdenek Kabelac
fb171edd45 pvresize: add missing return
Log error path missed return 0.
Also fix some unneded bactraces (since log_error already shows
position).
2018-06-08 14:36:56 +02:00
Zdenek Kabelac
0c62ae3f89 pvmove: improve lvs
When pvmoving LV - the target for LV is a mirror so the validation
that checked the type is matching was incorrect.

While we need a more generic enhancment of LVS output for pvmoved LVs,
for now at least stop showing internal errors and  'X' symbols in attrs.
2018-06-08 14:35:42 +02:00
Joe Thornber
286c1ba336 device_mapper: rename libdevmapper.h -> all.h
I'm paranoid a file will include the global one in /usr/include
by accident.
2018-06-08 12:31:45 +01:00
David Teigland
e6bb780d24 Rework lock-override options and locking_type settings
The last commit related to this was incomplete:
  "Implement lock-override options without locking type"

This is further reworking and reduction of the locking.[ch]
layer which handled all clustering, but is now only used
for file locking.  The "locking types" that this layer
implemented were removed previously, leaving only the
standard file locking.  (Some cluster-related artifacts
remain to be cleared out later.)

Command options to override or modify locking behavior
are reimplemented here without using the locking types.
Also, deprecated locking_type values are recognized,
and implemented as if one of the equivalent override
options was set.

Options that override file locking are:

. --nolocking disables all file locking.

. --readonly grants read lock requests without actually
  taking a file lock, and refuses write lock requests.

. --ignorelockingfailure tries to set up file locks and
  uses them normally if possible.  When not possible, it
  behaves like --readonly, but allows activation.

. --sysinit is the same as ignorelockingfailure.

. global/metadata_read_only acquires actual read file
  locks, and refuses write lock requests.

(Some of these options could probably be deprecated
because they were added as workarounds to various
locking_type behaviors that are now deprecated.)

The locking_type setting now has one valid value: 1 which
refers to standard file locking.  Configs that contain
deprecated values are recognized and still work in
largely the same way:

. 0 disabled all locking, now implemented like --nolocking
  is set.  Allow the nolocking option in all commands.

. 1 is the normal file locking setting and is unchanged.

. 2 was for external locking which was not used, and
  reverts to normal file locking.

. 3 was for cluster/clvm.  This reverts to normal file
  locking, and prints messages about lvmlockd.

. 4 was equivalent to readonly, now implemented like
  --readonly is set.

. 5 disabled all locking, now implemented like
  --nolocking is set.
2018-06-07 16:47:15 -05:00
David Teigland
6e6ef95ba6 Implement lock-override options without locking type
The options: --nolocking, --readonly, --sysinit
override, or make exceptions to, the normal file locking
behavior.  Implement these by just checking for the
options in the file locking path instead of using
special locking types.
2018-06-07 16:17:04 +01:00
David Teigland
da30b4a786 Remove locking infrastructure from activation paths
Basic LV functions:

  activate_lv(), deactivate_lv(),
  suspend_lv(), resume_lv()

were routed through the locking infrastruture on the way to:

  lv_activate_with_filter(), lv_deactivate(),
  lv_suspend_if_active(), lv_resume_if_active()

This commit removes the locking infrastructure from the
middle and calls the later functions directly from the former.

There were a couple of ancillary steps that the locking
infrastructure added along the way which are still included:

  - critical section inc/dec during suspend/resume
  - checking for active component LVs during activate

The "activation" file lock (serializing activation) has not
been kept because activation commands have been changed to
take the VG file lock exclusively which makes the activation
lock unused and unnecessary.
2018-06-07 16:17:04 +01:00
David Teigland
e7aa51c70f Remove VG lock ordering check
Four commands lock two VGs at a time:

- vgsplit and vgmerge already have their own logic to
  acquire the locks in the correct order.

- vgimportclone and vgrename disable this ordering check.
2018-06-07 16:17:04 +01:00
David Teigland
18259d5559 Remove unused clvm variations for active LVs
Different flavors of activate_lv() and lv_is_active()
which are meaningful in a clustered VG can be eliminated
and replaced with whatever that flavor already falls back
to in a local VG.

e.g. lv_is_active_exclusive_locally() is distinct from
lv_is_active() in a clustered VG, but in a local VG they
are equivalent.  So, all instances of the variant are
replaced with the basic local equivalent.

For local VGs, the same behavior remains as before.
For shared VGs, lvmlockd was written with the explicit
requirement of local behavior from these functions
(lvmlockd requires locking_type 1), so the behavior
in shared VGs also remains the same.
2018-06-07 16:17:04 +01:00
David Teigland
e4d9099e19 Remove more clvm code 2018-06-07 16:17:04 +01:00
David Teigland
d154dd6638 lvmlockd: fix missing lock_type null check
Missed checking if vg->lock_type is NULL in commit db8d3bdfa:
  lvmlockd: enable mirror split and merge with dlm lock_type
2018-06-07 16:17:04 +01:00
David Teigland
1539e51721 devices: clean up io error messages
Remove the io error message from bcache.c since it is not
very useful without the device path.

Make the io error messages from dev_read_bytes/dev_write_bytes
more user friendly.
2018-06-07 16:17:04 +01:00
David Teigland
3e781ea446 Remove clvmd and associated code
More code reduction and simplification can follow.
2018-06-05 11:09:13 -05:00
Heinz Mauelshagen
bd7cdd0b09 lvconvert: support linear <-> striped convenience conversions
"lvconvert --type {linear|striped|raid*} ..." on a striped/linear
LV provides convenience interim type to convert to the requested
final layout similar to the given raid* <-> raid* conveninece types.

Whilst on it, add missing raid5_n convenince type from raid5* to raid10.

Resolves: rhbz1439925
Resolves: rhbz1447809
Resolves: rhbz1573255
2018-06-05 16:23:18 +02:00
Heinz Mauelshagen
de66704253 segtype: add linear
Add linear segtype addressing FIXME in preparation
for linear <-> striped convenience conversion support
2018-06-05 16:23:18 +02:00
Joe Thornber
232918fb86 build: libbase.a 2018-06-04 13:53:07 +01:00
Zdenek Kabelac
1140d70893 build: fixes 2018-06-04 12:28:13 +02:00
Zdenek Kabelac
6a1f458bb7 build: compile fixes 2018-06-01 21:12:31 +02:00
David Teigland
7b5b1a9b6f scan: clean exit for alloc failure 2018-06-01 13:15:22 -05:00
David Teigland
0625c7f372 devs: clear coverity warning about null info
a theoretical possibility.
2018-06-01 13:15:22 -05:00
David Teigland
09177b53dd lvmlockd: clarify lock_type use for coverity
Make it clearer when vg->lock_type will be used so
coverity doesn't worry about it.
2018-06-01 13:15:22 -05:00
David Teigland
b6f0f20da2 lvmlockd: primarily use vg_is_shared
to check if a vg uses an lvmlockd lock_type,
instead of the equivalent but longer is_lockd_type.
2018-06-01 13:15:22 -05:00
Joe Thornber
dbba1e9b93 Merge branch 'master' into 2018-05-11-fork-libdm 2018-06-01 13:04:12 +01:00
Joe Thornber
cb379c86c4 Merge branch '2018-05-30-bcache-radix-tree' 2018-06-01 12:45:33 +01:00
David Teigland
06b2e5c176 lvmlockd: improve error message for existing lockspace
When a VG/lockspace already exists with the same name
don't just print the error number.
2018-05-31 15:52:23 -05:00
David Teigland
b9c1cef817 lvmlockd: fix reverting new lv in error path
The wrong name was being used to free the LV lock
in lvmlockd in the error exit path.
2018-05-31 15:35:48 -05:00
Joe Thornber
d4d39d0f90 Merge branch 'master' into 2018-05-30-bcache-radix-tree 2018-05-31 16:36:04 +01:00
David Teigland
fdaa7e2e87 vgs: add report field for shared
equivalent to a non-empty -o locktype.
2018-05-31 10:23:03 -05:00
David Teigland
c516321325 lvmlockd: enable lvcreate of new LV plus existing cache pool
In this command, lvcreate creates a new LV and then combines
it with an existing cache pool, producing a cache LV.  This
command was previously not allowed in in a shared VG.
2018-05-30 15:24:24 -05:00
David Teigland
6cd0523337 lvmlockd: enable repairing shared VG while reading it
When the lvmlockd lock is shared, upgrade it to ex
when repair (writing) is needed during vg_read.

Pass the lockd state through additional read-related
functions so the instances of repair scattered through
vg_read can be handled.

(Temporary solution until the ad hoc repairs can be
pulled out of vg_read into a top level, centralized
repair function.)
2018-05-30 12:56:46 -05:00
David Teigland
403c87c1aa lvmlockd: enable creation of cache pool with lvcreate
Previously, cache pools needed to be created with lvconvert.
2018-05-30 09:25:45 -05:00
David Teigland
948f2d9979 lvmlockd: enable lvcreate of thin pool and thin lv in one command
Previously, thin pools and thin lvs need needed to be
created with separate commands, now the combined command
is permitted.
2018-05-30 09:25:45 -05:00
David Teigland
db8d3bdfa9 lvmlockd: enable mirror split and merge with dlm lock_type 2018-05-30 09:25:45 -05:00
David Teigland
3a4fe54ca1 config: revert to normal locking when no cluster
and suggest lvmlockd
2018-05-30 09:25:45 -05:00
David Teigland
7f7ec769d9 lvmlockd: do not use an LV lock for some lvchange options
Some lvchange options can be used even if the LV is active.
2018-05-30 09:25:45 -05:00
David Teigland
0c1d3db8db lvmlockd: accept repeated global lock requests
It's not an error if a command requests the global lock
when it has already acquired it.  It shouldn't happen,
but there could be cases we've not found.
2018-05-30 09:25:45 -05:00
David Teigland
6d14d5d16b scan: removed failed paths for devices
Drop a device path when the scan fails to open it.
2018-05-30 09:05:18 -05:00
Joe Thornber
7635df8cce bcache: switch to storing blocks in a radix tree.
Rather than a hash table.  This will make invalidate_fd() more
efficient since we can iterate just those blocks that are on
a particular dev.
2018-05-30 14:17:26 +01:00
David Teigland
28c8e95d19 scan: refresh paths and retry open
If scanning fails to open any devices, refresh the
device paths in dev cache, and retry the opens.
2018-05-25 13:09:07 -05:00
Alasdair G Kergon
9a730233c9 format_text: Use versionsort to sort archive files
Ensure that vg_100000-* follows vg_99999-* so that the expiry logic
doesn't stop too early.

   https://bugzilla.redhat.com/1481085
2018-05-24 17:51:03 +02:00
David Teigland
61583281e5 filters: clarify some parts of md filter
Rename some functions to be consistent with the return values,
and add some comments about how it works.
2018-05-22 14:07:13 -05:00
David Teigland
3c9ed33f83 scan: move warnings about duplicate devices
We have been warning about duplicate devices (and disabling lvmetad)
immediately when the dup was detected (during label_scan).  Move the
warnings (and the disabling) to happen later, after label_scan is
finished.

This lets us avoid an unwanted warning message about duplicates
in the special case were md components are eliminated during the
duplicate device resolution.
2018-05-21 16:48:02 -05:00
David Teigland
0253f5a21d fix id_write_format on non-uuid string
orphan vgs using the vgname "#orphans" as the vgid,
and valgrind complains about calling id_write_format
on that invalid uuid.
2018-05-18 13:41:20 -05:00
David Teigland
286c9c78b4 liblvm2app: fix valgrind memory warning 2018-05-17 15:18:11 -05:00
Joe Thornber
5052970da3 bcache: Don't call sysconf for every io 2018-05-17 10:05:10 +01:00
Alex Bennée
c6ca81a38d bcache: don't use PAGE_SIZE compile const
PAGE_SIZE is not a compile time constant. Use sysconf instead like
elsewhere in the code.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-05-17 10:38:16 +02:00
Rick Elrod
8c453e2e5e cleanup: fix grammar in output - less then -> less than
This minor patch fixes grammar in a few messages which get
printed to users. It also fixes the same grammar mistake in
several comments.

Signed-off-by: Rick Elrod <relrod@redhat.com>
--
2018-05-17 10:37:45 +02:00
David Teigland
28d35e5c59 scan: fix missing close in lib
lib was using dev_test_excl which wasn't closing the device.
Switch code to new io layer with excl open.
Also use exclusive open in some other places.
2018-05-16 14:48:30 -05:00
Joe Thornber
89fdc0b588 Merge branch 'master' into 2018-05-11-fork-libdm 2018-05-16 13:43:02 +01:00
Joe Thornber
ccc35e2647 device-mapper: Fork libdm internally.
The device-mapper directory now holds a copy of libdm source.  At
the moment this code is identical to libdm.  Over time code will
migrate out to appropriate places (see doc/refactoring.txt).

The libdm directory still exists, and contains the source for the
libdevmapper shared library, which we will continue to ship (though
not neccessarily update).

All code using libdm should now use the version in device-mapper.
2018-05-16 13:00:50 +01:00
Joe Thornber
e296f784c9 Merge branch 'master' of git://sourceware.org/git/lvm2 2018-05-16 10:11:58 +01:00
Joe Thornber
df2acbbb97 bcache: nr_ios_pending wasn't being incremented
... but it was being decremented on completion.  Which meant
it wrapped, and no prefetches were ever issued after the
first completion.
2018-05-16 10:09:17 +01:00
David Teigland
3bbc17a670 scan: use up to 1024 max bcache blocks
Create bcache with one block per device that
will be scanned up to 1024 max blocks.
2018-05-15 15:17:31 -05:00
Zdenek Kabelac
889558fedb conf: update conf
Matching patch 2eba7c7755
2018-05-15 16:58:28 +02:00
David Teigland
11ceb77867 lvmcache: fix loop freeing infos
valgrind was concerned about loop through vginfo->infos,
so grab info from dev.
2018-05-14 13:45:55 -05:00
David Teigland
517d6cc418 scan: add some missing frees
some objects had been moved out of mem pools.
2018-05-14 13:38:16 -05:00
Joe Thornber
7f97c7ea9a build: Don't generate symlinks in include/ dir
As we start refactoring the code to break dependencies (see doc/refactoring.txt),
I want us to use full paths in the includes (eg, #include "base/data-struct/list.h").
This makes it more obvious when we're breaking abstraction boundaries, eg, including a file in
metadata/ from base/
2018-05-14 10:30:20 +01:00
Zdenek Kabelac
ac768a9d2b bcache: do not use libdm header files
Logging for libdm differs from lvm logging - keep using consisten
logging function calls.
2018-05-12 18:18:23 +02:00
David Teigland
09fcc8eaa8 scan: ignore duplicates that are md component devs
md devices using an older superblock version have
superblocks at the end of the md device.  For commands
that skip reading the end of devices during filtering,
the md component devs will be scanned, and will appear
as duplicate PVs to the original md device.  Remove
these md components from the list of unused duplicate
devices, so they are treated as if they had been
ignored during filtering.  This avoids the restrictions
that are placed on using PVs with duplicates.
2018-05-11 15:52:22 -05:00
David Teigland
73578e36fa dev_cache: remove the lvmcache check when closing fd
This is no longer used since devices are not held
open in dev_cache.
2018-05-11 14:30:10 -05:00
David Teigland
3e3cb22f2a dev_cache: fix close in utility functions
All these functions are now used as utilities,
e.g. for ioctl (not for io), and need to
open/close the device each time they are called.
(Many of the opens can probably be eliminated by
just using the bcache fd for the ioctl.)
2018-05-11 14:25:08 -05:00
David Teigland
5c9dcd99fd scan: remove unused args from label_read 2018-05-11 14:16:49 -05:00
David Teigland
b5d9914628 devs: recognize md devices in subsystem check
If md components appear as duplicate PVs, let the
existing subsystem check recognize the md device.
2018-05-11 14:00:19 -05:00
David Teigland
ccab54677c dev_cache: fix close in dev_get_block_size 2018-05-11 13:53:19 -05:00
David Teigland
bbb8040456 dev_cache: drop open_list
devices are now held open only in bcache,
so drop the dev_cache list of open devices
which is unused.
2018-05-11 12:47:56 -05:00
David Teigland
4362013872 bcache: disable fallback to old io
All io has been converted to bcache.
2018-05-11 11:35:56 -05:00
David Teigland
228ed56455 pvck: allow checking at user specified offsets
with the --labelsector option.  We probably don't
need all this code to support any value for this
option; it's unclear how, when, why it would be
used.
2018-05-11 11:23:51 -05:00
Joe Thornber
3b02b35c3e Merge branch 'master' of git+ssh://sourceware.org/git/lvm2 2018-05-11 05:39:27 +01:00
Joe Thornber
5f780813f2 bcache/sync io engine: handle short ios 2018-05-11 05:37:47 +01:00
David Teigland
9ad42e5f06 io: write log header with bcache 2018-05-10 16:25:33 -05:00
David Teigland
57bb46c5e7 filter: use bcache for filter reads
Filters are still applied before any device reading or
the label scan, but any filter checks that want to read
the device are skipped and the device is flagged.

After bcache is populated, but before lvm looks for
devices (i.e. before label scan), the filters are
reapplied to the devices that were flagged above.
The filters will then find the data they need in
bcache.
2018-05-10 16:03:19 -05:00
Joe Thornber
39ce38eb88 label/lv_manip: squash some warnings 2018-05-10 15:14:39 +01:00
Joe Thornber
ae50374811 bcache: Add sync io engine
Something to fall back to when testing.
2018-05-10 14:29:26 +01:00
Joe Thornber
67b80e2d9d bcache: knock out err param.
Dave used this for debugging.  Not needed in general.
2018-05-10 13:26:08 +01:00
David Teigland
9a5bd01b0c io: replace dev_set with bcache equivalents 2018-05-09 11:29:52 -05:00
Joe Thornber
1c5c99afce bcache-utils: bcache_set_bytes() 2018-05-09 11:05:29 +01:00
David Teigland
f4a60fe004 clvmd: saved_vg code and comment formatting 2018-05-03 14:54:48 -05:00
David Teigland
822a8b62be clvmd: don't save cft and buf for saved_vg 2018-05-03 14:54:48 -05:00
David Teigland
c016b573ee clvmd: separate saved_vg from vginfo
The clvmd saved_vg data is independent from the normal lvm
lvmcache vginfo data, so separate saved_vg from vginfo.
Normal lvm doesn't need to use save_vg at all, and in clvmd,
lvmcache changes on vginfo can be made without worrying
about unwanted effects on saved_vg.
2018-05-03 14:54:48 -05:00
David Teigland
a5e13f2eef clvmd: defer freeing saved vgs
To avoid the chance of freeing a saved vg while another
code path is using it, defer freeing saved vgs until
all the lvmcache content is dropped for the vg.
2018-05-03 14:54:48 -05:00
Heinz Mauelshagen
88fe07ad0a raid: use new internal APIs
Use APIs introduced with commit 4ebfd8e8eb
where appropriate to minimize redundant code.
2018-05-03 21:36:50 +02:00
Joe Thornber
49db9b5e0b Merge branch '2018-05-03-improve-bcache-utils' 2018-05-03 20:15:13 +01:00
Heinz Mauelshagen
4ebfd8e8eb lvconvert: don't return success on degraded -m raid1 conversion
In case "lvconvert -mN RaidLV" was used on a degraded
raid1 LV, success was returned instead of an error.

Provide message to inform about the need to repair first
before changing number of mirrors and exit with error.

Add new lvconvert-m-raid1-degraded.sh test.

Resolves: rhbz1573960
2018-05-03 18:48:00 +02:00
Joe Thornber
dfc320f5b8 bcache-utils: rewrite
They take care to avoid redundant reads now.
2018-05-03 11:36:29 +01:00
Joe Thornber
2688aafefb bcache: rename bcache_write_zeroes() -> bcache_zero_bytes()
Now matches the other util functions:

bcache_{prefetch,read,write,zero}_bytes()
2018-05-03 10:21:14 +01:00
Joe Thornber
8b755f1e04 bcache: rewrite bcache_write_zeros()
It now uses GF_ZERO to avoid reading blocks that are going to be
completely zeroed.
2018-05-03 10:14:56 +01:00
Joe Thornber
dc30d4b2f2 bcache: switch off_t -> uint64_t
We always want it to be 64bit
2018-05-03 09:37:43 +01:00
Joe Thornber
efad84ebc2 bcache: Move the utils to a separate file.
This makes it clearer that they don't access the cache internals.
2018-05-03 09:34:41 +01:00
Joe Thornber
b3c41bce3d bcache: add bcache_block_sectors() query fn 2018-05-03 09:33:55 +01:00
Joe Thornber
65912ce44d bcache: add a comment 2018-05-03 09:21:10 +01:00
David Teigland
977d0a3613 filters: increase MAX_FILTERS for new filter
The new signature filter was added without increasing this.
2018-05-02 14:10:30 -05:00
Joe Thornber
90d0ff6636 bcache: reorder includes in .c file too 2018-05-02 19:45:06 +01:00
Joe Thornber
8fd300f7df device/bcache: reorder includes 2018-05-02 18:59:43 +01:00
Joe Thornber
972b535220 build: add -D_FILE_OFFSET_BITS=64
I don't like having this in a common header because it means you end
up including too much and causing unneccessary dependencies.  eg,
lib/misc/lib.h includes libdevmapper.h, internationalisation, and
logging stuff.
2018-05-02 18:40:38 +01:00
David Teigland
24e7745d7a devices: ignore lvm1 and pool devices 2018-05-01 15:18:47 -05:00
David Teigland
8dcc973bbb bcache_write_bytes needs to be followed by flush
The improved bcache_write_bytes is not flushing, so
the caller needs to do that.
2018-05-01 09:33:55 -05:00
David Teigland
a418f88b76 lvmcache: fix typo in lvmcache_get_saved_vg 2018-05-01 09:06:57 -05:00
Joe Thornber
bfc61a9543 bcache: squash some warnings on rhel6 2018-05-01 13:21:53 +01:00
Joe Thornber
f564e78d98 bcache: rewrite bcache_{write,zero}_bytes
These are utility functions so should only use the public interface.

Also write_bytes was flushing, which will kill performance.
2018-05-01 12:07:33 +01:00
David Teigland
c1cd18f21e Remove lvm1 and pool disk formats
There are likely more bits of code that can be removed,
e.g. lvm1/pool-specific bits of code that were identified
using FMT flags.

The vgconvert command can likely be reduced further.

The lvm1-specific config settings should probably have
some other fields set for proper deprecation.
2018-04-30 16:55:02 -05:00
David Teigland
029a76b4f8 clvmd: don't repair vg from vg_read in clvmd
The mixed up vg repair code in vg_read was trying
to repair a vg when vg_read was called by clvmd.
The clvmd daemon isn't supposed to be repairing
or writing a vg.

(This is a temporary workaround; vg repair will soon
be pulled out of vg_read so it can be called in a
controlled way and consolidated instead of spread
around.)
2018-04-30 15:56:51 -05:00
David Teigland
89935ace29 clvmd: keep old saved_vg if it matches new
There is no need to release the old saved_vg
if it matches the new version.
2018-04-30 13:03:15 -05:00
Joe Thornber
2bc896f2a3 build: remove --with-{snapshots,mirrors,raid,thin,cache} options from ./configure
It now behaves as if the were all set as 'internal'
2018-04-30 10:11:23 +01:00
Joe Thornber
545ca59468 Merge branch 'master' of git+ssh://sourceware.org/git/lvm2 2018-04-30 09:56:04 +01:00
Joe Thornber
65d6118e47 [metadata-liblvm.c] comment out some dead code and add a FIXME 2018-04-30 09:45:39 +01:00
Joe Thornber
513e9e3264 [lvmetad.h] Use static inline functions to stub out functions.
The macros were causing warnings because the arguments were percieved as
unused.
2018-04-30 09:45:13 +01:00
Zdenek Kabelac
fade45b1d1 mirror: improve table update
Shift refresh of mirror table right into monitor_dev_for_events().
Use  !vg_write_lock_held() to recognize use of lvchange/vgchange.
(this shall change if this would no longer work, but requires
futher some API changes).

With this patch  dm mirror table is only refreshed when necassary.

Also update WARNING message about mirror usage without monitoring
and display LV name.
2018-04-30 10:41:51 +02:00
Joe Thornber
e890c37704 [bcache] Some work on bcache_invalidate()
bcache_invalidate() now returns a bool to indicate success.  If fails
if the block is currently held, or the block is dirty and writeback
fails.

Added a bunch of unit tests for the invalidate functions.

Fixed some bugs to do with invalidating errored blocks.
2018-04-27 10:56:13 +01:00
David Teigland
5b6e62dc1f clvmd: drop old saved_vg when returning new saved_vg
In some pvmove tests, clvmd uses the new (precommitted)
saved_vg, but then requests the old saved_vg, and
expects that the new saved_vg be returned instead of
the old.  So, when returning the new saved_vg, forget
the old one so we don't return it again.
2018-04-26 14:57:45 -05:00
David Teigland
cdb8400de2 scan: refresh filters before scan
The filters save information about devices that should
be ignored, so if we need to repeat a scan  (unusual,
but happens in clvmd), we need to update the filters.
2018-04-26 14:48:13 -05:00
Joe Thornber
1c97fda425 [bcache] get all unit tests passing again 2018-04-26 13:13:27 +01:00
David Teigland
0fe4f65f65 scan: don't use cmd mem pool in scan
Make it consistent with all the other allocations
in scanning.
2018-04-25 16:40:08 -05:00
David Teigland
4670e9f698 skip some clvmd-specific code in common cases
This, or something like it, can probably be done
in many other places.
2018-04-25 16:40:08 -05:00
David Teigland
47bfac21ca clvmd: skip dev rescan after full scan
When clvmd does a full label scan just prior to
calling _vg_read(), pass a new flag into _vg_read
to indicate that the normal rescan of VG devs is
not needed.
2018-04-25 16:39:43 -05:00
David Teigland
1fec86571f clvmd: reuse a vg struct for sequential LV operations
After reading a VG, stash it in lvmcache as "saved_vg".
Before reading the VG again, try to use the saved_vg.
The saved_vg is dropped on VG lock operations.
2018-04-25 16:39:43 -05:00
David Teigland
f8616ac2d8 lvmcache: rename suspended_vg to saved_vg
The copy of the VG which clvmd stashes in lvmcache should
not only be used between suspend and resume, but between
sequential LV operations in clvmd, so that clvmd does not
need to reread the VG for each one.  Prepare for that by
renaming the stashed VG as "saved_vg".
2018-04-25 16:39:43 -05:00
Zdenek Kabelac
c492fbb51c debug: more explanatory error message 2018-04-23 22:42:18 +02:00
Zdenek Kabelac
fcdac700f9 gcc: remove duplicate typedef 2018-04-23 22:42:18 +02:00
David Teigland
1409c4a1c2 clvm: rescan when VG or PV not found
Rescan devices to update lvmcache content when
clvmd vg_read doesn't find a VG or PV.
2018-04-20 16:09:49 -05:00
David Teigland
aee27dc7ba scan: skip device rescan in vg_read
For reporting commands (pvs,vgs,lvs,pvdisplay,vgdisplay,lvdisplay)
we do not need to repeat the label scan of devices in vg_read if
they all had matching metadata in the initial label scan.  The
data read by label scan can just be reused for the vg_read.
This cuts the amount of device i/o in half, from two reads of
each device to one.  We have to be careful to avoid repairing
the VG if we've skipped rescanning.  (The VG repair code is very
poor, and will be redone soon.)
2018-04-20 11:23:14 -05:00
David Teigland
aa833bdd8a bcache: intercept test mode before write
Don't allow writes in test mode.  test mode should be
more sophisticated than just faking writes, and this
should be a last defense for cases where test mode is
not being checked correctly.
2018-04-20 11:22:48 -05:00
David Teigland
9b6a62f944 lvmcache: simplify
Recent changes allow some major simplification of the way
lvmcache works and is used.  lvmcache_label_scan is now
called in a controlled fashion at the start of commands,
and not via various unpredictable side effects.  Remove
various calls to it from other places.  lvmcache_label_scan
should not be called from anywhere during a command, because
it produces an incorrect representation of PVs with no MDAs,
and misclassifies them as orphans.  This has been a long
standing problem.  The invalid flag and rescanning based on
that is no longer used and removed.  The 'force' variation is
no longer needed and removed.
2018-04-20 11:22:48 -05:00
David Teigland
c0973e70a5 dev_cache: clean up scan
Pull out all of the twisted logic and simply call dev_cache_scan
at the start of the command prior to label scan.
2018-04-20 11:22:48 -05:00
David Teigland
45e5e702c1 scan: improve io error checking and reporting 2018-04-20 11:22:48 -05:00
David Teigland
6d05859862 bcache: let caller see an error 2018-04-20 11:22:48 -05:00
David Teigland
ae21305ee7 scan: drop bcache between lvm shell commands
A running lvm shell keeps all lvm devices open
unless the bcache is dropped.
2018-04-20 11:22:48 -05:00
David Teigland
a9b0aa5c17 lvmetad: more fixes related to bcache
Need to open devs prior to bcache io.
2018-04-20 11:22:48 -05:00
David Teigland
e351f8bc66 lvmetad: need to set up bcache in another place
We need to find one common place to set up bcache
for the lvmetad case, instead of adding calls in
various places.
2018-04-20 11:22:48 -05:00
David Teigland
ddb5de7a98 clvm: fix bcache scan handling
We can't let clvmd keep all scanned devs open,
which prevents them from being removed.  So
drop the bcache data (and close fds) affter
doing a label scan.

Also set up bcache before the clvm-specific
vg_read (which needs to rescan the vg's devs
using bcache) and destroy the bcache after.
2018-04-20 11:22:48 -05:00
David Teigland
196579af1f scan: check for errors in text layer
The scanning code in the format_text layer
has previously ignored errors.  Start checking
for and returning them.
2018-04-20 11:22:47 -05:00
David Teigland
44726ed9cb scan: remove lvmcache info for failed devs
When scanning a device fails, drop an lvmcache
info struct for it.
2018-04-20 11:22:47 -05:00
David Teigland
1717d4cb17 lvmcache: add shorter way to delete dev info
Don't make the caller look up the info first.
2018-04-20 11:22:47 -05:00
David Teigland
570c6239ee bcache: fix error handling
The error handling code wasn't working, but it
appears that just removing it is what we need.
The doesn't really need any different behavior
related to bcache blocks on an io error, it just
wants to know if there was an error.
2018-04-20 11:22:47 -05:00
David Teigland
217f3f8741 scan: add function to drop bcache blocks
which can be a little more efficient that destroy.
2018-04-20 11:22:47 -05:00
David Teigland
da2b155a9d scan: invalidate bcache for dev after errors
If there are errors reading or writing dev,
invalidate bcache for it.
2018-04-20 11:22:47 -05:00
David Teigland
4331182964 bcache: add some error messages for debugging 2018-04-20 11:22:47 -05:00
David Teigland
21057676a1 scan: create bcache with minimum number of blocks
In some odd cases (e.g. tests) there are very few devices
which results in creating too few blocks in bcache, so
create bcache with a minimum number of blocks.
2018-04-20 11:22:47 -05:00
David Teigland
e49b114f7e bcache: use wrappers for bcache read write in lvm
Using a wrapper makes it easier to disable bcache if needed.
2018-04-20 11:22:47 -05:00
David Teigland
8065492046 bcache: do all writes through bcache 2018-04-20 11:22:47 -05:00
David Teigland
8b26a007b1 misc bcache fixes from ejt 2018-04-20 11:22:47 -05:00
David Teigland
0da296003d vgchange: invalidate bcache for stacked LVs when deactivating
An LV with a stacked PV will be open in bcache and needs to be
invalidated to close the fd before attempting to deactivate.
2018-04-20 11:22:47 -05:00
David Teigland
c2b10daf69 scan: put dev back on caller's list
Commit 6e442875613915e506440e59a290b56756df2521 missed
adding devs back to caller's list.
2018-04-20 11:22:47 -05:00
David Teigland
e7670d3338 pvck: use bcache 2018-04-20 11:22:47 -05:00
David Teigland
b504bb809e scan: use 128K bcache block size 2018-04-20 11:22:46 -05:00
David Teigland
28255e3eee scan: always setup bcache for commands using lvmetad
Do this at the start of the command so that it doesn't
need to be checked and set up in every function that
could need it.
2018-04-20 11:22:46 -05:00
David Teigland
f328532f05 scan: leave the caller's dev list unchanged
When scanning the list of devs from the caller
they are moved to another temporary list, but
were never returned to the original list.
2018-04-20 11:22:46 -05:00
David Teigland
7bce66c5e8 scan: setup bcache for commands using lvmetad
Commands using lvmetad will not begin with a proper
label_scan which initializes bcache, but may later
decide they need to scan a set of devs, in which case
they'll need bcache set up at that point.
2018-04-20 11:22:46 -05:00
David Teigland
6e580465b5 vgremove: fix force remove on devs with damaged metadata
The improved detection of bad metadata when scanning
(where errors were ignored before) means we now have to
override some errors when forcibly erasing damaged metadata.
2018-04-20 11:22:46 -05:00
David Teigland
37471bb477 scan: skip extra scan in vg_read
Drop an extra label scan in the recovery part
of vg_read.  This is a temporary improvement
until the pending replacement for the broken
recovery code burried in vg_read.
2018-04-20 11:22:46 -05:00
David Teigland
e4f478d86d scan: handle request to scan missing dev 2018-04-20 11:22:46 -05:00
David Teigland
89f54a5094 remove debugging print 2018-04-20 11:22:46 -05:00
David Teigland
a1e3398ffc scan: handle no devices
Still create bcache.
2018-04-20 11:22:46 -05:00
David Teigland
9d2add1361 scan: add a dev to bcache before each read to handle write path
This is a temporary hacky workaround to the problem of
reads going through bcache and writes not using bcache.
The write path wants to read parts of data that it is
incrementally writing to disk, but the reads (using
bcache) don't work because the writes are not in the
bcache.  For now, add a dev to bcache before each attempt
to read it in case it's being used on the write path.
2018-04-20 11:22:46 -05:00
David Teigland
6c67c7557c scan: use separate fd for bcache
Create a new dev->bcache_fd that the scanning code owns
and is in charge of opening/closing.  This prevents other
parts of lvm code (which do various open/close) from
interfering with the bcache fd.  A number of dev_open
and dev_close are removed from the reading path since
the read path now uses the bcache.

With that in place, open(O_EXCL) for pvcreate/pvremove
can then be fixed.  That wouldn't work previously because
of other open fds.
2018-04-20 11:22:46 -05:00
David Teigland
f17c2cf7c6 pvremove: device check doesn't require label_read
It just needs to check if the device was found during
the scan, which means checking if it exists in lvmcache.
2018-04-20 11:22:45 -05:00
David Teigland
29c6c17121 format-text.c log message fixes 2018-04-20 11:22:45 -05:00
David Teigland
d9a77e8bb4 lvmcache: simplify metadata cache
The copy of VG metadata stored in lvmcache was not being used
in general.  It pretended to be a generic VG metadata cache,
but was not being used except for clvmd activation.  There
it was used to avoid reading from disk while devices were
suspended, i.e. in resume.

This removes the code that attempted to make this look
like a generic metadata cache, and replaces with with
something narrowly targetted to what it's actually used for.

This is a way of passing the VG from suspend to resume in
clvmd.  Since in the case of clvmd one caller can't simply
pass the same VG to both suspend and resume, suspend needs
to stash the VG somewhere that resume can grab it from.
(resume doesn't want to read it from disk since devices
are suspended.)  The lvmcache vginfo struct is used as a
convenient place to stash the VG to pass it from suspend
to resume, even though it isn't related to the lvmcache
or vginfo.  These suspended_vg* vginfo fields should
not be used or touched anywhere else, they are only to
be used for passing the VG data from suspend to resume
in clvmd.  The VG data being passed between suspend and
resume is never modified, and will only exist in the
brief period between suspend and resume in clvmd.

suspend has both old (current) and new (precommitted)
copies of the VG metadata.  It stashes both of these in
the vginfo prior to suspending devices.  When vg_commit
is successful, it sets a flag in vginfo as before,
signaling the transition from old to new metadata.

resume grabs the VG stashed by suspend.  If the vg_commit
happened, it grabs the new VG, and if the vg_commit didn't
happen it grabs the old VG.  The VG is then used to resume
LVs.

This isolates clvmd-specific code and usage from the
normal lvm vg_read code, making the code simpler and
the behavior easier to verify.

Sequence of operations:

- lv_suspend() has both vg_old and vg_new
  and stashes a copy of each onto the vginfo:
  lvmcache_save_suspended_vg(vg_old);
  lvmcache_save_suspended_vg(vg_new);

- vg_commit() happens, which causes all clvmd
  instances to call lvmcache_commit_metadata(vg).
  A flag is set in the vginfo indicating the
  transition from the old to new VG:
  vginfo->suspended_vg_committed = 1;

- lv_resume() needs either vg_old or vg_new
  to use in resuming LVs.  It doesn't want to
  read the VG from disk since devices are
  suspended, so it gets the VG stashed by
  lv_suspend:
  vg = lvmcache_get_suspended_vg(vgid);

If the vg_commit did not happen, suspended_vg_committed
will not be set, and in this case, lvmcache_get_suspended_vg()
will return the old VG instead of the new VG, and it will
resume LVs based on the old metadata.
2018-04-20 11:22:45 -05:00
David Teigland
79c4971210 label_scan: remove extra label scan and read for orphan PVs
When process_each_pv() calls vg_read() on the orphan VG, the
internal implementation was doing an unnecessary
lvmcache_label_scan() and two unnecessary label_read() calls
on each orphan.  Some of those unnecessary label scans/reads
would sometimes be skipped due to caching, but the code was
always doing at least one unnecessary read on each orphan.

The common format_text case was also unecessarily calling into
the format-specific pv_read() function which actually did nothing.

By analyzing each case in which vg_read() was being called on
the orphan VG, we can say that all of the label scans/reads
in vg_read_orphans are unnecessary:

1. reporting commands: the information saved in lvmcache by
the original label scan can be reported.  There is no advantage
to repeating the label scan on the orphans a second time before
reporting it.

2. pvcreate/vgcreate/vgextend: these all share a common
implementation in pvcreate_each_device().  That function
already rescans labels after acquiring the orphan VG lock,
which ensures that the command is using valid lvmcache
information.
2018-04-20 11:22:45 -05:00