1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-12-10 16:23:54 +03:00
Commit Graph

403 Commits

Author SHA1 Message Date
David Teigland
a761f9fc9d label_scan/vg_read: use label_read_data to avoid disk reads
The new label_scan() function reads a large buffer of data
from the start of the disk, and saves it so that multiple
structs can be read from it.  Previously, only the label_header
was read from this buffer, and the code which needed data
structures that immediately followed the label_header would
read those from disk separately.  This created a large
number of small, unnecessary disk reads.

In each place that the two read paths (label_scan and vg_read)
need to read data from disk, first check if that data is
already available from the label_read_data buffer, and if
so just copy it from the buffer instead of reading from disk.

Code changes
------------

- passing the label_read_data struct down through
  both read paths to make it available.

- before every disk read, first check if the location
  and size of the desired piece of data exists fully
  in the label_read_data buffer, and if so copy it
  from there.  Otherwise, use the existing code to
  read the data from disk.

- adding some log_error messages on existing error paths
  that were already being updated for the reasons above.

- using similar naming for parallel functions on the two
  parallel read paths that are being updated above.

  label_scan path calls:
  read_metadata_location_summary, text_read_metadata_summary

  vg_read path calls:
  read_metadata_location_vg, text_read_metadata_file

  Previously, those functions were named:

  label_scan path calls:
  vgname_from_mda, text_vgsummary_import

  vg_read path calls:
  _find_vg_rlocn, text_vg_import_fd

I/O changes
-----------

In the label_scan path, the following data is either copied
from label_read_data or read from disk for each PV:

- label_header and pv_header
- mda_header (in _raw_read_mda_header)
- vg metadata name (in read_metadata_location_summary)
- vg metadata (in config_file_read_fd)

Total of 4 reads per PV in the label_scan path.

In the vg_read path, the following data is either copied from
label_read_data or read from disk for each PV:

- mda_header (in _raw_read_mda_header)
- vg metadata name (in read_metadata_location_vg)
- vg metadata (in config_file_read_fd)

Total of 3 reads per PV in the vg_read path.

For a common read/reporting command, each PV will be:

- read by the command's initial lvmcache_label_scan()
- read by lvmcache_label_rescan_vg() at the start of vg_read()
- read by vg_read()

Previously, this would cause 11 synchronous disk reads per PV:
4 from lvmcache_label_scan(), 4 from lvmcache_label_rescan_vg()
and 3 from vg_read().

With this commit's optimization, there are now 2 async disk reads
per PV: 1 from lvmcache_label_scan() and 1 from
lvmcache_label_rescan_vg().

When a second mda is used on a PV, it is located at the
end of the PV.  This second mda and copy of metadata will
not be found in the label_read_data buffer, and will always
require separate disk reads.
2017-10-27 12:32:44 -05:00
David Teigland
606dc06e43 label_scan: move to start of command
LVM's general design for scanning/reading of metadata from disks is
that a command begins with a discovery phase, called "label scan",
in which it discovers which devices belong to lvm, what VGs exist on
those devices, and which devices are associated with each VG.
After this comes the processing phase, which is based around
processing specific VGs.  In this phase, lvm acquires a lock on
the VG, and rescans the devices associated with that VG, i.e.
it repeats the label scan steps on the devices in the VG in case
something has changed between the initial label scan and taking
the VG lock.  This ensures that the command is processing the
lastest, unchanging data on disk.

This commit moves the location of these label scans to make them
clearer and avoid unnecessary repeated calls to them.

Previously, the initial label scan was called as a side effect
from various utility functions.  This would lead to it being called
unnecessarily.  It is an expensive operation, and should only be
called when necessary.  Also, this is a primary step in the
function of the command, and as such it should be called prominently
at the top level of command processing, not as a hidden side effect
of a utility function.  lvm knows exactly where and when the
label scan needs to be done.  Because of this, move the label scan
calls from the internal functions to the top level of processing.

Other specific instances of lvmcache_label_scan() are still called
unnecessarily or unclearly by specific commands that do not use
the common process_each functions.  These will be improved in
future commits.

During the processing phase, rescanning labels for devices in a VG
needs to be done after the VG lock is acquired in case things have
changed since the initial label scan.  This was being done by way
of rescanning devices that had the INVALID flag set in lvmcache.
This usually approximated the right set of devices, but it was not
exact, and obfuscated the real requirement.  Correct this by using
a new function that rescans the devices in the VG:
lvmcache_label_rescan_vg().

Apart from being inexact, the rescanning was extremely well hidden.
_vg_read() would call ->create_instance(), _text_create_text_instance(),
_create_vg_text_instance() which would call lvmcache_label_scan()
which would call _scan_invalid() which repeats the label scan on
devices flagged INVALID.  lvmcache_label_rescan_vg() is now called
prominently by _vg_read() directly.
2017-10-27 12:13:25 -05:00
David Teigland
cfe35ec877 label_scan: call new label_scan from lvmcache_label_scan
To do label scanning, lvm code calls lvmcache_label_scan().
Change lvmcache_label_scan() to use the new label_scan()
which can use async io, rather than implementing its own
dev iter loop and calling the synchronous label_read() on
each device.

Also add lvmcache_label_rescan_vg() which calls the new
label_scan_devs() which does label scanning on only the
specified devices.  This is for a subsequent commit and
is not yet used.
2017-10-27 12:13:25 -05:00
Alasdair G Kergon
f1cc5b12fd tidy: Add missing underscores to statics. 2017-10-18 15:58:13 +01:00
Zdenek Kabelac
539a48a328 debug: add stack trace point 2017-08-22 10:23:31 +02:00
Zdenek Kabelac
876c4a1b3b tidy: declaration names match implementation
Put in sync some naming used for function declaration and
actual in-code implementation.
2017-07-20 19:16:41 +02:00
Zdenek Kabelac
1fd8785ff3 tidy: drop unneeded return 2017-07-20 11:20:22 +02:00
Zdenek Kabelac
0bf836aa14 tidy: prefer not using else after return
clang-tidy: avoid using  'else' after return - give more readable code,
and also saves indention level.
2017-07-20 11:18:29 +02:00
Zdenek Kabelac
f7e62bc55c cleanup: drop extra compare
dm_free() already validates for NULL itself.
2017-07-17 12:32:18 +02:00
Zdenek Kabelac
ea96a9d68e devcache: correct logging severity for connection
Switch from warn to log_error since this generated
failing return code for command so printing log_error()
is mandatory.

Happens with i.e. pvscan --cache meets crashing lvmetad.
2017-07-17 12:28:51 +02:00
Zdenek Kabelac
e3a3cf01eb cleanup: use more common FMTd64 type
We use 'd' for plain singed integers.
2017-03-27 20:50:19 +02:00
David Teigland
506d88a2ec lvconvert: disable lvmetad for repair
Repairing missing devices does not work reliably
with lvmetad, so disable lvmetad before repair.
A standard lvmetad refresh (pvscan --cache) will
enable lvmetad again.
2017-03-16 11:50:36 -05:00
Christian Brauner
46b735c937 lvmetad: fix segfault on i386
Sending %d as format argument in lvmetad_vg_remove_pending() will cause
segfaults in config_make_nodes_v() when va_arg() casts to int64_t. Also, it is
clearly advertised in the lvm source code that using plain %d is prohibited, so
let's switch to FMTd64.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2017-03-13 13:37:07 -05:00
Zdenek Kabelac
9fe4f2337b cleanup: drop assign before use
Drop unneeded assigns singe vars are set later in code before
their first use (Coverity).
2016-10-03 17:49:55 +02:00
David Teigland
49a1c4d4b0 lvmlockd: fix segfault in error path
The log_debug statement was ignoring the NULL vg error case.
2016-09-28 13:29:55 -05:00
Peter Rajnoha
f1cad4c710 config: use config_tree_from_string_without_dup_node_check throughout code to construct metadata trees 2016-09-21 18:18:15 +02:00
David Teigland
d455300d7b lvmlockd: fix metadata validation when rescanning VG
When rescanning a VG from disk, the metadata read from
each PV was compared as a sanity check.  The comparison
is done by exporting the vg metadata from each dev to
a config tree, and then comparing the config trees.
The function to create the config tree inserts
extraneous information along with the actual VG metadata.
This extra info includes creation_time.  The config
trees for two devs can easily be created one second
apart in which case the different creation_times would
cause the metadata comparison to fail.  The fix is to
exclude the extraneous info from the metadata comparison.
2016-09-14 10:43:33 -05:00
Alasdair G Kergon
2e4821a847 lvmetad: Unused includes. 2016-09-05 17:14:53 +01:00
Peter Rajnoha
5d323c37f3 refactor: move and rename _dev_is_mpath_component in lvmetad.c to udev_dev_is_mpath_component in dev-type.c 2016-09-05 12:55:25 +02:00
Peter Rajnoha
29d0317557 lvmetad: check udev for mpath component several times if udev record not initialized yet
It's possible (mainly during boot) that udev has not finished
processing the device and hence the udev database record for that
device is still marked as uninitialized when we're trying to look
at it as part of multipath component check in pvscan --cache code.

So check several times with a short delay to wait for the udev db
record to be initialized before giving up completely.
2016-09-05 12:43:11 +02:00
David Teigland
939f5310b9 lvmetad: use udev to ignore multipath components during scan
When scanning devs to populate lvmetad during system startup,
filter-mpath with native sysfs multipath component detection
may not detect that a dev is multipath component.  This is
because the multipath devices may not be set up yet.

Because of this, pvscan will scan multipath components during
startup, will see them as duplicate PVs, and will disable
lvmetad.  This will leave lvmetad disabled on systems using
multipath, unless something or someone runs pvscan --cache
to rescan.

To avoid this problem, the code that is scanning devices to
populate lvmetad will now check the udev db to see if a
dev is a multipath component that should be skipped.

(This may not be perfect due to inherent udev races, but will
cover most cases and will be at least as good as it's ever
been.)
2016-08-31 13:19:57 -05:00
David Teigland
6ea250e2d0 lvmetad: fix use committed metadata to update
In some cases, the command will update VG metadata
in lvmetad without writing it.  In these cases there
is no vg->vg_committed and it should use 'vg' directly.
This happens when the command finds that the lvmetad
VG has been invalidated, rereads the metadata from disk,
then updates lvmetad with that metadata.  This happens
often with lvmlockd or foreign VGs, and can happen without
lvmlockd if a previous command fails after invalidating
the VG in lvmetad.
2016-07-20 10:25:26 -05:00
David Teigland
f8872578e9 lvmetad: use committed metadata to update
This fixes a regression from commit a7c45ddc5, which moved
the lvmetad VG update from vg_commit() to unlock_vg().

The lvmetad VG update needs to send the version of metadata
that was committed rather than sending the state of struct 'vg'.
The 'vg' may have been partially modified since vg_commit(),
and contain non-committed metadata that shouldn't be sent
to lvmetad.
2016-07-18 16:18:53 -05:00
Zdenek Kabelac
fd53d86eea cleanup: gcc warns removal
Ensure vars have always defined value.
2016-07-12 10:39:33 +02:00
Alasdair G Kergon
c1a66d4fc6 coverity: Fixes for recent changes. 2016-07-06 16:09:32 +01:00
Zdenek Kabelac
37a33d7414 cleanup: warns from older gcc 2016-07-01 00:44:48 +02:00
Zdenek Kabelac
b9d3e8c8a8 cleanup: drop unused assignments
In all code paths we set a value for these variables, so drop
their initial unused assign.
2016-07-01 00:44:48 +02:00
David Teigland
ff3c4ed1c0 lvmetad: two phase vg_remove
Apply the same idea as vg_update.
Before doing the VG remove on disk, invalidate
the VG in lvmetad.  After the VG is removed,
remove the VG in lvmetad.  If the command fails
after removing the VG on disk, but before removing
the VG metadata from lvmetad, then a subsequent
command will see the INVALID flag and not use the
stale metadata from lvmetad.
2016-06-28 02:30:36 +01:00
David Teigland
a7c45ddc59 lvmetad: two phase vg_update
Previously, a command sent lvmetad new VG metadata in vg_commit().
In vg_commit(), devices are suspended, so any memory allocation
done by the command while sending to lvmetad, or by lvmetad while
updating its cache could deadlock if memory reclaim was triggered.

Now lvmetad is updated in unlock_vg(), after devices are resumed.
The new method for updating VG metadata in lvmetad is in two phases:

1. In vg_write(), before devices are suspended, the command sends
   lvmetad a short message ("set_vg_info") telling it what the new
   VG seqno will be.  lvmetad sees that the seqno is newer than
   the seqno of its cached VG, so it sets the INVALID flag for the
   cached VG.  If sending the message to lvmetad fails, the command
   fails before the metadata is committed and the change is not made.
   If sending the message succeeds, vg_commit() is called.

2. In unlock_vg(), after devices are resumed, the command sends
   lvmetad the standard vg_update message with the new metadata.
   lvmetad sees that the seqno in the new metadata matches the
   seqno it saved from set_vg_info, and knows it has the latest
   copy, so it clears the INVALID flag for the cached VG.

If a command fails between 1 and 2 (after committing the VG on disk,
but before sending lvmetad the new metadata), the cached VG retains
the INVALID flag in lvmetad.  A subsequent command will read the
cached VG from lvmetad, see the INVALID flag, ignore the cached
copy, read the VG from disk instead, update the lvmetad copy
with the latest copy from disk, (this clears the INVALID flag
in lvmetad), and use the correct VG metadata for the command.

(This INVALID mechanism already existed for use by lvmlockd.)
2016-06-28 02:30:31 +01:00
Zdenek Kabelac
e88e9ee9ed cleanup: clean warns from older gcc
Don't report uninitialized use by older gcc.
2016-06-24 01:10:04 +02:00
David Teigland
6ae22125c6 vgcfgrestore: use lvmetad disabled state
Previously, vgcfgrestore would attempt to vg_remove the
existing VG from lvmetad and then vg_update to add the
restored VG.  But, if there was a failure in the command
or with vg_update, the lvmetad cache would be left incorrect.
Now, disable lvmetad before the restore begins, and then
rescan to populate lvmetad from disk after restore has
written the new VG to disk.
2016-06-20 11:19:49 -05:00
David Teigland
899a4ca275 lvmlockd: fix dropping PVs in rescanning VG
This fixes a problem in commit ae0a8740c.  The problem
in that commit was that all existing PVs are initially
dropped from lvmetad.  This works if the VG is updated
at the end, which replaces the dropped PVs, but if the
rescan finds that the VG seqno is unchanged, it leaves
the cached VG in place.  So, we should only drop the
existing PVs in lvmetad when the VG is going to be updated.
2016-06-17 12:21:09 -05:00
David Teigland
ae0a8740c5 lvmlockd: fix rescanning VG
Previously, new PVs that were added to the VG were not scanned.
2016-06-15 11:36:30 -05:00
David Teigland
86f9271457 vgcreate, pvcreate, vgextend: don't use a device with duplicates
If duplicate orphan PVs exist, don't allow one of them to be
used for vgcreate/pvcreate/vgextend.
2016-06-07 15:21:07 -05:00
David Teigland
199b7b55c2 lvmcache: fix duplicate handling with multiple scans
Some commands scan labels to populate lvmcache multiple
times, i.e. lvmcache_init, scan labels to fill lvmcache,
lvmcache_destroy, then later repeat

Each time labels are scanned, duplicates are detected,
and preferred devices are chosen.  Each time this is done
within a single command, we want to choose the same
preferred devices.  So, check for existing preferences
when choosing preferred devices.

This also fixes a problem with the list of unused duplicate
devs when run in an lvm shell.  The devs had been allocated
from cmd memory, resulting in invalid list entries between
commands.
2016-06-07 15:15:51 -05:00
David Teigland
01156de6f7 lvmcache: add optional dev arg to lvmcache_info_from_pvid
A number of places are working on a specific dev when they
call lvmcache_info_from_pvid() to look up an info struct
based on a pvid.  In those cases, pass the dev being used
to lvmcache_info_from_pvid().  When a dev is specified,
lvmcache_info_from_pvid() will verify that the cached
info it's using matches the dev being processed before
returning the info.  Calling code will not mistakenly
get info for the wrong dev when duplicate devs exist.

This confusion was happening when scanning labels when
duplicate devs existed.  label_read for the first dev
would add an info struct to lvmcache for that dev/pvid.
label_read for the second dev would see the pvid in
lvmcache from first dev, and mistakenly conclude that
the label_read from the second dev can be skipped
because it's already been done.  By verifying that the
dev for the cached pvid matches the dev being read,
this mismatch is avoided and the label is actually read
from the second duplicate.
2016-06-07 15:15:47 -05:00
David Teigland
ed6ffc7a34 lvmetad: handle update failures
If a command gets stuck during an lvmetad update, lvmetad
will cancel that update after the timeout.  The next command
to check the lvmetad will see that lvmetad needs to be
populated because lvmetad will return token of "none" after
a timed out update (same as when lvmetad is not populated
at all after starting.)

If a command gets an error during an lvmetad update, it
will now just quit and leave its updating token in place.
That update will be cancelled after the timeout.
2016-06-07 10:17:00 -05:00
David Teigland
851ccfccaf lvmetad: remove disabled case for "scan error"
Failures while populating lvmetad will be handling
differently in a subsequent commit.
2016-06-07 10:17:00 -05:00
David Teigland
0e7f352c70 lvmetad: define special update in progress string 2016-06-07 10:17:00 -05:00
Peter Rajnoha
48877e215d coverity: missing check for id_write_format return value 2016-05-31 09:56:10 +02:00
David Teigland
9b640c3684 pvscan: use process_each_vg for autoactivate
This refactors the code for autoactivation.  Previously,
as each PV was found, it would be sent to lvmetad, and
the VG would be autoactivated using a non-standard VG
processing function (the "activation_handler") called via
a function pointer from within the lvmetad notification path.

Now, any scanning that the command needs to do (scanning
only the named device args, or scanning all devices when
there are no args), is done first, before any activation
is attempted.  During the scans, the VG names are saved.
After scanning is complete, process_each_vg is used to do
autoactivation of the saved VG names.  This makes pvscan
activation much more similar to activation done with
vgchange or lvchange.

The separate autoactivate phase also means that if lvmetad
is disabled (either before or during the scan), the command
can continue with the activation step by simply not using
lvmetad and reverting to disk scanning to do the
activation.
2016-05-23 11:57:32 -05:00
David Teigland
ba9b7b69d9 pvremove: allow clearing a duplicate PV
Add a special case to allow modifying a duplicate PV
to erase it with pvremove -ff.
2016-05-16 14:40:43 -05:00
Alasdair G Kergon
a6203657a0 lvmetad: Fix client error when socket access fails. 2016-05-12 01:54:09 +01:00
David Teigland
b1ea27b1e2 lvmetad: disable if device scan fails
If a command begins repopulating the lvmetad cache,
and fails part way through, it should set the disabled
state in lvmetad so other commands don't use bad data.
If a subsequent scan succeeds, the disabled state is
cleared.
2016-05-06 09:00:00 -05:00
David Teigland
49d9582bed lvmcache: use active LVs and device sizes to choose between duplicates
If duplicate devices exist for a PV, and one device's
size matches the PV size, but the other doesn't, then
prefer the matching device.

If one device is used by an active LV, prefer that device.
2016-05-06 09:00:00 -05:00
David Teigland
d4e434d1e6 pvs: new attr and field for unchosen duplicate device
When there are duplicate devices for a PV, one device
is preferred and chosen to exist in the VG.  The other
devices are not used by lvm, but are displayed by pvs
with a new PV attr "d", indicating that they are
unchosen duplicate PVs.

The "duplicate" reporting field is set to "duplicate"
when the PV is an unchosen duplicate, and that field
is blank for the chosen PV.
2016-05-06 09:00:00 -05:00
David Teigland
d3d13e134a lvmcache: process duplicate PVs directly
Previously, duplicate PVs were processed as a side effect
of processing the "chosen" PV in lvmcache.  The duplicate
PV would be hacked into lvmcache temporarily in place of
the chosen PV.

In the old way, we had to always process the "chosen" PV
device, even if a duplicate of it was named on the command
line.  This meant we were processing a different device than
was asked for.  This could be worked around by naming
multiple duplicate devs on the command line in which case
they were swapped in and out of lvmcache for processing.

Now, the duplicate devs are processed directly in their
own processing loop.  This means we can remove the old
hacks related to processing dups as a side effect of
processing the chosen device.  We can now simply process
the device that was named on the command line.

When the same PVID exists on two or more devices, one device
is preferred and used in the VG, and the others are duplicates
and are not used in the VG.  The preferred device exists in
lvmcache as usual.  The duplicates exist in a specical list
of unused duplicate devices.

The duplicate devs have the "d" attribute and the "duplicate"
reporting field displays "duplicate" for them.

'pvs' warns about duplicates, but the formal output only
includes the single preferred PV.

'pvs -a' has the same warnings, and the duplicate devs are
included in the output.

'pvs <path>' has the same warnings, and displays the named
device, whether it is preferred or a duplicate.
2016-05-06 09:00:00 -05:00
David Teigland
8b7a78c728 lvmcache: improve duplicate PV handling
Wait to compare and choose alternate duplicate devices until
after all devices are scanned.  During scanning, the first
duplicate dev is kept in lvmcache, and others are kept in a
new list (_found_duplicate_devs).

After all devices are scanned, compare all the duplicates
available for a given PVID and decide which is best.

If the dev used in lvmcache is changed, drop the old dev
from lvmcache entirely and rescan the replacement dev.
Previously the VG metadata from the old dev was kept in
lvmcache and only the dev was replaced.

A new config setting devices/allow_changes_with_duplicate_pvs
can be set to 0 which disallows modifying a VG or activating
LVs in it when the VG contains PVs with duplicate devices.
Set to 1 is the old behavior which allowed the VG to be
changed.

The logic for which of two devs is preferred has changed.
The primary goal is to choose a device that is currently
in use if the other isn't, e.g. by an active LV.

. prefer dev with fs mounted if the other doesn't, else
. prefer dev that is dm if the other isn't, else
. prefer dev in subsystem if the other isn't

If neither device is preferred by these rules, then don't
change devices in lvmcache, leaving the one that was found
first.

The previous logic for preferring a device was:

. prefer dev in subsystem if the other isn't, else
. prefer dev without holders if the other has holders, else
. prefer dev that is dm if the other isn't
2016-05-06 09:00:00 -05:00
David Teigland
67da017fd2 lvmetad: remove client side altdev code
This is no longer used since lvmetad no longer
keeps track of alternate devices for duplicate PVs,
but is simply disabled when duplicates appear.
2016-05-06 09:00:00 -05:00
David Teigland
9539ee8098 lvmetad: set disabled flag in lvmetad if duplicate PVs are found
When devices are being scanned, if duplicate PVs are seen,
tell lvmetad to set its disabled flag because of duplicate PVs.
2016-05-06 08:59:59 -05:00