1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-12-10 16:23:54 +03:00
Commit Graph

14746 Commits

Author SHA1 Message Date
David Teigland
a761f9fc9d label_scan/vg_read: use label_read_data to avoid disk reads
The new label_scan() function reads a large buffer of data
from the start of the disk, and saves it so that multiple
structs can be read from it.  Previously, only the label_header
was read from this buffer, and the code which needed data
structures that immediately followed the label_header would
read those from disk separately.  This created a large
number of small, unnecessary disk reads.

In each place that the two read paths (label_scan and vg_read)
need to read data from disk, first check if that data is
already available from the label_read_data buffer, and if
so just copy it from the buffer instead of reading from disk.

Code changes
------------

- passing the label_read_data struct down through
  both read paths to make it available.

- before every disk read, first check if the location
  and size of the desired piece of data exists fully
  in the label_read_data buffer, and if so copy it
  from there.  Otherwise, use the existing code to
  read the data from disk.

- adding some log_error messages on existing error paths
  that were already being updated for the reasons above.

- using similar naming for parallel functions on the two
  parallel read paths that are being updated above.

  label_scan path calls:
  read_metadata_location_summary, text_read_metadata_summary

  vg_read path calls:
  read_metadata_location_vg, text_read_metadata_file

  Previously, those functions were named:

  label_scan path calls:
  vgname_from_mda, text_vgsummary_import

  vg_read path calls:
  _find_vg_rlocn, text_vg_import_fd

I/O changes
-----------

In the label_scan path, the following data is either copied
from label_read_data or read from disk for each PV:

- label_header and pv_header
- mda_header (in _raw_read_mda_header)
- vg metadata name (in read_metadata_location_summary)
- vg metadata (in config_file_read_fd)

Total of 4 reads per PV in the label_scan path.

In the vg_read path, the following data is either copied from
label_read_data or read from disk for each PV:

- mda_header (in _raw_read_mda_header)
- vg metadata name (in read_metadata_location_vg)
- vg metadata (in config_file_read_fd)

Total of 3 reads per PV in the vg_read path.

For a common read/reporting command, each PV will be:

- read by the command's initial lvmcache_label_scan()
- read by lvmcache_label_rescan_vg() at the start of vg_read()
- read by vg_read()

Previously, this would cause 11 synchronous disk reads per PV:
4 from lvmcache_label_scan(), 4 from lvmcache_label_rescan_vg()
and 3 from vg_read().

With this commit's optimization, there are now 2 async disk reads
per PV: 1 from lvmcache_label_scan() and 1 from
lvmcache_label_rescan_vg().

When a second mda is used on a PV, it is located at the
end of the PV.  This second mda and copy of metadata will
not be found in the label_read_data buffer, and will always
require separate disk reads.
2017-10-27 12:32:44 -05:00
David Teigland
654525b33d independent metadata areas: fix bogus code
Fix mixing bitwise & and logical && which was
always 1 in any case.
2017-10-27 12:13:25 -05:00
David Teigland
2773767924 label_scan: fix independent metadata areas
This fixes the use of lvmcache_label_rescan_vg() in the previous
commit for the special case of independent metadata areas.

label scan is about discovering VG name to device associations
using information from disks, but devices in VGs with
independent metadata areas have no information on disk, so
the label scan does nothing for these VGs/devices.
With independent metadata areas, only the VG metadata found
in files is used.  This metadata is found and read in
vg_read in the processing phase.

lvmcache_label_rescan_vg() drops lvmcache info for the VG devices
before repeating the label scan on them.  In the case of
independent metadata areas, there is no metadata on devices, so the
label scan of the devices will find nothing, so will not recreate
the necessary vginfo/info data in lvmcache for the VG.  Fix this
by setting a flag in the lvmcache vginfo struct indicating that
the VG uses independent metadata areas, and label rescanning should
be skipped.

In the case of independent metadata areas, it is the metadata
processing in the vg_read phase that sets up the lvmcache
vginfo/info information, and label scan has no role.
2017-10-27 12:13:25 -05:00
David Teigland
606dc06e43 label_scan: move to start of command
LVM's general design for scanning/reading of metadata from disks is
that a command begins with a discovery phase, called "label scan",
in which it discovers which devices belong to lvm, what VGs exist on
those devices, and which devices are associated with each VG.
After this comes the processing phase, which is based around
processing specific VGs.  In this phase, lvm acquires a lock on
the VG, and rescans the devices associated with that VG, i.e.
it repeats the label scan steps on the devices in the VG in case
something has changed between the initial label scan and taking
the VG lock.  This ensures that the command is processing the
lastest, unchanging data on disk.

This commit moves the location of these label scans to make them
clearer and avoid unnecessary repeated calls to them.

Previously, the initial label scan was called as a side effect
from various utility functions.  This would lead to it being called
unnecessarily.  It is an expensive operation, and should only be
called when necessary.  Also, this is a primary step in the
function of the command, and as such it should be called prominently
at the top level of command processing, not as a hidden side effect
of a utility function.  lvm knows exactly where and when the
label scan needs to be done.  Because of this, move the label scan
calls from the internal functions to the top level of processing.

Other specific instances of lvmcache_label_scan() are still called
unnecessarily or unclearly by specific commands that do not use
the common process_each functions.  These will be improved in
future commits.

During the processing phase, rescanning labels for devices in a VG
needs to be done after the VG lock is acquired in case things have
changed since the initial label scan.  This was being done by way
of rescanning devices that had the INVALID flag set in lvmcache.
This usually approximated the right set of devices, but it was not
exact, and obfuscated the real requirement.  Correct this by using
a new function that rescans the devices in the VG:
lvmcache_label_rescan_vg().

Apart from being inexact, the rescanning was extremely well hidden.
_vg_read() would call ->create_instance(), _text_create_text_instance(),
_create_vg_text_instance() which would call lvmcache_label_scan()
which would call _scan_invalid() which repeats the label scan on
devices flagged INVALID.  lvmcache_label_rescan_vg() is now called
prominently by _vg_read() directly.
2017-10-27 12:13:25 -05:00
David Teigland
cfe35ec877 label_scan: call new label_scan from lvmcache_label_scan
To do label scanning, lvm code calls lvmcache_label_scan().
Change lvmcache_label_scan() to use the new label_scan()
which can use async io, rather than implementing its own
dev iter loop and calling the synchronous label_read() on
each device.

Also add lvmcache_label_rescan_vg() which calls the new
label_scan_devs() which does label scanning on only the
specified devices.  This is for a subsequent commit and
is not yet used.
2017-10-27 12:13:25 -05:00
David Teigland
2530dc4679 label_scan: add new implementation for async and sync
This adds the implementation without using it in the code.
The code still calls label_read() on each individual device
to do scanning.

Calling the new label_scan() will use async io if async io is
enabled in config settings.  If not enabled, or if async io fails,
label_scan() will fall back to using synchronous io.  If only some
aio ops fail, the code will attempt to perform synchronous io on just
the ios that failed.

Uses linux native aio system calls, not the posix wrappers which
are messier and may not have all the latest linux capabilities.

Internally, the same functionality is used before:

- iterate through each visible device on the system,
  provided from from dev-cache

- call _find_label_header on the dev to find the sector
  containing the label_header

- call _text_read to look at the pv_header and mda locations
  after the pv_header

- for each mda location, read the mda_header and the vg metadata

- add info/vginfo structs to lvmcache which associate the
  device name (info) with the VG name (vginfo) so that vg_read
  can know which devices to read for a given VG name

The new label scanning issues a "large" read beginning at the start
of the device, where large is configurable, but intended to cover
all the labels/headers/metadata that is located at the start of
the device.  This large data buffer from each device is saved in a
global list using a new 'label_read_data' struct.  Currently, this
buffer is only used to find the label_header from the first four
sectors of the device.  In subsequent commits, other functions
that read other structs/metadata will first try to find that data in
the saved label_read_data buffer.  In most common cases, the data
they need can simply be copied out of the existing buffer, and
they can avoid issuing another disk read to get it.
2017-10-27 12:13:25 -05:00
David Teigland
8c08b509fa command: add settings to enable async io
There are config settings to enable aio, and to configure
the concurrency and read size.
2017-10-27 12:13:21 -05:00
David Teigland
a41ffd52d6 io: add low level async io support
The interface consists of:

- A context struct, one for the entire command.

- An io struct, one per io operation (read).

- dev_async_context_setup() creates an aio context.

- dev_async_context_destroy() destroys an aio context.

- dev_async_alloc_ios() allocates a specified number of io structs,
  along with an associated buffer for the data.

- dev_async_free_ios() frees all the allocated io structs+buffers.

- dev_async_io_get() gets an available io struct from those
  allocated in alloc_ios.  If none are available, it will allocate
  a new io struct if under limit.

- dev_async_io_put() puts a used io struct back into the set
  of unused io structs, making it available for get.

- dev_async_read_submit() start an async read io.

- dev_async_getevents() collect async io completions.
2017-10-27 10:11:17 -05:00
Heinz Mauelshagen
fdcc709ed0 WHATS_NEW: ignore stripes/stripesize on RAID takover 2017-10-26 18:18:24 +02:00
Heinz Mauelshagen
adb80816fb lvcreate: error message with dot. 2017-10-26 17:25:22 +02:00
Heinz Mauelshagen
4a3884245d raid: ignore --stripes/--stripesize on takeover
Converting from one raid level to another, no changes
of stripes or stripesize can be requested because those
are subject to reshaping.  I.e. the process requires to
takeover first and secondly request raid algorithm,
stripe or stripesize changes.

Ignore any related changes display warninngs
and proceed with the takeover.

Without this patch, a takeover requesting
stripesize change causes data corruption!
2017-10-26 17:16:23 +02:00
Zdenek Kabelac
b765288bf2 tests: better clustering support
Use exclusive activation for snapshot conversion since we can only
convert exclusively active volumes.
2017-10-26 14:04:58 +02:00
Zdenek Kabelac
1e80ec8926 tests: allow override of LVM_LOG_FILE_MAX_LINES
Just like with other vars support this:

make check_local T=xyz LVM_LOG_FILE_MAX_LINES=10000000

Allows easily to override existing line limit.
Also increase limiting size of logs per command since some of
our commands are becoming very verbose....
2017-10-26 14:04:58 +02:00
Zdenek Kabelac
04186616be Makefile: help shows hint about LVM_LOG_FILE_MAX_LINES 2017-10-26 14:04:58 +02:00
Zdenek Kabelac
837bfab75c log: better message when reached log limit
Add explaining message, when command was aborted due to the reach
of configure line number count (LVM_LOG_FILE_MAX_LINES)
for logging (used mainly with testing).
2017-10-26 14:04:58 +02:00
Zdenek Kabelac
1758614f96 WHATS_NEW: missed
Last patch missed to mention, we've improved/fixed generated paths
in units and init.d shell scripts when lvm2 was plainly configured
with just i.e. --prefix.

Note: some distros might have fully specified --sbindir and
--usrsbindir - thus those very not seeing problems in generated paths.
2017-10-26 14:04:58 +02:00
Zdenek Kabelac
44c4fe8e61 commands: drop secondary for lvconvert --type snapshot
Both form were marked and secondary thus none of the supported
syntax entered manpage.

This restores appearance of snapshot conversion in man page.
2017-10-25 22:02:54 +02:00
Zdenek Kabelac
35df4b10eb shellcheck: some apostrophe changes and cleanups 2017-10-25 22:02:54 +02:00
Zdenek Kabelac
34618c2d30 scripts: paths update
Correct usage of sbindir also for scripts so the path
no longer needs resolving more vars like exec_prefix & prefix.
2017-10-25 22:02:54 +02:00
Zdenek Kabelac
d809fbb541 systemd: use proper sbindir path
Replace lowercase  @sbindir@  with  @SBINDIR@ which contains
fully decoded path.

Same with  @usrsbindir@ which is also used with clvmd and cmirrord.

Also handle SYSCONFDIR for EnvironmentFile.

Patch fixes generated unit files with strings like:
ExecStart=${exec_prefix}/sbin/lvm
2017-10-25 22:02:54 +02:00
Zdenek Kabelac
3f59969c3f configure: improve support for sbindir path
Introduce few more AC_SUBST vars for usage in *.in generation.

In some case we want to replace i.e. $sbindir with full path
instead of current ${exec_prefix}/sbin.

This patch provides:

USRSBINDIR
SBINDIR
DEFAULT_SYS_LOCK_DIR
SYSCONFDIR

At the same time properly use sbindir & usrsbindir with
lvm, fsadm, clvmd from one primary definition.
2017-10-25 22:02:54 +02:00
Zdenek Kabelac
f32ef63b6c tests: snapshot conversions
Add missing tests for snapshost conversions.
2017-10-25 22:02:24 +02:00
Zdenek Kabelac
0a0cc696ca typo: fix invalid 2017-10-25 22:02:24 +02:00
Zdenek Kabelac
0e7edd1d24 snapshot: improve validation
Do not allow to take snapshot of mirror/raid leg or log or metadata LV.
This was actually never supported, but user was able to create it,
and this put device stack in hardly fixable state (needs manual work).

This prevents such creation to pass.

Also improve validation when recreating snapshot volume type
from origin and COW volume.
2017-10-25 21:58:01 +02:00
Jonathan Brassow
38f7fbac64 clean-up: Correct the comment to match the particular test case 2017-10-24 14:06:44 -05:00
Zdenek Kabelac
10c76ce35a tests:check lvconvert with /dev in vglvname 2017-10-24 16:16:08 +02:00
Zdenek Kabelac
ea63a38f5a lvconvert: fixing extraction of vgname
Correction to function for extracting vgname out of lvconvert
parameters.

Avoid repeating some checks.

Add code to handle generic options which may provide vgname in its argument
and compare them all so they match to a single vgname (otherwise it's a
error).

Extract default (envvar) vgname only when no position nor optional vgname is
found.

Fixing regression instroduce with patchset started with commit:
1e2420bca8   (2.02.169)
2017-10-24 16:16:08 +02:00
Ondrej Kozina
dcc8f90c58 fsadm: refactor resize_crypt function
split resize_crypt function in two.

a) Detect proper dm-crypt device type and count new --size
   value for cryptsetup resize command.

b) Perform the resize
2017-10-24 13:41:16 +02:00
Ondrej Kozina
9916d8fa9a fsadm: rename local variables to avoid confusion 2017-10-24 13:41:09 +02:00
Ondrej Kozina
213cea3aaa test: add regression test for fsadm bug
the bug in LUKS grow/shrink decision in fsadm was
masked due to fact that default LVM2 extent size
was larger than LUKS1 default data offset for dm-crypt
mapping. The new test address this bug.
2017-10-24 13:41:00 +02:00
Ondrej Kozina
af781897fa fsadm: fix bug in LUKS grow/shrink decision branch 2017-10-24 13:40:54 +02:00
Ondrej Kozina
6df7917581 fsadm: add luks specific error message for small devices 2017-10-24 13:40:50 +02:00
Zdenek Kabelac
888dd33148 tests: check stacked cache dataLV of thin-pool 2017-10-23 12:01:16 +02:00
Zdenek Kabelac
df3ff32fc0 lvcreate: skip checking for name restriction for caching
lvcreate supports a 'conversion' when caching LV.
This normally worked fine, however in case passed LV was
thin-pool's data LV with suffix _tdata we have failed to early.

As the easiest fix looks dropping validation of name when
caching type is select - such name check will happen later
once the VG is opened again and properly detect if the LV
with protected name already exists and can be converted,
or will be rejected as ambigiuous operation requiring user
to specify  --type cache | --type cache-pool.
2017-10-23 12:01:15 +02:00
Zdenek Kabelac
d6fcab900b lvextend: detect stacked cache lv used for thinpool
Ensure, that cacheLV is not tried to be resize until full support is
added.
2017-10-23 12:00:43 +02:00
Zdenek Kabelac
de58df390b lvconvert: preserve names of converted LV
When prompting and warning for conversion, remember initial LV names,
so after conversion is finished, correct original names are printed.
2017-10-23 11:58:27 +02:00
Heinz Mauelshagen
d6f4563103 test: remove 'should's from test to test target status race fix 2017-10-19 17:41:44 +02:00
Alasdair G Kergon
f3ae99dcc0 liblvm: Move lib code used exclusively into metadata-liblvm.c
Also remove some redundant function definitions from metadata.h.
2017-10-18 19:29:32 +01:00
Alasdair G Kergon
f1cc5b12fd tidy: Add missing underscores to statics. 2017-10-18 15:58:13 +01:00
Zdenek Kabelac
327d9d59be libdm: fix typo in libdevmapper.pc
Fixing name for RT libraries and using RT_LIBS.
2017-10-18 00:04:06 +02:00
David Teigland
1b319f39d6 lvmlockd: check error for sanlock access to lvmlock LV
When the sanlock daemon does not have permission to access
the lvmlock LV, make the error messages more helpful.
2017-10-17 13:45:53 -05:00
Alasdair G Kergon
146745ad88 device: Separate errors for dev not found and filtered.
Replaced the confusing device error message "not found (or ignored by
filtering)" by either "not found" or "excluded by a filter".
(Later we should be able to say which filter.)

Left the the liblvm code paths alone.
2017-10-17 02:12:41 +01:00
Zdenek Kabelac
1f359f7558 tests: check external origin is monitored 2017-10-16 15:47:46 +02:00
Zdenek Kabelac
186a3da998 thin: monitor also external origin
Add missing monitoring for external origin LVs and add -real suffix
for UUID used for monitoring of external origin.
2017-10-16 15:47:46 +02:00
Marian Csontos
12aff59183 configure: autoreconf 2017-10-16 07:48:23 +02:00
David Teigland
6ac1e04b3a replicator: remove the code
It has not been used in a long time and is not
expected to be used further.
2017-10-13 16:20:42 -05:00
Marian Csontos
e14c0cabd9 Update WHATS_NEW 2017-10-13 13:11:01 +02:00
Heinz Mauelshagen
cf13a30eaa lvcreate: allow 100%FREE creation of "--type mirror" to work
Fixes the following case with 3PVs and 3 legs "mirror" LV:

# lvcreate -l100%FREE --type mirror -m2 vg3
  Insufficient free space for log allocation for logical volume .
  Unable to allocate extents for mirror log.

Related: rhbz1269533
2017-10-12 17:43:24 +02:00
Marian Csontos
ae55b1b20a test: "Disable" lvconvert-raid-reshape
...when running from ramdisk. This causes test failure, so it is kept on
eyes.
2017-10-12 10:55:02 +02:00
Ondrej Kozina
71261ae374 test: update fsadm-crypt to pass with legacy cryptsetup 2017-10-11 14:39:33 +02:00