1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-10-26 07:33:16 +03:00

Compare commits

...

36 Commits

Author SHA1 Message Date
David Teigland
87075b989a label_scan: use a single aio context
A new aio context was being created, io_setup(2), in each
call to rescan labels.  This is an expensive call, so keep
the aio context around to reuse in each call.
2017-10-11 10:48:37 -05:00
David Teigland
915c057b65 doc: add description of disk reading 2017-10-11 10:48:37 -05:00
David Teigland
33be293153 scanning: get async events from config setting 2017-10-11 10:48:37 -05:00
David Teigland
85a2eb2780 lvmetad_vg_lookup: use fid ref_count to fix unwanted free 2017-10-11 10:48:37 -05:00
David Teigland
5cbb2c8da1 pvscan: use new dev scanning code 2017-10-11 10:48:37 -05:00
David Teigland
daa88ae711 label_scan: use the new scanning for label_scan_invalid 2017-10-11 10:48:37 -05:00
David Teigland
142b13566f scanning: remove references to async reads
Now that label_read_data structs are used for both
sync and async reads.
2017-10-11 10:48:37 -05:00
David Teigland
f5323b08f0 config: move init of aio/scan settings 2017-10-11 10:48:37 -05:00
David Teigland
17514dca5a config: move scan settings to devices section 2017-10-11 10:48:37 -05:00
David Teigland
04a6020532 label_scan: fix label scan for independent metadata areas
When label_scan reads metadata from independent areas, not
from the devices, set a flag on the vginfo.  Use this flag
to avoid trying to rescan those devices in vg_read() since
the metadata won't be found on them, and wreck the lvmcache
data set up by the label scan.
2017-10-11 10:48:37 -05:00
David Teigland
35537fef19 add comment describing the role of fid/fic 2017-10-11 10:48:37 -05:00
David Teigland
de5045c650 pvscan: quit if duplicates are found in label scan
duplicate PVs are detected during label scan and lvmetad
is disabled when they are detected.  pvscan --cache can quit
after label scan if duplicates were found rather than going
through the remaining metadata reading steps.
2017-10-11 10:48:37 -05:00
David Teigland
1d6e5b77de label_scan: get scan_size from config setting 2017-10-11 10:48:37 -05:00
David Teigland
6d8e87bc04 label_scan: use label_read_data for synchronous scans
We can read a large amount of data into label_read_data
synchronously also, and use this data in the processing path
instead of reading each bit of data from disk separately.
2017-10-11 10:48:37 -05:00
David Teigland
a522348f07 update configure for aio 2017-10-11 10:48:37 -05:00
David Teigland
d88aa7acc6 label_scan: add to vgcfgrestore
This command doesn't use process_each so it needs to
do a label_scan itself before trying to parse metadata,
which wants to know which PVs are on which devices.
2017-10-11 10:48:37 -05:00
David Teigland
b49c495b02 configure: improve libaio check 2017-10-11 10:48:37 -05:00
David Teigland
72c56d402f configure: autoreconf 2017-10-11 10:48:37 -05:00
David Teigland
f5062604ee conditional compile with AIO_SUPPORT 2017-10-11 10:48:37 -05:00
David Teigland
6ab316f072 label_scan: remove async/sync distinction from callers 2017-10-11 10:48:37 -05:00
David Teigland
f43aa2007c scanning: rewrite lvmetad_pvscan_vg to use new label reading
This is used to refresh the lvmetad content for a VG after
lvmlockd has invalidated the cached copy of the metadata in
lvmetad.
2017-10-11 10:48:37 -05:00
David Teigland
bf4edb27dc pvscan: use new label_scan data
'pvscan --cache' for scanning all devices now uses
label_scan_async and can reuse data like other commands.

'pvscan --cache dev' can't do a label_scan because it's
only allowed to read the single dev.  A label_read on
that single dev is added prior to reading the VG from it.
2017-10-11 10:48:37 -05:00
David Teigland
eca99f52cf scanning: allocate label data struct from mem pool
Use a separate alloc/free loop for ld structs from
mem pool vs the aio-related structs that are not
from pool.
2017-10-11 10:48:37 -05:00
David Teigland
cdd42dd78f dev-io: add layer around async io
Use the actual aio system calls only in dev-io.c
and use an obfuscation layer above that in lvm.
2017-10-11 10:48:37 -05:00
David Teigland
e3230f140d label_scan: pull out to top level
label_scan is a primary step that a command performs,
following the standard pattern for command processing.
It was being called as a side effect of a utility/helper
function, which made it less obvious and made it easier
to be called without realizing it.  Pull it up to a
prominent top level in the sequence of primary steps.
label_scan is not something that should be done in
various low level places as needed.
2017-10-11 10:48:37 -05:00
David Teigland
0b53671206 vg_read: avoid another extraneous device read
when using async read data.

This is the short disk read that validates VG metadata
before reading the full metadata.  There's a nearly
identical function in the label scan path, as this one
in the vg_read path.  The other version was already
adapted to use the label read data, but this one was
missed.
2017-10-11 10:48:37 -05:00
David Teigland
ec8624bfd9 vg_read: use the same async read code as label scan
Both for rescanning labels at the start of vg_read,
and then passing and using that reread data through
the vg_read() path to avoid rereading the same data
from disk.
2017-10-11 10:47:10 -05:00
David Teigland
c3fb3996db vg_read: improve messages and add comments
comments added where future error path handling should go
2017-10-11 10:47:10 -05:00
David Teigland
d35683e046 vg_read: new wording for functions and messages
Make the function names and messages parallel each other
on the two parallel vg reading paths (label reading and
vg_read).
2017-10-11 10:47:10 -05:00
David Teigland
ae1dad56f8 labels: move the label scan at the start of each vg_read
This moves a low level label scan to the start of vg_read.
2017-10-11 10:47:10 -05:00
David Teigland
bbf5b05c82 labels: avoid label_read when getting fmt in vg_read
When vg_read() begins, it looks up the format (fmt) for
the VG name in lvmcache, telling lvmcache_fmt_from_vgname()
to reread labels on all devices in the VG.

Avoid rereading the labels on all the devices, and trust
that lvmcache has correct information.  If the format of
the VG is not available, the calling code already
rescans labels and retries.
2017-10-11 10:47:10 -05:00
David Teigland
6ebe78e96f labels: avoid label_read when getting device from pvid
When the low levels of vg_read() are parsing VG metadata,
they see a PVID, and try to get the device for it, calling
lvmcache_device_from_pvid().  When this function found
the dev for this PVID in lvmcache, it would issue a
full label_read() on that device and verify that the
pvid/dev mapping in lvmcache is correct.

Remove this label_read() and trust that the pvid to dev
mapping in lvmcache is correct.  If metadata changed
between the initial label scan performed by the command,
and the locked vg_read(), then other code exists to
rescan labels.

(The lvmetad case already trusted the contents of lvmcache.)
2017-10-11 10:47:10 -05:00
David Teigland
4057e092f7 labels: avoid metadata area read using async read data
Copy the metadata out of the initial async read buffer
instead of performing another two synchronous reads
(first to check vgname, second to read all metadata.)
2017-10-11 10:47:10 -05:00
David Teigland
2be6ffca25 labels: avoid mda_header read using async read data
Extend the initial async read buffer size to cover all
the headers/metadata that need to be read from the device
during label scan.

Copy the mda_header from this buffer instead of performing
another synchronous read for it.
2017-10-11 10:47:10 -05:00
David Teigland
b968e33760 labels: add async label scan 2017-10-11 10:47:10 -05:00
David Teigland
6015e7f48d lvmcache: simplify metadata cache
The copy of VG metadata stored in lvmcache was not being used
in general.  It pretended to be a generic VG metadata cache,
but was not being used except for clvmd activation.  There
it was used to avoid reading from disk while devices were
suspended, i.e. in resume.

This removes the code that attempted to make this look
like a generic metadata cache, and replaces with with
something narrowly targetted to what it's actually used for.

This is a way of passing the VG from suspend to resume in
clvmd.  Since in the case of clvmd one caller can't simply
pass the same VG to both suspend and resume, suspend needs
to stash the VG somewhere that resume can grab it from.
(resume doesn't want to read it from disk since devices
are suspended.)  The lvmcache vginfo struct is used as a
convenient place to stash the VG to pass it from suspend
to resume, even though it isn't related to the lvmcache
or vginfo.  These suspended_vg* vginfo fields should
not be used or touched anywhere else, they are only to
be used for passing the VG data from suspend to resume
in clvmd.  The VG data being passed between suspend and
resume is never modified, and will only exist in the
brief period between suspend and resume in clvmd.

suspend has both old (current) and new (precommitted)
copies of the VG metadata.  It stashes both of these in
the vginfo prior to suspending devices.  When vg_commit
is successful, it sets a flag in vginfo as before,
signaling the transition from old to new metadata.

resume grabs the VG stashed by suspend.  If the vg_commit
happened, it grabs the new VG, and if the vg_commit didn't
happen it grabs the old VG.  The VG is then used to resume
LVs.

This isolates clvmd-specific code and usage from the
normal lvm vg_read code, making the code simpler and
the behavior easier to verify.

Sequence of operations:

- lv_suspend() has both vg_old and vg_new
  and stashes a copy of each onto the vginfo:
  lvmcache_save_suspended_vg(vg_old);
  lvmcache_save_suspended_vg(vg_new);

- vg_commit() happens, which causes all clvmd
  instances to call lvmcache_commit_metadata(vg).
  A flag is set in the vginfo indicating the
  transition from the old to new VG:
  vginfo->suspended_vg_committed = 1;

- lv_resume() needs either vg_old or vg_new
  to use in resuming LVs.  It doesn't want to
  read the VG from disk since devices are
  suspended, so it gets the VG stashed by
  lv_suspend:
  vg = lvmcache_get_suspended_vg(vgid);

If the vg_commit did not happen, suspended_vg_committed
will not be set, and in this case, lvmcache_get_suspended_vg()
will return the old VG instead of the new VG, and it will
resume LVs based on the old metadata.
2017-10-11 10:47:10 -05:00
46 changed files with 2784 additions and 845 deletions

69
configure vendored
View File

@@ -704,7 +704,9 @@ FSADM
ELDFLAGS
DM_LIB_PATCHLEVEL
DMEVENTD_PATH
AIO_LIBS
DL_LIBS
AIO
DEVMAPPER
DEFAULT_USE_LVMLOCKD
DEFAULT_USE_LVMPOLLD
@@ -951,6 +953,7 @@ enable_profiling
enable_testing
enable_valgrind_pool
enable_devmapper
enable_aio
enable_lvmetad
enable_lvmpolld
enable_lvmlockd_sanlock
@@ -1689,6 +1692,7 @@ Optional Features:
--enable-testing enable testing targets in the makefile
--enable-valgrind-pool enable valgrind awareness of pools
--disable-devmapper disable LVM2 device-mapper interaction
--disable-aio disable async i/o
--enable-lvmetad enable the LVM Metadata Daemon
--enable-lvmpolld enable the LVM Polling Daemon
--enable-lvmlockd-sanlock
@@ -3177,6 +3181,7 @@ case "$host_os" in
LDDEPS="$LDDEPS .export.sym"
LIB_SUFFIX=so
DEVMAPPER=yes
AIO=yes
BUILD_LVMETAD=no
BUILD_LVMPOLLD=no
LOCKDSANLOCK=no
@@ -3196,6 +3201,7 @@ case "$host_os" in
CLDNOWHOLEARCHIVE=
LIB_SUFFIX=dylib
DEVMAPPER=yes
AIO=no
ODIRECT=no
DM_IOCTLS=no
SELINUX=no
@@ -11840,6 +11846,67 @@ $as_echo "#define DEVMAPPER_SUPPORT 1" >>confdefs.h
fi
################################################################################
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to use aio" >&5
$as_echo_n "checking whether to use aio... " >&6; }
# Check whether --enable-aio was given.
if test "${enable_aio+set}" = set; then :
enableval=$enable_aio; AIO=$enableval
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $AIO" >&5
$as_echo "$AIO" >&6; }
if test "$AIO" = yes; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for io_setup in -laio" >&5
$as_echo_n "checking for io_setup in -laio... " >&6; }
if ${ac_cv_lib_aio_io_setup+:} false; then :
$as_echo_n "(cached) " >&6
else
ac_check_lib_save_LIBS=$LIBS
LIBS="-laio $LIBS"
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
/* Override any GCC internal prototype to avoid an error.
Use char because int might match the return type of a GCC
builtin and then its argument prototype would still apply. */
#ifdef __cplusplus
extern "C"
#endif
char io_setup ();
int
main ()
{
return io_setup ();
;
return 0;
}
_ACEOF
if ac_fn_c_try_link "$LINENO"; then :
ac_cv_lib_aio_io_setup=yes
else
ac_cv_lib_aio_io_setup=no
fi
rm -f core conftest.err conftest.$ac_objext \
conftest$ac_exeext conftest.$ac_ext
LIBS=$ac_check_lib_save_LIBS
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_aio_io_setup" >&5
$as_echo "$ac_cv_lib_aio_io_setup" >&6; }
if test "x$ac_cv_lib_aio_io_setup" = xyes; then :
$as_echo "#define AIO_SUPPORT 1" >>confdefs.h
AIO_LIBS="-laio"
AIO_SUPPORT=yes
else
AIO_LIBS=
AIO_SUPPORT=no
fi
fi
################################################################################
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build LVMetaD" >&5
$as_echo_n "checking whether to build LVMetaD... " >&6; }
@@ -15787,6 +15854,8 @@ _ACEOF

View File

@@ -39,6 +39,7 @@ case "$host_os" in
LDDEPS="$LDDEPS .export.sym"
LIB_SUFFIX=so
DEVMAPPER=yes
AIO=yes
BUILD_LVMETAD=no
BUILD_LVMPOLLD=no
LOCKDSANLOCK=no
@@ -58,6 +59,7 @@ case "$host_os" in
CLDNOWHOLEARCHIVE=
LIB_SUFFIX=dylib
DEVMAPPER=yes
AIO=no
ODIRECT=no
DM_IOCTLS=no
SELINUX=no
@@ -1131,6 +1133,24 @@ if test "$DEVMAPPER" = yes; then
AC_DEFINE([DEVMAPPER_SUPPORT], 1, [Define to 1 to enable LVM2 device-mapper interaction.])
fi
################################################################################
dnl -- Disable aio
AC_MSG_CHECKING(whether to use aio)
AC_ARG_ENABLE(aio,
AC_HELP_STRING([--disable-aio],
[disable async i/o]),
AIO=$enableval)
AC_MSG_RESULT($AIO)
if test "$AIO" = yes; then
AC_CHECK_LIB(aio, io_setup,
[AC_DEFINE([AIO_SUPPORT], 1, [Define to 1 if aio is available.])
AIO_LIBS="-laio"
AIO_SUPPORT=yes],
[AIO_LIBS=
AIO_SUPPORT=no ])
fi
################################################################################
dnl -- Build lvmetad
AC_MSG_CHECKING(whether to build LVMetaD)
@@ -2072,9 +2092,11 @@ AC_SUBST(DEFAULT_USE_LVMETAD)
AC_SUBST(DEFAULT_USE_LVMPOLLD)
AC_SUBST(DEFAULT_USE_LVMLOCKD)
AC_SUBST(DEVMAPPER)
AC_SUBST(AIO)
AC_SUBST(DLM_CFLAGS)
AC_SUBST(DLM_LIBS)
AC_SUBST(DL_LIBS)
AC_SUBST(AIO_LIBS)
AC_SUBST(DMEVENTD_PATH)
AC_SUBST(DM_LIB_PATCHLEVEL)
AC_SUBST(ELDFLAGS)

View File

@@ -77,6 +77,10 @@ include $(top_builddir)/make.tmpl
LIBS += $(LVMINTERNAL_LIBS) -ldevmapper $(PTHREAD_LIBS)
CFLAGS += -fno-strict-aliasing $(EXTRA_EXEC_CFLAGS)
ifeq ("@AIO@", "yes")
LIBS += $(AIO_LIBS)
endif
INSTALL_TARGETS = \
install_clvmd

246
doc/lvm-disk-reading.txt Normal file
View File

@@ -0,0 +1,246 @@
LVM disk reading
Reading disks happens in two phases. The first is a discovery phase,
which determines what's on the disks. The second is a working phase,
which does a particular job for the command.
Phase 1: Discovery
------------------
Read all the disks on the system to find out:
- What are the LVM devices?
- What VG's exist on those devices?
This phase is called "label scan" (although it reads and scans everything,
not just the label.) It stores the information it discovers (what LVM
devices exist, and what VGs exist on them) in lvmcache. The devs/VGs info
in lvmcache is the starting point for phase two.
Phase 1 in outline:
For each device:
a. Read the first <N> KB of the device. (N is configurable.)
b. Look for the lvm label_header in the first four sectors,
if none exists, it's not an lvm device, so quit looking at it.
(By default, label_header is in the second sector.)
c. Look at the pv_header, which follows the label_header.
This tells us the location of VG metadata on the device.
There can be 0, 1 or 2 copies of VG metadata. The first
is always at the start of the device, the second (if used)
is at the end.
d. Look at the first mda_header (location came from pv_header
in the previous step). This is by default in sector 8,
4096 bytes from the start of the device. This tells us the
location of the actual VG metadata text.
e. Look at the first copy of the text VG metadata (location came
from mda_header in the previous step). This is by default
in sector 9, 4608 bytes from the start of the device.
The VG metadata is only partially analyzed to create a basic
summary of the VG.
f. Store an "info" entry in lvmcache for this device,
indicating that it is an lvm device, and store a "vginfo"
entry in lvmcache indicating the name of the VG seen
in the metadata in step e.
g. If the pv_header in step c shows a second mda_header
location at the end of the device, then read that as
in step d, and repeat steps e-f for it.
At the end of step 1, lvmcache will have a list of devices
that belong to LVM, and a list of VG names that exist on
those devices. Each device (info struct) is associated
with the VG (vginfo struct) it is used in.
If the number <N> of KB read in step (a) was large enough, then
all the structs/metadata needed in steps b-e will be found
in the data buffer returned by a. If a particular struct
or metadata needed in steps b-e are located outside the range
of the initial read, then those steps need to issue their own
read at the necessary location to get that bit of data.
(The optional second mda_header and VG metadata in step g
is located at the end of the device, and will always require
an additional read.)
Phase 1 in code:
The most relevant functions are listed for each step in the outline.
For each device:
lvmcache_label_scan()
label_scan()
_label_scan_async()
for each dev: dev = dev_iter_get(iter)
a. _label_read_async_start()
b. _label_read_data_process()
_find_label_header()
c. _label_read_data_process()
ops->read()
_text_read()
d. _read_mda_header_and_metadata()
raw_read_mda_header()
e. _read_mda_header_and_metadata()
read_metadata_location()
text_read_metadata_summary()
config_file_read_fd()
ops->read_vgsummary()
_read_vgsummary()
f. _text_read(): lvmcache_add()
[adds this device to list of lvm devices]
_read_mda_header_and_metadata(): lvmcache_update_vgname_and_id()
[adds the VG name to list of VGs]
Phase 1 in log messages:
For each device:
Scanning data from all devs async
a. Reading sectors from device <dev>
b. Parsing label and data from device <dev>
d. Copying mda header sector from <dev> ...
or if the mda_header needs to be read from disk:
Reading mda header sector from <dev> ...
e. Copying metadata summary for <dev> ...
or if the metadata needs to be read from disk:
Reading metadata summary for <dev> ...
f. lvmcache <dev> ...
Phase 2: Work
-------------
This phase carries out the operation requested by the command that was
run.
Whereas the first phase is based on iterating through each device on the
system, this phase is based on iterating through each VG name. The list
of VG names comes from phase 1, which stored the list in lvmcache to be
used by phase 2.
Some commands may need to iterate through all VG names, while others may
need to iterate through just one or two.
This phase includes locking each VG as work is done on it, so that two
commands do not interfere with each other.
Phase 2 in outline:
For each VG name:
a. Lock the VG.
b. Repeat the phase 1 scan steps for each device (PV) in this VG.
The phase 1 information in lvmcache may have changed because no VG lock
was held during phase 1. So, repeat the phase 1 steps, but only for the
devices in this VG.
c. Get the list of on-disk metadata locations for this VG.
Phase 1 created this list in lvmcache to be used here. At this
point we copy it out of lvmcache. In the simple/common case,
this is a list of devices in the VG. But, some devices may
have 0 or 2 metadata locations instead of the default 1, so it
is not always equal to the list of devices. We want to read
every copy of the metadata for this VG.
d. For each metadata location on each device in the VG
(the list from the previous step):
1) Look at the mda_header. The location of the mda_header was saved
in the lvmcache info struct by phase 1 (where it came from the
pv_header.) The mda_header tells us where the text VG metadata is
located.
2) Look at the text VG metadata. The location came from mda_header
in the previous step. The VG metadata is fully analyzed and used
to create an in-memory 'struct volume_group'.
Copying or reading the mda_header and VG metadata in steps d.1 and d.2
follow the same model as in phase 1: if the data read in scan step 2.b
covered these areas, then data is simply copied out of the buffer from
step 2.b, otherwise new reads are done.
e. Compare the copies of VG metadata that were found in each location.
If some copies are older, choose the newest one to use, and update
any older copies.
f. Update details about the devices/VG in lvmcache.
g. Pass the 'vg' struct to the command-specific code to work with.
Phase 2 in code:
The most relevant functions are listed for each step in the outline.
For each VG name:
process_each_vg()
a. vg_read()
lock_vol()
b. vg_read()
lvmcache_label_rescan_vg()
[insert phase 1 steps a-f]
c. vg_read()
create_instance()
_text_create_text_instance()
_create_vg_text_instance()
lvmcache_fid_add_mdas_vg()
[Copies mda locations from info->mdas where it was saved
by phase 1, into fid->metadata_areas_in_use. This is
the key connection between phase 1 and phase 2.]
d. dm_list_iterate_items(mda, &fid->metadata_areas_in_use)
d1. ops->vg_read()
_vg_read_raw()
raw_read_mda_header()
d2. _vg_read_raw()
text_read_metadata()
config_file_read_fd()
ops->read_vg()
_read_vg()
Phase 2 in log messages:
For each VG name:
Processing VG <name>
Reading VG <name>
b. Reading VG rereading labels for <name>
Scanning data from devs async
[insert log messages from phase 1 steps a-f]
Scanned data from <N> devs async
For each mda on each <dev> in the VG:
d. Reading VG <name> from <dev>
d.1. Copying|Reading mda header sector from <dev> ...
d.2. Copying|Reading metadata from <dev> ...

View File

@@ -28,6 +28,7 @@
#include "config.h"
#include "segtype.h"
#include "sharedlib.h"
#include "lvmcache.h"
#include <limits.h>
#include <fcntl.h>
@@ -2116,6 +2117,17 @@ static int _lv_suspend(struct cmd_context *cmd, const char *lvid_s,
if (!lv_info(cmd, lv, laopts->origin_only, &info, 0, 0))
goto_out;
/*
* Save old and new (current and precommitted) versions of the
* VG metadata for lv_resume() to use, since lv_resume can't
* read metadata given that devices are suspended. lv_resume()
* will resume LVs using the old/current metadata if the vg_commit
* did happen (or failed), and it will resume LVs using the
* new/precommitted metadata if the vg_commit succeeded.
*/
lvmcache_save_suspended_vg(lv->vg, 0);
lvmcache_save_suspended_vg(lv_pre->vg, 1);
if (!info.exists || info.suspended) {
if (!error_if_not_suspended) {
r = 1;
@@ -2279,15 +2291,54 @@ static int _lv_resume(struct cmd_context *cmd, const char *lvid_s,
struct lv_activate_opts *laopts, int error_if_not_active,
const struct logical_volume *lv)
{
const struct logical_volume *lv_to_free = NULL;
struct volume_group *vg = NULL;
struct logical_volume *lv_found = NULL;
const union lvid *lvid;
const char *vgid;
struct lvinfo info;
int r = 0;
if (!activation())
return 1;
if (!lv && !(lv_to_free = lv = lv_from_lvid(cmd, lvid_s, 0)))
goto_out;
/*
* When called in clvmd, lvid_s is set and lv is not. We need to
* get the VG metadata without reading disks because devs are
* suspended. lv_suspend() saved old and new VG metadata for us
* to use here. If vg_commit() happened, lvmcache_get_suspended_vg
* will return the new metadata for us to use in resuming LVs.
* If vg_commit() did not happen, lvmcache_get_suspended_vg
* returns the old metadata which we use to resume LVs.
*/
if (!lv && lvid_s) {
lvid = (const union lvid *) lvid_s;
vgid = (const char *)lvid->id[0].uuid;
if ((vg = lvmcache_get_suspended_vg(vgid))) {
log_debug_activation("Resuming LVID %s found saved vg seqno %d %s", lvid_s, vg->seqno, vg->name);
if ((lv_found = find_lv_in_vg_by_lvid(vg, lvid))) {
log_debug_activation("Resuming LVID %s found saved LV %s", lvid_s, display_lvname(lv_found));
lv = lv_found;
} else
log_debug_activation("Resuming LVID %s did not find saved LV", lvid_s);
} else
log_debug_activation("Resuming LVID %s did not find saved VG", lvid_s);
/*
* resume must have been called without a preceding suspend,
* so we need to read the vg.
*/
if (!lv) {
log_debug_activation("Resuming LVID %s reading VG", lvid_s);
if (!(lv_found = lv_from_lvid(cmd, lvid_s, 0))) {
log_debug_activation("Resuming LVID %s failed to read VG", lvid_s);
goto out;
}
lv = lv_found;
}
}
if (!lv_is_origin(lv) && !lv_is_thin_volume(lv) && !lv_is_thin_pool(lv))
laopts->origin_only = 0;
@@ -2332,9 +2383,6 @@ static int _lv_resume(struct cmd_context *cmd, const char *lvid_s,
r = 1;
out:
if (lv_to_free)
release_vg(lv_to_free->vg);
return r;
}
@@ -2466,6 +2514,10 @@ int lv_activation_filter(struct cmd_context *cmd, const char *lvid_s,
int *activate_lv, const struct logical_volume *lv)
{
const struct logical_volume *lv_to_free = NULL;
struct volume_group *vg = NULL;
struct logical_volume *lv_found = NULL;
const union lvid *lvid;
const char *vgid;
int r = 0;
if (!activation()) {
@@ -2473,6 +2525,24 @@ int lv_activation_filter(struct cmd_context *cmd, const char *lvid_s,
return 1;
}
/*
* This function is called while devices are suspended,
* so try to use the copy of the vg that was saved in
* lv_suspend.
*/
if (!lv && lvid_s) {
lvid = (const union lvid *) lvid_s;
vgid = (const char *)lvid->id[0].uuid;
if ((vg = lvmcache_get_suspended_vg(vgid))) {
log_debug_activation("activation_filter for %s found saved VG seqno %d %s", lvid_s, vg->seqno, vg->name);
if ((lv_found = find_lv_in_vg_by_lvid(vg, lvid))) {
log_debug_activation("activation_filter for %s found saved LV %s", lvid_s, display_lvname(lv_found));
lv = lv_found;
}
}
}
if (!lv && !(lv_to_free = lv = lv_from_lvid(cmd, lvid_s, 0)))
goto_out;

630
lib/cache/lvmcache.c vendored
View File

@@ -63,15 +63,41 @@ struct lvmcache_vginfo {
char *lock_type;
uint32_t mda_checksum;
size_t mda_size;
size_t vgmetadata_size;
char *vgmetadata; /* Copy of VG metadata as format_text string */
struct dm_config_tree *cft; /* Config tree created from vgmetadata */
/* Lifetime is directly tied to vgmetadata */
struct volume_group *cached_vg;
unsigned holders;
unsigned vg_use_count; /* Counter of vg reusage */
unsigned precommitted; /* Is vgmetadata live or precommitted? */
unsigned cached_vg_invalidated; /* Signal to regenerate cached_vg */
int independent_metadata_location; /* metadata read from independent areas */
/*
* The following are not related to lvmcache or vginfo,
* but are borrowing the vginfo to store the data.
*
* suspended_vg_* are used only by clvmd suspend/resume.
* In suspend, both old (current) and new (precommitted)
* metadata is saved. (Each in three forms: buffer, cft,
* and vg). In resume, if the vg was committed
* (suspended_vg_committed is set), then LVs are resumed
* using the new metadata, but if the vg wasn't committed,
* then LVs are resumed using the old metadata.
*
* suspended_vg_committed is set to 1 when clvmd gets
* LCK_VG_COMMIT from vg_commit().
*
* These fields are only used between suspend and resume
* in clvmd, and should never be used in any other way.
* The contents of this data are never changed. This
* data does not really belong in lvmcache, it's unrelated
* to lvmcache or vginfo, but it's just a convenient place
* for clvmd to stash the VG between suspend and resume
* (since the same caller isn't present to pass the VG to
* both suspend and resume in the case of clvmd.)
*
* This data is not really a "cache" of the VG, it is just
* a location to pass the VG between suspend and resume.
*/
int suspended_vg_committed;
char *suspended_vg_old_buf;
struct dm_config_tree *suspended_vg_old_cft;
struct volume_group *suspended_vg_old;
char *suspended_vg_new_buf;
struct dm_config_tree *suspended_vg_new_cft;
struct volume_group *suspended_vg_new;
};
static struct dm_hash_table *_pvid_hash = NULL;
@@ -138,73 +164,7 @@ void lvmcache_seed_infos_from_lvmetad(struct cmd_context *cmd)
_has_scanned = 1;
}
/* Volume Group metadata cache functions */
static void _free_cached_vgmetadata(struct lvmcache_vginfo *vginfo)
{
if (!vginfo || !vginfo->vgmetadata)
return;
dm_free(vginfo->vgmetadata);
vginfo->vgmetadata = NULL;
/* Release also cached config tree */
if (vginfo->cft) {
dm_config_destroy(vginfo->cft);
vginfo->cft = NULL;
}
log_debug_cache("Metadata cache: VG %s wiped.", vginfo->vgname);
release_vg(vginfo->cached_vg);
}
/*
* Cache VG metadata against the vginfo with matching vgid.
*/
static void _store_metadata(struct volume_group *vg, unsigned precommitted)
{
char uuid[64] __attribute__((aligned(8)));
struct lvmcache_vginfo *vginfo;
char *data;
size_t size;
if (!(vginfo = lvmcache_vginfo_from_vgid((const char *)&vg->id))) {
stack;
return;
}
if (!(size = export_vg_to_buffer(vg, &data))) {
stack;
_free_cached_vgmetadata(vginfo);
return;
}
/* Avoid reparsing of the same data string */
if (vginfo->vgmetadata && vginfo->vgmetadata_size == size &&
strcmp(vginfo->vgmetadata, data) == 0)
dm_free(data);
else {
_free_cached_vgmetadata(vginfo);
vginfo->vgmetadata_size = size;
vginfo->vgmetadata = data;
}
vginfo->precommitted = precommitted;
if (!id_write_format((const struct id *)vginfo->vgid, uuid, sizeof(uuid))) {
stack;
return;
}
log_debug_cache("Metadata cache: VG %s (%s) stored (%" PRIsize_t " bytes%s).",
vginfo->vgname, uuid, size,
precommitted ? ", precommitted" : "");
}
static void _update_cache_info_lock_state(struct lvmcache_info *info,
int locked,
int *cached_vgmetadata_valid)
static void _update_cache_info_lock_state(struct lvmcache_info *info, int locked)
{
int was_locked = (info->status & CACHE_LOCKED) ? 1 : 0;
@@ -212,10 +172,8 @@ static void _update_cache_info_lock_state(struct lvmcache_info *info,
* Cache becomes invalid whenever lock state changes unless
* exclusive VG_GLOBAL is held (i.e. while scanning).
*/
if (!lvmcache_vgname_is_locked(VG_GLOBAL) && (was_locked != locked)) {
if (!lvmcache_vgname_is_locked(VG_GLOBAL) && (was_locked != locked))
info->status |= CACHE_INVALID;
*cached_vgmetadata_valid = 0;
}
if (locked)
info->status |= CACHE_LOCKED;
@@ -227,14 +185,9 @@ static void _update_cache_vginfo_lock_state(struct lvmcache_vginfo *vginfo,
int locked)
{
struct lvmcache_info *info;
int cached_vgmetadata_valid = 1;
dm_list_iterate_items(info, &vginfo->infos)
_update_cache_info_lock_state(info, locked,
&cached_vgmetadata_valid);
if (!cached_vgmetadata_valid)
_free_cached_vgmetadata(vginfo);
_update_cache_info_lock_state(info, locked);
}
static void _update_cache_lock_state(const char *vgname, int locked)
@@ -247,6 +200,35 @@ static void _update_cache_lock_state(const char *vgname, int locked)
_update_cache_vginfo_lock_state(vginfo, locked);
}
static void _suspended_vg_free(struct lvmcache_vginfo *vginfo, int free_old, int free_new)
{
if (free_old) {
if (vginfo->suspended_vg_old_buf)
dm_free(vginfo->suspended_vg_old_buf);
if (vginfo->suspended_vg_old_cft)
dm_config_destroy(vginfo->suspended_vg_old_cft);
if (vginfo->suspended_vg_old)
release_vg(vginfo->suspended_vg_old);
vginfo->suspended_vg_old_buf = NULL;
vginfo->suspended_vg_old_cft = NULL;
vginfo->suspended_vg_old = NULL;
}
if (free_new) {
if (vginfo->suspended_vg_new_buf)
dm_free(vginfo->suspended_vg_new_buf);
if (vginfo->suspended_vg_new_cft)
dm_config_destroy(vginfo->suspended_vg_new_cft);
if (vginfo->suspended_vg_new)
release_vg(vginfo->suspended_vg_new);
vginfo->suspended_vg_new_buf = NULL;
vginfo->suspended_vg_new_cft = NULL;
vginfo->suspended_vg_new = NULL;
}
}
static void _drop_metadata(const char *vgname, int drop_precommitted)
{
struct lvmcache_vginfo *vginfo;
@@ -255,25 +237,98 @@ static void _drop_metadata(const char *vgname, int drop_precommitted)
if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, NULL)))
return;
/*
* Invalidate cached PV labels.
* If cached precommitted metadata exists that means we
* already invalidated the PV labels (before caching it)
* and we must not do it again.
*/
if (!drop_precommitted && vginfo->precommitted && !vginfo->vgmetadata)
log_error(INTERNAL_ERROR "metadata commit (or revert) missing before "
"dropping metadata from cache.");
if (drop_precommitted || !vginfo->precommitted)
if (drop_precommitted)
dm_list_iterate_items(info, &vginfo->infos)
info->status |= CACHE_INVALID;
_free_cached_vgmetadata(vginfo);
/* VG revert */
if (drop_precommitted)
vginfo->precommitted = 0;
_suspended_vg_free(vginfo, 0, 1);
else
_suspended_vg_free(vginfo, 1, 1);
}
void lvmcache_save_suspended_vg(struct volume_group *vg, int precommitted)
{
struct lvmcache_vginfo *vginfo;
struct format_instance *fid;
struct format_instance_ctx fic;
struct volume_group *susp_vg = NULL;
struct dm_config_tree *susp_cft = NULL;
char *susp_buf = NULL;
size_t size;
int new = precommitted;
int old = !precommitted;
if (!(vginfo = lvmcache_vginfo_from_vgid((const char *)&vg->id)))
goto_bad;
/* already saved */
if (old && vginfo->suspended_vg_old &&
(vginfo->suspended_vg_old->seqno == vg->seqno))
return;
/* already saved */
if (new && vginfo->suspended_vg_new &&
(vginfo->suspended_vg_new->seqno == vg->seqno))
return;
_suspended_vg_free(vginfo, old, new);
if (!(size = export_vg_to_buffer(vg, &susp_buf)))
goto_bad;
fic.type = FMT_INSTANCE_MDAS | FMT_INSTANCE_AUX_MDAS;
fic.context.vg_ref.vg_name = vginfo->vgname;
fic.context.vg_ref.vg_id = vginfo->vgid;
if (!(fid = vginfo->fmt->ops->create_instance(vginfo->fmt, &fic)))
goto_bad;
if (!(susp_cft = config_tree_from_string_without_dup_node_check(susp_buf)))
goto_bad;
if (!(susp_vg = import_vg_from_config_tree(susp_cft, fid)))
goto_bad;
if (old) {
vginfo->suspended_vg_old_buf = susp_buf;
vginfo->suspended_vg_old_cft = susp_cft;
vginfo->suspended_vg_old = susp_vg;
log_debug_cache("lvmcache saved suspended vg old seqno %d %s", vg->seqno, vg->name);
} else {
vginfo->suspended_vg_new_buf = susp_buf;
vginfo->suspended_vg_new_cft = susp_cft;
vginfo->suspended_vg_new = susp_vg;
log_debug_cache("lvmcache saved suspended vg new seqno %d %s", vg->seqno, vg->name);
}
return;
bad:
_suspended_vg_free(vginfo, old, new);
log_debug_cache("lvmcache failed to save suspended pre %d vg %s", precommitted, vg->name);
}
struct volume_group *lvmcache_get_suspended_vg(const char *vgid)
{
struct lvmcache_vginfo *vginfo;
if (!(vginfo = lvmcache_vginfo_from_vgid(vgid)))
return_NULL;
if (vginfo->suspended_vg_committed)
return vginfo->suspended_vg_new;
else
return vginfo->suspended_vg_old;
}
void lvmcache_drop_suspended_vg(struct volume_group *vg)
{
struct lvmcache_vginfo *vginfo;
if (!(vginfo = lvmcache_vginfo_from_vgid((const char *)&vg->id)))
return;
_suspended_vg_free(vginfo, 1, 1);
}
/*
@@ -288,11 +343,7 @@ void lvmcache_commit_metadata(const char *vgname)
if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, NULL)))
return;
if (vginfo->precommitted) {
log_debug_cache("Precommitted metadata cache: VG %s upgraded to committed.",
vginfo->vgname);
vginfo->precommitted = 0;
}
vginfo->suspended_vg_committed = 1;
}
void lvmcache_drop_metadata(const char *vgname, int drop_precommitted)
@@ -542,7 +593,6 @@ const struct format_type *lvmcache_fmt_from_vgname(struct cmd_context *cmd,
{
struct lvmcache_vginfo *vginfo;
struct lvmcache_info *info;
struct label *label;
struct dm_list *devh, *tmp;
struct dm_list devs;
struct device_list *devl;
@@ -587,7 +637,7 @@ const struct format_type *lvmcache_fmt_from_vgname(struct cmd_context *cmd,
dm_list_iterate_safe(devh, tmp, &devs) {
devl = dm_list_item(devh, struct device_list);
(void) label_read(devl->dev, &label, UINT64_C(0));
label_read(devl->dev, NULL, UINT64_C(0));
dm_list_del(&devl->list);
dm_free(devl);
}
@@ -675,18 +725,6 @@ static int _info_is_valid(struct lvmcache_info *info)
return 1;
}
static int _vginfo_is_valid(struct lvmcache_vginfo *vginfo)
{
struct lvmcache_info *info;
/* Invalid if any info is invalid */
dm_list_iterate_items(info, &vginfo->infos)
if (!_info_is_valid(info))
return 0;
return 1;
}
/* vginfo is invalid if it does not contain at least one valid info */
static int _vginfo_is_invalid(struct lvmcache_vginfo *vginfo)
{
@@ -752,7 +790,7 @@ char *lvmcache_vgname_from_pvid(struct cmd_context *cmd, const char *pvid)
struct lvmcache_info *info;
char *vgname;
if (!lvmcache_device_from_pvid(cmd, (const struct id *)pvid, NULL, NULL)) {
if (!lvmcache_device_from_pvid(cmd, (const struct id *)pvid, NULL)) {
log_error("Couldn't find device with uuid %s.", pvid);
return NULL;
}
@@ -768,19 +806,42 @@ char *lvmcache_vgname_from_pvid(struct cmd_context *cmd, const char *pvid)
return vgname;
}
static void _rescan_entry(struct lvmcache_info *info)
/*
* FIXME: get rid of the CACHE_INVALID state and rescanning
* infos with that flag. The code should just know which devices
* need scanning and when.
*/
static int _label_scan_invalid(struct cmd_context *cmd)
{
struct label *label;
struct dm_list devs;
struct dm_hash_node *n;
struct device_list *devl;
struct lvmcache_info *info;
int dev_count = 0;
int ret;
if (info->status & CACHE_INVALID)
(void) label_read(info->dev, &label, UINT64_C(0));
}
dm_list_init(&devs);
static int _scan_invalid(void)
{
dm_hash_iter(_pvid_hash, (dm_hash_iterate_fn) _rescan_entry);
dm_hash_iterate(n, _pvid_hash) {
if (!(info = dm_hash_get_data(_pvid_hash, n)))
continue;
return 1;
if (!(info->status & CACHE_INVALID))
continue;
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
return_0;
devl->dev = info->dev;
dm_list_add(&devs, &devl->list);
dev_count++;
}
log_debug_cache("Scanning %d devs with invalid info.", dev_count);
ret = label_scan_devs(cmd, &devs);
return ret;
}
/*
@@ -1095,17 +1156,77 @@ next:
goto next;
}
/*
* The initial label_scan at the start of the command is done without
* holding VG locks. Then for each VG identified during the label_scan,
* vg_read(vgname) is called while holding the VG lock. The labels
* and metadata on this VG's devices could have changed between the
* initial unlocked label_scan and the current vg_read(). So, we reread
* the labels/metadata for each device in the VG now that we hold the
* lock, and use this for processing the VG.
*
* FIXME: In some cases, the data read by label_scan may be fine, and not
* need to be reread here. e.g. a reporting command, possibly with a
* special option, could skip this second reread. Or, we could look
* at the VG seqno in each copy of the metadata read in the first label
* scan, and if they all match, consider it good enough to use for
* reporting without rereading it. (A command modifying the VG would
* always want to reread while the lock is held before modifying.)
*/
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid)
{
struct dm_list devs;
struct device_list *devl;
struct lvmcache_vginfo *vginfo;
struct lvmcache_info *info;
if (lvmetad_used())
return 1;
dm_list_init(&devs);
if (!(vginfo = lvmcache_vginfo_from_vgname(vgname, vgid)))
return_0;
/*
* When the VG metadata is from an independent location,
* then rescanning the devices in the VG won't find the
* metadata, and will destroy the vginfo/info associations
* that were created during label scan when the
* independent locations were read.
*/
if (vginfo->independent_metadata_location)
return 1;
dm_list_iterate_items(info, &vginfo->infos) {
if (!(devl = dm_malloc(sizeof(*devl)))) {
log_error("device_list element allocation failed");
return 0;
}
devl->dev = info->dev;
dm_list_add(&devs, &devl->list);
}
label_scan_devs(cmd, &devs);
/*
* TODO: grab vginfo again, and compare vginfo->infos
* to what was found above before rereading labels.
* If there are any info->devs now that were not in the
* first devs list, then do label_read on those also.
*/
return 1;
}
int lvmcache_label_scan(struct cmd_context *cmd)
{
struct dm_list del_cache_devs;
struct dm_list add_cache_devs;
struct lvmcache_info *info;
struct device_list *devl;
struct label *label;
struct dev_iter *iter;
struct device *dev;
struct format_type *fmt;
int dev_count = 0;
int r = 0;
@@ -1123,34 +1244,40 @@ int lvmcache_label_scan(struct cmd_context *cmd)
goto out;
}
/*
* Scan devices whose info struct has the INVALID flag set.
* When scanning has read the pv_header, mda_header and
* mda locations, it will clear the INVALID flag (via
* lvmcache_make_valid).
*/
if (_has_scanned && !_force_label_scan) {
r = _scan_invalid();
r = _label_scan_invalid(cmd);
goto out;
}
if (_force_label_scan && (cmd->full_filter && !cmd->full_filter->use_count) && !refresh_filters(cmd))
goto_out;
if (!cmd->full_filter || !(iter = dev_iter_create(cmd->full_filter, _force_label_scan))) {
log_error("dev_iter creation failed");
if (!cmd->full_filter) {
log_error("label scan is missing full filter");
goto out;
}
log_very_verbose("Scanning device labels");
/*
* Duplicates found during this label scan are added to _found_duplicate_devs().
*/
_destroy_duplicate_device_list(&_found_duplicate_devs);
while ((dev = dev_iter_get(iter))) {
(void) label_read(dev, &label, UINT64_C(0));
dev_count++;
}
dev_iter_destroy(iter);
log_very_verbose("Scanned %d device labels", dev_count);
/*
* Do the actual scanning. This populates lvmcache
* with infos/vginfos based on reading headers from
* each device, and a vg summary from each mda.
*
* Note that this will *skip* scanning a device if
* an info struct already exists in lvmcache for
* the device.
*/
label_scan(cmd);
/*
* _choose_preferred_devs() returns:
@@ -1184,7 +1311,7 @@ int lvmcache_label_scan(struct cmd_context *cmd)
dm_list_iterate_items(devl, &add_cache_devs) {
log_debug_cache("Rescan preferred device %s for lvmcache", dev_name(devl->dev));
(void) label_read(devl->dev, &label, UINT64_C(0));
label_read(devl->dev, NULL, UINT64_C(0));
}
dm_list_splice(&_unused_duplicate_devs, &del_cache_devs);
@@ -1216,129 +1343,12 @@ int lvmcache_label_scan(struct cmd_context *cmd)
return r;
}
struct volume_group *lvmcache_get_vg(struct cmd_context *cmd, const char *vgname,
const char *vgid, unsigned precommitted)
{
struct lvmcache_vginfo *vginfo;
struct volume_group *vg = NULL;
struct format_instance *fid;
struct format_instance_ctx fic;
/*
* We currently do not store precommitted metadata in lvmetad at
* all. This means that any request for precommitted metadata is served
* using the classic scanning mechanics, and read from disk or from
* lvmcache.
*/
if (lvmetad_used() && !precommitted) {
/* Still serve the locally cached VG if available */
if (vgid && (vginfo = lvmcache_vginfo_from_vgid(vgid)) &&
vginfo->vgmetadata && (vg = vginfo->cached_vg))
goto out;
return lvmetad_vg_lookup(cmd, vgname, vgid);
}
if (!vgid || !(vginfo = lvmcache_vginfo_from_vgid(vgid)) || !vginfo->vgmetadata)
return NULL;
if (!_vginfo_is_valid(vginfo))
return NULL;
/*
* Don't return cached data if either:
* (i) precommitted metadata is requested but we don't have it cached
* - caller should read it off disk;
* (ii) live metadata is requested but we have precommitted metadata cached
* and no devices are suspended so caller may read it off disk.
*
* If live metadata is requested but we have precommitted metadata cached
* and devices are suspended, we assume this precommitted metadata has
* already been preloaded and committed so it's OK to return it as live.
* Note that we do not clear the PRECOMMITTED flag.
*/
if ((precommitted && !vginfo->precommitted) ||
(!precommitted && vginfo->precommitted && !critical_section()))
return NULL;
/* Use already-cached VG struct when available */
if ((vg = vginfo->cached_vg) && !vginfo->cached_vg_invalidated)
goto out;
release_vg(vginfo->cached_vg);
fic.type = FMT_INSTANCE_MDAS | FMT_INSTANCE_AUX_MDAS;
fic.context.vg_ref.vg_name = vginfo->vgname;
fic.context.vg_ref.vg_id = vgid;
if (!(fid = vginfo->fmt->ops->create_instance(vginfo->fmt, &fic)))
return_NULL;
/* Build config tree from vgmetadata, if not yet cached */
if (!vginfo->cft &&
!(vginfo->cft =
config_tree_from_string_without_dup_node_check(vginfo->vgmetadata)))
goto_bad;
if (!(vg = import_vg_from_config_tree(vginfo->cft, fid)))
goto_bad;
/* Cache VG struct for reuse */
vginfo->cached_vg = vg;
vginfo->holders = 1;
vginfo->vg_use_count = 0;
vginfo->cached_vg_invalidated = 0;
vg->vginfo = vginfo;
if (!dm_pool_lock(vg->vgmem, detect_internal_vg_cache_corruption()))
goto_bad;
out:
vginfo->holders++;
vginfo->vg_use_count++;
log_debug_cache("Using cached %smetadata for VG %s with %u holder(s).",
vginfo->precommitted ? "pre-committed " : "",
vginfo->vgname, vginfo->holders);
return vg;
bad:
_free_cached_vgmetadata(vginfo);
return NULL;
}
// #if 0
int lvmcache_vginfo_holders_dec_and_test_for_zero(struct lvmcache_vginfo *vginfo)
{
log_debug_cache("VG %s decrementing %d holder(s) at %p.",
vginfo->cached_vg->name, vginfo->holders, vginfo->cached_vg);
if (--vginfo->holders)
return 0;
if (vginfo->vg_use_count > 1)
log_debug_cache("VG %s reused %d times.",
vginfo->cached_vg->name, vginfo->vg_use_count);
/* Debug perform crc check only when it's been used more then once */
if (!dm_pool_unlock(vginfo->cached_vg->vgmem,
detect_internal_vg_cache_corruption() &&
(vginfo->vg_use_count > 1)))
stack;
vginfo->cached_vg->vginfo = NULL;
vginfo->cached_vg = NULL;
return 1;
}
// #endif
int lvmcache_get_vgnameids(struct cmd_context *cmd, int include_internal,
struct dm_list *vgnameids)
{
struct vgnameid_list *vgnl;
struct lvmcache_vginfo *vginfo;
lvmcache_label_scan(cmd);
dm_list_iterate_items(vginfo, &_vginfos) {
if (!include_internal && is_orphan_vg(vginfo->vgname))
continue;
@@ -1443,61 +1453,45 @@ struct dm_list *lvmcache_get_pvids(struct cmd_context *cmd, const char *vgname,
return pvids;
}
static struct device *_device_from_pvid(const struct id *pvid,
uint64_t *label_sector)
int lvmcache_get_vg_devs(struct cmd_context *cmd,
struct lvmcache_vginfo *vginfo,
struct dm_list *devs)
{
struct lvmcache_info *info;
struct device_list *devl;
dm_list_iterate_items(info, &vginfo->infos) {
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
return_0;
devl->dev = info->dev;
dm_list_add(devs, &devl->list);
}
return 1;
}
static struct device *_device_from_pvid(const struct id *pvid, uint64_t *label_sector)
{
struct lvmcache_info *info;
struct label *label;
if ((info = lvmcache_info_from_pvid((const char *) pvid, NULL, 0))) {
if (lvmetad_used()) {
if (info->label && label_sector)
*label_sector = info->label->sector;
return info->dev;
}
if (label_read(info->dev, &label, UINT64_C(0))) {
info = (struct lvmcache_info *) label->info;
if (id_equal(pvid, (struct id *) &info->dev->pvid)) {
if (label_sector)
*label_sector = label->sector;
return info->dev;
}
}
if (info->label && label_sector)
*label_sector = info->label->sector;
return info->dev;
}
return NULL;
}
struct device *lvmcache_device_from_pvid(struct cmd_context *cmd, const struct id *pvid,
unsigned *scan_done_once, uint64_t *label_sector)
struct device *lvmcache_device_from_pvid(struct cmd_context *cmd, const struct id *pvid, uint64_t *label_sector)
{
struct device *dev;
/* Already cached ? */
dev = _device_from_pvid(pvid, label_sector);
if (dev)
return dev;
lvmcache_label_scan(cmd);
/* Try again */
dev = _device_from_pvid(pvid, label_sector);
if (dev)
return dev;
if (critical_section() || (scan_done_once && *scan_done_once))
return NULL;
lvmcache_force_next_label_scan();
lvmcache_label_scan(cmd);
if (scan_done_once)
*scan_done_once = 1;
/* Try again */
dev = _device_from_pvid(pvid, label_sector);
if (dev)
return dev;
log_debug_devs("No device with uuid %s.", (const char *)pvid);
return NULL;
}
@@ -1505,7 +1499,6 @@ const char *lvmcache_pvid_from_devname(struct cmd_context *cmd,
const char *devname)
{
struct device *dev;
struct label *label;
if (!(dev = dev_cache_get(devname, cmd->filter))) {
log_error("%s: Couldn't find device. Check your filters?",
@@ -1513,7 +1506,7 @@ const char *lvmcache_pvid_from_devname(struct cmd_context *cmd,
return NULL;
}
if (!(label_read(dev, &label, UINT64_C(0))))
if (!label_read(dev, NULL, UINT64_C(0)))
return NULL;
return dev->pvid;
@@ -1535,8 +1528,6 @@ static int _free_vginfo(struct lvmcache_vginfo *vginfo)
struct lvmcache_vginfo *primary_vginfo, *vginfo2;
int r = 1;
_free_cached_vgmetadata(vginfo);
vginfo2 = primary_vginfo = lvmcache_vginfo_from_vgname(vginfo->vgname, NULL);
if (vginfo == primary_vginfo) {
@@ -1559,6 +1550,7 @@ static int _free_vginfo(struct lvmcache_vginfo *vginfo)
dm_free(vginfo->system_id);
dm_free(vginfo->vgname);
dm_free(vginfo->creation_host);
_suspended_vg_free(vginfo, 1, 1);
if (*vginfo->vgid && _vgid_hash &&
lvmcache_vginfo_from_vgid(vginfo->vgid) == vginfo)
@@ -1997,12 +1989,6 @@ int lvmcache_update_vgname_and_id(struct lvmcache_info *info, struct lvmcache_vg
!is_orphan_vg(info->vginfo->vgname) && critical_section())
return 1;
/* If making a PV into an orphan, any cached VG metadata may become
* invalid, incorrectly still referencing device structs.
* (Example: pvcreate -ff) */
if (is_orphan_vg(vgname) && info->vginfo && !is_orphan_vg(info->vginfo->vgname))
info->vginfo->cached_vg_invalidated = 1;
/* If moving PV from orphan to real VG, always mark it valid */
if (!is_orphan_vg(vgname))
info->status &= ~CACHE_INVALID;
@@ -2040,10 +2026,6 @@ int lvmcache_update_vg(struct volume_group *vg, unsigned precommitted)
return_0;
}
/* store text representation of vg to cache */
if (vg->cmd->current_settings.cache_vgmetadata)
_store_metadata(vg, precommitted);
return 1;
}
@@ -2607,6 +2589,10 @@ struct label *lvmcache_get_label(struct lvmcache_info *info) {
return info->label;
}
/*
* After label_scan reads pv_header, mda_header and mda locations
* from a PV, it clears the INVALID flag.
*/
void lvmcache_make_valid(struct lvmcache_info *info) {
info->status &= ~CACHE_INVALID;
}
@@ -2662,6 +2648,14 @@ int lvmcache_vgid_is_cached(const char *vgid) {
return 1;
}
void lvmcache_set_independent_location(const char *vgname)
{
struct lvmcache_vginfo *vginfo;
if ((vginfo = lvmcache_vginfo_from_vgname(vgname, NULL)))
vginfo->independent_metadata_location = 1;
}
/*
* Return true iff it is impossible to find out from this info alone whether the
* PV in question is or is not an orphan.

19
lib/cache/lvmcache.h vendored
View File

@@ -74,6 +74,7 @@ void lvmcache_destroy(struct cmd_context *cmd, int retain_orphans, int reset);
*/
void lvmcache_force_next_label_scan(void);
int lvmcache_label_scan(struct cmd_context *cmd);
int lvmcache_label_rescan_vg(struct cmd_context *cmd, const char *vgname, const char *vgid);
/* Add/delete a device */
struct lvmcache_info *lvmcache_add(struct labeller *labeller, const char *pvid,
@@ -105,10 +106,8 @@ struct lvmcache_vginfo *lvmcache_vginfo_from_vgid(const char *vgid);
struct lvmcache_info *lvmcache_info_from_pvid(const char *pvid, struct device *dev, int valid_only);
const char *lvmcache_vgname_from_vgid(struct dm_pool *mem, const char *vgid);
const char *lvmcache_vgid_from_vgname(struct cmd_context *cmd, const char *vgname);
struct device *lvmcache_device_from_pvid(struct cmd_context *cmd, const struct id *pvid,
unsigned *scan_done_once, uint64_t *label_sector);
const char *lvmcache_pvid_from_devname(struct cmd_context *cmd,
const char *devname);
struct device *lvmcache_device_from_pvid(struct cmd_context *cmd, const struct id *pvid, uint64_t *label_sector);
const char *lvmcache_pvid_from_devname(struct cmd_context *cmd, const char *devname);
char *lvmcache_vgname_from_pvid(struct cmd_context *cmd, const char *pvid);
const char *lvmcache_vgname_from_info(struct lvmcache_info *info);
const struct format_type *lvmcache_fmt_from_info(struct lvmcache_info *info);
@@ -134,9 +133,6 @@ int lvmcache_get_vgnameids(struct cmd_context *cmd, int include_internal,
struct dm_list *lvmcache_get_pvids(struct cmd_context *cmd, const char *vgname,
const char *vgid);
/* Returns cached volume group metadata. */
struct volume_group *lvmcache_get_vg(struct cmd_context *cmd, const char *vgname,
const char *vgid, unsigned precommitted);
void lvmcache_drop_metadata(const char *vgname, int drop_precommitted);
void lvmcache_commit_metadata(const char *vgname);
@@ -215,4 +211,13 @@ void lvmcache_remove_unchosen_duplicate(struct device *dev);
int lvmcache_pvid_in_unchosen_duplicates(const char *pvid);
void lvmcache_save_suspended_vg(struct volume_group *vg, int precommitted);
struct volume_group *lvmcache_get_suspended_vg(const char *vgid);
void lvmcache_drop_suspended_vg(struct volume_group *vg);
int lvmcache_get_vg_devs(struct cmd_context *cmd,
struct lvmcache_vginfo *vginfo,
struct dm_list *devs);
void lvmcache_set_independent_location(const char *vgname);
#endif

483
lib/cache/lvmetad.c vendored
View File

@@ -39,7 +39,7 @@ static int64_t _lvmetad_update_timeout;
static int _found_lvm1_metadata = 0;
static struct volume_group *lvmetad_pvscan_vg(struct cmd_context *cmd, struct volume_group *vg);
static struct volume_group *lvmetad_pvscan_vg(struct cmd_context *cmd, struct volume_group *vg, const char *vgid, struct format_type *fmt);
static uint64_t _monotonic_seconds(void)
{
@@ -1090,14 +1090,17 @@ struct volume_group *lvmetad_vg_lookup(struct cmd_context *cmd, const char *vgna
* invalidated the cached vg.
*/
if (rescan) {
if (!(vg2 = lvmetad_pvscan_vg(cmd, vg))) {
if (!(vg2 = lvmetad_pvscan_vg(cmd, vg, vgid, fmt))) {
log_debug_lvmetad("VG %s from lvmetad not found during rescan.", vgname);
fid = NULL;
release_vg(vg);
vg = NULL;
goto out;
}
fid->ref_count++;
release_vg(vg);
fid->ref_count--;
fmt->ops->destroy_instance(fid);
vg = vg2;
fid = vg2->fid;
}
@@ -1105,14 +1108,14 @@ struct volume_group *lvmetad_vg_lookup(struct cmd_context *cmd, const char *vgna
dm_list_iterate_items(pvl, &vg->pvs) {
if (!_pv_update_struct_pv(pvl->pv, fid)) {
vg = NULL;
goto_out; /* FIXME error path */
goto_out; /* FIXME: use an error path that disables lvmetad */
}
}
dm_list_iterate_items(pvl, &vg->pvs_outdated) {
if (!_pv_update_struct_pv(pvl->pv, fid)) {
vg = NULL;
goto_out; /* FIXME error path */
goto_out; /* FIXME: use an error path that disables lvmetad */
}
}
@@ -1756,6 +1759,7 @@ int lvmetad_pv_gone_by_dev(struct device *dev)
*/
struct _lvmetad_pvscan_baton {
struct cmd_context *cmd;
struct volume_group *vg;
struct format_instance *fid;
};
@@ -1763,10 +1767,14 @@ struct _lvmetad_pvscan_baton {
static int _lvmetad_pvscan_single(struct metadata_area *mda, void *baton)
{
struct _lvmetad_pvscan_baton *b = baton;
struct device *mda_dev = mda_get_device(mda);
struct label_read_data *ld;
struct volume_group *vg;
ld = get_label_read_data(b->cmd, mda_dev);
if (mda_is_ignored(mda) ||
!(vg = mda->ops->vg_read(b->fid, "", mda, NULL, NULL, 1)))
!(vg = mda->ops->vg_read(b->fid, "", mda, ld, NULL, NULL)))
return 1;
/* FIXME Also ensure contents match etc. */
@@ -1778,6 +1786,37 @@ static int _lvmetad_pvscan_single(struct metadata_area *mda, void *baton)
return 1;
}
/*
* FIXME: handle errors and do proper comparison of metadata from each area
* like vg_read and fall back to real vg_read from disk if there's any problem.
*/
static int _lvmetad_pvscan_vg_single(struct metadata_area *mda, void *baton)
{
struct _lvmetad_pvscan_baton *b = baton;
struct device *mda_dev = mda_get_device(mda);
struct label_read_data *ld;
struct volume_group *vg = NULL;
if (mda_is_ignored(mda))
return 1;
ld = get_label_read_data(b->cmd, mda_dev);
if (!(vg = mda->ops->vg_read(b->fid, "", mda, ld, NULL, NULL)))
return 1;
if (!b->vg)
b->vg = vg;
else if (vg->seqno > b->vg->seqno) {
release_vg(b->vg);
b->vg = vg;
} else
release_vg(vg);
return 1;
}
/*
* The lock manager may detect that the vg cached in lvmetad is out of date,
* due to something like an lvcreate from another host.
@@ -1787,41 +1826,41 @@ static int _lvmetad_pvscan_single(struct metadata_area *mda, void *baton)
* the VG, and that PV may have been reused for another VG.
*/
static struct volume_group *lvmetad_pvscan_vg(struct cmd_context *cmd, struct volume_group *vg)
static struct volume_group *lvmetad_pvscan_vg(struct cmd_context *cmd, struct volume_group *vg,
const char *vgid, struct format_type *fmt)
{
char pvid_s[ID_LEN + 1] __attribute__((aligned(8)));
char uuid[64] __attribute__((aligned(8)));
struct label *label;
struct volume_group *vg_ret = NULL;
struct dm_config_tree *vgmeta_ret = NULL;
struct dm_config_tree *vgmeta;
struct pv_list *pvl, *pvl_new;
struct device_list *devl, *devl_new, *devlsafe;
struct device_list *devl, *devlsafe;
struct dm_list pvs_scan;
struct dm_list pvs_drop;
struct dm_list pvs_new;
struct lvmcache_vginfo *vginfo = NULL;
struct lvmcache_info *info = NULL;
struct format_instance *fid;
struct format_instance_ctx fic = { .type = 0 };
struct _lvmetad_pvscan_baton baton;
struct volume_group *save_vg;
struct dm_config_tree *save_meta;
struct device *save_dev = NULL;
uint32_t save_seqno = 0;
int missing_devs = 0;
int check_new_pvs = 0;
int found_new_pvs = 0;
int retried_reads = 0;
int found;
save_vg = NULL;
save_meta = NULL;
save_dev = NULL;
save_seqno = 0;
dm_list_init(&pvs_scan);
dm_list_init(&pvs_drop);
dm_list_init(&pvs_new);
log_debug_lvmetad("Rescanning VG %s (seqno %u).", vg->name, vg->seqno);
log_debug_lvmetad("Rescan VG %s to update lvmetad (seqno %u).", vg->name, vg->seqno);
/*
* Another host may have added a PV to the VG, and some
* commands do not always populate their lvmcache with
* all devs from lvmetad, so they would fail to find
* the new PV when scanning the VG. So make sure this
* command knows about all PVs from lvmetad.
* Make sure this command knows about all PVs from lvmetad.
*/
lvmcache_seed_infos_from_lvmetad(cmd);
@@ -1836,54 +1875,111 @@ static struct volume_group *lvmetad_pvscan_vg(struct cmd_context *cmd, struct vo
dm_list_add(&pvs_scan, &devl->list);
}
scan_more:
/*
* Rescan labels/metadata only from devs that we previously
* saw in the VG. If we find below that there are new PVs
* in the VG, we'll have to rescan all devices to find which
* device(s) are now being used.
*/
log_debug_lvmetad("Rescan VG %s scanning data from devs in previous metadata.", vg->name);
label_scan_devs(cmd, &pvs_scan);
/*
* Run the equivalent of lvmetad_pvscan_single on each dev in the VG.
* Check if any pvs_scan entries are no longer PVs.
* In that case, label_read/_find_label_header will have
* found no label_header, and would have dropped the
* info struct for the device from lvmcache. So, if
* we look up the info struct here and don't find it,
* we can infer it's no longer a PV.
*
* FIXME: we should record specific results from the
* label_read and then check specifically for whatever
* result means "no label was found", rather than going
* about this indirectly via the lvmcache side effects.
*/
dm_list_iterate_items_safe(devl, devlsafe, &pvs_scan) {
if (!(info = lvmcache_info_from_pvid(devl->dev->pvid, devl->dev, 0))) {
/* Another host removed this PV from the VG. */
log_debug_lvmetad("Rescan VG %s from %s dropping dev (no label).",
vg->name, dev_name(devl->dev));
dm_list_move(&pvs_drop, &devl->list);
}
}
fic.type = FMT_INSTANCE_MDAS | FMT_INSTANCE_AUX_MDAS;
fic.context.vg_ref.vg_name = vg->name;
fic.context.vg_ref.vg_id = vgid;
retry_reads:
if (!(fid = fmt->ops->create_instance(fmt, &fic))) {
/* FIXME: are there only internal reasons for failures here? */
log_error("Reading VG %s failed to create format instance.", vg->name);
return NULL;
}
/* FIXME: not sure if this is necessary */
fid->ref_count++;
baton.fid = fid;
baton.cmd = cmd;
/*
* FIXME: this vg_read path does not have the ability to repair
* any problems with the VG, e.g. VG on one dev has an older
* seqno. When vg_read() is reworked, we need to fall back
* to using that from here (and vg_read's from lvmetad) when
* there is a problem. Perhaps by disabling lvmetad when a
* VG problem is detected, causing commands to fully fall
* back to disk, which will repair the VG. Then lvmetad can
* be repopulated and re-enabled (possibly automatically.)
*/
/*
* Do a low level vg_read on each dev, verify the vg returned
* from metadata on each device is for the VG being read
* (the PV may have been removed from the VG being read and
* added to a different one), and return this vg to the caller
* as the current vg to use.
*
* The label scan above will have saved in lvmcache which
* vg each device is used in, so we could figure that part
* out without doing the vg_read.
*/
dm_list_iterate_items_safe(devl, devlsafe, &pvs_scan) {
if (!devl->dev)
continue;
log_debug_lvmetad("Rescan VG %s scanning %s.", vg->name, dev_name(devl->dev));
if (!label_read(devl->dev, &label, 0)) {
/* Another host removed this PV from the VG. */
log_debug_lvmetad("Rescan VG %s found %s was removed.", vg->name, dev_name(devl->dev));
if ((info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0)))
lvmcache_del(info);
log_debug_lvmetad("Rescan VG %s getting metadata from %s.",
vg->name, dev_name(devl->dev));
/*
* The info struct for this dev knows what and where
* the mdas are for this dev (the label scan saved
* the mda locations for this dev on the lvmcache info struct).
*/
if (!(info = lvmcache_info_from_pvid(devl->dev->pvid, devl->dev, 0))) {
log_debug_lvmetad("Rescan VG %s from %s dropping dev (no info).",
vg->name, dev_name(devl->dev));
dm_list_move(&pvs_drop, &devl->list);
continue;
}
info = (struct lvmcache_info *) label->info;
baton.vg = NULL;
baton.fid = lvmcache_fmt(info)->ops->create_instance(lvmcache_fmt(info), &fic);
if (!baton.fid)
return_NULL;
if (baton.fid->fmt->features & FMT_OBSOLETE) {
log_debug_lvmetad("Ignoring obsolete format on PV %s in VG %s.", dev_name(devl->dev), vg->name);
lvmcache_fmt(info)->ops->destroy_instance(baton.fid);
dm_list_move(&pvs_drop, &devl->list);
continue;
}
/*
* Read VG metadata from this dev's mdas.
*/
lvmcache_foreach_mda(info, _lvmetad_pvscan_single, &baton);
lvmcache_foreach_mda(info, _lvmetad_pvscan_vg_single, &baton);
/*
* The PV may have been removed from the VG by another host
* since we last read the VG.
*/
if (!baton.vg) {
log_debug_lvmetad("Rescan VG %s did not find %s.", vg->name, dev_name(devl->dev));
lvmcache_fmt(info)->ops->destroy_instance(baton.fid);
log_debug_lvmetad("Rescan VG %s from %s dropping dev (no metadata).",
vg->name, dev_name(devl->dev));
dm_list_move(&pvs_drop, &devl->list);
continue;
}
@@ -1893,10 +1989,15 @@ scan_more:
* different VG since we last read the VG.
*/
if (strcmp(baton.vg->name, vg->name)) {
log_debug_lvmetad("Rescan VG %s found different VG %s on PV %s.",
vg->name, baton.vg->name, dev_name(devl->dev));
log_debug_lvmetad("Rescan VG %s from %s dropping dev (other VG %s).",
vg->name, dev_name(devl->dev), baton.vg->name);
release_vg(baton.vg);
continue;
}
if (!(vgmeta = export_vg_to_config_tree(baton.vg))) {
log_error("VG export to config tree failed");
release_vg(baton.vg);
dm_list_move(&pvs_drop, &devl->list);
continue;
}
@@ -1906,20 +2007,35 @@ scan_more:
* read from each other dev.
*/
if (!save_seqno)
save_seqno = baton.vg->seqno;
if (save_vg && (save_seqno != baton.vg->seqno)) {
/* FIXME: fall back to vg_read to correct this. */
log_warn("WARNING: inconsistent metadata for VG %s on devices %s seqno %u and %s seqno %u.",
vg->name, dev_name(save_dev), save_seqno,
dev_name(devl->dev), baton.vg->seqno);
log_warn("WARNING: temporarily disable lvmetad to repair metadata.");
if (!(vgmeta = export_vg_to_config_tree(baton.vg))) {
log_error("VG export to config tree failed");
release_vg(baton.vg);
return NULL;
/* Use the most recent */
if (save_seqno < baton.vg->seqno) {
release_vg(save_vg);
dm_config_destroy(save_meta);
save_vg = baton.vg;
save_meta = vgmeta;
save_seqno = baton.vg->seqno;
save_dev = devl->dev;
} else {
release_vg(baton.vg);
dm_config_destroy(vgmeta);
}
continue;
}
if (!vgmeta_ret) {
vgmeta_ret = vgmeta;
if (!save_vg) {
save_vg = baton.vg;
save_meta = vgmeta;
save_seqno = baton.vg->seqno;
save_dev = devl->dev;
} else {
struct dm_config_node *meta1 = vgmeta_ret->root;
struct dm_config_node *meta1 = save_meta->root;
struct dm_config_node *meta2 = vgmeta->root;
struct dm_config_node *sib1 = meta1->sib;
struct dm_config_node *sib2 = meta2->sib;
@@ -1944,73 +2060,128 @@ scan_more:
meta2->sib = NULL;
if (compare_config(meta1, meta2)) {
/* FIXME: fall back to vg_read to correct this. */
log_warn("WARNING: inconsistent metadata for VG %s on devices %s seqno %u and %s seqno %u.",
vg->name, dev_name(save_dev), save_seqno,
dev_name(devl->dev), baton.vg->seqno);
log_warn("WARNING: temporarily disable lvmetad to repair metadata.");
log_error("VG %s metadata comparison failed for device %s vs %s",
vg->name, dev_name(devl->dev), save_dev ? dev_name(save_dev) : "none");
_log_debug_inequality(vg->name, vgmeta_ret->root, vgmeta->root);
_log_debug_inequality(vg->name, save_meta->root, vgmeta->root);
meta1->sib = sib1;
meta2->sib = sib2;
dm_config_destroy(vgmeta);
dm_config_destroy(vgmeta_ret);
/* no right choice, just use the previous copy */
release_vg(baton.vg);
return NULL;
dm_config_destroy(vgmeta);
}
meta1->sib = sib1;
meta2->sib = sib2;
release_vg(baton.vg);
dm_config_destroy(vgmeta);
}
}
/*
* Look for any new PVs in the VG metadata that were not in our
* previous version of the VG. Add them to pvs_new to be
* scanned in this loop just like the old PVs.
*/
if (!check_new_pvs) {
check_new_pvs = 1;
dm_list_iterate_items(pvl_new, &baton.vg->pvs) {
found = 0;
dm_list_iterate_items(pvl, &vg->pvs) {
if (pvl_new->pv->dev != pvl->pv->dev)
continue;
found = 1;
break;
}
if (found)
/* FIXME: see above */
fid->ref_count--;
/*
* Look for any new PVs in the VG metadata that were not in our
* previous version of the VG.
*
* (Don't look for new PVs after a rescan and retry.)
*/
found_new_pvs = 0;
if (save_vg && !retried_reads) {
dm_list_iterate_items(pvl_new, &save_vg->pvs) {
found = 0;
dm_list_iterate_items(pvl, &vg->pvs) {
if (pvl_new->pv->dev != pvl->pv->dev)
continue;
if (!pvl_new->pv->dev) {
strncpy(pvid_s, (char *) &pvl_new->pv->id, sizeof(pvid_s) - 1);
if (!id_write_format((const struct id *)&pvid_s, uuid, sizeof(uuid)))
stack;
log_error("Device not found for PV %s in VG %s", uuid, vg->name);
missing_devs++;
continue;
}
if (!(devl_new = dm_pool_zalloc(cmd->mem, sizeof(*devl_new))))
return_NULL;
devl_new->dev = pvl_new->pv->dev;
dm_list_add(&pvs_new, &devl_new->list);
log_debug_lvmetad("Rescan VG %s found %s was added.", vg->name, dev_name(devl_new->dev));
found = 1;
break;
}
/*
* PV in new VG metadata not found in old VG metadata.
* There's a good chance we don't know about this new
* PV or what device it's on; a label scan is needed
* of all devices so we know which device the VG is
* now using.
*/
if (!found) {
found_new_pvs++;
strncpy(pvid_s, (char *) &pvl_new->pv->id, sizeof(pvid_s) - 1);
if (!id_write_format((const struct id *)&pvid_s, uuid, sizeof(uuid)))
stack;
log_debug_lvmetad("Rescan VG %s found new PV %s.", vg->name, uuid);
}
}
}
release_vg(baton.vg);
if (!save_vg && retried_reads) {
log_error("VG %s not found after rescanning devices.", vg->name);
goto out;
}
/*
* Do the same scanning above for any new PVs.
* Do a full rescan of devices, then look up which devices the
* scan found for this VG name, and select those devices to
* read metadata from in the loop above (rather than the list
* of devices we created from our last copy of the vg metadata.)
*
* Case 1: VG we knew is no longer on any of the devices we knew it
* to be on (save_vg is NULL, which means the metadata wasn't found
* when reading mdas on each of the initial pvs_scan devices).
* Rescan all devs and then retry reading metadata from the devs that
* the scan finds associated with this VG.
*
* Case 2: VG has new PVs but we don't know what devices they are
* so rescan all devs and then retry reading metadata from the devs
* that the scan finds associated with this VG.
*
* (N.B. after a retry, we don't check for found_new_pvs.)
*/
if (!dm_list_empty(&pvs_new)) {
dm_list_init(&pvs_scan);
dm_list_splice(&pvs_scan, &pvs_new);
dm_list_init(&pvs_new);
log_debug_lvmetad("Rescan VG %s found new PVs to scan.", vg->name);
goto scan_more;
}
if (!save_vg || found_new_pvs) {
if (!save_vg)
log_debug_lvmetad("Rescan VG %s did not find VG on previous devs.", vg->name);
if (found_new_pvs)
log_debug_lvmetad("Rescan VG %s scanning all devs to find new PVs.", vg->name);
if (missing_devs) {
if (vgmeta_ret)
dm_config_destroy(vgmeta_ret);
return_NULL;
label_scan_force(cmd);
if (!(vginfo = lvmcache_vginfo_from_vgname(vg->name, NULL))) {
log_error("VG %s vg info not found after rescanning devices.", vg->name);
goto out;
}
/*
* Set pvs_scan to devs that the label scan found
* in the VG and retry the metadata reading loop.
*/
dm_list_init(&pvs_scan);
if (!lvmcache_get_vg_devs(cmd, vginfo, &pvs_scan)) {
log_error("VG %s info devs not found after rescanning devices.", vg->name);
goto out;
}
log_debug_lvmetad("Rescan VG %s has %d PVs after label scan.",
vg->name, dm_list_size(&pvs_scan));
if (save_vg)
release_vg(save_vg);
if (save_meta)
dm_config_destroy(save_meta);
save_vg = NULL;
save_meta = NULL;
save_dev = NULL;
save_seqno = 0;
found_new_pvs = 0;
retried_reads = 1;
goto retry_reads;
}
/*
@@ -2019,52 +2190,50 @@ scan_more:
dm_list_iterate_items(devl, &pvs_drop) {
if (!devl->dev)
continue;
log_debug_lvmetad("Rescan VG %s dropping %s.", vg->name, dev_name(devl->dev));
if (!lvmetad_pv_gone_by_dev(devl->dev))
return_NULL;
log_debug_lvmetad("Rescan VG %s removing %s from lvmetad.", vg->name, dev_name(devl->dev));
if (!lvmetad_pv_gone_by_dev(devl->dev)) {
/* FIXME: use an error path that disables lvmetad */
log_error("Failed to remove %s from lvmetad.", dev_name(devl->dev));
}
}
/*
* Update the VG in lvmetad.
* Update lvmetad with the newly read version of the VG.
* When the seqno is unchanged the cached VG can be left.
*/
if (vgmeta_ret) {
fid = lvmcache_fmt(info)->ops->create_instance(lvmcache_fmt(info), &fic);
if (!(vg_ret = import_vg_from_config_tree(vgmeta_ret, fid))) {
log_error("VG import from config tree failed");
lvmcache_fmt(info)->ops->destroy_instance(fid);
goto out;
if (save_vg && (save_seqno != vg->seqno)) {
dm_list_iterate_items(devl, &pvs_scan) {
if (!devl->dev)
continue;
log_debug_lvmetad("Rescan VG %s removing %s from lvmetad to replace.",
vg->name, dev_name(devl->dev));
if (!lvmetad_pv_gone_by_dev(devl->dev)) {
/* FIXME: use an error path that disables lvmetad */
log_error("Failed to remove %s from lvmetad.", dev_name(devl->dev));
}
}
log_debug_lvmetad("Rescan VG %s updating lvmetad from seqno %u to seqno %u.",
vg->name, vg->seqno, save_seqno);
/*
* Update lvmetad with the newly read version of the VG.
* When the seqno is unchanged the cached VG can be left.
* If this vg_update fails the cached metadata in
* lvmetad will remain invalid.
*/
if (save_seqno != vg->seqno) {
dm_list_iterate_items(devl, &pvs_scan) {
if (!devl->dev)
continue;
log_debug_lvmetad("Rescan VG %s dropping to replace %s.", vg->name, dev_name(devl->dev));
if (!lvmetad_pv_gone_by_dev(devl->dev))
return_NULL;
}
log_debug_lvmetad("Rescan VG %s updating lvmetad from seqno %u to seqno %u.",
vg->name, vg->seqno, save_seqno);
/*
* If this vg_update fails the cached metadata in
* lvmetad will remain invalid.
*/
vg_ret->lvmetad_update_pending = 1;
if (!lvmetad_vg_update_finish(vg_ret))
log_error("Failed to update lvmetad with new VG meta");
save_vg->lvmetad_update_pending = 1;
if (!lvmetad_vg_update_finish(save_vg)) {
/* FIXME: use an error path that disables lvmetad */
log_error("Failed to update lvmetad with new VG meta");
}
dm_config_destroy(vgmeta_ret);
}
out:
if (vg_ret)
log_debug_lvmetad("Rescan VG %s done (seqno %u).", vg_ret->name, vg_ret->seqno);
return vg_ret;
if (!save_vg && fid)
fmt->ops->destroy_instance(fid);
if (save_meta)
dm_config_destroy(save_meta);
if (save_vg)
log_debug_lvmetad("Rescan VG %s done (new seqno %u).", save_vg->name, save_vg->seqno);
return save_vg;
}
int lvmetad_pvscan_single(struct cmd_context *cmd, struct device *dev,
@@ -2074,9 +2243,12 @@ int lvmetad_pvscan_single(struct cmd_context *cmd, struct device *dev,
struct label *label;
struct lvmcache_info *info;
struct _lvmetad_pvscan_baton baton;
const struct format_type *fmt;
/* Create a dummy instance. */
struct format_instance_ctx fic = { .type = 0 };
log_debug_lvmetad("Scan metadata from dev %s", dev_name(dev));
if (!lvmetad_used()) {
log_error("Cannot proceed since lvmetad is not active.");
return 0;
@@ -2087,23 +2259,31 @@ int lvmetad_pvscan_single(struct cmd_context *cmd, struct device *dev,
return 1;
}
if (!label_read(dev, &label, 0)) {
log_print_unless_silent("No PV label found on %s.", dev_name(dev));
if (!(info = lvmcache_info_from_pvid(dev->pvid, dev, 0))) {
log_print_unless_silent("No PV info found on %s for PVID %s.", dev_name(dev), dev->pvid);
if (!lvmetad_pv_gone_by_dev(dev))
goto_bad;
return 1;
}
info = (struct lvmcache_info *) label->info;
if (!(label = lvmcache_get_label(info))) {
log_print_unless_silent("No PV label found for %s.", dev_name(dev));
if (!lvmetad_pv_gone_by_dev(dev))
goto_bad;
return 1;
}
fmt = lvmcache_fmt(info);
baton.cmd = cmd;
baton.vg = NULL;
baton.fid = lvmcache_fmt(info)->ops->create_instance(lvmcache_fmt(info), &fic);
baton.fid = fmt->ops->create_instance(fmt, &fic);
if (!baton.fid)
goto_bad;
if (baton.fid->fmt->features & FMT_OBSOLETE) {
lvmcache_fmt(info)->ops->destroy_instance(baton.fid);
if (fmt->features & FMT_OBSOLETE) {
fmt->ops->destroy_instance(baton.fid);
log_warn("WARNING: Disabling lvmetad cache which does not support obsolete (lvm1) metadata.");
lvmetad_set_disabled(cmd, LVMETAD_DISABLE_REASON_LVM1);
_found_lvm1_metadata = 1;
@@ -2117,9 +2297,9 @@ int lvmetad_pvscan_single(struct cmd_context *cmd, struct device *dev,
lvmcache_foreach_mda(info, _lvmetad_pvscan_single, &baton);
if (!baton.vg)
lvmcache_fmt(info)->ops->destroy_instance(baton.fid);
fmt->ops->destroy_instance(baton.fid);
if (!lvmetad_pv_found(cmd, (const struct id *) &dev->pvid, dev, lvmcache_fmt(info),
if (!lvmetad_pv_found(cmd, (const struct id *) &dev->pvid, dev, fmt,
label->sector, baton.vg, found_vgnames, changed_vgnames)) {
release_vg(baton.vg);
goto_bad;
@@ -2185,6 +2365,13 @@ int lvmetad_pvscan_all_devs(struct cmd_context *cmd, int do_wait)
replacing_other_update = 1;
}
label_scan(cmd);
if (lvmcache_found_duplicate_pvs()) {
log_warn("WARNING: Scan found duplicate PVs.");
return 0;
}
log_verbose("Scanning all devices to update lvmetad.");
if (!(iter = dev_iter_create(cmd->lvmetad_filter, 1))) {
@@ -2555,6 +2742,8 @@ void lvmetad_validate_global_cache(struct cmd_context *cmd, int force)
*/
_lvmetad_get_pv_cache_list(cmd, &pvc_before);
log_debug_lvmetad("Rescan all devices to validate global cache.");
/*
* Update the local lvmetad cache so it correctly reflects any
* changes made on remote hosts. (It's possible that this command
@@ -2623,7 +2812,7 @@ void lvmetad_validate_global_cache(struct cmd_context *cmd, int force)
_update_changed_pvs_in_udev(cmd, &pvc_before, &pvc_after);
}
log_debug_lvmetad("Validating global lvmetad cache finished");
log_debug_lvmetad("Rescanned all devices");
}
int lvmetad_vg_is_foreign(struct cmd_context *cmd, const char *vgname, const char *vgid)

View File

@@ -542,6 +542,7 @@ static int _process_config(struct cmd_context *cmd)
const struct dm_config_value *cv;
int64_t pv_min_kb;
int udev_disabled = 0;
int scan_size;
char sysfs_dir[PATH_MAX];
if (!_check_config(cmd))
@@ -625,6 +626,29 @@ static int _process_config(struct cmd_context *cmd)
cmd->default_settings.udev_sync = udev_disabled ? 0 :
find_config_tree_bool(cmd, activation_udev_sync_CFG, NULL);
#ifdef AIO_SUPPORT
cmd->use_aio = find_config_tree_bool(cmd, devices_scan_async_CFG, NULL);
#else
cmd->use_aio = 0;
if (find_config_tree_bool(cmd, devices_scan_async_CFG, NULL))
log_verbose("Ignoring scan_async, no async I/O support.");
#endif
scan_size = find_config_tree_int(cmd, devices_scan_size_CFG, NULL);
if (!scan_size || (scan_size < 0)) {
log_warn("WARNING: Ignoring invalid metadata/scan_size %d, using default %u.",
scan_size, DEFAULT_SCAN_SIZE_KB);
scan_size = DEFAULT_SCAN_SIZE_KB;
}
if (cmd->use_aio && (scan_size % 4)) {
log_warn("WARNING: Ignoring invalid metadata/scan_size %d with scan_async, using default %u.",
scan_size, DEFAULT_SCAN_SIZE_KB);
scan_size = DEFAULT_SCAN_SIZE_KB;
}
cmd->default_settings.scan_size_kb = scan_size;
/*
* Set udev_fallback lazily on first use since it requires
* checking DM driver version which is an extra ioctl!
@@ -685,9 +709,6 @@ static int _process_config(struct cmd_context *cmd)
if (find_config_tree_bool(cmd, report_two_word_unknown_device_CFG, NULL))
init_unknown_device_name("unknown device");
init_detect_internal_vg_cache_corruption
(find_config_tree_bool(cmd, global_detect_internal_vg_cache_corruption_CFG, NULL));
if (!_init_system_id(cmd))
return_0;
@@ -2001,7 +2022,6 @@ struct cmd_context *create_toolcontext(unsigned is_long_lived,
if (set_filters && !init_filters(cmd, 1))
goto_out;
cmd->default_settings.cache_vgmetadata = 1;
cmd->current_settings = cmd->default_settings;
cmd->initialized.config = 1;
@@ -2231,6 +2251,9 @@ void destroy_toolcontext(struct cmd_context *cmd)
!cmd->filter->dump(cmd->filter, 1))
stack;
if (cmd->ac)
dev_async_context_destroy(cmd->ac);
archive_exit(cmd);
backup_exit(cmd);
lvmcache_destroy(cmd, 0, 0);

View File

@@ -39,7 +39,7 @@ struct config_info {
int udev_rules;
int udev_sync;
int udev_fallback;
int cache_vgmetadata;
int scan_size_kb;
const char *msg_prefix;
const char *fmt_name;
uint64_t unit_factor;
@@ -164,6 +164,8 @@ struct cmd_context {
unsigned vg_notify:1;
unsigned lv_notify:1;
unsigned pv_notify:1;
unsigned use_aio:1;
unsigned pvscan_cache_single:1;
/*
* Filtering.
@@ -223,6 +225,7 @@ struct cmd_context {
const char *time_format;
unsigned rand_seed;
struct dm_list unused_duplicate_devs; /* save preferences between lvmcache instances */
struct dev_async_context *ac; /* for async i/o */
};
/*

View File

@@ -494,7 +494,7 @@ int override_config_tree_from_profile(struct cmd_context *cmd,
* and function avoids parsing of mda into config tree which
* remains unmodified and should not be used.
*/
int config_file_read_fd(struct dm_config_tree *cft, struct device *dev,
int config_file_read_fd(struct dm_config_tree *cft, struct device *dev, char *buf_async,
off_t offset, size_t size, off_t offset2, size_t size2,
checksum_fn_t checksum_fn, uint32_t checksum,
int checksum_only, int no_dup_node_check)
@@ -517,7 +517,18 @@ int config_file_read_fd(struct dm_config_tree *cft, struct device *dev,
if (!(dev->flags & DEV_REGULAR) || size2)
use_mmap = 0;
if (use_mmap) {
if (buf_async) {
if (!(buf = dm_malloc(size + size2))) {
log_error("Failed to allocate circular buffer.");
return 0;
}
memcpy(buf, buf_async + offset, size);
if (size2)
memcpy(buf + size, buf_async + offset2, size2);
fb = buf;
} else if (use_mmap) {
mmap_offset = offset % lvm_getpagesize();
/* memory map the file */
fb = mmap((caddr_t) 0, size + mmap_offset, PROT_READ,
@@ -532,6 +543,7 @@ int config_file_read_fd(struct dm_config_tree *cft, struct device *dev,
log_error("Failed to allocate circular buffer.");
return 0;
}
if (!dev_read_circular(dev, (uint64_t) offset, size,
(uint64_t) offset2, size2, buf)) {
goto out;
@@ -601,7 +613,7 @@ int config_file_read(struct dm_config_tree *cft)
}
}
r = config_file_read_fd(cft, cf->dev, 0, (size_t) info.st_size, 0, 0,
r = config_file_read_fd(cft, cf->dev, NULL, 0, (size_t) info.st_size, 0, 0,
(checksum_fn_t) NULL, 0, 0, 0);
if (!cf->keep_open) {

View File

@@ -239,7 +239,7 @@ config_source_t config_get_source_type(struct dm_config_tree *cft);
typedef uint32_t (*checksum_fn_t) (uint32_t initial, const uint8_t *buf, uint32_t size);
struct dm_config_tree *config_open(config_source_t source, const char *filename, int keep_open);
int config_file_read_fd(struct dm_config_tree *cft, struct device *dev,
int config_file_read_fd(struct dm_config_tree *cft, struct device *dev, char *buf_async,
off_t offset, size_t size, off_t offset2, size_t size2,
checksum_fn_t checksum_fn, uint32_t checksum,
int skip_parse, int no_dup_node_check);

View File

@@ -457,6 +457,26 @@ cfg(devices_allow_changes_with_duplicate_pvs_CFG, "allow_changes_with_duplicate_
"Enabling this setting allows the VG to be used as usual even with\n"
"uncertain devices.\n")
cfg(devices_scan_async_CFG, "scan_async", devices_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_BOOL, DEFAULT_SCAN_ASYNC, vsn(2, 2, 173), NULL, 0, NULL,
"Use async I/O to read headers and metadata from disks in parallel.\n")
cfg(devices_scan_size_CFG, "scan_size", devices_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_SCAN_SIZE_KB, vsn(2, 2, 173), NULL, 0, NULL,
"Number of KiB to read from each disk when scanning disks.\n"
"The initial scan size is intended to cover all the headers\n"
"and metadata that LVM places at the start of each disk so\n"
"that a single read operation can retrieve them all.\n"
"Any headers or metadata that lie beyond this size require\n"
"an additional disk read.\n")
cfg(devices_async_events_CFG, "async_events", devices_CFG_SECTION, CFG_DEFAULT_COMMENTED, CFG_TYPE_INT, DEFAULT_ASYNC_EVENTS, vsn(2, 2, 173), NULL, 0, NULL,
"Max number of concurrent async reads when scanning disks.\n"
"Up to this many disks can be read concurrently when scanning\n"
"disks with async I/O. If there are more disks than this,\n"
"they will be scanned serially with synchronous reads.\n"
"Increasing this number to match a larger number of disks may\n"
"improve performance, but will increase memory requirements.\n"
"This setting is limitted by the system aio configuration.\n")
cfg_array(allocation_cling_tag_list_CFG, "cling_tag_list", allocation_CFG_SECTION, CFG_DEFAULT_UNDEFINED, CFG_TYPE_STRING, NULL, vsn(2, 2, 77), NULL, 0, NULL,
"Advise LVM which PVs to use when searching for new space.\n"
"When searching for free space to extend an LV, the 'cling' allocation\n"
@@ -868,11 +888,8 @@ cfg(global_abort_on_internal_errors_CFG, "abort_on_internal_errors", global_CFG_
"Treat any internal errors as fatal errors, aborting the process that\n"
"encountered the internal error. Please only enable for debugging.\n")
cfg(global_detect_internal_vg_cache_corruption_CFG, "detect_internal_vg_cache_corruption", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_DETECT_INTERNAL_VG_CACHE_CORRUPTION, vsn(2, 2, 96), NULL, 0, NULL,
"Internal verification of VG structures.\n"
"Check if CRC matches when a parsed VG is used multiple times. This\n"
"is useful to catch unexpected changes to cached VG structures.\n"
"Please only enable for debugging.\n")
cfg(global_detect_internal_vg_cache_corruption_CFG, "detect_internal_vg_cache_corruption", global_CFG_SECTION, 0, CFG_TYPE_BOOL, 0, vsn(2, 2, 96), NULL, vsn(2, 2, 174), NULL,
"No longer used.\n")
cfg(global_metadata_read_only_CFG, "metadata_read_only", global_CFG_SECTION, 0, CFG_TYPE_BOOL, DEFAULT_METADATA_READ_ONLY, vsn(2, 2, 75), NULL, 0, NULL,
"No operations that change on-disk metadata are permitted.\n"

View File

@@ -60,6 +60,10 @@
#define DEFAULT_LVDISPLAY_SHOWS_FULL_DEVICE_PATH 0
#define DEFAULT_UNKNOWN_DEVICE_NAME "[unknown]"
#define DEFAULT_SCAN_ASYNC 1
#define DEFAULT_SCAN_SIZE_KB 128
#define DEFAULT_ASYNC_EVENTS 100
#define DEFAULT_SANLOCK_LV_EXTEND_MB 256
#define DEFAULT_MIRRORLOG MIRROR_LOG_DISK
@@ -179,7 +183,6 @@
#define DEFAULT_LOGLEVEL 0
#define DEFAULT_INDENT 1
#define DEFAULT_ABORT_ON_INTERNAL_ERRORS 0
#define DEFAULT_DETECT_INTERNAL_VG_CACHE_CORRUPTION 0
#define DEFAULT_UNITS "r"
#define DEFAULT_SUFFIX 1
#define DEFAULT_HOSTTAGS 0

View File

@@ -1081,6 +1081,8 @@ static void _full_scan(int dev_scan)
if (_cache.has_scanned && !dev_scan)
return;
log_debug_devs("Adding device paths to dev cache");
_insert_dirs(&_cache.dirs);
(void) dev_cache_index_devs();
@@ -1090,6 +1092,8 @@ static void _full_scan(int dev_scan)
_cache.has_scanned = 1;
init_full_scan_done(1);
log_debug_devs("Added %d device paths to dev cache", dm_hash_get_num_entries(_cache.names));
}
int dev_cache_has_scanned(void)

View File

@@ -827,3 +827,172 @@ int dev_set(struct device *dev, uint64_t offset, size_t len, int value)
return (len == 0);
}
#ifdef AIO_SUPPORT
/* io_setup() wrapper */
struct dev_async_context *dev_async_context_setup(unsigned async_event_count)
{
struct dev_async_context *ac;
unsigned nr_events = DEFAULT_ASYNC_EVENTS;
int error;
if (async_event_count)
nr_events = async_event_count;
if (!(ac = malloc(sizeof(struct dev_async_context))))
return_0;
memset(ac, 0, sizeof(struct dev_async_context));
error = io_setup(nr_events, &ac->aio_ctx);
if (error < 0) {
log_warn("WARNING: async io setup error %d with %u events.", error, nr_events);
free(ac);
return_0;
}
return ac;
}
struct dev_async_io *dev_async_io_alloc(int buf_len)
{
struct dev_async_io *aio;
char *buf;
char **p_buf;
/*
* mem pool doesn't seem to work for this, probably because
* of the memalign that follows.
*/
if (!(aio = malloc(sizeof(struct dev_async_io))))
return_0;
memset(aio, 0, sizeof(struct dev_async_io));
buf = NULL;
p_buf = &buf;
if (posix_memalign((void *)p_buf, getpagesize(), buf_len)) {
free(aio);
return_NULL;
}
memset(buf, 0, buf_len);
aio->buf = buf;
aio->buf_len = buf_len;
return aio;
}
void dev_async_context_destroy(struct dev_async_context *ac)
{
io_destroy(ac->aio_ctx);
free(ac);
}
void dev_async_io_destroy(struct dev_async_io *aio)
{
if (aio->buf)
free(aio->buf);
free(aio);
}
/* io_submit() wrapper */
int dev_async_read_submit(struct dev_async_context *ac, struct dev_async_io *aio,
struct device *dev, uint32_t len, uint64_t offset, int *nospace)
{
struct iocb *iocb = &aio->iocb;
int error;
*nospace = 0;
if (len > aio->buf_len)
return_0;
aio->len = len;
iocb->data = aio;
iocb->aio_fildes = dev_fd(dev);
iocb->aio_lio_opcode = IO_CMD_PREAD;
iocb->u.c.buf = aio->buf;
iocb->u.c.nbytes = len;
iocb->u.c.offset = offset;
error = io_submit(ac->aio_ctx, 1, &iocb);
if (error == -EAGAIN)
*nospace = 1;
if (error < 0)
return 0;
return 1;
}
/* io_getevents() wrapper */
int dev_async_getevents(struct dev_async_context *ac, int wait_count, struct timespec *timeout)
{
int wait_nr;
int rv;
int i;
retry:
memset(&ac->events, 0, sizeof(ac->events));
if (wait_count >= MAX_GET_EVENTS)
wait_nr = MAX_GET_EVENTS;
else
wait_nr = wait_count;
rv = io_getevents(ac->aio_ctx, 1, wait_nr, (struct io_event *)&ac->events, timeout);
if (rv == -EINTR)
goto retry;
if (rv < 0)
return 0;
if (!rv)
return 1;
for (i = 0; i < rv; i++) {
struct iocb *iocb = ac->events[i].obj;
struct dev_async_io *aio = iocb->data;
aio->result = ac->events[i].res;
aio->done = 1;
}
return 1;
}
#else /* AIO_SUPPORT */
struct dev_async_context *dev_async_context_setup(unsigned async_event_count)
{
return NULL;
}
struct dev_async_io *dev_async_io_alloc(int buf_len)
{
return NULL;
}
void dev_async_context_destroy(struct dev_async_context *ac)
{
}
void dev_async_io_destroy(struct dev_async_io *aio)
{
}
int dev_async_read_submit(struct dev_async_context *ac, struct dev_async_io *aio,
struct device *dev, uint32_t len, uint64_t offset, int *nospace)
{
return 0;
}
int dev_async_getevents(struct dev_async_context *ac, int wait_count, struct timespec *timeout)
{
return 0;
}
#endif /* AIO_SUPPORT */

View File

@@ -19,6 +19,7 @@
#include "uuid.h"
#include <fcntl.h>
#include <libaio.h>
#define DEV_ACCESSED_W 0x00000001 /* Device written to? */
#define DEV_REGULAR 0x00000002 /* Regular file? */
@@ -90,6 +91,28 @@ struct device_area {
uint64_t size; /* Bytes */
};
/*
* We'll collect the results of this many async reads
* in one system call. It shouldn't matter much what
* number is used here.
*/
#define MAX_GET_EVENTS 16
struct dev_async_context {
io_context_t aio_ctx;
struct io_event events[MAX_GET_EVENTS];
};
struct dev_async_io {
char *buf;
struct iocb iocb;
struct device *dev;
uint32_t buf_len; /* size of buf */
uint32_t len; /* size of submitted io */
int done;
int result;
};
/*
* Support for external device info.
*/
@@ -144,4 +167,12 @@ void dev_destroy_file(struct device *dev);
/* Return a valid device name from the alias list; NULL otherwise */
const char *dev_name_confirmed(struct device *dev, int quiet);
struct dev_async_context *dev_async_context_setup(unsigned async_event_count);
struct dev_async_io *dev_async_io_alloc(int buf_len);
void dev_async_context_destroy(struct dev_async_context *ac);
void dev_async_io_destroy(struct dev_async_io *aio);
int dev_async_read_submit(struct dev_async_context *ac, struct dev_async_io *aio,
struct device *dev, uint32_t len, uint64_t offset, int *nospace);
int dev_async_getevents(struct dev_async_context *ac, int wait_count, struct timespec *timeout);
#endif

View File

@@ -180,9 +180,9 @@ out:
static struct volume_group *_format1_vg_read(struct format_instance *fid,
const char *vg_name,
struct metadata_area *mda __attribute__((unused)),
struct label_read_data *ld __attribute__((unused)),
struct cached_vg_fmtdata **vg_fmtdata __attribute__((unused)),
unsigned *use_previous_vg __attribute__((unused)),
int single_device __attribute__((unused)))
unsigned *use_previous_vg __attribute__((unused)))
{
struct volume_group *vg;
struct disk_list *dl;

View File

@@ -55,6 +55,7 @@ static int _lvm1_write(struct label *label __attribute__((unused)), void *buf __
}
static int _lvm1_read(struct labeller *l, struct device *dev, void *buf,
struct label_read_data *ld,
struct label **label)
{
struct pv_disk *pvd = (struct pv_disk *) buf;

View File

@@ -101,9 +101,9 @@ static int _check_usp(const char *vgname, struct user_subpool *usp, int sp_count
static struct volume_group *_pool_vg_read(struct format_instance *fid,
const char *vg_name,
struct metadata_area *mda __attribute__((unused)),
struct label_read_data *ld __attribute__((unused)),
struct cached_vg_fmtdata **vg_fmtdata __attribute__((unused)),
unsigned *use_previous_vg __attribute__((unused)),
int single_device __attribute__((unused)))
unsigned *use_previous_vg __attribute__((unused)))
{
struct volume_group *vg;
struct user_subpool *usp;

View File

@@ -56,6 +56,7 @@ static int _pool_write(struct label *label __attribute__((unused)), void *buf __
}
static int _pool_read(struct labeller *l, struct device *dev, void *buf,
struct label_read_data *ld,
struct label **label)
{
struct pool_list pl;

View File

@@ -321,7 +321,7 @@ static void _display_archive(struct cmd_context *cmd, struct archive_file *af)
* retrieve the archive time and description.
*/
/* FIXME Use variation on _vg_read */
if (!(vg = text_vg_import_file(tf, af->path, &when, &desc))) {
if (!(vg = text_read_metadata_file(tf, af->path, &when, &desc))) {
log_error("Unable to read archive file.");
tf->fmt->ops->destroy_instance(tf);
return;

View File

@@ -320,7 +320,7 @@ struct volume_group *backup_read_vg(struct cmd_context *cmd,
}
dm_list_iterate_items(mda, &tf->metadata_areas_in_use) {
if (!(vg = mda->ops->vg_read(tf, vg_name, mda, NULL, NULL, 0)))
if (!(vg = mda->ops->vg_read(tf, vg_name, mda, NULL, NULL, NULL)))
stack;
break;
}

View File

@@ -190,7 +190,7 @@ static int _pv_analyze_mda_raw (const struct format_type * fmt,
if (!dev_open_readonly(area->dev))
return_0;
if (!(mdah = raw_read_mda_header(fmt, area)))
if (!(mdah = raw_read_mda_header(fmt, area, NULL)))
goto_out;
rlocn = mdah->raw_locns;
@@ -316,15 +316,26 @@ static void _xlate_mdah(struct mda_header *mdah)
}
}
static int _raw_read_mda_header(struct mda_header *mdah, struct device_area *dev_area)
static int _raw_read_mda_header(struct mda_header *mdah, struct device_area *dev_area,
struct label_read_data *ld)
{
if (!dev_open_readonly(dev_area->dev))
return_0;
if (!dev_read(dev_area->dev, dev_area->start, MDA_HEADER_SIZE, mdah)) {
if (!dev_close(dev_area->dev))
stack;
return_0;
if (!ld || (ld->buf_len < dev_area->start + MDA_HEADER_SIZE)) {
log_debug_metadata("Reading mda header sector from %s at %llu",
dev_name(dev_area->dev), (unsigned long long)dev_area->start);
if (!dev_read(dev_area->dev, dev_area->start, MDA_HEADER_SIZE, mdah)) {
if (!dev_close(dev_area->dev))
stack;
return_0;
}
} else {
log_debug_metadata("Copying mda header sector from %s buffer at %llu",
dev_name(dev_area->dev), (unsigned long long)dev_area->start);
memcpy(mdah, ld->buf + dev_area->start, MDA_HEADER_SIZE);
}
if (!dev_close(dev_area->dev))
@@ -366,7 +377,8 @@ static int _raw_read_mda_header(struct mda_header *mdah, struct device_area *dev
}
struct mda_header *raw_read_mda_header(const struct format_type *fmt,
struct device_area *dev_area)
struct device_area *dev_area,
struct label_read_data *ld)
{
struct mda_header *mdah;
@@ -375,7 +387,7 @@ struct mda_header *raw_read_mda_header(const struct format_type *fmt,
return NULL;
}
if (!_raw_read_mda_header(mdah, dev_area)) {
if (!_raw_read_mda_header(mdah, dev_area, ld)) {
dm_pool_free(fmt->cmd->mem, mdah);
return NULL;
}
@@ -402,8 +414,14 @@ static int _raw_write_mda_header(const struct format_type *fmt,
return 1;
}
static struct raw_locn *_find_vg_rlocn(struct device_area *dev_area,
/*
* FIXME: unify this with read_metadata_location() which is used
* in the label scanning path.
*/
static struct raw_locn *_read_metadata_location_vg(struct device_area *dev_area,
struct mda_header *mdah,
struct label_read_data *ld,
const char *vgname,
int *precommitted)
{
@@ -438,11 +456,20 @@ static struct raw_locn *_find_vg_rlocn(struct device_area *dev_area,
if (!*vgname)
return rlocn;
/* FIXME Loop through rlocns two-at-a-time. List null-terminated. */
/* FIXME Ignore if checksum incorrect!!! */
if (!dev_read(dev_area->dev, dev_area->start + rlocn->offset,
sizeof(vgnamebuf), vgnamebuf))
goto_bad;
/*
* Verify that the VG metadata pointed to by the rlocn
* begins with a valid vgname.
*/
if (!ld || (ld->buf_len < dev_area->start + rlocn->offset + NAME_LEN)) {
/* FIXME Loop through rlocns two-at-a-time. List null-terminated. */
/* FIXME Ignore if checksum incorrect!!! */
if (!dev_read(dev_area->dev, dev_area->start + rlocn->offset,
sizeof(vgnamebuf), vgnamebuf))
goto_bad;
} else {
memset(vgnamebuf, 0, sizeof(vgnamebuf));
memcpy(vgnamebuf, ld->buf + dev_area->start + rlocn->offset, NAME_LEN);
}
if (!strncmp(vgnamebuf, vgname, len = strlen(vgname)) &&
(isspace(vgnamebuf[len]) || vgnamebuf[len] == '{'))
@@ -488,10 +515,10 @@ static int _raw_holds_vgname(struct format_instance *fid,
if (!dev_open_readonly(dev_area->dev))
return_0;
if (!(mdah = raw_read_mda_header(fid->fmt, dev_area)))
if (!(mdah = raw_read_mda_header(fid->fmt, dev_area, NULL)))
return_0;
if (_find_vg_rlocn(dev_area, mdah, vgname, &noprecommit))
if (_read_metadata_location_vg(dev_area, mdah, NULL, vgname, &noprecommit))
r = 1;
if (!dev_close(dev_area->dev))
@@ -503,10 +530,10 @@ static int _raw_holds_vgname(struct format_instance *fid,
static struct volume_group *_vg_read_raw_area(struct format_instance *fid,
const char *vgname,
struct device_area *area,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg,
int precommitted,
int single_device)
int precommitted)
{
struct volume_group *vg = NULL;
struct raw_locn *rlocn;
@@ -515,10 +542,10 @@ static struct volume_group *_vg_read_raw_area(struct format_instance *fid,
char *desc;
uint32_t wrap = 0;
if (!(mdah = raw_read_mda_header(fid->fmt, area)))
if (!(mdah = raw_read_mda_header(fid->fmt, area, ld)))
goto_out;
if (!(rlocn = _find_vg_rlocn(area, mdah, vgname, &precommitted))) {
if (!(rlocn = _read_metadata_location_vg(area, mdah, ld, vgname, &precommitted))) {
log_debug_metadata("VG %s not found on %s", vgname, dev_name(area->dev));
goto out;
}
@@ -532,25 +559,25 @@ static struct volume_group *_vg_read_raw_area(struct format_instance *fid,
goto out;
}
/* FIXME 64-bit */
if (!(vg = text_vg_import_fd(fid, NULL, vg_fmtdata, use_previous_vg, single_device, area->dev,
(off_t) (area->start + rlocn->offset),
(uint32_t) (rlocn->size - wrap),
(off_t) (area->start + MDA_HEADER_SIZE),
wrap, calc_crc, rlocn->checksum, &when,
&desc)) && (!use_previous_vg || !*use_previous_vg))
goto_out;
vg = text_read_metadata(fid, area->dev, NULL, ld, vg_fmtdata, use_previous_vg,
(off_t) (area->start + rlocn->offset),
(uint32_t) (rlocn->size - wrap),
(off_t) (area->start + MDA_HEADER_SIZE),
wrap,
calc_crc,
rlocn->checksum,
&when, &desc);
if (vg)
log_debug_metadata("Read %s %smetadata (%u) from %s at %" PRIu64 " size %"
PRIu64, vg->name, precommitted ? "pre-commit " : "",
vg->seqno, dev_name(area->dev),
area->start + rlocn->offset, rlocn->size);
else
log_debug_metadata("Skipped reading %smetadata from %s at %" PRIu64 " size %"
PRIu64 " with matching checksum.", precommitted ? "pre-commit " : "",
dev_name(area->dev),
area->start + rlocn->offset, rlocn->size);
if (!vg) {
/* FIXME: detect and handle errors, and distinguish from the optimization
that skips parsing the metadata which also returns NULL. */
}
log_debug_metadata("Found metadata on %s at %"PRIu64" size %"PRIu64" for VG %s",
dev_name(area->dev),
area->start + rlocn->offset,
rlocn->size,
vgname);
if (vg && precommitted)
vg->status |= PRECOMMITTED;
@@ -562,9 +589,9 @@ static struct volume_group *_vg_read_raw_area(struct format_instance *fid,
static struct volume_group *_vg_read_raw(struct format_instance *fid,
const char *vgname,
struct metadata_area *mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg,
int single_device)
unsigned *use_previous_vg)
{
struct mda_context *mdac = (struct mda_context *) mda->metadata_locn;
struct volume_group *vg;
@@ -572,7 +599,7 @@ static struct volume_group *_vg_read_raw(struct format_instance *fid,
if (!dev_open_readonly(mdac->area.dev))
return_NULL;
vg = _vg_read_raw_area(fid, vgname, &mdac->area, vg_fmtdata, use_previous_vg, 0, single_device);
vg = _vg_read_raw_area(fid, vgname, &mdac->area, ld, vg_fmtdata, use_previous_vg, 0);
if (!dev_close(mdac->area.dev))
stack;
@@ -583,6 +610,7 @@ static struct volume_group *_vg_read_raw(struct format_instance *fid,
static struct volume_group *_vg_read_precommit_raw(struct format_instance *fid,
const char *vgname,
struct metadata_area *mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg)
{
@@ -592,7 +620,7 @@ static struct volume_group *_vg_read_precommit_raw(struct format_instance *fid,
if (!dev_open_readonly(mdac->area.dev))
return_NULL;
vg = _vg_read_raw_area(fid, vgname, &mdac->area, vg_fmtdata, use_previous_vg, 1, 0);
vg = _vg_read_raw_area(fid, vgname, &mdac->area, ld, vg_fmtdata, use_previous_vg, 1);
if (!dev_close(mdac->area.dev))
stack;
@@ -630,10 +658,10 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg,
if (!dev_open(mdac->area.dev))
return_0;
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area)))
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area, NULL)))
goto_out;
rlocn = _find_vg_rlocn(&mdac->area, mdah, old_vg_name ? : vg->name, &noprecommit);
rlocn = _read_metadata_location_vg(&mdac->area, mdah, NULL, old_vg_name ? : vg->name, &noprecommit);
mdac->rlocn.offset = _next_rlocn_offset(rlocn, mdah);
if (!fidtc->raw_metadata_buf &&
@@ -736,10 +764,10 @@ static int _vg_commit_raw_rlocn(struct format_instance *fid,
if (!found)
return 1;
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area)))
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area, NULL)))
goto_out;
if (!(rlocn = _find_vg_rlocn(&mdac->area, mdah, old_vg_name ? : vg->name, &noprecommit))) {
if (!(rlocn = _read_metadata_location_vg(&mdac->area, mdah, NULL, old_vg_name ? : vg->name, &noprecommit))) {
mdah->raw_locns[0].offset = 0;
mdah->raw_locns[0].size = 0;
mdah->raw_locns[0].checksum = 0;
@@ -846,10 +874,10 @@ static int _vg_remove_raw(struct format_instance *fid, struct volume_group *vg,
if (!dev_open(mdac->area.dev))
return_0;
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area)))
if (!(mdah = raw_read_mda_header(fid->fmt, &mdac->area, NULL)))
goto_out;
if (!(rlocn = _find_vg_rlocn(&mdac->area, mdah, vg->name, &noprecommit))) {
if (!(rlocn = _read_metadata_location_vg(&mdac->area, mdah, NULL, vg->name, &noprecommit))) {
rlocn = &mdah->raw_locns[0];
mdah->raw_locns[1].offset = 0;
}
@@ -883,8 +911,10 @@ static struct volume_group *_vg_read_file_name(struct format_instance *fid,
time_t when;
char *desc;
if (!(vg = text_vg_import_file(fid, read_path, &when, &desc)))
return_NULL;
if (!(vg = text_read_metadata_file(fid, read_path, &when, &desc))) {
log_error("Failed to read VG %s from %s", vgname, read_path);
return NULL;
}
/*
* Currently you can only have a single volume group per
@@ -907,9 +937,9 @@ static struct volume_group *_vg_read_file_name(struct format_instance *fid,
static struct volume_group *_vg_read_file(struct format_instance *fid,
const char *vgname,
struct metadata_area *mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg __attribute__((unused)),
int single_device __attribute__((unused)))
unsigned *use_previous_vg __attribute__((unused)))
{
struct text_context *tc = (struct text_context *) mda->metadata_locn;
@@ -919,6 +949,7 @@ static struct volume_group *_vg_read_file(struct format_instance *fid,
static struct volume_group *_vg_read_precommit_file(struct format_instance *fid,
const char *vgname,
struct metadata_area *mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg __attribute__((unused)))
{
@@ -1092,6 +1123,8 @@ static int _vg_remove_file(struct format_instance *fid __attribute__((unused)),
return 1;
}
/* used for independent_metadata_areas */
static int _scan_file(const struct format_type *fmt, const char *vgname)
{
struct dirent *dirent;
@@ -1107,6 +1140,9 @@ static int _scan_file(const struct format_type *fmt, const char *vgname)
dir_list = &((struct mda_lists *) fmt->private)->dirs;
if (!dm_list_empty(dir_list))
log_debug_metadata("Scanning independent files for %s", vgname ? vgname : "VGs");
dm_list_iterate_items(dl, dir_list) {
if (!(d = opendir(dl->dir))) {
log_sys_error("opendir", dl->dir);
@@ -1139,10 +1175,14 @@ static int _scan_file(const struct format_type *fmt, const char *vgname)
stack;
break;
}
log_debug_metadata("Scanning independent file %s for VG %s", path, scanned_vgname);
if ((vg = _vg_read_file_name(fid, scanned_vgname,
path))) {
/* FIXME Store creation host in vg */
lvmcache_update_vg(vg, 0);
lvmcache_set_independent_location(vg->name);
release_vg(vg);
}
}
@@ -1154,8 +1194,9 @@ static int _scan_file(const struct format_type *fmt, const char *vgname)
return 1;
}
int vgname_from_mda(const struct format_type *fmt,
struct mda_header *mdah, struct device_area *dev_area,
int read_metadata_location(const struct format_type *fmt,
struct mda_header *mdah, struct label_read_data *ld,
struct device_area *dev_area,
struct lvmcache_vgsummary *vgsummary, uint64_t *mda_free_sectors)
{
struct raw_locn *rlocn;
@@ -1163,13 +1204,12 @@ int vgname_from_mda(const struct format_type *fmt,
unsigned int len = 0;
char buf[NAME_LEN + 1] __attribute__((aligned(8)));
uint64_t buffer_size, current_usage;
unsigned used_cached_metadata = 0;
if (mda_free_sectors)
*mda_free_sectors = ((dev_area->size - MDA_HEADER_SIZE) / 2) >> SECTOR_SHIFT;
if (!mdah) {
log_error(INTERNAL_ERROR "vgname_from_mda called with NULL pointer for mda_header");
log_error(INTERNAL_ERROR "read_metadata_location called with NULL pointer for mda_header");
return 0;
}
@@ -1185,10 +1225,16 @@ int vgname_from_mda(const struct format_type *fmt,
return 0;
}
/* Do quick check for a vgname */
if (!dev_read(dev_area->dev, dev_area->start + rlocn->offset,
NAME_LEN, buf))
return_0;
/*
* Verify that the VG metadata pointed to by the rlocn
* begins with a valid vgname.
*/
if (!ld || (ld->buf_len < dev_area->start + rlocn->offset + NAME_LEN)) {
if (!dev_read(dev_area->dev, dev_area->start + rlocn->offset, NAME_LEN, buf))
return_0;
} else {
memcpy(buf, ld->buf + dev_area->start + rlocn->offset, NAME_LEN);
}
while (buf[len] && !isspace(buf[len]) && buf[len] != '{' &&
len < (NAME_LEN - 1))
@@ -1214,30 +1260,25 @@ int vgname_from_mda(const struct format_type *fmt,
vgsummary->mda_checksum = rlocn->checksum;
vgsummary->mda_size = rlocn->size;
if (lvmcache_lookup_mda(vgsummary))
used_cached_metadata = 1;
/* FIXME 64-bit */
if (!text_vgsummary_import(fmt, dev_area->dev,
if (!text_read_metadata_summary(fmt, dev_area->dev, ld,
(off_t) (dev_area->start + rlocn->offset),
(uint32_t) (rlocn->size - wrap),
(off_t) (dev_area->start + MDA_HEADER_SIZE),
wrap, calc_crc, vgsummary->vgname ? 1 : 0,
vgsummary))
vgsummary)) {
/* FIXME: detect and handle errors */
return_0;
}
/* Ignore this entry if the characters aren't permissible */
if (!validate_name(vgsummary->vgname))
return_0;
log_debug_metadata("%s: %s metadata at %" PRIu64 " size %" PRIu64
" (in area at %" PRIu64 " size %" PRIu64
") for %s (" FMTVGID ")",
log_debug_metadata("Read metadata summary from %s at %"PRIu64" size %"PRIu64" for VG %s",
dev_name(dev_area->dev),
used_cached_metadata ? "Using cached" : "Found",
dev_area->start + rlocn->offset,
rlocn->size, dev_area->start, dev_area->size, vgsummary->vgname,
(char *)&vgsummary->vgid);
rlocn->size,
vgsummary->vgname);
if (mda_free_sectors) {
current_usage = (rlocn->size + SECTOR_SIZE - UINT64_C(1)) -
@@ -1253,6 +1294,8 @@ int vgname_from_mda(const struct format_type *fmt,
return 1;
}
/* used for independent_metadata_areas */
static int _scan_raw(const struct format_type *fmt, const char *vgname __attribute__((unused)))
{
struct raw_list *rl;
@@ -1264,27 +1307,34 @@ static int _scan_raw(const struct format_type *fmt, const char *vgname __attribu
raw_list = &((struct mda_lists *) fmt->private)->raws;
if (!dm_list_empty(raw_list))
log_debug_metadata("Scanning independent raw locations for %s", vgname ? vgname : "VGs");
fid.fmt = fmt;
dm_list_init(&fid.metadata_areas_in_use);
dm_list_init(&fid.metadata_areas_ignored);
dm_list_iterate_items(rl, raw_list) {
log_debug_metadata("Scanning independent dev %s", dev_name(rl->dev_area.dev));
/* FIXME We're reading mdah twice here... */
if (!dev_open_readonly(rl->dev_area.dev)) {
stack;
continue;
}
if (!(mdah = raw_read_mda_header(fmt, &rl->dev_area))) {
if (!(mdah = raw_read_mda_header(fmt, &rl->dev_area, NULL))) {
stack;
goto close_dev;
}
/* TODO: caching as in vgname_from_mda() (trigger this code?) */
if (vgname_from_mda(fmt, mdah, &rl->dev_area, &vgsummary, NULL)) {
vg = _vg_read_raw_area(&fid, vgsummary.vgname, &rl->dev_area, NULL, NULL, 0, 0);
if (vg)
/* TODO: caching as in read_metadata_location() (trigger this code?) */
if (read_metadata_location(fmt, mdah, NULL, &rl->dev_area, &vgsummary, NULL)) {
vg = _vg_read_raw_area(&fid, vgsummary.vgname, &rl->dev_area, NULL, NULL, NULL, 0);
if (vg) {
lvmcache_update_vg(vg, 0);
lvmcache_set_independent_location(vg->name);
}
}
close_dev:
if (!dev_close(rl->dev_area.dev))
@@ -1294,9 +1344,13 @@ static int _scan_raw(const struct format_type *fmt, const char *vgname __attribu
return 1;
}
/* used for independent_metadata_areas */
static int _text_scan(const struct format_type *fmt, const char *vgname)
{
return (_scan_file(fmt, vgname) & _scan_raw(fmt, vgname));
_scan_file(fmt, vgname);
_scan_raw(fmt, vgname);
return 1;
}
struct _write_single_mda_baton {
@@ -1748,15 +1802,18 @@ static struct metadata_area_ops _metadata_text_raw_ops = {
.mda_import_text = _mda_import_text_raw
};
/* used only for sending info to lvmetad */
static int _mda_export_text_raw(struct metadata_area *mda,
struct dm_config_tree *cft,
struct dm_config_node *parent)
{
struct mda_context *mdc = (struct mda_context *) mda->metadata_locn;
char mdah[MDA_HEADER_SIZE]; /* temporary */
if (!mdc || !_raw_read_mda_header((struct mda_header *)mdah, &mdc->area))
if (!mdc) {
log_error(INTERNAL_ERROR "mda_export_text_raw no mdc");
return 1; /* pretend the MDA does not exist */
}
return config_make_nodes(cft, parent, NULL,
"ignore = %" PRId64, (int64_t) mda_is_ignored(mda),
@@ -1766,6 +1823,8 @@ static int _mda_export_text_raw(struct metadata_area *mda,
NULL) ? 1 : 0;
}
/* used only for receiving info from lvmetad */
static int _mda_import_text_raw(struct lvmcache_info *info, const struct dm_config_node *cn)
{
struct device *device;
@@ -1995,22 +2054,6 @@ static int _create_vg_text_instance(struct format_instance *fid,
}
if (type & FMT_INSTANCE_MDAS) {
/*
* TODO in theory, this function should be never reached
* while in critical_section(), because lvmcache's
* cached_vg should be valid. However, this assumption
* sometimes fails (possibly due to inconsistent
* (precommit) metadata and/or missing devices), and
* calling lvmcache_label_scan inside the critical
* section may be fatal (i.e. deadlock).
*/
if (!critical_section())
/* Scan PVs in VG for any further MDAs */
/*
* FIXME Only scan PVs believed to be in the VG.
*/
lvmcache_label_scan(fid->fmt->cmd);
if (!(vginfo = lvmcache_vginfo_from_vgname(vg_name, vg_id)))
goto_out;
if (!lvmcache_fid_add_mdas_vg(vginfo, fid))
@@ -2480,7 +2523,7 @@ static int _get_config_disk_area(struct cmd_context *cmd,
return 0;
}
if (!(dev_area.dev = lvmcache_device_from_pvid(cmd, &id, NULL, NULL))) {
if (!(dev_area.dev = lvmcache_device_from_pvid(cmd, &id, NULL))) {
char buffer[64] __attribute__((aligned(8)));
if (!id_write_format(&id, buffer, sizeof(buffer)))

View File

@@ -49,7 +49,6 @@ struct text_vg_version_ops {
int (*check_version) (const struct dm_config_tree * cf);
struct volume_group *(*read_vg) (struct format_instance * fid,
const struct dm_config_tree *cf,
unsigned use_cached_pvs,
unsigned allow_lvmetad_extensions);
void (*read_desc) (struct dm_pool * mem, const struct dm_config_tree *cf,
time_t *when, char **desc);
@@ -68,23 +67,24 @@ int read_segtype_lvflags(uint64_t *status, char *segtype_str);
int text_vg_export_file(struct volume_group *vg, const char *desc, FILE *fp);
size_t text_vg_export_raw(struct volume_group *vg, const char *desc, char **buf);
struct volume_group *text_vg_import_file(struct format_instance *fid,
struct volume_group *text_read_metadata_file(struct format_instance *fid,
const char *file,
time_t *when, char **desc);
struct volume_group *text_vg_import_fd(struct format_instance *fid,
struct volume_group *text_read_metadata(struct format_instance *fid,
struct device *dev,
const char *file,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg,
int single_device,
struct device *dev,
off_t offset, uint32_t size,
off_t offset2, uint32_t size2,
checksum_fn_t checksum_fn,
uint32_t checksum,
time_t *when, char **desc);
int text_vgsummary_import(const struct format_type *fmt,
int text_read_metadata_summary(const struct format_type *fmt,
struct device *dev,
struct label_read_data *ld,
off_t offset, uint32_t size,
off_t offset2, uint32_t size2,
checksum_fn_t checksum_fn,

View File

@@ -35,8 +35,9 @@ static void _init_text_import(void)
/*
* Find out vgname on a given device.
*/
int text_vgsummary_import(const struct format_type *fmt,
int text_read_metadata_summary(const struct format_type *fmt,
struct device *dev,
struct label_read_data *ld,
off_t offset, uint32_t size,
off_t offset2, uint32_t size2,
checksum_fn_t checksum_fn,
@@ -45,20 +46,52 @@ int text_vgsummary_import(const struct format_type *fmt,
{
struct dm_config_tree *cft;
struct text_vg_version_ops **vsn;
char *buf = NULL;
int r = 0;
if (ld) {
if (ld->buf_len >= (offset + size))
buf = ld->buf;
else {
/*
* Needs data beyond the end of the ld buffer.
* Will do a new synchronous read to get the data.
* (scan_size could also be made larger.)
*/
log_debug_metadata("label scan buffer for %s too small %u for metadata offset %llu size %u",
dev_name(dev), ld->buf_len, (unsigned long long)offset, size);
buf = NULL;
}
}
_init_text_import();
if (!(cft = config_open(CONFIG_FILE_SPECIAL, NULL, 0)))
return_0;
if ((!dev && !config_file_read(cft)) ||
(dev && !config_file_read_fd(cft, dev, offset, size,
if (dev) {
if (buf)
log_debug_metadata("Copying metadata summary for %s at %llu size %d (+%d)",
dev_name(dev), (unsigned long long)offset,
size, size2);
else
log_debug_metadata("Reading metadata summary from %s at %llu size %d (+%d)",
dev_name(dev), (unsigned long long)offset,
size, size2);
if (!config_file_read_fd(cft, dev, buf, offset, size,
offset2, size2, checksum_fn,
vgsummary->mda_checksum,
checksum_only, 1))) {
log_error("Couldn't read volume group metadata.");
goto out;
checksum_only, 1)) {
/* FIXME: handle errors */
log_error("Couldn't read volume group metadata from %s.", dev_name(dev));
goto out;
}
} else {
if (!config_file_read(cft)) {
log_error("Couldn't read volume group metadata from file.");
goto out;
}
}
if (checksum_only) {
@@ -91,12 +124,12 @@ struct cached_vg_fmtdata {
size_t cached_mda_size;
};
struct volume_group *text_vg_import_fd(struct format_instance *fid,
struct volume_group *text_read_metadata(struct format_instance *fid,
struct device *dev,
const char *file,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg,
int single_device,
struct device *dev,
off_t offset, uint32_t size,
off_t offset2, uint32_t size2,
checksum_fn_t checksum_fn,
@@ -106,8 +139,18 @@ struct volume_group *text_vg_import_fd(struct format_instance *fid,
struct volume_group *vg = NULL;
struct dm_config_tree *cft;
struct text_vg_version_ops **vsn;
char *buf = NULL;
int skip_parse;
/*
* This struct holds the checksum and size of the VG metadata
* that was read from a previous device. When we read the VG
* metadata from this device, we can skip parsing it into a
* cft (saving time) if the checksum of the metadata buffer
* we read from this device matches the size/checksum saved in
* the mda_header/rlocn struct on this device, and matches the
* size/checksum from the previous device.
*/
if (vg_fmtdata && !*vg_fmtdata &&
!(*vg_fmtdata = dm_pool_zalloc(fid->mem, sizeof(**vg_fmtdata)))) {
log_error("Failed to allocate VG fmtdata for text format.");
@@ -127,15 +170,49 @@ struct volume_group *text_vg_import_fd(struct format_instance *fid,
((*vg_fmtdata)->cached_mda_checksum == checksum) &&
((*vg_fmtdata)->cached_mda_size == (size + size2));
if ((!dev && !config_file_read(cft)) ||
(dev && !config_file_read_fd(cft, dev, offset, size,
if (ld) {
if (ld->buf_len >= (offset + size))
buf = ld->buf;
else {
/*
* Needs data beyond the end of the ld buffer.
* Will do a new synchronous read to get the data.
* (scan_size could also be made larger.)
*/
log_debug_metadata("label scan buffer for %s too small %u for metadata offset %llu size %u",
dev_name(dev), ld->buf_len, (unsigned long long)offset, size);
buf = NULL;
}
}
if (dev) {
if (buf)
log_debug_metadata("Copying metadata for %s at %llu size %d (+%d)",
dev_name(dev), (unsigned long long)offset,
size, size2);
else
log_debug_metadata("Reading metadata from %s at %llu size %d (+%d)",
dev_name(dev), (unsigned long long)offset,
size, size2);
if (!config_file_read_fd(cft, dev, buf, offset, size,
offset2, size2, checksum_fn, checksum,
skip_parse, 1)))
goto_out;
skip_parse, 1)) {
/* FIXME: handle errors */
log_error("Couldn't read volume group metadata from %s.", dev_name(dev));
goto out;
}
} else {
if (!config_file_read(cft)) {
log_error("Couldn't read volume group metadata from file.");
goto out;
}
}
if (skip_parse) {
if (use_previous_vg)
*use_previous_vg = 1;
log_debug_metadata("Skipped parsing metadata on %s", dev_name(dev));
goto out;
}
@@ -146,7 +223,7 @@ struct volume_group *text_vg_import_fd(struct format_instance *fid,
if (!(*vsn)->check_version(cft))
continue;
if (!(vg = (*vsn)->read_vg(fid, cft, single_device, 0)))
if (!(vg = (*vsn)->read_vg(fid, cft, 0)))
goto_out;
(*vsn)->read_desc(vg->vgmem, cft, when, desc);
@@ -166,17 +243,20 @@ struct volume_group *text_vg_import_fd(struct format_instance *fid,
return vg;
}
struct volume_group *text_vg_import_file(struct format_instance *fid,
struct volume_group *text_read_metadata_file(struct format_instance *fid,
const char *file,
time_t *when, char **desc)
{
return text_vg_import_fd(fid, file, NULL, NULL, 0, NULL, (off_t)0, 0, (off_t)0, 0, NULL, 0,
return text_read_metadata(fid, NULL, file, NULL, NULL, NULL,
(off_t)0, 0, (off_t)0, 0,
NULL,
0,
when, desc);
}
static struct volume_group *_import_vg_from_config_tree(const struct dm_config_tree *cft,
struct format_instance *fid,
unsigned allow_lvmetad_extensions)
unsigned for_lvmetad)
{
struct volume_group *vg = NULL;
struct text_vg_version_ops **vsn;
@@ -191,7 +271,7 @@ static struct volume_group *_import_vg_from_config_tree(const struct dm_config_t
* The only path to this point uses cached vgmetadata,
* so it can use cached PV state too.
*/
if (!(vg = (*vsn)->read_vg(fid, cft, 1, allow_lvmetad_extensions)))
if (!(vg = (*vsn)->read_vg(fid, cft, for_lvmetad)))
stack;
else if ((vg_missing = vg_missing_pv_count(vg))) {
log_verbose("There are %d physical volumes missing.",

View File

@@ -32,9 +32,7 @@ typedef int (*section_fn) (struct format_instance * fid,
struct volume_group * vg, const struct dm_config_node * pvn,
const struct dm_config_node * vgn,
struct dm_hash_table * pv_hash,
struct dm_hash_table * lv_hash,
unsigned *scan_done_once,
unsigned report_missing_devices);
struct dm_hash_table * lv_hash);
#define _read_int32(root, path, result) \
dm_config_get_uint32(root, path, (uint32_t *) (result))
@@ -180,9 +178,7 @@ static int _read_pv(struct format_instance *fid,
struct volume_group *vg, const struct dm_config_node *pvn,
const struct dm_config_node *vgn __attribute__((unused)),
struct dm_hash_table *pv_hash,
struct dm_hash_table *lv_hash __attribute__((unused)),
unsigned *scan_done_once,
unsigned report_missing_devices)
struct dm_hash_table *lv_hash __attribute__((unused)))
{
struct dm_pool *mem = vg->vgmem;
struct physical_volume *pv;
@@ -220,16 +216,16 @@ static int _read_pv(struct format_instance *fid,
/*
* Convert the uuid into a device.
*/
if (!(pv->dev = lvmcache_device_from_pvid(fid->fmt->cmd, &pv->id, scan_done_once,
&pv->label_sector))) {
char buffer[64] __attribute__((aligned(8)));
if (fid->fmt->cmd && !fid->fmt->cmd->pvscan_cache_single) {
if (!(pv->dev = lvmcache_device_from_pvid(fid->fmt->cmd, &pv->id, &pv->label_sector))) {
char buffer[64] __attribute__((aligned(8)));
if (!id_write_format(&pv->id, buffer, sizeof(buffer)))
buffer[0] = '\0';
if (report_missing_devices)
if (!id_write_format(&pv->id, buffer, sizeof(buffer)))
buffer[0] = '\0';
log_error_once("Couldn't find device with uuid %s.", buffer);
else
log_very_verbose("Couldn't find device with uuid %s.", buffer);
}
} else {
log_debug_metadata("Skip metadata pvid to device lookup for lvmetad pvscan.");
}
if (!(pv->vg_name = dm_pool_strdup(mem, vg->name)))
@@ -574,9 +570,7 @@ static int _read_lvnames(struct format_instance *fid __attribute__((unused)),
struct volume_group *vg, const struct dm_config_node *lvn,
const struct dm_config_node *vgn __attribute__((unused)),
struct dm_hash_table *pv_hash __attribute__((unused)),
struct dm_hash_table *lv_hash,
unsigned *scan_done_once __attribute__((unused)),
unsigned report_missing_devices __attribute__((unused)))
struct dm_hash_table *lv_hash)
{
struct dm_pool *mem = vg->vgmem;
struct logical_volume *lv;
@@ -731,9 +725,7 @@ static int _read_historical_lvnames(struct format_instance *fid __attribute__((u
struct volume_group *vg, const struct dm_config_node *hlvn,
const struct dm_config_node *vgn __attribute__((unused)),
struct dm_hash_table *pv_hash __attribute__((unused)),
struct dm_hash_table *lv_hash __attribute__((unused)),
unsigned *scan_done_once __attribute__((unused)),
unsigned report_missing_devices __attribute__((unused)))
struct dm_hash_table *lv_hash __attribute__((unused)))
{
struct dm_pool *mem = vg->vgmem;
struct generic_logical_volume *glv;
@@ -802,9 +794,7 @@ static int _read_historical_lvnames_interconnections(struct format_instance *fid
struct volume_group *vg, const struct dm_config_node *hlvn,
const struct dm_config_node *vgn __attribute__((unused)),
struct dm_hash_table *pv_hash __attribute__((unused)),
struct dm_hash_table *lv_hash __attribute__((unused)),
unsigned *scan_done_once __attribute__((unused)),
unsigned report_missing_devices __attribute__((unused)))
struct dm_hash_table *lv_hash __attribute__((unused)))
{
struct dm_pool *mem = vg->vgmem;
const char *historical_lv_name, *origin_name = NULL;
@@ -914,9 +904,7 @@ static int _read_lvsegs(struct format_instance *fid,
struct volume_group *vg, const struct dm_config_node *lvn,
const struct dm_config_node *vgn __attribute__((unused)),
struct dm_hash_table *pv_hash,
struct dm_hash_table *lv_hash,
unsigned *scan_done_once __attribute__((unused)),
unsigned report_missing_devices __attribute__((unused)))
struct dm_hash_table *lv_hash)
{
struct logical_volume *lv;
@@ -977,12 +965,9 @@ static int _read_sections(struct format_instance *fid,
struct volume_group *vg, const struct dm_config_node *vgn,
struct dm_hash_table *pv_hash,
struct dm_hash_table *lv_hash,
int optional,
unsigned *scan_done_once)
int optional)
{
const struct dm_config_node *n;
/* Only report missing devices when doing a scan */
unsigned report_missing_devices = scan_done_once ? !*scan_done_once : 1;
if (!dm_config_get_section(vgn, section, &n)) {
if (!optional) {
@@ -994,8 +979,7 @@ static int _read_sections(struct format_instance *fid,
}
for (n = n->child; n; n = n->sib) {
if (!fn(fid, vg, n, vgn, pv_hash, lv_hash,
scan_done_once, report_missing_devices))
if (!fn(fid, vg, n, vgn, pv_hash, lv_hash))
return_0;
}
@@ -1004,15 +988,13 @@ static int _read_sections(struct format_instance *fid,
static struct volume_group *_read_vg(struct format_instance *fid,
const struct dm_config_tree *cft,
unsigned use_cached_pvs,
unsigned allow_lvmetad_extensions)
unsigned for_lvmetad)
{
const struct dm_config_node *vgn;
const struct dm_config_value *cv;
const char *str, *format_str, *system_id;
struct volume_group *vg;
struct dm_hash_table *pv_hash = NULL, *lv_hash = NULL;
unsigned scan_done_once = use_cached_pvs;
uint64_t vgstatus;
/* skip any top-level values */
@@ -1167,15 +1149,15 @@ static struct volume_group *_read_vg(struct format_instance *fid,
}
if (!_read_sections(fid, "physical_volumes", _read_pv, vg,
vgn, pv_hash, lv_hash, 0, &scan_done_once)) {
vgn, pv_hash, lv_hash, 0)) {
log_error("Couldn't find all physical volumes for volume "
"group %s.", vg->name);
goto bad;
}
if (allow_lvmetad_extensions)
if (for_lvmetad)
_read_sections(fid, "outdated_pvs", _read_pv, vg,
vgn, pv_hash, lv_hash, 1, &scan_done_once);
vgn, pv_hash, lv_hash, 1);
else if (dm_config_has_node(vgn, "outdated_pvs"))
log_error(INTERNAL_ERROR "Unexpected outdated_pvs section in metadata of VG %s.", vg->name);
@@ -1187,28 +1169,28 @@ static struct volume_group *_read_vg(struct format_instance *fid,
}
if (!_read_sections(fid, "logical_volumes", _read_lvnames, vg,
vgn, pv_hash, lv_hash, 1, NULL)) {
vgn, pv_hash, lv_hash, 1)) {
log_error("Couldn't read all logical volume names for volume "
"group %s.", vg->name);
goto bad;
}
if (!_read_sections(fid, "historical_logical_volumes", _read_historical_lvnames, vg,
vgn, pv_hash, lv_hash, 1, NULL)) {
vgn, pv_hash, lv_hash, 1)) {
log_error("Couldn't read all historical logical volumes for volume "
"group %s.", vg->name);
goto bad;
}
if (!_read_sections(fid, "logical_volumes", _read_lvsegs, vg,
vgn, pv_hash, lv_hash, 1, NULL)) {
vgn, pv_hash, lv_hash, 1)) {
log_error("Couldn't read all logical volumes for "
"volume group %s.", vg->name);
goto bad;
}
if (!_read_sections(fid, "historical_logical_volumes", _read_historical_lvnames_interconnections,
vg, vgn, pv_hash, lv_hash, 1, NULL)) {
vg, vgn, pv_hash, lv_hash, 1)) {
log_error("Couldn't read all removed logical volume interconnections "
"for volume group %s.", vg->name);
goto bad;

View File

@@ -81,7 +81,8 @@ struct mda_header {
} __attribute__ ((packed));
struct mda_header *raw_read_mda_header(const struct format_type *fmt,
struct device_area *dev_area);
struct device_area *dev_area,
struct label_read_data *ld);
struct mda_lists {
struct dm_list dirs;
@@ -103,7 +104,8 @@ struct mda_context {
#define LVM2_LABEL "LVM2 001"
#define MDA_SIZE_MIN (8 * (unsigned) lvm_getpagesize())
int vgname_from_mda(const struct format_type *fmt, struct mda_header *mdah,
int read_metadata_location(const struct format_type *fmt, struct mda_header *mdah,
struct label_read_data *ld,
struct device_area *dev_area, struct lvmcache_vgsummary *vgsummary,
uint64_t *mda_free_sectors);

View File

@@ -308,14 +308,15 @@ static int _text_initialise_label(struct labeller *l __attribute__((unused)),
return 1;
}
struct _update_mda_baton {
struct _mda_baton {
struct lvmcache_info *info;
struct label *label;
struct label_read_data *ld;
};
static int _update_mda(struct metadata_area *mda, void *baton)
static int _read_mda_header_and_metadata(struct metadata_area *mda, void *baton)
{
struct _update_mda_baton *p = baton;
struct _mda_baton *p = baton;
const struct format_type *fmt = p->label->labeller->fmt;
struct mda_context *mdac = (struct mda_context *) mda->metadata_locn;
struct mda_header *mdah;
@@ -334,7 +335,7 @@ static int _update_mda(struct metadata_area *mda, void *baton)
return 1;
}
if (!(mdah = raw_read_mda_header(fmt, &mdac->area))) {
if (!(mdah = raw_read_mda_header(fmt, &mdac->area, p->ld))) {
stack;
goto close_dev;
}
@@ -350,7 +351,7 @@ static int _update_mda(struct metadata_area *mda, void *baton)
return 1;
}
if (vgname_from_mda(fmt, mdah, &mdac->area, &vgsummary,
if (read_metadata_location(fmt, mdah, p->ld, &mdac->area, &vgsummary,
&mdac->free_sectors) &&
!lvmcache_update_vgname_and_id(p->info, &vgsummary)) {
if (!dev_close(mdac->area.dev))
@@ -365,22 +366,29 @@ close_dev:
return 1;
}
static int _text_read(struct labeller *l, struct device *dev, void *buf,
struct label **label)
/*
* When label_read_data *ld is set, it means that we have read the first
* ld->buf_len bytes of the device and already have that data, so we don't need
* to do any dev_read's (as long as the desired dev_read offset+size is less
* then ld->buf_len).
*/
static int _text_read(struct labeller *l, struct device *dev, void *label_buf,
struct label_read_data *ld, struct label **label)
{
struct label_header *lh = (struct label_header *) buf;
struct label_header *lh = (struct label_header *) label_buf;
struct pv_header *pvhdr;
struct pv_header_extension *pvhdr_ext;
struct lvmcache_info *info;
struct disk_locn *dlocn_xl;
uint64_t offset;
uint32_t ext_version;
struct _update_mda_baton baton;
struct _mda_baton baton;
/*
* PV header base
*/
pvhdr = (struct pv_header *) ((char *) buf + xlate32(lh->offset_xl));
pvhdr = (struct pv_header *) ((char *) label_buf + xlate32(lh->offset_xl));
if (!(info = lvmcache_add(l, (char *)pvhdr->pv_uuid, dev,
FMT_TEXT_ORPHAN_VG_NAME,
@@ -436,9 +444,9 @@ static int _text_read(struct labeller *l, struct device *dev, void *buf,
out:
baton.info = info;
baton.label = *label;
baton.ld = ld;
if (!lvmcache_foreach_mda(info, _update_mda, &baton))
return_0;
lvmcache_foreach_mda(info, _read_mda_header_and_metadata, &baton);
lvmcache_make_valid(info);

File diff suppressed because it is too large Load Diff

View File

@@ -18,6 +18,7 @@
#include "uuid.h"
#include "device.h"
#include "toolcontext.h"
#define LABEL_ID "LABELONE"
#define LABEL_SIZE SECTOR_SIZE /* Think very carefully before changing this */
@@ -28,6 +29,17 @@ struct labeller;
void allow_reads_with_lvmetad(void);
struct label_read_data {
struct dev_async_io *aio;
char *buf; /* points to aio->buf */
struct device *dev;
struct dm_list list;
int buf_len; /* same as aio->buf_len */
int result; /* same as aio->result */
int try_sync;
int process_done;
};
/* On disk - 32 bytes */
struct label_header {
int8_t id[8]; /* LABELONE */
@@ -63,7 +75,8 @@ struct label_ops {
* Read a label from a volume.
*/
int (*read) (struct labeller * l, struct device * dev,
void *buf, struct label ** label);
void *label_buf,
struct label_read_data *ld, struct label ** label);
/*
* Additional consistency checks for the paranoid.
@@ -99,11 +112,15 @@ int label_register_handler(struct labeller *handler);
struct labeller *label_get_handler(const char *name);
int label_remove(struct device *dev);
int label_read(struct device *dev, struct label **result,
uint64_t scan_sector);
int label_read(struct device *dev, struct label **label, uint64_t scan_sector);
int label_write(struct device *dev, struct label *label);
int label_verify(struct device *dev);
struct label *label_create(struct labeller *labeller);
void label_destroy(struct label *label);
int label_scan_force(struct cmd_context *cmd);
int label_scan(struct cmd_context *cmd);
int label_scan_devs(struct cmd_context *cmd, struct dm_list *devs);
struct label_read_data *get_label_read_data(struct cmd_context *cmd, struct device *dev);
#endif

View File

@@ -34,6 +34,7 @@
#include "lvmlockd.h"
#include "time.h"
#include "lvmnotify.h"
#include "label.h"
#include <math.h>
#include <sys/param.h>
@@ -782,6 +783,10 @@ bad:
* . pvremove_single()
* . find_pv_by_name()
* . get_pvs()
* . get_vgids()
* . get_vgnames()
* . lvmcache_get_vgids()
* . lvmcache_get_vgnames()
* . the vg->pvs_to_write list and pv_to_write struct
* . vg_reduce()
*/
@@ -1837,7 +1842,7 @@ struct physical_volume *pvcreate_vol(struct cmd_context *cmd, const char *pv_nam
}
if (pp->pva.idp) {
if ((dev = lvmcache_device_from_pvid(cmd, pp->pva.idp, NULL, NULL)) &&
if ((dev = lvmcache_device_from_pvid(cmd, pp->pva.idp, NULL)) &&
(dev != dev_cache_get(pv_name, cmd->full_filter))) {
if (!id_write_format((const struct id*)&pp->pva.idp->uuid,
buffer, sizeof(buffer)))
@@ -4259,7 +4264,6 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
struct dm_list *pvids;
struct pv_list *pvl;
struct dm_list all_pvs;
unsigned seqno = 0;
int reappeared = 0;
struct cached_vg_fmtdata *vg_fmtdata = NULL; /* Additional format-specific data about the vg */
unsigned use_previous_vg;
@@ -4276,7 +4280,7 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
}
if (lvmetad_used() && !use_precommitted) {
if ((correct_vg = lvmcache_get_vg(cmd, vgname, vgid, precommitted))) {
if ((correct_vg = lvmetad_vg_lookup(cmd, vgname, vgid))) {
dm_list_iterate_items(pvl, &correct_vg->pvs)
reappeared += _check_reappeared_pv(correct_vg, pvl->pv, *consistent);
if (reappeared && *consistent)
@@ -4307,36 +4311,27 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
}
/*
* If cached metadata was inconsistent and *consistent is set
* then repair it now. Otherwise just return it.
* Also return if use_precommitted is set due to the FIXME in
* the missing PV logic below.
* Rescan the devices that are associated with this vg in lvmcache.
* This repeats what was done by the command's initial label scan,
* but only the devices associated with this VG.
*
* The lvmcache info about these devs is from the initial label scan
* performed by the command before the vg lock was held. Now the VG
* lock is held, so we rescan all the info from the devs in case
* something changed between the initial scan and now that the lock
* is held.
*/
if ((correct_vg = lvmcache_get_vg(cmd, vgname, vgid, precommitted)) &&
(use_precommitted || !*consistent)) {
*consistent = 1;
return correct_vg;
} else {
if (correct_vg && correct_vg->seqno > seqno)
seqno = correct_vg->seqno;
release_vg(correct_vg);
correct_vg = NULL;
log_debug_metadata("Reading VG rereading labels for %s", vgname);
if (!lvmcache_label_rescan_vg(cmd, vgname, vgid)) {
/* The VG wasn't found, so force a full label scan. */
lvmcache_force_next_label_scan();
lvmcache_label_scan(cmd);
}
/* Find the vgname in the cache */
/* If it's not there we must do full scan to be completely sure */
if (!(fmt = lvmcache_fmt_from_vgname(cmd, vgname, vgid, 1))) {
lvmcache_label_scan(cmd);
if (!(fmt = lvmcache_fmt_from_vgname(cmd, vgname, vgid, 1))) {
/* Independent MDAs aren't supported under low memory */
if (!cmd->independent_metadata_areas && critical_section())
return_NULL;
lvmcache_force_next_label_scan();
lvmcache_label_scan(cmd);
if (!(fmt = lvmcache_fmt_from_vgname(cmd, vgname, vgid, 0)))
return_NULL;
}
if (!(fmt = lvmcache_fmt_from_vgname(cmd, vgname, vgid, 0))) {
log_debug_metadata("Cache did not find fmt for vgname %s", vgname);
return_NULL;
}
/* Now determine the correct vgname if none was supplied */
@@ -4354,6 +4349,25 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
if (use_precommitted && !(fmt->features & FMT_PRECOMMIT))
use_precommitted = 0;
/*
* A "format instance" is an abstraction for a VG location,
* i.e. where a VG's metadata exists on disk.
*
* An fic/fid pair (format_instance_ctx/format_instance) exists
* for each VG. The fic/fid is set up by create_instance() to
* describe the VG location. This happens before the VG metadata
* is assembled into the more familiar struct volume_group "vg".
*
* The fic/fid has one main purpose: to keep track of the metadata
* locations for a given VG. It does this by putting 'mda'
* structs on fid->metadata_areas_in_use, which specify where
* metadata is located on disk. It gets this information
* (metadata locations for a specific VG) from the command's
* initial label scan. The info is passed indirectly via
* lvmcache info/vginfo structs, which are created by the
* label scan and then copied into fic/fid by create_instance().
*/
/* create format instance with appropriate metadata area */
fic.type = FMT_INSTANCE_MDAS | FMT_INSTANCE_AUX_MDAS;
fic.context.vg_ref.vg_name = vgname;
@@ -4377,12 +4391,17 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
/* Ensure contents of all metadata areas match - else do recovery */
inconsistent_mda_count=0;
dm_list_iterate_items(mda, &fid->metadata_areas_in_use) {
struct device *mda_dev = mda_get_device(mda);
struct label_read_data *ld;
use_previous_vg = 0;
if ((use_precommitted &&
!(vg = mda->ops->vg_read_precommit(fid, vgname, mda, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg) ||
(!use_precommitted &&
!(vg = mda->ops->vg_read(fid, vgname, mda, &vg_fmtdata, &use_previous_vg, 0)) && !use_previous_vg)) {
log_debug_metadata("Reading VG %s from %s", vgname, dev_name(mda_dev));
ld = get_label_read_data(cmd, mda_dev);
if ((use_precommitted && !(vg = mda->ops->vg_read_precommit(fid, vgname, mda, ld, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg) ||
(!use_precommitted && !(vg = mda->ops->vg_read(fid, vgname, mda, ld, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg)) {
inconsistent = 1;
vg_fmtdata = NULL;
continue;
@@ -4572,9 +4591,9 @@ static struct volume_group *_vg_read(struct cmd_context *cmd,
use_previous_vg = 0;
if ((use_precommitted &&
!(vg = mda->ops->vg_read_precommit(fid, vgname, mda, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg) ||
!(vg = mda->ops->vg_read_precommit(fid, vgname, mda, NULL, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg) ||
(!use_precommitted &&
!(vg = mda->ops->vg_read(fid, vgname, mda, &vg_fmtdata, &use_previous_vg, 0)) && !use_previous_vg)) {
!(vg = mda->ops->vg_read(fid, vgname, mda, NULL, &vg_fmtdata, &use_previous_vg)) && !use_previous_vg)) {
inconsistent = 1;
vg_fmtdata = NULL;
continue;
@@ -4967,21 +4986,10 @@ static struct volume_group *_vg_read_by_vgid(struct cmd_context *cmd,
unsigned precommitted)
{
const char *vgname;
struct dm_list *vgnames;
struct volume_group *vg;
struct dm_str_list *strl;
uint32_t warn_flags = WARN_PV_READ | WARN_INCONSISTENT;
int consistent = 0;
/* Is corresponding vgname already cached? */
if (lvmcache_vgid_is_cached(vgid)) {
if ((vg = _vg_read(cmd, NULL, vgid, warn_flags, &consistent, precommitted)) &&
id_equal(&vg->id, (const struct id *)vgid)) {
return vg;
}
release_vg(vg);
}
/*
* When using lvmlockd we should never reach this point.
* The VG is locked, then vg_read() is done, which gets
@@ -4994,36 +5002,28 @@ static struct volume_group *_vg_read_by_vgid(struct cmd_context *cmd,
/* Mustn't scan if memory locked: ensure cache gets pre-populated! */
if (critical_section())
return_NULL;
log_debug_metadata("Reading VG by vgid in critical section pre %d vgid %.8s", precommitted, vgid);
/* FIXME Need a genuine read by ID here - don't vg_read_internal by name! */
/* FIXME Disabled vgrenames while active for now because we aren't
* allowed to do a full scan here any more. */
if (!(vgname = lvmcache_vgname_from_vgid(cmd->mem, vgid))) {
log_debug_metadata("Reading VG by vgid %.8s no VG name found, retrying.", vgid);
lvmcache_destroy(cmd, 0, 0);
lvmcache_force_next_label_scan();
lvmcache_label_scan(cmd);
}
// The slow way - full scan required to cope with vgrename
lvmcache_force_next_label_scan();
lvmcache_label_scan(cmd);
if (!(vgnames = get_vgnames(cmd, 0))) {
log_error("vg_read_by_vgid: get_vgnames failed");
if (!(vgname = lvmcache_vgname_from_vgid(cmd->mem, vgid))) {
log_debug_metadata("Reading VG by vgid %.8s no VG name found.", vgid);
return NULL;
}
dm_list_iterate_items(strl, vgnames) {
vgname = strl->str;
if (!vgname)
continue; // FIXME Unnecessary?
consistent = 0;
if ((vg = _vg_read(cmd, vgname, vgid, warn_flags, &consistent, precommitted)) &&
id_equal(&vg->id, (const struct id *)vgid)) {
if (!consistent) {
release_vg(vg);
return NULL;
}
return vg;
}
release_vg(vg);
consistent = 0;
if ((vg = _vg_read(cmd, vgname, vgid, warn_flags, &consistent, precommitted))) {
/* Does it matter if consistent is 0 or 1? */
return vg;
}
log_debug_metadata("Reading VG by vgid %.8s not found.", vgid);
return NULL;
}
@@ -5039,7 +5039,7 @@ struct logical_volume *lv_from_lvid(struct cmd_context *cmd, const char *lvid_s,
log_very_verbose("Finding %svolume group for uuid %s", precommitted ? "precommitted " : "", lvid_s);
if (!(vg = _vg_read_by_vgid(cmd, (const char *)lvid->id[0].uuid, precommitted))) {
log_error("Volume group for uuid not found: %s", lvid_s);
log_error("Reading VG not found for LVID %s", lvid_s);
return NULL;
}

View File

@@ -25,6 +25,8 @@
#include "dev-cache.h"
#include "lvm-string.h"
#include "metadata-exported.h"
#include "lvm-logging.h"
#include "label.h"
//#define MAX_STRIPES 128U
//#define SECTOR_SHIFT 9L
@@ -79,12 +81,13 @@ struct metadata_area_ops {
struct volume_group *(*vg_read) (struct format_instance * fi,
const char *vg_name,
struct metadata_area * mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg,
int single_device);
unsigned *use_previous_vg);
struct volume_group *(*vg_read_precommit) (struct format_instance * fi,
const char *vg_name,
struct metadata_area * mda,
struct label_read_data *ld,
struct cached_vg_fmtdata **vg_fmtdata,
unsigned *use_previous_vg);
/*

View File

@@ -97,11 +97,6 @@ void release_vg(struct volume_group *vg)
if (!vg || (vg->fid && vg == vg->fid->fmt->orphan_vg))
return;
/* Check if there are any vginfo holders */
if (vg->vginfo &&
!lvmcache_vginfo_holders_dec_and_test_for_zero(vg->vginfo))
return;
release_vg(vg->vg_committed);
release_vg(vg->vg_precommitted);
if (vg->cft_precommitted)

View File

@@ -54,8 +54,6 @@ static int _activation_checks = 0;
static char _sysfs_dir_path[PATH_MAX] = "";
static int _dev_disable_after_error_count = DEFAULT_DISABLE_AFTER_ERROR_COUNT;
static uint64_t _pv_min_size = (DEFAULT_PV_MIN_SIZE_KB * 1024L >> SECTOR_SHIFT);
static int _detect_internal_vg_cache_corruption =
DEFAULT_DETECT_INTERNAL_VG_CACHE_CORRUPTION;
static const char *_unknown_device_name = DEFAULT_UNKNOWN_DEVICE_NAME;
void init_verbose(int level)
@@ -198,11 +196,6 @@ void init_pv_min_size(uint64_t sectors)
_pv_min_size = sectors;
}
void init_detect_internal_vg_cache_corruption(int detect)
{
_detect_internal_vg_cache_corruption = detect;
}
void set_cmd_name(const char *cmd)
{
strncpy(_cmd_name, cmd, sizeof(_cmd_name) - 1);
@@ -387,11 +380,6 @@ uint64_t pv_min_size(void)
return _pv_min_size;
}
int detect_internal_vg_cache_corruption(void)
{
return _detect_internal_vg_cache_corruption;
}
const char *unknown_device_name(void)
{
return _unknown_device_name;

View File

@@ -51,7 +51,6 @@ void init_udev_checking(int checking);
void init_dev_disable_after_error_count(int value);
void init_pv_min_size(uint64_t sectors);
void init_activation_checks(int checks);
void init_detect_internal_vg_cache_corruption(int detect);
void init_retry_deactivation(int retry);
void init_unknown_device_name(const char *name);
@@ -85,7 +84,6 @@ int udev_checking(void);
const char *sysfs_dir_path(void);
uint64_t pv_min_size(void);
int activation_checks(void);
int detect_internal_vg_cache_corruption(void);
int retry_deactivation(void);
const char *unknown_device_name(void);

View File

@@ -45,6 +45,10 @@ include $(top_builddir)/make.tmpl
LDFLAGS += -L$(top_builddir)/lib -L$(top_builddir)/daemons/dmeventd
LIBS += $(LVMINTERNAL_LIBS) -ldevmapper
ifeq ("@AIO@", "yes")
LIBS += $(AIO_LIBS)
endif
.PHONY: install_dynamic install_static install_include install_pkgconfig
INSTALL_TYPE = install_dynamic

View File

@@ -64,6 +64,7 @@ LDDEPS += @LDDEPS@
LIB_SUFFIX = @LIB_SUFFIX@
LVMINTERNAL_LIBS = -llvm-internal $(DMEVENT_LIBS) $(DAEMON_LIBS) $(SYSTEMD_LIBS) $(UDEV_LIBS) $(DL_LIBS) $(BLKID_LIBS)
DL_LIBS = @DL_LIBS@
AIO_LIBS = @AIO_LIBS@
RT_LIBS = @RT_LIBS@
M_LIBS = @M_LIBS@
PTHREAD_LIBS = @PTHREAD_LIBS@

View File

@@ -31,6 +31,10 @@ endif
LVMLIBS = @LVM2APP_LIB@ -ldevmapper
endif
ifeq ("@AIO@", "yes")
LVMLIBS += $(AIO_LIBS)
endif
LVM_SCRIPTS = lvmdump.sh lvmconf.sh
DM_SCRIPTS =

View File

@@ -109,6 +109,10 @@ ifeq ("@CMDLIB@", "yes")
INSTALL_LVM_TARGETS += $(INSTALL_CMDLIB_TARGETS)
endif
ifeq ("@AIO@", "yes")
LVMLIBS += $(AIO_LIBS)
endif
EXPORTED_HEADER = $(srcdir)/lvm2cmd.h
EXPORTED_FN_PREFIX = lvm2

View File

@@ -43,7 +43,7 @@ xx(lastlog,
xx(lvchange,
"Change the attributes of logical volume(s)",
CACHE_VGMETADATA | PERMITTED_READ_ONLY)
PERMITTED_READ_ONLY)
xx(lvconvert,
"Change logical volume layout",
@@ -127,7 +127,7 @@ xx(pvdata,
xx(pvdisplay,
"Display various attributes of physical volume(s)",
CACHE_VGMETADATA | PERMITTED_READ_ONLY | ENABLE_ALL_DEVS | ENABLE_DUPLICATE_DEVS | LOCKD_VG_SH)
PERMITTED_READ_ONLY | ENABLE_ALL_DEVS | ENABLE_DUPLICATE_DEVS | LOCKD_VG_SH)
/* ALL_VGS_IS_DEFAULT is for polldaemon to find pvmoves in-progress using process_each_vg. */
@@ -145,7 +145,7 @@ xx(pvremove,
xx(pvs,
"Display information about physical volumes",
CACHE_VGMETADATA | PERMITTED_READ_ONLY | ALL_VGS_IS_DEFAULT | ENABLE_ALL_DEVS | ENABLE_DUPLICATE_DEVS | LOCKD_VG_SH)
PERMITTED_READ_ONLY | ALL_VGS_IS_DEFAULT | ENABLE_ALL_DEVS | ENABLE_DUPLICATE_DEVS | LOCKD_VG_SH)
xx(pvscan,
"List all physical volumes",
@@ -173,7 +173,7 @@ xx(vgcfgrestore,
xx(vgchange,
"Change volume group attributes",
CACHE_VGMETADATA | PERMITTED_READ_ONLY | ALL_VGS_IS_DEFAULT)
PERMITTED_READ_ONLY | ALL_VGS_IS_DEFAULT)
xx(vgck,
"Check the consistency of volume group(s)",

View File

@@ -2280,7 +2280,6 @@ static int _get_current_settings(struct cmd_context *cmd)
cmd->current_settings.archive = arg_int_value(cmd, autobackup_ARG, cmd->current_settings.archive);
cmd->current_settings.backup = arg_int_value(cmd, autobackup_ARG, cmd->current_settings.backup);
cmd->current_settings.cache_vgmetadata = cmd->cname->flags & CACHE_VGMETADATA ? 1 : 0;
if (arg_is_set(cmd, readonly_ARG)) {
cmd->current_settings.activation = 0;
@@ -2796,7 +2795,7 @@ int lvm_run_command(struct cmd_context *cmd, int argc, char **argv)
cmd->position_argv = argv;
set_cmd_name(cmd->name);
if (arg_is_set(cmd, backgroundfork_ARG)) {
if (!become_daemon(cmd, 1)) {
/* parent - quit immediately */

View File

@@ -300,8 +300,10 @@ static int _pvscan_autoactivate(struct cmd_context *cmd, struct pvscan_aa_params
static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
{
struct pvscan_aa_params pp = { 0 };
struct dm_list single_devs;
struct dm_list found_vgnames;
struct device *dev;
struct device_list *devl;
const char *pv_name;
const char *reason = NULL;
int32_t major = -1;
@@ -315,6 +317,7 @@ static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
int add_errors = 0;
int ret = ECMD_PROCESSED;
dm_list_init(&single_devs);
dm_list_init(&found_vgnames);
dm_list_init(&pp.changed_vgnames);
@@ -434,8 +437,10 @@ static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
* to drop any devices that have left.)
*/
if (argc || devno_args)
if (argc || devno_args) {
log_verbose("Scanning devices on command line.");
cmd->pvscan_cache_single = 1;
}
while (argc--) {
pv_name = *argv++;
@@ -453,8 +458,11 @@ static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
} else {
/* Add device path to lvmetad. */
log_debug("Scanning dev %s for lvmetad cache.", pv_name);
if (!lvmetad_pvscan_single(cmd, dev, &found_vgnames, &pp.changed_vgnames))
add_errors++;
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
return_0;
devl->dev = dev;
dm_list_add(&single_devs, &devl->list);
}
} else {
if (sscanf(pv_name, "%d:%d", &major, &minor) != 2) {
@@ -471,8 +479,11 @@ static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
} else {
/* Add major:minor to lvmetad. */
log_debug("Scanning dev %d:%d for lvmetad cache.", major, minor);
if (!lvmetad_pvscan_single(cmd, dev, &found_vgnames, &pp.changed_vgnames))
add_errors++;
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
return_0;
devl->dev = dev;
dm_list_add(&single_devs, &devl->list);
}
}
@@ -482,6 +493,15 @@ static int _pvscan_cache(struct cmd_context *cmd, int argc, char **argv)
}
}
if (!dm_list_empty(&single_devs)) {
label_scan_devs(cmd, &single_devs);
dm_list_iterate_items(devl, &single_devs) {
if (!lvmetad_pvscan_single(cmd, devl->dev, &found_vgnames, &pp.changed_vgnames))
add_errors++;
}
}
if (!devno_args)
goto activate;

View File

@@ -2216,14 +2216,10 @@ int process_each_vg(struct cmd_context *cmd,
}
/*
* First rescan for available devices, then force the next
* label scan to be done. get_vgnameids() will scan labels
* (when not using lvmetad).
* Scan all devices to populate lvmcache with initial
* list of PVs and VGs.
*/
if (cmd->cname->flags & REQUIRES_FULL_LABEL_SCAN) {
dev_cache_full_scan(cmd->full_filter);
lvmcache_force_next_label_scan();
}
lvmcache_label_scan(cmd);
/*
* A list of all VGs on the system is needed when:
@@ -3758,6 +3754,12 @@ int process_each_lv(struct cmd_context *cmd,
goto_out;
}
/*
* Scan all devices to populate lvmcache with initial
* list of PVs and VGs.
*/
lvmcache_label_scan(cmd);
/*
* A list of all VGs on the system is needed when:
* . processing all VGs on the system
@@ -4467,7 +4469,12 @@ int process_each_pv(struct cmd_context *cmd,
if (!trust_cache() && !orphans_locked) {
log_debug("Scanning for available devices");
lvmcache_destroy(cmd, 1, 0);
dev_cache_full_scan(cmd->full_filter);
/*
* Scan all devices to populate lvmcache with initial
* list of PVs and VGs.
*/
lvmcache_label_scan(cmd);
}
if (!get_vgnameids(cmd, &all_vgnameids, only_this_vgname, 1)) {
@@ -5481,6 +5488,8 @@ int pvcreate_each_device(struct cmd_context *cmd,
dev_cache_full_scan(cmd->full_filter);
lvmcache_label_scan(cmd);
/*
* Translate arg names into struct device's.
*/
@@ -5635,6 +5644,8 @@ int pvcreate_each_device(struct cmd_context *cmd,
goto out;
}
lvmcache_label_scan(cmd);
/*
* The device args began on the arg_devices list, then the first check
* loop moved those entries to arg_process as they were found. Devices

View File

@@ -113,7 +113,6 @@ struct arg_value_group_list {
uint32_t prio;
};
#define CACHE_VGMETADATA 0x00000001
#define PERMITTED_READ_ONLY 0x00000002
/* Process all VGs if none specified on the command line. */
#define ALL_VGS_IS_DEFAULT 0x00000004

View File

@@ -74,6 +74,8 @@ int vgcfgrestore(struct cmd_context *cmd, int argc, char **argv)
return ECMD_FAILED;
}
lvmcache_label_scan(cmd);
cmd->handles_unknown_segments = 1;
if (!(arg_is_set(cmd, file_ARG) ?