1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-03 05:18:29 +03:00
Commit Graph

285 Commits

Author SHA1 Message Date
Alasdair G Kergon
4002f5e206 format_text: Refactor and document metadata offset calculation. 2017-12-12 18:36:54 +00:00
Alasdair G Kergon
b76c6951aa format_text: Adjust metadata alignment calculation.
Use new ALIGN_ABSOLUTE macro when calculating the start location
of new metadata and adjust the end of buffer detection so that
there is no longer an imposed gap between old and new metadata.
2017-12-11 20:25:03 +00:00
Alasdair G Kergon
053d35de47 format_text: Use absolute alignment to calculate metadata usage
Currently both start and offset should always be divisible by alignment,
so this should have no effect, but a later patch will increase alignment
so these variables can no longer be optimised out.
2017-12-11 17:14:38 +00:00
Alasdair G Kergon
2db67a8ea0 format_text: Move metadata size checking into separate fn.
Move checks into _metadata_fits_into_buffer() and add macro for alignment.
2017-12-11 17:08:29 +00:00
Alasdair G Kergon
46393bfca0 format_text: Log additional circular buffer information. 2017-12-11 16:07:34 +00:00
Alasdair G Kergon
49d486319f format_text: Replace PRI with FMT. 2017-12-11 15:39:25 +00:00
Alasdair G Kergon
14b1e5270d format_text: Use explicit alignment in wrapping calc.
Expand out the metadata wrapping calculations to prepare
to support a larger alignment.

The current alignment is 512 bytes so
(mdac_area_start + rlocn->offset) % alignment is zero.
2017-12-08 01:18:46 +00:00
Alasdair G Kergon
d591d04103 device: Tag I/O for each mda on a device separately in log messages.
Mark the first metadata area on each text format PV as MDA_PRIMARY.
Pass this information down to the device layer so that when
there are two metadata areas on a block device, we can easily
distinguish two independent streams of I/O.
2017-12-07 03:48:11 +00:00
Alasdair G Kergon
e4805e4883 device: categorise block i/o
Introduce enum dev_io_reason to categorise block device I/O
in debug messages so it's obvious what it is for.

DEV_IO_SIGNATURES   /* Scanning device signatures */
DEV_IO_LABEL        /* LVM PV disk label */
DEV_IO_MDA_HEADER   /* Text format metadata area header */
DEV_IO_MDA_CONTENT  /* Text format metadata area content */
DEV_IO_FMT1         /* Original LVM1 metadata format */
DEV_IO_POOL         /* Pool metadata format */
DEV_IO_LV           /* Content written to an LV */
DEV_IO_LOG          /* Logging messages */
2017-12-04 23:45:26 +00:00
Alasdair G Kergon
52fd66210b metadata: Avoid accessing ignored metadata.
When an ignored metadata area gets flagged for use again, make sure the
code doesn't try to parse its old metadata.  Firstly by trying to detect
this situation and skipping the read (while still remembering the
position reached in the circular buffer), and secondly by clearing the
invalid live metadata location on disk as a precaution when subsequently
writing out the precommitted metadata.

Problems showed up when a metadata area in one VG got moved to
another VG in ignored state (still holding metadata for the original
VG) and then later got brought into use in the new VG - only the header
should be read in this case, not any of the metadata content.
2017-10-27 22:53:43 +01:00
Alasdair G Kergon
486ed10848 vgmerge: Fix intermediate metadata corruption
vgmerge suffers from a similar problem to the one fixed in commit
8146548d25 ("vgsplit: Fix intermediate
metadata corruption.")

When merging, splitting or renaming VGs, use a new PV status flag
PV_MOVED_VG to mark the PVs that hold metadata with the old VG name and
use this to provide PV-level granularity instead of incorrectly assuming
all PVs in the VG are the same.
2017-10-06 02:20:45 +01:00
Alasdair G Kergon
8146548d25 vgsplit: Fix intermediate metadata corruption.
Changing the VG of a PV uses the same on-disk mechanism as vgrename.
This relies on recognising both the old and new VG names.  Prior to this
patch the vgsplit code incorrectly provided the new VG name twice
instead of the old and new ones.  This lead the low-level mechanism not
to recognise the device as already belonging to a VG and so paying no
attention to the location of its existing metadata, sometimes partly
overwriting it and then later trying to read the corrupt metadata and
issuing a checksum error.
2017-09-22 18:34:34 +01:00
Peter Rajnoha
3c978f7bcc pvcreate: fix check for 2nd mda at end of disk fits if using pvcreate --restorefile
Fix code checking that the 2nd mda which is at the end of disk really
fits the available free space and avoid any DA and MDA interleaving when
we already have DA preallocated. This mainly applies when we're restoring
a PV from VG backup using pvcreate --restorefile where we may already have
some DA preallocated - this means the PV was in a VG before with already
allocated space from it (the LVs were created). Hence we need to avoid
stepping into DA - the MDA can never ever be inside in such case!

The code responsible for this calculation was already in
_text_pv_add_metadata_area fn, but it had a bug in the calculation where
we subtracted one more sector by mistake and then the code could still
incorrectly allocate the MDA inside existing DA. The patch also renames
the variable in the code so it doesn't confuse us in future.

Also, if the 2nd mda doesn't fit, don't silently continue with just 1
MDA (at the start of the disk). If 2nd mda was requested and we can't
create that due to unavailable space, error out correctly (the patch
also adds a test to shell/pvcreate-operation.sh for this case).
2017-08-15 13:40:25 +02:00
Zdenek Kabelac
48ce8c7a49 tidy: drop unneeded cast
Avoid casting to the same type.
2017-07-20 11:20:44 +02:00
Zdenek Kabelac
0bf836aa14 tidy: prefer not using else after return
clang-tidy: avoid using  'else' after return - give more readable code,
and also saves indention level.
2017-07-20 11:18:29 +02:00
Zdenek Kabelac
f7e62bc55c cleanup: drop extra compare
dm_free() already validates for NULL itself.
2017-07-17 12:32:18 +02:00
Alasdair G Kergon
5027c3c7ee format_text: Extend FIXME to reduce label scans
It's unnecessarily scanning all invalid labels even when nothing changed
instead of first just scanning the ones under the lock.
2017-07-13 17:05:49 +01:00
Zdenek Kabelac
419e8284c8 coverity: validate length of renaming path
Make sure path fits into buffer on stack.
2017-06-27 12:15:42 +02:00
David Teigland
01156de6f7 lvmcache: add optional dev arg to lvmcache_info_from_pvid
A number of places are working on a specific dev when they
call lvmcache_info_from_pvid() to look up an info struct
based on a pvid.  In those cases, pass the dev being used
to lvmcache_info_from_pvid().  When a dev is specified,
lvmcache_info_from_pvid() will verify that the cached
info it's using matches the dev being processed before
returning the info.  Calling code will not mistakenly
get info for the wrong dev when duplicate devs exist.

This confusion was happening when scanning labels when
duplicate devs existed.  label_read for the first dev
would add an info struct to lvmcache for that dev/pvid.
label_read for the second dev would see the pvid in
lvmcache from first dev, and mistakenly conclude that
the label_read from the second dev can be skipped
because it's already been done.  By verifying that the
dev for the cached pvid matches the dev being read,
this mismatch is avoided and the label is actually read
from the second duplicate.
2016-06-07 15:15:47 -05:00
Zdenek Kabelac
509b2e5247 debug: move misplaced log_debug
It should log action before taking it instead of only in error path.
2016-04-21 00:34:01 +02:00
David Teigland
5e9e43074a lvmetad: rework command connection setup and checking
The lvmetad connection is created within the
init_connections() path during command startup,
rather than via the old lvmetad_active() check.

The old lvmetad_active() checks are replaced
with lvmetad_used() which is a simple check that
tests if the command is using/connected to lvmetad.

The old lvmetad_set_active(cmd, 0) calls, which
stopped the command from using lvmetad (to revert to
disk scanning), are replaced with lvmetad_make_unused(cmd).
2016-04-19 14:00:02 -05:00
Zdenek Kabelac
a28c81cbae debug: unify some tracing messages
Introduce  FMTVGID - although it might be possibly better to ensure
vgid is always \0 ended string.

Unify some lvmcache reported messages.
2016-04-12 13:06:16 +02:00
David Teigland
147c9c01a2 rename function read_vgname to read_vgsummary
The name did not clearly represent what it does.
2016-04-11 13:07:48 -05:00
David Teigland
4de6caf5b5 redefine pvcreate structs
New pv_create_args struct contains all the specific
parameters for creating a PV, independent of the
command.
2016-02-25 09:14:10 -06:00
Peter Rajnoha
8ad93874d6 tests: fix tests checking pv_attr - there's a new bit now 2016-02-15 12:44:46 +01:00
Peter Rajnoha
9b9f1ae772 format: format_text: add pv_needs_rewrite to format_handler and implemention for format_text 2016-02-15 12:44:46 +01:00
Peter Rajnoha
d320d9c52b pv: format-text: store PV_EXT_USED flag if PV is used and unset it otherwise
When adding a PV to VG, set the PV_EXT_USED flag in PV header and
vice versa - if the PV is no longer in a VG, unset the flag.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
a522af93b7 format: add FMT_PV_FLAGS to indicate format supports PV flags 2016-02-15 12:44:46 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Alasdair G Kergon
01228b692b vgcfgrestore: Retain allocatable PV attribute.
pvchange -xn was getting lost.
All PVs were set to allocatable again after restore.

Moved setting ALLOCATABLE_PV outside pv_setup().
2016-01-14 00:46:45 +00:00
David Teigland
796461a912 vgrename: use process_each_vg
Use process_each_vg() to lock and read the old VG,
and then call the main vgrename code.

When real VG names are used (not a UUID in place of the
old name), the command still pre-locks the new name
(when strcmp wants it locked first), before calling
process_each_vg on the old name.

In the case where the old name is replaced with a UUID,
process_each_vg now translates that UUID into the real
VG name, which it locks and reads.  In this case, we
cannot do pre-locking to maintain lock ordering because
the old name is unknown.  So, in this case the strcmp
based lock ordering is suppressed and the old name is
always locked first.  This opens a remote chance for
lock ordering conflict between racing vgrenames between
two names where one or both commands use the UUID.
2015-12-14 14:26:47 -06:00
Zdenek Kabelac
c3b292a4a9 format-text: ensure no division by zero
Coverity likes here to be 100% sure no division by zero is possible.
Add check for alignment !=0 which is made on other code paths here.
2015-11-16 01:16:11 +01:00
Peter Rajnoha
ccfc09f79b metadata: format_text: also count with calculated mda size of 0
When checking minimum mda size, make sure the mda_size after alignment
and calculation is more than 0 - if there's no place for an MDA at the
end of the disk, the _text_pv_add_metadata_area does not try to add it
there and it returns (because we already have the MDA at the start of
the disk at least).
2015-10-30 12:02:34 +01:00
Peter Rajnoha
c2e88d1107 metadata: format_text: better check for metadata overlap
Actually, we don't need extra condition as introduced in commit
00348c0a63. We should fix the last
condition:

  (mdac->rlocn.size >= mdah->size)

...which should be:

  (MDA_HEADER_SIZE + (rlocn ? rlocn->size : 0) + mdac->rlocn.size >= mdah->size))

Where the "mdac" is new metadata, the "rlocn" is old metadata.

So the main problem with the previous condition was that it
didn't count in MDA_HEADER_SIZE properly (and possible existing
metadata - the "rlocn"). This could have caused the error state
where metadata in ring buffer overlap to not be hit.

Replace the new condition introduced in 00348c0a63
with the improved one for the condition that existed there
already but it was just incomplete.
2015-10-30 08:57:34 +01:00
Peter Rajnoha
00348c0a63 metadata: format_text: check VG metadata do not overlap themselves
We're already checking whether old and new meta do not overlap in
ring buffer (as we need to keep both old and new meta during vg_write
up until vg_commit).

We also need to check whether the new metadata do not overlap
themselves in case we don't have old metadata yet (...because
we're in vgcreate). This could happen if we're creating a VG so
that the very first metadata written are long enough that it wraps
themselves in metadata ring buffer.

Although we limited the minimum metadata area size better with the
previous commit ccb8da404d which
makes the initial VG metadata overlap in ring buffer to be less
probable, the risk of hitting this overlap condition is still there
if we still manage to generate big enough metadata somehow.

For example, users can provide many and/or long VG tags during vgcreate
so that the VG metadata is long enough to start to wrap in the ring
buffer again...
2015-10-29 16:46:41 +01:00
Peter Rajnoha
ccb8da404d metadata: format_text: check metadata area size is at least MDA_SIZE_MIN 2015-10-29 16:00:32 +01:00
Peter Rajnoha
b3c81d02c9 revert: 3d03e504cd: message about VG metadata size vs. PV mda size
The message needs refinement - it's not correct in all situations.
2015-10-29 11:10:48 +01:00
Peter Rajnoha
3d03e504cd metadata: format_text: provide more detailed error message when metadata too large for PV mda
Also, leave out the note about "circular buffer" which is
an internal imeplementation detail anyway and not quite
informational for users:

Before this patch:
$ vgcreate vg1 /dev/sda
  VG vg1 metadata too large for circular buffer
  Failed to write VG vg1.

With this patch applied:
$ vgcreate vg1 /dev/sda
  VG vg1 metadata too large: size of metadata to write is 691 bytes while PV metadata area size on /dev/sda is 512 bytes.
  Failed to write VG vg1.
2015-10-08 16:27:03 +02:00
Peter Rajnoha
fcfca57e2e format-text: label: fix missing dev assignment for struct label in _text_pv_write
When using lvm shell, some structures which are cached in memory may be
reused. This happens for the struct label (a part of lvmcache_info
structure) when lvmetad is used in which case the PV scan is not
done that would normally overwrite these label structures in memory
and making them up-to-date.

This is all consequence of the fact that struct lvmcache_info and
struct label are not always assigned in the same part of the code.
For example, if lvmetad *is not* used, parts of the struct label are
reassigned in label_read fn while struct lvmcache_info is created
elsewhere. No part of the code reused struct label (and its "dev"
field) before calling label_read fn. That's why the real bug is
hidden when using lvm shell without lvmetad.

However, with lvmetad and lvm shell, the situation is a bit different.
The label_read fn is not called if lvmetad *is* used, hence the
struct label may have ended up not initialized properly.

There was missing assignment for the dev field in struct label
in _text_pv_write fn which caused this problem to appear in
lvm shell with lvmetad, for example:

Before this patch:

lvm> pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created
lvm> pvs /dev/sda
  PV             VG     Fmt  Attr PSize   PFree
  unknown device        lvm2 ---  128.00m 128.00m

With this patch applied:

lvm> pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created
lvm> pvs /dev/sda
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda        lvm2 ---  128.00m 128.00m

Also, this problem had not appeared before changes introduced
by commits e1a63905d1 through
3a6f91d713 which, among other
things, added proper label field type reporting. Before, label
reporting was the same as using struct physical_volume which
has its own dev field assigned and so this problem was not exposed.
2015-09-15 18:07:32 +02:00
Zdenek Kabelac
a8fd88463e cleanup: trace error from lvmcache_update_vgname_and_id
Check result value from lvmcache_update_vgname_and_id().
2015-08-18 15:00:08 +02:00
Peter Rajnoha
3b6840e099 config: replace find_config_tree_node with find_config_tree_array where appropriate 2015-07-08 13:03:08 +02:00
Alasdair G Kergon
810ab095e6 macros: Wrap PRI with FMT.
Create a set of wrappers with embedded % such as
  #define FMTu64 "%" PRIu64
2015-07-06 15:09:17 +01:00
Zdenek Kabelac
05934d2538 format_text: properly validate PV size for restore
Use 64bit arithmentic for PV size calculation (Coverity).

Also remove sector shift for compared PV size, since all
values are already held in sectors.

This fixes validatio of PV size when restoring PV
from vg metadata backup file.
2015-05-08 15:12:35 +02:00
Alasdair G Kergon
cc26085b62 alloc: Respect cling_tag_list in contig alloc.
When performing initial allocation (so there is nothing yet to
cling to), use the list of tags in allocation/cling_tag_list to
partition the PVs.  We implement this by maintaining a list of
tags that have been "used up" as we proceed and ignoring further
devices that have a tag on the list.

https://bugzilla.redhat.com/983600
2015-04-11 01:55:24 +01:00
Alasdair G Kergon
a9d48bae2f cache: Set correct vgid when changing PV header.
pv_write is called both to write orphans and to rewrite PV headers
of PVs in VGs.  It needs to select the correct VG id so that the
internal cache state gets updated correctly.

It only affected commands that involved further steps after
the pv_write and was often masked because the metadata would
be re-read off disk and correct itself.

"Incorrect metadata area header checksum" warnings appeared.

Example:
  Create vg1 containing dev1, dev2 and dev3.
  Hide dev1 and dev2 from the system.
  Fix up vg1 with vgreduce --removemissing.
  Bring back dev1 and dev2.
  In a single operation reinstate dev1 and dev2 into vg1 (vgextend).
Done as separate operations (automatically fix-up dev1 and dev2 as orphans,
then vgextend) it worked, but done all in one go the internal cache got
corrupted and warnings about checksum errors appeared.
2015-04-09 21:13:55 +01:00
Alasdair G Kergon
a515a91fcc format_text: Fix precommitted segfault.
The code never mixes reads of committed and precommitted metadata,
so there's no need to attempt to set PRECOMMITTED when
*use_previous_vg is being set.
2015-03-19 11:14:47 +00:00
Alasdair G Kergon
6407d184d1 cache: Store metadata size and checksum.
Refactor the recent metadata-reading optimisation patches.

Remove the recently-added cache fields from struct labeller
and struct format_instance.

Instead, introduce struct lvmcache_vgsummary to wrap the VG information
that lvmcache holds and add the metadata size and checksum to it.

Allow this VG summary information to be looked up by metadata size +
checksum.  Adjust the debug log messages to make it clear when this
shortcut has been successful.

(This changes the optimisation slightly, and might be extendable
further.)

Add struct cached_vg_fmtdata to format-specific vg_read calls to
preserve state alongside the VG across separate calls and indicate
if the details supplied match, avoiding the need to read and
process the VG metadata again.
2015-03-18 23:43:02 +00:00
Zdenek Kabelac
a9b28a4f21 lib: reduce parsing in vgname_from_mda
Use similar logic as with text_vg_import_fd() and avoid repeated
parsing of same mda and its config tree for vgname_from_mda().

Remember last parsed vgname, vgid and creation_host in labeller
structure and if the  metadata have the same size and checksum,
return this stored info.

TODO: The reuse of labeller struct is not ideal, some lvmcache API for
this functionality would be nicer.
2015-03-06 13:53:13 +01:00
Zdenek Kabelac
60427d5d42 lib: return value
Drop label out: with goto and return NULL directly.
Add log_debug() for zero metadata offset.
2015-03-06 13:51:43 +01:00
Alasdair G Kergon
5e6e2d6b1b vgcreate: Permit non-power-of-2 extent sizes.
Relax validation to permit extent sizes > 128KB that are not powers of 2
with lvm2 format.  Existing code was already capable of handling this.
2014-10-14 18:12:15 +01:00