1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-03 05:18:29 +03:00
Commit Graph

675 Commits

Author SHA1 Message Date
Peter Rajnoha
e0b1415105 metadata: check for PV extension version before doing any checks on PV extension flags
PV header extension versions:
  0 - the original PV without any extensions
  1 - bootloader area support added
  2 - PV_EXT_USED flag support added

So do the associated checks related to PV_EXT_USED flag only if
PV header extension found is of version 2 and higher.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
d97f1c89de metadata: _vg_read: check if PV_EXT_USED flag is set correctly for orphan PVs and do a repair if needed
If we know that the PV is orphan, meaning there's at least one MDA on
that PV which does not reference any VG and at the same time there's
PV_EXT_USED flag set, we're certainly in an inconsistent state and we
need to fix this.

For example, such situation can happen during vgremove/vgreduce if we
removed/reduced the VG, but we haven't written PV headers yet because
vgremove stopped abruptly for whatever reason just before writing new
PV headers with updated state, including PV extension flags (and so the
PV_EXT_USED flag).

However, in case the PV has no MDAs at all, we can't double-check
whether the PV_EXT_USED is correct or not - if that PV is marked
as used, it's either:
  - really used (but other disks with MDAs are missing)
  - or the error state as described above is hit

User needs to overwrite the PV header directly if it's really clear
the PV having no MDAs does not belong to any VG and at the same time
it's still marked as being in use (pvcreate -ff <dev_name> will fix this).

For example - /dev/sda here has 1 MDA, orphan and is incorrectly marked
with PV_EXT_USED flag:

$ pvs --binary -o+pv_in_use
  WARNING: Found inconsistent standalone Physical Volumes.
  WARNING: Repairing flag incorrectly marking Physical Volume /dev/sda as used.
  PV         VG     Fmt  Attr PSize   PFree   InUse
  /dev/sda          lvm2 ---  128.00m 128.00m     0
2016-02-15 12:44:46 +01:00
Peter Rajnoha
b6e3080fff pv: _pvcreate_write: do label removal and zeroing only if creating a new PV 2016-02-15 12:44:46 +01:00
Peter Rajnoha
73f1d444c8 pv: issue different message of different type when we're overwriting existing PV header instead of creating a new one
Scenario:

$ pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created

We're adding the PV to a VG.

Before this patch:
$ vgcreate vg /dev/sda
  Physical volume "/dev/sda" successfully created
  Volume group "vg" successfully created

With this path applied:
$ vgcreate vg /dev/sda
  Volume group "vg" successfully created

...and verbose log containing: "Physical volume "/dev/sda" successfully written"
2016-02-15 12:44:46 +01:00
Peter Rajnoha
52999133a3 pv: check for the PV_EXT_USED flag and deny pvcreate/pvchange/pvremove/vgcreate on such PV (unless forced)
Make sure we won't use a PV that is already marked as used. Normally,
VG metadata would stop us from doing that, but we can run into a
situation where such metadata is missing because PVs with MDAs
are missing and the PVs left are the ones with 0 MDAs.

(/dev/sda in this example has 0 MDAs and it belongs to a VG,
but other PVs with MDA are missing)

$ pvs -o pv_name,pv_mda_count /dev/sda
  PV         #PMda
  /dev/sda       0

$ pvcreate /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Can't initialize PV '/dev/sda' without -ff.

$ pvchange -u /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Can't change PV '/dev/sda' without -ff.
  Physical volume /dev/sda not changed
  0 physical volumes changed / 1 physical volume not changed

$ pvremove /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  (If you are certain you need pvremove, then confirm by using --force twice.)

$ vgcreate vg /dev/sda
  Physical volume '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Unable to add physical volume '/dev/sda' to volume group 'vg'.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
10128c9bd6 metadata: schedule PV for header rewrite if adding a PV to VG or restoring VG
When adding PV to VG, we need to rewrite PV header as there's a flip
in PV_EXT_USED flag. The same applies if we're restoring VG from backup.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
2950adc2ab metadata: add_pv_to_vg: add 'new_pv' arg to state if the PV is about to be created 2016-02-15 12:44:46 +01:00
Peter Rajnoha
4361543f3e refactor: rename struct pv_to_create --> struct pv_to_write
We'll use this struct in subsequent patches for PVs which should
be rewritten, not just created. So rename struct pv_to_create to
struct pv_to_write for clarity.
2016-02-15 12:44:45 +01:00
Peter Rajnoha
136fd8f2f6 conf: add metadata/check_pv_device_sizes 2016-01-22 14:16:00 +01:00
Peter Rajnoha
c0912af310 metadata: check PV dev size is not less than PV size 2016-01-22 14:16:00 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Zdenek Kabelac
21028a7903 cleanup: reformat sentence about max sizes
The extent size must fits all blocks in 4294967295 sectors
(in 512b units) this is 1/2 KiB less then 2TiB.

So while previous statement 'suggested' 2TiB is still acceptable value,
make it clear it's not.

As now we support any multiples of 128KB as extent size -
values like 2047G will still 'flow-in' otherwise the largest power-of-2
supported value is 1TiB.

With 1TiB user needs 8388608 extents for 8EiB device.
(FYI such device is already unusable with todays glibc-2.22.90-27)

4GiB extent size is currently the smallest extent size which allows
a user to create 8EiB devices (with 2GiB  it's less then 8EiB).

TODO: lvm2 may possibly print amount of 'lost/unused space' on a PV,
since using such ridiculously sized extent size may result in huge
space being left unaccessible.
2016-01-20 13:44:47 +01:00
Alasdair G Kergon
01228b692b vgcfgrestore: Retain allocatable PV attribute.
pvchange -xn was getting lost.
All PVs were set to allocatable again after restore.

Moved setting ALLOCATABLE_PV outside pv_setup().
2016-01-14 00:46:45 +00:00
David Teigland
124b490fe6 lvmlockd: update VG lock version earlier
Have commands send lvmlockd the update message
in vg_write instead of vg_commit, so that it's
not done while LVs are suspended.  If the vg_write
is not committed, and the seqno sent to lvmlockd
is not used, then lvmlockd can detect this when
the next update uses the same seqno.
2015-12-15 16:14:49 -06:00
David Teigland
796461a912 vgrename: use process_each_vg
Use process_each_vg() to lock and read the old VG,
and then call the main vgrename code.

When real VG names are used (not a UUID in place of the
old name), the command still pre-locks the new name
(when strcmp wants it locked first), before calling
process_each_vg on the old name.

In the case where the old name is replaced with a UUID,
process_each_vg now translates that UUID into the real
VG name, which it locks and reads.  In this case, we
cannot do pre-locking to maintain lock ordering because
the old name is unknown.  So, in this case the strcmp
based lock ordering is suppressed and the old name is
always locked first.  This opens a remote chance for
lock ordering conflict between racing vgrenames between
two names where one or both commands use the UUID.
2015-12-14 14:26:47 -06:00
David Teigland
4aa9e99a10 Change messages from verbose to debug
These messages about outdated PVs should not
be verbose because they always appear, even
when there are no outdated PVs.
2015-12-11 15:28:46 -06:00
David Teigland
88cef47b18 vg_read: look up vgid from name
After recent changes to process_each, vg_read() is usually
given both the vgname and vgid for the intended VG.

However, in some cases vg_read() is given a vgid with
no vgname, or is given a vgname with no vgid.

When given a vgid with no vgname, vg_read() uses lvmcache
to look up the vgname using the vgid.  If the vgname is
not found, vg_read() fails.

When given a vgname with no vgid, vg_read() should also
use lvmcache to look up the vgid using the vgname.
If the vgid is not found, vg_read() fails.

If the lvmcache lookup finds multiple vgids for the
vgname, then the lookup fails, causing vg_read() to fail
because the intended VG is uncertain.

Usually, both vgname and vgid for the intended VG are passed
to vg_read(), which means the lvmcache translations
between vgname and vgid are not done.
2015-12-01 09:18:48 -06:00
David Teigland
05ac836798 system_id: refactor check for allowed system_id
Refactor the code that checks for an allowable system_id
so that it can be used from other places.
2015-11-30 11:46:55 -06:00
Zdenek Kabelac
66c7fa4a44 cleanup: rename lv_ondisk to lv_committed
Patch has no functional change.
2015-11-25 11:39:26 +01:00
Zdenek Kabelac
d9faf85987 cleanup: rename vg_ondisk to vg_committed
Unifying terminology.

Since all the metadata in-use are ALWAYS on disk - switch
to terminology  committed and precommitted.

Patch has no functional change inside.
2015-11-25 11:11:21 +01:00
Zdenek Kabelac
6d6c233768 cleanup: move towards using direct LV pointers
We do not won't to 'expose'  internals of VG struct.
ATM we use lists to keep all LVs - we may want to switch
to better struct for quicker 'search'.

Since we do not need 'lists' but always actual LV,
switch find_lv_in_vg_by_lvid() to return LV,
and replaces some use case of  find_lv_in_vg()
with 'better' working find_lv() which already
returns LV.
2015-11-23 23:42:59 +01:00
Zdenek Kabelac
d8049dd17a cleanup: add some test for NULL
Coverity here is a bit 'blind' here and cannot resolve which
code paths are actually able to hit this code path.
(It's using 'statistic' to resolve all possible paths,
and it's not scanning 'individual' code paths.)

This just cleans warns and add 'cheap' tests.
2015-11-17 19:01:25 +01:00
David Teigland
16780f6faa vg_read: skip repair and wipe for foreign and shared VGs
When reading a foreign VG we cannot write it, since
it belongs to another host.  When reading a shared VG
we cannot write it because we may not have an ex lock.
(Or we may be reading the shared VG while not using
lvmlockd in which case it's like reading a foreign VG.)

Add the same checks for wiping outdated PVs.  We may
read a foreign or shared VG, or see the PVs, while
another host is part way through writing a new version
of the VG to the PVs.  This might cause us to think
some of the PVs are outdated.  We do not want to
write another host's PVs, especially when we may
wrongly conclude they are outdated.
2015-11-03 13:42:21 -06:00
David Teigland
1a74171ca5 vg_read: sometimes ignore read errors
Running "vgremove -f VG & pvs" results in the pvs
command reporting that the VG is not found or is
inconsistent.  If the VG is gone or being removed,
the pvs command should just skip it and not print
errors about it.

"Not found" is because the pvs command created the
list of VGs to process, including VG, then vgremove
removed the VG, then the pvs command came to to read
the VG to process it and did not find it.

An "inconsistent" error could be reported if vgremove
had only partially completed removing VG when pvs did
vg_read on the VG to process it, causing pvs to find
the VG in a partially-removed state.

This fix adds a flag that pvs uses to ignore a VG
that can't be read or is inconsistent.
2015-10-23 10:12:34 -05:00
David Teigland
2ce8ee0214 vgcreate: initialize new PVs only in first vg_write
When a command does a sequence of
vg_write + vg_commit + vg_write + vg_commit,

initialization of non-PV devices happens during the
first vg_write, and does not need to be repeated by
the second vg_write.

When creating a lockd VG, this sequence occurs because
the VG is first created, then the lockd data is created,
then the lockd data is then written to the VG metadata.
2015-09-14 13:22:22 -05:00
Alasdair G Kergon
fb12308416 style: Standardise some error paths. 2015-09-05 23:56:30 +01:00
David Teigland
b4be988732 vgchange/lvchange: enforce the shared VG lock from lvmlockd
The vgchange/lvchange activation commands read the VG, and
don't write it, so they acquire a shared VG lock from lvmlockd.
When other commands fail to acquire a shared VG lock from
lvmlockd, a warning is printed and they continue without it.
(Without it, the VG metadata they display from lvmetad may
not be up to date.)

vgchange/lvchange -a shouldn't continue without the shared
lock for a couple reasons:

. Usually they will just continue on and fail to acquire the
  LV locks for activation, so continuing is pointless.

. More importantly, without the sh VG lock, the VG metadata
  used by the command may be stale, and the LV locks shown
  in the VG metadata may no longer be current.  In the
  case of sanlock, this would result in odd, unpredictable
  errors when lvmlockd doesn't find the expected lock on
  disk.  In the case of dlm, the invalid LV lock could be
  granted for the non-existing LV.

The solution is to not continue after the shared lock fails,
in the same way that a command fails if an exclusive lock fails.
2015-07-17 15:35:34 -05:00
David Teigland
96a883a454 metadata: change function name to _allow_extra_system_id
The previous name was misleading since this is not the
primary system_id check, only the "extra" check.
2015-07-14 14:43:16 -05:00
David Teigland
681f779a3c lockd: fix error message after a failing to get lock
There are two different failure conditions detected in
access_vg_lock_type() that should have different error
messages.  This adds another failure flag so the two
cases can be distinguished to avoid printing a misleading
error message.
2015-07-14 11:36:04 -05:00
David Teigland
0823511262 lockd: disable part of lock_args validation
There are at least a couple instances where
the lock_args check does not work correctly,
(listed in the comment), so disable the
NULL check for lock_args until those are
resolved.
2015-07-10 15:53:21 -05:00
David Teigland
841c3478fd metadata: vg_validate lock_args 2015-07-09 13:25:00 -05:00
Peter Rajnoha
3b6840e099 config: replace find_config_tree_node with find_config_tree_array where appropriate 2015-07-08 13:03:08 +02:00
David Teigland
fe70b03de2 Add lvmlockd 2015-07-02 15:42:26 -05:00
Alasdair G Kergon
4c629a5257 locking: Add missing error handling.
Add missing error logging and detection to unlock_vg and callers
of sync_local_dev_names etc.
2015-06-30 18:54:38 +01:00
Petr Rockai
632dde0cbc metadata: When outdated PVs are wiped, notify lvmetad about the fact. 2015-06-10 16:27:12 +02:00
Petr Rockai
611c8b6d29 metadata: Add pvs_outdated to struct volume_group.
This is a list of PVs that should have their MDAs wiped because they carry
outdated metadata (that used to belong to the VG they are attached to).
2015-05-20 19:46:14 +02:00
Petr Rockai
5435346052 metadata: Factor _wipe_outdated_pvs() PVs out of _vg_read(). 2015-05-20 19:46:13 +02:00
David Teigland
8e509b5dd5 toollib: avoid repeated lvmetad vg_lookup
In process_each_{vg,lv,pv} when no vgname args are given,
the first step is to get a list of all vgid/vgname on the
system.  This is exactly what lvmetad returns from a
vg_list request.  The current code is doing a vg_lookup
on each VG after the vg_list and populating lvmcache with
the info for each VG.  These preliminary vg_lookup's are
unnecessary, because they will be done again when the
processing functions call vg_read.  This patch eliminates
the initial round of vg_lookup's, which can roughly cut in
half the number of lvmetad requests and save a lot of extra work.
2015-05-08 11:44:55 -05:00
Zdenek Kabelac
2cea1c1bd9 pvcreate: fix test for wiping status
Commit ed420fb691 changed
paramet wiped to be a pointer, but missed to switch
to test pointer dereferenced value and instead always
checked 'pointer'.
2015-05-08 13:36:39 +02:00
Peter Rajnoha
8759f7d755 metadata: vg: add removed_lvs field to collect LVs which have been removed
Do not keep dangling LVs if they're removed from the vg->lvs list and
move them to vg->removed_lvs instead (this is actually similar to already
existing vg->removed_pvs list, just it's for LVs now).

Once we have this vg->removed_lvs list indexed so it's possible to
do lookups for LVs quickly, we can remove the LV_REMOVED flag as
that one won't be needed anymore - instead of checking the flag,
we can directly check the vg->removed_lvs list if the LV is present
there or not and to say if the LV is removed or not then. For now,
we don't have this index, but it may be implemented in the future.
2015-03-24 08:43:08 +01:00
Alasdair G Kergon
6407d184d1 cache: Store metadata size and checksum.
Refactor the recent metadata-reading optimisation patches.

Remove the recently-added cache fields from struct labeller
and struct format_instance.

Instead, introduce struct lvmcache_vgsummary to wrap the VG information
that lvmcache holds and add the metadata size and checksum to it.

Allow this VG summary information to be looked up by metadata size +
checksum.  Adjust the debug log messages to make it clear when this
shortcut has been successful.

(This changes the optimisation slightly, and might be extendable
further.)

Add struct cached_vg_fmtdata to format-specific vg_read calls to
preserve state alongside the VG across separate calls and indicate
if the details supplied match, avoiding the need to read and
process the VG metadata again.
2015-03-18 23:43:02 +00:00
Alasdair G Kergon
95fbbf4f40 metadata: Fix recent vg_validate message text. 2015-03-17 17:48:56 +00:00
Alasdair G Kergon
a854546234 metadata: Detect internal use of LVM_WRITE_LOCKED.
Generate internal error if LVM_WRITE_LOCKED ever appears
in struct volume_group: it's only used in external
metadata.
2015-03-09 18:56:24 +00:00
Alasdair G Kergon
faccdeda83 comments: Use full flag names. 2015-03-09 18:53:22 +00:00
Zdenek Kabelac
04101bc430 lib: drop unneeded vg_read call
Since we take a lock inside vg_lock_newname() and we do a full
detection of presence of  vgname inside all scanned labels,
there is no point to do this for second time to be sure
there is no such vg.

The only side-effect of such call would be a full validation of
some already exising VG metadata - but that's not the task for
vgcreate when create a new VG.

This call noticable reduces number of scans during 'vgcreate'.
2015-03-06 14:05:06 +01:00
Zdenek Kabelac
7e7411966a lib: avoid reparsing same metadata
When reading VG mda from multiple PVs - do all the validation only
when mda is seen for the first time and  when mda checksum and length
is same just return already existing VG pointer.

(i.e. using 300PVs for a VG would lead to create and destroy 300 config trees....)
2015-03-06 13:53:12 +01:00
David Teigland
1e65fdd9ba system_id: make new VGs read-only for old lvm versions
Previous versions of lvm will not obey the restrictions
imposed by the new system_id, and would allow such a VG
to be written.  So, a VG with a new system_id is further
changed to force previous lvm versions to treat it as
read-only.  This is done by removing the WRITE flag from
the metadata status line of these VGs, and putting a new
WRITE_LOCKED flag in the flags line of the metadata.

Versions of lvm that recognize WRITE_LOCKED, also obey the
new system_id.  For these lvm versions, WRITE_LOCKED is
identical to WRITE, and the rules associated with matching
system_id's are imposed.

A new VG lock_type field is also added that causes the same
WRITE/WRITE_LOCKED transformation when set.  A previous
version of lvm will also see a VG with lock_type as read-only.

Versions of lvm that recognize WRITE_LOCKED, must also obey
the lock_type setting.  Until the lock_type feature is added,
lvm will fail to read any VG with lock_type set and report an
error about an unsupported lock_type.  Once the lock_type
feature is added, lvm will allow VGs with lock_type to be
used according to the rules imposed by the lock_type.

When both system_id and lock_type settings are removed, a VG
is written with the old WRITE status flag, and without the
new WRITE_LOCKED flag.  This allows old versions of lvm to
use the VG as before.
2015-03-05 09:50:43 -06:00
David Teigland
c32efc7f7e system_id: apply consistent naming
In log messages refer to it as system ID (not System ID).

Do not put quotes around the system_id string when printing.

On the command line use systemid.

In code, metadata, and config files use system_id.

In lvmsystemid refer to the concept/entity as system_id.
2015-02-27 13:32:00 -06:00
David Teigland
dd6a202831 lvchange: deactivate is always possible in foreign vgs
The only realistic way for a host to have active LVs in a
foreign VG is if the host's system_id (or system_id_source)
is changed while LVs are active.

In this case, the active LVs produce an warning, and access
to the VG is implicitly allowed (without requiring --foreign.)
This allows the active LVs to be deactivated.

In this case, rescanning PVs for the VG offers no benefit.
It is not possible that rescanning would reveal an LV that
is active but wasn't previously in the VG metadata.
2015-02-25 14:58:49 -06:00
David Teigland
8668a9e81c systemid: silently ignore foreign vgs unless named
A foreign VG should be silently ignored by a reporting/display
command like 'vgs'.  If the reporting/display command specifies
a foreign VG by name on the command line, it should produce an
error message.

Scanning commands pvscan/vgscan/lvscan are always allowed to
read and update caches from all PVs, including those that belong
to foreign VGs.

Other non-report/display/scan commands always ignore a foreign
VG, or report an error if they attempt to use a foreign VG.

vgimport should always invalidate the lvmetad cache because
lvmetad likely holds a pre-vgexported copy of the VG.
(This is unrelated to using foreign VGs; the pre-vgexported
VG may have had no system_id at all.)
2015-02-25 10:53:52 -06:00