1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

2193 Commits

Author SHA1 Message Date
Peter Rajnoha
d97f1c89de metadata: _vg_read: check if PV_EXT_USED flag is set correctly for orphan PVs and do a repair if needed
If we know that the PV is orphan, meaning there's at least one MDA on
that PV which does not reference any VG and at the same time there's
PV_EXT_USED flag set, we're certainly in an inconsistent state and we
need to fix this.

For example, such situation can happen during vgremove/vgreduce if we
removed/reduced the VG, but we haven't written PV headers yet because
vgremove stopped abruptly for whatever reason just before writing new
PV headers with updated state, including PV extension flags (and so the
PV_EXT_USED flag).

However, in case the PV has no MDAs at all, we can't double-check
whether the PV_EXT_USED is correct or not - if that PV is marked
as used, it's either:
  - really used (but other disks with MDAs are missing)
  - or the error state as described above is hit

User needs to overwrite the PV header directly if it's really clear
the PV having no MDAs does not belong to any VG and at the same time
it's still marked as being in use (pvcreate -ff <dev_name> will fix this).

For example - /dev/sda here has 1 MDA, orphan and is incorrectly marked
with PV_EXT_USED flag:

$ pvs --binary -o+pv_in_use
  WARNING: Found inconsistent standalone Physical Volumes.
  WARNING: Repairing flag incorrectly marking Physical Volume /dev/sda as used.
  PV         VG     Fmt  Attr PSize   PFree   InUse
  /dev/sda          lvm2 ---  128.00m 128.00m     0
2016-02-15 12:44:46 +01:00
Peter Rajnoha
b6e3080fff pv: _pvcreate_write: do label removal and zeroing only if creating a new PV 2016-02-15 12:44:46 +01:00
Peter Rajnoha
73f1d444c8 pv: issue different message of different type when we're overwriting existing PV header instead of creating a new one
Scenario:

$ pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created

We're adding the PV to a VG.

Before this patch:
$ vgcreate vg /dev/sda
  Physical volume "/dev/sda" successfully created
  Volume group "vg" successfully created

With this path applied:
$ vgcreate vg /dev/sda
  Volume group "vg" successfully created

...and verbose log containing: "Physical volume "/dev/sda" successfully written"
2016-02-15 12:44:46 +01:00
Peter Rajnoha
52999133a3 pv: check for the PV_EXT_USED flag and deny pvcreate/pvchange/pvremove/vgcreate on such PV (unless forced)
Make sure we won't use a PV that is already marked as used. Normally,
VG metadata would stop us from doing that, but we can run into a
situation where such metadata is missing because PVs with MDAs
are missing and the PVs left are the ones with 0 MDAs.

(/dev/sda in this example has 0 MDAs and it belongs to a VG,
but other PVs with MDA are missing)

$ pvs -o pv_name,pv_mda_count /dev/sda
  PV         #PMda
  /dev/sda       0

$ pvcreate /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Can't initialize PV '/dev/sda' without -ff.

$ pvchange -u /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Can't change PV '/dev/sda' without -ff.
  Physical volume /dev/sda not changed
  0 physical volumes changed / 1 physical volume not changed

$ pvremove /dev/sda
  PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  (If you are certain you need pvremove, then confirm by using --force twice.)

$ vgcreate vg /dev/sda
  Physical volume '/dev/sda' is marked as belonging to a VG but its metadata is missing.
  Unable to add physical volume '/dev/sda' to volume group 'vg'.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
10128c9bd6 metadata: schedule PV for header rewrite if adding a PV to VG or restoring VG
When adding PV to VG, we need to rewrite PV header as there's a flip
in PV_EXT_USED flag. The same applies if we're restoring VG from backup.
2016-02-15 12:44:46 +01:00
Peter Rajnoha
2950adc2ab metadata: add_pv_to_vg: add 'new_pv' arg to state if the PV is about to be created 2016-02-15 12:44:46 +01:00
Peter Rajnoha
4cbaaa5c98 pv: add is_used_pv fn 2016-02-15 12:44:46 +01:00
Peter Rajnoha
54b41db9a6 metadata: introduce PV_EXT_USED flag and bump PV_HEADER_EXTENSION_VSN 2016-02-15 12:44:46 +01:00
Peter Rajnoha
a522af93b7 format: add FMT_PV_FLAGS to indicate format supports PV flags 2016-02-15 12:44:46 +01:00
Peter Rajnoha
4361543f3e refactor: rename struct pv_to_create --> struct pv_to_write
We'll use this struct in subsequent patches for PVs which should
be rewritten, not just created. So rename struct pv_to_create to
struct pv_to_write for clarity.
2016-02-15 12:44:45 +01:00
Zdenek Kabelac
befe0078ad gcc: better code for older compiler
Address this gcc warning:

metadata/lv.c:243: warning: initialized field overwritten
metadata/lv.c:243: warning: (near initialization for 'status.seg_status')

Present with e.g.: gcc version 4.3.2 (Debian 4.3.2-1.1)
2016-02-12 10:17:39 +01:00
Zdenek Kabelac
cc23fdbd13 cleanup: update messages 2016-02-11 18:38:40 +01:00
Zdenek Kabelac
032cf8ade6 cleanup: relocate function to vg.c 2016-02-11 18:35:06 +01:00
Zdenek Kabelac
acf7815aca cleanup: stripes_extents
Simplify calculation of extents rounding needed for
segment size.

Segment size has to divisible by 'extent count' needed to contain
whole stripe. LVM currently does not support stripes across segment.

In case the stripe size is bigger then extent size,
require bigger rounding.
2016-02-11 18:35:06 +01:00
Zdenek Kabelac
3f916e8285 lvresize: check for given parameters
Check ac_ value as passed args.
Also drop reseting 'computed' values - since they get
assigned values later.
2016-02-11 18:35:06 +01:00
Peter Rajnoha
136fd8f2f6 conf: add metadata/check_pv_device_sizes 2016-01-22 14:16:00 +01:00
Peter Rajnoha
c0912af310 metadata: check PV dev size is not less than PV size 2016-01-22 14:16:00 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Zdenek Kabelac
cc53a23d82 cleanup: join if/else 2016-01-21 12:11:37 +01:00
Zdenek Kabelac
21028a7903 cleanup: reformat sentence about max sizes
The extent size must fits all blocks in 4294967295 sectors
(in 512b units) this is 1/2 KiB less then 2TiB.

So while previous statement 'suggested' 2TiB is still acceptable value,
make it clear it's not.

As now we support any multiples of 128KB as extent size -
values like 2047G will still 'flow-in' otherwise the largest power-of-2
supported value is 1TiB.

With 1TiB user needs 8388608 extents for 8EiB device.
(FYI such device is already unusable with todays glibc-2.22.90-27)

4GiB extent size is currently the smallest extent size which allows
a user to create 8EiB devices (with 2GiB  it's less then 8EiB).

TODO: lvm2 may possibly print amount of 'lost/unused space' on a PV,
since using such ridiculously sized extent size may result in huge
space being left unaccessible.
2016-01-20 13:44:47 +01:00
Zdenek Kabelac
7b5a8f61a7 cleanup: drop extra cmd passed arg
Use vg->cmd when needed cmd struct.
2016-01-20 13:44:47 +01:00
Zdenek Kabelac
b64703401d cleanup: relocate size assign
Directly set thin-pool size when thin data LV size changes.
2016-01-20 13:44:47 +01:00
Zdenek Kabelac
ca878a3426 cleanup: adjust once 2016-01-20 13:44:47 +01:00
Zdenek Kabelac
178cbb580a cleanup: update check function
Use display_lvname().
Use lv_is_lockd_sanlock_lv().
Order  'error' checks ahead of 'ignore' ones.
2016-01-20 13:44:47 +01:00
Zdenek Kabelac
4b9ae55a8d cleanup: shuffle check of threshold
Check first threshold and then policy_amount.
2016-01-20 13:44:47 +01:00
Zdenek Kabelac
c99ca6f430 cleanup: use log_print
Using log_print for ignoring message instead of log_warn.
Add some missing '.'.
2016-01-20 13:44:47 +01:00
Peter Rajnoha
2a4ef78c4a report: fix off-by-one error when reporting LV segment's metadata device extent count
Commit a3f484f812 used "-1" two times by
mistake for the extent count when reporting seg_metadata_le_ranges.
2016-01-19 14:44:06 +01:00
Peter Rajnoha
1341f83554 report: add seg_le_ranges report field 2016-01-19 14:30:21 +01:00
Peter Rajnoha
fccb1bb276 report: make devices, metadata_devices, seg_pe_ranges and seg_metadata_le_ranges fields consistent
There are two basic groups of fields for LV segment device reporting:
  - related to LV segment's devices: devices and seg_pe_ranges
  - related to LV segment's metadata devices: metadata_devices and seg_metadata_le_ranges

The devices and metadata_devices report devices in this format:
    "device_name(extent_start)"

The seg_pe_ranges and seg_metadata_le_ranges report devices in
this format:
    "device_name:extent_start-extent_end"

This patch reverts partly what commit 7f74a99502
(v 2.02.140) introduced in this area - it added [] for
hidden devices to mark them for all four fields mentioned above.

We won't be marking hidden devices in devices and metadata_devices
fields.

The seg_metadata_le_ranges field will have hidden devices marked -
it's new enough that we don't need to care about compatibility much
yet.

The seg_pe_ranges is old enough that we shouldn't be changing this
one - so we're reverting to not marking hidden devices here.
Instead, there's going to be a new field "seg_le_ranges" which
is going to replace the seg_pe_ranges and it will mark hidden devices -
this is going to be introduced in a patch later.

So in the end we'll end up with:

   (LV segment's devices)
   devices field with "device_name(extent_start)" format, not marking hidden devices
   seg_pe_ranges field with "device_name:extent_start-extent_end" format, not marking hidden devices (deprecated, new seg_le_ranges should be used instead for standardized format)
   seg_le_ranges field with "device_name:extent_start-extent_end" format, marking hidden devices

   (LV segment's metadata devices)
   metadata_devices field with "device_name:extent_start-extent_end" format, not marking hidden devices
   seg_metadata_le_ranges field with "device_name:extent_start-extent_end" format, marking hidden devices

Also, both seg_le_ranges and seg_metadata_le_ranges will honour the
report/list_item_separator setting which can be used to configure
the delimiter used for list items.

So, to sum it up, we will recommend using the new seg_le_ranges and
seg_metadata_le_ranges fields because they display devices with
standard extent range format, they can mark hidden devices and they
honour the report/list_item_separator setting.

We'll be keeping devices,seg_pe_ranges and metadata_devices fields
for compatibility.
2016-01-19 14:30:20 +01:00
Peter Rajnoha
b160b73800 report: change _format_pvsegs to return list instead of plain string, change associated report fields from STR to STR_LIST
The associated devices,metadata_devices,seg_pe_ranges and
seg_metadata_le_ranges are reported as genuine string lists now.
This allows for using the items separately in -S|--select
(so searching for subsets etc.) and also it allows for
configuring the separator using report/list_item_separator
which may be useful in scripts (however, we'll enable this
only for seg_le_metadata_ranges and not for devices,seg_pe_ranges
and seg_metadata_devices for compatibility reasons - see following
patch).
2016-01-19 14:17:41 +01:00
Alasdair G Kergon
a3f484f812 report: Fix seg_pe_ranges LV sizes.
When reporting on LVs, take the end of the range from the size of the
underlying (hidden) LV rather than the logical size of the current
segment (that PVs use).
2016-01-18 22:04:43 +00:00
Peter Rajnoha
7559af2334 lvm2app: fix lvm2app to return either 0 or 1 for lvm_vg_is_{clustered,exported}
Fix lvm2app to return either 0 or 1 for lvm_vg_is_{clustered,exported},
including internal functions pvseg_is_allocated and vg_is_resizeable
which are not yet exposed in lvm2app but make them consistent with the
rest.
2016-01-15 14:42:18 +01:00
Peter Rajnoha
b82d5ee092 report: add kernel_discards report field to display thin pool discard used in kernel
Thin pool discard mode set in metadata can be different from the one
actually used if any device underneath does not support that mode. Add
kernel_discard report field to make it possible to see this difference.
2016-01-14 16:54:12 +01:00
Zdenek Kabelac
88400b599e lvmanip: fix last commit and drop else
In last commit when removing if() branch
this 'else' now has to be dropped.
2016-01-14 11:55:47 +01:00
Zdenek Kabelac
278c5509ee cleanup: order ac members 2016-01-14 11:34:05 +01:00
Zdenek Kabelac
c9a813bff8 cleanup: spaces 2016-01-14 11:34:05 +01:00
Zdenek Kabelac
7d2b7f2bd8 cleanup: replace log_warn 2016-01-14 11:34:05 +01:00
Zdenek Kabelac
2567d03e95 cleanup: explicit prohibition for virtual segs
Internal _alloc_init() is only called from allocate_extents(),
which already does prevent usage of virtual segments.

So mark as internal error early and do not process it any further.
2016-01-14 11:34:05 +01:00
Zdenek Kabelac
4310dfd4e1 cleanup: simplier formula 2016-01-14 11:34:05 +01:00
Zdenek Kabelac
ebcfd09ba9 cleanup: more readable check 2016-01-14 11:34:05 +01:00
Zdenek Kabelac
753a496348 snapshot: relocate alloc_snapshot_seg
Move alloc_snapshot_seg to snapshot_manip and make it local static.
2016-01-14 11:34:05 +01:00
Zdenek Kabelac
526297296f lvmanip: add lv_is_snapshot
Add new test for  lv_is_snapshot().
Also move few other bitchecks into same place as remaining bit tests.

TODO: drop lv_is_merging_origin() and keep using lv_is_merging().
2016-01-14 11:34:04 +01:00
Alasdair G Kergon
01228b692b vgcfgrestore: Retain allocatable PV attribute.
pvchange -xn was getting lost.
All PVs were set to allocatable again after restore.

Moved setting ALLOCATABLE_PV outside pv_setup().
2016-01-14 00:46:45 +00:00
Peter Rajnoha
63d59254d9 lv: fix check for NULL origin_lv in _do_lv_origin_dup, cleanup _do_lvconvert_lv_dup 2016-01-13 17:11:05 +01:00
Peter Rajnoha
1417ed304b cleanup: rename 'invisible devices' to 'hidden devices' 2016-01-13 15:22:46 +01:00
Peter Rajnoha
84a9f750fe cleanup: cleanup lv.h and put fns into categories for better readability 2016-01-13 12:17:00 +01:00
Peter Rajnoha
e168b5de75 conf: add report/mark_invisible_devices 2016-01-13 12:01:10 +01:00
Peter Rajnoha
7f74a99502 lv: use brackets for invisible devices when formatting device segments
Include brackets for the name if the dev is invisible.
This change applies to all callers of _format_pvsegs fn:
  - lvseg_devices (the "lvs -o devices")
  - lvseg_metadata_devices (the "lvs -o metadata_devices)
  - lvseg_seg_pe_ranges (the "lvs -o seg_pe_ranges")
  - lvseg_seg_metadata_le_ranges (the "lvs -o seg_metadata_le_ranges")
2016-01-13 11:20:04 +01:00
Peter Rajnoha
f1fe7af014 lv: add common lv_pool_lv fn for use in report and dup, use brackets for invisible devices
The common lv_pool_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:20:01 +01:00
Peter Rajnoha
42fcbc1fd4 lv: add common lv_metadata_lv fn for use in report and dup, use brackets for invisible devices
The common lv_metadata_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:19:58 +01:00
Peter Rajnoha
cdbf76b2f0 lv: add common lv_data_lv fn for use in report and dup, use brackets for invisible devices
The common lv_data_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:19:55 +01:00
Peter Rajnoha
d50cd9d8d7 lv: add common lv_mirror_log_lv for use in report and dup, use brackets for invisible devices
The common lv_mirror_log_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:19:51 +01:00
Peter Rajnoha
aae45a1f21 lv: add common lv_origin_lv fn for use in report and dup, use brackets for invisible devices
The common lv_origin_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:19:45 +01:00
Peter Rajnoha
1bd83814ce lv: add common lv_convert_lv fn for use in report and dup, use brackets for invisible devices
The common lv_convert_lv fn avoids code duplication and also
the reporting part now uses _lvname_disp and _uuid_disp to display
name and uuid respectively, including brackets for the name if the
dev is invisible.
2016-01-13 11:16:37 +01:00
David Teigland
124b490fe6 lvmlockd: update VG lock version earlier
Have commands send lvmlockd the update message
in vg_write instead of vg_commit, so that it's
not done while LVs are suspended.  If the vg_write
is not committed, and the seqno sent to lvmlockd
is not used, then lvmlockd can detect this when
the next update uses the same seqno.
2015-12-15 16:14:49 -06:00
David Teigland
796461a912 vgrename: use process_each_vg
Use process_each_vg() to lock and read the old VG,
and then call the main vgrename code.

When real VG names are used (not a UUID in place of the
old name), the command still pre-locks the new name
(when strcmp wants it locked first), before calling
process_each_vg on the old name.

In the case where the old name is replaced with a UUID,
process_each_vg now translates that UUID into the real
VG name, which it locks and reads.  In this case, we
cannot do pre-locking to maintain lock ordering because
the old name is unknown.  So, in this case the strcmp
based lock ordering is suppressed and the old name is
always locked first.  This opens a remote chance for
lock ordering conflict between racing vgrenames between
two names where one or both commands use the UUID.
2015-12-14 14:26:47 -06:00
David Teigland
4aa9e99a10 Change messages from verbose to debug
These messages about outdated PVs should not
be verbose because they always appear, even
when there are no outdated PVs.
2015-12-11 15:28:46 -06:00
Zdenek Kabelac
cd8e95d933 lvrename: always allow to rename pools
Since we mark cache-pool as 'hidden/private' while it is in-use,
we may still allow user to change it's name.

It should not cause any harm and user may prefer better naming
for a cache-pool in use.
2015-12-10 21:01:24 +01:00
Zdenek Kabelac
bf4b74c5eb cache: support stacked rename
Preserve skip_pool flag when running for_each_sub_lv() so
lvrename continues to work when thin-pool is using cached
data LV.
2015-12-10 21:01:24 +01:00
David Teigland
bdba4e7a93 lvrename: move the lvmlockd LV lock
The function it was in is used for various
internal renaming of hidden LVs where a lock
from lvmlockd does not apply.
2015-12-09 11:59:49 -06:00
David Teigland
88cef47b18 vg_read: look up vgid from name
After recent changes to process_each, vg_read() is usually
given both the vgname and vgid for the intended VG.

However, in some cases vg_read() is given a vgid with
no vgname, or is given a vgname with no vgid.

When given a vgid with no vgname, vg_read() uses lvmcache
to look up the vgname using the vgid.  If the vgname is
not found, vg_read() fails.

When given a vgname with no vgid, vg_read() should also
use lvmcache to look up the vgid using the vgname.
If the vgid is not found, vg_read() fails.

If the lvmcache lookup finds multiple vgids for the
vgname, then the lookup fails, causing vg_read() to fail
because the intended VG is uncertain.

Usually, both vgname and vgid for the intended VG are passed
to vg_read(), which means the lvmcache translations
between vgname and vgid are not done.
2015-12-01 09:18:48 -06:00
Zdenek Kabelac
6336ef98d4 lib: pass mem pool to check_transient_status
check_transient_status() may need to allocate some memory,
so pass in already existing mem pool.
2015-12-01 13:01:28 +01:00
David Teigland
05ac836798 system_id: refactor check for allowed system_id
Refactor the code that checks for an allowable system_id
so that it can be used from other places.
2015-11-30 11:46:55 -06:00
Zdenek Kabelac
66c7fa4a44 cleanup: rename lv_ondisk to lv_committed
Patch has no functional change.
2015-11-25 11:39:26 +01:00
Zdenek Kabelac
d9faf85987 cleanup: rename vg_ondisk to vg_committed
Unifying terminology.

Since all the metadata in-use are ALWAYS on disk - switch
to terminology  committed and precommitted.

Patch has no functional change inside.
2015-11-25 11:11:21 +01:00
Zdenek Kabelac
6d6c233768 cleanup: move towards using direct LV pointers
We do not won't to 'expose'  internals of VG struct.
ATM we use lists to keep all LVs - we may want to switch
to better struct for quicker 'search'.

Since we do not need 'lists' but always actual LV,
switch find_lv_in_vg_by_lvid() to return LV,
and replaces some use case of  find_lv_in_vg()
with 'better' working find_lv() which already
returns LV.
2015-11-23 23:42:59 +01:00
Zdenek Kabelac
e2b00b0a89 cleanup: use display_lvname in pmspare
Just switch to use display_lvname().
Also squeeze possibly failing strncpy into INTERNAL_ERROR
as lvname always should fit.
2015-11-18 22:17:26 +01:00
Zdenek Kabelac
d8049dd17a cleanup: add some test for NULL
Coverity here is a bit 'blind' here and cannot resolve which
code paths are actually able to hit this code path.
(It's using 'statistic' to resolve all possible paths,
and it's not scanning 'individual' code paths.)

This just cleans warns and add 'cheap' tests.
2015-11-17 19:01:25 +01:00
Zdenek Kabelac
cad3568def raid: drop unneeded NULL test
Skip testing target_pvs for NULL, we already
dereference it in many other places.
If check would ever be needed - it needs to be
in front of _raid_extract_images().
2015-11-17 19:01:25 +01:00
Alasdair G Kergon
970a428909 pvmove: Remove unused find_pvmove_lv_from_pvname. 2015-11-13 18:06:08 +00:00
Zdenek Kabelac
b2e13ac552 coverity: add few internal errors
Mark impossible paths with internal errors.
Also replace 'strcmp() with more readable seg_is...()
2015-11-13 11:18:27 +01:00
Zdenek Kabelac
1f2a42c7b7 cleanup: check LVs in one statement
Use a single statement to check all LVs we want to
deref via  get_only_segment_using_this_lv().
2015-11-13 11:17:06 +01:00
Zdenek Kabelac
70af08122e cleanup: missing check for PV2
Patch missed also check this pointer dereference.
2015-11-13 11:17:06 +01:00
Zdenek Kabelac
007be91e3d raid: ensure area_count is at least 2
Enusure we will not divide by 0.
2015-11-13 11:17:06 +01:00
Zdenek Kabelac
76b42901c0 cache: ensure there is no NULL str
Coverity is not smart enough to detect this case could never happen.
2015-11-09 17:04:10 +01:00
Zdenek Kabelac
e4c9b390ca cleanup: update comment 2015-11-09 12:21:17 +01:00
Zdenek Kabelac
57c2a1ae8c raid: mark intententional copy and paste
Coverity: add this extra comment, to let Coverity know this
slightly changed copy&paste code is intentional.
2015-11-09 10:22:52 +01:00
Zdenek Kabelac
9b9b5d0ea2 cleanup: use 64bit multiply for print 2015-11-09 10:22:52 +01:00
Zdenek Kabelac
5ba219e87a cleanup: use display_lvname 2015-11-09 10:22:51 +01:00
Zdenek Kabelac
04f76d9020 cleanup: use NAME_LEN
Let's have instant check for max name len when creating
subLV name.
2015-11-09 10:22:51 +01:00
Zdenek Kabelac
2e04eee192 cleanup: do not test alloca for NULL
alloca() never returns NULL.
In case stack is out-of-range the behaviour is undefined.
2015-11-09 10:22:51 +01:00
Zdenek Kabelac
07046e994f alloc: use own mem pool for alloc_handle
Keep alloc_handle's data in a single mempool and do not
spread them into vgmem pool.
2015-11-09 10:22:49 +01:00
Zdenek Kabelac
0c380c316c cleanup: relocate error capture
Capture internal error before allocation anything.
2015-11-09 10:21:09 +01:00
Zdenek Kabelac
856e11e11c lv_manip: do not deref NULL for debug message
Coverity: when 'pv2' would be passed as NULL, do not try to
deref it in debug message.
2015-11-09 10:19:20 +01:00
Alasdair G Kergon
8096f2224c mirror: Fix log size calc when more than 1 extent.
Currently the code creates the log separately after allocating space for
the data and as no data allocation is needed this second time,
total_extents ends up holding zero so use new_extents directly instead.
2015-11-05 23:40:47 +00:00
David Teigland
16780f6faa vg_read: skip repair and wipe for foreign and shared VGs
When reading a foreign VG we cannot write it, since
it belongs to another host.  When reading a shared VG
we cannot write it because we may not have an ex lock.
(Or we may be reading the shared VG while not using
lvmlockd in which case it's like reading a foreign VG.)

Add the same checks for wiping outdated PVs.  We may
read a foreign or shared VG, or see the PVs, while
another host is part way through writing a new version
of the VG to the PVs.  This might cause us to think
some of the PVs are outdated.  We do not want to
write another host's PVs, especially when we may
wrongly conclude they are outdated.
2015-11-03 13:42:21 -06:00
Zdenek Kabelac
175119fdcd cleanup: remove thin low_water_mark from metadata
This option could never have been printed in lvm2 metadata, so it could
be safely removed as it could have been set only as 0.

These configurable setting is supported via metadata profile.
2015-10-29 12:14:20 +01:00
Zdenek Kabelac
33a8a2febf cleanup: use same type 2015-10-29 12:14:20 +01:00
Marian Csontos
1af2ab10d0 cleanup: snapshots of snapshots message
No plans to support thick snapshost of snapshots.
2015-10-27 11:42:48 +01:00
Zdenek Kabelac
40eea582ae lv_manip: ensure it will fit bellow threshold
Use single code to evaluate if the percentage value has
crossed threshold.

Recalculate amount value to always fit bellow
threshold so there are not need any extra reiterations
to reach this state in case policy amount is too small.
2015-10-25 21:03:11 +01:00
Zdenek Kabelac
b780d329aa thin: fix percentage compare
Since plugin's percentage compare has been fixed,
it's now revealed wrong compare here.

The logic for threshold is - to allow to go as high
as given value e.g. 80% - so if pool is exactlu 80%
full it's still allowed to use it (dmeventd will not
resize it).
2015-10-25 21:01:54 +01:00
David Teigland
1a74171ca5 vg_read: sometimes ignore read errors
Running "vgremove -f VG & pvs" results in the pvs
command reporting that the VG is not found or is
inconsistent.  If the VG is gone or being removed,
the pvs command should just skip it and not print
errors about it.

"Not found" is because the pvs command created the
list of VGs to process, including VG, then vgremove
removed the VG, then the pvs command came to to read
the VG to process it and did not find it.

An "inconsistent" error could be reported if vgremove
had only partially completed removing VG when pvs did
vg_read on the VG to process it, causing pvs to find
the VG in a partially-removed state.

This fix adds a flag that pvs uses to ignore a VG
that can't be read or is inconsistent.
2015-10-23 10:12:34 -05:00
Alasdair G Kergon
51735f09f7 thin: Fix typo in policy threshold message. 2015-10-23 15:38:31 +01:00
Peter Rajnoha
8b965bd3d5 pvremove: make sure even invalid info is removed from lvmcache on pvremove
The lvmcache info might be resued, most notably in lvm shell.
We need to be sure that even lvmcache_info marked as invalid
is removed from the lvmcache so it does not confuse any subsequent
code/commands executed later on.

Problematic example with the lvm shell:

lvm> pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda        lvm2 ---  128.00m 128.00m

Before this patch (/dev/sda still displayed in a way):
======================================================

lvm> pvremove /dev/sda
  Labels on physical volume "/dev/sda" successfully wiped

(without lvmetad)
lvm> pvs
  No physical volume label read from /dev/sda

(with lvmetad)
lvm> pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda        lvm2 ---  128.00m 128.00m

With this patch applied:
========================

lvm> pvremove /dev/sda
  Labels on physical volume "/dev/sda" successfully wiped

(without lvmetad)
lvm> pvs

(with lvmetad)
lvm> pvs
2015-10-23 15:48:06 +02:00
Zdenek Kabelac
1a2d0a0c72 cleanup: indents 2015-10-22 22:46:10 +02:00
Zdenek Kabelac
19e272ba53 lib: better reporting of threshold
Simplify code reporting warning about incorrect thresholds.
2015-10-22 22:06:14 +02:00
Zdenek Kabelac
7c36d7c90c thin: enforce local activation when creation new thin
As we need to check how full thin-pool is - require thin-pool is
locally active.
2015-10-14 01:00:35 +02:00
Zdenek Kabelac
86b04ebd19 thin: enhance logging
Add debug message with more details about threshold overflow.
2015-10-13 14:38:52 +02:00
Zdenek Kabelac
c7b4359ff4 thin: check for overflown pool earlier
Check for pool early before we actually start to modify metadata.
This requires locally active thin-pool.
2015-10-13 14:37:07 +02:00
Peter Rajnoha
e04424e87e report: identify LV hodling sanlock locks as 'private,lockd,sanlock' within lv_role report field
Before this patch:
$ lvs -a -o name,layout,role test/lvmlock
  LV        Layout     Role
  [lvmlock] linear     public

With this patch applied:
$ lvs -a -o name,layout,role test/lvmlock
  LV        Layout     Role
  [lvmlock] linear     private,lockd,sanlock
2015-10-08 13:44:29 +02:00
Heinz Mauelshagen
b33d7586e7 raid_manip: fix wrong image size allocation on raid10 "lvconvert --replace ..." 2015-10-02 17:09:37 +02:00
Alasdair G Kergon
fb957ef322 raid: Add metadata dev information to reports.
Add metadata_devices and seg_metadata_le_ranges report fields.
Currently only defined for raid, but should probably be extended
to all other segment types that don't report all their device
usage in the 'devices' field.
2015-10-02 10:09:28 +01:00
Zdenek Kabelac
a139275eca alloc: fix update or area_len
Commit: 192d9ad977
changed logic for area_len formula - so it returns
different values.

Placing () to restore previous behaviour and make it
explicit.
2015-10-01 15:02:49 +02:00
Heinz Mauelshagen
eab099b221 segtypes: Use flags in raid segtype macros. 2015-09-24 20:43:18 +01:00
Heinz Mauelshagen
3036620b48 raid: Add a segtype flag for each raid type. 2015-09-24 20:17:57 +01:00
Heinz Mauelshagen
028715b0f0 raid: Detect whether or not kernel supports raid0. 2015-09-24 19:59:29 +01:00
Alasdair G Kergon
4a74e19f80 alloc: Move _calc_area_multiple. 2015-09-24 17:56:19 +01:00
Alasdair G Kergon
e773e71910 stripes: Introduce _round_to_stripe_boundary. 2015-09-24 17:50:53 +01:00
Alasdair G Kergon
39a97d86f0 segtypes: Add and use new segtype macros.
Includes fixing an inverted raid10 segtype check in _raid_add_target_line.
2015-09-24 14:59:07 +01:00
Alasdair G Kergon
41fe225b0d style: lv_manip.c changes 2015-09-24 13:43:58 +01:00
Heinz Mauelshagen
192d9ad977 style: Miscellaneous tidying up of metadata/lv* 2015-09-23 14:37:52 +01:00
Alasdair G Kergon
28aff5d240 segtypes: Make constants ULL. 2015-09-22 21:10:46 +01:00
Alasdair G Kergon
214e2cddf6 segtypes: Use SEG_TYPE_NAME_ string constants. 2015-09-22 19:04:12 +01:00
Heinz Mauelshagen
0ce150280e segtypes: Extend flags to 64 bits. 2015-09-22 18:03:33 +01:00
Peter Rajnoha
a54b4bba35 report: add lv_convert_lv_uuid field 2015-09-21 14:22:23 +02:00
Peter Rajnoha
0a01c5aa36 report: add lv_move_pv_uuid field 2015-09-21 14:22:03 +02:00
Peter Rajnoha
f01b7afa19 pv: add 'mem' arg for pv_uuid_dup and pv_name_dup 2015-09-21 14:21:42 +02:00
Peter Rajnoha
ffa7b37b28 report: add lv_mirror_log_uuid field 2015-09-21 14:21:39 +02:00
Peter Rajnoha
f61a394be4 report: add lv_data_lv_uuid field 2015-09-21 14:21:21 +02:00
Peter Rajnoha
c2ea5b3dee report: add lv_metadata_lv_uuid field 2015-09-21 14:20:58 +02:00
Peter Rajnoha
199697accf report: add lv_origin_uuid field 2015-09-21 14:20:36 +02:00
Peter Rajnoha
cb8f29d147 report: add lv_pool_lv_uuid field 2015-09-21 14:20:12 +02:00
Peter Rajnoha
0e3042f488 lv: add 'mem' arg for lv_uuid_dup 2015-09-21 12:25:31 +02:00
David Teigland
2ce8ee0214 vgcreate: initialize new PVs only in first vg_write
When a command does a sequence of
vg_write + vg_commit + vg_write + vg_commit,

initialization of non-PV devices happens during the
first vg_write, and does not need to be repeated by
the second vg_write.

When creating a lockd VG, this sequence occurs because
the VG is first created, then the lockd data is created,
then the lockd data is then written to the VG metadata.
2015-09-14 13:22:22 -05:00
Zdenek Kabelac
ffeeb5c1e7 thin: show message on error path
Add missing log_error and show proper reason for failure
when autoextend is set to 0.

Add missing log_error when checked LV is not locally active.
2015-09-14 20:18:54 +02:00
Zdenek Kabelac
729b035edd pool: validate pool_metadata has proper suffix 2015-09-11 21:52:27 +02:00
Zdenek Kabelac
5911fa1d91 cache: warn if caching causes troubles
Certain stacks of cached LVs may have unexpected consequences.
So add a warning function called when LV is cached to detect
such caces and WARN user about them - the best we could do ATM.
2015-09-10 17:27:30 +02:00
Zdenek Kabelac
e1edb5676e lib: when moving segtypes, move LV bits
When we insert layer we also move status flag-bits for certain LV types,
so internal volume_group structure remains consistent.
(Perhaps it's misuse of 'insert_layer' function and we should have
another similar function for this.)

Basically we aim to maintain the same state as after reading fresh
metadata out of volume group.

Currently we when i.e. cache  'raid' LV - this should transfer 'raidLV' flag
to  _corigin LV and cache is no longer a raid.

TODO: bits for stacked devices needs more exact rules.
2015-09-10 17:25:28 +02:00
Zdenek Kabelac
ffbf12504d cleanup: assign seg_name once 2015-09-07 17:44:08 +02:00
Alasdair G Kergon
fb12308416 style: Standardise some error paths. 2015-09-05 23:56:30 +01:00
Zdenek Kabelac
32d6ca9196 cleanup: show error message
Add error message on error path.
2015-09-03 23:34:36 +02:00
Zdenek Kabelac
6d9e7d48fb cleanup: add . 2015-08-21 15:35:45 +02:00
Zdenek Kabelac
e4b9ac46d7 thin: metadata size cannot be reduced
Until we implement offline metadata manipulation,
the size of metadata LV cannot be reduced.
2015-08-21 15:35:45 +02:00
Zdenek Kabelac
55a9262bdb cleanup: unused header files (Coverity) 2015-08-18 15:00:08 +02:00
Zdenek Kabelac
4b28383b1c cache: move detection code to cache_set_policy
Move code which runtime detects settings for cache_policy
out of config dir to cache seg handling code.

Also mark cache_mode as command profilable setting.
2015-08-17 15:52:06 +02:00
Zdenek Kabelac
79ea81b8a8 thin: restore transaction_id handling
Revert back to already existing behavior which has been slightly
modified by a900d150e4.

At the end however it seem to be equal to change TID right with first
metadata write.

Existing code missed handling for 'unused' thin-pool which would
require to also check empty message list for TID==0.

So with the fix we now again preserve 'active' thin-pool volume
when first thin volume is created - this property was lost and caused
problems in cluster, where the lock was hold, but volume was no longer
active on the node.

Another missing part was the proper support for already increased,
but unfinished TID change.

So going back here with existing logic -

TID is increased with first MDA update.

Code allows start with either same TID or (TID-1).

If there are messages, TID must be lower by 1 for sending,
otherwise messages were already posted.
2015-08-17 11:25:03 +02:00
Zdenek Kabelac
48ed8ac50c cleanup: indent 2015-08-12 14:33:16 +02:00
Zdenek Kabelac
533ac4d47d cache: add more validation 2015-08-12 14:33:14 +02:00
Zdenek Kabelac
f0c18fceb4 cache: api update
Change logic and naming of some internal API functions.

cache_set_mode() and cache_set_policy() both take segment.

cache mode is now correctly 'masked-in'.

If the passed segment is 'cache' segment - it will automatically
try to find 'defaults' according to profiles if the are NOT
specified on command line or they are NOT already set for cache-pool.

These defaults are never set for cache-pool.
2015-08-12 14:32:24 +02:00
Zdenek Kabelac
969ee25a74 toollib: get_cache_params
Enhance  get_cache_params() to read common cache args.
2015-08-12 14:11:18 +02:00
Zdenek Kabelac
8a74d1ec79 cache: detect smq policy presence
Add code to detect available cache features.
Support policy_mq & policy_smq  features which might be disabled.

Introduce global_cache_disabled_features_CFG.
2015-08-12 14:11:17 +02:00
David Teigland
53c08f0bba lvrename: fix lockd LV locking
lvrename should not be done if the LV is active on another host.
This check was mistakenly removed when the code was changed to
use LV uuids in locks rather than LV names.
2015-08-10 15:46:21 -05:00
David Teigland
b4be988732 vgchange/lvchange: enforce the shared VG lock from lvmlockd
The vgchange/lvchange activation commands read the VG, and
don't write it, so they acquire a shared VG lock from lvmlockd.
When other commands fail to acquire a shared VG lock from
lvmlockd, a warning is printed and they continue without it.
(Without it, the VG metadata they display from lvmetad may
not be up to date.)

vgchange/lvchange -a shouldn't continue without the shared
lock for a couple reasons:

. Usually they will just continue on and fail to acquire the
  LV locks for activation, so continuing is pointless.

. More importantly, without the sh VG lock, the VG metadata
  used by the command may be stale, and the LV locks shown
  in the VG metadata may no longer be current.  In the
  case of sanlock, this would result in odd, unpredictable
  errors when lvmlockd doesn't find the expected lock on
  disk.  In the case of dlm, the invalid LV lock could be
  granted for the non-existing LV.

The solution is to not continue after the shared lock fails,
in the same way that a command fails if an exclusive lock fails.
2015-07-17 15:35:34 -05:00
Alasdair G Kergon
b93b85378d alloc: Fix lvextend failure when varying stripes.
A segfault was reported when extending an LV with a smaller number of
stripes than originally used.  Under unusual circumstances, the cling
detection code could successfully find a match against the excess
stripe positions and think it had finished prematurely leading to an
allocation being pursued with a length of zero.

Rename ix_offset to num_positional_areas and move it to struct
alloc_state so that _is_condition() can obtain access to it.

In _is_condition(), areas_size can no longer be assumed to match the
number of positional slots being filled so check this newly-exposed
num_positional_areas directly instead.  If the slot is outside the
range we are trying to fill, just ignore the match for now.

(Also note that the code still only performs cling detection against
the first segment of the LV.)
2015-07-15 23:12:54 +01:00
Zdenek Kabelac
a7101e7bfb cleanup: drop duplicated seg test
Test is already in seg_is_pool() if branch.
and one minor indent fix.
2015-07-15 13:10:22 +02:00
Zdenek Kabelac
c2d4330f27 cache: enhance cache-pool validation
Capture cache-pool without cache policy name set.
2015-07-15 13:10:22 +02:00
Zdenek Kabelac
e9e35b011e cache: handle policy_name separately
Keep policy name separate from policy settings and avoid
to mangling and demangling this string from same config tree.
Ensure policy_name is always defined.
2015-07-15 13:10:22 +02:00
Zdenek Kabelac
86a4d47215 cache: move setting of cache policy
Set policy before saving 1st. metadata and avoid unnecessary reload.
Fixes problem when we stored cache-pool without cache-policy set.
2015-07-15 13:10:21 +02:00
Zdenek Kabelac
4a33d57143 thin: fix warning for overprovisioning
When lvm.conf is properly configure for auto resize of overprovisioned
thin-pool volume, avoid showing any warning (2.02.124).
2015-07-15 13:10:21 +02:00
David Teigland
96a883a454 metadata: change function name to _allow_extra_system_id
The previous name was misleading since this is not the
primary system_id check, only the "extra" check.
2015-07-14 14:43:16 -05:00
David Teigland
681f779a3c lockd: fix error message after a failing to get lock
There are two different failure conditions detected in
access_vg_lock_type() that should have different error
messages.  This adds another failure flag so the two
cases can be distinguished to avoid printing a misleading
error message.
2015-07-14 11:36:04 -05:00
David Teigland
0823511262 lockd: disable part of lock_args validation
There are at least a couple instances where
the lock_args check does not work correctly,
(listed in the comment), so disable the
NULL check for lock_args until those are
resolved.
2015-07-10 15:53:21 -05:00
David Teigland
3d2c4dc034 metadata: fix duplicated LV flag
LOCKD_SANLOCK_LV was using the WRITEMOSTLY flag instead of a new one.
2015-07-09 17:02:30 -05:00
David Teigland
841c3478fd metadata: vg_validate lock_args 2015-07-09 13:25:00 -05:00
Peter Rajnoha
3b6840e099 config: replace find_config_tree_node with find_config_tree_array where appropriate 2015-07-08 13:03:08 +02:00
Jonathan Brassow
4daea88516 clean-up: typos s/bellow/below/ 2015-07-06 10:15:11 -05:00
Alasdair G Kergon
810ab095e6 macros: Wrap PRI with FMT.
Create a set of wrappers with embedded % such as
  #define FMTu64 "%" PRIu64
2015-07-06 15:09:17 +01:00
Zdenek Kabelac
a900d150e4 thin: move pool messaging from resume to suspend
Existing messaging intarface for thin-pool has a few 'weak' points:

* Message were posted with each 'resume' operation, thus not allowing
activation of thin-pool with the existing state.

* Acceleration skipped suspend step has not worked in cluster,
since clvmd resumes only nodes which are suspended (have proper lock
state).

* Resume may fail and code is not really designed to 'fail' in this
phase (generic rule here is resume DOES NOT fail unless something serious
is wrong and lvm2 tool usually doesn't handle recovery path in this case.)

* Full thin-pool suspend happened, when taken a thin-volume snapshot.

With this patch the new method relocates message passing into suspend
state.

This has a few drawbacks with current API, but overal it performs
better and gives are more posibilities to deal with errors.

Patch introduces a new logic for 'origin-only' suspend of thin-pool and
this also relates to thin-volume when taking snapshot.

When suspend_origin_only operation is invoked on a pool with
queued messages then only those messages are posted to thin-pool and
actual suspend of thin pool and data and metadata volume is skipped.

This makes taking a snapshot of thin-volume lighter operation and
avoids blocking of other unrelated active thin volumes.

Also fail now happens in 'suspend' state where the 'Fail' is more expected
and it is better handled through error paths.

Activation of thin-pool is now not sending any message and leaves upto a tool
to decided later how to finish unfinished double-commit transaction.

Problem which needs some API improvements relates to the lvm2 tree
construction. For the suspend tree we do not add target table line
into the tree, but only a device is inserted into a tree.
Current mechanism to attach messages for thin-pool requires the libdm
to know about thin-pool target, so lvm2 currently takes assumption, node
is really a thin-pool and fills in the table line for this node (which
should be ensured by the PRELOAD phase, but it's a misuse of internal API)
we would possibly need to be able to attach message to 'any' node.

Other thing to notice - current messaging interface in thin-pool
target requires to suspend thin volume origin first and then send
a create message, but this could not have any 'nice' solution on lvm2
side and IMHO we should introduce something like 'create_after_resume'
message.

Patch also changes the moment, where lvm2 transaction id is increased.
Now it happens only after successful finish of kernel transaction id
change. This change was needed to handle properly activation of pool,
which is in the middle of unfinished transaction, and also this corrects
usage of thin-pool by external apps like Docker.
2015-07-03 16:13:14 +02:00
Zdenek Kabelac
622064f00f thin: check for overprovisioning 2015-07-03 16:13:14 +02:00
David Teigland
fe70b03de2 Add lvmlockd 2015-07-02 15:42:26 -05:00
Alasdair G Kergon
4c629a5257 locking: Add missing error handling.
Add missing error logging and detection to unlock_vg and callers
of sync_local_dev_names etc.
2015-06-30 18:54:38 +01:00
Peter Rajnoha
621398ebb7 lv: time: increase buffer to 4k in lv_time_dup 2015-06-29 15:24:00 +02:00
Peter Rajnoha
125cd06698 conf: make time format configurable
Make it possible to define format for time that is displayed.
The way the format is defined is equal to the way that is used
for strftime function, although not all formatting options as
used in strftime are available for LVM2 - the set is restricted
(e.g. we do not allow newline to be printed). The lvm.conf
comments contain the whole list that LVM2 accepts for time format
together with brief description (copied from strftime man page).

For example:
(defaults used - the format is the same as used before this patch)
$ lvs -o+time vg/lvol0 vg/lvol1
  LV    VG   Attr       LSize Time
  lvol0 vg   -wi-a----- 4.00m 2015-06-25 16:18:34 +0200
  lvol1 vg   -wi-a----- 4.00m 2015-06-29 09:17:11 +0200

(using 'time_format = "@%s"' in lvm.conf - number of seconds
since the Epoch)
$ lvs -o+time vg/lvol0 vg/lvol1
  LV    VG   Attr       LSize Time
  lvol0 vg   -wi-a----- 4.00m @1435241914
  lvol1 vg   -wi-a----- 4.00m @1435562231
2015-06-29 14:30:35 +02:00
Zdenek Kabelac
e217873ed6 snapshot: add synchronization point
Synchronize with udev logic before reusing device as snapshot.

This patch tries to fix the problem with udev, where we manage
to 'active' LV for clearing, then we deactivate such device and
active again as member of 'origin&snapshot' tree all in 1 step.

There needs to be a sync point where udev has time to remove all links,
otherwise we race with scans and we may end-up with mysterious 'free'
links in the system pointing to wrong dm names.

This patch tries to fix failing topology cluster tests..
2015-06-24 15:18:49 +02:00
Zdenek Kabelac
a3e0d830bd thin: support unaligned size of external origin and thin pool
With thin-pool kernel target module 1.13 it's now support usage of
external origin with sizes which are not 'alligned' with chunk size
of thin-pool.

Enable lvm2 support for this and also fix reporting of data_percent
usage for case sizes are not alligned.
2015-06-18 18:50:36 +02:00
Zdenek Kabelac
6f2a617c31 thin: drop limitation for extension of reduced thin volume
Drop check which has prevented resize of reduce thin volume with
external origin. User is supposed to use 'zeroing' to get 'clean'
chunks.
2015-06-18 18:48:59 +02:00
David Teigland
e043e03cd8 lv_refresh: move the bulk of the function into lib
So that it can be used from other lib code.
2015-06-16 13:38:40 -05:00
David Teigland
d5adec1056 Add the 's' activation mode
Just as 'e' means activation with an exclusive lock,
add an 's' to mean activation with a shared lock.

This allows the existing but implicit behavior of '-ay'
of clvm LVs to be specified explicitly.  For local VGs,
asy simply means ay, just like aey means ay.

For local VGs, ay == aey == asy

For clvm VGs,  ay == asy, aey == aey, asy == asy
2015-06-16 10:18:16 -05:00
Petr Rockai
632dde0cbc metadata: When outdated PVs are wiped, notify lvmetad about the fact. 2015-06-10 16:27:12 +02:00
Petr Rockai
c78b6f18d4 metadata: Reject lvmetad metadata extensions when reading from disk. 2015-06-10 16:25:57 +02:00
Petr Rockai
756d027da5 metadata: Explain the pvs_outdated field in struct volume_group. 2015-06-10 16:17:45 +02:00
Petr Rockai
611c8b6d29 metadata: Add pvs_outdated to struct volume_group.
This is a list of PVs that should have their MDAs wiped because they carry
outdated metadata (that used to belong to the VG they are attached to).
2015-05-20 19:46:14 +02:00
Petr Rockai
5435346052 metadata: Factor _wipe_outdated_pvs() PVs out of _vg_read(). 2015-05-20 19:46:13 +02:00
Alasdair G Kergon
0300730cc9 pre-release 2015-05-15 23:19:29 +01:00
Zdenek Kabelac
9e102ecbd9 mirror: use proper 64bit constants
ed2a08bf25 missed to use 64bit
constants.
2015-05-15 22:53:12 +02:00
David Teigland
8e509b5dd5 toollib: avoid repeated lvmetad vg_lookup
In process_each_{vg,lv,pv} when no vgname args are given,
the first step is to get a list of all vgid/vgname on the
system.  This is exactly what lvmetad returns from a
vg_list request.  The current code is doing a vg_lookup
on each VG after the vg_list and populating lvmcache with
the info for each VG.  These preliminary vg_lookup's are
unnecessary, because they will be done again when the
processing functions call vg_read.  This patch eliminates
the initial round of vg_lookup's, which can roughly cut in
half the number of lvmetad requests and save a lot of extra work.
2015-05-08 11:44:55 -05:00
Zdenek Kabelac
29c709f591 debug: tracing error path 2015-05-08 15:15:10 +02:00
Zdenek Kabelac
ed2a08bf25 cleanup: use 64bit ulongs
Use 64bit arithmetics for all numbers (Coverity).
2015-05-08 15:15:10 +02:00
Zdenek Kabelac
3c46428fcd cleanup: drop unneeded int test
Testing int  region_size > INT32_MAX is always false
so drop the test (Coverity).
2015-05-08 15:15:10 +02:00
Zdenek Kabelac
2cea1c1bd9 pvcreate: fix test for wiping status
Commit ed420fb691 changed
paramet wiped to be a pointer, but missed to switch
to test pointer dereferenced value and instead always
checked 'pointer'.
2015-05-08 13:36:39 +02:00
Zdenek Kabelac
88421c883e raid: reread status when 0 is reported
When kernel target reports sync status as 0%  it might as well mean
it's 100% in sync, just the target is in some race inconsistent
state - so reread once again and take a more optimistic value ;)

Patch tries to work around:
https://bugzilla.redhat.com/show_bug.cgi?id=1210637
2015-05-04 13:09:05 +02:00
Alasdair G Kergon
cc26085b62 alloc: Respect cling_tag_list in contig alloc.
When performing initial allocation (so there is nothing yet to
cling to), use the list of tags in allocation/cling_tag_list to
partition the PVs.  We implement this by maintaining a list of
tags that have been "used up" as we proceed and ignoring further
devices that have a tag on the list.

https://bugzilla.redhat.com/983600
2015-04-11 01:55:24 +01:00
Alasdair G Kergon
2872e8c289 alloc: Add A_PARTITION_BY_TAGS to avoid sharing.
Add A_PARTITION_BY_TAGS set when allocated areas should not share tags
with each other and allow _match_pv_tags to accept an alternative list
of tags.  (Not used yet.)
2015-04-10 21:57:52 +01:00
Alasdair G Kergon
f1e3e99169 alloc: Log PV tags when reserving areas. 2015-03-26 21:13:26 +00:00
Alasdair G Kergon
e8fa3354f0 alloc: Pass alloc_handle through to _reserve_area. 2015-03-26 20:32:59 +00:00
Alasdair G Kergon
f9d74ba3d1 alloc: Only report cling tag errors once. 2015-03-26 19:43:51 +00:00
Alasdair G Kergon
4b1219ee87 metadata: Move alloc_handle init/destroy fns. 2015-03-26 18:44:24 +00:00
Peter Rajnoha
8759f7d755 metadata: vg: add removed_lvs field to collect LVs which have been removed
Do not keep dangling LVs if they're removed from the vg->lvs list and
move them to vg->removed_lvs instead (this is actually similar to already
existing vg->removed_pvs list, just it's for LVs now).

Once we have this vg->removed_lvs list indexed so it's possible to
do lookups for LVs quickly, we can remove the LV_REMOVED flag as
that one won't be needed anymore - instead of checking the flag,
we can directly check the vg->removed_lvs list if the LV is present
there or not and to say if the LV is removed or not then. For now,
we don't have this index, but it may be implemented in the future.
2015-03-24 08:43:08 +01:00
Peter Rajnoha
c9f021de0b metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing
This avoids a problem in which we're using selection on LV list - we
need to do the selection on initial state and not on any intermediary
state as we process LVs one by one - some of the relations among LVs
can be gone during this processing.

For example, processing one LV can cause the other LVs to lose the
relation to this LV and hence they're not selectable anymore with
the original selection criteria as it would be if we did selection
on inital state. A perfect example is with thin snapshots:

$ lvs -o lv_name,origin,layout,role vg
  LV    Origin Layout      Role
  lvol1        thin,sparse public,origin,thinorigin,multithinorigin
  lvol2 lvol1  thin,sparse public,snapshot,thinsnapshot
  lvol3 lvol1  thin,sparse public,snapshot,thinsnapshot
  pool         thin,pool   private

$ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1'
  Logical volume "lvol1" successfully removed

The lvremove command above was supposed to remove lvol1 as well as
all its snapshots which have origin=lvol1. It failed to do so, because
once we removed the origin lvol1, the lvol2 and lvol3 which were
snapshots before are not snapshots anymore - the relations change
as we're processing these LVs one by one.

If we do the selection first and then execute any concrete actions on
these LVs (which is what this patch does), the behaviour is correct
then - the selection is done on the *initial state*:

$ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1'
  Logical volume "lvol1" successfully removed
  Logical volume "lvol2" successfully removed
  Logical volume "lvol3" successfully removed

Similarly for all the other situations in which relations among
LVs are being changed by processing the LVs one by one.

This patch also introduces LV_REMOVED internal LV status flag
to mark removed LVs so they're not processed further when we
iterate over collected list of LVs to be processed.

Previously, when we iterated directly over vg->lvs list to
process the LVs, we relied on the fact that once the LV is removed,
it is also removed from the vg->lvs list we're iterating over.
But that was incorrect as we shouldn't remove LVs from the list
during one iteration while we're iterating over that exact list
(dm_list_iterate_items safe can handle only one removal at
one iteration anyway, so it can't be used here).
2015-03-24 08:43:07 +01:00
Alasdair G Kergon
6407d184d1 cache: Store metadata size and checksum.
Refactor the recent metadata-reading optimisation patches.

Remove the recently-added cache fields from struct labeller
and struct format_instance.

Instead, introduce struct lvmcache_vgsummary to wrap the VG information
that lvmcache holds and add the metadata size and checksum to it.

Allow this VG summary information to be looked up by metadata size +
checksum.  Adjust the debug log messages to make it clear when this
shortcut has been successful.

(This changes the optimisation slightly, and might be extendable
further.)

Add struct cached_vg_fmtdata to format-specific vg_read calls to
preserve state alongside the VG across separate calls and indicate
if the details supplied match, avoiding the need to read and
process the VG metadata again.
2015-03-18 23:43:02 +00:00
Alasdair G Kergon
95fbbf4f40 metadata: Fix recent vg_validate message text. 2015-03-17 17:48:56 +00:00
Alasdair G Kergon
a854546234 metadata: Detect internal use of LVM_WRITE_LOCKED.
Generate internal error if LVM_WRITE_LOCKED ever appears
in struct volume_group: it's only used in external
metadata.
2015-03-09 18:56:24 +00:00
Alasdair G Kergon
faccdeda83 comments: Use full flag names. 2015-03-09 18:53:22 +00:00
Zdenek Kabelac
04101bc430 lib: drop unneeded vg_read call
Since we take a lock inside vg_lock_newname() and we do a full
detection of presence of  vgname inside all scanned labels,
there is no point to do this for second time to be sure
there is no such vg.

The only side-effect of such call would be a full validation of
some already exising VG metadata - but that's not the task for
vgcreate when create a new VG.

This call noticable reduces number of scans during 'vgcreate'.
2015-03-06 14:05:06 +01:00
Zdenek Kabelac
7e7411966a lib: avoid reparsing same metadata
When reading VG mda from multiple PVs - do all the validation only
when mda is seen for the first time and  when mda checksum and length
is same just return already existing VG pointer.

(i.e. using 300PVs for a VG would lead to create and destroy 300 config trees....)
2015-03-06 13:53:12 +01:00
David Teigland
1e65fdd9ba system_id: make new VGs read-only for old lvm versions
Previous versions of lvm will not obey the restrictions
imposed by the new system_id, and would allow such a VG
to be written.  So, a VG with a new system_id is further
changed to force previous lvm versions to treat it as
read-only.  This is done by removing the WRITE flag from
the metadata status line of these VGs, and putting a new
WRITE_LOCKED flag in the flags line of the metadata.

Versions of lvm that recognize WRITE_LOCKED, also obey the
new system_id.  For these lvm versions, WRITE_LOCKED is
identical to WRITE, and the rules associated with matching
system_id's are imposed.

A new VG lock_type field is also added that causes the same
WRITE/WRITE_LOCKED transformation when set.  A previous
version of lvm will also see a VG with lock_type as read-only.

Versions of lvm that recognize WRITE_LOCKED, must also obey
the lock_type setting.  Until the lock_type feature is added,
lvm will fail to read any VG with lock_type set and report an
error about an unsupported lock_type.  Once the lock_type
feature is added, lvm will allow VGs with lock_type to be
used according to the rules imposed by the lock_type.

When both system_id and lock_type settings are removed, a VG
is written with the old WRITE status flag, and without the
new WRITE_LOCKED flag.  This allows old versions of lvm to
use the VG as before.
2015-03-05 09:50:43 -06:00
David Teigland
c6a57dc4f3 Revert "systemid: Add ACCESS_NEEDS_SYSTEM_ID VG flag."
This reverts commit bfbb5d269a.

This will be done differently.
2015-03-05 09:50:43 -06:00
Peter Rajnoha
190d591fbe report: fix seg_monitor field to display monitoring status for thick snapshots and mirrors
The seg_monitor did not display monitored status for thick snapshots
and mirrors (with mirror log *not* mirrored). The seg monitor did work
correctly even before for other segtypes - thins and raids.

Before (mirrors and snapshots, only mirrors with mirrored log properly displayed monitoring status):

[0] f21/~ # lvs -a -o lv_name,lv_layout,lv_role,seg_monitor vg
  LV                                     Layout     Role                             Monitor
  mirror                                 mirror     public
  [mirror_mimage_0]                      linear     private,mirror,image
  [mirror_mimage_1]                      linear     private,mirror,image
  [mirror_mlog]                          linear     private,mirror,log

  mirror_with_mirror_log                 mirror     public                           monitored
  [mirror_with_mirror_log_mimage_0]      linear     private,mirror,image
  [mirror_with_mirror_log_mimage_1]      linear     private,mirror,image
  [mirror_with_mirror_log_mlog]          mirror     private,mirror,log               monitored
  [mirror_with_mirror_log_mlog_mimage_0] linear     private,mirror,image
  [mirror_with_mirror_log_mlog_mimage_1] linear     private,mirror,image

  thick_origin                           linear     public,origin,thickorigin
  thick_snapshot                         linear     public,snapshot,thicksnapshot

With this patch applied (monitoring status displayed for all mirrors and snapshots):

[0] f21/~ # lvs -a -o lv_name,lv_layout,lv_role,seg_monitor vg
  LV                                     Layout     Role                             Monitor
  mirror                                 mirror     public                           monitored
  [mirror_mimage_0]                      linear     private,mirror,image
  [mirror_mimage_1]                      linear     private,mirror,image
  [mirror_mlog]                          linear     private,mirror,log

  mirror_with_mirror_log                 mirror     public                           monitored
  [mirror_with_mirror_log_mimage_0]      linear     private,mirror,image
  [mirror_with_mirror_log_mimage_1]      linear     private,mirror,image
  [mirror_with_mirror_log_mlog]          mirror     private,mirror,log               monitored
  [mirror_with_mirror_log_mlog_mimage_0] linear     private,mirror,image
  [mirror_with_mirror_log_mlog_mimage_1] linear     private,mirror,image

  thick_origin                           linear     public,origin,thickorigin
  thick_snapshot                         linear     public,snapshot,thicksnapshot    monitored
2015-03-05 14:05:34 +01:00
Alasdair G Kergon
bfbb5d269a systemid: Add ACCESS_NEEDS_SYSTEM_ID VG flag.
Set ACCESS_NEEDS_SYSTEM_ID VG status flag whenever there is
a non-lvm1 system_id set.  Prevents concurrent access from
older LVM2 versions.
Not set on VGs that bear a system_id only due to conversion
from lvm1 metadata.
2015-03-04 01:16:32 +00:00
Alasdair G Kergon
3562b5ab39 systemid: Init and merge lvm2 and lvm1 fields.
Use system_id field in preference to lvm1_system_id.
Initialise both for now.
2015-03-04 01:00:51 +00:00