1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

3368 Commits

Author SHA1 Message Date
Peter Rajnoha
8740ecfa64 WHATS_NEW: previous commit 2014-08-15 13:31:21 +02:00
Alasdair G Kergon
bf78e55ef3 pvcreate: Fix cache state with filters/sig wiping.
_pvcreate_check() has two missing requirements:
  After refreshing filters there must be a rescan.
    (Otherwise the persistent filter may remain empty.)
  After wiping a signature, the filters must be refreshed.
    (A device that was previously excluded by the filter due to
     its signature might now need to be included.)

If several devices are added at once, the repeated scanning isn't
strictly needed, but we can address that later as part of the command
processing restructuring (by grouping the devices).

Replace the new pvcreate code added by commit
54685c20fc "filters: fix regression caused
by commit e80884cd080cad7e10be4588e3493b9000649426"
with this change to _pvcreate_check().

The filter refresh problem dates back to commit
acb4b5e4de "Fix pvcreate device check."
2014-08-14 01:30:01 +01:00
Peter Rajnoha
9738a02d3d filter-mpath: fix primary device lookup failure for partition when processing mpath filter
If using persistent filter and we're refreshing filters (just like we
do for pvcreate now after commit 54685c20fc),
we can't rely on getting the primary device of the partition from the cache
as such device could be already filtered by persistent filter and we get
a device cache lookup failure for such device.

For example:

$ lvm dumpconfig --type diff
devices {
	obtain_device_list_from_udev=0
}

$lsblk /dev/sda
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0  128M  0 disk
`-sda1   8:1    0  127M  0 part

$cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda1",

$pvcreate /dev/sda1
  dev_is_mpath: failed to get device for 8:1
  Physical volume "/dev/sda1" successfully created

The problematic part of the code called dev_cache_get_by_devt
to get the device for the device number supplied. Then the code
used dev_name(dev) to get the name which is then used in check
whether there's any mpath on top of this dev...

This patch uses sysfs to get the base name for the partition
instead, hence avoiding the device cache which is a correct
approach here.
2014-08-08 10:49:19 +02:00
Peter Rajnoha
c52c9a1e31 activation: if LV inactive and non-clustered, do not issue "Cannot deactivate" on -aln
The message "Cannot deactivate remotely exclusive device locally." makes
sense only for clustered LV. If the LV is non-clustered, then it's
always exclusive by definition and if it's already deactivated, this
message pops up inappropriately as those two conditions are met.

So issue the message only if the conditions are met AND we have clustered VG.
2014-08-07 16:44:09 +02:00
Peter Rajnoha
ea662ca060 pvmove: remove spurious "Skipping mirror LV" message on pvmove of clustered mirror
With cmirrord, we can do pvmove of clustered mirror. The code checking
suitability of LVs on the PV being moved issued a message if a mirror
LV was found and the VG was clustered. However, the actual pvmove did
work correctly.

The top-level mirror LV is actually skipped in the code since it's
always layered on top of internal LVs making up the mirror LV and for pvmove
we consider these internal devices only as they're actually layered on
top of concrete PVs then. But we don't need to issue any message here
about skipping the top-level mirror LV - it's misleading here.
2014-08-07 15:23:58 +02:00
Alasdair G Kergon
26885ea119 post-release 2014-08-05 02:12:20 +01:00
Alasdair G Kergon
9d4e1e51a9 pre-release 2014-08-05 02:07:35 +01:00
Petr Rockai
46262f163b WHATS_NEW: lvscan --cache SEGV fix 2014-08-04 17:05:08 +02:00
Peter Rajnoha
54685c20fc filters: fix regression caused by commit e80884cd08
Commit e80884cd08 tried to dump filters
for them to be reevaluated when creating a PV to avoid overwriting
any existing signature that may have been created after last
scan/filtering.

However, we need to call refresh_filters instead of
persistent_filter->dump since dump requires proper rescannig to fill
up the persistent filter again. However, this is true only for pvcreate
but not for vgcreate with PV creation where the scanning happens before
this PV creation and hence the next rescan (if not full scan), does not
fill the persistent filter.

Also, move refresh_filters so that it's called sooner and only for
pvcreate, vgcreate already calls lvmcache_label_scan(cmd, 2) which
then calls refresh_filters itself, so no need to reevaluate this again.

This caused the persistent filter (/etc/lvm/cache/.cache file) to be
wrong and contain only the PV just being processed with
vgcreate <vg_name> <pv_name_to_create>.

This regression caused other block devices to be filtered out in case
the vgcreate with PV creation was used and then the persistent filter
is used by any other LVM command afterwards.
2014-08-01 11:39:53 +02:00
Alasdair G Kergon
c7b9f0ab42 lvresize: Allow approximation with +%FREE.
Make lvresize -l+%FREE support approximate allocation.

Move existing "Reducing/Extending' message to verbose level
and change it to say 'up to' if approximate allocation is being used.

Replace it with a new message that gives the actual old and new size or
says 'unchanged'.
2014-08-01 00:35:43 +01:00
Peter Rajnoha
ef85997980 metadata: remove spurious "Physical volume <dev_name> not found"
This is addendum to commit 2e82a070f3
which fixed these spurious messages that appeared after commit
651d5093ed ("avoid pv_read in
find_pv_by_name").

There was one more "not found" message issued in case the device
could not be found in device cache (commit 2e82a07 fixed this only
for PV lookup itself). But if we "allow_unformatted" for
find_pv_by_name, we should not issue this message even in case
the device can't be found in dev cache as we just need to know
whether there's a PV or not for the code to decide on next steps
and we don't want to issue any messages if either device itself
is not found or PV is not found.

For example, when we were creating a new PV (and so allow_unformatted = 1)
and the device had a signature on it which caused it to be filtered
by device filter (e.g. MD signature if md filtering is enabled),
or it was part of some other subsystem (e.g. multipath), this message
was issued on find_pv_by_name call which was misleading.

Also, remove misleading "stack" call in case find_pv_by_name
returns NULL in pvcreate_check - any error state is reported
later by pvcreate_check code so no need to "stack" here.

There's one more and proper check to issue "not found" message if
the device can't be found in device cache within pvcreate_check fn
so this situation is still covered properly later in the code.

Before this patch (/dev/sda contains MD signature and is therefore filtered):

$ pvcreate /dev/sda
  Physical volume /dev/sda not found
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:

With this patch applied:

$ pvcreate /dev/sda
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:

Non-existent devices are still caught properly:

$ pvcreate /dev/sdx
  Device /dev/sdx not found (or ignored by filtering).
2014-07-31 10:03:30 +02:00
Alasdair G Kergon
7cff640d9a activation: Fix upgrades using uuid suffixes.
2.02.106 added suffixes to some LV uuids in the kernel.

If any of these LVs is activated with 2.02.105 or earlier,
and then a later version is used, the LVs appear invisible and
activation commands fail.

The code now has to check the kernel for both old and new uuids.
2014-07-30 21:55:11 +01:00
Alasdair G Kergon
321bed7137 post-release 2014-07-23 16:23:52 +01:00
Alasdair G Kergon
25fa725b05 pre-release 2014-07-23 16:05:22 +01:00
Petr Rockai
66686a5bc5 WHATS_NEW: lvscan --cache, dmeventd RAID + lvmetad 2014-07-22 22:48:41 +02:00
Zdenek Kabelac
894eda4707 thin and cache: unify pool common code
Fix get_pool_params to only read params.
Add poolmetadataspare option to get_pool_params.
Move all profile code into update_pool_params.
Move recalculate code into pool_manip.c
2014-07-22 22:41:38 +02:00
Alasdair G Kergon
513fd029a6 config: Adjust description of activation_mode. 2014-07-21 15:50:47 +01:00
Zdenek Kabelac
8f9f180139 lvconvert: improve merge validation
Easier check for conflicting options with --merge.
2014-07-17 16:16:00 +02:00
Zdenek Kabelac
7abad9ef88 lvconvert: improve splitsnapshot test
Easier check for conflicting options with --splitsnapshot.
2014-07-17 16:15:30 +02:00
Peter Rajnoha
17b92001ea WHATS_NEW: recent commits
Commits:
d169ff1e03
bccc2bef33
f76879ba44
2014-07-11 14:09:05 +02:00
Zdenek Kabelac
970989655f lvconvert: update for thin a cache
Major update of lvconvert code to handle cache and thin.
related targets.

Code tries to unify handling of cache and thin pools.
Better supports lvm2 syntax:

lvconvert --type cache --cachepool vg/pool vg/cache
lvconvert --type thin --thinpool vg/pool vg/extorg
lvconvert --type cache-pool vg/pool
lvconvert --type thin-pool vg/pool

as well as:

lvconvert --cache --cachepool vg/pool vg/cache
lvconvert --thin --thinpool vg/pool vg/extorg
lvconvert --cachepool vg/pool
lvconvert --thinpool vg/pool

While catching much more command line errors.
(Yet couple paths still needs more tests)

Detects as much cmdline errors prior opening VG.

Uses single lvconvert_name_params to convert LV names.

Detects as much incompatibilies in VG prior prompting.

Uses single prompt to confirm whole conversion.

TODO: still the code needs fixes...
2014-07-11 13:32:22 +02:00
Zdenek Kabelac
4db5d78cef display: show C only for cache and cachepool
Keep target type (attr6) as the cache data and metadata volume has.
(i.e. when will show 'raid' type if metadata is raid)
2014-07-11 12:50:44 +02:00
Zdenek Kabelac
56c5ad7b19 lvconvert: snapshot prompts to confirm conversion
Since the type passed LV is changed and content of data detroyed,
query user with prompt to confirm this operation.
Also add a proper wiping of header.

Using '--yes' will skip this prompt:

lvconvert -s --yes vg/lv vg/lvcow
2014-07-11 12:49:55 +02:00
Zdenek Kabelac
64828d877e lvconvert: fix return codes
Error codes in some function are directly used
as command result - thus return 0 is not error code,
but success - switch to proper ECMD_FAILED.
2014-07-11 12:49:02 +02:00
Zdenek Kabelac
f9d80c9d31 cache: add tool support
Introducing cache tool support.
2014-07-11 12:47:35 +02:00
Jonathan Brassow
be75076dfc activation: Add "degraded" activation mode
Currently, we have two modes of activation, an unnamed nominal mode
(which I will refer to as "complete") and "partial" mode.  The
"complete" mode requires that a volume group be 'complete' - that
is, no missing PVs.  If there are any missing PVs, no affected LVs
are allowed to activate - even RAID LVs which might be able to
tolerate a failure.  The "partial" mode allows anything to be
activated (or at least attempted).  If a non-redundant LV is
missing a portion of its addressable space due to a device failure,
it will be replaced with an error target.  RAID LVs will either
activate or fail to activate depending on how badly their
redundancy is compromised.

This patch adds a third option, "degraded" mode.  This mode can
be selected via the '--activationmode {complete|degraded|partial}'
option to lvchange/vgchange.  It can also be set in lvm.conf.
The "degraded" activation mode allows RAID LVs with a sufficient
level of redundancy to activate (e.g. a RAID5 LV with one device
failure, a RAID6 with two device failures, or RAID1 with n-1
failures).  RAID LVs with too many device failures are not allowed
to activate - nor are any non-redundant LVs that may have been
affected.  This patch also makes the "degraded" mode the default
activation mode.

The degraded activation mode does not yet work in a cluster.  A
new cluster lock flag (LCK_DEGRADED_MODE) will need to be created
to make that work.  Currently, there is limited space for this
extra flag and I am looking for possible solutions.  One possible
solution is to usurp LCK_CONVERT, as it is not used.  When the
locking_type is 3, the degraded mode flag simply gets dropped and
the old ("complete") behavior is exhibited.
2014-07-09 22:56:11 -05:00
Peter Rajnoha
bccc2bef33 report: add lv_active_{locally,remotely,exclusively} LV reporting fields
lv_active_{locally,remotely,exclusively} display the original
"lv_active" field in a more separate way so that we can create
selection criteria in a binary-based form (yes/no).
2014-07-09 15:57:05 +02:00
Peter Rajnoha
4b65d7ec72 WHATS_NEW: commits a473435..7021c8f1 2014-07-07 16:52:43 +02:00
Peter Rajnoha
6dc7b783c8 metadata: fix regression causing PVs not in VGs to be marked as allocatable
If the PV is not yet in a VG, it's not allocatable.
A regression introduced by commit 0283c439ec
(_pv_create) and later commit a7ca101517
(pv_read).
2014-07-07 14:07:21 +02:00
Alasdair G Kergon
137ed3081a report: Add lv_parent field.
Only defined for thin/cache/raid/mirror at this stage as it
relies on get_only_segment_using_this_lv().
2014-07-03 23:49:34 +01:00
Alasdair G Kergon
1e1c2769a7 vgsplit: Fix VG component of lvid.
Fix VG component of lvid in vgsplit and vgmerge
Update vg_validate() to detect the error.
Call lv_is_active() before moving LV into new VG, not after.
2014-07-03 19:06:04 +01:00
Alasdair G Kergon
64ce3a8066 report: Add lv_dm_path and lv_full_name fields. 2014-07-02 17:24:05 +01:00
Alasdair G Kergon
5bfa2ec21d report: Exclude hidden devices from lv_path field. 2014-07-02 14:57:00 +01:00
Zdenek Kabelac
7bdf4719e8 pool: delay conversion prompt
First validate as much params as possible before prompting user
about conversion to data and metadata LV.
2014-07-02 10:45:39 +02:00
Zdenek Kabelac
3af761ba16 thin: fix chunk_size conversion prompt skip
Use --force only enables prompting for dangerous operation.
User has to add --yes to skip this prompt.
2014-07-02 10:43:56 +02:00
Zdenek Kabelac
93a80018ae lvremove: remove thin volumes on damaged pools
Support remove of thin volumes With --force --force
when thin pools is damaged.

This way it's possible to remove thin pool with
unrepairable metadata without requiring to
manually edit lvm2 metadata.

lvremove -ff vg/pool

removes all thin volumes and pool even when
thin pool cannot be activated (to accept
removal of thin volumes in kernel metadata)
2014-07-02 10:37:52 +02:00
Zdenek Kabelac
0b872ce870 raid: don't skip prompt with force
Yes is meant to be used to skip all new prompts.
(--force just adds more prompts).
2014-07-02 10:36:32 +02:00
Alasdair G Kergon
c77197c688 make: Fix pofile and .d file generation.
Use builddir not srcdir with make pofile.

Append 'incfile:' lines to %.d files to handle newly-missing dependencies
without 'make clean' after a file is moved or deleted.
2014-07-02 00:48:50 +01:00
Zdenek Kabelac
667f93b7d9 uuid: add more private uuid sufixes
Use suffixes for easier detection of private volumes.

This commit makes older volume UUIDs incompatible and
it most probably needs machine reboot after upgrade.
2014-06-30 12:17:07 +02:00
Zdenek Kabelac
6da14a82c6 thin: do not create reserved LVs
When creating pool's metadata - create initial LV for clearing with some
generic name and after the volume is create & cleared - rename it to
reserved name '_tmeta/_cmeta'.

We should not expose  'reserved' names for public LVs.
2014-06-30 12:16:05 +02:00
Zdenek Kabelac
eadcea2dae thin: repaired LV uses _meta%d
Don't leave 'regular' LV with reserved suffix for a user.
After succefull repair use 'normal' (non-reserved) LV name
for backup of original metadata.
2014-06-30 12:15:13 +02:00
Jonathan Brassow
ed3c2537b8 raid: Allow repair to reuse PVs from same image that suffered a PV failure
When repairing RAID LVs that have multiple PVs per image, allow
replacement images to be reallocated from the PVs that have not
failed in the image if there is sufficient space.

This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
where each image lives on two PVs but doesn't use the entire space
on any of them.  If one PV fails and there is sufficient space on the
remaining PV in the image, the image can be reallocated on just the
remaining PV.
2014-06-25 22:26:06 -05:00
Jonathan Brassow
b35fb0b15a raid/misc: Allow creation of parallel areas by LV vs segment
I've changed build_parallel_areas_from_lv to take a new parameter
that allows the caller to build parallel areas by LV vs by segment.
Previously, the function created a list of parallel areas for each
segment in the given LV.  When it came time for allocation, the
parallel areas were honored on a segment basis.  This was problematic
for RAID because any new RAID image must avoid being placed on any
PVs used by other images in the RAID.  For example, if we have a
linear LV that has half its space on one PV and half on another, we
do not want an up-convert to use either of those PVs.  It should
especially not wind up with the following, where the first portion
of one LV is paired up with the second portion of the other:
------PV1-------  ------PV2-------
[ 2of2 image_1 ]  [ 1of2 image_1 ]
[ 1of2 image_0 ]  [ 2of2 image_0 ]
----------------  ----------------
Previously, it was possible for this to happen.  The change makes
it so that the returned parallel areas list contains one "super"
segment (seg_pvs) with a list of all the PVs from every actual
segment in the given LV and covering the entire logical extent range.

This change allows RAID conversions to function properly when there
are existing images that contain multiple segments that span more
than one PV.
2014-06-25 21:20:41 -05:00
Peter Rajnoha
e80884cd08 filters: always reevaluate filter before creating a PV
...to avoid using cached value (persistent filter) and therefore
not noticing any change made after last scan/filtering - the state
of the device may have changed, for example new signatures added.

$ lvm dumpconfig --type diff
allocation {
	use_blkid_wiping=0
}
devices {
	obtain_device_list_from_udev=0
}

$ cat /etc/lvm/cache/.cache | grep sda

$ vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "fedora" using metadata type lvm2

$ cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda",

$ parted /dev/sda mklabel gpt
Information: You may need to update /etc/fstab.

$ parted /dev/sda print
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 134MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags

$ cat /etc/lvm/cache/.cache | grep sda
		"/dev/sda",

====

Before this patch:
$ pvcreate /dev/sda
  Physical volume "/dev/sda" successfully created

With this patch applied:
$ pvcreate /dev/sda
  Physical volume /dev/sda not found
  Device /dev/sda not found (or ignored by filtering).
2014-06-25 16:24:28 +02:00
Alasdair G Kergon
29ca0573ba post-release 2014-06-23 15:23:09 +01:00
Alasdair G Kergon
8d27f8e003 pre-release 2014-06-23 14:03:32 +01:00
Alasdair G Kergon
78533f72d3 locking: Introduce LCK_ACTIVATION.
Take a local file lock to prevent concurrent activation/deactivation of LVs.
Thin/cache types and an extension for cluster support are excluded for
now.

'lvchange -ay $lv' and 'lvchange -an $lv' should no longer cause trouble
if issued concurrently: the new lock should make sure they
activate/deactivate $lv one-after-the-other, instead of overlapping.

(If anyone wants to experiment with the cluster patch, please get in touch.)
2014-06-20 13:24:02 +01:00
Alasdair G Kergon
b33091cb11 pvmove: tidy 2014-06-19 13:40:47 +01:00
Zdenek Kabelac
aa3e413093 lvchange: better --refresh of raid and mirrors
Use lv_check_not_in_use() to detect openned device.
Plain info.open_count is not good enough for udev random
device openning.
2014-06-19 12:01:34 +02:00
Peter Rajnoha
63f5be0170 WHATS_NEW: commits 7dbbc05a69c4cb9756464720cad29e3c1ed971c3..b16f5633ab199dedfd25f08562f686a6fb4aba9d
Report selection support...
2014-06-18 10:48:53 +02:00
Jonathan Brassow
5ebff6cc9f pvmove: Enable all-or-nothing (atomic) pvmoves
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000).  When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV.  The LVs
then point to the range in the pvmove LV, rather than the PV
range.

Example 1:
	We have two LVs in this example.  After they were
	created, the first LV was grown, yeilding two segments
	in LV1.  So, there are two LVs with a total of three
	segments.

	Before pvmove:
	      ---------  ---------   ---------
	      | LV1s0 |  | LV2s0 |   | LV1s1 |
	      ---------  ---------   ---------
	         |           |           |
	   -------------------------------------
	PV | 000 - 255 | 256 - 511 | 512 - 767 |
	   -------------------------------------

	After pvmove inserts the temporary pvmove LV:
	          ---------   ---------   ---------
	          | LV1s0 |   | LV2s0 |   | LV1s1 |
	          ---------   ---------   ---------
	              |           |           |
	        -------------------------------------
	pvmove0 |   seg 0   |   seg 1   |   seg 2   |
	        -------------------------------------
	              |           |           |
	        -------------------------------------
	PV      | 000 - 255 | 256 - 511 | 512 - 767 |
	        -------------------------------------

	Each of the affected LV segments now point to a
	range of blocks in the pvmove LV, which purposefully
	corresponds to the segments moved from the original
	LVs into the temporary pvmove LV.

The current implementation goes on from here to mirror the temporary
pvmove LV by segment.  Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet.  If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV.  Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.

Example 2:
	What 'example 1' might look if it was half-way
	through the move.
	             ---------   ---------   ---------
	             | LV1s0 |   | LV2s0 |   | LV1s1 |
	             ---------   ---------   ---------
	                 |           |           |
	           -------------------------------------
	pvmove0    |   seg 0   |   seg 1   |   seg 2   |
	           -------------------------------------
	                 |           |           |
	                 |     -------------------------
	source PV        |     | 256 - 511 | 512 - 767 |
	                 |     -------------------------
	                 |           ||
	           -------------------------
	dest PV    | 000 - 255 | 256 - 511 |
	           -------------------------

This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment".  That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.

Example 3:
	A pvmove that is performed "by LV" rather than "by segment".

	                   ---------   ---------
	                   | LV1s0 |   | LV2s0 |
	                   ---------   ---------
	                       |           |
	                 -------------------------
	        pvmove0  |  * LV-level mirror *  |
	                 -------------------------
                             /                \
	   pvmove_mimage0   /          pvmove_mimage1
	   -------------------------   -------------------------
	   |   seg 0   |   seg 1   |   |   seg 0   |   seg 1   |
	   -------------------------   -------------------------
	        |            |               |           |
	   -------------------------   -------------------------
	   | 000 - 255 | 256 - 511 |   | 000 - 255 | 256 - 511 |
	   -------------------------   -------------------------
	           source PV                    dest PV

The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments.  A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries.  The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved.  So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated.  The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case.  This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV.  This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).

The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together.  It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
2014-06-17 22:59:36 -05:00
Zdenek Kabelac
494db11004 snapshot: %ORIGIN is relative to data size
Let's use the size of origin as the real base for percenta calculation,
and 'silenly' add needed metadata space for snapshot.

So now command   'lvcreate -s -l100%ORIGIN vg/lv' should always create a
snapshot to handle full device overwrite.
2014-06-17 13:41:01 +02:00
Jonathan Brassow
962a40b981 cache: Properly rename origin LV tree when adding "_corig"
When creating a cache LV with a RAID origin, we need to ensure that
the sub-LVs of that origin properly change their names to include
the "_corig" extention of the top-level LV.  We do this by first
performing a 'lv_rename_update' before making the call to
'insert_layer_for_lv'.
2014-06-16 18:15:39 -05:00
Peter Rajnoha
fea8abe56a systemd: use RemoveOnStop for dm-event.socket and lvm2-lvmetad.socket
Systemd version 214 introduced new "RemoveOnStop" option for socket
units to remove the socket/FIFO when the particular unit is stopped.

Also https://bugzilla.redhat.com/show_bug.cgi?id=802748.
2014-06-13 15:45:25 +02:00
Peter Rajnoha
8e0687ca69 profile: add thin-generic.profile
The thin-generic.profile contains settings for thin/thin pool volumes
suitable for generic environment/use containing default settings.
This allows users to change the global lvm.conf settings at will
and still keep the original settings for volumes that have this
thin profile assigned already.
2014-06-13 09:56:29 +02:00
Zdenek Kabelac
922f884abe report: avoid passing NULL label
Internal reporting function cannot handle NULL reporting value,
so ensure there is at least dummy label.

So move dummy_lable from tools/reporter.c and use it for all
report_object() calls in lib/report/report.c.
(Fixes RHBZ 1108394)

Simlify lvm_report_object initialization.
2014-06-12 11:55:58 +02:00
Zdenek Kabelac
2f260c9909 activation: retry cleanup deactivation
Enable 'retry' deactivation also in 'cleanup' phase.
It shouldn't be mostly needed - however udev now produces
more and more completelny non-synchronizable device opens,
so even for orphan devices we can't easily predict where
udevd opens devices.

So it's more preferable here to log error about device being open
and retry clean, but let the command proceed.
2014-06-10 10:51:24 +02:00
Jonathan Brassow
1f2aedb190 WHATS_NEW: For commit 9399b743 (prompt for VG cluster attr change)
Minor change, but put a comment in WHATS_NEW anyway.
2014-06-05 22:30:50 -05:00
Zdenek Kabelac
3f8048f28c vgextend: allow --yes to skip prompt 2014-05-23 23:35:40 +02:00
Peter Rajnoha
1c4fe47308 lvm_init: don't use name mangling for LVM
LVM has restricter character set that is allowed for VG-LV names
and the dm names constructed do not contain any blacklisted characters
that would require name mangling.

Also, when any other device-mapper device is scanned that could
possibly contain such blacklisted characters, we reference the
device by its major:minor instead of dm name (e.g. _device_is_usable fn).
2014-05-22 10:00:19 +02:00
Peter Rajnoha
23f9c45a1b profiles: remove default.profile and add {command,metadata}_profile_template.profile
The "default.profile" name was misleading. It's actually a helper
*template* that can be used for copying and further editing to create
a new profile.

Also, we have separate command and metadata profiles now so the templates
are separated as well - we can't mix profile settings from one group with
another - such profile is rejected by lvm tools.
2014-05-21 12:36:52 +02:00
Dongmao Zhang
e9db11f387 systemd: use umask 022 for generated systemd units by lvm2-activation-generator 2014-05-21 10:12:02 +02:00
Zdenek Kabelac
c70c100cce lvconvert: check ret code of mirror_remove_missing
When mirror_remove_missing() fails, stop repairing mirror.
2014-05-20 21:49:42 +02:00
Zdenek Kabelac
bbf4b2c1c9 thin: lvconvert warn before conversion
Warn user before converting volume to different type.

  WARNING: Converting vg/lvol0 logical volume to pool's meta/data volume.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)

Since the content of volume is lost we have to query user to confirm
such operation.  If user is 100% sure, he may use '--yes' to avoid prompts.
2014-05-20 21:48:47 +02:00
Peter Rajnoha
9c937e7d54 dumpconfig: add --type profilable-command/profilable-metadata, --metadataprofile/--commandprofile
The dumpconfig now understands --commandprofile/--profile/--metadataprofile

The --commandprofile and --profile functionality is almost the same
with only one difference and that is that the --profile is just used
for dumping the content, it's not applied for the command itself
(while the --commandprofile profile is applied like it is done for
any other LVM command).

We also allow --metadataprofile for dumpconfig - dumpconfig *does not*
touch VG/LV and metadata in any way so it's OK to use it here (just for
dumping the content, checking the profile validity etc.).

The validity of the profile can be checked with:
      dumpconfig --commandprofile/--profile/--metadataprofile --validate

...depending on the profile type.

Also, mention --config in the dumpconfig help string so users know
that  dumpconfig handles this too (it did even before, but it was not
documented in the help string).
2014-05-20 16:27:07 +02:00
Peter Rajnoha
9e3e4d6994 config: differentiate command and metadata profiles and consolidate profile handling code
- When defining configuration source, the code now uses separate
  CONFIG_PROFILE_COMMAND and CONFIG_PROFILE_METADATA markers
  (before, it was just CONFIG_PROFILE that did not make the
  difference between the two). This helps when checking the
  configuration if it contains correct set of options which
  are all in either command-profilable or metadata-profilable
  group without mixing these groups together - so it's a firm
  distinction. The "command profile" can't contain
  "metadata profile" and vice versa! This is strictly checked
  and if the settings are mixed, such profile is rejected and
  it's not used. So in the end, the CONFIG_PROFILE_COMMAND
  set of options and CONFIG_PROFILE_METADATA are mutually exclusive
  sets.

- Marking configuration with one or the other marker will also
  determine the way these configuration sources are positioned
  in the configuration cascade which is now:

  CONFIG_STRING -> CONFIG_PROFILE_COMMAND -> CONFIG_PROFILE_METADATA -> CONFIG_FILE/CONFIG_MERGED_FILES

- Marking configuration with one or the other marker will also make
  it possible to issue a command context refresh (will be probably
  a part of a future patch) if needed for settings in global profile
  set. For settings in metadata profile set this is impossible since
  we can't refresh cmd context in the middle of reading VG/LV metadata
  and for each VG/LV separately because each VG/LV can have a different
  metadata profile assinged and it's not possible to change these
  settings at this level.

- When command profile is incorrect, it's rejected *and also* the
  command exits immediately - the profile *must* be correct for the
  command that was run with a profile to be executed. Before this
  patch, when the profile was found incorrect, there was just the
  warning message and the command continued without profile applied.
  But it's more correct to exit immediately in this case.

- When metadata profile is incorrect, we reject it during command
  runtime (as we know the profile name from metadata and not early
  from command line as it is in case of command profiles) and we
  *do continue* with the command as we're in the middle of operation.
  Also, the metadata profile is applied directly and on the fly on
  find_config_tree_* fn call and even if the metadata profile is
  found incorrect, we still need to return the non-profiled value
  as found in the other configuration provided or default value.
  To exit immediately even in this case, we'd need to refactor
  existing find_config_tree_* fns so they can return error. Currently,
  these fns return only config values (which end up with default
  values in the end if the config is not found).

- To check the profile validity before use to be sure it's correct,
  one can use :

    lvm dumpconfig --commandprofile/--metadataprofile ProfileName --validate

  (the --commandprofile/--metadataprofile for dumpconfig will come
   as part of the subsequent patch)

- This patch also adds a reference to --commandprofile and
  --metadataprofile in the cmd help string (which was missing before
  for the --profile for some commands). We do not mention --profile
  now as people should use --commandprofile or --metadataprofile
  directly. However, the --profile is still supported for backward
  compatibility and it's translated as:

    --profile == --metadataprofile for lvcreate, vgcreate, lvchange and vgchange
                 (as these commands are able to attach profile to metadata)

    --profile == --commandprofile for all the other commands
                (--metadataprofile is not allowed there as it makes no sense)

- This patch also contains some cleanups to make the code handling
  the profiles more readable...
2014-05-20 16:21:48 +02:00
Peter Rajnoha
24f32721a9 dumpconfig: fix dumpconfig --type diff used in lvm shell as second and later command
The dumpconfig reuses existing config_def_check results in case
the check is done during general lvm command context initialization
(when enabled by config/checks=1) so dumpconfig does not need to run
the same check again during its execution, hence saving some time.

However, we don't check for differences from defaults during general
lvm command initialization as it's useless at that time. It makes
sense only in case when such a check is directly requested (like in
the case of lvm dumpconfig --type diff). We need to take care that
the reused information was already produced with this "diff" checking
before and if not, we need to force the check so the check status also
gathers the new "diff" info now.

Also, do not do diff checking for any other dumpconfig command that
is run after dumpconfig --type diff.
2014-05-19 15:41:25 +02:00
Peter Rajnoha
9a324df3b3 config: fix incorrect profile initialization on cmd context refresh
When cmd refresh is called, we need to move any already loaded profiles
to profiles_to_load list which will cause their reload on subsequent
use. In addition to that, we need to take into account any change
in config/profile configuration setting on cmd context refresh
since this setting could be overriden with --config.

Also, when running commands in the shell, we need to remove the
global profile used from the configuration cascade so the profile
is not incorrectly reused next time when the --profile option is
not specified anymore for the next command in the shell.

This bug only affected profile specified by --profile cmd line
arg, not profiles referenced from LVM metadata.
2014-05-19 15:39:55 +02:00
Zdenek Kabelac
b73a786755 man: lvmcache
Migrate cache description into  man(7) entry
(like lvmthin).
2014-05-15 12:13:24 +02:00
Zdenek Kabelac
11bedf1baf display: print skipped prompt
Since decisions in the silent mode may not be always obvious,
print skipped prompt with answer 'n'.

Also document  '-qq' behaviour (single -q only shuts
logging, while -qq sets silent mode).
2014-05-15 12:11:35 +02:00
Peter Rajnoha
7c86131233 report: export DM_REPORT_FIELD_RESERVED_NAME_{HELP,HELP_ALT} and show help on '<lvm_command> -O help'
Share DM_REPORT_FIELD_RESERVED_NAME_{HELP,HELP_ALT} between libdm and
any libdm user to handle reserved field names, in this case the virtual
field name to show help instead of failing on unrecognized field.
The libdm user also needs to check the field name so it can fire
proper code in this case (cleanup, exit etc.).
2014-05-15 10:58:14 +02:00
Alasdair G Kergon
5684cfcb1c report: Add metadata_percent to lvs_cols. 2014-05-15 08:32:27 +01:00
Alasdair G Kergon
3b989e317f allocation: Fix alloc anywhere with parity.
Take account of parity areas with alloc anywhere in
_calc_required_extents.  Extents beyond area_count were treated
incorrectly as mirror logs.
2014-05-14 16:25:43 +01:00
Zdenek Kabelac
48a8cf28f7 cache: avoid expression overflow
Cast data_extents to 64bit so calculation is in 64b arithmetic.
2014-05-07 14:14:54 +02:00
Zdenek Kabelac
e585a6bbcf signals: better nesting support
Support upto 3 levels os nesting signal blocking.
As of today - code blocks signals immediatelly when it opens
VG in read-write mode - this however makes current prompt usage
then partially unusable since user may not 'break' command
during prompt (something most user would expect).

Until a better fix for prompting is implemented, put in support
for signal nesting - thus when prompt enables signal acceptance,
make it possible to really break command at this point.
2014-05-07 14:09:33 +02:00
Zdenek Kabelac
04b29a3587 locking: use sigaction signal handling
Use sigint_allow/restore function instead of duplicating code
and switch to use only sigactiction based signal handling.
2014-05-07 14:01:13 +02:00
Alasdair G Kergon
2eed136f0f signals: Move sigint handling out to lvm-signal. 2014-05-01 20:07:17 +01:00
Zdenek Kabelac
517b002648 display: check for dmeventd support
When quering for dmeventd monitoring status, check first
if lvm2 is configured to monitor to avoid unwanted start
of dmeventd process for answering monitoring status.
2014-04-30 10:26:26 +02:00
Alasdair G Kergon
b1f765d72a pvremove: Catch CTRL-c during prompts. 2014-04-29 08:16:28 +01:00
Zdenek Kabelac
d1aba7ccf6 lvscan: drop test for snapshosts
When showing  ACTIVE status for snapshot's origin,
avoid testing all its snapshot - it's not useful
to tell user origin is inactivate, while it's clearly
available and running - just one of its snapshot leg
is invalid...
2014-04-28 12:42:53 +02:00
Zdenek Kabelac
4c405a9b49 thin: move segment info display to correct code section
Relocate info from thin pool and thin volume segments
to proper code section for segments.
Add discards and thin count status info.

Info is shown with  'lvdisplay --maps' (like for other segments).
2014-04-28 12:41:25 +02:00
Zdenek Kabelac
71314a9905 thin: display info when -tpool is running
For percentage display we need -tpool - so check for layered
device presence here instead of plain pool device.
Also update 'info' - so when pool is 'available' we
display open count for -tpool device instead of mostly
irrelevant pool.
TODO: Maybe we should actually display this open info always?
(even when just -tpool is available, but pool is not)
2014-04-28 12:40:17 +02:00
Zdenek Kabelac
91a8e4a3d8 display: show monitoring status
When displaying segments  (lvdisplay --maps)
show monitoring status when supported by segment.
2014-04-28 12:39:03 +02:00
Zdenek Kabelac
e6168b8d70 display: use Virtual for virtual LV
Emphesize virtual extents for virtual LVs and for
those use  'Virtual extents' instead of 'Logical extents',
so it's immeditatelly visible, which extents do have
straighforward physical backend.
2014-04-28 12:37:50 +02:00
Jonathan Brassow
3c4234f825 vgsplit: Make RAID 4/5/6 fail cleanly when too few PV specified
While the 'raid1/10' segment types were being handled inadvertently
by '_move_mirrors()', the parity RAIDs were not being properly checked
to ensure that the user had specified all necessary PVs when moving
them.  Thus, internal errors were being triggered when only part of
a RAID LV was moved to the new VG.  I've added a new function,
'_move_raid()', which properly checks over any affected RAID LVs and
ensures that all the necessary PVs are being moved.
2014-04-25 16:24:50 -05:00
Jonathan Brassow
76687f4cac WHATS_NEW: Add message for commit 9ac858f
WHATS_NEW for commit:
9ac858f vgsplit: Make vgsplit work on mirrors with leg and log on same PV
2014-04-25 14:55:32 -05:00
Peter Rajnoha
8e8a47143f config: use devices/ignore_suspended_devices=0 by default
ignore_suspended_devices=0 is already used in lvm.conf we distribute,
but it was still "1" in the code (so it was used when lvm.conf value
was not defined). It should be "0" too.
2014-04-24 12:12:39 +02:00
Zdenek Kabelac
47a60369a0 unknown: fix mempool used for name allocation
Use cmd libmem mempool for name allocation, since mem mempool
is released after each clvmd command.
2014-04-18 16:38:47 +02:00
Alasdair G Kergon
b5f8f452ac tools: Add --readonly support.
Offer lock-free access to display virtual machine or clustered VG metadata
while it might be in use.
2014-04-18 02:46:34 +01:00
Alasdair G Kergon
17e304e0ac metadata: Fix unlock on VG recovery error path.
If lock conversion failed it tried to unlock VG that was no longer locked.
2014-04-18 02:27:16 +01:00
Alasdair G Kergon
177ece01a9 reports: Use X for unknown LV attr when no dm.
It's safer not to tell people an LV is inactive when we aren't sure.
2014-04-18 02:23:39 +01:00
Alasdair G Kergon
e8a3ba1865 pvscan: Use lvmetad_used().
Config variables that are processed during setup prior to calling into
particular tools must not be accessed directly afterwards in case the
values already got overridden.

_process_config() already used the tests I'm removing here to call
lvmetad_set_active() and set up lvmetad_used().
2014-04-18 02:13:46 +01:00
Peter Rajnoha
702180b30c configure: use configure's --enable-udev-systemd-background-jobs by default
This should be the preferred way of configuring lvm2 for udev/systemd
since otherwise one can end up with the processes run from udev (the
pvscan we run for lvmetad update on events) to be killed prematurely
and this can end up with LVM volumes not activated in the end.
2014-04-16 11:33:44 +02:00
Peter Rajnoha
a6763c64a7 lvmdump: add -s to gather system info and context (currently systemd-related only)
This is sort of info we always ask people to retrieve when
inspecting problems in systemd environment so let's have this
as part of lvmdump directly.

The -s option does not need to be bound to systemd only. We could
add support for initscripts or any other system-wide/service tracking
info that can help us with debugging problems.
2014-04-15 15:27:30 +02:00
Alasdair G Kergon
1bf4c3a1fa alloc: Introduce A_POSITIONAL_FILL.
Set A_POSITIONAL_FILL if the array of areas is being filled
positionally (with a slot corresponding to each 'leg') rather
than sequentially (with all suitable areas found, to be sorted
and selected from).
2014-04-15 01:13:47 +01:00
Zdenek Kabelac
eccc50d861 clvmd: use thread-safe ctime_r when debugging
Use thread friendly version of ctime
TODO:should be probably replaced with strftime()
2014-04-14 13:02:25 +02:00
Zdenek Kabelac
639983b6b7 clvmd: skip adding reply when finished
Prior adding new reply to the list, check
if the reply thread is not already finished.
In that case discard adding message
(which would otherwise be leaked).
2014-04-14 13:01:42 +02:00
Zdenek Kabelac
7236b92857 clvmd: improve mutex usage in request_timed_out
Use mutex to access localsock values, so check
num_replies when the thread is not yet finished.

Check for threadid prior the mutex taking
(though this check is probably not really needed)
2014-04-14 13:00:51 +02:00
Zdenek Kabelac
7075656034 clvmd: drop reply_mutex
Added complexity with extra reply mutex is not worth the troubles.
The only place which may slightly benefit from this mutex is timeout
and since this is rather error case - let's convert it to
localsock.mutex and keep it simple.
2014-04-14 12:59:07 +02:00
Zdenek Kabelac
6115c0d112 clvmd: set finished flag with mutex
Setting this variable needs to be protected with mutex.
2014-04-14 12:58:28 +02:00
Zdenek Kabelac
cc0096ebdd clvmd: move mutex init and detroy
Move the pthread mutex and condition creation and destroy
to correct place right after client memory is allocatedd
or is going to be released.

In the original place it's been in race with lvm thread
which could have still unlock mutex while it's been already
destroyed.
2014-04-14 12:57:39 +02:00
Zdenek Kabelac
91f4e09b48 clvmd: fix test mode race
When TEST_MODE flag is passed around the cluster,
it's been use in thread unprotected way, so it may have
influenced behaviour of other running parallel lvm commands
(activation/deactivation/suspend/resume).

Fix it by set/query function only under lvm mutex.
For hold_un/lock function calls check lock_flags bits directly.
2014-04-14 12:55:46 +02:00
Zdenek Kabelac
bd3e44643d memlock: ignore more libraries
Extend the list of ignored libraries. Since we do not
use those libraries during suspend, skip their locking.
2014-04-14 12:53:07 +02:00
Zdenek Kabelac
84ff3ae703 pvmove: remove locked flag from error pvmove0
When pvmove0 is finished, it replaces temporarily pvmove0
with error segment, however in this case, pvmove0 remains
unremovable in case pvmove --abort is interrupted in this
moment - since it's not a pvmove anymore and normal
lvremove can't be used to remove LOCKED lv.
2014-04-14 12:52:32 +02:00
Zdenek Kabelac
45f45c9932 polldaemon: ret invalid cmd for negative interval
Negative intervals are not supported.
2014-04-14 12:47:14 +02:00
Alasdair G Kergon
6320c3b905 post-release 2014-04-10 17:13:27 +01:00
Alasdair G Kergon
2043f8c729 pre-release 2014-04-10 16:28:28 +01:00
Peter Rajnoha
6c79556f4f pvcreate: fix ignored --dataalignment/dataalignment offset for pvcreate --restorefile
There were two bugs before when using pvcreate --restorefile together
with data alignment and its offset specified:

  - the --dataalignment was always ignored due to missing braces in the
    code when validating the divisibility of supplied --dataalignment
    argument with pe_start which we're just restoring:

	if (pp->rp.pe_start % pp->data_alignment)
		log_warn("WARNING: Ignoring data alignment %" PRIu64
			 " incompatible with --restorefile value (%"
			 PRIu64").", pp->data_alignment, pp->rp.pe_start);
        pp->data_alignment = 0

    The pp->data_alignment should be zeroed only if the pe_start is not
    divisible with data_alignment.

  - the check for compatibility of restored pe_start was incorrect too
    since it did not properly count with the dataalignmentoffset that
    could be supplied together with dataalignment

The proper formula is:

  X * dataalignment + dataalignmentoffset == pe_start

So it should be:

  if ((pp->rp.pe_start % pp->data_alignment) != pp->data_alignment_offset) {
	...ignore supplied dataalignment and dataalignment offset...
  }
2014-04-10 13:43:46 +02:00
Peter Rajnoha
10e10cfa4f lvmetad: fix lost bootloader area information
The refactoring made by 732859d21f
caused this. The former "ea" was not renamed to "ba" and we used
incorrect tree node name to search for the value.
2014-04-09 14:53:45 +02:00
Peter Rajnoha
a7c930b18d tools: don't require --major to be specified when using -My option on kernels > 2.4
Since kernel > 2.4 have dynamically assigned major numbers.

[0] raw/~ $ lvcreate -l1 -My --minor 10 vg
  Logical volume "lvol0" created

[0] raw/~ $ lvcreate -l1 -My --major 254 --minor 11 vg
  Ignoring supplied major number - kernel assigns major numbers dynamically. Using major number 253 instead.
  Logical volume "lvol1" created

[0] raw/~ $ lvs --profile out -o+major,minor
  lvol0 vg     -wima-----    4.00 253  10
  lvol1 vg     -wima-----    4.00 253  11
2014-04-04 13:29:39 +02:00
Alasdair G Kergon
cc72f340d7 thin: Support thin_check --clear-needs-check-flag.
Update thin provisioning tools to version 0.3.2 or later!
2014-04-04 02:22:40 +01:00
Alasdair G Kergon
12ddaa5f10 lib: Share lvm_even_rand for random numbers. 2014-04-04 01:26:19 +01:00
Alasdair G Kergon
6d2a26f6b6 man: Add lvmthin(7). 2014-04-04 01:14:25 +01:00
Zdenek Kabelac
84beba5d7f vg_validate: check size of lv_name + vg_name
Since the whole dm device name may not exceed 127 characters,
validate no LV names is bigger then this limit.
2014-03-31 12:05:32 +02:00
Zdenek Kabelac
6570d36ad5 lv_rename: resume fail is certainly error
Failing resume path means command has failed,
even when commit was ok.
2014-03-31 12:03:25 +02:00
Zdenek Kabelac
d3e4934a15 lvrename: fix name length validation
This test for LV name restriction check name of device is below 128
chars (which is enforced by dm target).
Thus it should not count with device name.
(Though the test for PATH_MAX size should be probably also added,
but this is runtime test, since theoretically devpath might differ in cluster)
2014-03-31 12:02:18 +02:00
Zdenek Kabelac
caa429da2e lvdiplay: prohibit use of -c and -m
It's unclear why we should prohibit use of -v output.
So reenable (like with other 'display' tools)
But -c -m is really unsupported - return invalid cmd.
2014-03-31 12:00:45 +02:00
Zdenek Kabelac
31d68b7124 man: sort options
Sort order for -C|--columns as with other options,
and use short capital name as the first (as with other options).
Also drop multiple reference for pvs/lvs/vgs, since now
the text for -C is really close to referrence of lvm anyway.
2014-03-30 23:43:31 +02:00
Zdenek Kabelac
7a9c569eb4 pvscan: return error when free parameter is given
Just like vgscan, pvscan (without --cache option) is not
accepting and 'free' args  (i.e.  pvscan /dev/sdx)
2014-03-30 23:42:57 +02:00
Zdenek Kabelac
844afa32a0 pvdisplay: fix option error
Properly show error for '-m'.
Also report unsupported  '-c & -s' (just like with vg/lvdisplay)
2014-03-30 23:41:52 +02:00
Zdenek Kabelac
cf29de5de0 vgimport/vgexport: return invalid cmd
When option parsing fails, return invalid cmd instead of fail.
2014-03-30 23:40:27 +02:00
Zdenek Kabelac
88a9705222 pvchange: populate lvmcache for --all
When running pvchange --all  learn about available VGs from lvmetad.
2014-03-28 10:41:58 +01:00
Peter Rajnoha
801d43445d man: add man page for lvm dumpconfig 2014-03-27 14:03:28 +01:00
Zdenek Kabelac
356fdda46d lv_manip: drop cmd pointer from for_each_sub_lv
Drop unused passed cmd pointer from function.

TODO:

We have two similar functions (though not identical)

lv_manip.c: for_each_sub_lv()
metadata.c: _lv_each_dependency()

They seem to not always match - we should probably convert
to use only a single function.
2014-03-27 13:10:13 +01:00
Zdenek Kabelac
22579b4451 lv_rename: validate renamed sublv
Use proper vgmem memory pool for allocation of LV name in the vg
and check if new renamed LV is a valid name.

TODO: validation should really use also VG name, othewise we are not
able to tell "vgname-lvname" will be valid.
2014-03-27 13:06:23 +01:00
Zdenek Kabelac
28e4bf0b6d lvrename: fix exit code
Return invalid cmd line (3) when error is detect on cmd line input.
Cmd failed (5) is used when command processing fails.
2014-03-27 13:04:55 +01:00
Zdenek Kabelac
80fe100afa pvchange: fix exit code regression
Commit 1a832398a7 moved
some code from _pvchange_single() to main pvchange() and
introduced exit code regression as return codes have not
been properly changed, thus pvchange command exited
with '0' exit code, even though it has reported error.
Also there is a missing vg unlock in error path.

Fix it by counting the total number of expected calls before
checking for pvname and also unlock and relase vg when
pv is not found.
2014-03-27 13:01:51 +01:00
Peter Rajnoha
ce78cb58eb lvmdump: add lvm dumpconfig --type diff/missing
For a quick overview of config when debugging and to quickly check
which values are different from defaults and which are not defined
in the config and for which defaults are used.
2014-03-25 13:06:59 +01:00
Zdenek Kabelac
0738f0ad72 pvresize: fail exit code for negative size
Pvresize with negative value retuns invalid cmd line exit code.
2014-03-25 11:52:03 +01:00
Zdenek Kabelac
68d13b2517 dev-cache: fix mem corruption on refresh context
When lvm2 command works with clvmd and uses locking in wrong way,
it may 'leak' certain file descriptors in opened (incorrect) state.

dev_cache_exit then destroys memory pool of cached devices, while
_open_devices list in dev-io.c was still referencing them if they
were still opened.

Patch properly calls _close() function to 'self-heal' from this
invalid state, but it will report internal error (so execution
with abort_on_internal_error causes immediate death). On the
normal 'execution', error is only reported, but memory state is
corrected, and linked list is not referencing devices from
released mempool.

For crash see: https://bugzilla.redhat.com/show_bug.cgi?id=1073886
2014-03-25 11:22:57 +01:00
Peter Rajnoha
d29fe919e6 WHATS_NEW: commit f12ee43 2014-03-25 11:00:44 +01:00
Peter Rajnoha
5dcec1734e dumpconfig: add dumpconfig --type diff to show differences from defaults 2014-03-24 15:35:54 +01:00
Zdenek Kabelac
936bfeb8de dev-swap: detect swap signature on devices smaller then 2MB
Smallest supported size for swap device is 40KB, however current
test skipped devices smaller then 4096 sectors (2MB).

Since page is in bytes, convert it to sectors before comparing
with device size (in sectors).
2014-03-22 20:36:14 +01:00
Zdenek Kabelac
93d77455ea WHATS_NEW update 2014-03-21 22:29:28 +01:00
Peter Rajnoha
2f279797f5 WHATS_NEW: commit 5eef269 2014-03-19 09:49:09 +01:00
Zdenek Kabelac
3d7eaf9226 pvscan: fix report of long pv names
pvscan --uuid was broken since it was using only 128 char buffers
without checking any write size, so any longer device path leads to
crash.

Also ansure format is properly aligned into columns with this option.
2014-03-19 00:58:01 +01:00
Zdenek Kabelac
506091be70 pv_vg_name: do not expose internal orphans to lvm2 users
Check for orphan VG name and return just empty string,
Use internally pv->vg_name if the real orphan name is needed.
2014-03-19 00:57:59 +01:00
Thomas Fehr
5b69bfb6f8 pvdisplay: fix man to refer to sectors, not KB
The PV size is displayed in sectors, not kilobytes
for 'pvdisplay -c'

Signed-off-by: Thomas Fehr <fehr@suse.de>
Acked-by: Hannes Reinecke <hare@suse.de>
2014-03-19 00:49:32 +01:00
Zdenek Kabelac
6a8d3d7811 clvmd: avoid resending local sync commands
Instead of sending repeatedly  LOCAL_SYNC commands to clvmds
like 'lvs', rememeber the last sent commmand, and if there was no other
clvmd command, drop this redundant SYNC call message.

The problem has started with commit:
56cab8cc03

This introduced correct synchronisation of name, when user requests to know
open_count (needs to wait for udev), however it is also executed for
read-only cases like 'lvs' command.

For now implement very simple solution, which is only monitoring
outgoing clvmd command, and when sequence of LOCAL sync names are
recognized, they are skipped automatically.

TODO:
Future solution might move this variable info 'cmd_context' and
use  'needs_sync' flag also i.e. in file locking code.
2014-03-19 00:47:58 +01:00
Zdenek Kabelac
a6b159e99c file_locking: use PATH_MAX for dir name
Using just 128 chars for locking dir may wail if longer
dir entry is used, swich to default linux max path.
2014-03-19 00:46:28 +01:00
Zdenek Kabelac
08018a5345 archiver: drop unneeded backup check
When the backup is disabled, avoid testing backup presence.
This only leads to errors being logged in debug trace and the missing
backup can't be fixed, since it's disabled.
2014-03-19 00:45:41 +01:00
Peter Rajnoha
25f5e2da8d dumpconfig: fix memleak when using --mergedconfig
Check whether lvm dumpconfig --mergedconfig is used only
with --type current (where we're merging current config and
the config supplied on command line). With other types
the config was merged, but it was thrown away since we're
generating other type of config anyway. This lead to a memleak.

Error out if --mergedconfig is used with anything else than
--type current (or without specifying --type in which case
the --type current is used by default).
2014-03-18 11:11:31 +01:00
Peter Rajnoha
21b3c983fd config: make global/lvdisplay_shows_full_device_path profilable 2014-03-18 09:49:53 +01:00
Peter Rajnoha
3a6bc7fc65 WHATS_NEW: previous commit 2014-03-18 09:28:52 +01:00
Peter Rajnoha
c1ce2cc86c config: make global/units and global/si_unit_consistency profilable 2014-03-17 16:07:29 +01:00
Zdenek Kabelac
d425e788e9 lvconvert: validate min chunk size for snapshot
Do not allow conversion of too small LV into a COW snapshot device.

Without this patch snapshot target is generating these kernel
messages before creation fails:

attempt to access beyond end of device
dm-9: rw=16, want=8, limit=2
attempt to access beyond end of device
...
device-mapper: table: 253:11: snapshot: Failed to read snapshot metadata
device-mapper: ioctl: error adding target to table
device-mapper: reload ioctl on  failed: Input/output error
2014-03-17 14:31:43 +01:00
Zdenek Kabelac
f3b9ee37e9 lvconvert: disallow usage of origin for snapshot
Usage of origin as a snapshot 'COW' volume is unsupported.

Without this test lvm2 is able to generate this ugly internal error message.

To test this:

lvcreate -L1 -n lv1 vg
lvcreate -L1 -n lv2 -s vg/lv1
lvcreate -L1 -n lv3 vg
lvconvert -s vg/lv3 vg/lv1

Internal error: LVs (5) != visible LVs (1) + snapshots (1) + internal LVs (0) in VG vg
2014-03-17 14:31:42 +01:00
Peter Rajnoha
455f23586f config: make report settings profilable
Users can create several profiles for how the tools report
the output very easily and then just use

  <lvm reporting command> --profile <report_profile_name>
2014-03-17 14:27:49 +01:00
Peter Rajnoha
f9070c196b conf: add existing report settings to lvm.conf
These settings exist for ages but they were not exposed in lvm.conf
and so majority of users didn't event know about them.
2014-03-17 14:24:21 +01:00
Peter Rajnoha
ada47c164a autoactivation: use VG read lock
The activation (including the refresh) should take the VG read lock
like the usual activation/refresh.
2014-03-14 12:10:52 +01:00
Peter Rajnoha
ca880a4f13 autoactivation: issue a VG refresh before autoactivation only if 'change' lvmetad flag is set
This prevents numerous VG refreshes on each "pvscan --cache -aay" call
if the VG is found complete. We need to issue the refresh only if the PV:
  - is new
  - was gone before and now it reappears (device "unplug/plug back" scenario)
  - the metadata has changed
2014-03-14 10:48:56 +01:00
Peter Rajnoha
031666ef66 man: add man page for lvm2-activation-generator 2014-03-13 13:01:06 +01:00
Peter Rajnoha
bf42119b22 config: accept empty values for global/thin_disabled_features
Let's do this the other way round - this makes more logic than commit b995f06.
So let's allow empty values for global/thin_disabled_features where
such an empty value now means "none of this features are disabled".
2014-03-12 15:53:20 +01:00
Zdenek Kabelac
6a0d97a65c lvm: change build_dm_uuid API
Pass directly 'lv' into this build routine,
so we can eventually add more private UUID suffixes.
2014-03-12 00:16:20 +01:00
Zdenek Kabelac
4d64e91efd thin: do not check of empty pool with messages
The empty pool is also the pool which has yet queued list of messages
and transaction_id == 1.

Problem is exposed when pool is created inactive.

lvcreate -L10 -T vg/pool -an
lvcreate -V10 -T vg/pool
2014-03-12 00:15:22 +01:00
Zdenek Kabelac
1850a6e454 thin: fix pool_has_message return for NULL params
When pool_has_message() is queried with NULL lv and 0 device_id
it should just return 'true' when there is any message queued.
So it needs to return negative value dm_list_empty().

Since there is no user for this code path in code currently,
this bug has not been triggered.
2014-03-12 00:13:21 +01:00
Zdenek Kabelac
460c19df62 clvmd: fix memleak on exit
This patch will releases allocated private resources from
startup. Needs previous dm_zalloc patch to ensure unset
private pointer is NULL.

TODO: check on real cluster.
2014-03-10 12:21:32 +01:00
Zdenek Kabelac
38ce06e448 clvmd: use dm_zalloc for socket allocation
Instead of doing individual settings for struct members,
ensure whole struct is in defined state.
2014-03-10 12:20:49 +01:00
Zdenek Kabelac
7a595d7388 makefiles: use BLKID/UDEV_CFLAGS properly
blkid.h needs BLKID_CFLAGS
Do not add UDEV_CFLAGS everywhere and use it only when needed.
2014-03-06 17:30:06 +01:00
Zdenek Kabelac
216c57eed7 readline: switch to new-style readline typedef
Based on patch:
https://www.redhat.com/archives/lvm-devel/2014-March/msg00015.html

The CPPFunction typedef (among others) have been deprecated in favour of
specific prototyped typedefs since readline 4.2 (circa 2001).
It's been working since because compatibility typedefs have been in
place until they where removed in the recent readline 6.3 release.
Switch to the new style to avoid build breakage.

But also add full backward compatibility with define.

Signed-off-by: Gustavo Zacarias <gustavo zacarias com ar>
2014-03-06 17:28:40 +01:00
Peter Rajnoha
42eb9b4526 WHATS_NEW: config handling changes 2014-03-06 13:03:51 +01:00
Peter Rajnoha
2c42f60890 udev: run pvscan --cache via systemd-run in udev if the PV label is detected lost
If the PV label is lost (e.g. by doing a dd on the device), call
"systemd-run pvscan --cache <major>:<minor>" in 69-dm-lvm-metad.rules
to inform lvmetad about this state.

The reason for this is that ENV{SYSTEMD_WANTS}="lvm2-pvscan@<major>:<minor>"
logic will not cause the pvscan to be fired in this case since this works
only on proper device addition/removal cycle - the lvm2-pvscan service's
ExecStop is called only on proper REMOVE event - the service is bound to
device existence. Hence we need pvscan call via systemd-run (that
instantiates a quick transient service just to call the command).

See also https://bugzilla.redhat.com/show_bug.cgi?id=1063813.
2014-03-05 14:30:58 +01:00
Zdenek Kabelac
d8513da9be lvmetad: fix memleak when pv changes it device
Test vgimportclone invokes mem leak of pvid which
would be otherwise lost when device_old_pvid
is removed from hash table.
2014-03-01 14:00:15 +01:00
Zdenek Kabelac
40e6176d25 snapshots: fix incorrect calculation of cow size
Code uses target driver version for better estimation of
max size of COW device for snapshot.

The bug can be tested with this script:
VG=vg1
lvremove -f $VG/origin
set -e
lvcreate -L 2143289344b -n origin $VG
lvcreate -n snap -c 8k -L 2304M -s $VG/origin
dd if=/dev/zero of=/dev/$VG/snap bs=1M count=2044 oflag=direct

The bug happens when these two conditions are met
* origin size is divisible by (chunk_size/16) - so that the last
  metadata area is filled completely
* the miscalculated snapshot metadata size is divisible by extent size -
  so that there is no padding to extent boundary which would otherwise
  save us

Signed-off-by:Mikulas Patocka <mpatocka@redhat.com>
2014-02-26 14:25:09 +01:00
Zhiqing Zhang
014ba37cb1 lvresize: fix stripe size validation
While stripe size is twice the physical extent size,
the original code will not reduce stripe size to maximum
(physical extent size).

Signed-off-by: Zhiqing Zhang <zhangzq.fnst@cn.fujitsu.com>
2014-02-26 13:25:50 +01:00
Zdenek Kabelac
2b044452e3 snapshot: zero cow header for read-only snapshot
When read-only snapshot was created, tool was skipping header
initialization of cow device.  If it happened device has been
already containing header from some previous snapshot, it's
been 'reused' for a newly created snapshot instead of being cleared.
2014-02-26 00:22:46 +01:00
Peter Rajnoha
558c932444 dumpconfig: comment out config lines without default values defined
To make "lvm dumpconfig --type default" output to be usable like any
other config, we need to comment out lines that have no default value
defined. Otherwise, we'd have the output with config options
with blank or zero values which is not the same as when the value
is not defined! And such configuration can't be feed into lvm again
without further edits. So let's fix this.

Currently this covers these configuration options exactly:

  devices/loopfiles
  devices/preferred_names
  devices/filter
  devices/global_filter
  devices/types
  allocation/cling_tag_list
  global/format_libraries
  global/segment_libraries
  activation/volume_list
  activation/auto_activation_volume_list
  activation/read_only_volume_list
  activation/mlock_filter
  metadata/dirs
  metadata/disk_areas
  metadata/disk_areas/<disk_area>
  metadata/disk_areas/<disk_area>/start_sector
  metadata/disk_areas/<disk_area>/size
  metadata/disk_areas/<disk_area>/id
  tags/<tag>
  tags/<tag>/host_list
2014-02-25 11:32:54 +01:00
Zdenek Kabelac
5097463fb3 mirror: detect attrs just once
Reorder detection of cmirrord. Now if cmirrord is not
running, target will not try to load kernel log module,
for communication with cmirrord.

Whole check for attrs now also happens just once.
2014-02-24 21:13:11 +01:00
Zdenek Kabelac
95fe823eba raid: use feature attributes for raid10
Test raid10 availability as a target feature (instead of doing
it in all the places where raid10 should be checked).

TODO: activation needs runtime validation - so metadata with raid10
are skipped from activation in user-friendly way in lvm2.
2014-02-24 21:10:13 +01:00
Zdenek Kabelac
8c878438f5 metadata: move vg parsing to vg_write
Parsing vg structure during  supend/commit/resume may require a lot of
memory - so move this into vg_write.

FIXME: there are now multiple cache layers which our doing some thing
multiple times at different levels. Moreover there is now different
caching path with and without lvmetad - this should be unified
and both path should use same mechanism.
2014-02-24 21:08:53 +01:00
Peter Rajnoha
84d02542bd WHATS_NEW: for commit b391ae88e5 2014-02-21 11:36:55 +01:00
Zdenek Kabelac
c71a3bcbc0 activation: lv_activation_skip remove always same arg.
Remove 'skip' argument passed into the function.
We always used '0' - as this is the only supported
option (-K) and there is no complementary option.

Also add some testing for behaviour of skipping.
2014-02-19 11:33:39 +01:00
Peter Rajnoha
417e52c13a udev: create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for a PV
We already have /dev/disk/by-id/dm-uuid-... (which encompasses the
VG UUID and LV UUID in case of LVs since the mapping's UUID is
VG+LV UUID together) and /dev/disk/by-id/dm-name-... (which encompasses
the VG and LV name in case of LVs).

This patch addds /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> that completes
this scheme and makes navigation a bit easier using PV UUIDs since
one can navigate using PV UUIDs only and there's no need to do extra
PV UUID <--> kernel name matching (the PV UUID is stable across reboots).
This may come in handy in various scripts.

Since we already have the PV UUID stored in udev database (as a result
of blkid call - returned in ID_FS_UUID blkid's variable), this operation
is very cheap indeed, just creating the extra one symlink.
2014-02-18 11:37:20 +01:00
Jonathan Brassow
6a00a7e33d RAID: Allow implicit stripe (and parity) when creating RAID LVs
There are typically 2 functions for the more advanced segment types that
deal with parameters in lvcreate.c: _get_*_params() and _check_*_params().
(Not all segment types name their functions according to this scheme.)
The former function is responsible for reading parameters before the VG
has been read.  The latter is for sanity checking and possibly setting
parameters after the VG has been read.

This patch adds a _check_raid_parameters() function that will determine
if the user has specified 'stripe' or 'mirror' parameters.  If not, the
proper number is computed from the list of PVs the user has supplied or
the number that are available in the VG.  Now that _check_raid_parameters()
is available, we move the check for proper number of stripes from
_get_* to _check_*.

This gives the user the ability to create RAID LVs as follows:
# 5-device RAID5, 4-data, 1-parity (i.e. implicit '-i 4')
~> lvcreate --type raid5 -L 100G -n lv vg /dev/sd[abcde]1

# 5-device RAID6, 3-data, 2-parity (i.e. implicit '-i 3')
~> lvcreate --type raid6 -L 100G -n lv vg /dev/sd[abcde]1

# If 5 PVs in VG, 4-data, 1-parity RAID5
~> lvcreate --type raid5 -L 100G -n lv vg

Considerations:
This patch only affects RAID.  It might also be useful to apply this to
the 'stripe' segment type.  LVM RAID may include RAID0 at some point in
the future and the implicit stripes would apply there.  It would be odd
to have RAID0 be able to auto-determine the stripe count while 'stripe'
could not.

The only draw-back of this patch that I can see is that there might be
less error checking.  Rather than informing the user that they forgot
to supply an argument (e.g. '-i'), the value would be computed and it
may differ from what the user actually wanted.  I don't see this as a
problem, because the user can check the device count after creation
and remove the LV if they have made an error.
2014-02-17 20:18:23 -06:00
Zdenek Kabelac
25cea92338 thin: fix merge of old snaphost
Fix merging of old snapshot into thinvolume origin.
Add also internal error for the error case when
merging requests activation of merged LV.
2014-02-17 22:25:53 +01:00
Peter Rajnoha
f33a224ef0 scripts: use --ignoreskippedcluster in lvm2-monitor initscript/systemd unit
When clustered VG is available in the system but we don't have
clustering set up for whatever reason, the lvm2-monitor scripts should
not fail completely just because these clustered VGs are skipped during
vgs/vgchange calls in lvm2-monitor initscript/systemd unit.
2014-02-17 16:29:49 +01:00
Zdenek Kabelac
f0f4248333 activation: drop test r/w vg state for activing LV
VG status read/write is meant to influence only VG metadata.
It's not related to the read/write status of the LV itself.
2014-02-15 11:34:54 +01:00
Peter Rajnoha
4210219a8b systemd: Use --ignoreskippedcluster in generated activation systemd units
When the activation units are generated if use_lvmetad=0 (no
autoactivation), use --ignoreskippedcluster option for vgchange calls
since the cluster with cLVM is set up by separate units.

This avoids a situation in which the generated activation units are
improperly in failed state just because of the vgchange return value
when clustered VGs are encountered while the activation of non-clustered
VGs does proceed normally.
2014-02-14 11:59:12 +01:00
Jonathan Brassow
94d8779ae2 Update WHATS_NEW for approximate allocation check-in 2014-02-13 21:12:28 -06:00
Jonathan Brassow
5bfe4aaf95 cache[pool]: Man page updates for lvs, lvcreate, lvconvert 2014-02-12 10:29:07 -06:00
Zdenek Kabelac
38e457c478 raid: drop invalid modication of active parameter
lv_active_change will enforce proper activation.
Modification of activation was wrong and lead to misuse of
autoactivation. Fix allows to use proper local exclusive activation,
while the removed code turned this into just exclusive
activation (losing required local property).
2014-02-11 18:48:38 +01:00
Zdenek Kabelac
8a21dcebac raid: use unsigned 64b constant for shift 2014-02-11 18:47:15 +01:00
Peter Rajnoha
1a5062c9a7 cleanup: clarify man pages about lvchange/vgchange -aay, use -aay in lvm2-cluster-activation script 2014-02-11 13:48:04 +01:00
Peter Rajnoha
a092cd33be systemd: rename lvm2-cluster-activation and lvm2-clvmd services to follow existing naming
lvm2_cluster_activation_red_hat.service.in -> lvm2_cluster_activation_systemd_red_hat.service.in
lvm2_clvmd_red_hat.service.in -> lvm2_clvmd_red_hat.service.in

Edit lvm2-cluster-activation reference on cmirror - take new
lvm2-cmirrord.service, it was just cmirrord(.service) before
as the old initscript was used in compatibility mode.

Also, use WantedBy=multi-user.target instead of sysinit.target
in lvm2-cluster-activation.service.
2014-02-11 10:12:39 +01:00
Ondrej Kozina
fd41dd8f9c Add systemd native service for clvmd and cluster activation
The commit splits original clvmd service in two new native services
for systemd enabled systems while original init scripts remain unaltered.

New systemd native services:

  1) clvmd daemon itself (lvm2_clvmd_red_hat.service.in)
  2) (de)activation of clustered VGs (lvm2_cluster_activation_red_hat.service.in)

There're several reasons to split it. First, there's no support for conditional
stop in systemd and AFAIK they don't plan to support it. In other words:
if the deactivation fails for some reason, systemd doesn't care and will simply
kill all remaining processes in original cgroup (by default). Killing the
remaining procs can be suppressed however it doesn't solve the following problem:

You can't repeat the stop command of a failed service. The repeated stop command
is simply not propagated to the service in a failed state. You would have to start
and then try to stop the service again. Unfortunately, this can't be done while
the daemon is still running (and we need the daemon to stay active until all
clustered VGs are deactivated properly).

In a separated setup we need only to restart the failed activation service and
that's fine.
2014-02-10 17:13:49 +01:00
Peter Rajnoha
ffa623c53d systemd: cleanup for lvmetad systemd unit
No need to fork lvmetad when running under systemd.
Also, the "lvmetad -R" support has been removed in lvm2 v2.02.98
so remove the ExecReload line that called it on "systemctl reload".
2014-02-10 16:20:45 +01:00
Peter Rajnoha
ed166a3b1d wiping: wipe DM_snapshot_cow signature without prompt in newly created LVs
The libblkid can detect DM_snapshot_cow signature and when creating
new LVs with blkid wiping used (allocation/use_blkid_wiping=1 lvm.conf
setting and --wipe y used at the same time - which it is by default).

Do not issue any prompts about this signature when new LV is created
and just wipe it right away without asking questions. Still keep the
log in verbose mode though.
2014-02-10 13:28:13 +01:00
Zdenek Kabelac
ef6c5795a0 raid: add temporary activation for raid metadata clear
Use LV_TEMPORARY when activating devices for clearing
raid metadata.
2014-02-04 14:51:05 +01:00
Alasdair G Kergon
83358d4c03 tools: Add internal tags command. 2014-01-30 13:09:15 +00:00
Zdenek Kabelac
155405b0e1 thin: validate external origin size
Avoid use of external origin with size unaligned/incompatible with
thin pool chunk size, since the last chunk is not correctly provisioned
when it is overwritten.
2014-01-29 14:58:13 +01:00
Zdenek Kabelac
e9d9852c55 thin: more validation of thin name
Avoid starting conversion of the LV to the thin pool and thin volume
at the same time.  Since this is mostly a user mistake, do not try
to just convert to one of those type, since we cannot assume if the
user wanted LV to become thin volume or thin pool.

Before the fix tool reported pretty strange internal error:
Internal error: Referenced LV lvol1_tdata not listed in VG mvg.

Fixed output:
lvconvert --thinpool lvol0 -T mvg/lvol0
Can't use same LV mvg/lvol0 for thin pool and thin volume.
2014-01-28 13:21:39 +01:00
Zdenek Kabelac
7786443530 devices: support zvol
Support partitions on ZFS zvol.
Requested via https://bugzilla.redhat.com/show_bug.cgi?id=913597

Author: hakimian@aha.com
2014-01-28 10:33:29 +01:00
Zdenek Kabelac
6b73d21ba9 locking: avoid dropping locks
When lvm2 command forks, it calls reset_locking(),
which as an unwanted side effect unlinked lock file from filesystem.

Patch changes the behavior to just close locked file descriptor
in children - so the lock is being still properly hold in the parent.
2014-01-27 12:13:29 +01:00
Zdenek Kabelac
f18ee04fab lvmetad: respect LVM_LVMETAD_PIDFILE settings in lvm
Test LVM_LVMETAD_PIDFILE for pid for lvm command.
Fix WHATS_NEW envvar name usage
Fix init order in prepare_lvmetad to respect set vars
and avoid clash with system settings.
Update test to really test the 'is running' message.
2014-01-24 15:59:38 +01:00
Zdenek Kabelac
731c298e12 thin: use LV_TEMPORARY for metadata initialization
This flag need to be specified when we create thin pool - to avoid
scanning device with watch rules.
2014-01-24 12:30:28 +01:00
Zdenek Kabelac
f8b20fb8e8 thin: fix feature compare function
Comparing for available feature missed the code path, when
maj is already bigger.

The bug would be only hit in the case, thin pool target would have
increased major version.
2014-01-23 14:22:31 +01:00
Zdenek Kabelac
902b343e0e thin: validate resize of thin LV with ext. origin
When thin volume is using external origin, current thin target
is not able to supply 'extended' size with empty pages.

lvm2 detects version and disables extension of LV past the external
origin size in this case.

Thin LV could be however still reduced and extended freely bellow
this size.
2014-01-23 14:20:34 +01:00
Zdenek Kabelac
2dae78b722 thin: rename function
Rename pool_can_resize_metadata() to more reusable
thin_pool_feature_supported() which could be queried
for mutiple different features.
2014-01-23 14:19:17 +01:00
Peter Rajnoha
2b9d25133e wiping: issue error if libblkid detects signature and fails to return offset/length
We need both offset and length when trying to wipe detected signatures.
The libblkid can fail so it's good to have an error message issued for
this state instead of being silent (libblkid does not issue any error
messages here). We just issued "stack" here before but that was not
quite useful if some error occurs...
2014-01-22 16:29:52 +01:00
Alasdair G Kergon
d2956f0a59 autoconf: Update config.guess/sub to 2014-01-01. 2014-01-21 22:00:15 +00:00
Zdenek Kabelac
22d9daff12 thin: online metadata resize requires 1.10
Version 1.10 starts to look promissing, let's enable
online resize when this thin-pool kernel target is present.
2014-01-21 13:50:17 +01:00
Alasdair G Kergon
899a079c8f post-release 2014-01-20 19:41:30 +00:00
Alasdair G Kergon
aa21e79991 pre-release 2014-01-20 19:22:56 +00:00
Peter Rajnoha
91b26b63b4 thin: fix thin LV flagging for udev to skip scanning
Only flag thin LV for no scanning in udev if this LV is about
to be wiped. This happens only in case the thin LV's pool was not
created with zeroing of the new blocks enabled.
2014-01-20 12:38:21 +01:00
Zdenek Kabelac
79768b2e9c tests: update testing for xfs 2014-01-20 12:02:33 +01:00
Alasdair G Kergon
3813cd7a3c format1: Mark obsolete and do not use with lvmetad.
DO NOT USE LVMETAD IF YOU HAVE ANY LVM1-FORMATTED PVS.

You may continue to use it without lvmetad, but do please schedule
an upgrade to the lvm2 format (with 'vgconvert').

Sending the original LVM1 formatted metadata to lvmetad is breaking
assumptions made by the code, so I am marking the format as obsolete for
now and no longer sending it to lvmetad.

This means that if you are using lvmetad, lvm1 volumes will usually
appear invisible - though not always: it depends on exactly what
sequence of commands you run!

The current situation is not satisfactory.

We'll either fix lvmetad and reenable this or we'll fix the code to
issue appropriate warning messages when lvm1 PVs are encountered
to avoid accidents.

(The latest unfixed problem is that lvmetad assumes metadata sequence
numbers exist and always increase - but the lvm1 format does not define
or store any sequence number, confusing both the daemon and client
when default values get passed to-and-fro.)
2014-01-14 03:27:45 +00:00
Alasdair G Kergon
4c2b4c37e7 lvmcache: Invalidate cached VG if PV is orphaned.
If a PV in an existing VG becomes orphaned (with 'pvcreate -ff', for
example) the VG struct cached against its vginfo must be invalidated.
This is because the struct device it references no longer contains
the PV label so becomes incorrect.

This triggers the error:
  Internal error: PV $dev unexpectedly not in cache.
when the PV from the cached VG metadata is subsequently looked up
in the cache.

Bug introduced in 2.02.87 by commit 7ad0d47c3c
("Cache and share generated VG structs").

Before:

lvm> pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
  WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
  Physical volume "/dev/loop3" successfully created
lvm> pvs
  Internal error: PV /dev/loop3 unexpectedly not in cache.
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop3      lvm2 a--  32.00m 32.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m

After:
lvm> pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
  WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
  Physical volume "/dev/loop3" successfully created
lvm> pvs
  PV             VG   Fmt  Attr PSize  PFree
  /dev/loop3          lvm2 a--  32.00m 32.00m
  /dev/loop4     vg12 lvm2 a--  28.00m 28.00m
  unknown device vg12 lvm2 a-m  28.00m 28.00m
2014-01-14 02:57:03 +00:00
Peter Rajnoha
359291b41c systemd: use only major:minor for pvscan in lvm2-pvscan@.service
When using filters for the pvscan --cache (the global_filter),
there's a difference between:

  pvscan --cache -aay /dev/block/<major>:<minor>

and

  pvscan --cache -aay <major>:<minor> (or --major <major> --minor <minor>)

In the first case, we need to be sure to have an exact matching line
in the filter for the device to be used, no aliases are considered
So for example even if we have accept rule for "/dev/sda" present,
this won't apply for "/dev/block/8:0" even though it's the same device!
This is because we're comparing the path used on command line directly
with the path written in the rule.

For the second one, any alias mentioned in the filter will apply
as we're comparing the major and minor pair, not looking at actual
device names - so any alias mentioned in the rules will suffice for
the filtering rule to apply.

For the global_filter to be properly used, we need to call the
second one in the lvm2-pvscan@.service - nobody is able to tell
what value of major:minor the kernel assignes next time, hence
this bug makes the use of global_filter quite unusable!
2013-12-18 12:23:59 +01:00
Ville Skyttä
3776832499 man: syntax and spelling fixes. 2013-12-18 09:03:42 +01:00
Zdenek Kabelac
c3d82d717c Revert "tree_action: destroy devices from failing activation"
This reverts commit 24639be558.

Ok - seems we could be here a bit too active - and we
may remove devices which are unsuable for reasons we are not
aware of - thus taking down whole device could be way to big hammer.

So we still need some solution to recover from failing preload
and activation - but it needs more tunning.
2013-12-17 15:21:28 +01:00
Zdenek Kabelac
24639be558 tree_action: destroy devices from failing activation
When activation fails - we may leak large tree of partially loaded
devices in the dm table (i.e. failure in snapshot activation)

The best we can do here is try to deactivate whole device and
remove as much inactive table entries as we can.
2013-12-17 14:08:54 +01:00
Zdenek Kabelac
94137b72ed lv_dependency: scan also snapshots and extorigins
When LV is scanned for its dependencies - scan also origin's snapshots,
and thin external origins.

So if any PV from snapshot or external origin device is missing - lvm2 will
avoid trying to activate such device.
2013-12-17 14:08:54 +01:00
Peter Rajnoha
32080c4ff7 device: add physical block size info and make sure VG extent size >= PV's phys. block size 2013-12-12 15:02:36 +01:00
Zdenek Kabelac
6c0e44d5a2 dev-cache: skip double stat() call on each _insert
When the device is inserted in dev_name_confirmed() stat() is
called twice as _insert() has it's own stat() call.

Extend _insert() parameter with struct stat* - which could be used
if it has been just obtained.  When NULL is passed code is
doing its own stat() call as before.
2013-12-12 13:42:28 +01:00
Zdenek Kabelac
10a13dc03c thin: enable build of thin provisioning by default
Use internal type by default for thin provisioning.
If user is not interested in thin provisiong and doesn't
have thin provisining supporting tools installed,
configure will just print warning at the end of configure
process about limited support.
2013-12-12 13:31:23 +01:00
Zdenek Kabelac
fe21b02fab cleanup: improve tag processing
Boolean algebra changes for process_each_lv_in_vg().

1st.
Drop process_lv variable since it's not needed.

2nd.
process_lv was always initilized to 0 - so the condition was always true.
It the condition (!tags_supplied && !lvargs_supplied) evaluates as "true",
process_all is already set to 1, so skip vg tags evaluation.

3rd.
Move check for matching lv name in the front of lv tags check
since this check can't be skipped for lvargs_matched counter.
If this filter evaluates to true, skip lv tags evaluation.
2013-12-12 13:29:20 +01:00
Zdenek Kabelac
b53e9ba66a cleanup: simplify logging code
Condense code for logging.
2013-12-12 13:27:59 +01:00
Peter Rajnoha
877418b1a2 WHATS_NEW: commit 4c267c7286
See also https://bugzilla.redhat.com/show_bug.cgi?id=1026860
for more info, comment 76 for summary.
2013-12-11 16:50:52 +01:00
Zdenek Kabelac
e3b7fa4b8d thin: thin metadata resize unsupported with 1.9
Thin kernel target 1.9 still does not support online resize of
thin pool metadata properly - so disable it with expectation
for much higher version - and reenable after fixing kernel.
2013-12-10 11:17:37 +01:00
Zdenek Kabelac
cf857ecabd cleanup: shorter raid initialization
Avoid multiple queries for raid library for each initialized raid
seg type and use shorter code.
2013-12-10 11:16:51 +01:00
Zdenek Kabelac
9f0e27a18c cleanup: tiny speedup of lib_dir checking
Instead of repeated lookup of global/library_dir - remember first
search in command context - saves couple lines in debug output...
2013-12-10 11:15:48 +01:00
Alasdair G Kergon
16eab3ec08 config: shorten new sig wiping option string
Rename wipe_signatures_on_new_logical_volumes_when_zeroing  to
wipe_signatures_when_zeroing_new_lvs.
2013-12-09 09:35:47 +00:00
Peter Rajnoha
481edce41f compile/link: use RELRO/PIE compiler/linker options for executables 2013-12-05 14:03:10 +01:00
Zdenek Kabelac
7baf411c97 dev-cache: return success when ignoring dirs
When there is no failure from the insert operation itself,
just dirs and links are skipped - return success.
2013-12-05 11:17:37 +01:00
Zdenek Kabelac
598c82fc07 vgchange: move detection of remote exlusivness
Since activation takes only read-lock, there could be
multiple activation running in parallel.

So instead of checking before taking any real lock,
let the locking resolve the problem and just
detect if the reason for failure has been remote
exlusive activation.

It should be also faster, since each activation does
not need to do explicit lock query.
2013-12-04 17:09:51 +01:00
Zdenek Kabelac
91496e0eda thin: enable thin snapshot merge 2013-12-04 14:30:26 +01:00
Zdenek Kabelac
572983d793 thin: read table line with thin device id
Add functions to parse thin table line to
obtain thin device id.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
cf7f451238 merge: test only for meging origin
It's enough to check only for merging origin.
This will be also used for thin volume merges which
do not need to be origins.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
5a6794a2ce lv_remove_single: add silent arg
Support silence for removal message.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
84b3852ee5 snapshots: use lv_check_not_in_use
Switch from a simple 'open_count' test on opened snaphost
to a more 'skilled'  lv_check_not_in_use().
2013-12-04 14:30:24 +01:00
Zdenek Kabelac
778de22d51 refresh: print error message with failing lv name
If there is  suspend/resume error, print error message
with lv name.
Drop goto.
2013-12-04 14:30:24 +01:00
Peter Rajnoha
a65ab773b4 daemons: use PIE and RELRO compiler/linker options
The PIE and RELRO compiler/linker options can be used to produce a code
some techniques applied that makes the code more immune to some attacks:

  - PIE (Position Independent Executable). It can make use of the ASLR
    (Address Space Layout Randomization) provided by kernel to avoid
    static locations for .text regions of executables (this is the 'pie'
    compiler and linker option)

  - RELRO (Relocation Read-Only). This prevents overwrite attacks of
    the GOT (Global Offset Table) and PLT (Procedure Lookup Table)
    used for relocations by making it read-only after all relocations
    are resolved (these are the 'relro' and 'now' linker options) -
    hence all symbols are resolved at the very start so there's no
    need for those tables to be writeable later.

These compiler/linker options are now used by default for daemons
if the compiler/linker supports it.
2013-12-04 13:30:08 +01:00
Zdenek Kabelac
fc37d4fb0d make: support per-object defines
In the case we have a dir with multiple objects and for
an individual object file we need special define -
allow to define it without adding extra rules.

To ensure dmeventd.o compilation will use EXTRA_FLAGS:
CFLAGS_dmeventd.o += $(EXTRA_FLAGS)

Then it's better to use:
dmeventd.o: CFLAGS += $(EXTRA_FLAGS)
2013-12-04 13:30:08 +01:00
Peter Rajnoha
6c6bcc00e4 configure: check compiler/linker support for RELRO and PIE options
Also, add AC_TRY_LDFLAGS m4 macro to help with checking ld flags.
2013-12-04 13:30:08 +01:00
Alasdair G Kergon
7b65363bf7 lvconvert: Implement --splitsnapshot. 2013-12-04 02:09:37 +00:00
Alasdair G Kergon
ff769ecfe7 lvconvert: Fix reload after snapshot conversion.
At the end of lvconvert --snapshot with an active origin, the origin
gets reloaded.

Commit 57c0f72b1d ("lvconvert: use
_reload_lv on more places") accidentally replaced this with a snapshot
LV reload (which does nothing because only the origin is active).
2013-12-04 02:04:29 +00:00
Peter Rajnoha
6232cac86c vgdisplay: select only active volumes groups if -A option is used
Where "active" means "at least one LV is active in the volume group".
2013-12-03 14:43:00 +01:00
Alasdair G Kergon
84394c0219 lvmetad: extend socket/pid file handling
Make it easier to run a live lvmetad in debugging mode and
to avoid conflicts if multiple test instances need to be run
alongside a live one.

No longer require -s when -f is used: use built-in default.
Add -p to lvmetad to specify the pid file.
No longer disable pidfile if -f used to run in foreground.
If specified socket file appears to be genuine but stale, remove it
before use.
On error, only remove lvmetad socket file if created by the same
process.  (Previous code removes socket even while a running instance
is using it!)
2013-11-29 20:56:29 +00:00
Peter Rajnoha
75628f341a configure: enable blkid_wiping by default if the blkid library is present 2013-11-29 15:27:56 +01:00
Alasdair G Kergon
b3074560eb lvmetad: Add newline to missing socket error mesg 2013-11-28 19:16:25 +00:00
Zdenek Kabelac
01c438a96c format-text: ensure aligment is not 0
Make sure this path of code is not used for alignment == 0,
to prevent division by 0.
2013-11-28 12:42:39 +01:00
Peter Rajnoha
ab2f858af7 conf: add allocation/use_blkid_wiping
Add allocation/use_blkid_wiping setting to lvm.conf to select between
LVM2 native code to detect signatures to wipe or blkid library code.
2013-11-27 15:49:14 +01:00
Peter Rajnoha
9bfc0be493 configure: add --enable-blkid_wiping 2013-11-27 15:48:16 +01:00
Peter Rajnoha
169b4c1586 lvcreate: recognize --wipesignatures arg
Recognize the new --wipesignatures arg in lvcreate that is supposed
to wipe known signatures if found on newly created LV.
2013-11-27 15:48:15 +01:00
Peter Rajnoha
5b7e543cae conf: add allocation/wipe_signatures_on_new_logical_volumes_when_zeroing
This setting controls whether signature wiping on newly created logical
volumes will follow the state of zeroing (-Z/--zero option).
2013-11-27 15:48:06 +01:00
Peter Rajnoha
d6e67b8503 WHATS_NEW: commit 729b104 2013-11-27 08:33:02 +01:00
Peter Rajnoha
8d5cff5b9b lv/vgchange: do not try to connect to lvmetad if socket absent and --sysinit -aay used
If using lv/vgchange --sysinit -aay and lvmetad is enabled, we'd like to
avoid the direct activation and rely on autoactivation instead so
it fits system initialization scripts.

But if we're calling lv/vgchange --sysinit -aay too early when even
lvmetad service is not started yet, we just need to do the direct
activation instead without printing any error messages (while
trying to connect to lvmetad and not finding its socket).

This patch adds two helper functions - "lvmetad_socket_present" and
"lvmetad_used" which can be used to check for this condition properly
and avoid these lvmetad connections when the socket is not present
(and hence lvmetad is not yet running).
2013-11-26 14:51:23 +01:00
Zdenek Kabelac
4a061a35c7 snapshot: use lv_check_not_in_use
Instead of plain open_count check, try to use 'smarter'
lv_check_not_in_use() function.
2013-11-22 20:58:11 +01:00
Zdenek Kabelac
6d196410fc snapshot: revert and move check to lvconvert
Revert 4777eb6872 which put
target_present check into init_snapshot_merge(). However
this function is also used when parsing metadata. So we would
get this present test performed even when target is not really
needed. So move this target_present test directly into lvconvert.
2013-11-22 20:57:30 +01:00
Zdenek Kabelac
3d3b8bfd1c pv_write: check for lvmcache_add_mda failure
Add missing test of failing lvmcache_add_mda() call.
2013-11-22 20:55:09 +01:00
Zdenek Kabelac
a50a297f6e report: detect dev_get_size failure
Since dev_get_size() may fail, detect this failure.
2013-11-22 20:54:16 +01:00
Zdenek Kabelac
08d6d81cc2 filters: drop extra slash from sysfs path
Sysfs filter was using '/sys//class/block' with double '//' inside.
Remove this extra '/'.
Also simplify code around and use loop to try those paths.
2013-11-22 20:53:31 +01:00
Alasdair G Kergon
4c1f281b43 toollib: Avoid undefined ignore_vg parameter.
Fix process_each_segment_in_pv to always set ret before calling ignore_vg().
2013-11-22 18:11:04 +00:00
Tony Asleson
7c8abb29df liblvm/python API: Additions & fixes
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-11-19 14:54:18 -06:00
Zdenek Kabelac
c697f501c7 Makefile: fix build in non src dir
Fix 'make install' from builddir != srcdir.
2013-11-19 10:55:11 +01:00
Zdenek Kabelac
37b7c67079 thin: report thin device id
Support   lvs -o+thin_id  to report thin device id.
This device_id is connection between kernel and lvm2 user space thin
metadata.
2013-11-15 12:38:37 +01:00
Zdenek Kabelac
36d7c639d1 report: fix dereference of string as uint64_t
Fix buggy usage of "" (empty string) as a numerical string
value used for sorting.

On intel 64b platform this was typically resolve
as 0xffffff0000000000 - which is already 'close' to
UINT64_MAX which is used for _minusone64.

On other platforms it might have been giving
different numbers depends on aligment of strings.

Use proper &_minusone64 for sorting value when the reported
value is NUM.

Note: each numerical value needs to be thought about if it needs
default value &_zero64 or &_minusone64 since for cases, were
value of zero is valid, sorting should not be mixing entries
together.
2013-11-15 11:05:02 +01:00
Zdenek Kabelac
49ccf44f45 report: add wrappers to set values and percents
Add wrapper function for dm_report_field_set_value() which returns void
and return 1, so the code could be shorter.

Add wrapper function for percent display _field_set_percent().
2013-11-15 11:05:00 +01:00
Alasdair G Kergon
322e7b3060 post-release 2013-11-13 14:11:11 +00:00
Alasdair G Kergon
77a1efeb8e release 2.02.104
87 files changed, 1207 insertions(+), 294 deletions(-)
2013-11-13 14:02:34 +00:00
Alasdair G Kergon
527db4645f gcc: replace #ifdef linux with __linux__ 2013-11-13 13:56:29 +00:00
Peter Rajnoha
d8085edf65 pvscan: retry VG refresh before autoactivation if it fails
There's a tiny race when suspending the device which is part
of the refresh because when suspend ioctl is performed, the
dm kernel driver executes (do_suspend and dm_suspend kernel fn):

  step 1: a check whether the dev is already suspended and
          if yes it returns success immediately as there's
          nothing to do
  step 2: it grabs the suspend lock
  step 3: another check whether the dev is already suspended
          and if found suspended, it exits with -EINVAL now

The race can occur in between step 1 and step 2. To prevent
premature autoactivation failure, we're using a simple retry
logic here before we fail completely. For a complete solution,
we need to fix the locking so there's no possibility for suspend
calls to interleave each other to cause this kind of race.

This is just a workaround. Remove it and replace it with proper
locking once we have that in!
2013-11-12 11:09:45 +01:00
Jonathan Brassow
7de533ad12 mirror: Handle failures in tmp mirror used when up-converting.
Failures in the temporary mirror used when up-converting cause dmeventd
to issue 'lvconvert --repair' on the sub-LV, <lv_name>_mimagetmp_?.  The
'lvconvert' command refuses to deal with this sub-LV outright - it
expects to be given the name of the top-level LV.  So, just like we do
with mirrored logs, we strip-off the portion of the name that is not
the top-level LV and issue the command on the top-level LV instead.
2013-11-08 09:52:00 -06:00
Zdenek Kabelac
9f6209b878 activation: improve activation
This patch fixes mostly cluster behavior but also updates
non-cluster reaction where calls like   'lvchange -aln'
lead to incorrect errors for some segment types.

Fix the implicit activation rules where some segment types could
be activated only in exclusive mode in cluster.
lvm2 command was not preserver 'local' property and incorrectly
converted local activations in to plain exclusive, so the local
activation could have activate volumes exclusively, but remotely.
2013-11-01 13:03:50 +01:00
Zdenek Kabelac
c3e674ad30 activation: _lv_activate is ok when filtered.
If the volume_list filters out volume from activation,
it is still success result for this function.
Change the error message back to verbose level.

Detect if the volume is active localy before zeroing,
so we report error a bit later for cases, where volume
could not be activated because it doesn't pass through volume
list  (but user still could create volume when he disables
zeroing)
2013-11-01 13:02:36 +01:00
Zdenek Kabelac
1bde9f68ce locking: activate_lv_excl return correct error code
Correct return code of activate_lv_excl().

Function is not supposed to return activation state of
activated volume, but return code of the operation.
Since i.e. when activation filter is allowing to activate
volume on current system, it is still success even though
no volume is activated.
2013-11-01 13:02:13 +01:00
Peter Rajnoha
f070e3543a udev: properly trigger LVM scan for MD partitions
MD can directly create partition devices without a need to run
an extra kpartx or partprobe call. We need to react to this event in
a different way as for bare MD devices - we need to handle the ADD event
for KERNEL=="md[0-9]*p[0-9]*" kernel name and trigger the LVM scanning
to update lvmetad to trigger autoactivation and so on...

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1023250
2013-10-30 14:09:11 +01:00
Peter Rajnoha
db05a0cf6f WHATS_NEW: commit 9d06212
Other changes in previous commits 9d06212 and f1a42aa are changes
in the code that was not yet released as part of upcoming v104.
2013-10-29 13:49:55 +01:00
Peter Rajnoha
f3a6f7073b WHATS_NEW: commit 4c0db84 2013-10-29 11:04:32 +01:00
Zdenek Kabelac
8e1f2e733e gcc: fix comparing floating point warning
Since we enabled some more gcc warnings - let's adapt for
it and check for double equals with DBL_EPSILON.

Current close_enough() is far from perfect
for more details see i.e. here:
http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/
but fairly enough for lvm2 use-case.
2013-10-25 10:43:32 +02:00
Alasdair G Kergon
c9c23d4148 build: Use additional gcc warning flags. 2013-10-24 17:10:24 +01:00
Jonathan Brassow
d5896f0afd Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O.  When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices.  This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.

Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel.  While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely.  This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.

I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs).  If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command.  This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror.  They
can then release their lock and allow a repair command to commence.  In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.

This patch brings solution #1.  If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
2013-10-22 19:14:33 -05:00
Peter Rajnoha
039bdad732 activation: flag temporary LVs internally
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.

We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.

This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.

For example: lvcreate --thinpool POOL --zero n -L 1G vg

- first, the usual LV is created to do a clean up for pool metadata
  spare. The LV is activated, zeroed, deactivated.

- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
  to avoid any scanning in udev

- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
  udev rule, but since the LV is just a usual LV, we can't make a
  difference. The LV_TEMPORARY internal LV flag helps here. If we
  create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
  and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
  with "invisible" and non-top-level LVs) - udev is directed to
  skip WATCH rule use.

- if the LV_TEMPORARY flag was not used, there would normally be
  a WATCH event generated once the LV is closed after "zeroed"
  stage. This will make problems with immediated deactivation that
  follows.
2013-10-23 14:09:37 +02:00
Peter Rajnoha
9883bffb04 WHATS_NEW: typo 2013-10-22 16:37:02 +02:00
Peter Rajnoha
b109bfc1ef blkdeactivate: fix endless loop if device(s) given and unable to umount/deactivate
The blkdeactivate script iterates over the list of devices if they're
given as an argument and it tries to umount/deactivate them one by one.

This iteration failed to proceed if any of the umount/deactivation
was unsuccessful - there was a missing "shift" call to move to the
next argument (device) for processing. As a result of this, the same
device was tried again and again, causing an endless loop, never
proceeding to the next device given.
2013-10-22 16:24:39 +02:00
Peter Rajnoha
3fee661028 udev+systemd: refine lvm2-pvscan@.service to better track device existence
When using ENV{SYSTEMD_WANTS}=lvm2-pvscan@... to instantiate a service
for lvmetad scan when the new PV appears in the system, the service
is started and executed. However, to track device removal, we need
to bind it (the "BindsTo" systemd directive) to a certain .device
systemd unit.

In default systemd setup, the device is tracked by it's name and
sysfs path (there's normally a sysfs path .device systemd unit for
a device and then the device name .device unit as an alias for it).
Neither of these two is useful for lvmetad update as we need to bind
it to device's <major>:<minor> pair.

The /dev/block/<major>:<minor> is the essential symlink under /dev
that exists for each block device (created by default udev rules
provided by udev directly). So let's use this as an alias for
the device's .device unit as well by means of "ENV{SYSTEMD_ALIAS}"
declaration within udev rules which systemd understands (this will
create a new alias "dev-block-<major>:<minor>.device".

Then we can easily bind the "dev-block-<major>:<minor>" device
systemd unit with instantiated lvm2-pvscan@<major>:<minor>.service.
So once the device is removed from the systemd, the
lvm-pvscan@<major>:<minor>.service executes it's ExecStop action
(which in turn notifies lvmetad about the device being gone).

This completes the udev-systemd-lvmetad interaction then.
2013-10-22 14:22:40 +02:00
Peter Rajnoha
0a48137d39 pvscan: use major:minor as short form of --major and --minor arg for pvscan --cache
Before, pvscan recognized either:
  pvscan --cache --major <major> --minor <minor>
or
  pvscan --cache <DevicePath>

When the device is gone and we need to notify lvmetad about device
removal, only --major/--minor works as we can't translate DevicePath
into major/minor pair anymore. The device does not exist in the system
and we don't keep DevicePath index in lvmetad cache to make the
translation internally into original major/minor pair. It would be
useless to keep this index just for this one exact case.

There's nothing bad about using "--major <major> --minor <minor>",
but it makes our life a bit harder when trying to make an
interconnection with systemd units, mainly with instantiated services
where only one and only one arg can be passed (which is encoded in the
service name).

This patch tries to make this easier by adding support for recognizing
the "<major>:<minor>" as a shortcut for the longer form
"--major <major> --minor <minor>". The rule here is simple: if the argument
starts with "/", it's a DevicePath, otherwise it's a <major>:<minor> pair.
2013-10-22 13:52:18 +02:00
Mike Snitzer
65456a4a29 vgimportclone: remove 2>/dev/null from three lvm commands
There is no point eating stderr for these commands.  In fact the
redirect causes confusion and hurts dubugging.

Also reword an error message if the pvs command fails so as not be
certain that a device is not a PV.  Coupled with removing the stderr
redirect this will improve the user experience in the face of errors.
2013-10-21 18:04:14 -04:00
Peter Rajnoha
546db1c4be udev+systemd: make pvscan --cache -aay run as systemd background job from udev
The new lvm2-pvscan@.service is responsible for on-demand execution
of "pvscan --cache --activate ay" which causes lvmetad to be
updated and LVM activation done if the VG is complete.

Also, use udev-systemd mechanism to instantiate the job as the
lvm2-pvscan@$devnode.service on each newly appeared PV in the system.
This prevents the background job to be killed (that would happen
if it was directly forked from udev rule - this behaviour is seen
in recent versions of udev with the help of systemd that can track
detached processes - the detached process would still be in the same
cgroup).

To enable this official udev-systemd protocol for instantiating
background jobs, use new --enable-udev-systemd-background-jobs
configure switch (it's disabled by default). This option is highly
recommended wherever systemd is used!
2013-10-18 11:38:49 +02:00
Zdenek Kabelac
1b7631101b thin: fix lvconvert for active pool.
Prohibit conversion of pool device with active thin volumes.
Properly restore active states only for active thin pool volume.
Use new LV_NOSCAN when converting volume into thin pool's metadata.
2013-10-16 10:53:01 +02:00
Peter Rajnoha
48df36b8c5 activation: check for open count with a timeout before removal/deactivation of an LV
This patch reinstates the lv_info call to check for open count of
the LV we're removing/deactivating - this was changed with commit 125712b
some time ago and we relied on the ioctl retry logic deeper in the libdm
while calling the exact 'remove' ioctl.

However, there are still some situations in which it's still required to
check for open count before we do any 'remove' actions - this mainly
applies to LVs which consist of several sub LVs, like it is for
virtual snapshot devices.

The commit 1146691 fixed the issue with ordering of actions during
virtual snapshot removal while the snapshot is still open. But
the check for the open status of the snapshot is still prone to
marking the snapshot as in use with an immediate exit even though
this could be a temporary asynchronous open only, most notably
because of udev and its WATCH udev rule with accompanying scans
for the event which is asynchronous. The situation where this crops
up most often is when we're closing the LV that was open for read-write
and then calling lvremove immediately.

This patch reinstates the original lv_info call for the open status
of the LV in the lv_check_not_in_use fn that gets called before
we do any LV removal/deactivation. In addition to original logic,
this patch adds its own retry loop with a delay (25x0.2 seconds)
besides the existing ioctl retry loop.
2013-10-15 12:44:42 +02:00
Jonathan Brassow
f58b26b633 RAID: Report RAID images split with tracking as out-of-sync ("I").
Split image should have an out-of-sync attr ('I') - always.  Even if
the RAID LV has not been written to since the LV was split off, it is
still not part of the group that makes up the RAID and is therefore
"out-of-sync".
2013-10-14 10:48:44 -05:00
Zdenek Kabelac
851bba258c snapshot: rework parsing of snapshot metadata
Add better parsing code for snapshot metadata, which describe
properly errors found for snapshot segment.
2013-10-14 00:26:58 +02:00
Zdenek Kabelac
1146691afc snapshot: deactivate virtual snapshot first
Since the virtual snapshot has no reason to stay alive once we
detach related snapshot - deactivate whole thing in front of
snapshot removal - otherwice the code would get tricky for
support in cluster.

The correct full solution would require to have transactions
for libdm operations.

Also enable to the check for snapshot being opened prior
the origin deactivation, otherwise we could easily end
with the origin being deactivate, but snapshot still kept
active, desynchronizing locking state in cluster.
2013-10-14 00:25:15 +02:00
Zdenek Kabelac
ac961087b0 snapshot: disable merging for virtual snaps
Merging into virtual origin is not supposed to work.
2013-10-12 00:15:55 +02:00
Zdenek Kabelac
81504ba70c snapshot: move virtsnap code from tool to lib
Move code for removal dependency from tool's remove.c
into lib's manipulation code.

Same code then works with lvm2app.
2013-10-12 00:14:52 +02:00
Peter Rajnoha
304159c99a cleanup: WHATS_NEW + compiler warning about discarding const 2013-10-10 09:09:16 +02:00
Alasdair G Kergon
7bed6d1263 filters: Add NVM Express (nvme). 2013-10-09 20:08:07 +01:00
Peter Rajnoha
1b91847beb WHATS_NEW: commit 0decd75 2013-10-09 15:59:19 +02:00
Peter Rajnoha
863be9d9c6 WHATS_NEW: commit d888a05 and 808a5d9 2013-10-09 12:11:12 +02:00
Peter Rajnoha
2f5ddfbade udev: add support for "NOSCAN" flag
Recognize DM_SUBSYSTEM_UDEV_FLAG0 which for LVM is the "LVM_NOSCAN"
flag that causes the scanning to be skipped (mainly blkid) and
also directs all the foreign rules to be skipped as well.

Important thing here is that the "watch" udev rules is still set
as well as the /dev/disk/by-id content created (which does not
require any scanning to be done). Also, the flag is dropped on
any subsequent event and scanning done...
2013-10-08 13:43:14 +02:00
Peter Rajnoha
ce7489ed22 activation: add support for flagging an LV to skip udev scanning during activation
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.

This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
2013-10-08 13:43:14 +02:00
Zdenek Kabelac
92bafade60 thin: fix lvconvert in external origin conversion
Patch 562ad293fd introduced code regression
when LV was converted to a thin LV with external origin and at the same time,
conversion of LV to a thin pool has been requested.
(RHBZ: #997704)

data_lv needs to be assigned after test for external conversion find pool.
2013-10-08 13:41:06 +02:00
Zdenek Kabelac
30746f31dd vgrename: run fullscan
For vgrename run full scan so the command is able to properly
detect name collision.
2013-10-08 13:39:11 +02:00
Alasdair G Kergon
4806f38d70 lvchange: improve discards when pool active error
Existing message deemed misleading:
  Cannot change discards state for active pool volume

https://bugzilla.redhat.com/show_bug.cgi?id=994315
2013-10-07 23:50:09 +01:00
Alasdair G Kergon
761b524519 post-release 2013-10-04 14:41:32 +01:00
Alasdair G Kergon
04d9a52684 release 2.02.103
52 files changed, 598 insertions(+), 264 deletions(-)
2013-10-04 14:32:23 +01:00
Peter Rajnoha
a7ff7aee4f WHATS_NEW: renamed thin_pool_chunk_size_calculation -> policy 2013-10-04 12:36:32 +02:00
Alasdair G Kergon
baf95bbff7 cmdline: Add --ignoreskippedcluster.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.

The messages can still be seen with -v.
2013-10-01 21:20:10 +01:00
Peter Rajnoha
e4c7236c07 udev: fix 3min udev timeout so that it is applied for all LVM volumes
The timeout should be set before any volume skipping.
2013-09-27 15:37:16 +02:00
Jonathan Brassow
acdc731e83 RAID: Fix _sufficient_pes_free calculation for RAID
lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the
required space for RAID allocations incorrectly due to double
accounting.  This resulted in failure to allocate when available
space was tight.

When RAID data and metadata areas are allocated together, the total
amount is stored in ah->new_extents and ah->alloc_and_split_meta is
set.  '_sufficient_pes_free' was adding the necessary metadata extents
to ah->new_extents without ever checking ah->alloc_and_split_meta.
This often led to double accounting of the metadata extents.  This
patch checks 'ah->alloc_and_split_meta' to perform proper calculations
for RAID.

This error is only present in the function that checks for the needed
space, not in the functions that do the actual allocation.
2013-09-26 11:30:07 -05:00
Jonathan Brassow
d6516d2f79 WHATS_NEW: description for previous commit
commit 098896fb29 failed to include
description of what was fixed.

"Conversion from linear to mirror or RAID1 now honors
 mirror_segtype_default."
2013-09-25 22:35:52 -05:00
Peter Rajnoha
dd796d6a94 profile: add thin-performance.profile
Define a "performance" profile for thin pools which is exactly:
  - allocation/thin_pool_zero = 0
  - thin_pool_chunk_size_calculation = "performance"
2013-09-25 16:07:35 +02:00
Peter Rajnoha
8bf425005c conf: add allocation/thin_pool_chunk_size_calculation
Add allocation/thin_pool_chunk_size_calculation lvm.conf
option to select a method for calculating thin pool chunk
sizes and define two possible values - "default" and "performance".
2013-09-25 16:06:38 +02:00
Jonathan Brassow
5ded7314ae RAID: Fix broken allocation policies for parity RAID types
A previous commit (b6bfddcd0a) which
was designed to prevent segfaults during lvextend when trying to
extend striped logical volumes forgot to include calculations for
RAID4/5/6 parity devices.  This was causing the 'contiguous' and
'cling_by_tags' allocation policies to fail for RAID 4/5/6.

The solution is to remember that while we can compare
	ah->area_count == prev_lvseg->area_count
for non-RAID, we should compare
	(ah->area_count + ah->parity_count) == prev_lvseg->area_count
for a general solution.
2013-09-24 21:32:10 -05:00
Peter Rajnoha
6553f86818 lvmconf: use_lvmetad=0 on --enable-cluster, reset to default on --disable-cluster
lvmetad is not yet supported in clustered environment so
disable it automatically if using lvmconf --enable-cluster
and reset it to default value if using lvmconf --disable-cluster.

Also, add a few comments in lvm.conf about locking_type vs. use_lvmetad
if setting it for clustered environment.
2013-09-24 14:03:42 +02:00
Peter Rajnoha
f050278a35 tools: don't install separate command symlink for lvm devtypes 2013-09-24 09:35:20 +02:00
Alasdair G Kergon
11dc6a03c4 lvs: Add seg_size_pe field.
Requested
https://www.redhat.com/archives/linux-lvm/2013-July/msg00112.html
2013-09-23 21:50:14 +01:00
Alasdair G Kergon
7233e584ad pvmove: Accept PE ranges as start+length. 2013-09-23 19:50:34 +01:00
Alasdair G Kergon
bbcc120e5a pvmove: clean exit on failed pvmove restart
At present, before the pvmove command can be used to restart pvmove
polling, the LVs concerned need to be activated e.g. with lvchange -ay.
2013-09-23 19:46:28 +01:00
Alasdair G Kergon
229e0752f1 post-release 2013-09-23 15:55:11 +01:00
Alasdair G Kergon
c8057aec36 release 2.02.102
18 files changed, 137 insertions(+), 203 deletions(-)
2013-09-23 15:43:37 +01:00
Christine Caulfield
431eda63cc clvmd: Fix node up/down handing in corosync module
The corosync cluster interface for clvmd did not correctly
deal with node up/down events so that when a node was removed
from the cluster clvmd would prevent remote operations
from happening, as it thought the node was up but not
running clvmd.

This patch fixes that code by simplifying the case to node
being  up or down - which was the original intention
and is supported by pacemaker and CPG in the higher layers.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2013-09-23 13:23:00 +01:00
Zdenek Kabelac
ebf66ac316 Makefile: add missing deps
Add missing deps for device-mapper build of scripts dir.
Cleanup multiple SUBDIR lines together.
2013-09-23 12:13:51 +02:00
Zdenek Kabelac
3b604e5c8e lvinfo: allow to use lv_info with NULL info
When NULL info struct is passed in - function is usable
as a quick query for  lv_is_active_locally() - with a bonus
we may query for layered device.

So it could be seen as a more efficient lv_is_active_locally().
2013-09-23 12:13:06 +02:00
Alasdair G Kergon
bd75844024 release 2.02.101
112 files changed, 4131 insertions(+), 1312 deletions(-)
2013-09-20 13:56:29 +01:00
Alasdair G Kergon
6e912d949b tools: Avoid overflow in _get_int_arg.
Use strtoull instead of strtol so that argument size is not cut
to 31 bytes on machines with 32-bit long.

(Mikulas)
2013-09-18 01:16:48 +01:00
Alasdair G Kergon
a3a5f58c21 reporting: Add devtypes command.
Add internal devtypes reporting command to display built-in recognised
block device types.  (The output does not include any additional
types added by a configuration file.)

> lvm devtypes -o help
  Device Types Fields
  -------------------
    devtype_all            - All fields in this section.
    devtype_name           - Name of Device Type exactly as it appears in /proc/devices.
    devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
    devtype_description    - Description of Device Type.

> lvm devtypes
  DevType       MaxParts Description
  aoe                 16 ATA over Ethernet
  ataraid             16 ATA Raid
  bcache               1 bcache block device cache
  blkext               1 Extended device partitions
...
2013-09-18 01:09:15 +01:00
Jonathan Brassow
d1bcb21e02 WHATS_NEW: Better description for commit 82228ac
More correct description of changes made to disallow thin+mirror.
2013-09-16 15:37:48 -05:00
Alasdair G Kergon
36c5bb40a2 Makefiles: Fix CC variable override.
The CC override in commit f42b2d4bbf
caused the built-in value to be used instead of the configured value
when it wasn't being overridden.

The behaviour is explained here:
	http://stackoverflow.com/questions/18007326/how-to-change-default-values-of-variables-like-cc-in-makefile
2013-09-16 19:57:14 +01:00
Alasdair G Kergon
97ba18f4cb filters: Add bcache.
N.B. Using bcache devices as PVs is still experimental.
Problems should be reported to the appropriate mailing lists.
2013-09-16 16:56:55 +01:00
Peter Rajnoha
9742c5192e systemd: run lvm2-activation-net.service after lvm2-activation.service
The lvm2-activation-net.service was ordered only with respect to iscsi
and fcoe service before. In addition to that, we also need ordering
with respect to lvm2-activation.service to prevent parallel vgchange -aay
runs which may cause some problems during activation.
See also https://bugs.gentoo.org/show_bug.cgi?id=480066.

With this patch, the ordering is firmly set to:
lvm2-activation-early.service -> lvm2-activation.service -> lvm2-activation-net.service

Thanks to Alexander Tsoy for the original patch (modified a bit here):
https://www.redhat.com/archives/lvm-devel/2013-September/msg00049.html
2013-09-16 11:47:09 +02:00
Peter Rajnoha
e2268aeb92 WHATS_NEW: some missing lines 2013-09-12 14:16:54 +02:00
Zdenek Kabelac
2a6abcb80a tests: singlenode updates
Add more 'realistic' simulation of dlm locking.
Previous version was not capable to maintain multiple locks.
Current version doesn't handle multiqueues for locks,
so the ordering is different.
2013-09-12 10:40:39 +02:00
Jonathan Brassow
82228acfc9 Mirror/Thin: Disallow thinpools on mirror logical volumes
The same corner cases that exist for snapshots on mirrors exist for
any logical volume layered on top of mirror.  (One example is when
a mirror image fails and a non-repair LVM command is the first to
detect it via label reading.  In this case, the LVM command will hang
and prevent the necessary LVM repair command from running.)  When
a better alternative exists, it makes no sense to allow a new target
to stack on mirrors as a new feature.  Since, RAID is now capable of
running EX in a cluster and thin is not active-active aware, it makes
sense to pair these two rather than mirror+thinpool.

As further background, here are some additional comments that I made
when addressing a bug related to mirror+thinpool:
(https://bugzilla.redhat.com/show_bug.cgi?id=919604#c9)
I am going to disallow thin* on top of mirror logical volumes.
Users will have to use the "raid1" segment type if they want this.

This bug has come down to a choice between:
1) Disallowing thin-LVs from being used as PVs.
2) Disallowing thinpools on top of mirrors.

The problem is that the code in dev_manager.c:device_is_usable() is unable
to tell whether there is a mirror device lower in the stack from the device
being checked.  Pretty much anything layered on top of a mirror will suffer
from this problem.  (Snapshots are a good example of this; and option #1
above has been chosen to deal with them.  This can also be seen in
dev_manager.c:device_is_usable().)  When a mirror failure occurs, the
kernel blocks all I/O to it.  If there is an LVM command that comes along
to do the repair (or a different operation that requires label reading), it
would normally avoid the mirror when it sees that it is blocked.  However,
if there is a snapshot or a thin-LV that is on a mirror, the above code
will not detect the mirror underneath and will issue label reading I/O.
This causes the command to hang.

Choosing #1 would mean that thin-LVs could never be used as PVs - even if
they are stacked on something other than mirrors.

Choosing #2 means that thinpools can never be placed on mirrors.  This is
probably better than we think, since it is preferred that people use the
"raid1" segment type in the first place.  However, RAID* cannot currently
be used in a cluster volume group - even in EX-only mode.  Thus, a complete
solution for option #2 must include the ability to activate RAID logical
volumes (and perform RAID operations) in a cluster volume group.  I've
already begun working on this.
2013-09-11 15:58:44 -05:00
Peter Rajnoha
1f4bc637b4 WHATS_NEW: one more for commit 8d1d835 2013-09-11 13:16:36 +02:00
Peter Rajnoha
fe227d14a5 WHATS_NEW: for commit 8d1d8350 and 72a9d4f 2013-09-11 13:01:01 +02:00
Jonathan Brassow
2691f1d764 RAID: Make RAID single-machine-exclusive capable in a cluster
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.

The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide.  This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them.  Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction.  This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.

The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID.  (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
2013-09-10 16:33:22 -05:00
Zdenek Kabelac
f5832d8c49 deactivate: drop readahead calc in deactivation
Skip readahead when device will be deactivated.
2013-09-07 09:13:20 +02:00
Zdenek Kabelac
0670bfeb59 thin: validation catch multiseg thin pool/volumes
Multisegment thin pools and volumes are not supported.
Catch such error code path early.
2013-09-07 03:32:07 +02:00
Zdenek Kabelac
655296609e thin: fix monitoring of thin pool volume
Properly skip unmonitoring of thin pool volume in deactivation code
path. Code makes sure if there is just any thin pool user
it stays monitored with all its resources.
2013-09-07 03:31:04 +02:00
Zdenek Kabelac
4c001a7854 thin: fix resize of stacked thin pool volume
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.

Also properly check for active pool and extra code to active it
temporarily.

With this fix it's now possible to use:

lvcreate -L20 -m2 -n pool vg  --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm

lvresize -L+10 vg/pool
2013-09-07 03:24:48 +02:00
Alasdair G Kergon
96880102a3 logging: Write Completed message before resetting. 2013-09-06 01:47:41 +01:00
Jonathan Brassow
cc66dedc0e pvmove: Skip pvmove of RAID, thin, snapshot, origin, and mirror LVs in cluster
pvmove of the above types should only have been enabled in single machine
mode.
2013-09-03 13:17:01 -05:00
Peter Rajnoha
3b51f298bb reinstate: commit 82d83a01ce
It now works as supposed. The source of the problem is fixed
by previous commit d2d6a9da52e04f28e1916bcea3f9fda356b6df29.
2013-09-03 16:49:21 +02:00
Peter Rajnoha
008c33a21b tools: add -b/--background for pvscan --cache -aay
Udev daemon has recently introduced a limit on the number of udev
processes (there was no limit before). This causes a problem
when calling pvscan --cache -aay in lvmetad udev rules which
is supposed to activate the volumes. This activation is itself
synced with udev and so it waits for the activation to complete
before the pvscan finishes. The event processing can't continue
until this pvscan call is finished.

But if we're at the limit with the udev process count, we can't
instatiate any more udev processes, all such events are queued
and so we can't process the lvm activation event for which the
pvscan is waiting.

Then we're in a deadlock since the udev process with the
pvscan --cache -aay call waits for the lvm activation udev
processing to complete, but that will never happen as there's
this limit hit with the number of udev processes.

The process with pvscan --cache -aay actually times out eventually
(3min or 30sec, depends on the version of udev).

This patch makes it possible to run the pvscan --cache -aay
in the background so the udev processing can continue and hence
we can avoid the deadlock mentioned above.
2013-09-03 16:49:21 +02:00
Peter Rajnoha
44c1a02c18 revert: commit 82d83a01ce
The commit 82d83a01ce
"autoactivation: refresh existing VG before autoactivation"
causes problems (dangling udev_sync cookies, slow processing
of the pvscan --cache --major --minor call from udev rules)
when the autoactivation handler is run in parallel on
several PVs that belong to the same VG. Revert this patch
until the exact source of the problem is found and then
properly fixed and handled.
2013-09-02 13:53:27 +02:00
Alasdair G Kergon
78647da1c6 toolcontext: Only reopen stdin if readable.
Don't fail when running lvm commands under versions of nohup that set
up stdin as O_WRONLY!
2013-08-28 23:55:14 +01:00
Alasdair G Kergon
c0f987949b activation: Fix segfault with inactive pvmove LV.
Set flag to avoid recursion back through an inactive pvmove LV when
populating deptree.
2013-08-28 22:56:23 +01:00
Peter Rajnoha
0acd7173d1 systemd: lvm2-activation-generator: remove default dir if args not specified and require all args to be given
Remove default "/tmp" as destination directory if no args
specified for lvm2-activation-generator. Require all the
args to be specified directly for proper functionality.
2013-08-28 16:06:51 +02:00
Jonathan Brassow
2ef48b91ed pvmove: Allow moving snapshot/origin. Disallow converting and merging LVs
The patch allows the user to also pvmove snapshots and origin logical
volumes.  This means pvmove should be able to move all segment types.
I have, however, disallowed moving converting or merging logical volumes.
2013-08-26 16:36:30 -05:00
Jonathan Brassow
caa77b33f2 pvmove: Fix inability to specify LV name when moving RAID, mirror, or thin LV
Top-level LVs (like RAID, mirror or thin) are ignored when determining which
portions of an LV to pvmove.  If the user specified the name of an LV to
move and it was one of the above types, it would be skipped.  The code would
never move on to check whether its sub-LVs needed moving because their names
did not match what the user specified.

The solution is to check whether a sub-LVs is part of the LV whose name was
specified by the user - not just if there was a name match.
2013-08-26 14:12:31 -05:00
Peter Rajnoha
d34ab5e0d3 WHATS_NEW: for 4d3b5724e0 2013-08-26 15:52:15 +02:00
Zdenek Kabelac
6b416f837f thin: support lvchange for data and metadata
Support lvchange operation on stacked thin pool data and metadata
volumes.
2013-08-26 14:55:22 +02:00
Jonathan Brassow
c59167ec13 pvmove: Add support for RAID, mirror, and thin
This patch allows pvmove to operate on RAID, mirror and thin LVs.
The key component is the ability to avoid moving a RAID or mirror
sub-LV onto a PV that already has another RAID sub-LV on it.
(e.g. Avoid placing both images of a RAID1 LV on the same PV.)

Top-level LVs are processed to determine which PVs to avoid for
the sake of redundancy, while bottom-level LVs are processed
to determine which segments/extents to move.

This approach does have some drawbacks.  By eliminating whole PVs
from the allocation list, we might miss the opportunity to perform
pvmove in some senarios.  For example, if we have 3 devices and
a linear uses half of the first, a RAID1 uses half of the first and
half of the second, and a linear uses half of the third (FIGURE 1);
we should be able to pvmove the first device (FIGURE 2).
	FIGURE 1:
        [ linear ] [ -RAID- ] [ linear ]
        [ -RAID- ] [        ] [        ]

	FIGURE 2:
        [  moved ] [ -RAID- ] [ linear ]
        [  moved ] [ linear ] [ -RAID- ]
However, the approach we are using would eliminate the second
device from consideration and would leave us with too little space
for allocation.  In these situations, the user does have the ability
to specify LVs and move them one at a time.
2013-08-23 08:57:16 -05:00
Peter Rajnoha
99fe3b88d2 systemd: lvm2-activation-generator: report only error otherwise be silent
Do not print success status for lvm2-activation-generator:

  "LVM: Activation generator successfully completed."
  "LVM: Logical Volume autoactivation enabled." (if use_lvmetad=1)

Though this information is quite useful during boot, it may
be confusing for users if it happens anytime later and it
actually happens if systemd reloads. This is usually on package
update to update the systemd state and load any new units that are
newly installed in the system. The systemd reload is global and
so any existing generators are rerun at that moment too.
2013-08-22 08:27:51 +02:00
Peter Rajnoha
c8daa15270 filter-mpath: remove superfluous error message about mpath major not equal to dm major
This is a regression caused by commit 3bd9048854.
The error message added with that commit "mpath major %d is not dm major %d" is
superfluous.

When scanning for mpath components, we're looking for a parent device.
But this parent device is not necessarily an mpath device (so the dm device)
if it exists - it can be any other device layered on top (e.g. an MD RAID device).
2013-08-21 14:07:01 +02:00
Jonathan Brassow
f0be9ac904 cmirrord: Prevent secondary checkpoints from corrupting bitmaps
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).

When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap.  More than one machine can send a checkpoint,
but only the initial one should be imported.  Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin.  When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.

When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter.  This parameter was designed to avoid reading all but
the initial checkpoint.  Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
2013-08-20 13:21:09 -05:00
Peter Rajnoha
cac49725c9 udev: fix lvmetad rules to not ignore loop device configuration
If loop device is first configured on systems where /dev/loop-control
is used to dynamically create the loop device itself, there's an
ADD+CHANGE even generated. But next time the existing /dev/loop[0-9]*
is reused, there's only a CHANGE event since the device representing
it is already present in kernel (so no ADD event in this case).

We can't ignore this CHANGE event for loop devices! This is a regression
caused by 756bcabbfe. We already had
a similar problem with MD devices which was fixed by
2ac217d408 (but that one was
only an intra-release fix).
2013-08-16 15:45:00 +02:00
Michael Stapelberg
8cbbe851a8 systemd: use LVM_PATH instead of hardcoded value in activation generator 2013-08-15 09:59:19 +02:00