1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-31 21:18:26 +03:00
Commit Graph

3328 Commits

Author SHA1 Message Date
Alasdair G Kergon
177ece01a9 reports: Use X for unknown LV attr when no dm.
It's safer not to tell people an LV is inactive when we aren't sure.
2014-04-18 02:23:39 +01:00
Alasdair G Kergon
e8a3ba1865 pvscan: Use lvmetad_used().
Config variables that are processed during setup prior to calling into
particular tools must not be accessed directly afterwards in case the
values already got overridden.

_process_config() already used the tests I'm removing here to call
lvmetad_set_active() and set up lvmetad_used().
2014-04-18 02:13:46 +01:00
Peter Rajnoha
702180b30c configure: use configure's --enable-udev-systemd-background-jobs by default
This should be the preferred way of configuring lvm2 for udev/systemd
since otherwise one can end up with the processes run from udev (the
pvscan we run for lvmetad update on events) to be killed prematurely
and this can end up with LVM volumes not activated in the end.
2014-04-16 11:33:44 +02:00
Peter Rajnoha
a6763c64a7 lvmdump: add -s to gather system info and context (currently systemd-related only)
This is sort of info we always ask people to retrieve when
inspecting problems in systemd environment so let's have this
as part of lvmdump directly.

The -s option does not need to be bound to systemd only. We could
add support for initscripts or any other system-wide/service tracking
info that can help us with debugging problems.
2014-04-15 15:27:30 +02:00
Alasdair G Kergon
1bf4c3a1fa alloc: Introduce A_POSITIONAL_FILL.
Set A_POSITIONAL_FILL if the array of areas is being filled
positionally (with a slot corresponding to each 'leg') rather
than sequentially (with all suitable areas found, to be sorted
and selected from).
2014-04-15 01:13:47 +01:00
Zdenek Kabelac
eccc50d861 clvmd: use thread-safe ctime_r when debugging
Use thread friendly version of ctime
TODO:should be probably replaced with strftime()
2014-04-14 13:02:25 +02:00
Zdenek Kabelac
639983b6b7 clvmd: skip adding reply when finished
Prior adding new reply to the list, check
if the reply thread is not already finished.
In that case discard adding message
(which would otherwise be leaked).
2014-04-14 13:01:42 +02:00
Zdenek Kabelac
7236b92857 clvmd: improve mutex usage in request_timed_out
Use mutex to access localsock values, so check
num_replies when the thread is not yet finished.

Check for threadid prior the mutex taking
(though this check is probably not really needed)
2014-04-14 13:00:51 +02:00
Zdenek Kabelac
7075656034 clvmd: drop reply_mutex
Added complexity with extra reply mutex is not worth the troubles.
The only place which may slightly benefit from this mutex is timeout
and since this is rather error case - let's convert it to
localsock.mutex and keep it simple.
2014-04-14 12:59:07 +02:00
Zdenek Kabelac
6115c0d112 clvmd: set finished flag with mutex
Setting this variable needs to be protected with mutex.
2014-04-14 12:58:28 +02:00
Zdenek Kabelac
cc0096ebdd clvmd: move mutex init and detroy
Move the pthread mutex and condition creation and destroy
to correct place right after client memory is allocatedd
or is going to be released.

In the original place it's been in race with lvm thread
which could have still unlock mutex while it's been already
destroyed.
2014-04-14 12:57:39 +02:00
Zdenek Kabelac
91f4e09b48 clvmd: fix test mode race
When TEST_MODE flag is passed around the cluster,
it's been use in thread unprotected way, so it may have
influenced behaviour of other running parallel lvm commands
(activation/deactivation/suspend/resume).

Fix it by set/query function only under lvm mutex.
For hold_un/lock function calls check lock_flags bits directly.
2014-04-14 12:55:46 +02:00
Zdenek Kabelac
bd3e44643d memlock: ignore more libraries
Extend the list of ignored libraries. Since we do not
use those libraries during suspend, skip their locking.
2014-04-14 12:53:07 +02:00
Zdenek Kabelac
84ff3ae703 pvmove: remove locked flag from error pvmove0
When pvmove0 is finished, it replaces temporarily pvmove0
with error segment, however in this case, pvmove0 remains
unremovable in case pvmove --abort is interrupted in this
moment - since it's not a pvmove anymore and normal
lvremove can't be used to remove LOCKED lv.
2014-04-14 12:52:32 +02:00
Zdenek Kabelac
45f45c9932 polldaemon: ret invalid cmd for negative interval
Negative intervals are not supported.
2014-04-14 12:47:14 +02:00
Alasdair G Kergon
6320c3b905 post-release 2014-04-10 17:13:27 +01:00
Alasdair G Kergon
2043f8c729 pre-release 2014-04-10 16:28:28 +01:00
Peter Rajnoha
6c79556f4f pvcreate: fix ignored --dataalignment/dataalignment offset for pvcreate --restorefile
There were two bugs before when using pvcreate --restorefile together
with data alignment and its offset specified:

  - the --dataalignment was always ignored due to missing braces in the
    code when validating the divisibility of supplied --dataalignment
    argument with pe_start which we're just restoring:

	if (pp->rp.pe_start % pp->data_alignment)
		log_warn("WARNING: Ignoring data alignment %" PRIu64
			 " incompatible with --restorefile value (%"
			 PRIu64").", pp->data_alignment, pp->rp.pe_start);
        pp->data_alignment = 0

    The pp->data_alignment should be zeroed only if the pe_start is not
    divisible with data_alignment.

  - the check for compatibility of restored pe_start was incorrect too
    since it did not properly count with the dataalignmentoffset that
    could be supplied together with dataalignment

The proper formula is:

  X * dataalignment + dataalignmentoffset == pe_start

So it should be:

  if ((pp->rp.pe_start % pp->data_alignment) != pp->data_alignment_offset) {
	...ignore supplied dataalignment and dataalignment offset...
  }
2014-04-10 13:43:46 +02:00
Peter Rajnoha
10e10cfa4f lvmetad: fix lost bootloader area information
The refactoring made by 732859d21f
caused this. The former "ea" was not renamed to "ba" and we used
incorrect tree node name to search for the value.
2014-04-09 14:53:45 +02:00
Peter Rajnoha
a7c930b18d tools: don't require --major to be specified when using -My option on kernels > 2.4
Since kernel > 2.4 have dynamically assigned major numbers.

[0] raw/~ $ lvcreate -l1 -My --minor 10 vg
  Logical volume "lvol0" created

[0] raw/~ $ lvcreate -l1 -My --major 254 --minor 11 vg
  Ignoring supplied major number - kernel assigns major numbers dynamically. Using major number 253 instead.
  Logical volume "lvol1" created

[0] raw/~ $ lvs --profile out -o+major,minor
  lvol0 vg     -wima-----    4.00 253  10
  lvol1 vg     -wima-----    4.00 253  11
2014-04-04 13:29:39 +02:00
Alasdair G Kergon
cc72f340d7 thin: Support thin_check --clear-needs-check-flag.
Update thin provisioning tools to version 0.3.2 or later!
2014-04-04 02:22:40 +01:00
Alasdair G Kergon
12ddaa5f10 lib: Share lvm_even_rand for random numbers. 2014-04-04 01:26:19 +01:00
Alasdair G Kergon
6d2a26f6b6 man: Add lvmthin(7). 2014-04-04 01:14:25 +01:00
Zdenek Kabelac
84beba5d7f vg_validate: check size of lv_name + vg_name
Since the whole dm device name may not exceed 127 characters,
validate no LV names is bigger then this limit.
2014-03-31 12:05:32 +02:00
Zdenek Kabelac
6570d36ad5 lv_rename: resume fail is certainly error
Failing resume path means command has failed,
even when commit was ok.
2014-03-31 12:03:25 +02:00
Zdenek Kabelac
d3e4934a15 lvrename: fix name length validation
This test for LV name restriction check name of device is below 128
chars (which is enforced by dm target).
Thus it should not count with device name.
(Though the test for PATH_MAX size should be probably also added,
but this is runtime test, since theoretically devpath might differ in cluster)
2014-03-31 12:02:18 +02:00
Zdenek Kabelac
caa429da2e lvdiplay: prohibit use of -c and -m
It's unclear why we should prohibit use of -v output.
So reenable (like with other 'display' tools)
But -c -m is really unsupported - return invalid cmd.
2014-03-31 12:00:45 +02:00
Zdenek Kabelac
31d68b7124 man: sort options
Sort order for -C|--columns as with other options,
and use short capital name as the first (as with other options).
Also drop multiple reference for pvs/lvs/vgs, since now
the text for -C is really close to referrence of lvm anyway.
2014-03-30 23:43:31 +02:00
Zdenek Kabelac
7a9c569eb4 pvscan: return error when free parameter is given
Just like vgscan, pvscan (without --cache option) is not
accepting and 'free' args  (i.e.  pvscan /dev/sdx)
2014-03-30 23:42:57 +02:00
Zdenek Kabelac
844afa32a0 pvdisplay: fix option error
Properly show error for '-m'.
Also report unsupported  '-c & -s' (just like with vg/lvdisplay)
2014-03-30 23:41:52 +02:00
Zdenek Kabelac
cf29de5de0 vgimport/vgexport: return invalid cmd
When option parsing fails, return invalid cmd instead of fail.
2014-03-30 23:40:27 +02:00
Zdenek Kabelac
88a9705222 pvchange: populate lvmcache for --all
When running pvchange --all  learn about available VGs from lvmetad.
2014-03-28 10:41:58 +01:00
Peter Rajnoha
801d43445d man: add man page for lvm dumpconfig 2014-03-27 14:03:28 +01:00
Zdenek Kabelac
356fdda46d lv_manip: drop cmd pointer from for_each_sub_lv
Drop unused passed cmd pointer from function.

TODO:

We have two similar functions (though not identical)

lv_manip.c: for_each_sub_lv()
metadata.c: _lv_each_dependency()

They seem to not always match - we should probably convert
to use only a single function.
2014-03-27 13:10:13 +01:00
Zdenek Kabelac
22579b4451 lv_rename: validate renamed sublv
Use proper vgmem memory pool for allocation of LV name in the vg
and check if new renamed LV is a valid name.

TODO: validation should really use also VG name, othewise we are not
able to tell "vgname-lvname" will be valid.
2014-03-27 13:06:23 +01:00
Zdenek Kabelac
28e4bf0b6d lvrename: fix exit code
Return invalid cmd line (3) when error is detect on cmd line input.
Cmd failed (5) is used when command processing fails.
2014-03-27 13:04:55 +01:00
Zdenek Kabelac
80fe100afa pvchange: fix exit code regression
Commit 1a832398a7 moved
some code from _pvchange_single() to main pvchange() and
introduced exit code regression as return codes have not
been properly changed, thus pvchange command exited
with '0' exit code, even though it has reported error.
Also there is a missing vg unlock in error path.

Fix it by counting the total number of expected calls before
checking for pvname and also unlock and relase vg when
pv is not found.
2014-03-27 13:01:51 +01:00
Peter Rajnoha
ce78cb58eb lvmdump: add lvm dumpconfig --type diff/missing
For a quick overview of config when debugging and to quickly check
which values are different from defaults and which are not defined
in the config and for which defaults are used.
2014-03-25 13:06:59 +01:00
Zdenek Kabelac
0738f0ad72 pvresize: fail exit code for negative size
Pvresize with negative value retuns invalid cmd line exit code.
2014-03-25 11:52:03 +01:00
Zdenek Kabelac
68d13b2517 dev-cache: fix mem corruption on refresh context
When lvm2 command works with clvmd and uses locking in wrong way,
it may 'leak' certain file descriptors in opened (incorrect) state.

dev_cache_exit then destroys memory pool of cached devices, while
_open_devices list in dev-io.c was still referencing them if they
were still opened.

Patch properly calls _close() function to 'self-heal' from this
invalid state, but it will report internal error (so execution
with abort_on_internal_error causes immediate death). On the
normal 'execution', error is only reported, but memory state is
corrected, and linked list is not referencing devices from
released mempool.

For crash see: https://bugzilla.redhat.com/show_bug.cgi?id=1073886
2014-03-25 11:22:57 +01:00
Peter Rajnoha
d29fe919e6 WHATS_NEW: commit f12ee43 2014-03-25 11:00:44 +01:00
Peter Rajnoha
5dcec1734e dumpconfig: add dumpconfig --type diff to show differences from defaults 2014-03-24 15:35:54 +01:00
Zdenek Kabelac
936bfeb8de dev-swap: detect swap signature on devices smaller then 2MB
Smallest supported size for swap device is 40KB, however current
test skipped devices smaller then 4096 sectors (2MB).

Since page is in bytes, convert it to sectors before comparing
with device size (in sectors).
2014-03-22 20:36:14 +01:00
Zdenek Kabelac
93d77455ea WHATS_NEW update 2014-03-21 22:29:28 +01:00
Peter Rajnoha
2f279797f5 WHATS_NEW: commit 5eef269 2014-03-19 09:49:09 +01:00
Zdenek Kabelac
3d7eaf9226 pvscan: fix report of long pv names
pvscan --uuid was broken since it was using only 128 char buffers
without checking any write size, so any longer device path leads to
crash.

Also ansure format is properly aligned into columns with this option.
2014-03-19 00:58:01 +01:00
Zdenek Kabelac
506091be70 pv_vg_name: do not expose internal orphans to lvm2 users
Check for orphan VG name and return just empty string,
Use internally pv->vg_name if the real orphan name is needed.
2014-03-19 00:57:59 +01:00
Thomas Fehr
5b69bfb6f8 pvdisplay: fix man to refer to sectors, not KB
The PV size is displayed in sectors, not kilobytes
for 'pvdisplay -c'

Signed-off-by: Thomas Fehr <fehr@suse.de>
Acked-by: Hannes Reinecke <hare@suse.de>
2014-03-19 00:49:32 +01:00
Zdenek Kabelac
6a8d3d7811 clvmd: avoid resending local sync commands
Instead of sending repeatedly  LOCAL_SYNC commands to clvmds
like 'lvs', rememeber the last sent commmand, and if there was no other
clvmd command, drop this redundant SYNC call message.

The problem has started with commit:
56cab8cc03

This introduced correct synchronisation of name, when user requests to know
open_count (needs to wait for udev), however it is also executed for
read-only cases like 'lvs' command.

For now implement very simple solution, which is only monitoring
outgoing clvmd command, and when sequence of LOCAL sync names are
recognized, they are skipped automatically.

TODO:
Future solution might move this variable info 'cmd_context' and
use  'needs_sync' flag also i.e. in file locking code.
2014-03-19 00:47:58 +01:00
Zdenek Kabelac
a6b159e99c file_locking: use PATH_MAX for dir name
Using just 128 chars for locking dir may wail if longer
dir entry is used, swich to default linux max path.
2014-03-19 00:46:28 +01:00
Zdenek Kabelac
08018a5345 archiver: drop unneeded backup check
When the backup is disabled, avoid testing backup presence.
This only leads to errors being logged in debug trace and the missing
backup can't be fixed, since it's disabled.
2014-03-19 00:45:41 +01:00
Peter Rajnoha
25f5e2da8d dumpconfig: fix memleak when using --mergedconfig
Check whether lvm dumpconfig --mergedconfig is used only
with --type current (where we're merging current config and
the config supplied on command line). With other types
the config was merged, but it was thrown away since we're
generating other type of config anyway. This lead to a memleak.

Error out if --mergedconfig is used with anything else than
--type current (or without specifying --type in which case
the --type current is used by default).
2014-03-18 11:11:31 +01:00
Peter Rajnoha
21b3c983fd config: make global/lvdisplay_shows_full_device_path profilable 2014-03-18 09:49:53 +01:00
Peter Rajnoha
3a6bc7fc65 WHATS_NEW: previous commit 2014-03-18 09:28:52 +01:00
Peter Rajnoha
c1ce2cc86c config: make global/units and global/si_unit_consistency profilable 2014-03-17 16:07:29 +01:00
Zdenek Kabelac
d425e788e9 lvconvert: validate min chunk size for snapshot
Do not allow conversion of too small LV into a COW snapshot device.

Without this patch snapshot target is generating these kernel
messages before creation fails:

attempt to access beyond end of device
dm-9: rw=16, want=8, limit=2
attempt to access beyond end of device
...
device-mapper: table: 253:11: snapshot: Failed to read snapshot metadata
device-mapper: ioctl: error adding target to table
device-mapper: reload ioctl on  failed: Input/output error
2014-03-17 14:31:43 +01:00
Zdenek Kabelac
f3b9ee37e9 lvconvert: disallow usage of origin for snapshot
Usage of origin as a snapshot 'COW' volume is unsupported.

Without this test lvm2 is able to generate this ugly internal error message.

To test this:

lvcreate -L1 -n lv1 vg
lvcreate -L1 -n lv2 -s vg/lv1
lvcreate -L1 -n lv3 vg
lvconvert -s vg/lv3 vg/lv1

Internal error: LVs (5) != visible LVs (1) + snapshots (1) + internal LVs (0) in VG vg
2014-03-17 14:31:42 +01:00
Peter Rajnoha
455f23586f config: make report settings profilable
Users can create several profiles for how the tools report
the output very easily and then just use

  <lvm reporting command> --profile <report_profile_name>
2014-03-17 14:27:49 +01:00
Peter Rajnoha
f9070c196b conf: add existing report settings to lvm.conf
These settings exist for ages but they were not exposed in lvm.conf
and so majority of users didn't event know about them.
2014-03-17 14:24:21 +01:00
Peter Rajnoha
ada47c164a autoactivation: use VG read lock
The activation (including the refresh) should take the VG read lock
like the usual activation/refresh.
2014-03-14 12:10:52 +01:00
Peter Rajnoha
ca880a4f13 autoactivation: issue a VG refresh before autoactivation only if 'change' lvmetad flag is set
This prevents numerous VG refreshes on each "pvscan --cache -aay" call
if the VG is found complete. We need to issue the refresh only if the PV:
  - is new
  - was gone before and now it reappears (device "unplug/plug back" scenario)
  - the metadata has changed
2014-03-14 10:48:56 +01:00
Peter Rajnoha
031666ef66 man: add man page for lvm2-activation-generator 2014-03-13 13:01:06 +01:00
Peter Rajnoha
bf42119b22 config: accept empty values for global/thin_disabled_features
Let's do this the other way round - this makes more logic than commit b995f06.
So let's allow empty values for global/thin_disabled_features where
such an empty value now means "none of this features are disabled".
2014-03-12 15:53:20 +01:00
Zdenek Kabelac
6a0d97a65c lvm: change build_dm_uuid API
Pass directly 'lv' into this build routine,
so we can eventually add more private UUID suffixes.
2014-03-12 00:16:20 +01:00
Zdenek Kabelac
4d64e91efd thin: do not check of empty pool with messages
The empty pool is also the pool which has yet queued list of messages
and transaction_id == 1.

Problem is exposed when pool is created inactive.

lvcreate -L10 -T vg/pool -an
lvcreate -V10 -T vg/pool
2014-03-12 00:15:22 +01:00
Zdenek Kabelac
1850a6e454 thin: fix pool_has_message return for NULL params
When pool_has_message() is queried with NULL lv and 0 device_id
it should just return 'true' when there is any message queued.
So it needs to return negative value dm_list_empty().

Since there is no user for this code path in code currently,
this bug has not been triggered.
2014-03-12 00:13:21 +01:00
Zdenek Kabelac
460c19df62 clvmd: fix memleak on exit
This patch will releases allocated private resources from
startup. Needs previous dm_zalloc patch to ensure unset
private pointer is NULL.

TODO: check on real cluster.
2014-03-10 12:21:32 +01:00
Zdenek Kabelac
38ce06e448 clvmd: use dm_zalloc for socket allocation
Instead of doing individual settings for struct members,
ensure whole struct is in defined state.
2014-03-10 12:20:49 +01:00
Zdenek Kabelac
7a595d7388 makefiles: use BLKID/UDEV_CFLAGS properly
blkid.h needs BLKID_CFLAGS
Do not add UDEV_CFLAGS everywhere and use it only when needed.
2014-03-06 17:30:06 +01:00
Zdenek Kabelac
216c57eed7 readline: switch to new-style readline typedef
Based on patch:
https://www.redhat.com/archives/lvm-devel/2014-March/msg00015.html

The CPPFunction typedef (among others) have been deprecated in favour of
specific prototyped typedefs since readline 4.2 (circa 2001).
It's been working since because compatibility typedefs have been in
place until they where removed in the recent readline 6.3 release.
Switch to the new style to avoid build breakage.

But also add full backward compatibility with define.

Signed-off-by: Gustavo Zacarias <gustavo zacarias com ar>
2014-03-06 17:28:40 +01:00
Peter Rajnoha
42eb9b4526 WHATS_NEW: config handling changes 2014-03-06 13:03:51 +01:00
Peter Rajnoha
2c42f60890 udev: run pvscan --cache via systemd-run in udev if the PV label is detected lost
If the PV label is lost (e.g. by doing a dd on the device), call
"systemd-run pvscan --cache <major>:<minor>" in 69-dm-lvm-metad.rules
to inform lvmetad about this state.

The reason for this is that ENV{SYSTEMD_WANTS}="lvm2-pvscan@<major>:<minor>"
logic will not cause the pvscan to be fired in this case since this works
only on proper device addition/removal cycle - the lvm2-pvscan service's
ExecStop is called only on proper REMOVE event - the service is bound to
device existence. Hence we need pvscan call via systemd-run (that
instantiates a quick transient service just to call the command).

See also https://bugzilla.redhat.com/show_bug.cgi?id=1063813.
2014-03-05 14:30:58 +01:00
Zdenek Kabelac
d8513da9be lvmetad: fix memleak when pv changes it device
Test vgimportclone invokes mem leak of pvid which
would be otherwise lost when device_old_pvid
is removed from hash table.
2014-03-01 14:00:15 +01:00
Zdenek Kabelac
40e6176d25 snapshots: fix incorrect calculation of cow size
Code uses target driver version for better estimation of
max size of COW device for snapshot.

The bug can be tested with this script:
VG=vg1
lvremove -f $VG/origin
set -e
lvcreate -L 2143289344b -n origin $VG
lvcreate -n snap -c 8k -L 2304M -s $VG/origin
dd if=/dev/zero of=/dev/$VG/snap bs=1M count=2044 oflag=direct

The bug happens when these two conditions are met
* origin size is divisible by (chunk_size/16) - so that the last
  metadata area is filled completely
* the miscalculated snapshot metadata size is divisible by extent size -
  so that there is no padding to extent boundary which would otherwise
  save us

Signed-off-by:Mikulas Patocka <mpatocka@redhat.com>
2014-02-26 14:25:09 +01:00
Zhiqing Zhang
014ba37cb1 lvresize: fix stripe size validation
While stripe size is twice the physical extent size,
the original code will not reduce stripe size to maximum
(physical extent size).

Signed-off-by: Zhiqing Zhang <zhangzq.fnst@cn.fujitsu.com>
2014-02-26 13:25:50 +01:00
Zdenek Kabelac
2b044452e3 snapshot: zero cow header for read-only snapshot
When read-only snapshot was created, tool was skipping header
initialization of cow device.  If it happened device has been
already containing header from some previous snapshot, it's
been 'reused' for a newly created snapshot instead of being cleared.
2014-02-26 00:22:46 +01:00
Peter Rajnoha
558c932444 dumpconfig: comment out config lines without default values defined
To make "lvm dumpconfig --type default" output to be usable like any
other config, we need to comment out lines that have no default value
defined. Otherwise, we'd have the output with config options
with blank or zero values which is not the same as when the value
is not defined! And such configuration can't be feed into lvm again
without further edits. So let's fix this.

Currently this covers these configuration options exactly:

  devices/loopfiles
  devices/preferred_names
  devices/filter
  devices/global_filter
  devices/types
  allocation/cling_tag_list
  global/format_libraries
  global/segment_libraries
  activation/volume_list
  activation/auto_activation_volume_list
  activation/read_only_volume_list
  activation/mlock_filter
  metadata/dirs
  metadata/disk_areas
  metadata/disk_areas/<disk_area>
  metadata/disk_areas/<disk_area>/start_sector
  metadata/disk_areas/<disk_area>/size
  metadata/disk_areas/<disk_area>/id
  tags/<tag>
  tags/<tag>/host_list
2014-02-25 11:32:54 +01:00
Zdenek Kabelac
5097463fb3 mirror: detect attrs just once
Reorder detection of cmirrord. Now if cmirrord is not
running, target will not try to load kernel log module,
for communication with cmirrord.

Whole check for attrs now also happens just once.
2014-02-24 21:13:11 +01:00
Zdenek Kabelac
95fe823eba raid: use feature attributes for raid10
Test raid10 availability as a target feature (instead of doing
it in all the places where raid10 should be checked).

TODO: activation needs runtime validation - so metadata with raid10
are skipped from activation in user-friendly way in lvm2.
2014-02-24 21:10:13 +01:00
Zdenek Kabelac
8c878438f5 metadata: move vg parsing to vg_write
Parsing vg structure during  supend/commit/resume may require a lot of
memory - so move this into vg_write.

FIXME: there are now multiple cache layers which our doing some thing
multiple times at different levels. Moreover there is now different
caching path with and without lvmetad - this should be unified
and both path should use same mechanism.
2014-02-24 21:08:53 +01:00
Peter Rajnoha
84d02542bd WHATS_NEW: for commit b391ae88e5 2014-02-21 11:36:55 +01:00
Zdenek Kabelac
c71a3bcbc0 activation: lv_activation_skip remove always same arg.
Remove 'skip' argument passed into the function.
We always used '0' - as this is the only supported
option (-K) and there is no complementary option.

Also add some testing for behaviour of skipping.
2014-02-19 11:33:39 +01:00
Peter Rajnoha
417e52c13a udev: create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for a PV
We already have /dev/disk/by-id/dm-uuid-... (which encompasses the
VG UUID and LV UUID in case of LVs since the mapping's UUID is
VG+LV UUID together) and /dev/disk/by-id/dm-name-... (which encompasses
the VG and LV name in case of LVs).

This patch addds /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> that completes
this scheme and makes navigation a bit easier using PV UUIDs since
one can navigate using PV UUIDs only and there's no need to do extra
PV UUID <--> kernel name matching (the PV UUID is stable across reboots).
This may come in handy in various scripts.

Since we already have the PV UUID stored in udev database (as a result
of blkid call - returned in ID_FS_UUID blkid's variable), this operation
is very cheap indeed, just creating the extra one symlink.
2014-02-18 11:37:20 +01:00
Jonathan Brassow
6a00a7e33d RAID: Allow implicit stripe (and parity) when creating RAID LVs
There are typically 2 functions for the more advanced segment types that
deal with parameters in lvcreate.c: _get_*_params() and _check_*_params().
(Not all segment types name their functions according to this scheme.)
The former function is responsible for reading parameters before the VG
has been read.  The latter is for sanity checking and possibly setting
parameters after the VG has been read.

This patch adds a _check_raid_parameters() function that will determine
if the user has specified 'stripe' or 'mirror' parameters.  If not, the
proper number is computed from the list of PVs the user has supplied or
the number that are available in the VG.  Now that _check_raid_parameters()
is available, we move the check for proper number of stripes from
_get_* to _check_*.

This gives the user the ability to create RAID LVs as follows:
# 5-device RAID5, 4-data, 1-parity (i.e. implicit '-i 4')
~> lvcreate --type raid5 -L 100G -n lv vg /dev/sd[abcde]1

# 5-device RAID6, 3-data, 2-parity (i.e. implicit '-i 3')
~> lvcreate --type raid6 -L 100G -n lv vg /dev/sd[abcde]1

# If 5 PVs in VG, 4-data, 1-parity RAID5
~> lvcreate --type raid5 -L 100G -n lv vg

Considerations:
This patch only affects RAID.  It might also be useful to apply this to
the 'stripe' segment type.  LVM RAID may include RAID0 at some point in
the future and the implicit stripes would apply there.  It would be odd
to have RAID0 be able to auto-determine the stripe count while 'stripe'
could not.

The only draw-back of this patch that I can see is that there might be
less error checking.  Rather than informing the user that they forgot
to supply an argument (e.g. '-i'), the value would be computed and it
may differ from what the user actually wanted.  I don't see this as a
problem, because the user can check the device count after creation
and remove the LV if they have made an error.
2014-02-17 20:18:23 -06:00
Zdenek Kabelac
25cea92338 thin: fix merge of old snaphost
Fix merging of old snapshot into thinvolume origin.
Add also internal error for the error case when
merging requests activation of merged LV.
2014-02-17 22:25:53 +01:00
Peter Rajnoha
f33a224ef0 scripts: use --ignoreskippedcluster in lvm2-monitor initscript/systemd unit
When clustered VG is available in the system but we don't have
clustering set up for whatever reason, the lvm2-monitor scripts should
not fail completely just because these clustered VGs are skipped during
vgs/vgchange calls in lvm2-monitor initscript/systemd unit.
2014-02-17 16:29:49 +01:00
Zdenek Kabelac
f0f4248333 activation: drop test r/w vg state for activing LV
VG status read/write is meant to influence only VG metadata.
It's not related to the read/write status of the LV itself.
2014-02-15 11:34:54 +01:00
Peter Rajnoha
4210219a8b systemd: Use --ignoreskippedcluster in generated activation systemd units
When the activation units are generated if use_lvmetad=0 (no
autoactivation), use --ignoreskippedcluster option for vgchange calls
since the cluster with cLVM is set up by separate units.

This avoids a situation in which the generated activation units are
improperly in failed state just because of the vgchange return value
when clustered VGs are encountered while the activation of non-clustered
VGs does proceed normally.
2014-02-14 11:59:12 +01:00
Jonathan Brassow
94d8779ae2 Update WHATS_NEW for approximate allocation check-in 2014-02-13 21:12:28 -06:00
Jonathan Brassow
5bfe4aaf95 cache[pool]: Man page updates for lvs, lvcreate, lvconvert 2014-02-12 10:29:07 -06:00
Zdenek Kabelac
38e457c478 raid: drop invalid modication of active parameter
lv_active_change will enforce proper activation.
Modification of activation was wrong and lead to misuse of
autoactivation. Fix allows to use proper local exclusive activation,
while the removed code turned this into just exclusive
activation (losing required local property).
2014-02-11 18:48:38 +01:00
Zdenek Kabelac
8a21dcebac raid: use unsigned 64b constant for shift 2014-02-11 18:47:15 +01:00
Peter Rajnoha
1a5062c9a7 cleanup: clarify man pages about lvchange/vgchange -aay, use -aay in lvm2-cluster-activation script 2014-02-11 13:48:04 +01:00
Peter Rajnoha
a092cd33be systemd: rename lvm2-cluster-activation and lvm2-clvmd services to follow existing naming
lvm2_cluster_activation_red_hat.service.in -> lvm2_cluster_activation_systemd_red_hat.service.in
lvm2_clvmd_red_hat.service.in -> lvm2_clvmd_red_hat.service.in

Edit lvm2-cluster-activation reference on cmirror - take new
lvm2-cmirrord.service, it was just cmirrord(.service) before
as the old initscript was used in compatibility mode.

Also, use WantedBy=multi-user.target instead of sysinit.target
in lvm2-cluster-activation.service.
2014-02-11 10:12:39 +01:00
Ondrej Kozina
fd41dd8f9c Add systemd native service for clvmd and cluster activation
The commit splits original clvmd service in two new native services
for systemd enabled systems while original init scripts remain unaltered.

New systemd native services:

  1) clvmd daemon itself (lvm2_clvmd_red_hat.service.in)
  2) (de)activation of clustered VGs (lvm2_cluster_activation_red_hat.service.in)

There're several reasons to split it. First, there's no support for conditional
stop in systemd and AFAIK they don't plan to support it. In other words:
if the deactivation fails for some reason, systemd doesn't care and will simply
kill all remaining processes in original cgroup (by default). Killing the
remaining procs can be suppressed however it doesn't solve the following problem:

You can't repeat the stop command of a failed service. The repeated stop command
is simply not propagated to the service in a failed state. You would have to start
and then try to stop the service again. Unfortunately, this can't be done while
the daemon is still running (and we need the daemon to stay active until all
clustered VGs are deactivated properly).

In a separated setup we need only to restart the failed activation service and
that's fine.
2014-02-10 17:13:49 +01:00
Peter Rajnoha
ffa623c53d systemd: cleanup for lvmetad systemd unit
No need to fork lvmetad when running under systemd.
Also, the "lvmetad -R" support has been removed in lvm2 v2.02.98
so remove the ExecReload line that called it on "systemctl reload".
2014-02-10 16:20:45 +01:00
Peter Rajnoha
ed166a3b1d wiping: wipe DM_snapshot_cow signature without prompt in newly created LVs
The libblkid can detect DM_snapshot_cow signature and when creating
new LVs with blkid wiping used (allocation/use_blkid_wiping=1 lvm.conf
setting and --wipe y used at the same time - which it is by default).

Do not issue any prompts about this signature when new LV is created
and just wipe it right away without asking questions. Still keep the
log in verbose mode though.
2014-02-10 13:28:13 +01:00
Zdenek Kabelac
ef6c5795a0 raid: add temporary activation for raid metadata clear
Use LV_TEMPORARY when activating devices for clearing
raid metadata.
2014-02-04 14:51:05 +01:00
Alasdair G Kergon
83358d4c03 tools: Add internal tags command. 2014-01-30 13:09:15 +00:00
Zdenek Kabelac
155405b0e1 thin: validate external origin size
Avoid use of external origin with size unaligned/incompatible with
thin pool chunk size, since the last chunk is not correctly provisioned
when it is overwritten.
2014-01-29 14:58:13 +01:00
Zdenek Kabelac
e9d9852c55 thin: more validation of thin name
Avoid starting conversion of the LV to the thin pool and thin volume
at the same time.  Since this is mostly a user mistake, do not try
to just convert to one of those type, since we cannot assume if the
user wanted LV to become thin volume or thin pool.

Before the fix tool reported pretty strange internal error:
Internal error: Referenced LV lvol1_tdata not listed in VG mvg.

Fixed output:
lvconvert --thinpool lvol0 -T mvg/lvol0
Can't use same LV mvg/lvol0 for thin pool and thin volume.
2014-01-28 13:21:39 +01:00
Zdenek Kabelac
7786443530 devices: support zvol
Support partitions on ZFS zvol.
Requested via https://bugzilla.redhat.com/show_bug.cgi?id=913597

Author: hakimian@aha.com
2014-01-28 10:33:29 +01:00
Zdenek Kabelac
6b73d21ba9 locking: avoid dropping locks
When lvm2 command forks, it calls reset_locking(),
which as an unwanted side effect unlinked lock file from filesystem.

Patch changes the behavior to just close locked file descriptor
in children - so the lock is being still properly hold in the parent.
2014-01-27 12:13:29 +01:00
Zdenek Kabelac
f18ee04fab lvmetad: respect LVM_LVMETAD_PIDFILE settings in lvm
Test LVM_LVMETAD_PIDFILE for pid for lvm command.
Fix WHATS_NEW envvar name usage
Fix init order in prepare_lvmetad to respect set vars
and avoid clash with system settings.
Update test to really test the 'is running' message.
2014-01-24 15:59:38 +01:00
Zdenek Kabelac
731c298e12 thin: use LV_TEMPORARY for metadata initialization
This flag need to be specified when we create thin pool - to avoid
scanning device with watch rules.
2014-01-24 12:30:28 +01:00
Zdenek Kabelac
f8b20fb8e8 thin: fix feature compare function
Comparing for available feature missed the code path, when
maj is already bigger.

The bug would be only hit in the case, thin pool target would have
increased major version.
2014-01-23 14:22:31 +01:00
Zdenek Kabelac
902b343e0e thin: validate resize of thin LV with ext. origin
When thin volume is using external origin, current thin target
is not able to supply 'extended' size with empty pages.

lvm2 detects version and disables extension of LV past the external
origin size in this case.

Thin LV could be however still reduced and extended freely bellow
this size.
2014-01-23 14:20:34 +01:00
Zdenek Kabelac
2dae78b722 thin: rename function
Rename pool_can_resize_metadata() to more reusable
thin_pool_feature_supported() which could be queried
for mutiple different features.
2014-01-23 14:19:17 +01:00
Peter Rajnoha
2b9d25133e wiping: issue error if libblkid detects signature and fails to return offset/length
We need both offset and length when trying to wipe detected signatures.
The libblkid can fail so it's good to have an error message issued for
this state instead of being silent (libblkid does not issue any error
messages here). We just issued "stack" here before but that was not
quite useful if some error occurs...
2014-01-22 16:29:52 +01:00
Alasdair G Kergon
d2956f0a59 autoconf: Update config.guess/sub to 2014-01-01. 2014-01-21 22:00:15 +00:00
Zdenek Kabelac
22d9daff12 thin: online metadata resize requires 1.10
Version 1.10 starts to look promissing, let's enable
online resize when this thin-pool kernel target is present.
2014-01-21 13:50:17 +01:00
Alasdair G Kergon
899a079c8f post-release 2014-01-20 19:41:30 +00:00
Alasdair G Kergon
aa21e79991 pre-release 2014-01-20 19:22:56 +00:00
Peter Rajnoha
91b26b63b4 thin: fix thin LV flagging for udev to skip scanning
Only flag thin LV for no scanning in udev if this LV is about
to be wiped. This happens only in case the thin LV's pool was not
created with zeroing of the new blocks enabled.
2014-01-20 12:38:21 +01:00
Zdenek Kabelac
79768b2e9c tests: update testing for xfs 2014-01-20 12:02:33 +01:00
Alasdair G Kergon
3813cd7a3c format1: Mark obsolete and do not use with lvmetad.
DO NOT USE LVMETAD IF YOU HAVE ANY LVM1-FORMATTED PVS.

You may continue to use it without lvmetad, but do please schedule
an upgrade to the lvm2 format (with 'vgconvert').

Sending the original LVM1 formatted metadata to lvmetad is breaking
assumptions made by the code, so I am marking the format as obsolete for
now and no longer sending it to lvmetad.

This means that if you are using lvmetad, lvm1 volumes will usually
appear invisible - though not always: it depends on exactly what
sequence of commands you run!

The current situation is not satisfactory.

We'll either fix lvmetad and reenable this or we'll fix the code to
issue appropriate warning messages when lvm1 PVs are encountered
to avoid accidents.

(The latest unfixed problem is that lvmetad assumes metadata sequence
numbers exist and always increase - but the lvm1 format does not define
or store any sequence number, confusing both the daemon and client
when default values get passed to-and-fro.)
2014-01-14 03:27:45 +00:00
Alasdair G Kergon
4c2b4c37e7 lvmcache: Invalidate cached VG if PV is orphaned.
If a PV in an existing VG becomes orphaned (with 'pvcreate -ff', for
example) the VG struct cached against its vginfo must be invalidated.
This is because the struct device it references no longer contains
the PV label so becomes incorrect.

This triggers the error:
  Internal error: PV $dev unexpectedly not in cache.
when the PV from the cached VG metadata is subsequently looked up
in the cache.

Bug introduced in 2.02.87 by commit 7ad0d47c3c
("Cache and share generated VG structs").

Before:

lvm> pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
  WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
  Physical volume "/dev/loop3" successfully created
lvm> pvs
  Internal error: PV /dev/loop3 unexpectedly not in cache.
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop3      lvm2 a--  32.00m 32.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m

After:
lvm> pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/loop3 vg12 lvm2 a--  28.00m 28.00m
  /dev/loop4 vg12 lvm2 a--  28.00m 28.00m
lvm> pvcreate -ff /dev/loop3
Really INITIALIZE physical volume "/dev/loop3" of volume group "vg12" [y/n]? y
  WARNING: Forcing physical volume creation on /dev/loop3 of volume group "vg12"
  Physical volume "/dev/loop3" successfully created
lvm> pvs
  PV             VG   Fmt  Attr PSize  PFree
  /dev/loop3          lvm2 a--  32.00m 32.00m
  /dev/loop4     vg12 lvm2 a--  28.00m 28.00m
  unknown device vg12 lvm2 a-m  28.00m 28.00m
2014-01-14 02:57:03 +00:00
Peter Rajnoha
359291b41c systemd: use only major:minor for pvscan in lvm2-pvscan@.service
When using filters for the pvscan --cache (the global_filter),
there's a difference between:

  pvscan --cache -aay /dev/block/<major>:<minor>

and

  pvscan --cache -aay <major>:<minor> (or --major <major> --minor <minor>)

In the first case, we need to be sure to have an exact matching line
in the filter for the device to be used, no aliases are considered
So for example even if we have accept rule for "/dev/sda" present,
this won't apply for "/dev/block/8:0" even though it's the same device!
This is because we're comparing the path used on command line directly
with the path written in the rule.

For the second one, any alias mentioned in the filter will apply
as we're comparing the major and minor pair, not looking at actual
device names - so any alias mentioned in the rules will suffice for
the filtering rule to apply.

For the global_filter to be properly used, we need to call the
second one in the lvm2-pvscan@.service - nobody is able to tell
what value of major:minor the kernel assignes next time, hence
this bug makes the use of global_filter quite unusable!
2013-12-18 12:23:59 +01:00
Ville Skyttä
3776832499 man: syntax and spelling fixes. 2013-12-18 09:03:42 +01:00
Zdenek Kabelac
c3d82d717c Revert "tree_action: destroy devices from failing activation"
This reverts commit 24639be558.

Ok - seems we could be here a bit too active - and we
may remove devices which are unsuable for reasons we are not
aware of - thus taking down whole device could be way to big hammer.

So we still need some solution to recover from failing preload
and activation - but it needs more tunning.
2013-12-17 15:21:28 +01:00
Zdenek Kabelac
24639be558 tree_action: destroy devices from failing activation
When activation fails - we may leak large tree of partially loaded
devices in the dm table (i.e. failure in snapshot activation)

The best we can do here is try to deactivate whole device and
remove as much inactive table entries as we can.
2013-12-17 14:08:54 +01:00
Zdenek Kabelac
94137b72ed lv_dependency: scan also snapshots and extorigins
When LV is scanned for its dependencies - scan also origin's snapshots,
and thin external origins.

So if any PV from snapshot or external origin device is missing - lvm2 will
avoid trying to activate such device.
2013-12-17 14:08:54 +01:00
Peter Rajnoha
32080c4ff7 device: add physical block size info and make sure VG extent size >= PV's phys. block size 2013-12-12 15:02:36 +01:00
Zdenek Kabelac
6c0e44d5a2 dev-cache: skip double stat() call on each _insert
When the device is inserted in dev_name_confirmed() stat() is
called twice as _insert() has it's own stat() call.

Extend _insert() parameter with struct stat* - which could be used
if it has been just obtained.  When NULL is passed code is
doing its own stat() call as before.
2013-12-12 13:42:28 +01:00
Zdenek Kabelac
10a13dc03c thin: enable build of thin provisioning by default
Use internal type by default for thin provisioning.
If user is not interested in thin provisiong and doesn't
have thin provisining supporting tools installed,
configure will just print warning at the end of configure
process about limited support.
2013-12-12 13:31:23 +01:00
Zdenek Kabelac
fe21b02fab cleanup: improve tag processing
Boolean algebra changes for process_each_lv_in_vg().

1st.
Drop process_lv variable since it's not needed.

2nd.
process_lv was always initilized to 0 - so the condition was always true.
It the condition (!tags_supplied && !lvargs_supplied) evaluates as "true",
process_all is already set to 1, so skip vg tags evaluation.

3rd.
Move check for matching lv name in the front of lv tags check
since this check can't be skipped for lvargs_matched counter.
If this filter evaluates to true, skip lv tags evaluation.
2013-12-12 13:29:20 +01:00
Zdenek Kabelac
b53e9ba66a cleanup: simplify logging code
Condense code for logging.
2013-12-12 13:27:59 +01:00
Peter Rajnoha
877418b1a2 WHATS_NEW: commit 4c267c7286
See also https://bugzilla.redhat.com/show_bug.cgi?id=1026860
for more info, comment 76 for summary.
2013-12-11 16:50:52 +01:00
Zdenek Kabelac
e3b7fa4b8d thin: thin metadata resize unsupported with 1.9
Thin kernel target 1.9 still does not support online resize of
thin pool metadata properly - so disable it with expectation
for much higher version - and reenable after fixing kernel.
2013-12-10 11:17:37 +01:00
Zdenek Kabelac
cf857ecabd cleanup: shorter raid initialization
Avoid multiple queries for raid library for each initialized raid
seg type and use shorter code.
2013-12-10 11:16:51 +01:00
Zdenek Kabelac
9f0e27a18c cleanup: tiny speedup of lib_dir checking
Instead of repeated lookup of global/library_dir - remember first
search in command context - saves couple lines in debug output...
2013-12-10 11:15:48 +01:00
Alasdair G Kergon
16eab3ec08 config: shorten new sig wiping option string
Rename wipe_signatures_on_new_logical_volumes_when_zeroing  to
wipe_signatures_when_zeroing_new_lvs.
2013-12-09 09:35:47 +00:00
Peter Rajnoha
481edce41f compile/link: use RELRO/PIE compiler/linker options for executables 2013-12-05 14:03:10 +01:00
Zdenek Kabelac
7baf411c97 dev-cache: return success when ignoring dirs
When there is no failure from the insert operation itself,
just dirs and links are skipped - return success.
2013-12-05 11:17:37 +01:00
Zdenek Kabelac
598c82fc07 vgchange: move detection of remote exlusivness
Since activation takes only read-lock, there could be
multiple activation running in parallel.

So instead of checking before taking any real lock,
let the locking resolve the problem and just
detect if the reason for failure has been remote
exlusive activation.

It should be also faster, since each activation does
not need to do explicit lock query.
2013-12-04 17:09:51 +01:00
Zdenek Kabelac
91496e0eda thin: enable thin snapshot merge 2013-12-04 14:30:26 +01:00
Zdenek Kabelac
572983d793 thin: read table line with thin device id
Add functions to parse thin table line to
obtain thin device id.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
cf7f451238 merge: test only for meging origin
It's enough to check only for merging origin.
This will be also used for thin volume merges which
do not need to be origins.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
5a6794a2ce lv_remove_single: add silent arg
Support silence for removal message.
2013-12-04 14:30:25 +01:00
Zdenek Kabelac
84b3852ee5 snapshots: use lv_check_not_in_use
Switch from a simple 'open_count' test on opened snaphost
to a more 'skilled'  lv_check_not_in_use().
2013-12-04 14:30:24 +01:00
Zdenek Kabelac
778de22d51 refresh: print error message with failing lv name
If there is  suspend/resume error, print error message
with lv name.
Drop goto.
2013-12-04 14:30:24 +01:00
Peter Rajnoha
a65ab773b4 daemons: use PIE and RELRO compiler/linker options
The PIE and RELRO compiler/linker options can be used to produce a code
some techniques applied that makes the code more immune to some attacks:

  - PIE (Position Independent Executable). It can make use of the ASLR
    (Address Space Layout Randomization) provided by kernel to avoid
    static locations for .text regions of executables (this is the 'pie'
    compiler and linker option)

  - RELRO (Relocation Read-Only). This prevents overwrite attacks of
    the GOT (Global Offset Table) and PLT (Procedure Lookup Table)
    used for relocations by making it read-only after all relocations
    are resolved (these are the 'relro' and 'now' linker options) -
    hence all symbols are resolved at the very start so there's no
    need for those tables to be writeable later.

These compiler/linker options are now used by default for daemons
if the compiler/linker supports it.
2013-12-04 13:30:08 +01:00
Zdenek Kabelac
fc37d4fb0d make: support per-object defines
In the case we have a dir with multiple objects and for
an individual object file we need special define -
allow to define it without adding extra rules.

To ensure dmeventd.o compilation will use EXTRA_FLAGS:
CFLAGS_dmeventd.o += $(EXTRA_FLAGS)

Then it's better to use:
dmeventd.o: CFLAGS += $(EXTRA_FLAGS)
2013-12-04 13:30:08 +01:00
Peter Rajnoha
6c6bcc00e4 configure: check compiler/linker support for RELRO and PIE options
Also, add AC_TRY_LDFLAGS m4 macro to help with checking ld flags.
2013-12-04 13:30:08 +01:00
Alasdair G Kergon
7b65363bf7 lvconvert: Implement --splitsnapshot. 2013-12-04 02:09:37 +00:00
Alasdair G Kergon
ff769ecfe7 lvconvert: Fix reload after snapshot conversion.
At the end of lvconvert --snapshot with an active origin, the origin
gets reloaded.

Commit 57c0f72b1d ("lvconvert: use
_reload_lv on more places") accidentally replaced this with a snapshot
LV reload (which does nothing because only the origin is active).
2013-12-04 02:04:29 +00:00
Peter Rajnoha
6232cac86c vgdisplay: select only active volumes groups if -A option is used
Where "active" means "at least one LV is active in the volume group".
2013-12-03 14:43:00 +01:00
Alasdair G Kergon
84394c0219 lvmetad: extend socket/pid file handling
Make it easier to run a live lvmetad in debugging mode and
to avoid conflicts if multiple test instances need to be run
alongside a live one.

No longer require -s when -f is used: use built-in default.
Add -p to lvmetad to specify the pid file.
No longer disable pidfile if -f used to run in foreground.
If specified socket file appears to be genuine but stale, remove it
before use.
On error, only remove lvmetad socket file if created by the same
process.  (Previous code removes socket even while a running instance
is using it!)
2013-11-29 20:56:29 +00:00
Peter Rajnoha
75628f341a configure: enable blkid_wiping by default if the blkid library is present 2013-11-29 15:27:56 +01:00
Alasdair G Kergon
b3074560eb lvmetad: Add newline to missing socket error mesg 2013-11-28 19:16:25 +00:00
Zdenek Kabelac
01c438a96c format-text: ensure aligment is not 0
Make sure this path of code is not used for alignment == 0,
to prevent division by 0.
2013-11-28 12:42:39 +01:00
Peter Rajnoha
ab2f858af7 conf: add allocation/use_blkid_wiping
Add allocation/use_blkid_wiping setting to lvm.conf to select between
LVM2 native code to detect signatures to wipe or blkid library code.
2013-11-27 15:49:14 +01:00
Peter Rajnoha
9bfc0be493 configure: add --enable-blkid_wiping 2013-11-27 15:48:16 +01:00
Peter Rajnoha
169b4c1586 lvcreate: recognize --wipesignatures arg
Recognize the new --wipesignatures arg in lvcreate that is supposed
to wipe known signatures if found on newly created LV.
2013-11-27 15:48:15 +01:00
Peter Rajnoha
5b7e543cae conf: add allocation/wipe_signatures_on_new_logical_volumes_when_zeroing
This setting controls whether signature wiping on newly created logical
volumes will follow the state of zeroing (-Z/--zero option).
2013-11-27 15:48:06 +01:00
Peter Rajnoha
d6e67b8503 WHATS_NEW: commit 729b104 2013-11-27 08:33:02 +01:00
Peter Rajnoha
8d5cff5b9b lv/vgchange: do not try to connect to lvmetad if socket absent and --sysinit -aay used
If using lv/vgchange --sysinit -aay and lvmetad is enabled, we'd like to
avoid the direct activation and rely on autoactivation instead so
it fits system initialization scripts.

But if we're calling lv/vgchange --sysinit -aay too early when even
lvmetad service is not started yet, we just need to do the direct
activation instead without printing any error messages (while
trying to connect to lvmetad and not finding its socket).

This patch adds two helper functions - "lvmetad_socket_present" and
"lvmetad_used" which can be used to check for this condition properly
and avoid these lvmetad connections when the socket is not present
(and hence lvmetad is not yet running).
2013-11-26 14:51:23 +01:00
Zdenek Kabelac
4a061a35c7 snapshot: use lv_check_not_in_use
Instead of plain open_count check, try to use 'smarter'
lv_check_not_in_use() function.
2013-11-22 20:58:11 +01:00
Zdenek Kabelac
6d196410fc snapshot: revert and move check to lvconvert
Revert 4777eb6872 which put
target_present check into init_snapshot_merge(). However
this function is also used when parsing metadata. So we would
get this present test performed even when target is not really
needed. So move this target_present test directly into lvconvert.
2013-11-22 20:57:30 +01:00
Zdenek Kabelac
3d3b8bfd1c pv_write: check for lvmcache_add_mda failure
Add missing test of failing lvmcache_add_mda() call.
2013-11-22 20:55:09 +01:00
Zdenek Kabelac
a50a297f6e report: detect dev_get_size failure
Since dev_get_size() may fail, detect this failure.
2013-11-22 20:54:16 +01:00
Zdenek Kabelac
08d6d81cc2 filters: drop extra slash from sysfs path
Sysfs filter was using '/sys//class/block' with double '//' inside.
Remove this extra '/'.
Also simplify code around and use loop to try those paths.
2013-11-22 20:53:31 +01:00
Alasdair G Kergon
4c1f281b43 toollib: Avoid undefined ignore_vg parameter.
Fix process_each_segment_in_pv to always set ret before calling ignore_vg().
2013-11-22 18:11:04 +00:00
Tony Asleson
7c8abb29df liblvm/python API: Additions & fixes
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-11-19 14:54:18 -06:00
Zdenek Kabelac
c697f501c7 Makefile: fix build in non src dir
Fix 'make install' from builddir != srcdir.
2013-11-19 10:55:11 +01:00
Zdenek Kabelac
37b7c67079 thin: report thin device id
Support   lvs -o+thin_id  to report thin device id.
This device_id is connection between kernel and lvm2 user space thin
metadata.
2013-11-15 12:38:37 +01:00
Zdenek Kabelac
36d7c639d1 report: fix dereference of string as uint64_t
Fix buggy usage of "" (empty string) as a numerical string
value used for sorting.

On intel 64b platform this was typically resolve
as 0xffffff0000000000 - which is already 'close' to
UINT64_MAX which is used for _minusone64.

On other platforms it might have been giving
different numbers depends on aligment of strings.

Use proper &_minusone64 for sorting value when the reported
value is NUM.

Note: each numerical value needs to be thought about if it needs
default value &_zero64 or &_minusone64 since for cases, were
value of zero is valid, sorting should not be mixing entries
together.
2013-11-15 11:05:02 +01:00
Zdenek Kabelac
49ccf44f45 report: add wrappers to set values and percents
Add wrapper function for dm_report_field_set_value() which returns void
and return 1, so the code could be shorter.

Add wrapper function for percent display _field_set_percent().
2013-11-15 11:05:00 +01:00
Alasdair G Kergon
322e7b3060 post-release 2013-11-13 14:11:11 +00:00
Alasdair G Kergon
77a1efeb8e release 2.02.104
87 files changed, 1207 insertions(+), 294 deletions(-)
2013-11-13 14:02:34 +00:00
Alasdair G Kergon
527db4645f gcc: replace #ifdef linux with __linux__ 2013-11-13 13:56:29 +00:00
Peter Rajnoha
d8085edf65 pvscan: retry VG refresh before autoactivation if it fails
There's a tiny race when suspending the device which is part
of the refresh because when suspend ioctl is performed, the
dm kernel driver executes (do_suspend and dm_suspend kernel fn):

  step 1: a check whether the dev is already suspended and
          if yes it returns success immediately as there's
          nothing to do
  step 2: it grabs the suspend lock
  step 3: another check whether the dev is already suspended
          and if found suspended, it exits with -EINVAL now

The race can occur in between step 1 and step 2. To prevent
premature autoactivation failure, we're using a simple retry
logic here before we fail completely. For a complete solution,
we need to fix the locking so there's no possibility for suspend
calls to interleave each other to cause this kind of race.

This is just a workaround. Remove it and replace it with proper
locking once we have that in!
2013-11-12 11:09:45 +01:00
Jonathan Brassow
7de533ad12 mirror: Handle failures in tmp mirror used when up-converting.
Failures in the temporary mirror used when up-converting cause dmeventd
to issue 'lvconvert --repair' on the sub-LV, <lv_name>_mimagetmp_?.  The
'lvconvert' command refuses to deal with this sub-LV outright - it
expects to be given the name of the top-level LV.  So, just like we do
with mirrored logs, we strip-off the portion of the name that is not
the top-level LV and issue the command on the top-level LV instead.
2013-11-08 09:52:00 -06:00
Zdenek Kabelac
9f6209b878 activation: improve activation
This patch fixes mostly cluster behavior but also updates
non-cluster reaction where calls like   'lvchange -aln'
lead to incorrect errors for some segment types.

Fix the implicit activation rules where some segment types could
be activated only in exclusive mode in cluster.
lvm2 command was not preserver 'local' property and incorrectly
converted local activations in to plain exclusive, so the local
activation could have activate volumes exclusively, but remotely.
2013-11-01 13:03:50 +01:00
Zdenek Kabelac
c3e674ad30 activation: _lv_activate is ok when filtered.
If the volume_list filters out volume from activation,
it is still success result for this function.
Change the error message back to verbose level.

Detect if the volume is active localy before zeroing,
so we report error a bit later for cases, where volume
could not be activated because it doesn't pass through volume
list  (but user still could create volume when he disables
zeroing)
2013-11-01 13:02:36 +01:00
Zdenek Kabelac
1bde9f68ce locking: activate_lv_excl return correct error code
Correct return code of activate_lv_excl().

Function is not supposed to return activation state of
activated volume, but return code of the operation.
Since i.e. when activation filter is allowing to activate
volume on current system, it is still success even though
no volume is activated.
2013-11-01 13:02:13 +01:00
Peter Rajnoha
f070e3543a udev: properly trigger LVM scan for MD partitions
MD can directly create partition devices without a need to run
an extra kpartx or partprobe call. We need to react to this event in
a different way as for bare MD devices - we need to handle the ADD event
for KERNEL=="md[0-9]*p[0-9]*" kernel name and trigger the LVM scanning
to update lvmetad to trigger autoactivation and so on...

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1023250
2013-10-30 14:09:11 +01:00
Peter Rajnoha
db05a0cf6f WHATS_NEW: commit 9d06212
Other changes in previous commits 9d06212 and f1a42aa are changes
in the code that was not yet released as part of upcoming v104.
2013-10-29 13:49:55 +01:00
Peter Rajnoha
f3a6f7073b WHATS_NEW: commit 4c0db84 2013-10-29 11:04:32 +01:00
Zdenek Kabelac
8e1f2e733e gcc: fix comparing floating point warning
Since we enabled some more gcc warnings - let's adapt for
it and check for double equals with DBL_EPSILON.

Current close_enough() is far from perfect
for more details see i.e. here:
http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/
but fairly enough for lvm2 use-case.
2013-10-25 10:43:32 +02:00
Alasdair G Kergon
c9c23d4148 build: Use additional gcc warning flags. 2013-10-24 17:10:24 +01:00
Jonathan Brassow
d5896f0afd Mirror: Fix hangs and lock-ups caused by attempting label reads of mirrors
There is a problem with the way mirrors have been designed to handle
failures that is resulting in stuck LVM processes and hung I/O.  When
mirrors encounter a write failure, they block I/O and notify userspace
to reconfigure the mirror to remove failed devices.  This process is
open to a couple races:
1) Any LVM process other than the one that is meant to deal with the
mirror failure can attempt to read the mirror, fail, and block other
LVM commands (including the repair command) from proceeding due to
holding a lock on the volume group.
2) If there are multiple mirrors that suffer a failure in the same
volume group, a repair can block while attempting to read the LVM
label from one mirror while trying to repair the other.

Mitigation of these races has been attempted by disallowing label reading
of mirrors that are either suspended or are indicated as blocking by
the kernel.  While this has closed the window of opportunity for hitting
the above problems considerably, it hasn't closed it completely.  This is
because it is still possible to start an LVM command, read the status of
the mirror as healthy, and then perform the read for the label at the
moment after a the failure is discovered by the kernel.

I can see two solutions to this problem:
1) Allow users to configure whether mirrors can be candidates for LVM
labels (i.e. whether PVs can be created on mirror LVs).  If the user
chooses to allow label scanning of mirror LVs, it will be at the expense
of a possible hang in I/O or LVM processes.
2) Instrument a way to allow asynchronous label reading - allowing
blocked label reads to be ignored while continuing to process the LVM
command.  This would action would allow LVM commands to continue even
though they would have otherwise blocked trying to read a mirror.  They
can then release their lock and allow a repair command to commence.  In
the event of #2 above, the repair command already in progress can continue
and repair the failed mirror.

This patch brings solution #1.  If solution #2 is developed later on, the
configuration option created in #1 can be negated - allowing mirrors to
be scanned for labels by default once again.
2013-10-22 19:14:33 -05:00
Peter Rajnoha
039bdad732 activation: flag temporary LVs internally
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.

We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.

This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.

For example: lvcreate --thinpool POOL --zero n -L 1G vg

- first, the usual LV is created to do a clean up for pool metadata
  spare. The LV is activated, zeroed, deactivated.

- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
  to avoid any scanning in udev

- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
  udev rule, but since the LV is just a usual LV, we can't make a
  difference. The LV_TEMPORARY internal LV flag helps here. If we
  create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
  and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
  with "invisible" and non-top-level LVs) - udev is directed to
  skip WATCH rule use.

- if the LV_TEMPORARY flag was not used, there would normally be
  a WATCH event generated once the LV is closed after "zeroed"
  stage. This will make problems with immediated deactivation that
  follows.
2013-10-23 14:09:37 +02:00
Peter Rajnoha
9883bffb04 WHATS_NEW: typo 2013-10-22 16:37:02 +02:00
Peter Rajnoha
b109bfc1ef blkdeactivate: fix endless loop if device(s) given and unable to umount/deactivate
The blkdeactivate script iterates over the list of devices if they're
given as an argument and it tries to umount/deactivate them one by one.

This iteration failed to proceed if any of the umount/deactivation
was unsuccessful - there was a missing "shift" call to move to the
next argument (device) for processing. As a result of this, the same
device was tried again and again, causing an endless loop, never
proceeding to the next device given.
2013-10-22 16:24:39 +02:00
Peter Rajnoha
3fee661028 udev+systemd: refine lvm2-pvscan@.service to better track device existence
When using ENV{SYSTEMD_WANTS}=lvm2-pvscan@... to instantiate a service
for lvmetad scan when the new PV appears in the system, the service
is started and executed. However, to track device removal, we need
to bind it (the "BindsTo" systemd directive) to a certain .device
systemd unit.

In default systemd setup, the device is tracked by it's name and
sysfs path (there's normally a sysfs path .device systemd unit for
a device and then the device name .device unit as an alias for it).
Neither of these two is useful for lvmetad update as we need to bind
it to device's <major>:<minor> pair.

The /dev/block/<major>:<minor> is the essential symlink under /dev
that exists for each block device (created by default udev rules
provided by udev directly). So let's use this as an alias for
the device's .device unit as well by means of "ENV{SYSTEMD_ALIAS}"
declaration within udev rules which systemd understands (this will
create a new alias "dev-block-<major>:<minor>.device".

Then we can easily bind the "dev-block-<major>:<minor>" device
systemd unit with instantiated lvm2-pvscan@<major>:<minor>.service.
So once the device is removed from the systemd, the
lvm-pvscan@<major>:<minor>.service executes it's ExecStop action
(which in turn notifies lvmetad about the device being gone).

This completes the udev-systemd-lvmetad interaction then.
2013-10-22 14:22:40 +02:00
Peter Rajnoha
0a48137d39 pvscan: use major:minor as short form of --major and --minor arg for pvscan --cache
Before, pvscan recognized either:
  pvscan --cache --major <major> --minor <minor>
or
  pvscan --cache <DevicePath>

When the device is gone and we need to notify lvmetad about device
removal, only --major/--minor works as we can't translate DevicePath
into major/minor pair anymore. The device does not exist in the system
and we don't keep DevicePath index in lvmetad cache to make the
translation internally into original major/minor pair. It would be
useless to keep this index just for this one exact case.

There's nothing bad about using "--major <major> --minor <minor>",
but it makes our life a bit harder when trying to make an
interconnection with systemd units, mainly with instantiated services
where only one and only one arg can be passed (which is encoded in the
service name).

This patch tries to make this easier by adding support for recognizing
the "<major>:<minor>" as a shortcut for the longer form
"--major <major> --minor <minor>". The rule here is simple: if the argument
starts with "/", it's a DevicePath, otherwise it's a <major>:<minor> pair.
2013-10-22 13:52:18 +02:00
Mike Snitzer
65456a4a29 vgimportclone: remove 2>/dev/null from three lvm commands
There is no point eating stderr for these commands.  In fact the
redirect causes confusion and hurts dubugging.

Also reword an error message if the pvs command fails so as not be
certain that a device is not a PV.  Coupled with removing the stderr
redirect this will improve the user experience in the face of errors.
2013-10-21 18:04:14 -04:00
Peter Rajnoha
546db1c4be udev+systemd: make pvscan --cache -aay run as systemd background job from udev
The new lvm2-pvscan@.service is responsible for on-demand execution
of "pvscan --cache --activate ay" which causes lvmetad to be
updated and LVM activation done if the VG is complete.

Also, use udev-systemd mechanism to instantiate the job as the
lvm2-pvscan@$devnode.service on each newly appeared PV in the system.
This prevents the background job to be killed (that would happen
if it was directly forked from udev rule - this behaviour is seen
in recent versions of udev with the help of systemd that can track
detached processes - the detached process would still be in the same
cgroup).

To enable this official udev-systemd protocol for instantiating
background jobs, use new --enable-udev-systemd-background-jobs
configure switch (it's disabled by default). This option is highly
recommended wherever systemd is used!
2013-10-18 11:38:49 +02:00
Zdenek Kabelac
1b7631101b thin: fix lvconvert for active pool.
Prohibit conversion of pool device with active thin volumes.
Properly restore active states only for active thin pool volume.
Use new LV_NOSCAN when converting volume into thin pool's metadata.
2013-10-16 10:53:01 +02:00
Peter Rajnoha
48df36b8c5 activation: check for open count with a timeout before removal/deactivation of an LV
This patch reinstates the lv_info call to check for open count of
the LV we're removing/deactivating - this was changed with commit 125712b
some time ago and we relied on the ioctl retry logic deeper in the libdm
while calling the exact 'remove' ioctl.

However, there are still some situations in which it's still required to
check for open count before we do any 'remove' actions - this mainly
applies to LVs which consist of several sub LVs, like it is for
virtual snapshot devices.

The commit 1146691 fixed the issue with ordering of actions during
virtual snapshot removal while the snapshot is still open. But
the check for the open status of the snapshot is still prone to
marking the snapshot as in use with an immediate exit even though
this could be a temporary asynchronous open only, most notably
because of udev and its WATCH udev rule with accompanying scans
for the event which is asynchronous. The situation where this crops
up most often is when we're closing the LV that was open for read-write
and then calling lvremove immediately.

This patch reinstates the original lv_info call for the open status
of the LV in the lv_check_not_in_use fn that gets called before
we do any LV removal/deactivation. In addition to original logic,
this patch adds its own retry loop with a delay (25x0.2 seconds)
besides the existing ioctl retry loop.
2013-10-15 12:44:42 +02:00
Jonathan Brassow
f58b26b633 RAID: Report RAID images split with tracking as out-of-sync ("I").
Split image should have an out-of-sync attr ('I') - always.  Even if
the RAID LV has not been written to since the LV was split off, it is
still not part of the group that makes up the RAID and is therefore
"out-of-sync".
2013-10-14 10:48:44 -05:00
Zdenek Kabelac
851bba258c snapshot: rework parsing of snapshot metadata
Add better parsing code for snapshot metadata, which describe
properly errors found for snapshot segment.
2013-10-14 00:26:58 +02:00
Zdenek Kabelac
1146691afc snapshot: deactivate virtual snapshot first
Since the virtual snapshot has no reason to stay alive once we
detach related snapshot - deactivate whole thing in front of
snapshot removal - otherwice the code would get tricky for
support in cluster.

The correct full solution would require to have transactions
for libdm operations.

Also enable to the check for snapshot being opened prior
the origin deactivation, otherwise we could easily end
with the origin being deactivate, but snapshot still kept
active, desynchronizing locking state in cluster.
2013-10-14 00:25:15 +02:00
Zdenek Kabelac
ac961087b0 snapshot: disable merging for virtual snaps
Merging into virtual origin is not supposed to work.
2013-10-12 00:15:55 +02:00
Zdenek Kabelac
81504ba70c snapshot: move virtsnap code from tool to lib
Move code for removal dependency from tool's remove.c
into lib's manipulation code.

Same code then works with lvm2app.
2013-10-12 00:14:52 +02:00
Peter Rajnoha
304159c99a cleanup: WHATS_NEW + compiler warning about discarding const 2013-10-10 09:09:16 +02:00
Alasdair G Kergon
7bed6d1263 filters: Add NVM Express (nvme). 2013-10-09 20:08:07 +01:00
Peter Rajnoha
1b91847beb WHATS_NEW: commit 0decd75 2013-10-09 15:59:19 +02:00
Peter Rajnoha
863be9d9c6 WHATS_NEW: commit d888a05 and 808a5d9 2013-10-09 12:11:12 +02:00
Peter Rajnoha
2f5ddfbade udev: add support for "NOSCAN" flag
Recognize DM_SUBSYSTEM_UDEV_FLAG0 which for LVM is the "LVM_NOSCAN"
flag that causes the scanning to be skipped (mainly blkid) and
also directs all the foreign rules to be skipped as well.

Important thing here is that the "watch" udev rules is still set
as well as the /dev/disk/by-id content created (which does not
require any scanning to be done). Also, the flag is dropped on
any subsequent event and scanning done...
2013-10-08 13:43:14 +02:00
Peter Rajnoha
ce7489ed22 activation: add support for flagging an LV to skip udev scanning during activation
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.

This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
2013-10-08 13:43:14 +02:00
Zdenek Kabelac
92bafade60 thin: fix lvconvert in external origin conversion
Patch 562ad293fd introduced code regression
when LV was converted to a thin LV with external origin and at the same time,
conversion of LV to a thin pool has been requested.
(RHBZ: #997704)

data_lv needs to be assigned after test for external conversion find pool.
2013-10-08 13:41:06 +02:00
Zdenek Kabelac
30746f31dd vgrename: run fullscan
For vgrename run full scan so the command is able to properly
detect name collision.
2013-10-08 13:39:11 +02:00
Alasdair G Kergon
4806f38d70 lvchange: improve discards when pool active error
Existing message deemed misleading:
  Cannot change discards state for active pool volume

https://bugzilla.redhat.com/show_bug.cgi?id=994315
2013-10-07 23:50:09 +01:00
Alasdair G Kergon
761b524519 post-release 2013-10-04 14:41:32 +01:00
Alasdair G Kergon
04d9a52684 release 2.02.103
52 files changed, 598 insertions(+), 264 deletions(-)
2013-10-04 14:32:23 +01:00
Peter Rajnoha
a7ff7aee4f WHATS_NEW: renamed thin_pool_chunk_size_calculation -> policy 2013-10-04 12:36:32 +02:00
Alasdair G Kergon
baf95bbff7 cmdline: Add --ignoreskippedcluster.
Accept --ignoreskippedcluster with pvs, vgs, lvs, pvdisplay, vgdisplay,
lvdisplay, vgchange and lvchange to avoid the 'Skipping clustered
VG' errors when requesting information about a clustered VG
without using clustered locking and still exit with success.

The messages can still be seen with -v.
2013-10-01 21:20:10 +01:00
Peter Rajnoha
e4c7236c07 udev: fix 3min udev timeout so that it is applied for all LVM volumes
The timeout should be set before any volume skipping.
2013-09-27 15:37:16 +02:00
Jonathan Brassow
acdc731e83 RAID: Fix _sufficient_pes_free calculation for RAID
lib/metadata/lv_manip.c:_sufficient_pes_free() was calculating the
required space for RAID allocations incorrectly due to double
accounting.  This resulted in failure to allocate when available
space was tight.

When RAID data and metadata areas are allocated together, the total
amount is stored in ah->new_extents and ah->alloc_and_split_meta is
set.  '_sufficient_pes_free' was adding the necessary metadata extents
to ah->new_extents without ever checking ah->alloc_and_split_meta.
This often led to double accounting of the metadata extents.  This
patch checks 'ah->alloc_and_split_meta' to perform proper calculations
for RAID.

This error is only present in the function that checks for the needed
space, not in the functions that do the actual allocation.
2013-09-26 11:30:07 -05:00
Jonathan Brassow
d6516d2f79 WHATS_NEW: description for previous commit
commit 098896fb29 failed to include
description of what was fixed.

"Conversion from linear to mirror or RAID1 now honors
 mirror_segtype_default."
2013-09-25 22:35:52 -05:00
Peter Rajnoha
dd796d6a94 profile: add thin-performance.profile
Define a "performance" profile for thin pools which is exactly:
  - allocation/thin_pool_zero = 0
  - thin_pool_chunk_size_calculation = "performance"
2013-09-25 16:07:35 +02:00
Peter Rajnoha
8bf425005c conf: add allocation/thin_pool_chunk_size_calculation
Add allocation/thin_pool_chunk_size_calculation lvm.conf
option to select a method for calculating thin pool chunk
sizes and define two possible values - "default" and "performance".
2013-09-25 16:06:38 +02:00
Jonathan Brassow
5ded7314ae RAID: Fix broken allocation policies for parity RAID types
A previous commit (b6bfddcd0a) which
was designed to prevent segfaults during lvextend when trying to
extend striped logical volumes forgot to include calculations for
RAID4/5/6 parity devices.  This was causing the 'contiguous' and
'cling_by_tags' allocation policies to fail for RAID 4/5/6.

The solution is to remember that while we can compare
	ah->area_count == prev_lvseg->area_count
for non-RAID, we should compare
	(ah->area_count + ah->parity_count) == prev_lvseg->area_count
for a general solution.
2013-09-24 21:32:10 -05:00
Peter Rajnoha
6553f86818 lvmconf: use_lvmetad=0 on --enable-cluster, reset to default on --disable-cluster
lvmetad is not yet supported in clustered environment so
disable it automatically if using lvmconf --enable-cluster
and reset it to default value if using lvmconf --disable-cluster.

Also, add a few comments in lvm.conf about locking_type vs. use_lvmetad
if setting it for clustered environment.
2013-09-24 14:03:42 +02:00
Peter Rajnoha
f050278a35 tools: don't install separate command symlink for lvm devtypes 2013-09-24 09:35:20 +02:00
Alasdair G Kergon
11dc6a03c4 lvs: Add seg_size_pe field.
Requested
https://www.redhat.com/archives/linux-lvm/2013-July/msg00112.html
2013-09-23 21:50:14 +01:00
Alasdair G Kergon
7233e584ad pvmove: Accept PE ranges as start+length. 2013-09-23 19:50:34 +01:00
Alasdair G Kergon
bbcc120e5a pvmove: clean exit on failed pvmove restart
At present, before the pvmove command can be used to restart pvmove
polling, the LVs concerned need to be activated e.g. with lvchange -ay.
2013-09-23 19:46:28 +01:00
Alasdair G Kergon
229e0752f1 post-release 2013-09-23 15:55:11 +01:00
Alasdair G Kergon
c8057aec36 release 2.02.102
18 files changed, 137 insertions(+), 203 deletions(-)
2013-09-23 15:43:37 +01:00
Christine Caulfield
431eda63cc clvmd: Fix node up/down handing in corosync module
The corosync cluster interface for clvmd did not correctly
deal with node up/down events so that when a node was removed
from the cluster clvmd would prevent remote operations
from happening, as it thought the node was up but not
running clvmd.

This patch fixes that code by simplifying the case to node
being  up or down - which was the original intention
and is supported by pacemaker and CPG in the higher layers.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
2013-09-23 13:23:00 +01:00
Zdenek Kabelac
ebf66ac316 Makefile: add missing deps
Add missing deps for device-mapper build of scripts dir.
Cleanup multiple SUBDIR lines together.
2013-09-23 12:13:51 +02:00
Zdenek Kabelac
3b604e5c8e lvinfo: allow to use lv_info with NULL info
When NULL info struct is passed in - function is usable
as a quick query for  lv_is_active_locally() - with a bonus
we may query for layered device.

So it could be seen as a more efficient lv_is_active_locally().
2013-09-23 12:13:06 +02:00
Alasdair G Kergon
bd75844024 release 2.02.101
112 files changed, 4131 insertions(+), 1312 deletions(-)
2013-09-20 13:56:29 +01:00
Alasdair G Kergon
6e912d949b tools: Avoid overflow in _get_int_arg.
Use strtoull instead of strtol so that argument size is not cut
to 31 bytes on machines with 32-bit long.

(Mikulas)
2013-09-18 01:16:48 +01:00
Alasdair G Kergon
a3a5f58c21 reporting: Add devtypes command.
Add internal devtypes reporting command to display built-in recognised
block device types.  (The output does not include any additional
types added by a configuration file.)

> lvm devtypes -o help
  Device Types Fields
  -------------------
    devtype_all            - All fields in this section.
    devtype_name           - Name of Device Type exactly as it appears in /proc/devices.
    devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
    devtype_description    - Description of Device Type.

> lvm devtypes
  DevType       MaxParts Description
  aoe                 16 ATA over Ethernet
  ataraid             16 ATA Raid
  bcache               1 bcache block device cache
  blkext               1 Extended device partitions
...
2013-09-18 01:09:15 +01:00
Jonathan Brassow
d1bcb21e02 WHATS_NEW: Better description for commit 82228ac
More correct description of changes made to disallow thin+mirror.
2013-09-16 15:37:48 -05:00
Alasdair G Kergon
36c5bb40a2 Makefiles: Fix CC variable override.
The CC override in commit f42b2d4bbf
caused the built-in value to be used instead of the configured value
when it wasn't being overridden.

The behaviour is explained here:
	http://stackoverflow.com/questions/18007326/how-to-change-default-values-of-variables-like-cc-in-makefile
2013-09-16 19:57:14 +01:00
Alasdair G Kergon
97ba18f4cb filters: Add bcache.
N.B. Using bcache devices as PVs is still experimental.
Problems should be reported to the appropriate mailing lists.
2013-09-16 16:56:55 +01:00
Peter Rajnoha
9742c5192e systemd: run lvm2-activation-net.service after lvm2-activation.service
The lvm2-activation-net.service was ordered only with respect to iscsi
and fcoe service before. In addition to that, we also need ordering
with respect to lvm2-activation.service to prevent parallel vgchange -aay
runs which may cause some problems during activation.
See also https://bugs.gentoo.org/show_bug.cgi?id=480066.

With this patch, the ordering is firmly set to:
lvm2-activation-early.service -> lvm2-activation.service -> lvm2-activation-net.service

Thanks to Alexander Tsoy for the original patch (modified a bit here):
https://www.redhat.com/archives/lvm-devel/2013-September/msg00049.html
2013-09-16 11:47:09 +02:00
Peter Rajnoha
e2268aeb92 WHATS_NEW: some missing lines 2013-09-12 14:16:54 +02:00
Zdenek Kabelac
2a6abcb80a tests: singlenode updates
Add more 'realistic' simulation of dlm locking.
Previous version was not capable to maintain multiple locks.
Current version doesn't handle multiqueues for locks,
so the ordering is different.
2013-09-12 10:40:39 +02:00
Jonathan Brassow
82228acfc9 Mirror/Thin: Disallow thinpools on mirror logical volumes
The same corner cases that exist for snapshots on mirrors exist for
any logical volume layered on top of mirror.  (One example is when
a mirror image fails and a non-repair LVM command is the first to
detect it via label reading.  In this case, the LVM command will hang
and prevent the necessary LVM repair command from running.)  When
a better alternative exists, it makes no sense to allow a new target
to stack on mirrors as a new feature.  Since, RAID is now capable of
running EX in a cluster and thin is not active-active aware, it makes
sense to pair these two rather than mirror+thinpool.

As further background, here are some additional comments that I made
when addressing a bug related to mirror+thinpool:
(https://bugzilla.redhat.com/show_bug.cgi?id=919604#c9)
I am going to disallow thin* on top of mirror logical volumes.
Users will have to use the "raid1" segment type if they want this.

This bug has come down to a choice between:
1) Disallowing thin-LVs from being used as PVs.
2) Disallowing thinpools on top of mirrors.

The problem is that the code in dev_manager.c:device_is_usable() is unable
to tell whether there is a mirror device lower in the stack from the device
being checked.  Pretty much anything layered on top of a mirror will suffer
from this problem.  (Snapshots are a good example of this; and option #1
above has been chosen to deal with them.  This can also be seen in
dev_manager.c:device_is_usable().)  When a mirror failure occurs, the
kernel blocks all I/O to it.  If there is an LVM command that comes along
to do the repair (or a different operation that requires label reading), it
would normally avoid the mirror when it sees that it is blocked.  However,
if there is a snapshot or a thin-LV that is on a mirror, the above code
will not detect the mirror underneath and will issue label reading I/O.
This causes the command to hang.

Choosing #1 would mean that thin-LVs could never be used as PVs - even if
they are stacked on something other than mirrors.

Choosing #2 means that thinpools can never be placed on mirrors.  This is
probably better than we think, since it is preferred that people use the
"raid1" segment type in the first place.  However, RAID* cannot currently
be used in a cluster volume group - even in EX-only mode.  Thus, a complete
solution for option #2 must include the ability to activate RAID logical
volumes (and perform RAID operations) in a cluster volume group.  I've
already begun working on this.
2013-09-11 15:58:44 -05:00
Peter Rajnoha
1f4bc637b4 WHATS_NEW: one more for commit 8d1d835 2013-09-11 13:16:36 +02:00
Peter Rajnoha
fe227d14a5 WHATS_NEW: for commit 8d1d8350 and 72a9d4f 2013-09-11 13:01:01 +02:00
Jonathan Brassow
2691f1d764 RAID: Make RAID single-machine-exclusive capable in a cluster
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.

The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide.  This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them.  Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction.  This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.

The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID.  (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
2013-09-10 16:33:22 -05:00
Zdenek Kabelac
f5832d8c49 deactivate: drop readahead calc in deactivation
Skip readahead when device will be deactivated.
2013-09-07 09:13:20 +02:00
Zdenek Kabelac
0670bfeb59 thin: validation catch multiseg thin pool/volumes
Multisegment thin pools and volumes are not supported.
Catch such error code path early.
2013-09-07 03:32:07 +02:00
Zdenek Kabelac
655296609e thin: fix monitoring of thin pool volume
Properly skip unmonitoring of thin pool volume in deactivation code
path. Code makes sure if there is just any thin pool user
it stays monitored with all its resources.
2013-09-07 03:31:04 +02:00
Zdenek Kabelac
4c001a7854 thin: fix resize of stacked thin pool volume
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.

Also properly check for active pool and extra code to active it
temporarily.

With this fix it's now possible to use:

lvcreate -L20 -m2 -n pool vg  --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm

lvresize -L+10 vg/pool
2013-09-07 03:24:48 +02:00
Alasdair G Kergon
96880102a3 logging: Write Completed message before resetting. 2013-09-06 01:47:41 +01:00
Jonathan Brassow
cc66dedc0e pvmove: Skip pvmove of RAID, thin, snapshot, origin, and mirror LVs in cluster
pvmove of the above types should only have been enabled in single machine
mode.
2013-09-03 13:17:01 -05:00
Peter Rajnoha
3b51f298bb reinstate: commit 82d83a01ce
It now works as supposed. The source of the problem is fixed
by previous commit d2d6a9da52e04f28e1916bcea3f9fda356b6df29.
2013-09-03 16:49:21 +02:00
Peter Rajnoha
008c33a21b tools: add -b/--background for pvscan --cache -aay
Udev daemon has recently introduced a limit on the number of udev
processes (there was no limit before). This causes a problem
when calling pvscan --cache -aay in lvmetad udev rules which
is supposed to activate the volumes. This activation is itself
synced with udev and so it waits for the activation to complete
before the pvscan finishes. The event processing can't continue
until this pvscan call is finished.

But if we're at the limit with the udev process count, we can't
instatiate any more udev processes, all such events are queued
and so we can't process the lvm activation event for which the
pvscan is waiting.

Then we're in a deadlock since the udev process with the
pvscan --cache -aay call waits for the lvm activation udev
processing to complete, but that will never happen as there's
this limit hit with the number of udev processes.

The process with pvscan --cache -aay actually times out eventually
(3min or 30sec, depends on the version of udev).

This patch makes it possible to run the pvscan --cache -aay
in the background so the udev processing can continue and hence
we can avoid the deadlock mentioned above.
2013-09-03 16:49:21 +02:00
Peter Rajnoha
44c1a02c18 revert: commit 82d83a01ce
The commit 82d83a01ce
"autoactivation: refresh existing VG before autoactivation"
causes problems (dangling udev_sync cookies, slow processing
of the pvscan --cache --major --minor call from udev rules)
when the autoactivation handler is run in parallel on
several PVs that belong to the same VG. Revert this patch
until the exact source of the problem is found and then
properly fixed and handled.
2013-09-02 13:53:27 +02:00
Alasdair G Kergon
78647da1c6 toolcontext: Only reopen stdin if readable.
Don't fail when running lvm commands under versions of nohup that set
up stdin as O_WRONLY!
2013-08-28 23:55:14 +01:00
Alasdair G Kergon
c0f987949b activation: Fix segfault with inactive pvmove LV.
Set flag to avoid recursion back through an inactive pvmove LV when
populating deptree.
2013-08-28 22:56:23 +01:00
Peter Rajnoha
0acd7173d1 systemd: lvm2-activation-generator: remove default dir if args not specified and require all args to be given
Remove default "/tmp" as destination directory if no args
specified for lvm2-activation-generator. Require all the
args to be specified directly for proper functionality.
2013-08-28 16:06:51 +02:00
Jonathan Brassow
2ef48b91ed pvmove: Allow moving snapshot/origin. Disallow converting and merging LVs
The patch allows the user to also pvmove snapshots and origin logical
volumes.  This means pvmove should be able to move all segment types.
I have, however, disallowed moving converting or merging logical volumes.
2013-08-26 16:36:30 -05:00
Jonathan Brassow
caa77b33f2 pvmove: Fix inability to specify LV name when moving RAID, mirror, or thin LV
Top-level LVs (like RAID, mirror or thin) are ignored when determining which
portions of an LV to pvmove.  If the user specified the name of an LV to
move and it was one of the above types, it would be skipped.  The code would
never move on to check whether its sub-LVs needed moving because their names
did not match what the user specified.

The solution is to check whether a sub-LVs is part of the LV whose name was
specified by the user - not just if there was a name match.
2013-08-26 14:12:31 -05:00
Peter Rajnoha
d34ab5e0d3 WHATS_NEW: for 4d3b5724e0 2013-08-26 15:52:15 +02:00
Zdenek Kabelac
6b416f837f thin: support lvchange for data and metadata
Support lvchange operation on stacked thin pool data and metadata
volumes.
2013-08-26 14:55:22 +02:00
Jonathan Brassow
c59167ec13 pvmove: Add support for RAID, mirror, and thin
This patch allows pvmove to operate on RAID, mirror and thin LVs.
The key component is the ability to avoid moving a RAID or mirror
sub-LV onto a PV that already has another RAID sub-LV on it.
(e.g. Avoid placing both images of a RAID1 LV on the same PV.)

Top-level LVs are processed to determine which PVs to avoid for
the sake of redundancy, while bottom-level LVs are processed
to determine which segments/extents to move.

This approach does have some drawbacks.  By eliminating whole PVs
from the allocation list, we might miss the opportunity to perform
pvmove in some senarios.  For example, if we have 3 devices and
a linear uses half of the first, a RAID1 uses half of the first and
half of the second, and a linear uses half of the third (FIGURE 1);
we should be able to pvmove the first device (FIGURE 2).
	FIGURE 1:
        [ linear ] [ -RAID- ] [ linear ]
        [ -RAID- ] [        ] [        ]

	FIGURE 2:
        [  moved ] [ -RAID- ] [ linear ]
        [  moved ] [ linear ] [ -RAID- ]
However, the approach we are using would eliminate the second
device from consideration and would leave us with too little space
for allocation.  In these situations, the user does have the ability
to specify LVs and move them one at a time.
2013-08-23 08:57:16 -05:00
Peter Rajnoha
99fe3b88d2 systemd: lvm2-activation-generator: report only error otherwise be silent
Do not print success status for lvm2-activation-generator:

  "LVM: Activation generator successfully completed."
  "LVM: Logical Volume autoactivation enabled." (if use_lvmetad=1)

Though this information is quite useful during boot, it may
be confusing for users if it happens anytime later and it
actually happens if systemd reloads. This is usually on package
update to update the systemd state and load any new units that are
newly installed in the system. The systemd reload is global and
so any existing generators are rerun at that moment too.
2013-08-22 08:27:51 +02:00
Peter Rajnoha
c8daa15270 filter-mpath: remove superfluous error message about mpath major not equal to dm major
This is a regression caused by commit 3bd9048854.
The error message added with that commit "mpath major %d is not dm major %d" is
superfluous.

When scanning for mpath components, we're looking for a parent device.
But this parent device is not necessarily an mpath device (so the dm device)
if it exists - it can be any other device layered on top (e.g. an MD RAID device).
2013-08-21 14:07:01 +02:00
Jonathan Brassow
f0be9ac904 cmirrord: Prevent secondary checkpoints from corrupting bitmaps
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).

When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap.  More than one machine can send a checkpoint,
but only the initial one should be imported.  Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin.  When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.

When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter.  This parameter was designed to avoid reading all but
the initial checkpoint.  Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
2013-08-20 13:21:09 -05:00
Peter Rajnoha
cac49725c9 udev: fix lvmetad rules to not ignore loop device configuration
If loop device is first configured on systems where /dev/loop-control
is used to dynamically create the loop device itself, there's an
ADD+CHANGE even generated. But next time the existing /dev/loop[0-9]*
is reused, there's only a CHANGE event since the device representing
it is already present in kernel (so no ADD event in this case).

We can't ignore this CHANGE event for loop devices! This is a regression
caused by 756bcabbfe. We already had
a similar problem with MD devices which was fixed by
2ac217d408 (but that one was
only an intra-release fix).
2013-08-16 15:45:00 +02:00
Michael Stapelberg
8cbbe851a8 systemd: use LVM_PATH instead of hardcoded value in activation generator 2013-08-15 09:59:19 +02:00
Peter Rajnoha
82d83a01ce autoactivation: refresh existing VG before autoactivation
When autoactivating a VG, there could be an existing VG with exactly
the same PV UUIDs. The PVs could be reappeared after previous
loss/disconnect (for example disconnecting and reconnecting iscsi).

Since there's no "autodeactivation" yet, the mappings for the LVs
from the VG were left in the system even if the device was disconnected.
These mappings also hold the major:minor of the underlying device.
So if the device reappears, it is assigned a different major:minor
pair (...and kernel name). We need to cope with this during
autoactivation so any existing mappings are corrected for any changes.
The VG refresh does that (the vgchange --refresh functionality) -
call this before VG autoactivation.

(If the VG does not exist yet, the VG refresh is NOP)
2013-08-14 14:04:58 +02:00
Peter Rajnoha
fcbb34bdcc WHATS_NEW: for 0da72743ca 2013-08-14 10:18:02 +02:00
Alasdair G Kergon
80bcdb93ff filters: check for mpath before opening devs
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.

When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.

Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.

Log each filter initialisation to show sequence.

Avoid duplicate 'Using $device' debug messages.
2013-08-13 23:26:58 +01:00
Alasdair G Kergon
1a1d3a10ff vgchange: require confirmation with -c and no VGs
Too many people have been running 'vgchange -cy' by mistake
so add a confirmation prompt.  Use --yes to bypass this.
2013-08-13 18:20:11 +01:00
Peter Rajnoha
fd7cac15bc WHATS_NEW: be more precise 2013-08-13 18:25:54 +02:00
Peter Rajnoha
e166c00ac6 WHATS_NEW: one more for a85439 2013-08-13 18:16:05 +02:00
Peter Rajnoha
268b370e24 blkdeactivate: add support for bind mounts
Recent version of util-linux/umount (v2.23+) provides
umount --all-targets that can unmount all the mount targets of
the same device (the bind mounts). Use this if available when
calling the umount blkdeactivate.

Otherwise, for older versions of util-linux, use findmnt
(that is also a part of the util-linux) to iterate over all
mount targets of the same device - this is the manual way.
2013-08-13 17:51:40 +02:00
Peter Rajnoha
a854398764 blkdeactivate: change the way blkdeactivate reports status
The blkdeactivate now suppresses error messages from external
tools that are called. Instead, only a summary message "done"
or "skipped" is issued by blkdeactivate as any error in calling
the external tool (e.g. unmounting or deactivating a device) causes
the device to be skipped and the blkdeactivate continues with the
next device in the tree.

Add new -e/--errors switch to display any error messages from
external tools.

Also, suppress any output given by the external tools and add
new -v/--verbose switch to display it including the verbose
output of the tools called (this will enable error reporting
as well).

Also add blkdeactivate -vv for even more debug (the script's debug).
2013-08-13 17:51:23 +02:00
Alasdair G Kergon
32148369d1 post-release 2013-08-13 11:54:48 +01:00
Alasdair G Kergon
297907899c release 2.02.100
84 files changed, 1540 insertions(+), 442 deletions(-)

Mostly bug fixes this time.

Also note:
  md raid replaces dm mirroring as the default implementation.
  Can call out to thin_repair to fix thin metadata.
  Improved clvmd error detection/debugging information.
2013-08-13 11:29:21 +01:00
Jonathan Brassow
abc89422af Mirror: Fix inability to remove VG's cluster flag if it contains a mirror
According to bug 995193, if a volume group
	1) contains a mirror
	2) is clustered
	3) 'locking_type' = 0 is used
then it is not possible to remove the 'c'luster flag from the VG.  This
is due to the way _lv_is_active behaves.

We shouldn't allow the cluster flag to be flipped unless the mirrors in
the cluster are not active.  This is because different kernel modules
are used depending on whether a mirror is cluster or not.  When we
attempt to see if the mirror is active, we first check locally.  If it
is not, then we attempt to check for remotely active instances if the VG
is clustered.  Since the no_lock locking type is LCK_CLUSTERED, but does
not implement 'query_resource', remote_lock_held will always return an
error in this case.  An error from remove_lock_held is treated as though
the lock _is_ held (i.e. the LV is active remotely).  This blocks the
cluster flag from changing.

The solution is to implement 'query_resource' for the no_lock type.  It
will report a message and return 1.  This will allow _lv_is_active to
function properly.  The LV would be considered not active remotely and
the VG can change its flag.
2013-08-12 13:56:47 -05:00
Alasdair G Kergon
28760275e6 logging: tidy log_sys_error when string empty 2013-08-12 18:40:41 +01:00
Jonathan Brassow
cba228f856 WHATSNEW: typo 2013-08-09 17:17:53 -05:00
Jonathan Brassow
8615234c0f RAID: Fix bug making lvchange unable to change recovery rate for RAID
1) Since the min|maxrecoveryrate args are size_kb_ARGs and they
   are recorded (and sent to the kernel) in terms of kB/sec/disk,
   we must back out the factor multiple done by size_kb_arg.  This
   is already performed by 'lvcreate' for these arguments.
2) Allow all RAID types, not just RAID1, to change these values.
3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe'
   commands so that lvchange will not complain about needing at
   least one of a certain set of arguments and failing.
4) Add tests that check that these values can be set via lvchange
   and lvcreate and that 'lvs' reports back the proper results.
2013-08-09 17:09:47 -05:00
Zdenek Kabelac
e583ff3d2c thin: thin pool can't be external origin
Avoid trying to convert thin-pool to external origin.
2013-08-09 23:04:30 +02:00
Peter Rajnoha
2f61478436 workaround: gcc v4.8 on 32 bit param. passing bug when -02 opimization used
gcc -O2 v4.8 on 32 bit architecture is causing a bug in parameter
passing. It does not happen with -01 nor -O0.

The problematic part of the code was strlen use in config.c in
the config_def_check fn and the call for _config_def_check_tree in it:

<snip>
  rplen = strlen(rp);
  if (!_config_def_check_tree(handle, vp, vp + strlen(vp), rp, rp + rplen, CFG_PATH_MAX_LEN - rplen, cn, cmd->cft_def_hash)) ...
</snip>

If compiled with -O0 (correct):

Breakpoint 1, config_def_check (cmd=0x819b050, handle=0x81a04f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	strlen(vp)
$2 = 6
(gdb)
_config_def_check_tree (handle=0x81a04f8, vp=0x8189ee0 <_cfg_path>
"config", pvp=0x8189ee6 <_cfg_path+6> "", rp=0xbfffe1e8 "config",
prp=0xbfffe1ee "", buf_size=58, root=0x81a2568, ht=0x81a65
48) at config/config.c:680
(gdb) p	vp
$4 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	pvp
$5 = 0x8189ee6 <_cfg_path+6> ""

If compiled with -O2 (incorrect):

Breakpoint 1, config_def_check (cmd=cmd@entry=0x8183050, handle=0x81884f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8172fc0 <_cfg_path> "config"
(gdb) p strlen(vp)
$2 = 6
(gdb) p	vp + strlen(vp)
$3 = 0x8172fc6 <_cfg_path+6> ""
(gdb)
_config_def_check_tree (handle=handle@entry=0x81884f8, pvp=0x8172fc7
<_cfg_path+7> "host_list", rp=rp@entry=0xbffff190 "config",
prp=prp@entry=0xbffff196 "", buf_size=buf_size@entry=58, ht=0x
818e548, root=0x818a568, vp=0x8172fc0 <_cfg_path> "config") at
config/config.c:674
(gdb) p	pvp
$4 = 0x8172fc7 <_cfg_path+7> "host_list"

The difference is in passing the "pvp" arg for _config_def_check_tree.
While in the correct case, the value of _cfg_path+6 is passed
(the result of vp + strlen(vp) - see the snippet of the code above),
in the incorrect case, this value is increased by 1 to _cfg_path+7,
hence totally malforming the string that is being processed.

This ends up with incorrect validation check and incorrect warning
messages are issued like:

 "Configuration setting "config/checks" has invalid type. Found integer, expected section."

To workaround this issue, remove the "static" qualifier from the
"static char _cfg_path[CFG_PATH_MAX_LEN]". This causes the optimalizer
to be less aggressive (also shuffling the arg list for
_config_def_check_tree call helps).
2013-08-09 13:24:50 +02:00
Peter Rajnoha
8d3347f70b WHATS_NEW: entry for 19baf84290 2013-08-08 10:04:53 +02:00
Jonathan Brassow
68c2d352ec WHATS_NEW: update WHATS_NEW for previous commit 2013-08-07 17:51:21 -05:00
Jonathan Brassow
b15278c3dc Mirror/RAID1: When up|down-converting default to segtype of current LV
If there is no RAID support in the kernel but the default mirror
segtype is "raid1", converting legacy mirrors can be problematic.
For example, changing the log type or converting a mirror to a linear
LV does not require the RAID modules to be present.  However, because
lp->segtype is set to be RAID1 by the configuration file, the command
fails.

We should only be setting lp->segtype when converting mirrors if it is
going to change (e.g. to linear or between mirror types).
2013-08-07 16:01:45 -05:00
Jonathan Brassow
7e1083c985 RAID: Make "raid1" the default mirror segment type 2013-08-06 14:13:55 -05:00
Zdenek Kabelac
003f08c164 clogd: fix descriptor leak when daemonzing 2013-08-06 16:21:51 +02:00
Zdenek Kabelac
7b1315411f clmvd: fix decriptor leak on restart
Do not leave descriptor used for dup2() openned.
2013-08-06 16:20:36 +02:00
Zdenek Kabelac
f6dd5a294b exec: pipe open
Function replaces popen() system and avoids shell execution
and argument parsing (no surprices).
2013-08-06 16:18:43 +02:00
Peter Rajnoha
61e7dc833c WHATS_NEW: previous commit 2013-08-06 14:03:43 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Peter Rajnoha
1cdd563b6c WHATS_NEW: move line to WHATS_NEW_DM 2013-08-06 11:42:01 +02:00
Jonathan Brassow
5ca54c4f0b dmeventd: Fix memory leak
When creating a timeout thread for snapshots, the thread is not
tracked and thus never joined.  This means that the exit status
of the timeout thread is held indefinitely.  Saves a bit of
memory to set PTHREAD_CREATE_DETACHED when creating this thread.

I've also added pthread_attr_init|destroy to setup the creation
pthread_attr_t.

Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2013-07-31 15:23:13 -05:00
Zdenek Kabelac
de0cba0e2d thin: initial --repair support for pools
Initial basic support for repair.
It currently takes pool metadata spare volume, which
is used for recovery.  New spare is created if the volume
is successfuly repaired.

After the operation the previous _tmeta volume is moved
into  _tmeta%d volume and if everything is ok, this volume
could be removed.
New _tmeta needs to be pvmoved to proper place and also
converted to i.e. mirror if it should be mirrored.

Later version will try to automate some steps here.
2013-07-31 15:32:36 +02:00
Zdenek Kabelac
22fc80982a thin: add thin_repair and thin_dump options
Add new configure lvm.conf options for binaries thin_repair
and thin_dump.

Those are part of device-mapper-persistent-data package
and will be used for recovery of thin_pool.
2013-07-31 15:30:47 +02:00
Zdenek Kabelac
ea605d1ec7 thin: metadata resize needs 1.9 version
Version 1.8 is not yet fully usable for metadata resize.
2013-07-31 15:29:27 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Peter Rajnoha
ecc9f74988 filters: fix segfault on incorrect global_filter
When using a global_filter and if this filter is incorrectly
specified, we ended up with a segfault:

  raw/~ $ pvs
    Invalid filter pattern "r|/dev/sda".
  Segmentation fault (core dumped)

In the example above a closing '|' character is missing at the end
of the regex. The segfault itself was caused by trying to destroy
the same filter twice in _init_filters fn within the error path
(the "bad" goto target):

bad:
        if (f3)
                f3->destroy(f3);
        if (f4)
                f4->destroy(f4);

Where f3 is the composite filter (sysfs + regex + type + md + mpath filter)
and f4 is the persistent filter which encompasses this composite filter
within persistent filter's 'real' field in 'struct pfilter'.

So in the end, we need to destroy the persistent filter only as
this will also destroy any 'real' filter attached to it.
2013-07-26 13:04:53 +02:00
Alasdair G Kergon
06dce7d539 post-release 2013-07-25 00:38:53 +01:00
Alasdair G Kergon
76e617b158 release 2.02.99
363 files changed, 19863 insertions(+), 9055 deletions(-)

This is a very large release - so expect bugs!

Please treat this release like a release candidate.
Changes to the external interfaces since 2.02.98 are not yet frozen.

Updated releases will follow quickly (days not weeks) as any problems
are handled.
2013-07-24 23:59:03 +01:00
Alasdair G Kergon
d13e87b9ef cleanup: comments and a message 2013-07-24 22:10:37 +01:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Peter Rajnoha
31de670318 lvconvert: add more checks for lvconvert --type
The --type mirror requires -m/--mirrrors:

  lvconvert --type mirror vg/lvol0
    --type mirror requires -m/--mirrors
    Run `lvconvert --help' for more information.

The --type raid* is allowed (the checks already existed):

  lvconvert --type raid10 vg/lvol0
    Converting the segment type for vg/lvol0 from linear to raid10 is not yet supported.

The --type snapshot is a synonym to -s/--snapshot:

  lvconvert -s vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

  lvconvert --type snapshot vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

All the other segment types are not supported, e.g.:

  lvconvert --type zero vg/lvol0
    Conversion using --type zero is not supported.
    Run `lvconvert --help' for more information.
2013-07-23 17:13:54 +02:00
Peter Rajnoha
9b1834f075 man: pvs -o ba_start,ba_size -> pv_ba_start,pv_ba_size 2013-07-23 14:45:30 +02:00
Peter Rajnoha
ea333a894e systemd: lvm2-activation-generator: use LOG_DEBUG/ERR severity for kmsg 2013-07-22 14:04:12 +02:00
Alasdair G Kergon
ccc29f17b6 cmdline: support ARG_GROUPABLE in merge_synonym 2013-07-19 20:37:43 +01:00
Alasdair G Kergon
90a09559ed commandline: add prefix aliases for raid options
Accept --raidwritemostly as well as --writemostly etc.
2013-07-19 19:24:54 +01:00
Jonathan Brassow
4eea660191 RAID: Fix segfault when reporting raid_syncaction field on older kernel
The status printed for dm-raid targets on older kernels does not include
the syncaction field.  This is handled by dev_manager_raid_status() just
fine by populating the raid status structure with NULL for that field.
However, lv_raid_sync_action() does not properly handle that field being
NULL.  So, check for it and return 0 if it is NULL.
2013-07-19 10:01:48 -05:00
Alasdair G Kergon
da79fe4c1d reporting: tidy recent new fields
Add underscores and prefixes to recently-added fields.
(Might add more alias functionality in future.)
2013-07-19 01:30:02 +01:00
Alasdair G Kergon
357df34133 display: fix units for sizes <1k 2013-07-18 17:55:58 +01:00
Zdenek Kabelac
460d0254eb thin: add pool metadata spare lv support
Add support for pool's metadata spare volume.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
08df7ba844 thin: improve pool creation activation order
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.

This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.

However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.

Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.

Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
4e724f5f52 thin: for thin volumes properly list modules
thin volume needs   thin-pool and  thin kernel modules so print them
both for   lvs -o+modules
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7afa9cebcb thin: fix error path in creation path
Remove some calls to revert_new_lv when no LV has been created/commited so far.
When the pool update failed - then only revert is needed.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7b4b97b731 snapshot: local activation for clear COW device
To clear snapshot cow device in cluster enforce local
activation here.
2013-07-18 18:22:43 +02:00
Peter Rajnoha
8606bf316a systemd: generator: add lvm2-activation-net.service
The new lvm2-activation-net.service activates LVM volumes
after network-attached devices are set up (iSCSI and FCoE)
if lvmetad is disabled and hence the autoactivation is not
used.
2013-07-17 16:54:52 +02:00
Zdenek Kabelac
57be501aa3 dev_manager: lower memory usage
Created dlid for test is not needed afterward, so lower a memory
usage of this call is repeatedly used for building some large tree.

TODO: create function to use given buffer on stack as much cheaper.
2013-07-15 15:59:20 +02:00
Zdenek Kabelac
0443c42e3b thin: add sub volumes as whole volumes
Do not use origin_only when add log_lv and metadata as a subvolume.
The stacked volume needs to access whole volume in this case.
2013-07-15 15:58:07 +02:00
Zdenek Kabelac
97d36d5750 thin: check and use layered origin lv
Code needs to check if the layer origin device is suspended,
It's valid to create  thinvolume snapshot of thinvolume which is also
used as an old-style snapshot. In this case we need to check -real
is suspended.

When adding origin_only - add only layer thin volume.
(in case it's also old-snapshot add only -real device)
2013-07-15 15:51:39 +02:00
Zdenek Kabelac
925701d9f3 thin: write back when command successfully finished
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
2013-07-15 15:48:32 +02:00
Zdenek Kabelac
42881c8877 thin: send messages to active pool
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.

So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.

Replace the test for valid activation combination with simpler list of
deactivation combinations.
2013-07-15 15:47:25 +02:00
Peter Rajnoha
9a44ba94a5 WHATS_NEW: support for LV activation skip flag 2013-07-12 21:37:58 +02:00
Peter Rajnoha
953a438e93 dumpconfig: add --type profilable
The --type profilable shows all config settings that
are customizable by profiles:

  raw/~ $ lvm dumpconfig --type profilable
  allocation {
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  thin_pool_chunk_size=64
  }
  activation {
	  thin_pool_autoextend_threshold=100
	  thin_pool_autoextend_percent=20
  }
2013-07-09 10:00:47 +02:00
Peter Rajnoha
5ed7d0cf1d dumpconfig: add --mergedconfig option
Normally, the lvm dumpconfig processes only the configuration tree
that is at the top of the cascade. Considering the cascade is:

  CONFIG_STRING -> CONFIG_PROFILE -> CONFIG_MERGED_FILES/CONFIG_FILE

...then:

  (dumpconfig of lvm.conf only)
  raw/~ $ lvm dumpconfig allocation
  allocation {
	  maximise_cling=1
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=64
  }

  (dumpconfig of selected profile configuration only)
  raw/~ $ lvm dumpconfig --profile test allocation
  allocation {
	  thin_pool_chunk_size=8
	  thin_pool_discards="passdown"
	  thin_pool_zero=1
  }

  (dumpconfig of given --config configuration only)
  raw/~ $ lvm dumpconfig --config 'allocation{thin_pool_chunk_size=16}' allocation
  allocation {
	  thin_pool_chunk_size=16
  }

The --mergedconfig option causes the configuration cascade to be
merged before processing it with dumpconfig:

  (dumpconfig of merged selected profile and lvm.conf)
  raw/~ $ lvm dumpconfig --profile test allocation --mergedconfig
  allocation {
	  maximise_cling=1
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=8
  }

  (dumpconfig merged given --config and selected profile and lvm.conf)
  raw/~ $ lvm dumpconfig --profile test --config 'allocation{thin_pool_chunk_size=16}' allocation --mergedconfig
  allocation {
	  maximise_cling=1
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=16
  }

Hence with the --mergedconfig, we are able to see the
configuration that is actually used when processing any
LVM command while using any combination of --config/--profile
options together with lvm.conf file.
2013-07-08 16:05:56 +02:00
Zdenek Kabelac
985251c8f3 locking: unlock memory on error path
Unlock memory and unblock signals on error path.
2013-07-08 14:02:49 +02:00
Alasdair G Kergon
7c6526aae2 lvresize: separate validation from action
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions.  Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
2013-07-06 03:28:21 +01:00
Peter Rajnoha
0a5b68e87b man: document profile config and related options
Document following items:
  configuration cascade (man lvm.conf)
  --profile ProfileName (man lvm)
  --detachprofile (man vgchange/lvchange)
  -o vg_profile/lv_profile (man vgs/lvs)

Also document --config a bit so we can see where it fits in the
configuration cascade - will be documented more in following commit...
2013-07-03 16:49:26 +02:00
Zdenek Kabelac
6f335ffa35 sigint: improve logic on for sigint reaction
Fix and improve handling on sigint.

Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.

If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.

Fix cuple related stack; reports and assignments.
2013-07-03 14:46:42 +02:00
Mike Snitzer
fe09d84668 lvconvert: Rename _swap_lv to _swap_lv_identifiers and move to allow an additional user 2013-07-02 17:02:25 -04:00
Mike Snitzer
f9e0adcce5 snapshot: Rename snapshot segment returning methods from find_*_cow to find_*_snapshot
find_cow -> find_snapshot, find_merging_cow -> find_merging_snapshot.
Will thin snapshot code to reuse these methods without confusion.
2013-07-02 16:26:03 -04:00
Tony Asleson
79106f7394 liblvm/python-lvm New additions
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Peter Rajnoha
fd94dd3b7e WHATS_NEW: config/profile_dir 2013-07-02 15:57:46 +02:00
Peter Rajnoha
7ce0c6777b WHATS_NEW: configuration profile support 2013-07-02 15:51:21 +02:00
Zdenek Kabelac
e30028004b archiver: do not archive vg more then once
Do not keep multiple archives for the executed command.
Reuse the ALLOCATABLE_PV from pv status for
ARCHIVED_VG vg status. Mark VG with the bit with the
first archivation.
2013-07-01 23:09:26 +02:00
Jonathan Brassow
8215846aa5 Clean-up: WHATS_NEW
Choosing between two entries and forgot to remove one.
2013-06-19 19:55:34 -05:00
Jonathan Brassow
a6d13308ec RAID/MIRROR: Honor mirror_segtype_default when upconverting linear LVs
If the user would upconvert a linear LV to a mirror without specifying
the segment type ("--type mirror" vs "--type raid1"), the "mirror"
segment type would be chosen without consulting the 'default_mirror_segtype'
setting in lvm.conf.  This is now used as the basis for determining
which should be used if left unspecified.
2013-06-19 17:50:10 -05:00
Zdenek Kabelac
155841c349 lvmetad: fix compare function
Check for enough space in preallocated buffer.
Fixes problem, when lvm code started to suddenly allocate
too big memory chunks.

TODO: lvmetad protocol should announce needed size ahead,
so if metadata have 1MB we are not reallocating memory...
2013-06-18 22:12:51 +02:00
Zdenek Kabelac
2562968864 vgcfgrestore: fix crash on restore of wrong vgname
When vgname has not existed in metadata, it has crashed on double free
in format_instance destroy() -  since VG was created, used FID and was
released - which also released FID, so further use was accessing bad
memory.

Fix it for this code path before release_vg() so FID will exists
when _vg_read_file_name() returns NULL.
2013-06-18 22:11:21 +02:00
Alasdair G Kergon
c2dc21d89f text: miscellaneous comments & message tweaks 2013-06-15 01:28:54 +01:00
Peter Rajnoha
dba53681a5 man: refine lvm.conf and man page documentation for autoactivation feature 2013-06-14 10:02:56 +02:00
Zdenek Kabelac
fe22089edf thin: vgsplit support for thins
Support vgsplit for VGs with thin pools and thin volumes.
In case the thin data and thin metadata volumes are moved to a new VG,
move there also all related thin volumes and check that external origins
are also present in this new VG.
2013-06-13 14:51:00 +02:00
Peter Rajnoha
966d4f36d7 filter-mpath: detect partitions of mpath components
We use mpath filtering (enabled by devices/multipath_component_detection=1
lvm.conf setting) to avoid a situation in which we could end up with
duplicate PVs found. We need to filter out the mpath components and
use only the top-level multipath mapping instead for PV scans.

However, if the there are partitions on multipath components, we need
to filter out these partitions. This patch fixes it so those
partitions found on multipath components are filtered as well.

For example, let's consider following configuration:
The sda and sdb are mpath components, sda1 and sdb1 the partitions
on these components, mpath-test the mpath mapping and mpath-test1
the partition mapping - created automatically by kpartx right
after mpath-test creation. The PV resides on top.

       (LVM PV)
          |
      mpath-test1
          |
      mpath-test
          |
sda1 ---------- sdb1
   \ |        |/
    sda      sdb

E.g. for sda1 and sdb1, the code will detect this and it skips
the partition that belongs to the multipath component:
  <snippet from the log>
    #filters/filter-mpath.c:156         /dev/sda1: Device is a partition, using primary device /dev/sda for mpath component detection
    130 #ioctl/libdm-iface.c:1724         dm status   (253:2) OF[16384](*1)
    131 #filters/filter-mpath.c:196         /dev/sda1: Skipping mpath component device
  </snippet from the log>

Othewise, we'd see the same PV label on sda1/sdb1 and mpath-test1
at the same time ending up with "Duplicate PV found...".
2013-06-12 13:13:38 +02:00
Zdenek Kabelac
87aca628d6 thin: lvresize supports pool metadata resize
Add support for lvresize of thin pool metadata device.

lvresize --poolmetadatasize +20   vgname/thinpool_lv

or

lvresize -L +20 vgname/thinpool_lv_tmeta

Where the second one allows all the args for resize (striping...)
and the first option resizes accoding to the last metadata lv segment.
2013-06-11 14:05:20 +02:00
Zdenek Kabelac
72c3ae253e thin: add helper functions
Add find_pool_lv() and pool_can_resize_metadata().
2013-06-11 14:03:30 +02:00
Zdenek Kabelac
55a3859632 thin: detect online metadata resize support 2013-06-11 14:03:28 +02:00
Zdenek Kabelac
01ef97fcbb thin: report 'e' metadata type with higher priority
Giving volume type information about being 'metadata' type of volume
has higher priority then i.e.  'mirror' or 'thin' flag - for those
type we have 'target attr' (7th. field).
2013-06-11 14:03:08 +02:00
Zdenek Kabelac
c0290489c3 thin: report o as volume type for external origin
Reuse 'o' attr for lvs report also for external origin.
2013-06-11 14:02:41 +02:00
Zdenek Kabelac
7151ede767 thin: report t for thin pool and volume
Do not mark internal device _tdata and _tmeta as having target type 't'.
They have the target type on their own (i.e. mirror, raid).
2013-06-11 13:58:16 +02:00
Zdenek Kabelac
272f5ae208 snapshots: check for active state
Fix testing if the snapshot could be resized and use lv_is_active()
to get correct answer in cluster.
2013-06-11 13:57:18 +02:00
Zdenek Kabelac
f05c5a97c3 filters: dump filter returns error code
Add int return value from dump() function.
Report stack for error case.
Update composable filter.
2013-06-03 08:42:25 +02:00
Zdenek Kabelac
5467a3b2b7 filters: update composable filter
Last commit made dump filter only partially composable.
Add remaining functionality and also support composable wipe,
which is needed, when i.e. vgscan needs to remove cache.

(in release fix)
2013-06-02 22:46:06 +02:00
Petr Rockai
1f73e992ef lvmetad: no use of persistent filter with lvmetad 2013-06-02 00:49:55 +02:00
Petr Rockai
e7878da921 filters: toplevel filter not persistent
Add a generic dump operation to filters and make the composite filter call
through to its components. Previously, when global filter was set, the code
would treat the toplevel composite filter's private area as if it belonged a
persistent filter, trying to write nonsense into a non-sensical file.
Also deal with NULL cmd->filter gracefully.
2013-06-02 00:48:58 +02:00
Petr Rockai
05bf4b8cc3 vgimportclone: override global_filter in lvm.conf
The global filter in system's lvm.conf may conflict with the custom filter we
set up in vgimportclone (they can easily fail to intersect). Since we explicitly
avoid talking to lvmetad in vgimportclone, it is safe and reasonable to do so.
2013-06-02 00:47:17 +02:00
Zdenek Kabelac
3ced1bf694 lvresize: check for max snapshot size
As for lvcreate, lvresize also doesn't need to grow bigger then needed.
2013-05-30 17:35:23 +02:00
Zdenek Kabelac
bd3ece0128 lvcreate: reduce too large cow
Detect maximum usable size of snapshot COW device,
and do not waste more space for such LV then needed.
2013-05-30 17:35:14 +02:00
Zdenek Kabelac
eb7e206a73 snapshot: add cow_max_extents
Add more precise calculation of the maximum usable snapshot size.
Using only percentage fails for small size of snapshot and extents.
2013-05-30 17:30:15 +02:00
Zdenek Kabelac
59962d8d3e snapshot: require 3 chunks for creation
There is no point in creation of 2chunks snapshot,
since the snapshot is invalidated immeditelly with the first write
as there is no free chunk for COW blocks
(2 chunks are used by the snap header and the 1st. metadata chunk).

Enhance error message about the lowest usable size.
2013-05-30 17:28:03 +02:00
Zdenek Kabelac
56779c32c5 snapshot: fix resize of 100% full cow
When the COW area is using all the available space (100%) it can be still
a valid snapshot which may need a resize. So support it.
2013-05-30 17:26:20 +02:00
Zdenek Kabelac
99f0483580 args: do not accept >=16EiB sizes
Instead of seeing wierd overflows inside the lvm code,
giving false error messages, kill the user experiment in the begining.

Who needs to use more then 16EiB with lvm2 and 64bit anyway...
2013-05-30 17:23:51 +02:00
Zdenek Kabelac
2f1a571c97 fid: fix reset of PV fid
Avoid hitting memory corruption (double free) in code path,
where PV FID has been already destroyed and the released pointer
was left in PV structure and could have been tried to be released
from there 2nd. time with final context destruction.
2013-05-30 16:52:39 +02:00
Peter Rajnoha
be25f7ac83 WHATS_NEW: ea_start,ea_size -> ba_start,ba_size 2013-05-28 12:43:26 +02:00
Peter Rajnoha
732859d21f refactor: rename embedding area -> bootloader area 2013-05-28 12:37:22 +02:00
Zdenek Kabelac
9966842810 snapshot: skip monitor for large cows
If snapshot cow device is already big enough to
cover whole origin, do not monitor it.
2013-05-27 10:35:43 +02:00
Zdenek Kabelac
77952151af snapshot: add lv_is_cow_covering_origin
Add function to check is size of cow is already big enough
to cover whole origin.
2013-05-27 10:34:53 +02:00
Zdenek Kabelac
06e8ff29ff snapshot: use dm_get_status_snapshot()
Replace code with libdm call to dm_get_status_snapshot().
2013-05-27 10:32:02 +02:00
Zdenek Kabelac
2ada982e73 vgchange: check for mounted fs
Check for mounted fs also for vgchange command, not just lvchange.

NOTE: Code is using lv_info() just like lvs_in_vg_opened().
It should be probably converted into  lv_is_active_locally().
2013-05-20 16:47:33 +02:00
Jonathan Brassow
06ac797f42 Clean-up: Replace 'lv_is_active' with more correct/specific variants
There are places where 'lv_is_active' was being used where it was
more correct to use 'lv_is_active_locally'.  For example, when checking
for the existance of a kernel instance before asking for its status.
Most of the time these would work correctly.  (RAID is only allowed on
non-clustered VGs at the moment, which means that 'lv_is_active' and
'lv_is_active_locally' would give the same result.)  However, it is
more correct to use the proper variant and it helps with future
scenarios where targets might be allowed exclusively (or clustered) in
a cluster VG.
2013-05-16 10:36:56 -05:00
Peter Rajnoha
b3b551a93e WHATS_NEW: bad day 2013-05-16 11:02:38 +02:00
Peter Rajnoha
cb0d817fb5 WHATS_NEW: for commit 4f6c2951d6 2013-05-16 08:38:27 +02:00
Alasdair G Kergon
f12d88f840 activation: fix lv_is_active regressions
Try to fix commit bf2741376d.

lv_is_active is not the same as lv_info(cmd, org, 0, &info, 0, 0).

Introduce and use lv_is_active_locally.
2013-05-15 02:13:31 +01:00
Alasdair G Kergon
2fbe1e6e00 rephrasing: miscellaneous changes
Miscellaneous changes to messages, man pages, comments and WHATS_NEW.
2013-05-15 01:50:42 +01:00
Alasdair G Kergon
2e4a66a761 make: fix exported symbols regex for non-GNU sed
Remove a couple of incorrect backslashes from expressions used to
generate lists of exported symbols so it works with busybox sed.
[John Spencer]
2013-05-14 19:29:26 +01:00
Alasdair G Kergon
c6cf2ed7fd commands: accept --yes globally
Accept --yes on all commands, even ones that don't today have prompts,
so that test scripts that don't care about interactive prompts no
longer need to deal with them.

But continue to mention --yes only in the command prototypes that
actually use it.
2013-05-14 18:45:37 +01:00
Mike Snitzer
8ad7865b42 Fix alignment of PV data area if detected alignment less than 1 MB
This fixes a long standing regression since LVM2 2.02.74 (commit 4efb1d9c,
"Update heuristic used for default and detected data alignment.")

The default PE alignment could be used (via MAX()) even if it was
determined that the device's MD stripe width, or minimal_io_size or
optimal_io_size were not factors of the default PE alignment (either 64K
or the newer default of 1MB, etc).  This bug would manifest if the
default PE alignment was larger than the overriding hint that the
device provided (e.g. default of 1MB vs optimal_io_size of 768K).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-05-13 15:56:47 -04:00
Zdenek Kabelac
55fe07ad98 mm: fix leak in fail path
If the dm_realloc would fail, the already allocate _maps_buffer
memory would have been lost (overwritten with NULL).
Fix this by using temporary line buffer.

Also add a minor cleanup to set end of buffer to '\0',
only when we really know the file size fits the preallocated buffer.
2013-05-13 13:13:20 +02:00
Peter Rajnoha
4407133113 toolcontext: check dm version lazily for udev_fallback setting
Setting the cmd->default_settings.udev_fallback also requires DM
driver version check. However, this caused useless mapper/control
access with ioctl if not needed actually. For example if we're not
using activation code, we don't need to know the udev_fallback as
there's no node and symlink processing.

For example, this premature mapper/control access caused problems
when using lvm2app even when no activation happens - there are
situations in which we don't need to use mapper/control, but still
need some of the lvm2app functionality. This is also the case for
lvm2-activation systemd generator which just needs to look at the
lvm2 configuration, but it shouldn't touch mapper/control.
2013-05-13 11:53:53 +02:00
Zdenek Kabelac
8d004b5127 report: show active state of LV
For non clustered VG - show  "active"/""

For clustered VG its more complex:

"local exclusive"
"remote exclusive"
"locally"
"remotely"
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
8b18ab76d2 report: show dmeventd monitoring status
Add new lvs segment field 'Monitor' showing 3 states:

"monitored" - LV is monitored by dmeventd.

"not monitored" - LV is currently not being monitored by dmeventd

"" (empty) - LV does not support monitoring, or dmeventd support
             is not compiled in.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
3f7de58e96 man: lvextend --use-policies
Add missing man info.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
f84f12a6a3 snapshot: rework cluster creation and removal
Support for exclusive activation of snapshots revealed some problems.

When snapshot is created, COW LV is activated first (for clearing) and
then it's transformed into snapshot's COW LV, but it has left the lock
for such LV active in cluster and this lock could not have been removed
from dlm, unless snapshot has been removed within same dlm session.

If the user tried to remove snapshot after rebooting node, the lock was
missing, and COW LV could not have been detached.

Patch modifes the approach in this way:

Always deactivate COW LV for clustered vg  after clearing (so it's
activated again via imlicit snapshot activation rule when snapshot is activated).

When snapshot is removed, activate COW LV as independend LV, so the lock
will exist for such LV, but only when the snapshot is active.

Also add test case for testing snapshot removal after cluster reboot.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
d51b7e5404 clvmd: avoid pretesting of dev availability
Patch fixes hidden problem with lvm metadata caching.

When the pretest was made, only the commited data have been cached back
since the call lv_info_by_lvid() triggers mda read operation.
However call of lv_suspend_if_active() also reads precommited metadata.
The problem is visible in this sequence of calls:

vg_write(), suspend_lv(), vg_commit(), resume_lv()

which may end with leaving outdated mda in lvm cache, since vg_write()
drops cached metadata and vg_commit() only transforms precommited
to commited metadata, but in the case of pretesting we have
no precommited mda available so the cache will continue to use
old metadata. This happens, when suspend LV is inactive.
2013-04-25 17:33:22 +02:00
Zdenek Kabelac
45eeb70b02 config: merge timestamps
Merging multiple config files together needs to know newest (highest)
timestamp of merged files. Persistent cache file is being used
only in case, the config file is older then .cache file.
2013-04-23 12:31:16 +02:00
Zdenek Kabelac
1951798d72 vgread: fix fid transfer for lvm1 and pool format
Assign fid as the last step before returning VG.
Make the format reader for 'lvm1' and 'pool' equal to 'lvm2' format reader.

It has caused memory corruption to lvmetad as it later calls
destroy_instance() to allocated fid. This patch should fix problems
with crashing test lvmetad-lvm1.sh.
2013-04-21 23:13:57 +02:00
Zdenek Kabelac
a2b76a6f02 thin: fix resource leak in err path
If the devices list could not have been obtained, FILE* was leaked.
2013-04-21 23:10:30 +02:00
Zdenek Kabelac
17a6915054 thin: explicitly avoid pvmove operation
So far we do not support pvmove for thin volumes
and thin pools.
2013-04-21 23:09:11 +02:00
Zdenek Kabelac
f787b575b5 lvmetad: fix error paths
Also add missing goto out on error.
Error path missed return NULL leading to double free of enc_value.
2013-04-21 23:04:53 +02:00
Zdenek Kabelac
c9d8d22224 clmvd: fix responce status
Failing status code is expected to be 0.
Also do not return '*response' as pointer which has been already free().
2013-04-21 22:54:42 +02:00
Jonathan Brassow
2e0740f7ef RAID: Add writemostly/writebehind support for RAID1
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics.  The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value.  If no trailing
character is given, it will set the flag.
Synopsis:
        lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
        lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv

The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set.  It is signified with a 'w'.  If the device
has failed, the 'p'artial flag has priority.

Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   Rwi---r-m    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-w    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r--    1 linear   4.00m

Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   rwi---r-p    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-p    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r-p    1 linear   4.00m

A new reportable field has been added for writebehind as well.  If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--     512
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--

Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--
2013-04-15 13:59:46 -05:00
Zdenek Kabelac
a81a2406f1 tools: add common lv_change_activate
Move common code for changing activation state from
vgchange and lvchange to one function.

Fix the order of checks - so we always implicitelly
activate snapshots and thin volumes in exclusive mode,
and we do not allow local deactivation for them.
2013-04-12 11:30:07 +02:00
Jonathan Brassow
719e908bc0 WHATS_NEW: Add WHATS_NEW entry for previous commit. 2013-04-11 16:03:24 -05:00
Jonathan Brassow
ff64e3500f RAID: Add scrubbing support for RAID LVs
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
	lvchange --syncaction {check|repair} vg/raid_lv

RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent.  'lvchange' can
now initaite the two scrubbing operations: "check" and "repair".  "check"
will go over the array and recored the number of discrepancies but not
repair them.  "repair" will correct the discrepancies as it finds them.

'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'.  The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.

Additional reporting has been added for 'lvs' to support the new
operations.  Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches".  These
can be accessed using the '-o' option to 'lvs', like:
	lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing.  It can be one of the following:
        - idle:   All sync operations complete (doing nothing)
        - resync: Initializing an array or recovering after a machine failure
        - recover: Replacing a device in the array
        - check: Looking for array inconsistencies
        - repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.

The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.

Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well.  The role of the 'p'artial character in the lv_attr report field
as expanded.  "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
        'm'ismatches:  Indicates that there are discrepancies in a RAID
                       LV.  This character is shown after a scrubbing
                       operation has detected that portions of the RAID
                       are not coherent.
        'r'efresh   :  Indicates that a device in a RAID array has suffered
                       a failure and the kernel regards it as failed -
                       even though LVM can read the device label and
                       considers the device to be ok.  The LV should be
                       'r'efreshed to notify the kernel that the device is
                       now available, or the device should be 'r'eplaced
                       if it is suspected of failing.
2013-04-11 15:33:59 -05:00
Jonathan Brassow
95d28735ea WHATS_NEW: Include entry for RAID status func improvements 2013-04-08 15:17:12 -05:00
Zdenek Kabelac
c22e925ce4 man: lvceate document external origin snapshot
Document added support for external origin.
2013-04-05 14:15:03 +02:00
Zdenek Kabelac
ddafa0115e man: updates for lvconvert and lvcreate
Cleanup and improvement on man pages.
2013-04-05 14:14:20 +02:00
Peter Rajnoha
32ae07cef1 pv_write: clean up non-orphan format1 PV write
...to not pollute the common and format-independent code in the
abstraction layer above.

The format1 pv_write has common code for writing metadata and
PV header by calling the "write_disks" fn and when rewriting
the header itself only (e.g. just for the purpose of changing
the PV UUID) during the pvchange operation, we had to tweak
this functionality for the format1 case and we had to assign
the PV the orphan state temporarily.

This patch removes the need for this format1 tweak and it calls
the write_disks with appropriate flag indicating whether this is
a PV write call or a VG write call, allowing for metatada update
for the latter one.

Also, a side effect of the former tweak was that it effectively
invalidated the cache (even for the non-format1 PVs) as we
assigned it the orphan state temporarily just for the format1
PV write to pass.

Also, that tweak made it difficult to directly detect whether
a PV was part of a VG or not because the state was incorrect.

Also, it's not necessary to backup and restore some PV fields
when doing a PV write:

  orig_pe_size = pv_pe_size(pv);
  orig_pe_start = pv_pe_start(pv);
  orig_pe_count = pv_pe_count(pv);
  ...
  pv_write(pv)
  ...
  pv->pe_size = orig_pe_size;
  pv->pe_start = orig_pe_start;
  pv->pe_count = orig_pe_count;

...this is already done by the layer below itself (the _format1_pv_write fn).

So let's have this cleaned up so we don't need to be bothered
about any 'format1 special case for pv_write' anymore.
2013-03-25 15:08:26 +01:00
Peter Rajnoha
784867d5bd WHATS_NEW: vgextend and PV with 0 MDAs 2013-03-19 15:41:34 +01:00
Zdenek Kabelac
b36a776a7f thin: move update_pool_params
Now we may recongnize preset arguments, move
the code for updating thin pool related values
into /lib portion of the code.
2013-03-13 15:13:54 +01:00
Alasdair G Kergon
cbfb5a98b5 filters: power2 devs get precedence if PVIDs match
Give precedence to EMC "power2" devices with duplicate PVIDs like
we already do with "emcpower" devices.
2013-03-11 20:10:49 +00:00
Peter Rajnoha
03b5c51730 WHATS_NEW: add lines for config validation support 2013-03-06 11:00:30 +01:00
Peter Rajnoha
b3776468fa WHATS_NEW: add lines for embedding area support 2013-02-26 15:50:43 +01:00
Zdenek Kabelac
b73de73151 thin: lvconvert support for external origin
Add basic support for converting LV into an external origin volume.

Syntax:

lvconvert --thinpool vg/pool  --originname renamed_origin -T origin

It will convert volume  'origin' into a thin volume, which will
use 'renamed_origin' as an external read-only origin.
All read/write into origin will go via 'pool'.

renamed_origin volume is read-only volume, that could be activated
only in read-only mode, and cannot be modified.
2013-02-23 10:38:20 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Zdenek Kabelac
3679bb1cd9 activation: simplify activation code
Reorder activation code to look similar for preload tree and
activation tree.

Its also give much better suppport for device stacking,
since now we also support activation of snapshot which might
be then used for other devices.
2013-02-23 10:30:03 +01:00
Zdenek Kabelac
0631d233d8 activation: add _add_layer_target_to_dtree
Add function for creation of simple linear mapping over layer device.
2013-02-23 10:29:08 +01:00
Zdenek Kabelac
78b23f3595 activation: extend _cached_info
Add layer string to support check of layered devices.
2013-02-23 10:28:01 +01:00