1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

2824 Commits

Author SHA1 Message Date
Jonathan Brassow
c59167ec13 pvmove: Add support for RAID, mirror, and thin
This patch allows pvmove to operate on RAID, mirror and thin LVs.
The key component is the ability to avoid moving a RAID or mirror
sub-LV onto a PV that already has another RAID sub-LV on it.
(e.g. Avoid placing both images of a RAID1 LV on the same PV.)

Top-level LVs are processed to determine which PVs to avoid for
the sake of redundancy, while bottom-level LVs are processed
to determine which segments/extents to move.

This approach does have some drawbacks.  By eliminating whole PVs
from the allocation list, we might miss the opportunity to perform
pvmove in some senarios.  For example, if we have 3 devices and
a linear uses half of the first, a RAID1 uses half of the first and
half of the second, and a linear uses half of the third (FIGURE 1);
we should be able to pvmove the first device (FIGURE 2).
	FIGURE 1:
        [ linear ] [ -RAID- ] [ linear ]
        [ -RAID- ] [        ] [        ]

	FIGURE 2:
        [  moved ] [ -RAID- ] [ linear ]
        [  moved ] [ linear ] [ -RAID- ]
However, the approach we are using would eliminate the second
device from consideration and would leave us with too little space
for allocation.  In these situations, the user does have the ability
to specify LVs and move them one at a time.
2013-08-23 08:57:16 -05:00
Peter Rajnoha
99fe3b88d2 systemd: lvm2-activation-generator: report only error otherwise be silent
Do not print success status for lvm2-activation-generator:

  "LVM: Activation generator successfully completed."
  "LVM: Logical Volume autoactivation enabled." (if use_lvmetad=1)

Though this information is quite useful during boot, it may
be confusing for users if it happens anytime later and it
actually happens if systemd reloads. This is usually on package
update to update the systemd state and load any new units that are
newly installed in the system. The systemd reload is global and
so any existing generators are rerun at that moment too.
2013-08-22 08:27:51 +02:00
Peter Rajnoha
c8daa15270 filter-mpath: remove superfluous error message about mpath major not equal to dm major
This is a regression caused by commit 3bd9048854.
The error message added with that commit "mpath major %d is not dm major %d" is
superfluous.

When scanning for mpath components, we're looking for a parent device.
But this parent device is not necessarily an mpath device (so the dm device)
if it exists - it can be any other device layered on top (e.g. an MD RAID device).
2013-08-21 14:07:01 +02:00
Jonathan Brassow
f0be9ac904 cmirrord: Prevent secondary checkpoints from corrupting bitmaps
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).

When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap.  More than one machine can send a checkpoint,
but only the initial one should be imported.  Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin.  When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.

When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter.  This parameter was designed to avoid reading all but
the initial checkpoint.  Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
2013-08-20 13:21:09 -05:00
Peter Rajnoha
cac49725c9 udev: fix lvmetad rules to not ignore loop device configuration
If loop device is first configured on systems where /dev/loop-control
is used to dynamically create the loop device itself, there's an
ADD+CHANGE even generated. But next time the existing /dev/loop[0-9]*
is reused, there's only a CHANGE event since the device representing
it is already present in kernel (so no ADD event in this case).

We can't ignore this CHANGE event for loop devices! This is a regression
caused by 756bcabbfe. We already had
a similar problem with MD devices which was fixed by
2ac217d408 (but that one was
only an intra-release fix).
2013-08-16 15:45:00 +02:00
Michael Stapelberg
8cbbe851a8 systemd: use LVM_PATH instead of hardcoded value in activation generator 2013-08-15 09:59:19 +02:00
Peter Rajnoha
82d83a01ce autoactivation: refresh existing VG before autoactivation
When autoactivating a VG, there could be an existing VG with exactly
the same PV UUIDs. The PVs could be reappeared after previous
loss/disconnect (for example disconnecting and reconnecting iscsi).

Since there's no "autodeactivation" yet, the mappings for the LVs
from the VG were left in the system even if the device was disconnected.
These mappings also hold the major:minor of the underlying device.
So if the device reappears, it is assigned a different major:minor
pair (...and kernel name). We need to cope with this during
autoactivation so any existing mappings are corrected for any changes.
The VG refresh does that (the vgchange --refresh functionality) -
call this before VG autoactivation.

(If the VG does not exist yet, the VG refresh is NOP)
2013-08-14 14:04:58 +02:00
Peter Rajnoha
fcbb34bdcc WHATS_NEW: for 0da72743ca 2013-08-14 10:18:02 +02:00
Alasdair G Kergon
80bcdb93ff filters: check for mpath before opening devs
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.

When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.

Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.

Log each filter initialisation to show sequence.

Avoid duplicate 'Using $device' debug messages.
2013-08-13 23:26:58 +01:00
Alasdair G Kergon
1a1d3a10ff vgchange: require confirmation with -c and no VGs
Too many people have been running 'vgchange -cy' by mistake
so add a confirmation prompt.  Use --yes to bypass this.
2013-08-13 18:20:11 +01:00
Peter Rajnoha
fd7cac15bc WHATS_NEW: be more precise 2013-08-13 18:25:54 +02:00
Peter Rajnoha
e166c00ac6 WHATS_NEW: one more for a85439 2013-08-13 18:16:05 +02:00
Peter Rajnoha
268b370e24 blkdeactivate: add support for bind mounts
Recent version of util-linux/umount (v2.23+) provides
umount --all-targets that can unmount all the mount targets of
the same device (the bind mounts). Use this if available when
calling the umount blkdeactivate.

Otherwise, for older versions of util-linux, use findmnt
(that is also a part of the util-linux) to iterate over all
mount targets of the same device - this is the manual way.
2013-08-13 17:51:40 +02:00
Peter Rajnoha
a854398764 blkdeactivate: change the way blkdeactivate reports status
The blkdeactivate now suppresses error messages from external
tools that are called. Instead, only a summary message "done"
or "skipped" is issued by blkdeactivate as any error in calling
the external tool (e.g. unmounting or deactivating a device) causes
the device to be skipped and the blkdeactivate continues with the
next device in the tree.

Add new -e/--errors switch to display any error messages from
external tools.

Also, suppress any output given by the external tools and add
new -v/--verbose switch to display it including the verbose
output of the tools called (this will enable error reporting
as well).

Also add blkdeactivate -vv for even more debug (the script's debug).
2013-08-13 17:51:23 +02:00
Alasdair G Kergon
32148369d1 post-release 2013-08-13 11:54:48 +01:00
Alasdair G Kergon
297907899c release 2.02.100
84 files changed, 1540 insertions(+), 442 deletions(-)

Mostly bug fixes this time.

Also note:
  md raid replaces dm mirroring as the default implementation.
  Can call out to thin_repair to fix thin metadata.
  Improved clvmd error detection/debugging information.
2013-08-13 11:29:21 +01:00
Jonathan Brassow
abc89422af Mirror: Fix inability to remove VG's cluster flag if it contains a mirror
According to bug 995193, if a volume group
	1) contains a mirror
	2) is clustered
	3) 'locking_type' = 0 is used
then it is not possible to remove the 'c'luster flag from the VG.  This
is due to the way _lv_is_active behaves.

We shouldn't allow the cluster flag to be flipped unless the mirrors in
the cluster are not active.  This is because different kernel modules
are used depending on whether a mirror is cluster or not.  When we
attempt to see if the mirror is active, we first check locally.  If it
is not, then we attempt to check for remotely active instances if the VG
is clustered.  Since the no_lock locking type is LCK_CLUSTERED, but does
not implement 'query_resource', remote_lock_held will always return an
error in this case.  An error from remove_lock_held is treated as though
the lock _is_ held (i.e. the LV is active remotely).  This blocks the
cluster flag from changing.

The solution is to implement 'query_resource' for the no_lock type.  It
will report a message and return 1.  This will allow _lv_is_active to
function properly.  The LV would be considered not active remotely and
the VG can change its flag.
2013-08-12 13:56:47 -05:00
Alasdair G Kergon
28760275e6 logging: tidy log_sys_error when string empty 2013-08-12 18:40:41 +01:00
Jonathan Brassow
cba228f856 WHATSNEW: typo 2013-08-09 17:17:53 -05:00
Jonathan Brassow
8615234c0f RAID: Fix bug making lvchange unable to change recovery rate for RAID
1) Since the min|maxrecoveryrate args are size_kb_ARGs and they
   are recorded (and sent to the kernel) in terms of kB/sec/disk,
   we must back out the factor multiple done by size_kb_arg.  This
   is already performed by 'lvcreate' for these arguments.
2) Allow all RAID types, not just RAID1, to change these values.
3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe'
   commands so that lvchange will not complain about needing at
   least one of a certain set of arguments and failing.
4) Add tests that check that these values can be set via lvchange
   and lvcreate and that 'lvs' reports back the proper results.
2013-08-09 17:09:47 -05:00
Zdenek Kabelac
e583ff3d2c thin: thin pool can't be external origin
Avoid trying to convert thin-pool to external origin.
2013-08-09 23:04:30 +02:00
Peter Rajnoha
2f61478436 workaround: gcc v4.8 on 32 bit param. passing bug when -02 opimization used
gcc -O2 v4.8 on 32 bit architecture is causing a bug in parameter
passing. It does not happen with -01 nor -O0.

The problematic part of the code was strlen use in config.c in
the config_def_check fn and the call for _config_def_check_tree in it:

<snip>
  rplen = strlen(rp);
  if (!_config_def_check_tree(handle, vp, vp + strlen(vp), rp, rp + rplen, CFG_PATH_MAX_LEN - rplen, cn, cmd->cft_def_hash)) ...
</snip>

If compiled with -O0 (correct):

Breakpoint 1, config_def_check (cmd=0x819b050, handle=0x81a04f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	strlen(vp)
$2 = 6
(gdb)
_config_def_check_tree (handle=0x81a04f8, vp=0x8189ee0 <_cfg_path>
"config", pvp=0x8189ee6 <_cfg_path+6> "", rp=0xbfffe1e8 "config",
prp=0xbfffe1ee "", buf_size=58, root=0x81a2568, ht=0x81a65
48) at config/config.c:680
(gdb) p	vp
$4 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	pvp
$5 = 0x8189ee6 <_cfg_path+6> ""

If compiled with -O2 (incorrect):

Breakpoint 1, config_def_check (cmd=cmd@entry=0x8183050, handle=0x81884f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8172fc0 <_cfg_path> "config"
(gdb) p strlen(vp)
$2 = 6
(gdb) p	vp + strlen(vp)
$3 = 0x8172fc6 <_cfg_path+6> ""
(gdb)
_config_def_check_tree (handle=handle@entry=0x81884f8, pvp=0x8172fc7
<_cfg_path+7> "host_list", rp=rp@entry=0xbffff190 "config",
prp=prp@entry=0xbffff196 "", buf_size=buf_size@entry=58, ht=0x
818e548, root=0x818a568, vp=0x8172fc0 <_cfg_path> "config") at
config/config.c:674
(gdb) p	pvp
$4 = 0x8172fc7 <_cfg_path+7> "host_list"

The difference is in passing the "pvp" arg for _config_def_check_tree.
While in the correct case, the value of _cfg_path+6 is passed
(the result of vp + strlen(vp) - see the snippet of the code above),
in the incorrect case, this value is increased by 1 to _cfg_path+7,
hence totally malforming the string that is being processed.

This ends up with incorrect validation check and incorrect warning
messages are issued like:

 "Configuration setting "config/checks" has invalid type. Found integer, expected section."

To workaround this issue, remove the "static" qualifier from the
"static char _cfg_path[CFG_PATH_MAX_LEN]". This causes the optimalizer
to be less aggressive (also shuffling the arg list for
_config_def_check_tree call helps).
2013-08-09 13:24:50 +02:00
Peter Rajnoha
8d3347f70b WHATS_NEW: entry for 19baf84290 2013-08-08 10:04:53 +02:00
Jonathan Brassow
68c2d352ec WHATS_NEW: update WHATS_NEW for previous commit 2013-08-07 17:51:21 -05:00
Jonathan Brassow
b15278c3dc Mirror/RAID1: When up|down-converting default to segtype of current LV
If there is no RAID support in the kernel but the default mirror
segtype is "raid1", converting legacy mirrors can be problematic.
For example, changing the log type or converting a mirror to a linear
LV does not require the RAID modules to be present.  However, because
lp->segtype is set to be RAID1 by the configuration file, the command
fails.

We should only be setting lp->segtype when converting mirrors if it is
going to change (e.g. to linear or between mirror types).
2013-08-07 16:01:45 -05:00
Jonathan Brassow
7e1083c985 RAID: Make "raid1" the default mirror segment type 2013-08-06 14:13:55 -05:00
Zdenek Kabelac
003f08c164 clogd: fix descriptor leak when daemonzing 2013-08-06 16:21:51 +02:00
Zdenek Kabelac
7b1315411f clmvd: fix decriptor leak on restart
Do not leave descriptor used for dup2() openned.
2013-08-06 16:20:36 +02:00
Zdenek Kabelac
f6dd5a294b exec: pipe open
Function replaces popen() system and avoids shell execution
and argument parsing (no surprices).
2013-08-06 16:18:43 +02:00
Peter Rajnoha
61e7dc833c WHATS_NEW: previous commit 2013-08-06 14:03:43 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Peter Rajnoha
1cdd563b6c WHATS_NEW: move line to WHATS_NEW_DM 2013-08-06 11:42:01 +02:00
Jonathan Brassow
5ca54c4f0b dmeventd: Fix memory leak
When creating a timeout thread for snapshots, the thread is not
tracked and thus never joined.  This means that the exit status
of the timeout thread is held indefinitely.  Saves a bit of
memory to set PTHREAD_CREATE_DETACHED when creating this thread.

I've also added pthread_attr_init|destroy to setup the creation
pthread_attr_t.

Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2013-07-31 15:23:13 -05:00
Zdenek Kabelac
de0cba0e2d thin: initial --repair support for pools
Initial basic support for repair.
It currently takes pool metadata spare volume, which
is used for recovery.  New spare is created if the volume
is successfuly repaired.

After the operation the previous _tmeta volume is moved
into  _tmeta%d volume and if everything is ok, this volume
could be removed.
New _tmeta needs to be pvmoved to proper place and also
converted to i.e. mirror if it should be mirrored.

Later version will try to automate some steps here.
2013-07-31 15:32:36 +02:00
Zdenek Kabelac
22fc80982a thin: add thin_repair and thin_dump options
Add new configure lvm.conf options for binaries thin_repair
and thin_dump.

Those are part of device-mapper-persistent-data package
and will be used for recovery of thin_pool.
2013-07-31 15:30:47 +02:00
Zdenek Kabelac
ea605d1ec7 thin: metadata resize needs 1.9 version
Version 1.8 is not yet fully usable for metadata resize.
2013-07-31 15:29:27 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Peter Rajnoha
ecc9f74988 filters: fix segfault on incorrect global_filter
When using a global_filter and if this filter is incorrectly
specified, we ended up with a segfault:

  raw/~ $ pvs
    Invalid filter pattern "r|/dev/sda".
  Segmentation fault (core dumped)

In the example above a closing '|' character is missing at the end
of the regex. The segfault itself was caused by trying to destroy
the same filter twice in _init_filters fn within the error path
(the "bad" goto target):

bad:
        if (f3)
                f3->destroy(f3);
        if (f4)
                f4->destroy(f4);

Where f3 is the composite filter (sysfs + regex + type + md + mpath filter)
and f4 is the persistent filter which encompasses this composite filter
within persistent filter's 'real' field in 'struct pfilter'.

So in the end, we need to destroy the persistent filter only as
this will also destroy any 'real' filter attached to it.
2013-07-26 13:04:53 +02:00
Alasdair G Kergon
06dce7d539 post-release 2013-07-25 00:38:53 +01:00
Alasdair G Kergon
76e617b158 release 2.02.99
363 files changed, 19863 insertions(+), 9055 deletions(-)

This is a very large release - so expect bugs!

Please treat this release like a release candidate.
Changes to the external interfaces since 2.02.98 are not yet frozen.

Updated releases will follow quickly (days not weeks) as any problems
are handled.
2013-07-24 23:59:03 +01:00
Alasdair G Kergon
d13e87b9ef cleanup: comments and a message 2013-07-24 22:10:37 +01:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Peter Rajnoha
31de670318 lvconvert: add more checks for lvconvert --type
The --type mirror requires -m/--mirrrors:

  lvconvert --type mirror vg/lvol0
    --type mirror requires -m/--mirrors
    Run `lvconvert --help' for more information.

The --type raid* is allowed (the checks already existed):

  lvconvert --type raid10 vg/lvol0
    Converting the segment type for vg/lvol0 from linear to raid10 is not yet supported.

The --type snapshot is a synonym to -s/--snapshot:

  lvconvert -s vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

  lvconvert --type snapshot vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

All the other segment types are not supported, e.g.:

  lvconvert --type zero vg/lvol0
    Conversion using --type zero is not supported.
    Run `lvconvert --help' for more information.
2013-07-23 17:13:54 +02:00
Peter Rajnoha
9b1834f075 man: pvs -o ba_start,ba_size -> pv_ba_start,pv_ba_size 2013-07-23 14:45:30 +02:00
Peter Rajnoha
ea333a894e systemd: lvm2-activation-generator: use LOG_DEBUG/ERR severity for kmsg 2013-07-22 14:04:12 +02:00
Alasdair G Kergon
ccc29f17b6 cmdline: support ARG_GROUPABLE in merge_synonym 2013-07-19 20:37:43 +01:00
Alasdair G Kergon
90a09559ed commandline: add prefix aliases for raid options
Accept --raidwritemostly as well as --writemostly etc.
2013-07-19 19:24:54 +01:00
Jonathan Brassow
4eea660191 RAID: Fix segfault when reporting raid_syncaction field on older kernel
The status printed for dm-raid targets on older kernels does not include
the syncaction field.  This is handled by dev_manager_raid_status() just
fine by populating the raid status structure with NULL for that field.
However, lv_raid_sync_action() does not properly handle that field being
NULL.  So, check for it and return 0 if it is NULL.
2013-07-19 10:01:48 -05:00
Alasdair G Kergon
da79fe4c1d reporting: tidy recent new fields
Add underscores and prefixes to recently-added fields.
(Might add more alias functionality in future.)
2013-07-19 01:30:02 +01:00
Alasdair G Kergon
357df34133 display: fix units for sizes <1k 2013-07-18 17:55:58 +01:00
Zdenek Kabelac
460d0254eb thin: add pool metadata spare lv support
Add support for pool's metadata spare volume.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
08df7ba844 thin: improve pool creation activation order
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.

This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.

However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.

Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.

Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
4e724f5f52 thin: for thin volumes properly list modules
thin volume needs   thin-pool and  thin kernel modules so print them
both for   lvs -o+modules
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7afa9cebcb thin: fix error path in creation path
Remove some calls to revert_new_lv when no LV has been created/commited so far.
When the pool update failed - then only revert is needed.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
7b4b97b731 snapshot: local activation for clear COW device
To clear snapshot cow device in cluster enforce local
activation here.
2013-07-18 18:22:43 +02:00
Peter Rajnoha
8606bf316a systemd: generator: add lvm2-activation-net.service
The new lvm2-activation-net.service activates LVM volumes
after network-attached devices are set up (iSCSI and FCoE)
if lvmetad is disabled and hence the autoactivation is not
used.
2013-07-17 16:54:52 +02:00
Zdenek Kabelac
57be501aa3 dev_manager: lower memory usage
Created dlid for test is not needed afterward, so lower a memory
usage of this call is repeatedly used for building some large tree.

TODO: create function to use given buffer on stack as much cheaper.
2013-07-15 15:59:20 +02:00
Zdenek Kabelac
0443c42e3b thin: add sub volumes as whole volumes
Do not use origin_only when add log_lv and metadata as a subvolume.
The stacked volume needs to access whole volume in this case.
2013-07-15 15:58:07 +02:00
Zdenek Kabelac
97d36d5750 thin: check and use layered origin lv
Code needs to check if the layer origin device is suspended,
It's valid to create  thinvolume snapshot of thinvolume which is also
used as an old-style snapshot. In this case we need to check -real
is suspended.

When adding origin_only - add only layer thin volume.
(in case it's also old-snapshot add only -real device)
2013-07-15 15:51:39 +02:00
Zdenek Kabelac
925701d9f3 thin: write back when command successfully finished
Remove backup() call from update_pool_lv() as it's been there
duplicated and preperly order backup() call after lvresize,
so there is just one such call.
2013-07-15 15:48:32 +02:00
Zdenek Kabelac
42881c8877 thin: send messages to active pool
If the thin pool is known to be active, messages can be passed
to the pool even when the created thin volume is not going to be
activated.

So we do not need to stack large list of message and validate
and catch creation errors earlier in this case.

Replace the test for valid activation combination with simpler list of
deactivation combinations.
2013-07-15 15:47:25 +02:00
Peter Rajnoha
9a44ba94a5 WHATS_NEW: support for LV activation skip flag 2013-07-12 21:37:58 +02:00
Peter Rajnoha
953a438e93 dumpconfig: add --type profilable
The --type profilable shows all config settings that
are customizable by profiles:

  raw/~ $ lvm dumpconfig --type profilable
  allocation {
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  thin_pool_chunk_size=64
  }
  activation {
	  thin_pool_autoextend_threshold=100
	  thin_pool_autoextend_percent=20
  }
2013-07-09 10:00:47 +02:00
Peter Rajnoha
5ed7d0cf1d dumpconfig: add --mergedconfig option
Normally, the lvm dumpconfig processes only the configuration tree
that is at the top of the cascade. Considering the cascade is:

  CONFIG_STRING -> CONFIG_PROFILE -> CONFIG_MERGED_FILES/CONFIG_FILE

...then:

  (dumpconfig of lvm.conf only)
  raw/~ $ lvm dumpconfig allocation
  allocation {
	  maximise_cling=1
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=64
  }

  (dumpconfig of selected profile configuration only)
  raw/~ $ lvm dumpconfig --profile test allocation
  allocation {
	  thin_pool_chunk_size=8
	  thin_pool_discards="passdown"
	  thin_pool_zero=1
  }

  (dumpconfig of given --config configuration only)
  raw/~ $ lvm dumpconfig --config 'allocation{thin_pool_chunk_size=16}' allocation
  allocation {
	  thin_pool_chunk_size=16
  }

The --mergedconfig option causes the configuration cascade to be
merged before processing it with dumpconfig:

  (dumpconfig of merged selected profile and lvm.conf)
  raw/~ $ lvm dumpconfig --profile test allocation --mergedconfig
  allocation {
	  maximise_cling=1
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=8
  }

  (dumpconfig merged given --config and selected profile and lvm.conf)
  raw/~ $ lvm dumpconfig --profile test --config 'allocation{thin_pool_chunk_size=16}' allocation --mergedconfig
  allocation {
	  maximise_cling=1
	  thin_pool_zero=1
	  thin_pool_discards="passdown"
	  mirror_logs_require_separate_pvs=0
	  thin_pool_metadata_require_separate_pvs=0
	  thin_pool_chunk_size=16
  }

Hence with the --mergedconfig, we are able to see the
configuration that is actually used when processing any
LVM command while using any combination of --config/--profile
options together with lvm.conf file.
2013-07-08 16:05:56 +02:00
Zdenek Kabelac
985251c8f3 locking: unlock memory on error path
Unlock memory and unblock signals on error path.
2013-07-08 14:02:49 +02:00
Alasdair G Kergon
7c6526aae2 lvresize: separate validation from action
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions.  Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
2013-07-06 03:28:21 +01:00
Peter Rajnoha
0a5b68e87b man: document profile config and related options
Document following items:
  configuration cascade (man lvm.conf)
  --profile ProfileName (man lvm)
  --detachprofile (man vgchange/lvchange)
  -o vg_profile/lv_profile (man vgs/lvs)

Also document --config a bit so we can see where it fits in the
configuration cascade - will be documented more in following commit...
2013-07-03 16:49:26 +02:00
Zdenek Kabelac
6f335ffa35 sigint: improve logic on for sigint reaction
Fix and improve handling on sigint.

Always check for signal presence *before* calling of command,
so it will not call the command when break was hit.

If the command has been finished succesfully there is
no problem to mark the command ok and not report interrupt at all.

Fix cuple related stack; reports and assignments.
2013-07-03 14:46:42 +02:00
Mike Snitzer
fe09d84668 lvconvert: Rename _swap_lv to _swap_lv_identifiers and move to allow an additional user 2013-07-02 17:02:25 -04:00
Mike Snitzer
f9e0adcce5 snapshot: Rename snapshot segment returning methods from find_*_cow to find_*_snapshot
find_cow -> find_snapshot, find_merging_cow -> find_merging_snapshot.
Will thin snapshot code to reuse these methods without confusion.
2013-07-02 16:26:03 -04:00
Tony Asleson
79106f7394 liblvm/python-lvm New additions
Signed-off-by: Tony Asleson <tasleson@redhat.com>
2013-07-02 14:24:34 -05:00
Peter Rajnoha
fd94dd3b7e WHATS_NEW: config/profile_dir 2013-07-02 15:57:46 +02:00
Peter Rajnoha
7ce0c6777b WHATS_NEW: configuration profile support 2013-07-02 15:51:21 +02:00
Zdenek Kabelac
e30028004b archiver: do not archive vg more then once
Do not keep multiple archives for the executed command.
Reuse the ALLOCATABLE_PV from pv status for
ARCHIVED_VG vg status. Mark VG with the bit with the
first archivation.
2013-07-01 23:09:26 +02:00
Jonathan Brassow
8215846aa5 Clean-up: WHATS_NEW
Choosing between two entries and forgot to remove one.
2013-06-19 19:55:34 -05:00
Jonathan Brassow
a6d13308ec RAID/MIRROR: Honor mirror_segtype_default when upconverting linear LVs
If the user would upconvert a linear LV to a mirror without specifying
the segment type ("--type mirror" vs "--type raid1"), the "mirror"
segment type would be chosen without consulting the 'default_mirror_segtype'
setting in lvm.conf.  This is now used as the basis for determining
which should be used if left unspecified.
2013-06-19 17:50:10 -05:00
Zdenek Kabelac
155841c349 lvmetad: fix compare function
Check for enough space in preallocated buffer.
Fixes problem, when lvm code started to suddenly allocate
too big memory chunks.

TODO: lvmetad protocol should announce needed size ahead,
so if metadata have 1MB we are not reallocating memory...
2013-06-18 22:12:51 +02:00
Zdenek Kabelac
2562968864 vgcfgrestore: fix crash on restore of wrong vgname
When vgname has not existed in metadata, it has crashed on double free
in format_instance destroy() -  since VG was created, used FID and was
released - which also released FID, so further use was accessing bad
memory.

Fix it for this code path before release_vg() so FID will exists
when _vg_read_file_name() returns NULL.
2013-06-18 22:11:21 +02:00
Alasdair G Kergon
c2dc21d89f text: miscellaneous comments & message tweaks 2013-06-15 01:28:54 +01:00
Peter Rajnoha
dba53681a5 man: refine lvm.conf and man page documentation for autoactivation feature 2013-06-14 10:02:56 +02:00
Zdenek Kabelac
fe22089edf thin: vgsplit support for thins
Support vgsplit for VGs with thin pools and thin volumes.
In case the thin data and thin metadata volumes are moved to a new VG,
move there also all related thin volumes and check that external origins
are also present in this new VG.
2013-06-13 14:51:00 +02:00
Peter Rajnoha
966d4f36d7 filter-mpath: detect partitions of mpath components
We use mpath filtering (enabled by devices/multipath_component_detection=1
lvm.conf setting) to avoid a situation in which we could end up with
duplicate PVs found. We need to filter out the mpath components and
use only the top-level multipath mapping instead for PV scans.

However, if the there are partitions on multipath components, we need
to filter out these partitions. This patch fixes it so those
partitions found on multipath components are filtered as well.

For example, let's consider following configuration:
The sda and sdb are mpath components, sda1 and sdb1 the partitions
on these components, mpath-test the mpath mapping and mpath-test1
the partition mapping - created automatically by kpartx right
after mpath-test creation. The PV resides on top.

       (LVM PV)
          |
      mpath-test1
          |
      mpath-test
          |
sda1 ---------- sdb1
   \ |        |/
    sda      sdb

E.g. for sda1 and sdb1, the code will detect this and it skips
the partition that belongs to the multipath component:
  <snippet from the log>
    #filters/filter-mpath.c:156         /dev/sda1: Device is a partition, using primary device /dev/sda for mpath component detection
    130 #ioctl/libdm-iface.c:1724         dm status   (253:2) OF[16384](*1)
    131 #filters/filter-mpath.c:196         /dev/sda1: Skipping mpath component device
  </snippet from the log>

Othewise, we'd see the same PV label on sda1/sdb1 and mpath-test1
at the same time ending up with "Duplicate PV found...".
2013-06-12 13:13:38 +02:00
Zdenek Kabelac
87aca628d6 thin: lvresize supports pool metadata resize
Add support for lvresize of thin pool metadata device.

lvresize --poolmetadatasize +20   vgname/thinpool_lv

or

lvresize -L +20 vgname/thinpool_lv_tmeta

Where the second one allows all the args for resize (striping...)
and the first option resizes accoding to the last metadata lv segment.
2013-06-11 14:05:20 +02:00
Zdenek Kabelac
72c3ae253e thin: add helper functions
Add find_pool_lv() and pool_can_resize_metadata().
2013-06-11 14:03:30 +02:00
Zdenek Kabelac
55a3859632 thin: detect online metadata resize support 2013-06-11 14:03:28 +02:00
Zdenek Kabelac
01ef97fcbb thin: report 'e' metadata type with higher priority
Giving volume type information about being 'metadata' type of volume
has higher priority then i.e.  'mirror' or 'thin' flag - for those
type we have 'target attr' (7th. field).
2013-06-11 14:03:08 +02:00
Zdenek Kabelac
c0290489c3 thin: report o as volume type for external origin
Reuse 'o' attr for lvs report also for external origin.
2013-06-11 14:02:41 +02:00
Zdenek Kabelac
7151ede767 thin: report t for thin pool and volume
Do not mark internal device _tdata and _tmeta as having target type 't'.
They have the target type on their own (i.e. mirror, raid).
2013-06-11 13:58:16 +02:00
Zdenek Kabelac
272f5ae208 snapshots: check for active state
Fix testing if the snapshot could be resized and use lv_is_active()
to get correct answer in cluster.
2013-06-11 13:57:18 +02:00
Zdenek Kabelac
f05c5a97c3 filters: dump filter returns error code
Add int return value from dump() function.
Report stack for error case.
Update composable filter.
2013-06-03 08:42:25 +02:00
Zdenek Kabelac
5467a3b2b7 filters: update composable filter
Last commit made dump filter only partially composable.
Add remaining functionality and also support composable wipe,
which is needed, when i.e. vgscan needs to remove cache.

(in release fix)
2013-06-02 22:46:06 +02:00
Petr Rockai
1f73e992ef lvmetad: no use of persistent filter with lvmetad 2013-06-02 00:49:55 +02:00
Petr Rockai
e7878da921 filters: toplevel filter not persistent
Add a generic dump operation to filters and make the composite filter call
through to its components. Previously, when global filter was set, the code
would treat the toplevel composite filter's private area as if it belonged a
persistent filter, trying to write nonsense into a non-sensical file.
Also deal with NULL cmd->filter gracefully.
2013-06-02 00:48:58 +02:00
Petr Rockai
05bf4b8cc3 vgimportclone: override global_filter in lvm.conf
The global filter in system's lvm.conf may conflict with the custom filter we
set up in vgimportclone (they can easily fail to intersect). Since we explicitly
avoid talking to lvmetad in vgimportclone, it is safe and reasonable to do so.
2013-06-02 00:47:17 +02:00
Zdenek Kabelac
3ced1bf694 lvresize: check for max snapshot size
As for lvcreate, lvresize also doesn't need to grow bigger then needed.
2013-05-30 17:35:23 +02:00
Zdenek Kabelac
bd3ece0128 lvcreate: reduce too large cow
Detect maximum usable size of snapshot COW device,
and do not waste more space for such LV then needed.
2013-05-30 17:35:14 +02:00
Zdenek Kabelac
eb7e206a73 snapshot: add cow_max_extents
Add more precise calculation of the maximum usable snapshot size.
Using only percentage fails for small size of snapshot and extents.
2013-05-30 17:30:15 +02:00
Zdenek Kabelac
59962d8d3e snapshot: require 3 chunks for creation
There is no point in creation of 2chunks snapshot,
since the snapshot is invalidated immeditelly with the first write
as there is no free chunk for COW blocks
(2 chunks are used by the snap header and the 1st. metadata chunk).

Enhance error message about the lowest usable size.
2013-05-30 17:28:03 +02:00
Zdenek Kabelac
56779c32c5 snapshot: fix resize of 100% full cow
When the COW area is using all the available space (100%) it can be still
a valid snapshot which may need a resize. So support it.
2013-05-30 17:26:20 +02:00
Zdenek Kabelac
99f0483580 args: do not accept >=16EiB sizes
Instead of seeing wierd overflows inside the lvm code,
giving false error messages, kill the user experiment in the begining.

Who needs to use more then 16EiB with lvm2 and 64bit anyway...
2013-05-30 17:23:51 +02:00
Zdenek Kabelac
2f1a571c97 fid: fix reset of PV fid
Avoid hitting memory corruption (double free) in code path,
where PV FID has been already destroyed and the released pointer
was left in PV structure and could have been tried to be released
from there 2nd. time with final context destruction.
2013-05-30 16:52:39 +02:00
Peter Rajnoha
be25f7ac83 WHATS_NEW: ea_start,ea_size -> ba_start,ba_size 2013-05-28 12:43:26 +02:00
Peter Rajnoha
732859d21f refactor: rename embedding area -> bootloader area 2013-05-28 12:37:22 +02:00
Zdenek Kabelac
9966842810 snapshot: skip monitor for large cows
If snapshot cow device is already big enough to
cover whole origin, do not monitor it.
2013-05-27 10:35:43 +02:00
Zdenek Kabelac
77952151af snapshot: add lv_is_cow_covering_origin
Add function to check is size of cow is already big enough
to cover whole origin.
2013-05-27 10:34:53 +02:00
Zdenek Kabelac
06e8ff29ff snapshot: use dm_get_status_snapshot()
Replace code with libdm call to dm_get_status_snapshot().
2013-05-27 10:32:02 +02:00
Zdenek Kabelac
2ada982e73 vgchange: check for mounted fs
Check for mounted fs also for vgchange command, not just lvchange.

NOTE: Code is using lv_info() just like lvs_in_vg_opened().
It should be probably converted into  lv_is_active_locally().
2013-05-20 16:47:33 +02:00
Jonathan Brassow
06ac797f42 Clean-up: Replace 'lv_is_active' with more correct/specific variants
There are places where 'lv_is_active' was being used where it was
more correct to use 'lv_is_active_locally'.  For example, when checking
for the existance of a kernel instance before asking for its status.
Most of the time these would work correctly.  (RAID is only allowed on
non-clustered VGs at the moment, which means that 'lv_is_active' and
'lv_is_active_locally' would give the same result.)  However, it is
more correct to use the proper variant and it helps with future
scenarios where targets might be allowed exclusively (or clustered) in
a cluster VG.
2013-05-16 10:36:56 -05:00
Peter Rajnoha
b3b551a93e WHATS_NEW: bad day 2013-05-16 11:02:38 +02:00
Peter Rajnoha
cb0d817fb5 WHATS_NEW: for commit 4f6c2951d6 2013-05-16 08:38:27 +02:00
Alasdair G Kergon
f12d88f840 activation: fix lv_is_active regressions
Try to fix commit bf2741376d.

lv_is_active is not the same as lv_info(cmd, org, 0, &info, 0, 0).

Introduce and use lv_is_active_locally.
2013-05-15 02:13:31 +01:00
Alasdair G Kergon
2fbe1e6e00 rephrasing: miscellaneous changes
Miscellaneous changes to messages, man pages, comments and WHATS_NEW.
2013-05-15 01:50:42 +01:00
Alasdair G Kergon
2e4a66a761 make: fix exported symbols regex for non-GNU sed
Remove a couple of incorrect backslashes from expressions used to
generate lists of exported symbols so it works with busybox sed.
[John Spencer]
2013-05-14 19:29:26 +01:00
Alasdair G Kergon
c6cf2ed7fd commands: accept --yes globally
Accept --yes on all commands, even ones that don't today have prompts,
so that test scripts that don't care about interactive prompts no
longer need to deal with them.

But continue to mention --yes only in the command prototypes that
actually use it.
2013-05-14 18:45:37 +01:00
Mike Snitzer
8ad7865b42 Fix alignment of PV data area if detected alignment less than 1 MB
This fixes a long standing regression since LVM2 2.02.74 (commit 4efb1d9c,
"Update heuristic used for default and detected data alignment.")

The default PE alignment could be used (via MAX()) even if it was
determined that the device's MD stripe width, or minimal_io_size or
optimal_io_size were not factors of the default PE alignment (either 64K
or the newer default of 1MB, etc).  This bug would manifest if the
default PE alignment was larger than the overriding hint that the
device provided (e.g. default of 1MB vs optimal_io_size of 768K).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-05-13 15:56:47 -04:00
Zdenek Kabelac
55fe07ad98 mm: fix leak in fail path
If the dm_realloc would fail, the already allocate _maps_buffer
memory would have been lost (overwritten with NULL).
Fix this by using temporary line buffer.

Also add a minor cleanup to set end of buffer to '\0',
only when we really know the file size fits the preallocated buffer.
2013-05-13 13:13:20 +02:00
Peter Rajnoha
4407133113 toolcontext: check dm version lazily for udev_fallback setting
Setting the cmd->default_settings.udev_fallback also requires DM
driver version check. However, this caused useless mapper/control
access with ioctl if not needed actually. For example if we're not
using activation code, we don't need to know the udev_fallback as
there's no node and symlink processing.

For example, this premature mapper/control access caused problems
when using lvm2app even when no activation happens - there are
situations in which we don't need to use mapper/control, but still
need some of the lvm2app functionality. This is also the case for
lvm2-activation systemd generator which just needs to look at the
lvm2 configuration, but it shouldn't touch mapper/control.
2013-05-13 11:53:53 +02:00
Zdenek Kabelac
8d004b5127 report: show active state of LV
For non clustered VG - show  "active"/""

For clustered VG its more complex:

"local exclusive"
"remote exclusive"
"locally"
"remotely"
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
8b18ab76d2 report: show dmeventd monitoring status
Add new lvs segment field 'Monitor' showing 3 states:

"monitored" - LV is monitored by dmeventd.

"not monitored" - LV is currently not being monitored by dmeventd

"" (empty) - LV does not support monitoring, or dmeventd support
             is not compiled in.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
3f7de58e96 man: lvextend --use-policies
Add missing man info.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
f84f12a6a3 snapshot: rework cluster creation and removal
Support for exclusive activation of snapshots revealed some problems.

When snapshot is created, COW LV is activated first (for clearing) and
then it's transformed into snapshot's COW LV, but it has left the lock
for such LV active in cluster and this lock could not have been removed
from dlm, unless snapshot has been removed within same dlm session.

If the user tried to remove snapshot after rebooting node, the lock was
missing, and COW LV could not have been detached.

Patch modifes the approach in this way:

Always deactivate COW LV for clustered vg  after clearing (so it's
activated again via imlicit snapshot activation rule when snapshot is activated).

When snapshot is removed, activate COW LV as independend LV, so the lock
will exist for such LV, but only when the snapshot is active.

Also add test case for testing snapshot removal after cluster reboot.
2013-04-25 17:33:24 +02:00
Zdenek Kabelac
d51b7e5404 clvmd: avoid pretesting of dev availability
Patch fixes hidden problem with lvm metadata caching.

When the pretest was made, only the commited data have been cached back
since the call lv_info_by_lvid() triggers mda read operation.
However call of lv_suspend_if_active() also reads precommited metadata.
The problem is visible in this sequence of calls:

vg_write(), suspend_lv(), vg_commit(), resume_lv()

which may end with leaving outdated mda in lvm cache, since vg_write()
drops cached metadata and vg_commit() only transforms precommited
to commited metadata, but in the case of pretesting we have
no precommited mda available so the cache will continue to use
old metadata. This happens, when suspend LV is inactive.
2013-04-25 17:33:22 +02:00
Zdenek Kabelac
45eeb70b02 config: merge timestamps
Merging multiple config files together needs to know newest (highest)
timestamp of merged files. Persistent cache file is being used
only in case, the config file is older then .cache file.
2013-04-23 12:31:16 +02:00
Zdenek Kabelac
1951798d72 vgread: fix fid transfer for lvm1 and pool format
Assign fid as the last step before returning VG.
Make the format reader for 'lvm1' and 'pool' equal to 'lvm2' format reader.

It has caused memory corruption to lvmetad as it later calls
destroy_instance() to allocated fid. This patch should fix problems
with crashing test lvmetad-lvm1.sh.
2013-04-21 23:13:57 +02:00
Zdenek Kabelac
a2b76a6f02 thin: fix resource leak in err path
If the devices list could not have been obtained, FILE* was leaked.
2013-04-21 23:10:30 +02:00
Zdenek Kabelac
17a6915054 thin: explicitly avoid pvmove operation
So far we do not support pvmove for thin volumes
and thin pools.
2013-04-21 23:09:11 +02:00
Zdenek Kabelac
f787b575b5 lvmetad: fix error paths
Also add missing goto out on error.
Error path missed return NULL leading to double free of enc_value.
2013-04-21 23:04:53 +02:00
Zdenek Kabelac
c9d8d22224 clmvd: fix responce status
Failing status code is expected to be 0.
Also do not return '*response' as pointer which has been already free().
2013-04-21 22:54:42 +02:00
Jonathan Brassow
2e0740f7ef RAID: Add writemostly/writebehind support for RAID1
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics.  The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value.  If no trailing
character is given, it will set the flag.
Synopsis:
        lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
        lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv

The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set.  It is signified with a 'w'.  If the device
has failed, the 'p'artial flag has priority.

Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   Rwi---r-m    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-w    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r--    1 linear   4.00m

Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
  LV                VG   Attr      #Str Type   SSize
  raid1             vg   rwi---r-p    2 raid1  500.00m
  [raid1_rimage_0]  vg   Iwi---r--    1 linear 500.00m
  [raid1_rimage_1]  vg   Iwi---r-p    1 linear 500.00m
  [raid1_rmeta_0]   vg   ewi---r--    1 linear   4.00m
  [raid1_rmeta_1]   vg   ewi---r-p    1 linear   4.00m

A new reportable field has been added for writebehind as well.  If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--     512
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--

Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
  LV            Attr      WBehind
  lv            rwi-a-r--
  [lv_rimage_0] iwi-aor-w
  [lv_rimage_1] iwi-aor--
  [lv_rmeta_0]  ewi-aor--
  [lv_rmeta_1]  ewi-aor--
2013-04-15 13:59:46 -05:00
Zdenek Kabelac
a81a2406f1 tools: add common lv_change_activate
Move common code for changing activation state from
vgchange and lvchange to one function.

Fix the order of checks - so we always implicitelly
activate snapshots and thin volumes in exclusive mode,
and we do not allow local deactivation for them.
2013-04-12 11:30:07 +02:00
Jonathan Brassow
719e908bc0 WHATS_NEW: Add WHATS_NEW entry for previous commit. 2013-04-11 16:03:24 -05:00
Jonathan Brassow
ff64e3500f RAID: Add scrubbing support for RAID LVs
New options to 'lvchange' allow users to scrub their RAID LVs.
Synopsis:
	lvchange --syncaction {check|repair} vg/raid_lv

RAID scrubbing is the process of reading all the data and parity blocks in
an array and checking to see whether they are coherent.  'lvchange' can
now initaite the two scrubbing operations: "check" and "repair".  "check"
will go over the array and recored the number of discrepancies but not
repair them.  "repair" will correct the discrepancies as it finds them.

'lvchange --syncaction repair vg/raid_lv' is not to be confused with
'lvconvert --repair vg/raid_lv'.  The former initiates a background
synchronization operation on the array, while the latter is designed to
repair/replace failed devices in a mirror or RAID logical volume.

Additional reporting has been added for 'lvs' to support the new
operations.  Two new printable fields (which are not printed by
default) have been added: "syncaction" and "mismatches".  These
can be accessed using the '-o' option to 'lvs', like:
	lvs -o +syncaction,mismatches vg/lv
"syncaction" will print the current synchronization operation that the
RAID volume is performing.  It can be one of the following:
        - idle:   All sync operations complete (doing nothing)
        - resync: Initializing an array or recovering after a machine failure
        - recover: Replacing a device in the array
        - check: Looking for array inconsistencies
        - repair: Looking for and repairing inconsistencies
The "mismatches" field with print the number of descrepancies found during
a check or repair operation.

The 'Cpy%Sync' field already available to 'lvs' will print the progress
of any of the above syncactions, including check and repair.

Finally, the lv_attr field has changed to accomadate the scrubbing operations
as well.  The role of the 'p'artial character in the lv_attr report field
as expanded.  "Partial" is really an indicator for the health of a
logical volume and it makes sense to extend this include other health
indicators as well, specifically:
        'm'ismatches:  Indicates that there are discrepancies in a RAID
                       LV.  This character is shown after a scrubbing
                       operation has detected that portions of the RAID
                       are not coherent.
        'r'efresh   :  Indicates that a device in a RAID array has suffered
                       a failure and the kernel regards it as failed -
                       even though LVM can read the device label and
                       considers the device to be ok.  The LV should be
                       'r'efreshed to notify the kernel that the device is
                       now available, or the device should be 'r'eplaced
                       if it is suspected of failing.
2013-04-11 15:33:59 -05:00
Jonathan Brassow
95d28735ea WHATS_NEW: Include entry for RAID status func improvements 2013-04-08 15:17:12 -05:00
Zdenek Kabelac
c22e925ce4 man: lvceate document external origin snapshot
Document added support for external origin.
2013-04-05 14:15:03 +02:00
Zdenek Kabelac
ddafa0115e man: updates for lvconvert and lvcreate
Cleanup and improvement on man pages.
2013-04-05 14:14:20 +02:00
Peter Rajnoha
32ae07cef1 pv_write: clean up non-orphan format1 PV write
...to not pollute the common and format-independent code in the
abstraction layer above.

The format1 pv_write has common code for writing metadata and
PV header by calling the "write_disks" fn and when rewriting
the header itself only (e.g. just for the purpose of changing
the PV UUID) during the pvchange operation, we had to tweak
this functionality for the format1 case and we had to assign
the PV the orphan state temporarily.

This patch removes the need for this format1 tweak and it calls
the write_disks with appropriate flag indicating whether this is
a PV write call or a VG write call, allowing for metatada update
for the latter one.

Also, a side effect of the former tweak was that it effectively
invalidated the cache (even for the non-format1 PVs) as we
assigned it the orphan state temporarily just for the format1
PV write to pass.

Also, that tweak made it difficult to directly detect whether
a PV was part of a VG or not because the state was incorrect.

Also, it's not necessary to backup and restore some PV fields
when doing a PV write:

  orig_pe_size = pv_pe_size(pv);
  orig_pe_start = pv_pe_start(pv);
  orig_pe_count = pv_pe_count(pv);
  ...
  pv_write(pv)
  ...
  pv->pe_size = orig_pe_size;
  pv->pe_start = orig_pe_start;
  pv->pe_count = orig_pe_count;

...this is already done by the layer below itself (the _format1_pv_write fn).

So let's have this cleaned up so we don't need to be bothered
about any 'format1 special case for pv_write' anymore.
2013-03-25 15:08:26 +01:00
Peter Rajnoha
784867d5bd WHATS_NEW: vgextend and PV with 0 MDAs 2013-03-19 15:41:34 +01:00
Zdenek Kabelac
b36a776a7f thin: move update_pool_params
Now we may recongnize preset arguments, move
the code for updating thin pool related values
into /lib portion of the code.
2013-03-13 15:13:54 +01:00
Alasdair G Kergon
cbfb5a98b5 filters: power2 devs get precedence if PVIDs match
Give precedence to EMC "power2" devices with duplicate PVIDs like
we already do with "emcpower" devices.
2013-03-11 20:10:49 +00:00
Peter Rajnoha
03b5c51730 WHATS_NEW: add lines for config validation support 2013-03-06 11:00:30 +01:00
Peter Rajnoha
b3776468fa WHATS_NEW: add lines for embedding area support 2013-02-26 15:50:43 +01:00
Zdenek Kabelac
b73de73151 thin: lvconvert support for external origin
Add basic support for converting LV into an external origin volume.

Syntax:

lvconvert --thinpool vg/pool  --originname renamed_origin -T origin

It will convert volume  'origin' into a thin volume, which will
use 'renamed_origin' as an external read-only origin.
All read/write into origin will go via 'pool'.

renamed_origin volume is read-only volume, that could be activated
only in read-only mode, and cannot be modified.
2013-02-23 10:38:20 +01:00
Zdenek Kabelac
d023b2d12f lvremove: easier removal of dependent lvs
Add function to remove lvs which are depending on removed lv
prior the lv is removed.

User is asked for confirmation.
2013-02-23 10:31:05 +01:00
Zdenek Kabelac
3679bb1cd9 activation: simplify activation code
Reorder activation code to look similar for preload tree and
activation tree.

Its also give much better suppport for device stacking,
since now we also support activation of snapshot which might
be then used for other devices.
2013-02-23 10:30:03 +01:00
Zdenek Kabelac
0631d233d8 activation: add _add_layer_target_to_dtree
Add function for creation of simple linear mapping over layer device.
2013-02-23 10:29:08 +01:00
Zdenek Kabelac
78b23f3595 activation: extend _cached_info
Add layer string to support check of layered devices.
2013-02-23 10:28:01 +01:00
Jonathan Brassow
bbc6378b73 RAID: Make 'lvchange --refresh' restore transiently failed RAID PVs
A new function (dm_tree_node_force_identical_table_reload) was added to
avoid the suppression of identical table reloads.  This allows RAID LVs
to reload the on-disk superblock information that contains which devices
have failed and the bitmaps.  If the failed device has returned, this has
the effect of restoring the device and initiating recovery.  Without this
patch, the user had to completely deactivate their RAID LV and re-activate
it in order to restore the failed device.  Now they simply need to
suspend and resume (which is done by 'lvchange --refresh').

The identical table suppression is only avoided if the LV is not PARTAIL
(i.e. all of it's devices can be seen and read by LVM) and the kernel
status of the array contains failed devices.  In other words, the function
will only be called in the case where we may have success in restoring
a failed device in the array.
2013-02-21 11:31:36 -06:00
Jonathan Brassow
3ab46449f4 vgimport: Allow '--force' to import VGs with missing PVs.
When there are missing PVs in a volume group, most operations that alter
the LVM metadata are disallowed.  It turns out that 'vgimport' is one of
those disallowed operations.  This is bad because it creates a circular
dependency.  'vgimport' will complain that the VG is inconsistent and that
'vgreduce --removemissing' must be run.  However, 'vgreduce' cannot be run
because it has not been imported.  Therefore, 'vgimport' must be one of
the operations allowed to change the metadata when PVs are missing.  The
'--force' option is the way to make 'vgimport' happen in spite of the
missing PVs.
2013-02-20 16:37:41 -06:00
Peter Rajnoha
303e86adc8 pvcreate: fix alignment to incorporate alignment offset if PV has 0 MDAs
If zero metadata copies are used, there's no further recalculation of
PV alignment that happens when adding metadata areas to the PV and
which actually calculates the alignment correctly as a matter of fact.
So fix this for "PV without MDA" case as well.

Before this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda     8.00m

After this patch:
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 1 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m
[1] raw/~ # pvcreate --dataalignment 8m --dataalignmentoffset 4m
--metadatacopies 0 /dev/sda
  Physical volume "/dev/sda" successfully created
[1] raw/~ # pvs -o pv_name,pe_start
  PV         1st PE
  /dev/sda    12.00m

Also, remove a superfluous condition "pv->pe_start < pv->pe_align" in:
  if (pe_start == PV_PE_START_CALC && pv->pe_start < pv->pe_align)
    pv->pe_start = pv->pe_align ...
This part of the condition is not reachable as with the PV_PE_START_CALC,
we always have pv->pe_start set to 0 from the PV struct initialisation
(...the pv->pe_start value is just being calculated).
2013-02-21 14:51:19 +01:00
Jonathan Brassow
bd0ee420b5 RAID: Allow remove/replace of sub-LVs composed of error segments.
When a device fails, we may wish to replace those segments with an
error segment.  (Like when a 'vgreduce --removemissing' removes a
failed device that happens to be a RAID image/meta.)  We are then left
with images that we will eventually want to remove or replace.

This patch allows us to pull out these virtual "error" sub-LVs.  This
allows a user to 'lvconvert -m -1 vg/lv' to extract the bad sub-LVs.
Sub-LVs with error segments are considered for extraction before other
possible devices so that good devices are not accidentally removed.

This patch also adds the ability to replace RAID images that contain error
segments.  The user will still be unable to run 'lvconvert --replace'
because there is no way to address the 'error' segment (i.e. no PV
that it is associated with).  However, 'lvconvert --repair' can be
used to replace the image's error segment with a new PV.  This is also
the most appropriate way to do it, since the LV will continue to be
reported as 'partial'.
2013-02-20 14:58:56 -06:00