1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-10-28 20:25:52 +03:00
Commit Graph

2721 Commits

Author SHA1 Message Date
Jonathan Brassow
f0be9ac904 cmirrord: Prevent secondary checkpoints from corrupting bitmaps
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).

When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap.  More than one machine can send a checkpoint,
but only the initial one should be imported.  Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin.  When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.

When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter.  This parameter was designed to avoid reading all but
the initial checkpoint.  Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
2013-08-20 13:21:09 -05:00
Peter Rajnoha
cac49725c9 udev: fix lvmetad rules to not ignore loop device configuration
If loop device is first configured on systems where /dev/loop-control
is used to dynamically create the loop device itself, there's an
ADD+CHANGE even generated. But next time the existing /dev/loop[0-9]*
is reused, there's only a CHANGE event since the device representing
it is already present in kernel (so no ADD event in this case).

We can't ignore this CHANGE event for loop devices! This is a regression
caused by 756bcabbfe. We already had
a similar problem with MD devices which was fixed by
2ac217d408 (but that one was
only an intra-release fix).
2013-08-16 15:45:00 +02:00
Michael Stapelberg
8cbbe851a8 systemd: use LVM_PATH instead of hardcoded value in activation generator 2013-08-15 09:59:19 +02:00
Peter Rajnoha
82d83a01ce autoactivation: refresh existing VG before autoactivation
When autoactivating a VG, there could be an existing VG with exactly
the same PV UUIDs. The PVs could be reappeared after previous
loss/disconnect (for example disconnecting and reconnecting iscsi).

Since there's no "autodeactivation" yet, the mappings for the LVs
from the VG were left in the system even if the device was disconnected.
These mappings also hold the major:minor of the underlying device.
So if the device reappears, it is assigned a different major:minor
pair (...and kernel name). We need to cope with this during
autoactivation so any existing mappings are corrected for any changes.
The VG refresh does that (the vgchange --refresh functionality) -
call this before VG autoactivation.

(If the VG does not exist yet, the VG refresh is NOP)
2013-08-14 14:04:58 +02:00
Peter Rajnoha
fcbb34bdcc WHATS_NEW: for 0da72743ca 2013-08-14 10:18:02 +02:00
Alasdair G Kergon
80bcdb93ff filters: check for mpath before opening devs
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.

When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.

Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.

Log each filter initialisation to show sequence.

Avoid duplicate 'Using $device' debug messages.
2013-08-13 23:26:58 +01:00
Alasdair G Kergon
1a1d3a10ff vgchange: require confirmation with -c and no VGs
Too many people have been running 'vgchange -cy' by mistake
so add a confirmation prompt.  Use --yes to bypass this.
2013-08-13 18:20:11 +01:00
Peter Rajnoha
fd7cac15bc WHATS_NEW: be more precise 2013-08-13 18:25:54 +02:00
Peter Rajnoha
e166c00ac6 WHATS_NEW: one more for a85439 2013-08-13 18:16:05 +02:00
Peter Rajnoha
268b370e24 blkdeactivate: add support for bind mounts
Recent version of util-linux/umount (v2.23+) provides
umount --all-targets that can unmount all the mount targets of
the same device (the bind mounts). Use this if available when
calling the umount blkdeactivate.

Otherwise, for older versions of util-linux, use findmnt
(that is also a part of the util-linux) to iterate over all
mount targets of the same device - this is the manual way.
2013-08-13 17:51:40 +02:00
Peter Rajnoha
a854398764 blkdeactivate: change the way blkdeactivate reports status
The blkdeactivate now suppresses error messages from external
tools that are called. Instead, only a summary message "done"
or "skipped" is issued by blkdeactivate as any error in calling
the external tool (e.g. unmounting or deactivating a device) causes
the device to be skipped and the blkdeactivate continues with the
next device in the tree.

Add new -e/--errors switch to display any error messages from
external tools.

Also, suppress any output given by the external tools and add
new -v/--verbose switch to display it including the verbose
output of the tools called (this will enable error reporting
as well).

Also add blkdeactivate -vv for even more debug (the script's debug).
2013-08-13 17:51:23 +02:00
Alasdair G Kergon
32148369d1 post-release 2013-08-13 11:54:48 +01:00
Alasdair G Kergon
297907899c release 2.02.100
84 files changed, 1540 insertions(+), 442 deletions(-)

Mostly bug fixes this time.

Also note:
  md raid replaces dm mirroring as the default implementation.
  Can call out to thin_repair to fix thin metadata.
  Improved clvmd error detection/debugging information.
2013-08-13 11:29:21 +01:00
Jonathan Brassow
abc89422af Mirror: Fix inability to remove VG's cluster flag if it contains a mirror
According to bug 995193, if a volume group
	1) contains a mirror
	2) is clustered
	3) 'locking_type' = 0 is used
then it is not possible to remove the 'c'luster flag from the VG.  This
is due to the way _lv_is_active behaves.

We shouldn't allow the cluster flag to be flipped unless the mirrors in
the cluster are not active.  This is because different kernel modules
are used depending on whether a mirror is cluster or not.  When we
attempt to see if the mirror is active, we first check locally.  If it
is not, then we attempt to check for remotely active instances if the VG
is clustered.  Since the no_lock locking type is LCK_CLUSTERED, but does
not implement 'query_resource', remote_lock_held will always return an
error in this case.  An error from remove_lock_held is treated as though
the lock _is_ held (i.e. the LV is active remotely).  This blocks the
cluster flag from changing.

The solution is to implement 'query_resource' for the no_lock type.  It
will report a message and return 1.  This will allow _lv_is_active to
function properly.  The LV would be considered not active remotely and
the VG can change its flag.
2013-08-12 13:56:47 -05:00
Alasdair G Kergon
28760275e6 logging: tidy log_sys_error when string empty 2013-08-12 18:40:41 +01:00
Jonathan Brassow
cba228f856 WHATSNEW: typo 2013-08-09 17:17:53 -05:00
Jonathan Brassow
8615234c0f RAID: Fix bug making lvchange unable to change recovery rate for RAID
1) Since the min|maxrecoveryrate args are size_kb_ARGs and they
   are recorded (and sent to the kernel) in terms of kB/sec/disk,
   we must back out the factor multiple done by size_kb_arg.  This
   is already performed by 'lvcreate' for these arguments.
2) Allow all RAID types, not just RAID1, to change these values.
3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe'
   commands so that lvchange will not complain about needing at
   least one of a certain set of arguments and failing.
4) Add tests that check that these values can be set via lvchange
   and lvcreate and that 'lvs' reports back the proper results.
2013-08-09 17:09:47 -05:00
Zdenek Kabelac
e583ff3d2c thin: thin pool can't be external origin
Avoid trying to convert thin-pool to external origin.
2013-08-09 23:04:30 +02:00
Peter Rajnoha
2f61478436 workaround: gcc v4.8 on 32 bit param. passing bug when -02 opimization used
gcc -O2 v4.8 on 32 bit architecture is causing a bug in parameter
passing. It does not happen with -01 nor -O0.

The problematic part of the code was strlen use in config.c in
the config_def_check fn and the call for _config_def_check_tree in it:

<snip>
  rplen = strlen(rp);
  if (!_config_def_check_tree(handle, vp, vp + strlen(vp), rp, rp + rplen, CFG_PATH_MAX_LEN - rplen, cn, cmd->cft_def_hash)) ...
</snip>

If compiled with -O0 (correct):

Breakpoint 1, config_def_check (cmd=0x819b050, handle=0x81a04f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	strlen(vp)
$2 = 6
(gdb)
_config_def_check_tree (handle=0x81a04f8, vp=0x8189ee0 <_cfg_path>
"config", pvp=0x8189ee6 <_cfg_path+6> "", rp=0xbfffe1e8 "config",
prp=0xbfffe1ee "", buf_size=58, root=0x81a2568, ht=0x81a65
48) at config/config.c:680
(gdb) p	vp
$4 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	pvp
$5 = 0x8189ee6 <_cfg_path+6> ""

If compiled with -O2 (incorrect):

Breakpoint 1, config_def_check (cmd=cmd@entry=0x8183050, handle=0x81884f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8172fc0 <_cfg_path> "config"
(gdb) p strlen(vp)
$2 = 6
(gdb) p	vp + strlen(vp)
$3 = 0x8172fc6 <_cfg_path+6> ""
(gdb)
_config_def_check_tree (handle=handle@entry=0x81884f8, pvp=0x8172fc7
<_cfg_path+7> "host_list", rp=rp@entry=0xbffff190 "config",
prp=prp@entry=0xbffff196 "", buf_size=buf_size@entry=58, ht=0x
818e548, root=0x818a568, vp=0x8172fc0 <_cfg_path> "config") at
config/config.c:674
(gdb) p	pvp
$4 = 0x8172fc7 <_cfg_path+7> "host_list"

The difference is in passing the "pvp" arg for _config_def_check_tree.
While in the correct case, the value of _cfg_path+6 is passed
(the result of vp + strlen(vp) - see the snippet of the code above),
in the incorrect case, this value is increased by 1 to _cfg_path+7,
hence totally malforming the string that is being processed.

This ends up with incorrect validation check and incorrect warning
messages are issued like:

 "Configuration setting "config/checks" has invalid type. Found integer, expected section."

To workaround this issue, remove the "static" qualifier from the
"static char _cfg_path[CFG_PATH_MAX_LEN]". This causes the optimalizer
to be less aggressive (also shuffling the arg list for
_config_def_check_tree call helps).
2013-08-09 13:24:50 +02:00
Peter Rajnoha
8d3347f70b WHATS_NEW: entry for 19baf84290 2013-08-08 10:04:53 +02:00
Jonathan Brassow
68c2d352ec WHATS_NEW: update WHATS_NEW for previous commit 2013-08-07 17:51:21 -05:00
Jonathan Brassow
b15278c3dc Mirror/RAID1: When up|down-converting default to segtype of current LV
If there is no RAID support in the kernel but the default mirror
segtype is "raid1", converting legacy mirrors can be problematic.
For example, changing the log type or converting a mirror to a linear
LV does not require the RAID modules to be present.  However, because
lp->segtype is set to be RAID1 by the configuration file, the command
fails.

We should only be setting lp->segtype when converting mirrors if it is
going to change (e.g. to linear or between mirror types).
2013-08-07 16:01:45 -05:00
Jonathan Brassow
7e1083c985 RAID: Make "raid1" the default mirror segment type 2013-08-06 14:13:55 -05:00
Zdenek Kabelac
003f08c164 clogd: fix descriptor leak when daemonzing 2013-08-06 16:21:51 +02:00
Zdenek Kabelac
7b1315411f clmvd: fix decriptor leak on restart
Do not leave descriptor used for dup2() openned.
2013-08-06 16:20:36 +02:00
Zdenek Kabelac
f6dd5a294b exec: pipe open
Function replaces popen() system and avoids shell execution
and argument parsing (no surprices).
2013-08-06 16:18:43 +02:00
Peter Rajnoha
61e7dc833c WHATS_NEW: previous commit 2013-08-06 14:03:43 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Peter Rajnoha
1cdd563b6c WHATS_NEW: move line to WHATS_NEW_DM 2013-08-06 11:42:01 +02:00
Jonathan Brassow
5ca54c4f0b dmeventd: Fix memory leak
When creating a timeout thread for snapshots, the thread is not
tracked and thus never joined.  This means that the exit status
of the timeout thread is held indefinitely.  Saves a bit of
memory to set PTHREAD_CREATE_DETACHED when creating this thread.

I've also added pthread_attr_init|destroy to setup the creation
pthread_attr_t.

Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2013-07-31 15:23:13 -05:00
Zdenek Kabelac
de0cba0e2d thin: initial --repair support for pools
Initial basic support for repair.
It currently takes pool metadata spare volume, which
is used for recovery.  New spare is created if the volume
is successfuly repaired.

After the operation the previous _tmeta volume is moved
into  _tmeta%d volume and if everything is ok, this volume
could be removed.
New _tmeta needs to be pvmoved to proper place and also
converted to i.e. mirror if it should be mirrored.

Later version will try to automate some steps here.
2013-07-31 15:32:36 +02:00
Zdenek Kabelac
22fc80982a thin: add thin_repair and thin_dump options
Add new configure lvm.conf options for binaries thin_repair
and thin_dump.

Those are part of device-mapper-persistent-data package
and will be used for recovery of thin_pool.
2013-07-31 15:30:47 +02:00
Zdenek Kabelac
ea605d1ec7 thin: metadata resize needs 1.9 version
Version 1.8 is not yet fully usable for metadata resize.
2013-07-31 15:29:27 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Peter Rajnoha
ecc9f74988 filters: fix segfault on incorrect global_filter
When using a global_filter and if this filter is incorrectly
specified, we ended up with a segfault:

  raw/~ $ pvs
    Invalid filter pattern "r|/dev/sda".
  Segmentation fault (core dumped)

In the example above a closing '|' character is missing at the end
of the regex. The segfault itself was caused by trying to destroy
the same filter twice in _init_filters fn within the error path
(the "bad" goto target):

bad:
        if (f3)
                f3->destroy(f3);
        if (f4)
                f4->destroy(f4);

Where f3 is the composite filter (sysfs + regex + type + md + mpath filter)
and f4 is the persistent filter which encompasses this composite filter
within persistent filter's 'real' field in 'struct pfilter'.

So in the end, we need to destroy the persistent filter only as
this will also destroy any 'real' filter attached to it.
2013-07-26 13:04:53 +02:00
Alasdair G Kergon
06dce7d539 post-release 2013-07-25 00:38:53 +01:00
Alasdair G Kergon
76e617b158 release 2.02.99
363 files changed, 19863 insertions(+), 9055 deletions(-)

This is a very large release - so expect bugs!

Please treat this release like a release candidate.
Changes to the external interfaces since 2.02.98 are not yet frozen.

Updated releases will follow quickly (days not weeks) as any problems
are handled.
2013-07-24 23:59:03 +01:00
Alasdair G Kergon
d13e87b9ef cleanup: comments and a message 2013-07-24 22:10:37 +01:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Peter Rajnoha
31de670318 lvconvert: add more checks for lvconvert --type
The --type mirror requires -m/--mirrrors:

  lvconvert --type mirror vg/lvol0
    --type mirror requires -m/--mirrors
    Run `lvconvert --help' for more information.

The --type raid* is allowed (the checks already existed):

  lvconvert --type raid10 vg/lvol0
    Converting the segment type for vg/lvol0 from linear to raid10 is not yet supported.

The --type snapshot is a synonym to -s/--snapshot:

  lvconvert -s vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

  lvconvert --type snapshot vg/lvol0 vg/lvol1
    Logical volume lvol1 converted to snapshot.

All the other segment types are not supported, e.g.:

  lvconvert --type zero vg/lvol0
    Conversion using --type zero is not supported.
    Run `lvconvert --help' for more information.
2013-07-23 17:13:54 +02:00
Peter Rajnoha
9b1834f075 man: pvs -o ba_start,ba_size -> pv_ba_start,pv_ba_size 2013-07-23 14:45:30 +02:00
Peter Rajnoha
ea333a894e systemd: lvm2-activation-generator: use LOG_DEBUG/ERR severity for kmsg 2013-07-22 14:04:12 +02:00
Alasdair G Kergon
ccc29f17b6 cmdline: support ARG_GROUPABLE in merge_synonym 2013-07-19 20:37:43 +01:00
Alasdair G Kergon
90a09559ed commandline: add prefix aliases for raid options
Accept --raidwritemostly as well as --writemostly etc.
2013-07-19 19:24:54 +01:00
Jonathan Brassow
4eea660191 RAID: Fix segfault when reporting raid_syncaction field on older kernel
The status printed for dm-raid targets on older kernels does not include
the syncaction field.  This is handled by dev_manager_raid_status() just
fine by populating the raid status structure with NULL for that field.
However, lv_raid_sync_action() does not properly handle that field being
NULL.  So, check for it and return 0 if it is NULL.
2013-07-19 10:01:48 -05:00
Alasdair G Kergon
da79fe4c1d reporting: tidy recent new fields
Add underscores and prefixes to recently-added fields.
(Might add more alias functionality in future.)
2013-07-19 01:30:02 +01:00
Alasdair G Kergon
357df34133 display: fix units for sizes <1k 2013-07-18 17:55:58 +01:00
Zdenek Kabelac
460d0254eb thin: add pool metadata spare lv support
Add support for pool's metadata spare volume.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
08df7ba844 thin: improve pool creation activation order
Pool creation involves clearing of metadata device
which triggers udev watch rule we cannot udev synchronize with
in current code.

This metadata devices needs to be activated localy,
so in cluster mode deactivation and reactivation
is always needed.

However for non-clustered mode we may reload table
via suspend/resume path which avoids collision with
udev watch rule which was occasionaly triggering
retry deactivation loop.

Code has been also split into 2 separate code paths
for thin pools and thin volumes which improved readability
of the code as well.

Deactivation has been moved out of extend_pool() and
decision is now in _lv_create_an_lv() which knows
the change mode.
2013-07-18 18:22:43 +02:00
Zdenek Kabelac
4e724f5f52 thin: for thin volumes properly list modules
thin volume needs   thin-pool and  thin kernel modules so print them
both for   lvs -o+modules
2013-07-18 18:22:43 +02:00