1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-18 10:04:20 +03:00

114 Commits

Author SHA1 Message Date
Jonathan Brassow
5ebff6cc9f pvmove: Enable all-or-nothing (atomic) pvmoves
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000).  When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV.  The LVs
then point to the range in the pvmove LV, rather than the PV
range.

Example 1:
	We have two LVs in this example.  After they were
	created, the first LV was grown, yeilding two segments
	in LV1.  So, there are two LVs with a total of three
	segments.

	Before pvmove:
	      ---------  ---------   ---------
	      | LV1s0 |  | LV2s0 |   | LV1s1 |
	      ---------  ---------   ---------
	         |           |           |
	   -------------------------------------
	PV | 000 - 255 | 256 - 511 | 512 - 767 |
	   -------------------------------------

	After pvmove inserts the temporary pvmove LV:
	          ---------   ---------   ---------
	          | LV1s0 |   | LV2s0 |   | LV1s1 |
	          ---------   ---------   ---------
	              |           |           |
	        -------------------------------------
	pvmove0 |   seg 0   |   seg 1   |   seg 2   |
	        -------------------------------------
	              |           |           |
	        -------------------------------------
	PV      | 000 - 255 | 256 - 511 | 512 - 767 |
	        -------------------------------------

	Each of the affected LV segments now point to a
	range of blocks in the pvmove LV, which purposefully
	corresponds to the segments moved from the original
	LVs into the temporary pvmove LV.

The current implementation goes on from here to mirror the temporary
pvmove LV by segment.  Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet.  If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV.  Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.

Example 2:
	What 'example 1' might look if it was half-way
	through the move.
	             ---------   ---------   ---------
	             | LV1s0 |   | LV2s0 |   | LV1s1 |
	             ---------   ---------   ---------
	                 |           |           |
	           -------------------------------------
	pvmove0    |   seg 0   |   seg 1   |   seg 2   |
	           -------------------------------------
	                 |           |           |
	                 |     -------------------------
	source PV        |     | 256 - 511 | 512 - 767 |
	                 |     -------------------------
	                 |           ||
	           -------------------------
	dest PV    | 000 - 255 | 256 - 511 |
	           -------------------------

This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment".  That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.

Example 3:
	A pvmove that is performed "by LV" rather than "by segment".

	                   ---------   ---------
	                   | LV1s0 |   | LV2s0 |
	                   ---------   ---------
	                       |           |
	                 -------------------------
	        pvmove0  |  * LV-level mirror *  |
	                 -------------------------
                             /                \
	   pvmove_mimage0   /          pvmove_mimage1
	   -------------------------   -------------------------
	   |   seg 0   |   seg 1   |   |   seg 0   |   seg 1   |
	   -------------------------   -------------------------
	        |            |               |           |
	   -------------------------   -------------------------
	   | 000 - 255 | 256 - 511 |   | 000 - 255 | 256 - 511 |
	   -------------------------   -------------------------
	           source PV                    dest PV

The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments.  A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries.  The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved.  So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated.  The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case.  This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV.  This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).

The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together.  It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
2014-06-17 22:59:36 -05:00
Jonathan Brassow
ee89ac7b88 pvmove: Disallow pvmove of cache LVs
Skip over LVs that have a cache LV in their tree of LV dependencies
when performing a pvmove.

This means that users cannot move a cache pool or a cache LV's origin -
even when that cache LV is used as part of another LV (e.g. a thin pool).

The new test (pvmove-cache-segtypes.sh) currently builds up various LV
stacks that incorporate cache LVs.  pvmove tests are then performed to
ensure that cache related LVs are /not/ moved.  Once pvmove is enabled
for cache, those tests will switch to ensuring that the LVs /are/
moved.
2014-02-24 12:25:18 -06:00
Alasdair G Kergon
2e82a070f3 pvcreate: Avoid spurious 'not found' messages.
Replacement of pv_read by find_pv_by_name in commit
651d5093edde3e0ebee9d75be1c9834efc152d91 caused spurious
error messages when running pvcreate or vgextend against an
unformatted device.

Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created

Physical volume /dev/loop4 not found
Physical volume /dev/loop4 not found
Physical volume "/dev/loop4" successfully created
Volume group "vg1" successfully extended
2013-11-29 21:45:37 +00:00
Jonathan Brassow
cc66dedc0e pvmove: Skip pvmove of RAID, thin, snapshot, origin, and mirror LVs in cluster
pvmove of the above types should only have been enabled in single machine
mode.
2013-09-03 13:17:01 -05:00
Jonathan Brassow
2ef48b91ed pvmove: Allow moving snapshot/origin. Disallow converting and merging LVs
The patch allows the user to also pvmove snapshots and origin logical
volumes.  This means pvmove should be able to move all segment types.
I have, however, disallowed moving converting or merging logical volumes.
2013-08-26 16:36:30 -05:00
Jonathan Brassow
caa77b33f2 pvmove: Fix inability to specify LV name when moving RAID, mirror, or thin LV
Top-level LVs (like RAID, mirror or thin) are ignored when determining which
portions of an LV to pvmove.  If the user specified the name of an LV to
move and it was one of the above types, it would be skipped.  The code would
never move on to check whether its sub-LVs needed moving because their names
did not match what the user specified.

The solution is to check whether a sub-LVs is part of the LV whose name was
specified by the user - not just if there was a name match.
2013-08-26 14:12:31 -05:00
Jonathan Brassow
448ff0119f pvmove: Ability to move thin volumes
The previous commit was missing the code to allow moving thin
volumes.
2013-08-23 09:13:14 -05:00
Jonathan Brassow
c59167ec13 pvmove: Add support for RAID, mirror, and thin
This patch allows pvmove to operate on RAID, mirror and thin LVs.
The key component is the ability to avoid moving a RAID or mirror
sub-LV onto a PV that already has another RAID sub-LV on it.
(e.g. Avoid placing both images of a RAID1 LV on the same PV.)

Top-level LVs are processed to determine which PVs to avoid for
the sake of redundancy, while bottom-level LVs are processed
to determine which segments/extents to move.

This approach does have some drawbacks.  By eliminating whole PVs
from the allocation list, we might miss the opportunity to perform
pvmove in some senarios.  For example, if we have 3 devices and
a linear uses half of the first, a RAID1 uses half of the first and
half of the second, and a linear uses half of the third (FIGURE 1);
we should be able to pvmove the first device (FIGURE 2).
	FIGURE 1:
        [ linear ] [ -RAID- ] [ linear ]
        [ -RAID- ] [        ] [        ]

	FIGURE 2:
        [  moved ] [ -RAID- ] [ linear ]
        [  moved ] [ linear ] [ -RAID- ]
However, the approach we are using would eliminate the second
device from consideration and would leave us with too little space
for allocation.  In these situations, the user does have the ability
to specify LVs and move them one at a time.
2013-08-23 08:57:16 -05:00
Zdenek Kabelac
b90450b8a0 cleanup: introduce return_ECMD_FAILED macro
Use shortening macro for common code sequence
stack;
return ECMD_FAILED;
2013-07-01 23:10:33 +02:00
Zdenek Kabelac
3ba3bc0d66 cleanup: drop backtrace
After log_error/log_warn there is no point to show <backtrace>
in debug log trace from the next code line.
2013-05-27 10:28:32 +02:00
Zdenek Kabelac
17a6915054 thin: explicitly avoid pvmove operation
So far we do not support pvmove for thin volumes
and thin pools.
2013-04-21 23:09:11 +02:00
Zdenek Kabelac
9e24d563c6 raid: read segment only for known LV
Avoid reading first_seg() on unknown LV and find it only when needed.
2013-04-21 23:07:00 +02:00
Peter Rajnoha
59878d0129 metadata: add 'allow_orphan' arg to find_pv_by_name fn
Before, the find_pv_by_name call always failed if the PV found was orphan.
However, we might use this function even for a PV that is not part of any VG.
This patch adds 'allow_orphan' arg to find_pv_by_name fn that allows that.
2013-03-19 14:57:31 +01:00
Jonathan Brassow
3835755259 pvmove/RAID: Disallow pvmove on RAID LVs until properly handled
Attempting pvmove on RAID LVs replaces the kernel RAID target with
a temporary pvmove target, ultimately destroying the RAID LV.  pvmove
must be prevented on RAID LVs for now.

Use 'lvconvert --replace old_pv vg/lv new_pv' if you want to move
an image of the RAID LV.
2012-12-04 17:47:47 -06:00
Alasdair G Kergon
438e0050df config: add silent mode
Accept -q as the short form of --quiet.
Suppress non-essential standard output if -q is given twice.
Treat log/silent in lvm.conf as equivalent to -qq.
Review all log_print messages and change some to
log_print_unless_silent.

When silent, the following commands still produce output:
dumpconfig, lvdisplay, lvmdiskscan, lvs, pvck, pvdisplay,
pvs, version, vgcfgrestore -l, vgdisplay, vgs.
[Needs checking.]

Non-essential messages are shifted from log level 4 to log level 5
for syslog and lvm2_log_fn purposes.
2012-08-25 20:35:48 +01:00
Milan Broz
be36c0ec49 Fail early if cmirror is not detected and pvmove requires it. 2012-03-27 12:01:22 +00:00
Milan Broz
3d5d5196d0 Also skip pvmove for remotely active LVs. 2012-03-27 11:43:32 +00:00
Milan Broz
ddb31b62e5 Keep exclusive activation in pvmove if LV is already active.
Pvmove should never try to downgrade exclusive lock
for LVs.

This allows pvmove to work again for exclusive activated LVs.
2012-03-26 20:33:40 +00:00
Milan Broz
dcd90bc501 Do not allow pvmove if some affected LVs are activated
locally or on more nodes while others are activated exclusively.

Current pvmove code can either use local mirror (for exclusive
activation) or cmirror (for clustered LVs).

Because the whole intenal pvmove LV is just segmented LV containing
segments of several top-level LVs, code cannot properly handle
situation if some segment need to be activated exclusively.

Previously, it wrongly activated exclusive LV on all nodes
(locing code allowed it) but now this is no lnger possible.

If there is exclusively activated LV, pvmove is only
possible if all affected LVs are aslo activated exclusively.

(Note that in non-exclusive mode pvmove still activates LVs
on other nodes during move.)

# lvchange -aly vg_test/lv1
# lvchange -aey vg_test/lv2
# pvmove -i 1 /dev/sdc
   Error locking on node bar-01: Device or resource busy
   Error locking on node bar-03: Volume is busy on another node
...
   Failed to activate lv2
2012-03-26 20:32:58 +00:00
Milan Broz
3366541aab Use new flag PVMOVE_EXCLUSIVE in update_metatada call.
There is no real functional change in this patch except it
avoids checking cluster cmirror module twice.

(Flag used in following patch.)
2012-03-26 20:31:01 +00:00
Alasdair Kergon
bba1e4d11f Fix error message when pvmove LV activation fails with name already in use. 2012-03-13 20:21:26 +00:00
Zdenek Kabelac
fbf6b89a84 Using enum types for enums
alloc_policy_t, dm_string_mangling_t, percent_range_t, sign_t
2012-02-28 14:24:57 +00:00
Alasdair Kergon
7a5b7def16 reinstate !first_time check
(recovery from first_time failure would need different code)
2011-12-08 18:06:33 +00:00
Zdenek Kabelac
a4c1c0d26f Remove test for first_time with FIXME
Workaround for the current code with big FIXME,
since proper solution for pvmove needs to be developed.

Commiting this only for the purpose to get cluster testing covered.
2011-10-11 08:51:02 +00:00
Alasdair Kergon
10d0d9c7c4 Introduce revert_lv for better pvmove cleanup.
(One further fix needed to remove the stray pvmove LVs left behind.)
2011-09-27 22:43:40 +00:00
Alasdair Kergon
74e72bd75d Replace incomplete pvmove activation failure recovery code with a message.
As it stands, the recovery code can make things worse sometimes so it's
better to insist on a proper 'pvmove --abort' cleanup.
2011-09-27 17:29:33 +00:00
Alasdair Kergon
1c26860d82 Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely.
Change suspend_lvs to call vg_revert internally.
Change vg_revert to void and remove superfluous calls after failed vg_commit.
2011-09-27 17:09:42 +00:00
Alasdair Kergon
a944480b9b FIXMEs to note problems with some error paths. (Perhaps safer to disable
them until they can be fixed completely?)
2011-09-21 16:36:39 +00:00
Petr Rockai
e59e2f7c3c Move the core of the lib/config/config.c functionality into libdevmapper,
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
2011-08-30 14:55:15 +00:00
Zdenek Kabelac
077a6755ff Replace free_vg with release_vg
Move the free_vg() to  vg.c  and replace free_vg  with release_vg
and make the _free_vg internal.

Patch is needed for sharing VG in vginfo cache so the release_vg function name
is a better fit here.
2011-08-10 20:25:29 +00:00
Alasdair Kergon
df390f1799 Major pvmove fix to issue ioctls in the correct order when multiple LVs
are affected by the move.  (Currently it's possible for I/O to become
trapped between suspended devices amongst other problems.

The current fix was selected so as to minimise the testing surface.  I
hope eventually to replace it with a cleaner one that extends the
deptree code.

Some lvconvert scenarios still suffer from related problems.
2011-06-11 00:03:06 +00:00
Zdenek Kabelac
f77736cab5 Remove double braces
Clang gives notice about possible confusion as commonly double bracces are
used when some assignment is done inside them.
2011-03-29 20:19:03 +00:00
Peter Rajnoha
84f48499a3 Add new free_pv_fid fn and use it throughout to free all attached fids.
Since format instances will use own memory pool, it's necessary to properly
deallocate it. For now, only fid is deallocated. The PV structure itself
still uses cmd mempool mostly, but anytime we'd like to add a mempool
in the struct physical_volume, we can just rename this fn to free_pv and
add the code (like we have free_vg fn for VGs).
2011-03-11 14:56:56 +00:00
Alasdair Kergon
2b82bd79f5 Rename vg_release to free_vg. 2010-12-08 20:50:48 +00:00
Peter Rajnoha
bad35c6554 Add escape sequence for ':' and '@' found in device names used as PVs. 2010-09-23 12:02:33 +00:00
Milan Broz
cf704d22b6 Fix pvmove --abort to work even for empty pvmove LV
If pvmove crashed and metadata contains pvmove LV
but without miorrored segments, pvmove --abort
will not repair the situation (and finish wth success!).

Fix it by allowing metadata update if aborting
(thus removing pvmove LV) even if no moved LVs detected.

(Tested on real metadata provided by an lvm user:-)
2010-08-23 11:34:10 +00:00
Alasdair Kergon
08f1ddea6c Use __attribute__ consistently throughout. 2010-07-09 15:34:40 +00:00
Petr Rockai
d345bf2cd3 Account for mirror transient status when doing lvconvert --repair. 2010-05-24 15:32:20 +00:00
Alasdair Kergon
1485ce69c4 Permit mimage LVs to be striped in lvcreate and lvresize. 2010-04-09 01:00:10 +00:00
Mike Snitzer
0ade9a8b37 Prepare for _get_lvconvert_vg() reuse as part of a larger lvconvert.c
refactoring.

Document the need to cleanup the "name" args passed around polldaemon,
lvconvert and pvmove.  It is quite a mess.

Annotate the unused nature of the existing poll_fns->get_copy_vg
methods' 'uuid' arg.
2010-02-05 22:40:49 +00:00
Milan Broz
a1b40be081 Fix pvmove abort when temporary mirror fails to be cluster-aware.
When activation of pvmove mirror fails on cluster, some nodes
still possibly succeeded in activation.

 - Explicitly deactivate that mirror to be sure
 - properly pair suspend/resume calls to not cause memory lock problems in clvmd

Code cannot simply call _finish_pvmove on cluster in this situation, because
changed LVs are suspended twice (causing memory inbalance) and also temporary
mirror is activated when it is not expected (and we know that it failed already).

Patch prepares special function which remove temporary mirror references from
metadata and then resumes changed LVs.
2010-01-27 13:29:11 +00:00
Milan Broz
e01bdd2fab Add some missing vg_revrts calls when pvmove aborts. 2010-01-26 08:01:18 +00:00
Mike Snitzer
56b3d20462 Add missing 'stack;' for all activate_lv and deactivate_lv callers.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-01-05 21:08:34 +00:00
Mike Snitzer
df13cf08d5 Add missing 'stack;' for all suspend_lv and resume_lv callers.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-01-05 21:07:31 +00:00
Peter Rajnoha
7a3263f2a0 Give better message for pvmove when all data on source PV is skipped. 2009-12-04 14:03:18 +00:00
Milan Broz
29f011314d Fix pvmove test mode to not fail and do not poll.
Test mode should not fail nor try to poll non-existent devices.
2009-12-03 19:22:24 +00:00
Alasdair Kergon
a8fb89adaf Tidy some uses of arg_count and introduce arg_is_set. 2009-11-03 15:50:42 +00:00
Alasdair Kergon
4b12fa1377 Make poll_mirror_progress a function that can be replaced. 2009-09-30 18:15:06 +00:00
Dave Wysochanski
0f66b7086b Refactor a couple log_error() statements in prep for larger log_error() patch.
Part of twoerner's log_error() patches.

Signed-off-by: Thomas Woerner <twoerner@redhat.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-08 18:15:51 +00:00
Dave Wysochanski
4c611a220a Fix vg_read() error paths to properly release upon vg_read_error().
Fix vg_read() error paths to properly release upon vg_read_error().
Note that in the iterator paths (process_each_*()), we release
inside the iterator so no individual cleanup is needed.  However there
are a number of other places we missed the cleanup.  Proper cleanup
when vg_read_error() is true should be calling vg_release(vg), since
there should be no locks held if we get an error (except in certain
special cases, which IMO we should work to remove from the code).

Unfortunately the testsuite is unable to detect these types of memory
leaks.  Most of them can be easily seen if you try an operation
(e.g. lvcreate) with a volume group that does not exist.  Error
message looks like this:
  Volume group "vg2" not found
  You have a memory leak (not released memory pool):
   [0x1975eb8]
  You have a memory leak (not released memory pool):
   [0x1975eb8]


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-07 01:18:35 +00:00