1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-25 01:34:38 +03:00
Commit Graph

6164 Commits

Author SHA1 Message Date
Jonathan Earl Brassow
f7235e7cb4 Revert initial solution to bug 733114 - I/O error message during splitmirror
The original commit comments can be located via this git commit ID:
	7d8e615c0b

There were three possible solutions to the original problem proposed in the
initial check-in.  The one chosen was as follows:
    2) Do like _remove_mirror_images does and suspend the original, then suspend
    the sub-lv (the error target), then resume the sub-lv, and finally resume the
    original LV.  This seems like extra pointless operations to me, but it doesn't
    produce the error message (although, I'm not sure why) and it allows us to
    leave the visible flag in place.
Turns out, the cluster also views the extra suspend/resume operations as
pointless too and ignores them.  So, this solution doesn't work in a cluster.
Further, I've noticed that in addition to the remote cluster nodes still getting
I/O errors from scanning the error target, they also have a different LVM and
DM views of the same LV.  IOW, while the LVM level (gotten from the LVM metadata)
sees the correct name for the newly split LV, device-mapper still maintains the
old names.

Because the original fix failed to completely fix the problem (or work-around it)
and because a better solution must be found to address the additional cluster
issue of device renaming, I am reverting the above mentioned commit.
2011-10-06 14:49:16 +00:00
Jonathan Earl Brassow
857339758f This patch fixes issues with improper udev flags on sub-LVs.
The current code does not always assign proper udev flags to sub-LVs (e.g.
mirror images and log LVs).  This shows up especially during a splitmirror
operation in which an image is split off from a mirror to form a new LV.

A mirror with a disk log is actually composed of 4 different LVs: the 2
mirror images, the log, and the top-level LV that "glues" them all together.
When a 2-way mirror is split into two linear LVs, two of those LVs must be
removed.  The segments of the image which is not split off to form the new
LV are transferred to the top-level LV.  This is done so that the original
LV can maintain its major/minor, UUID, and name.  The sub-lv from which the
segments were transferred gets an error segment as a transitory process
before it is eventually removed.  (Note that if the error target was not put
in place, a resume_lv would result in two LVs pointing to the same segment!
If the machine crashes before the eventual removal of the sub-LV, the result
would be a residual LV with the same mapping as the original (now linear) LV.)
So, the two LVs that need to be removed are now the log device and the sub-LV
with the error segment.  If udev_flags are not properly set, a resume will
cause the error LV to come up and be scanned by udev.  This causes I/O errors.
Additionally, when udev scans sub-LVs (or former sub-LVs), it can cause races
when we are trying to remove those LVs.  This is especially bad during failure
conditions.

When the mirror is suspended, the top-level along with its sub-LVs are
suspended.  The changes (now 2 linear devices and the yet-to-be-removed log
and error LV) are committed.  When the resume takes place on the original
LV, there are no longer links to the other sub-lvs through the LVM metadata.
The links are implicitly handled by querying the kernel for a list of
dependencies.  This is done in the '_add_dev' function (which is recursively
called for each dependency found) - called through the following chain:
	_add_dev
	dm_tree_add_dev_with_udev_flags
	<*** DM / LVM divide ***>
	_add_dev_to_dtree
	_add_lv_to_dtree
	_create_partial_dtree
	_tree_action
	dev_manager_activate
	_lv_activate_lv
	_lv_resume
	lv_resume_if_active
When udev flags are calculated by '_get_udev_flags', it is done by referencing
the 'logical_volume' structure.  Those flags are then passed down into
'dm_tree_add_dev_with_udev_flags', which in turn passes them to '_add_dev'.
Unfortunately, when '_add_dev' is finding the dependencies, it has no way to
calculate their proper udev_flags.  This is because it is below the DM/LVM
divide - it doesn't have access to the logical_volume structure.  In fact,
'_add_dev' simply reuses the udev_flags given for the initial device!  This
virtually guarentees the udev_flags are wrong for all the dependencies unless
they are reset by some other mechanism.  The current code provides no such
mechanism.  Even if '_add_new_lv_to_dtree' were called on the sub-devices -
which it isn't - entries already in the tree are simply passed over, failing
to reset any udev_flags.  The solution must retain its implicit nature of
discovering dependencies and be able to go back over the dependencies found
to properly set the udev_flags.

My solution simply calls a new function before leaving '_add_new_lv_to_dtree'
that iterates over the dtree nodes to properly reset the udev_flags of any
children.  It is important that this function occur after the '_add_dev' has
done its job of querying the kernel for a list of dependencies.  It is this
list of children that we use to look up their respective LVs and properly
calculate the udev_flags.

This solution has worked for single machine, cluster, and cluster w/ exclusive
activation.
2011-10-06 14:45:40 +00:00
Jonathan Earl Brassow
29044ecb22 Fix vgsplit when there are mirrors that have mirrored logs.
The problem as reported by "ben <benscott@nwlink.com>" on lvm-devel:

vgsplit fails with mirrored mirror log

#lvs --all -o lv_name,lv_attr,devices
LV                       Attr   Devices
MyMirror                 mwi--
[MyMirror_mimage_0]      Iwi--- /dev/sdq(0)
[MyMirror_mimage_1]      Iwi--- /dev/sdo(0)
[MyMirror_mimage_2]      Iwi--- /dev/sdi(0)
[MyMirror_mlog]          mwi---
[MyMirror_mlog_mimage_0] Iwi--- /dev/sds(0)
[MyMirror_mlog_mimage_1] Iwi--- /dev/sde(0)

#vgsplit -v "TestA" "TestB" "/dev/sdq" "/dev/sdo" "/dev/sdi" "/dev/sds"
"/dev/sde"
  Checking for volume group "TestA"
  Checking for new volume group "TestB"
  Archiving volume group "TestA" metadata (seqno 213).
Can't split mirror MyMirror between two Volume Groups

AFTER FIX:

[root@bp-01 ~]# lvs -a -o name,vg_name,devices vg new
  Volume group "new" not found
  Skipping volume group new
  LV                 VG   Devices
  lv                 vg   lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]      vg   /dev/sdb1(0)
  [lv_mimage_1]      vg   /dev/sdc1(0)
  [lv_mlog]          vg   lv_mlog_mimage_0(0),lv_mlog_mimage_1(0)
  [lv_mlog_mimage_0] vg   /dev/sdh1(0)
  [lv_mlog_mimage_1] vg   /dev/sdi1(0)
[root@bp-01 ~]# vgsplit vg new /dev/sd[bchi]1
  New volume group "new" successfully split from "vg"
[root@bp-01 ~]# lvs -a -o name,vg_name,devices vg new
  LV                 VG   Devices
  lv                 new  lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]      new  /dev/sdb1(0)
  [lv_mimage_1]      new  /dev/sdc1(0)
  [lv_mlog]          new  lv_mlog_mimage_0(0),lv_mlog_mimage_1(0)
  [lv_mlog_mimage_0] new  /dev/sdh1(0)
  [lv_mlog_mimage_1] new  /dev/sdi1(0)
2011-10-06 14:17:45 +00:00
Zdenek Kabelac
b47df54d48 Add more validation to config parser
Do not leave it for vgvalidate().
2011-10-06 11:06:36 +00:00
Zdenek Kabelac
9f9b3e1e28 Move defines to header
Make limits for thin data_block_size and device_id part of public API.

FIXME: read them possible from some kernel header file in the future ?
But we may need to support different values for different versions ?
2011-10-06 11:05:56 +00:00
Alasdair Kergon
43dce243ab Clarify multi-name device filter pattern matching explanation in lvm.conf.5. 2011-10-04 20:49:24 +00:00
Alasdair Kergon
122476adc6 Clarify multi-name device filter pattern matching explanation in lvm.conf.5. 2011-10-04 20:45:36 +00:00
Zdenek Kabelac
1fef12cd31 Name changes
typo zeroeing->zeroing
add size low_water_mark->low_water_mark_size so it's more obvious its sector
related variable.
2011-10-04 16:22:38 +00:00
Zdenek Kabelac
f53fb47dd9 Use capital letters 2011-10-04 12:39:59 +00:00
Zdenek Kabelac
b79f082537 Missed rename pool->thin_pool
Fix compilation
2011-10-03 19:10:52 +00:00
Zdenek Kabelac
b12ec9b372 Add code to activate thin target
Code to zero pool metadata lv when pool is created.
Add code to create thin target via message sending.

(Revert is missing)
2011-10-03 18:43:39 +00:00
Zdenek Kabelac
4121fbb8a4 Add simple function for lookup of some free device_id
Initial simple implementation for finding some free device_id.
2011-10-03 18:39:17 +00:00
Zdenek Kabelac
f683063d42 Add lvm functions for sending messages.
Functions are currently only needed for thin provissioning.
2011-10-03 18:37:47 +00:00
Zdenek Kabelac
c6d777289b Add intial code to check transaction_id
Fix typy in transaction_id.
Add this as node property, so it could be easily checked on resume.

Code is not yet finished.
2011-10-03 18:34:52 +00:00
Zdenek Kabelac
d75053e74f Display transaction_id for thin_pool 2011-10-03 18:31:03 +00:00
Zdenek Kabelac
27596fa624 Move priority check in front
Just a minor code mode - make a test for priority before
more complex uuid checks.
2011-10-03 18:29:48 +00:00
Zdenek Kabelac
f8b4957694 Update error path tracing for _resume_node
dm_task_create & dm_task_set_name produces it's own log_error
Add missing stacks for dm_task_set_cookie, dm_task_run,
dm_task_get_info.
2011-10-03 18:28:25 +00:00
Zdenek Kabelac
2daddac019 Transaction_id is property of thin_pool
Remove Transaction_id from thin target.
Store device_id for thin target.
2011-10-03 18:26:07 +00:00
Zdenek Kabelac
a6d73dc760 Add preload support for thin and thin_pool 2011-10-03 18:24:47 +00:00
Zdenek Kabelac
56fc7c0053 Fix bad error message for thinp validation 2011-09-29 09:03:36 +00:00
Zdenek Kabelac
4ffb50268a Let the utils to prepare PVs 2011-09-29 08:58:27 +00:00
Zdenek Kabelac
3473d2f219 Typo in debug message 2011-09-29 08:57:21 +00:00
Zdenek Kabelac
4a59dda8aa Add experimental code for activation of thinp targets
No dm messages yes - just a base functionality in the steps of other targets.
For now usable only for debugging and tracing.
2011-09-29 08:56:38 +00:00
Zdenek Kabelac
222bbab442 Add supporting function for thinp
New dm_tree_node_add_thin_pool_target() and  dm_tree_node_add_thin_target()
This API is highly experimental and unstable for now.
2011-09-29 08:53:48 +00:00
Zdenek Kabelac
3589d75998 Just add warning about potential problem exteding dm_segtypes
Since raid target is using now dm_segtypes also for search purpose.
2011-09-29 08:50:54 +00:00
Jonathan Earl Brassow
3016856a4c New handy gdb debugging function, "dm_list_size"
Example:
(gdb) dm_list_size &split_images
1 list items
2011-09-28 16:32:22 +00:00
Alasdair Kergon
e0948b5825 Introduce revert_lv for better pvmove cleanup.
(One further fix needed to remove the stray pvmove LVs left behind.)
2011-09-27 22:43:40 +00:00
Alasdair Kergon
691157a71e Replace incomplete pvmove activation failure recovery code with a message.
As it stands, the recovery code can make things worse sometimes so it's
better to insist on a proper 'pvmove --abort' cleanup.
2011-09-27 17:29:33 +00:00
Alasdair Kergon
e63febe5ec Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely.
Change suspend_lvs to call vg_revert internally.
Change vg_revert to void and remove superfluous calls after failed vg_commit.
2011-09-27 17:09:42 +00:00
Alasdair Kergon
88c3d4b61a better -m0 error message, but there's an internal logic error to fix instead 2011-09-27 12:37:07 +00:00
Alasdair Kergon
fcdcef33c3 typo 2011-09-27 12:34:14 +00:00
Alasdair Kergon
ead841e0a3 correct thin_pool width 2011-09-27 12:33:36 +00:00
Zdenek Kabelac
26f43649f9 Show some Thin related info in lvdisplay 2011-09-26 13:11:02 +00:00
Peter Rajnoha
d1f949465f Add log_error even for general device in use when we can't do the sysfs checks. 2011-09-26 10:17:51 +00:00
Zdenek Kabelac
57f1027a03 Use execvp for clvmd restart
Since execve passed only NULL as environ, we had lost all environment vars on
restart - thus actually running  'different' clvmd then the one at start.

Preserving environ allows to restart clvmd with the same settings
(i.e. LD_LIBRARY_PATH)

Add test for second restart.
2011-09-26 07:51:23 +00:00
Zdenek Kabelac
7f7e0704f6 Remove test for NULL
Since it's internal function and we always check for NULL value
before call - this is safe.

Just for case add nonnull attribute so analyzer might better
catch error.
2011-09-25 19:45:40 +00:00
Zdenek Kabelac
ed3d5e9409 Add missing log_error messages 2011-09-25 19:43:43 +00:00
Zdenek Kabelac
f0633627b4 Add backtrace when allocation fails for _type 2011-09-25 19:42:45 +00:00
Zdenek Kabelac
1dd5dfed81 Replace test for NULL of root->child with test for NULL l
It's 100% equivalent test - since it always happen for the first iteration.
But the check for 'l' is understandable with analyzers - since analyzer
is not smart enough to deduce connection between  root->child == NULL.
2011-09-25 19:41:27 +00:00
Zdenek Kabelac
2d2d9ac875 Simplier attribute format
No need to repeat whole declaration for static function.
2011-09-25 19:40:29 +00:00
Zdenek Kabelac
3416af3f5d Chheck for failing filename strdup 2011-09-25 19:39:38 +00:00
Zdenek Kabelac
4da6e11c5a Use NULL for pointers 2011-09-25 19:38:59 +00:00
Zdenek Kabelac
bd085674b2 Restart CLVMD with same cluster manager
Add named cluster_ops to easily learn the name of the active cluster manager,
so we are able to restart singlenode manager in testing.

Add simple test for clvmd -S  (restart) and -R (refresh)
(though it needs some extensions).
2011-09-25 19:37:00 +00:00
Zdenek Kabelac
71ee4b8d25 Fix log_error() usage
Cosmetic - skip <bactrace> when error has been just printed in raid segtype.
Add missing log_error if allocation would fail for unknown segtype.
2011-09-24 21:19:30 +00:00
Zdenek Kabelac
1fd620a436 Allow overwrite for VERIFY_UDEV
When running tests it might be useful to have an override option when
testing on real /dev  and some broken system (i.e. Debian and its rules).

So one can use:

LVM_TEST_DEVDIR=/dev LVM_VERIFY_UDEV=1 make check
2011-09-24 21:15:13 +00:00
Zdenek Kabelac
5764014aa3 Avoid sending garbage to terminal in verbose mode.
When read in drain returned <0 value, terminal content has been trashed.
Remove unneeded  memset() and use whole buffer.
Free  readbuf before exit (valgrind).
2011-09-24 21:12:35 +00:00
Zdenek Kabelac
cc12990b2f Improvements
Simplify RUN_BASE

Put .tests-stamp deps only for check target and fix its cleanup.
Fix abs_top_srcdir.
vgimportclone needs  srcdir.
Clean  api subdir.
2011-09-24 21:10:19 +00:00
Zdenek Kabelac
a6791e34ba Fix install_ocf
When builddir is different from srcdir install_ocf: has not been able to find
files for installation.
2011-09-24 21:05:03 +00:00
Zdenek Kabelac
8deff7018a Drop cleanup of .exported_symbols_generated in DISTCLEAN_TARGETS
Makefile cosmetics - since .exported_symbols_generated in cleardir:
target via make.tmpl, there is no need to set them in DISTCLEAN_TARGETS.
2011-09-24 21:00:52 +00:00
Zdenek Kabelac
48b9cbab24 Use Makefile for daemmons/common library.
Next iteration for better fit of lvmetad compilation.

Move build of libdaemon.a into common subdir Makefile.
libdaemon.a is device-mapper target.

Build and install lvmetad as lvm2 target.
2011-09-24 20:57:49 +00:00