1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-07 21:18:59 +03:00
Commit Graph

590 Commits

Author SHA1 Message Date
Jonathan Earl Brassow
c954b73149 cmirrord now returns log name to kernel in CTR so it can be registered
Version 2 of the userspace log protocol accepts return information during the
DM_ULOG_CTR exchange.  The return information contains the name of the log
device that is being used (if there is one).  The kernel can then register the
device via 'dm_get_device'.  Amoung other things, this allows for userspace to
assemble a correct dependency tree of devices - critical for LVM handling of
suspend/resume calls.

Also, update dm-log-userspace.h to match the kernel header associated with
this protocol change.  (Includes a version inc.)
2011-10-14 14:18:49 +00:00
Jonathan Earl Brassow
681ceb16d8 Update stale libdm/misc/dm-log-userspace.h
The upstream kernel version that this file mirrors has changed, here is the
commit message:

commit 86a54a4802df10d23ccd655e2083e812fe990243
Author: Jonathan Brassow <jbrassow@redhat.com>
Date:   Thu Jan 13 19:59:52 2011 +0000

    dm log userspace: add version number to comms

    This patch adds a 'version' field to the 'dm_ulog_request'
    structure.

    The 'version' field is taken from a portion of the unused
    'padding' field in the 'dm_ulog_request' structure.  This
    was done to avoid changing the size of the structure and
    possibly disrupting backwards compatibility.

    The version number will help notify user-space daemons
    when a change has been made to the kernel/userspace
    log API.

    Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-10-14 14:04:05 +00:00
Zdenek Kabelac
54e43fd3de Use pool for dm_tree allocation
Using the same pool allocation strategy as we use for vg,
so dm_tree structure is part of the pool itself.
2011-10-14 13:34:19 +00:00
Jonathan Earl Brassow
857339758f This patch fixes issues with improper udev flags on sub-LVs.
The current code does not always assign proper udev flags to sub-LVs (e.g.
mirror images and log LVs).  This shows up especially during a splitmirror
operation in which an image is split off from a mirror to form a new LV.

A mirror with a disk log is actually composed of 4 different LVs: the 2
mirror images, the log, and the top-level LV that "glues" them all together.
When a 2-way mirror is split into two linear LVs, two of those LVs must be
removed.  The segments of the image which is not split off to form the new
LV are transferred to the top-level LV.  This is done so that the original
LV can maintain its major/minor, UUID, and name.  The sub-lv from which the
segments were transferred gets an error segment as a transitory process
before it is eventually removed.  (Note that if the error target was not put
in place, a resume_lv would result in two LVs pointing to the same segment!
If the machine crashes before the eventual removal of the sub-LV, the result
would be a residual LV with the same mapping as the original (now linear) LV.)
So, the two LVs that need to be removed are now the log device and the sub-LV
with the error segment.  If udev_flags are not properly set, a resume will
cause the error LV to come up and be scanned by udev.  This causes I/O errors.
Additionally, when udev scans sub-LVs (or former sub-LVs), it can cause races
when we are trying to remove those LVs.  This is especially bad during failure
conditions.

When the mirror is suspended, the top-level along with its sub-LVs are
suspended.  The changes (now 2 linear devices and the yet-to-be-removed log
and error LV) are committed.  When the resume takes place on the original
LV, there are no longer links to the other sub-lvs through the LVM metadata.
The links are implicitly handled by querying the kernel for a list of
dependencies.  This is done in the '_add_dev' function (which is recursively
called for each dependency found) - called through the following chain:
	_add_dev
	dm_tree_add_dev_with_udev_flags
	<*** DM / LVM divide ***>
	_add_dev_to_dtree
	_add_lv_to_dtree
	_create_partial_dtree
	_tree_action
	dev_manager_activate
	_lv_activate_lv
	_lv_resume
	lv_resume_if_active
When udev flags are calculated by '_get_udev_flags', it is done by referencing
the 'logical_volume' structure.  Those flags are then passed down into
'dm_tree_add_dev_with_udev_flags', which in turn passes them to '_add_dev'.
Unfortunately, when '_add_dev' is finding the dependencies, it has no way to
calculate their proper udev_flags.  This is because it is below the DM/LVM
divide - it doesn't have access to the logical_volume structure.  In fact,
'_add_dev' simply reuses the udev_flags given for the initial device!  This
virtually guarentees the udev_flags are wrong for all the dependencies unless
they are reset by some other mechanism.  The current code provides no such
mechanism.  Even if '_add_new_lv_to_dtree' were called on the sub-devices -
which it isn't - entries already in the tree are simply passed over, failing
to reset any udev_flags.  The solution must retain its implicit nature of
discovering dependencies and be able to go back over the dependencies found
to properly set the udev_flags.

My solution simply calls a new function before leaving '_add_new_lv_to_dtree'
that iterates over the dtree nodes to properly reset the udev_flags of any
children.  It is important that this function occur after the '_add_dev' has
done its job of querying the kernel for a list of dependencies.  It is this
list of children that we use to look up their respective LVs and properly
calculate the udev_flags.

This solution has worked for single machine, cluster, and cluster w/ exclusive
activation.
2011-10-06 14:45:40 +00:00
Zdenek Kabelac
9f9b3e1e28 Move defines to header
Make limits for thin data_block_size and device_id part of public API.

FIXME: read them possible from some kernel header file in the future ?
But we may need to support different values for different versions ?
2011-10-06 11:05:56 +00:00
Zdenek Kabelac
1fef12cd31 Name changes
typo zeroeing->zeroing
add size low_water_mark->low_water_mark_size so it's more obvious its sector
related variable.
2011-10-04 16:22:38 +00:00
Zdenek Kabelac
c6d777289b Add intial code to check transaction_id
Fix typy in transaction_id.
Add this as node property, so it could be easily checked on resume.

Code is not yet finished.
2011-10-03 18:34:52 +00:00
Zdenek Kabelac
27596fa624 Move priority check in front
Just a minor code mode - make a test for priority before
more complex uuid checks.
2011-10-03 18:29:48 +00:00
Zdenek Kabelac
f8b4957694 Update error path tracing for _resume_node
dm_task_create & dm_task_set_name produces it's own log_error
Add missing stacks for dm_task_set_cookie, dm_task_run,
dm_task_get_info.
2011-10-03 18:28:25 +00:00
Zdenek Kabelac
2daddac019 Transaction_id is property of thin_pool
Remove Transaction_id from thin target.
Store device_id for thin target.
2011-10-03 18:26:07 +00:00
Zdenek Kabelac
222bbab442 Add supporting function for thinp
New dm_tree_node_add_thin_pool_target() and  dm_tree_node_add_thin_target()
This API is highly experimental and unstable for now.
2011-09-29 08:53:48 +00:00
Zdenek Kabelac
3589d75998 Just add warning about potential problem exteding dm_segtypes
Since raid target is using now dm_segtypes also for search purpose.
2011-09-29 08:50:54 +00:00
Alasdair Kergon
e0948b5825 Introduce revert_lv for better pvmove cleanup.
(One further fix needed to remove the stray pvmove LVs left behind.)
2011-09-27 22:43:40 +00:00
Peter Rajnoha
d1f949465f Add log_error even for general device in use when we can't do the sysfs checks. 2011-09-26 10:17:51 +00:00
Zdenek Kabelac
7f7e0704f6 Remove test for NULL
Since it's internal function and we always check for NULL value
before call - this is safe.

Just for case add nonnull attribute so analyzer might better
catch error.
2011-09-25 19:45:40 +00:00
Zdenek Kabelac
ed3d5e9409 Add missing log_error messages 2011-09-25 19:43:43 +00:00
Zdenek Kabelac
f0633627b4 Add backtrace when allocation fails for _type 2011-09-25 19:42:45 +00:00
Zdenek Kabelac
1dd5dfed81 Replace test for NULL of root->child with test for NULL l
It's 100% equivalent test - since it always happen for the first iteration.
But the check for 'l' is understandable with analyzers - since analyzer
is not smart enough to deduce connection between  root->child == NULL.
2011-09-25 19:41:27 +00:00
Zdenek Kabelac
2d2d9ac875 Simplier attribute format
No need to repeat whole declaration for static function.
2011-09-25 19:40:29 +00:00
Zdenek Kabelac
3416af3f5d Chheck for failing filename strdup 2011-09-25 19:39:38 +00:00
Zdenek Kabelac
4da6e11c5a Use NULL for pointers 2011-09-25 19:38:59 +00:00
Peter Rajnoha
5d604a99f6 readlink does not append a null byte to the output string! 2011-09-24 11:47:53 +00:00
Alasdair Kergon
4e8d5bc726 explain why we may now retry 2011-09-23 17:16:28 +00:00
Peter Rajnoha
2729ca70bc Initialize 'retryable' variable. 2011-09-22 17:59:58 +00:00
Peter Rajnoha
3157f8e8fa Add dm_tree_retry_remove to use retry logic for device removal in a dm_tree. 2011-09-22 17:36:50 +00:00
Peter Rajnoha
638409a573 Replace open_count check with holders/mounted_fs check on lvremove path.
Before, we used to display "Can't remove open logical volume" which was
generic. There 3 possibilities of how a device could be opened:
  - used by another device
  - having a filesystem on that device which is mounted
  - opened directly by an application

With the help of sysfs info, we can distinguish the first two situations.
The third one will be subject to "remove retry" logic - if it's opened
quickly (e.g. a parallel scan from within a udev rule run), this will
finish quickly and we can remove it once it has finished. If it's a
legitimate application that keeps the device opened, we'll do our best
to remove the device, but we will fail finally after a few retries.
2011-09-22 17:33:50 +00:00
Peter Rajnoha
def057522f Add dm_device_has_holders fn to to check use of the device by another device.
Add dm_device_has_mounted_fs fn to check mounted filesystem on a device.

This requires sysfs directory to be correctly set via dm_set_sysfs_dir
(/sys by default). If sysfs dir is not used or it's set incorrectly,
dm_device_has_{holders,mounted_fs} will return 0!
2011-09-22 17:23:35 +00:00
Peter Rajnoha
1f92ab6e8a Add dm_set_sysfs_dir to libdevmapper to set sysfs location.
Add dm_sysfs_dir to libdevmapper to retrieve sysfs location thas is set.
2011-09-22 17:17:07 +00:00
Peter Rajnoha
6b1629455a Add dm_task_retry_remove fn to use retry logic for device removal.
This call ensures that the dm device removal is retried several
times before failing.
2011-09-22 17:09:48 +00:00
Zdenek Kabelac
52301e7c5d Fix memory overwrite
Transfer of build_dm_uuid() function into libdm made uuid_prefix as parameter,
thus sizeof() was replaced with strlen() and room for '\0' missed.

As it's only fix in current version - no whatsnew.
2011-09-14 16:07:07 +00:00
Peter Rajnoha
40a234183c Retry DM_DEVICE_REMOVE ioctl if device is busy.
This is a workaround for long-lasting problem with using the WATCH udev
rule. When trying to remove a DM device, this one can still be opened
while processing the event in parallel (generated based on the WATCH
udev rule).

Let's use this until we have a proper solution.
2011-09-13 15:13:41 +00:00
Zdenek Kabelac
83ca5e6d5c Remove unused passed parameters 2011-09-07 08:37:48 +00:00
Alasdair Kergon
5c216d6eb2 Move cascade inside libdm etc.
Makes dumpconfig whole-section output wrong in a different way from before,
but we should be able to merge cft_cmdline properly into cmd->cft now and
remove cascade.
2011-09-02 01:32:08 +00:00
Alasdair Kergon
c147bbddb3 Comments, FIXMEs, name changes. 2011-09-01 21:04:14 +00:00
Alasdair Kergon
eced331945 Add comments & remove always-included header. 2011-09-01 17:58:27 +00:00
Zdenek Kabelac
d91c270d1c Use const casting when it's needed
Keep the lookup operation const and use const casting at the dm_ function level.
2011-09-01 14:02:05 +00:00
Zdenek Kabelac
fa784f6198 Mark unreleased memory pools as internal error 2011-09-01 10:19:01 +00:00
Petr Rockai
118fa896b7 Replace const usage of dm_config_find_node with more appropriate value-lookup
functionality. A number of bugs (copied and pasted all over the code) should
disappear:

- most string lookup based on dm_config_find_node would segfault when
  encountering a non-zero integer (the intention there was to print an
  error message instead)
- check for required sections in metadata would have been satisfied by
  values as well (i.e. not sections)
- encountering a section in place of expected flag value would have
  segfaulted (due to assumed but unchecked cn->v != NULL)
2011-08-31 15:19:19 +00:00
Petr Rockai
dbe351860e Fix warnings and constness handling in lvmetad-core (adjusting the
dm_config_find_node to give non-const node pointer, since that better reflects
the contract of that function).
2011-08-31 12:39:58 +00:00
Petr Rockai
d60c24dda8 Move the core of the lib/config/config.c functionality into libdevmapper,
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
2011-08-30 14:55:15 +00:00
Alasdair Kergon
37086f40a1 spaces->tabs 2011-08-19 17:02:48 +00:00
Alasdair Kergon
946f96cda5 revert incomplete inconsistent log message change for now 2011-08-19 16:49:00 +00:00
Alasdair Kergon
6ab3b611c7 restrict dm_tree_node_add_null_area 2011-08-19 16:26:02 +00:00
Jonathan Earl Brassow
b87604e649 Add ability to merge back a RAID1 image that has been split w/ --trackchanges
Argument layout is very similar to the merge command for snapshots.
2011-08-18 19:43:08 +00:00
Jonathan Earl Brassow
4fad401cd2 Add support for m-way to n-way up-convert in RAID1 (no linear to n-way yet)
This patch adds the ability to upconvert a raid1 array - say from 2-way to
3-way.  It does not yet support upconverting linear to n-way.

The 'raid' device-mapper target allows for individual components (images) of
an array to be specified for rebuild.  This mechanism is used when adding
new images to the array so that the new images can be resync'ed while the
rest of the images in the array can remain 'in-sync'.  (There is no
mirror-on-mirror layering required.)
2011-08-18 19:41:21 +00:00
Jonathan Earl Brassow
cc00073da7 Add the ability to split an image from the mirror and track changes.
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access.  The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed.  This
operation can be thought of as a partial split.  The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap.  The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed.  The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
	~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
	~> lvconvert --merge vg/lv_rimage_<n>

The internal mechanics of this are relatively simple.  The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'.  This is what will be used if a partial activation of an array
is ever required.  (It would also be possible to use 'error' targets in
place of the '- -'.)  If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array.  So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only.  To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
2011-08-18 19:38:26 +00:00
Zdenek Kabelac
787fa3705d Fix memleak of geometry buffer
Looks like this function is not used too often - thus leak was discovered
by static analyzis (Coverity).
2011-08-11 20:49:33 +00:00
Jonathan Earl Brassow
470f8b1266 Add some log_error msg's and fix potential segfault
Thanks to kabi for spotting these - especially the possibility for
segfault if a loop runs all the way through without finding a match.
2011-08-11 19:17:10 +00:00
Zdenek Kabelac
64d62e1ded Add memory pool locking functions
Adding debuging functionality to lock and unlock memory pool.

2 ways to debug code:
crc - is default checksum/hash of the locked pool.
      It gets slower when the pool is larger - so the check is only
      made when VG is finaly released and it has been used more then
      once.Thus the result is rather informative.

mprotect - quite fast all the time - but requires more memory and
           currently it is using posix_memalign() - this could be
	   later modified to use dm_malloc() and align internally.
           Tool segfaults when locked memory is modified and core
	   could be examined for faulty code section (backtrace).

Only fast memory pools could use mprotect for now -
so such debug builds cannot be combined with DEBUG_POOL.
2011-08-11 17:29:04 +00:00
Alasdair Kergon
bd7f41d83c Remove support for the original dm ioctl interface version 1.
Leave the basic support for multiple versions in case we have a new version
in future.
2011-08-09 17:56:47 +00:00