1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

1152 Commits

Author SHA1 Message Date
Zdenek Kabelac
38796c3d47 Fix bad error message for thinp validation 2011-09-29 09:03:36 +00:00
Zdenek Kabelac
aebf2d5cdc Add experimental code for activation of thinp targets
No dm messages yes - just a base functionality in the steps of other targets.
For now usable only for debugging and tracing.
2011-09-29 08:56:38 +00:00
Alasdair Kergon
1c26860d82 Abort if _finish_pvmove suspend_lvs fails instead of cleaning up incompletely.
Change suspend_lvs to call vg_revert internally.
Change vg_revert to void and remove superfluous calls after failed vg_commit.
2011-09-27 17:09:42 +00:00
Jonathan Earl Brassow
efa3621a59 Add 'Volume Type' lv_attr characters for RAID and RAID_IMAGE.
RAID_META is already handled.
2011-09-23 15:17:54 +00:00
Peter Rajnoha
125712bea0 Replace open_count check with holders/mounted_fs check on lvremove path.
Before, we used to display "Can't remove open logical volume" which was
generic. There 3 possibilities of how a device could be opened:
  - used by another device
  - having a filesystem on that device which is mounted
  - opened directly by an application

With the help of sysfs info, we can distinguish the first two situations.
The third one will be subject to "remove retry" logic - if it's opened
quickly (e.g. a parallel scan from within a udev rule run), this will
finish quickly and we can remove it once it has finished. If it's a
legitimate application that keeps the device opened, we'll do our best
to remove the device, but we will fail finally after a few retries.
2011-09-22 17:33:50 +00:00
Jonathan Earl Brassow
40c85cf1d7 When up-converting a RAID1 array, we need to allocate new larger arrays for
seg->areas and seg->meta_areas.  We also need to copy the memory from the
old arrays to the newly allocated arrays.  The amount of memory to copy was
determined by seg->area_count.  However, seg->area_count was being set to the
higher value after copying the 'seg->areas' information, but before copying
the 'seg->meta_areas' information.  This means we were copying more memory
than necessary for 'seg->meta_areas' - something that could lead to a segfault.
2011-09-22 15:33:21 +00:00
Jonathan Earl Brassow
4026cb6fd1 fix compiler warning.
Compiler says variable may be used uninitialized.  It can't be, but we
initialize the variable to NULL anyway.  Also, remove the double initialization
of another variable.
2011-09-19 14:28:23 +00:00
Jonathan Earl Brassow
eb607100ef Fix Bug 738832 - core to disk log conversion fails with internal error
This bug showed up when trying to add a log to a mirror whose images are on
multiple devices.  This is an intra-release regression and no WHATS_NEW
entry will be added.  The error was introduce in the following commit:
	2d8a2f35c7

The solution is to recognise in _alloc_init that if there are no mirrors
or stripes specified, then 'new_extents' should be zero.
2011-09-16 18:39:03 +00:00
Jonathan Earl Brassow
a514067448 After suspend/resume following a splitmirror op, call sync_local_dev_names
to settle udev before calling deactivate_lv.

This is an intra-release regression (no WHATS_NEW entry required).  It is
part of the fix for the current WHATS_NEW entry:
  Work around resume_lv causing error LV scanning during splitmirror operation.
2011-09-16 16:41:37 +00:00
Zdenek Kabelac
a6d50bef2f Remove thin volumes before thin pools
When user wants to remove thin pool - check if there are no thin volumes using it.
If so - query before removal (or -ff for no question) and remove them first.
2011-09-16 12:12:51 +00:00
Zdenek Kabelac
4a0c6df8df Reset LV status when unlinking LV from VG
When LV is unlinked, we want to catch problem in vg_validate,
that LV has changed.

i.e. catch LV has been removed and is no long thin_pool while still
being referenced by some thin volume.
2011-09-16 11:59:22 +00:00
Zdenek Kabelac
94147f3f29 Trim spaces on EOL 2011-09-16 11:53:14 +00:00
Petr Rockai
fd7d4adc57 Fix the divisibility check in the allocator for the mirror+stripe case (require
divisibility by stripe count alone, not by (mirror*stripe)).
2011-09-16 09:59:42 +00:00
Milan Broz
c81a322337 Activate virtual snapshot origin exclusively (only on local node in cluster). 2011-09-14 14:20:16 +00:00
Zdenek Kabelac
e24be2abe4 Add suggest parentheses around '&&'
Follow gcc suggestion.
2011-09-14 10:03:15 +00:00
Zdenek Kabelac
886d005616 LVM_WRITE and LVM_READ are 64bit constants
Revert John patch, which fixed only 1 place where ~LVM_WRITE was in use and
convert ommited LVM_READ/WRITE flags to 64bit constants as well.
(Since both 'status' flags for LV and VG are 64bit.)
2011-09-14 09:57:35 +00:00
Zdenek Kabelac
3e25de05a9 Add missing underscores to local static functions 2011-09-14 09:54:21 +00:00
Jonathan Earl Brassow
462579d54e Additional fixes for lv_mirror_count.
Changing lv_mirror_count to only count the AREA_LVs made the function
stop working for PVMOVE mirrors.  A conditional has been added to fix
that problem.  Additionally, when counting the images in a mirror stack,
we don't need to subtract 1 from the count we get back from the
lv_mirror_count call on the temporary mirror layer.  (This is because we
are no falsely counting the top layer of the temporary mirror.)
2011-09-14 04:10:26 +00:00
Jonathan Earl Brassow
9cb27929e9 Fix for bug 734252 - problem up converting striped mirror after image failure
lv_mirror_count was not able to handle mirrors of stripes properly.  When a
failed device is removed, the MIRRORED status flag is removed from the LV
conditionally based on the results of lv_mirror_count.  However, lv_mirror_count
trusted the MIRRORED flag - thinking any such LV must be mirrored.  It would
happily assign first_seg(lv)->area_count as the number of mirrors, but when
a mirrored striped LV was reduced to a simple striped LV area_count would be
the number of /stripes/ not the number of /mirrors/.  A result higher than 1
would be returned from lv_mirror_count, the MIRRORED flag would not be cleared,
and the LV would fail to be up-converted properly in lvconvert_mirrors_aux
because of it.
2011-09-14 02:45:36 +00:00
Jonathan Earl Brassow
46f0efbfce Fix bug 733400 - Mirror down conversion when specifying the secondary leg is broke
The operation of deactivating the residual error target LV after removing a
mirror layer can cause a "device in-use" conflict with udev.  Giving udev a
poke before calling deactivate_lv eliminates the conflict.  The stick used
to poke udev is 'sync_local_dev_names'.
2011-09-13 21:13:33 +00:00
Jonathan Earl Brassow
c94c47abd7 Fix for bug 737200 - Can't create mirrored-log mirror on a VG with small extents
Kernel requires a mirror to be at least 1 region large.  So,
if our mirror log is itself a mirror, it must be at least
1 region large.  This restriction may not be necessary for
non-mirrored logs, but we apply the rule anyway.

(The other option is to make the region size of the log
mirror smaller than the mirror it is acting as a log for,
but that really complicates things.  It's much easier to
keep the region_size the same for both.)
2011-09-13 18:42:57 +00:00
Jonathan Earl Brassow
f5e43f061a Better fix for bug 737125 - unable to create mirror on 1K extent size VG
WHATS_NEW entry:
Fix log size calculation when only a log is being added to a mirror.

The original fix pass the mirror LV to allocate_extents (rather than
passing NULL) so that _alloc_init could correctly determine the necessary
size of the mirror log.  In the previous check-in, I noted:
    In order to get a decent value computed, we need to pass in the 'lv' argument
    to allocate_extents.  This would normally imply a desire for cling/contiguous
    allocation to the given LV, but since we are not allocating any parallel
    extents and only log extents, it works fine.
However, passing in the LV did have unintended consequences on the placement of
the log.  The better solution is to pass in the number of extext that are in
the mirror LV instead of the LV itself.  This will not cause the allocator to
reserve that number of extents, because 'stripes' and 'mirrors' are specified
as 0.  Thus, 'extents' is used to calculate the size of the log, but won't
affect how much is allocated.
2011-09-13 18:11:38 +00:00
Jonathan Earl Brassow
0c89ef513a Changing RAID status flags to 64-bit broke some binary flag operations.
LVM_WRITE is a 32-bit flag.  Now that RAID[_IMAGE|_META] are 64-bit,
and'ing a RAID LV's status against LVM_WRITE can reset the higher order
flags.

A similar thing will affect thinp flags if not careful.
2011-09-13 16:33:21 +00:00
Jonathan Earl Brassow
cc9dc919e6 Fix for bug 737125 - unable to create mirror on 1K extent size VG
_alloc_init calculates the number of necessary log extents via
'mirror_log_extents'.  'mirror_log_extents' takes 3 arguments: region_size,
pe_size, and size of the mirror LV.  Unfortunately, _alloc_init is guessing at
the mirror size by using 'ah->new_extents / ah->area_multiple' - the number of
extents that the mirror images have.  However, this is /always/ wrong when
allocating the log separately.  Further, the log is always allocated separately
unless we are up-converting the mirror at the same time.  It was by luck alone
that a default value of '1' reflects what we want in most cases.

In order to get a decent value computed, we need to pass in the 'lv' argument
to allocate_extents.  This would normally imply a desire for cling/contiguous
allocation to the given LV, but since we are not allocating any parallel
extents and only log extents, it works fine.
2011-09-13 14:37:48 +00:00
Jonathan Earl Brassow
6d0aa801a0 Fix for bug 733114.
When an image is split from a 2-way mirror, the original mirror is converted to
a linear device.  To do this, the top "layer" must be removed.  The segments
are transferred from the sub-lv to the top-level LV and the link is severed.
The former sub-lv - having its segments transferred - now contains a temporary
error target.

When the original LV is resumed, the old sub-lv that now contains an error
segment is activated and scanned.  This is what causes the I/O error messages.
There are three ways to fix this problem:

1) Do not set the sub-lv which contains the error target as "visible" before
suspending the original LV.  This way, when the original is resumed, the sub-lv
device node is not created and it is not scanned - avoiding the error messages.
 The problem with this approach is that if the machine crashes after the
resume, it leaves the *hidden* LV in place and the user has a more difficult
time noticing that it needs to be cleaned up.  Thus, this type of processing is
frowned upon.

2) Do like _remove_mirror_images does and suspend the original, then suspend
the sub-lv (the error target), then resume the sub-lv, and finally resume the
original LV.  This seems like extra pointless operations to me, but it does not
produce the error message (although, I'm not sure why) and it allows us to
leave the visible flag in place.

3) Flag the sub-lv (error target) with a "do not scan" flag.  This seems like
the cleanest approach, but I have been unable to find the method for doing
this.  LVs get tagged in such a way by _get_udev_flags, but in this case the
resume of the original LV also resumes the error target LV without running it
through _get_udev_flags (likely because they are no longer linked).  Could
there be something wrong in resume_lv?

Option #2 was chosen to fix this bug, but it seems like more of a workaround
for now.
2011-09-13 13:59:19 +00:00
Alasdair Kergon
5081181b5d Append z to lv_attr if new blocks will be zeroed. 2011-09-09 01:15:18 +00:00
Alasdair Kergon
dbb48de507 Add a new 'thin_pool' output field to 'lvs.
A gentle reminder that anyone relying on the output of reporting commands
like lvs in scripts must use -o to guarantee they get the fields they expect.

The default sequence of fields can change from release to release.
Equally, the 'attr' fields can have new values introduced and/or characters
appended to them.
2011-09-09 00:54:49 +00:00
Alasdair Kergon
52e3f9dd5e Add 7th lv_attr char to show the related kernel target.
Add thin volume types to lv_attr.
2011-09-08 20:55:39 +00:00
Alasdair Kergon
ef78ebf35a lvcreate/remove thin_pool and thin volumes (--driverloaded n only) 2011-09-08 16:41:18 +00:00
Alasdair Kergon
1abaaab1bc Terminate pv_attr field correctly. (2.02.86) 2011-09-07 13:42:00 +00:00
Zdenek Kabelac
f32b76a193 Minor change for pv_create api
Switch int to unsigned type.
2011-09-07 08:34:21 +00:00
Alasdair Kergon
bb6f9b10db pool attach fns & more field renaming 2011-09-06 22:43:56 +00:00
Alasdair Kergon
b88362ff95 add thin_manip.c like the other manip files
move basic lv_is_* to macros
data_lv -> pool_lv - we decided to call it 'pool' everywhere now
2011-09-06 19:25:42 +00:00
Alasdair Kergon
2ef5b7cca6 Start using 64-bit status flags - most of the code already handles them.
tdata -> tpool
remove commented out definitions from metadata.h
formatting clean-ups
2011-09-06 18:49:31 +00:00
Alasdair Kergon
dd44cccefe else 2011-09-06 15:39:46 +00:00
Alasdair Kergon
9ac61d2ba2 lvcreate parsing for thin provisioning.
The rest is incomplete so this isn't usable yet.
2011-09-06 00:26:42 +00:00
Jonathan Earl Brassow
da23255cc9 Fix for bug 732142: Unsafe table load during mirror image split
There was a bad sequence:
*) Make changes to LV layout to split images (e.g. 4-way -> 2-way/2-way)
1) vg_write, suspend_lv(original_mirror), vg_commit
2) activate_lv(newly_split_lv)
3) resume_lv(original_mirror)

Step #2 is not allowed.  However, without it, the resume of the original
mirror will also resume its former sub-LVs - making it impossible to
activate the newly split LV due to the changes in layering, pointers, and
names that had already been made.  Additionally, the resume or the original
brings the sub-lv's online with names that differ from the metadata on disk -
also a no-no.  Thus, the split must be done in stages such that the active LVs
always reflect what is in the committed LVM metadata.

First, alter the original mirror by releasing the images.  The images are made
visible and independent as an intermediate stage.  (This way, we can have
consistency between LVM metadata and active LVs.)  The second stage collects
the recently split LVs, deactivates them, forms them into a mirror if necessary,
and then activates them.  It is a bit of a circuitous method, but it is the only
way to split a mirror from a mirror and obey these general rules:
1) Never [de]activate sub-lvs when the top-level LV is suspended
2) Avoid having active LVs that differ from the description in the LVM metadata

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2011-09-01 19:22:11 +00:00
Zdenek Kabelac
3caa77f831 Use size_t return type
Since these function returns buffer size - use size_t type for them.
2011-09-01 10:25:22 +00:00
Petr Rockai
e59e2f7c3c Move the core of the lib/config/config.c functionality into libdevmapper,
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
2011-08-30 14:55:15 +00:00
Alasdair Kergon
11bfaa1df8 same for segtype_is_thin 2011-08-26 18:17:05 +00:00
Alasdair Kergon
6fbf1c6b56 seg_is_thin includes both thin_pool and thin_volume 2011-08-26 18:15:14 +00:00
Alasdair Kergon
42914557d5 thin - hide unimplemented dso fn; remove duplicate origin_lv field; add
some lvcreate struct parms
2011-08-26 17:40:53 +00:00
Zdenek Kabelac
e82bd6249b Initial code for read/write of thin metadata lv segments 2011-08-26 13:37:47 +00:00
Zdenek Kabelac
9d32170d5c Add registration of thin_pool segment
Register thin and thin_pool segment via multiple_segtypes.
2011-08-25 10:00:09 +00:00
Alasdair Kergon
f9b92564a7 Fix raid shared lib segtype registration (2.02.87). 2011-08-24 13:41:46 +00:00
Zdenek Kabelac
3ba4a19510 Initial code layout for thin provisioning target
Only registers init_thin_segtype

Option --with-thin=internal needed for compilation.
For now useful only for developememt!
2011-08-24 08:27:49 +00:00
Alasdair Kergon
c31d14d786 Remove incorrect error message added in 2.02.87. 2011-08-19 22:55:07 +00:00
Alasdair Kergon
1d64dcfbf7 clarify comment 2011-08-19 19:35:50 +00:00
Alasdair Kergon
ba7df3de88 avoid multi-line calc with incorrect intermediate var contents 2011-08-19 16:41:26 +00:00
Alasdair Kergon
3250b38583 _ for static fns 2011-08-19 15:59:15 +00:00
Jonathan Earl Brassow
a2facf4ad4 Add ability to merge back a RAID1 image that has been split w/ --trackchanges
Argument layout is very similar to the merge command for snapshots.
2011-08-18 19:43:08 +00:00
Jonathan Earl Brassow
f439e65b64 Add support for m-way to n-way up-convert in RAID1 (no linear to n-way yet)
This patch adds the ability to upconvert a raid1 array - say from 2-way to
3-way.  It does not yet support upconverting linear to n-way.

The 'raid' device-mapper target allows for individual components (images) of
an array to be specified for rebuild.  This mechanism is used when adding
new images to the array so that the new images can be resync'ed while the
rest of the images in the array can remain 'in-sync'.  (There is no
mirror-on-mirror layering required.)
2011-08-18 19:41:21 +00:00
Jonathan Earl Brassow
6d04311efa Add the ability to split an image from the mirror and track changes.
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access.  The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed.  This
operation can be thought of as a partial split.  The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap.  The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed.  The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
	~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
	~> lvconvert --merge vg/lv_rimage_<n>

The internal mechanics of this are relatively simple.  The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'.  This is what will be used if a partial activation of an array
is ever required.  (It would also be possible to use 'error' targets in
place of the '- -'.)  If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array.  So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only.  To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
2011-08-18 19:38:26 +00:00
Jonathan Earl Brassow
a324baf6a1 Add --splitmirrors support for RAID1 (1 image only)
Users already have the ability to split an image from an LV of "mirror"
segtype.  This patch extends that ability to LVs of "raid1" segtype.

This patch only allows a single image to be split off, however.  (The
"mirror" segtype allows an arbitrary number of images to be split off.
e.g.  4-way => 3-way/linear, 2-way/2-way, linear,3-way)
2011-08-18 19:34:18 +00:00
Jonathan Earl Brassow
63d32fb6a6 When down-converting RAID1, don't activate sub-lvs between suspend/resume
of top-level LV.

We can't activate sub-lv's that are being removed from a RAID1 LV while it
is suspended.  However, this is what was being used to have them show-up
so we could remove them.  'sync_local_dev_names' is a sufficient and
proper replacement and can be done after the top-level LV is resumed.
2011-08-18 19:31:33 +00:00
Jonathan Earl Brassow
4903b85d23 Compiler warning fixes, better error messaging, and cosmetic changes.
1) add new function 'raid_remove_top_layer' which will be useful
to other conversion functions later (also cleans up code)
2) Add error messages if raid_[extract|add]_images fails
3) Add function prototypes to prevent compiler warnings when
compiling with '--with-raid=shared'
2011-08-13 04:28:34 +00:00
Jonathan Earl Brassow
a22515c87f Various code clean-ups (s/malloc/zalloc/, new msgs, etc)
Fix a couple more issues that kabi found.
- Add some error messages in failure cases
- s/malloc/zalloc/
- use vg->vgmem for lv names instead of vg->cmd->mem
2011-08-11 21:32:18 +00:00
Jonathan Earl Brassow
2100c90dd7 Add missing checks for function return codes.
Some functions were being called without having their return values checked.
2011-08-11 19:38:00 +00:00
Jonathan Earl Brassow
b2fa9b43dc Add some log_error msg's and fix potential segfault
Thanks to kabi for spotting these - especially the possibility for
segfault if a loop runs all the way through without finding a match.
2011-08-11 19:17:10 +00:00
Jonathan Earl Brassow
4aebd52c4c Add ability to down-convert RAID1 arrays.
Also, add some simple RAID tests to testsuite.
2011-08-11 18:24:40 +00:00
Zdenek Kabelac
031c986ea8 Lock memory for shared VG
Use debug pool locking functionality. So the command could check,
whether the memory in the pool has not been modified.

For lv_postoder() instead of unlocking and locking for every changed
struct status member do it once when entering and leaving function.
(mprotect would trap each such memory access).
Currently lv_postoder() does not modify other part of vg structure
then status flags of each LV with flags that are reverted back to
its original state after function exit.
2011-08-11 17:34:30 +00:00
Zdenek Kabelac
bb115a7a6c Cache and share generated VG structs
Extend vginfo cache with cached VG structure. So if the same metadata
are use, skip mda decoding in the case, the same data are in use.
This helps for operations like activation of all LVs in one VG,
where same data were decoded giving the same output result.

Patch adds 1-to-1 connection between volume_group and lvmcache_vginfo.
2011-08-11 17:24:23 +00:00
Peter Rajnoha
47d7f00e16 Fix possible format instance memory leaks and premature releases in _vg_read. 2011-08-11 16:31:40 +00:00
Jonathan Earl Brassow
66d9675559 Fix renaming of RAID logical volumes.
The function 'for_each_sub_lv', which rename uses, was not handling the
RAID metadata areas.  Thus, the metadata LVs were not being renamed.
2011-08-11 03:29:51 +00:00
Zdenek Kabelac
530b00a652 Just add new lines between header comment 2011-08-10 20:26:41 +00:00
Zdenek Kabelac
077a6755ff Replace free_vg with release_vg
Move the free_vg() to  vg.c  and replace free_vg  with release_vg
and make the _free_vg internal.

Patch is needed for sharing VG in vginfo cache so the release_vg function name
is a better fit here.
2011-08-10 20:25:29 +00:00
Zdenek Kabelac
789f9c55e5 Remove INCONSISTENT_VG flag
As this flag could not have been set by the current code - removing it.

Note: because of the wrong code logic this call:

lvmcache_update_vg(correct_vg, correct_vg->status & PRECOMMITTED &
			   (inconsistent ? INCONSISTENT_VG : 0));

had always passed '0' - now after flag removal it's passing
PRECOMMITTED flag in - this present functinal change in this patch.

To match the original functionality - 0 had to be always passed.
More testing is needed here.
2011-08-10 20:17:33 +00:00
Jonathan Earl Brassow
e01bcc6884 Fix compiler warning.
Compiler complaining that meta_lv could be used uninitialized.  (Not true
because it is protected by 'clear_metadata'.)  I switched to using 'lv->vg',
as it makes no difference to vg_[write|commit].
2011-08-10 16:44:17 +00:00
Peter Rajnoha
0127a9a525 Remove unused 'origin' variable in lv_remove_single function. 2011-08-05 09:21:13 +00:00
Zdenek Kabelac
425862fb95 Remove unused inconsistent_seqno
Last usage was removed in Petr's commit related to VG mda repair fix
where relaxed check starts to ignore inconsistencies coming from
PVs that are marked MISSING - thus removing unused variable.
2011-08-04 15:18:10 +00:00
Jonathan Earl Brassow
cac52ca4ce Add basic RAID segment type(s) support.
Implementation described in doc/lvm2-raid.txt.

Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types
2011-08-02 22:07:20 +00:00
Jonathan Earl Brassow
7411a44871 Remove and unneeded parameter from build_parallel_areas_from_lv() 2011-07-19 16:37:42 +00:00
Jonathan Earl Brassow
aa6599e687 Fix potential null ptr deref in 'origin_from_cow'
return NULL rather than segfaulting if lv->snapshot is not set
2011-07-19 16:23:52 +00:00
Alasdair Kergon
ee840ff14c Move snapshot deactivation logic into lib/activate, fixing the
teardown sequence.  (Previously the snapshot was deactivated while its
origin was active and before its removal was committed to disk, so
restarting after a crash at the point would leave corruption.)
2011-07-08 12:48:41 +00:00
Alasdair Kergon
0f2a4ca2b5 When suspending, automatically preload newly-visible existing LVs
Let's find out if this makes things better or worse overall...
2011-06-30 18:25:18 +00:00
Alasdair Kergon
1d7649f36b Reinstate correct permissions when creating mirrors. 2011-06-29 17:05:53 +00:00
Alasdair Kergon
e189a84f57 Append 'm' attribute to pv_attr for missing PVs. 2011-06-29 14:56:33 +00:00
Alasdair Kergon
140615dafb remove unused var after recent patch 2011-06-24 23:39:09 +00:00
Jonathan Earl Brassow
9e0edb7ee5 Fix to preserve exclusive activation of mirror while up-converting.
When an LVM mirror is up-converted (an additional image added), it creates
a temporary mirror stack.  The lower-level mirror in the stack that is
created was not being activated exclusively - violating the exclusive nature
of the original mirror.  We now check for exclusive activation of a mirror
before converting it, and if found, we ensure that the temporary mirror
is also exclusively activated.
2011-06-23 14:00:58 +00:00
Milan Broz
6adbb95b82 Fail allocation if number of extents not divisible by area count
Allocation should fail early if this condition is not met.

Quick fix for https://bugzilla.redhat.com/show_bug.cgi?id=707779
2011-06-23 10:53:24 +00:00
Jonathan Earl Brassow
9e277b9e2c Fix issue preventing cluster mirror creation.
Mirrors used to be created by first creating a linear device and then adding
the other images plus the log.  Now mirrors are created by creating all the
images in one go and then adding the log separately.  The new way ran into
the condition that cluster mirrors cannot change the log type (in the case
of creation, from core -> disk) while the mirror is not active.  (It isn't
active because it is in the process of being created.)  The reason this
condition is in place is because a remote node may have the mirror active, and
we don't want to alter the log underneath it.

What we really needed was a way of checking if the mirror was active remotely
but not locally, and in that case do not allow a change of the log.  I've added
this check, and cluster mirrors can now be created again.
2011-06-22 21:31:21 +00:00
Zdenek Kabelac
bebe60b70c Code move of vg_mark_partial() up in stack
It's useful to keep the partial flag cached - so just move the call
for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets
evaluated before it goes through the lvmcache.

This patch should not present any functional change.

Note: It is rather temporal solution - proper place is probably inside the
'read' call back - but needs some more discussion.
For now using this minor hack.
2011-06-17 14:39:10 +00:00
Zdenek Kabelac
93a98c2672 Remove unused internal flag ACTIVATE_EXCL from the code 2011-06-17 14:30:58 +00:00
Zdenek Kabelac
f50a76379a Remove test for status flag
As the ACTIVATE_EXCL could be set only in clvmd code - there is no
use for this test in lv_add_mirrors() function only called from
tools context.

FIXME: Add cluster test case for this.
2011-06-17 14:27:34 +00:00
Zdenek Kabelac
f3d8974dc9 Add couple FIXMEs around suspicious code 2011-06-17 14:24:18 +00:00
Zdenek Kabelac
81beded3af Add lv_activate_opts structure
To avoid modification of 'read-only' volume group structure
add a new structure to pass local data around the code for LV
activation.

As origin_only is one such flag - replace this parameter with new
struct lv_activate_opts.

More parameters might eventually become part of lv_activate_opts.
2011-06-17 14:14:19 +00:00
Petr Rockai
6d25c0d26f Fix RHBZ 651590 (failure to lock LV results in failure to repair mirror after
transient error), stemming from the following sequence of events:

1) devices fail IO, triggering repair
2) dmeventd starts fixing up the mirror
3) during the downconversion, a new metadata version is written

--> the devices come back online here

4) the mirror device suspend/resume is called to update DM tables
5) during the suspend/resume cycle, *pre*-commit metadata is read;
   however, since the failed devices are now back online, we get back
   inconsistent set of precommit metadata and the whole operation fails

The patch relaxes the check that fails in step 5 above, namely by ignoring
inconsistencies coming from PVs that are marked MISSING.
2011-06-15 17:45:02 +00:00
Alasdair Kergon
7df72b3c88 Fix last snapshot removal to avoid table reload while a device is suspended. 2011-06-13 22:28:04 +00:00
Alasdair Kergon
df390f1799 Major pvmove fix to issue ioctls in the correct order when multiple LVs
are affected by the move.  (Currently it's possible for I/O to become
trapped between suspended devices amongst other problems.

The current fix was selected so as to minimise the testing surface.  I
hope eventually to replace it with a cleaner one that extends the
deptree code.

Some lvconvert scenarios still suffer from related problems.
2011-06-11 00:03:06 +00:00
Milan Broz
4fb39ae074 Validate mirror segments size
Currently some operation with striped mirrors lead
to corrupted metadata, this patch just add detection of such
situation.

Example:
# lvcreate -i2 -l10 -n lvs vg_test
# lvconvert -m1 vg_test/lvs

# lvreduce -f -l1 vg_test/lvs
  Reducing logical volume lvs to 4.00 MiB
  Segment extent reduction 9not divisible by #stripes 2
  Logical volume lvs successfully resized

# lvremove vg_test/lvs
  Segment extent reduction 1not divisible by #stripes 2
  LV segment lvs:0-4294967295 is incorrectly listed as being used by LV lvs_mimage_0
  Internal error: LV segments corrupted in lvs_mimage_0.
2011-06-09 19:36:16 +00:00
Alasdair Kergon
bb056af3c9 missing space in mesg 2011-06-06 12:08:42 +00:00
Alasdair Kergon
3cac20f850 Defer writing PV labels to vg_write.
Store label_sector only in struct physical_volume.
2011-06-01 19:29:31 +00:00
Alasdair Kergon
453cdee51c Permit --available with lvcreate so non-snapshot LVs need not be activated. 2011-06-01 19:21:03 +00:00
Petr Rockai
833a287337 Make vg_mark_partial_lvs also clear existing PARTIAL_LV flags, so it can be
issued repeatedly on the same VG, keeping the PARTIAL_LV flags up to date.
2011-05-07 13:32:05 +00:00
Alasdair Kergon
5510b4e7d7 test update without WHATS_NEW to check it gives warning now 2011-04-29 19:06:17 +00:00
Alasdair Kergon
9cda028a96 clean up critical section patch 2011-04-28 20:29:59 +00:00
Zdenek Kabelac
b680d5bf7b Fix use of released vgname and vgid
Avoid using of already released memory when duplicated MDA is found.

As get_pv_from_vg_by_id() may call lvmcache_label_scan() use the local copy
of the vgname and vgid on the stack as vginfo may dissapear and code was
then accessing garbage in memory.

i.e.  pvs  /dev/loop0
(when /dev/loop0 and /dev/loop1 has same MDA content)

Invalid read of size 1
   at 0x523C986: dm_hash_lookup (hash.c:325)
   by 0x440C8C: vginfo_from_vgname (lvmcache.c:399)
   by 0x4605C0: _create_vg_text_instance (format-text.c:1882)
   by 0x46140D: _text_create_text_instance (format-text.c:2243)
   by 0x47EB49: _vg_read (metadata.c:2887)
   by 0x47FBD8: vg_read_internal (metadata.c:3231)
   by 0x477594: get_pv_from_vg_by_id (metadata.c:344)
   by 0x45F07A: _get_pv_if_in_vg (format-text.c:1400)
   by 0x45F0B9: _populate_pv_fields (format-text.c:1414)
   by 0x45F40F: _text_pv_read (format-text.c:1493)
   by 0x480431: _pv_read (metadata.c:3500)
   by 0x4802B2: pv_read (metadata.c:3462)
 Address 0x652ab80 is 0 bytes inside a block of size 4 free'd
   at 0x4C2756E: free (vg_replace_malloc.c:366)
   by 0x442277: _free_vginfo (lvmcache.c:963)
   by 0x44235E: _drop_vginfo (lvmcache.c:992)
   by 0x442B23: _lvmcache_update_vgname (lvmcache.c:1165)
   by 0x443449: lvmcache_update_vgname_and_id (lvmcache.c:1358)
   by 0x443C07: lvmcache_add (lvmcache.c:1492)
   by 0x46588C: _text_read (text_label.c:271)
   by 0x466A65: label_read (label.c:289)
   by 0x4413FC: lvmcache_label_scan (lvmcache.c:635)
   by 0x4605AD: _create_vg_text_instance (format-text.c:1881)
   by 0x46140D: _text_create_text_instance (format-text.c:2243)
   by 0x47EB49: _vg_read (metadata.c:2887)

Add testing script
2011-04-21 13:13:40 +00:00
Mike Snitzer
ffcb1b9c2c Improve the discard documentation. Also improve discard code in
pv_manip.c to properly account for case when pe_start=0 and the first
physical extent is to be released (currently skip the first extent to
avoid discarding the PV label).
2011-04-13 18:26:39 +00:00
Mike Snitzer
727373c176 Use uint32_t rather than uint64_t. 2011-04-12 22:04:04 +00:00
Mike Snitzer
fdc8670327 Add "devices/issue_discards" to lvm.conf.
Issue discards on lvremove if enabled and both storage and kernel have support.
2011-04-12 21:59:01 +00:00
Zdenek Kabelac
96077265c4 Replace dm_snprintf with strncpy
My previous patch fixed incorrect error check for dm_snprintf.
However in this particular case - dm_snprintf has been used differently -
just like strncpy + setting last char with '\0' - so the code had to return
error - because the buffer was to short for whole string.

Patch replaces it with real strncpy.
Also test for alloca() failure is removed - as the program behaviour
is rather undefined in this case - it never returns NULL.
2011-04-12 14:13:17 +00:00
Petr Rockai
db22d9b978 This patchset refactors some reporting code and completes the remaining
lvseg properties for lvm2app, 'devices' and 'seg_pe_ranges'.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
2011-04-12 12:24:29 +00:00
Zdenek Kabelac
c67d2b4dd4 Fix incorrect tests for dm_snprintf() failure
As the memory is preallocated based on arg size in these cases,
the error would be quite hard to trigger here anyway.
2011-04-09 19:05:23 +00:00
Zdenek Kabelac
a1eba521e3 Fix some unmatching sign comparation gcc warnings
Simple replacement for unsigned type - usually in for() loops.
2011-04-08 14:40:18 +00:00
Jonathan Earl Brassow
532e6c8ae3 Thanks to Zdenek Kabelac (kabi) for pointing out that I was using
dm_pool_free incorrectly.  This check-in fixes that incorrect usage.

I've also added a WHATS_NEW line to reflect the changes I made to allow
lv_extend to operate on 0 length intrinsically layered LVs (i.e mirrors
and RAID).  I forgot that in the last commit.
2011-04-07 21:49:29 +00:00
Jonathan Earl Brassow
fe93c99ad9 This patch adds the ability to extend 0 length layered LVs. This
allows us to allocate all images of a mirror (or RAID array) at one
time during create.

The current mirror implementation still requires a separate allocation
for the log, however.
2011-04-06 21:32:20 +00:00
Peter Rajnoha
29684f590c Cleanup fid finalization code in free_vg and allow exactly the same fid to be set again for a PV/VG.
Actually, we can call vg_set_fid(vg, NULL) instead of calling
destroy_instance for all PV structs and a VG struct - it's the same
code we already have in the vg_set_fid.

Also, allow exactly the same fid to be set again for the same PV/VG
Before, this could end up with the fid destroyed because we destroyed
existing fid first and then we used the new one and we didn't care
whether existing one == new one by chance.
2011-04-01 14:54:20 +00:00
Zdenek Kabelac
3d04380691 Use created hash tables for quick check of LV, PV.
Instead of searching linear list of all LVs, PVs - use created hash tables
also for quick mapping between LV.

(Note - for small number of PVs or LVs the overhead of the hash is bigger).

TODO: Use hash tables in volume_group structure directly.
2011-03-30 13:35:51 +00:00
Zdenek Kabelac
1bedd3a97b Use id_equal instead of strncmp()
More consistent and easier to read.
2011-03-29 21:57:56 +00:00
Zdenek Kabelac
f77736cab5 Remove double braces
Clang gives notice about possible confusion as commonly double bracces are
used when some assignment is done inside them.
2011-03-29 20:19:03 +00:00
Jonathan Earl Brassow
60c10a45ce s/MIRROR_NOTSYNCED/LV_NOTSYNCED/ - Flag will may refer to more than just mirrors 2011-03-29 12:51:57 +00:00
Jonathan Earl Brassow
be226be635 Fix unhandled condition in _move_lv_segments
If _move_lv_segments is passed a 'lv_from' that does not yet
have any segments, it will screw things up because the code
that does the segment copy assumes there is at least one
segment.  See copy code here:
        lv_to->segments = lv_from->segments;
        lv_to->segments.n->p = &lv_to->segments;
        lv_to->segments.p->n = &lv_to->segments;

If 'segments' is an empty list, the first statement copies over
the values, but the next two reset those values to point to the
other LV's list structure.  'lv_to' now appears to have one
segment, but it is really an ill-set pointer.
2011-03-25 22:02:27 +00:00
Petr Rockai
5ef2808bc7 In some cases, we could end up with a mirrored LV without a MIRRORED flag. In
other cases, the code could wind up removing wrong number of mirrors. In yet
other cases, we could remove the right number of mirrors, but fail to respect
the removal preferences (i.e. keep an image that was requested to be removed
while removing an image that was requested to be kept). Under some
circumstances, remove_mirror_images could also get stuck in an infinite loop.

This patch should fix all of the above undesirable behaviours.

Signed-off-by: Petr Rockai <prockai@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
2011-03-24 12:28:02 +00:00
Zdenek Kabelac
b8ccce3500 Add missing \0 for grown debug object
Attach \0 for proper char* display - otherwise somewhat random message could
be displayed in debug more and read of unpredictable read of uninitilized
memory values could happen.
2011-03-14 17:00:57 +00:00
Zdenek Kabelac
844b75f4d6 Fix allocation of system_id
As code uses strncpy(system_id, NAME_LEN) and doesn't set '\0'
Fix it by always allocating NAME_LEN + 1 buffer size and with zalloc
we always get '\0' as the last byte.

This bug may trigger some unexpected behavior of the string operation
code - depends on the pool allocator.

FIXME: refactor this code to alloc_vg.
2011-03-13 23:05:48 +00:00
Peter Rajnoha
ff4479414c Use format instance mempool where possible and adequate. 2011-03-11 15:10:16 +00:00
Peter Rajnoha
e8d4946ec7 Various cleanups for fid mem and ref_count changes.
Missing free_vg on error_path in lvmcache_get_vg fn. Call destroy_instance
only if the fid is not part of the vg in backup_read_vg fn (otherwise it's
part of the VG we're returning and we definitely don't want to destroy it!).
2011-03-11 15:08:31 +00:00
Peter Rajnoha
2feb2a66fd Call destroy_instance for any PVs found in VG structure during vg_free call.
This is necessary for proper format instance ref_count support. We iterate
over vg->pvs and vg->removed_pvs list and the ref_count is decremented and
then it is destroyed if not referenced anymore.
2011-03-11 15:06:13 +00:00
Peter Rajnoha
84f48499a3 Add new free_pv_fid fn and use it throughout to free all attached fids.
Since format instances will use own memory pool, it's necessary to properly
deallocate it. For now, only fid is deallocated. The PV structure itself
still uses cmd mempool mostly, but anytime we'd like to add a mempool
in the struct physical_volume, we can just rename this fn to free_pv and
add the code (like we have free_vg fn for VGs).
2011-03-11 14:56:56 +00:00
Peter Rajnoha
1307ddf4cf Use only vg_set_fid and new pv_set_fid fn to assign the format instance.
This is essential for proper format instance ref_count support. We must
use these functions to set the fid everywhere from now on, even the NULL
value!
2011-03-11 14:50:13 +00:00
Peter Rajnoha
a1bec4e685 Add mem and ref_count fields to struct format_instance for own mempool use.
Format instances can be created anytime on demand and it contains
metadata area information mostly (at least for now, but in the future,
we may store more things here to update/edit in a PV/VG). In case we
have lots of metadata areas, memory consumption will rise. Using cmd
context mempool is not quite optimal here because it is destroyed too
late. So let's use a separate mempool for format instances.

Reference counting is used because fids could be shared, e.g. each PV
has either a PV-based fid or VG-based fid. If it's VG-based, each PV has
a shared fid with the VG - a reference to VG's fid.
2011-03-11 14:38:38 +00:00
Peter Rajnoha
56f5b12eed Use new alloc_fid fn for common format instance initialisation. 2011-03-11 14:30:27 +00:00
Zdenek Kabelac
a6f38f9d6a Missed merge fix in vg_validate patch 2011-03-10 22:39:36 +00:00
Zdenek Kabelac
442dbf9ad8 Refactor code for _lv_postoder
Add _lv_postorder_vg() - for calling _lv_postorder() for every LV from VG.
We use this in 2 places -  vg_mark_partial_lvs() and vg_validate()
so make it as a one function.

Benefit here is - to use only one cleanup code and avoid
potentially duplicate scans of same LVs.
2011-03-10 14:40:32 +00:00
Zdenek Kabelac
4ee2b4965f Use hash tables for validating names
Accelerate validation loop by using lvname, lvid, pvid hash tables.
Also merge pvl loop into one cycle now - no need to scan the list twice.
List scan is stopped when dm_hash_insert fails.

The error message with loop_counter1 is no longer provided - however
the message has been misleading anyway.
2011-03-10 13:11:59 +00:00
Zdenek Kabelac
3019419e95 Refactor vg allocation code
Create new function alloc_vg() to allocate VG structure.

It takes pool_name (for easier debugging).
and also take vg_name to futher simplify code.

Move remainder of _build_vg_from_pds  to _pool_vg_read
and use vg memory pool for import functions.
(it's been using smem -> fid mempool -> cmd mempool)
(FIXME: remove mempool parameter for import functions and use vg).

Move remainder of the _build_vg to _format1_vg_read
2011-03-10 12:43:29 +00:00
Alasdair Kergon
2f25c320fb Use empty string instead of /dev// for LV path when there's no VG.
Don't allocate unused VG mempool in _pvsegs_sub_single.
2011-03-09 12:44:42 +00:00
Zdenek Kabelac
55f6627427 Fix reading of released memory
lvseg_segtype_dup used memory pool vg memory pool for strind duplication.
However this one gets released before reporting happens so the command like:

pvs -o segtype

prints data from already released memory pool. Thanks to the fact there
is not much allocation happing after the VG is released, the memory
stays unmodified and correct result is printed.

Fix adds support for mempool passed parameter (like other similar
query commands) and uses dm_report memory pool for string duplication.
2011-03-05 12:14:00 +00:00
Milan Broz
be3510b204 PE size overflows, on most architectures it is catch by "PE cannot be 0"
but s390x unfortunately return something usable.

Always use unit64 in inital parameter check.
2011-03-02 20:00:09 +00:00
Zdenek Kabelac
36653e8903 Add fall through comments
Add comments to switch case construct.
2011-02-28 19:53:03 +00:00
Peter Rajnoha
3b97e8d643 Allow non-orphan PVs with two metadata areas to be resized.
We allow writing non-orphan PVs only for resize now. The "orphan PV" assert
in pv_write fn uses the "allow_non_orphan" parameter to control this assert.
However, we should find a more elaborate solution so we can remove this
restriction altogether (pv_write together with vg_write is not atomic, we
need to find a safe mechanism so there's an easy revert possible in case of
an error).
2011-02-28 13:19:02 +00:00
Alasdair Kergon
1a52fa6858 Fix check for log-only allocation in new alloc normal loop. 2011-02-27 01:16:52 +00:00
Alasdair Kergon
92ffcda183 Various changes to the allocation algorithms: Expect some fallout.
There is a lot to test.

Two new config settings added that are intended to make the code behave
closely to the way it did before - worth a try if you find problems.
2011-02-27 00:38:31 +00:00
Peter Rajnoha
4a304dc1d8 Allow only orphan PVs to be resized even with two metadata areas. 2011-02-25 14:08:54 +00:00
Peter Rajnoha
f74bd57ec9 Revert the patch for vgconvert to work with recent changes in metadata area handling.
This should work now with the help of the patch from previous commit.
2011-02-25 14:02:53 +00:00
Peter Rajnoha
38b0564cab Read PV metadata information from cache if pv_setup called with pv->fid == vg->fid.
If the PV is already part of the VG (so the pv->fid == vg->fid), it makes no
sense to attach the mdas information from PV to a VG. Instead, we read new
PV metadata information from cache and attach it to the VG fid.
2011-02-25 13:59:47 +00:00
Peter Rajnoha
c901a92aa5 %ld -> PRIu64 2011-02-21 13:09:27 +00:00
Peter Rajnoha
9c0035c129 Fix metadata balance code to work with recent changes in metadata handling
interface (with the changes in format_instance).
2011-02-21 12:33:16 +00:00
Peter Rajnoha
51aed1992f Add old_uuid field to struct physical_volume so we can still reference a PV
with its old UUID when we're changig it (the cache as well as metadata area
index has the old uuid that we need to use to access the information!)
2011-02-21 12:31:28 +00:00
Peter Rajnoha
6bdc80743e Fix vgconvert code to work with changes in metadata area handling and changes
in format_instance. Add new 'vg_convert' function.
2011-02-21 12:29:21 +00:00
Peter Rajnoha
cb2396730a Change pvresize code to work with new metadata handling interface and allow
resizing a PV with two metadata areas.
2011-02-21 12:27:26 +00:00
Peter Rajnoha
17ad2b1115 Change pv_write code to work with the changes in metadata handling interface
and changes in format_instance.
2011-02-21 12:26:27 +00:00
Peter Rajnoha
94d91fdda1 Change the code throughout to use new pv_initialise and modified pv_setup fn.
Change pv_create code to work with these changes together with using new
pv_add_metadata_area fn to add metadata areas for a PV being created.
2011-02-21 12:24:15 +00:00
Peter Rajnoha
617b900d85 Separate new pv_initialise function out of the original pv_setup code.
pv_initiliase initialises a new PV
pv_setup sets up an existing PV with a VG
2011-02-21 12:20:18 +00:00
Peter Rajnoha
981895a860 Add new pv_remove_metadata_area interface function. 2011-02-21 12:17:54 +00:00
Peter Rajnoha
8d5d20a526 Add new pv_add_metadata_area interface function. 2011-02-21 12:17:26 +00:00
Peter Rajnoha
305816232d Remove useless mdas parameter for pv_read (from now on, we store mdas in a
format instance)
2011-02-21 12:15:59 +00:00
Peter Rajnoha
6e0b348d34 Add format instance support for pv_read code. 2011-02-21 12:13:40 +00:00
Peter Rajnoha
56280d0d3a Initialise a new PV-based format instance for a PV that is being created. 2011-02-21 12:12:32 +00:00
Peter Rajnoha
f8b78ec613 Add vg_set_fid function to change VG format instance.
This function also sets a reference to a new VG format instance for all PVs
that are part of the VG so the PV-VG interconnection is consistent after the
change.
2011-02-21 12:10:58 +00:00
Peter Rajnoha
c0c21864c6 Change the code throughout for recent changes in format_instance handling. 2011-02-21 12:07:03 +00:00
Peter Rajnoha
88129db5e1 Change create_instance to create PV-based as well as VG-based format instances.
Add supporting functions to work with the format instance and metadata area
structures stored within the format instance. Add support for simple indexing
of metadata areas using PV id and mda order (for on-disk PV only for now, we
can extend the indexing even for other mdas if needed - we only need to define
a proper key for the index).
2011-02-21 12:05:49 +00:00
Peter Rajnoha
716c4ebe52 Change and generalise struct format_instance for PV and VG use. 2011-02-21 12:01:22 +00:00
Zdenek Kabelac
aec2115410 Const fixing
Fixing some const warnings - with API change in:

int vg_extend(struct volume_group *vg, int pv_count, const char *const *pv_names,

Change is needed - as lvm2api expects const behaviour here.
So vg_extend() is doing local strdup for unescaping.

skip_dev_dir return const char* from const char* vg_name.

Rest of the patch is cleanup of related warnings.

Also using dm_report_filed_string() API change to simplify
casting in _string_disp and _lvname_disp.
2011-02-18 14:47:28 +00:00
Zdenek Kabelac
b1bcff7424 Critical section
New strategy for memory locking to decrease the number of call to
to un/lock memory when processing critical lvm functions.

Introducing functions for critical section.

Inside the critical section - memory is always locked.
When leaving the critical section, the memory stays locked
until memlock_unlock() is called - this happens with
sync_local_dev_names() and sync_dev_names() function call.

memlock_reset() is needed to reset locking numbers after fork
(polldaemon).

The patch itself is mostly rename:

memlock_inc  -> critical_section_inc
memlock_dec  -> critical_section_dec
memlock      -> critical_section

Daemons (clmvd, dmevent) are using memlock_daemon_inc&dec
(mlockall()) thus they will never release or relock memory they've
already locked memory.

Macros sync_local_dev_names() and sync_dev_names() are functions.
It's better for debugging - and also we do not need to add memlock.h
to locking.h header (for memlock_unlock() prototyp).
2011-02-18 14:16:11 +00:00
Zdenek Kabelac
794e94fe16 Replace PV_MIN_SIZE with function pv_min_size()
Add configurable option to define minimal size of
of block device usable as a PV.

pv_min_size() is added to lvm-globals and it's being
initialized through _process_config.

Macro PV_MIN_SIZE is unused and removed.

New define DEFAULT_PV_MIN_SIZE_KB is added to lvm-global
and unlike PV_MIN_SIZE it uses KB units.

Should help users with various slow devices attached to the system,
which cannot be easily filtered out (like FDD on /dev/sdX):
https://bugzilla.redhat.com/show_bug.cgi?id=644578
2011-02-18 14:11:22 +00:00
Petr Rockai
21849a8587 Fix an lv_postorder bug where it failed to clear temporary flags, making it
impossible to use twice with the same LV(s). Discovered by Milan.
2011-02-14 19:27:05 +00:00
Jonathan Earl Brassow
27ff8813da Allow snapshots in a cluster as long as they are exclusively
activated.

In order to achieve this, we need to be able to query whether
the origin is active exclusively (a condition of being able to
add an exclusive snapshot).

Once we are able to query the exclusive activation of an LV, we
can safely create/activate the snapshot.

A change to 'hold_lock' was also made so that a request to aquire
a WRITE lock did not replace an EX lock, which is already a form
of write lock.
2011-02-04 20:30:17 +00:00
Mike Snitzer
3e3591904b Improve lvcreate "insufficient extents" errors to "insufficient free space". 2011-01-28 02:58:00 +00:00
Alasdair Kergon
cef065f63f Fix lvchange --test to exit cleanly. 2011-01-24 14:19:05 +00:00
Alasdair Kergon
a8de276520 Replace fs_unlock by sync_local_dev_names to notify local clvmd. (2.02.80)
Introduce sync_local_dev_names and CLVMD_CMD_SYNC_NAMES to issue fs_unlock.
2011-01-12 20:42:50 +00:00
Jonathan Earl Brassow
6a095ca99f s/log_verbose/log_error/ - Increase log level on error message. 2011-01-11 17:21:01 +00:00
Jonathan Earl Brassow
025e69a15a Add disk to mirrored log type conversion. 2011-01-11 17:05:08 +00:00
Zdenek Kabelac
937a21f0d2 Speedup consequent activation calls
Stop calling fs_unlock() from lv_de/activate().
Start using internal lvm fs cookie for dm_tree.
Stop directly calling dm_udev_wait() and
dm_tree_set/get_cookie() from activate code -
it's now called through fs_unlock() function.

Add lvm_do_fs_unlock()

Call fs_unlock() when unlocking vg where implicit unlock solves the
problem also for cluster - thus no extra command for clustering
environment is required - only lvm_do_fs_unlock() function is added
to call lvm's fs_unlock() while holding lvm_lock mutex in clvmd.

Add fs_unlock() also to set_lv() so the command waits until devices
are ready for regular open (i.e. wiping its begining).

Move fs_unlock() prototype to activation.h to keep fs.h private
in lib/activate dir and not expose other functions from this header.
2011-01-10 14:02:30 +00:00
Zdenek Kabelac
6feecf76d4 Change import_vg_from_buffer to use config_tree
Change function import_vg_from_buffer() to import_vg_from_config_tree().
Instead of creating config tree inside the function allow config tree to
be passed as parameter - usable later for caching.
2011-01-10 13:13:42 +00:00
Zdenek Kabelac
2ae2ca89bf Add backtraces for backup and backup_remove fail paths 2010-12-22 15:36:41 +00:00
Zdenek Kabelac
b7149bbe45 Add missing test for reallocation error. 2010-12-20 14:38:22 +00:00
Zdenek Kabelac
9b30dfb967 Use const char * for name and old_name in vg
Switch to use const char pointers to avoid changes of these structure
members and having better control over, were these members could
be modified.
2010-12-20 13:40:46 +00:00
Zdenek Kabelac
9d9de35dca Remove const usage from destroy callbacks
As const segment_type or const format_type are never released
use their non-const version and remove const downcast from dm_free calls.
This change fixes many gcc warnings we were getting from them.
2010-12-20 13:32:49 +00:00
Zdenek Kabelac
ba96eb24fa Some const cleanups
Minor const warning fixes and internal API updates.
2010-12-20 13:19:13 +00:00
Zdenek Kabelac
760d1fac55 Add more strict const pointers around config tree
To have better control were the config tree could be modified use more
const pointers and very carefully downcast them back to non-const
(for config tree merge).
2010-12-20 13:12:55 +00:00
Petr Rockai
ebfe96cad5 Add further consistency checking to vg_validate, ensuring that all segment
areas point to LVs or PVs that are listed in the respective VG.
2010-12-14 17:51:09 +00:00
Petr Rockai
75b2f3507a Add a validation step for pvmoveN internal LVs to vg_validate. 2010-12-14 17:07:35 +00:00
Alasdair Kergon
acb037657c Fix scanning of VGs without in-PV mdas.
Set cmd->independent_metadata_areas if metadata/dirs or disk_areas in use.
- Identify and record this state.

Don't skip full scan when independent mdas are present even if memlock is set.
- Clusters and OOM aren't supported, so no problem doing the proper scans.

Avoid revalidating the label cache immediately after scanning.
- A simple optimisation.

Support scanning for a single VG in independent mdas.
- Not used by the fix but I left it in anyway as later patches might use it.
2010-12-10 22:39:52 +00:00
Alasdair Kergon
2b82bd79f5 Rename vg_release to free_vg. 2010-12-08 20:50:48 +00:00
Zdenek Kabelac
54fca7b1ca Remove reset of vg->vgmem pointer as it is access of already release memory
This reset of vgmem pointer causes access of already released memory.
(_vg_make_handle allocates vg from vgmem pool itself - which is a bit tricky)

Interestingly this memory fault was missed by our test suite.
2010-12-08 10:45:37 +00:00
Zdenek Kabelac
166597d998 Add backtraces for errors
Add stack;  backtraces when error is reported from dev_set() or
dev_close_immediate().
2010-12-01 12:56:39 +00:00
Petr Rockai
8191fe4f4a Refactor the percent (mirror sync, snapshot usage) handling code to use
fixed-point values instead of a combination of a float value and an enum.
2010-11-30 11:53:31 +00:00
Petr Rockai
97e8048e05 Avoid the automatic MISSING_PV recovery path in commands with special
MISSING_PV handling (cmd->handles_missing_pvs is set).
2010-11-30 11:15:54 +00:00
Alasdair Kergon
1415afcdba Fix memory leak when VG allocation policy in metadata is invalid.
Ignore unrecognised allocation policy found in metadata instead of aborting.
Fix another missing vg_release() in _vg_read_by_vgid.
2010-11-29 18:35:37 +00:00
Zdenek Kabelac
201222ebad Reset vg pointer after release
Set vg to NULL after releasing it as the following memlock() test may
lead to goto for the second call of vg_release() with the already
released vg pointer.
2010-11-29 11:08:14 +00:00
Alasdair Kergon
728074ac83 Suppress 'No PV label' message when removing several PVs without mdas. 2010-11-23 01:55:53 +00:00
Petr Rockai
c1abd569f2 Add the macro and specific 'get' functions for lvsegs.
Signed-off-by: Dave Wysochanski <wysochanski@pobox.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
2010-11-17 20:08:14 +00:00
Alasdair Kergon
f8452d8cfd Support repetition of --addtag and --deltag arguments.
Add infrastructure for specific cmdline arguments to be repeated in groups.
Split the_args cmdline arguments and values into arg_props and arg_values.
2010-11-11 17:29:05 +00:00
Zdenek Kabelac
64dff85ce4 Preserve const for char pointer
Keep char pointers 'const'  (introduced with cling commit).
2010-11-11 12:32:33 +00:00
Alasdair Kergon
eb82bd0525 Extend cling allocation policy to recognise PV tags (cling_by_tags).
Add allocation/cling_tag_list to lvm.conf.
2010-11-09 12:34:40 +00:00
Peter Rajnoha
f7e3a19f75 Clarify error messages when activation fails due to activation filter use. 2010-11-05 18:18:11 +00:00
Alasdair Kergon
2aa06d73ca pre-release 2010-10-25 13:54:29 +00:00
Zdenek Kabelac
91e56ffb29 Fix constness warning for _vg_read_by_vgid() uuid usage 2010-10-25 13:35:13 +00:00
Alasdair Kergon
eacd3a0916 fix header #defines 2010-10-25 12:01:59 +00:00
Alasdair Kergon
b83af51668 Add global/metadata_read_only to use unrepaired metadata in read-only cmds. 2010-10-25 11:20:54 +00:00
Dave Wysochanski
d53d92f2e1 Add lv_read_ahead and lv_kernel_read_ahead 'get' functions. 2010-10-21 14:49:31 +00:00
Dave Wysochanski
f1fc310730 Refactor and add code for (lv) 'lv_origin' get function. 2010-10-21 14:49:20 +00:00
Dave Wysochanski
6103254393 Refactor and add code for (lv) 'lv_name' get function. 2010-10-21 14:49:10 +00:00
Jonathan Earl Brassow
2c33c8b80c Fix for bug 637936: killing both redundant logs causes deadlock
Problem:
When both legs of a mirrored log fail, neither the log nor the parent
mirror can proceed.  The repair code must be careful to replace the
log with an error target before operating on the parent - otherwise,
the parent can get stuck trying to suspend because it can't push through
any writes.  The steps to replace the log device with an error target
were incomplete and resulted in the replacement not happening at all!

The code originally had all the necessary logic to complete the
replacement task, but was pulled out in a effort to clean-up that
section of code, while fixing another bug:
<offending commit msg>
In addition, I added following three changes.

- Removed tmp_orphan_lvs handling procedure
  It seems that _delete_lv() can handle detached_log_lv properly
  without adding mirror legs in mirrored log to tmp_orphan_lvs.
  Therefore, I removed the procedure.

- Removed vg_write()/vg_commit()
  Metadata is saved by vg_write()/vg_commit() just after detached_log_lv
  is handled. Therefore, I removed vg_write()/vg_commit().
</offending commit msg>

http://sources.redhat.com/cgi-bin/cvsweb.cgi/LVM2/lib/metadata/mirror.c?cvsroot=lvm2&f=h#rev1.130

I've reverted the "clean-up" changes associated with that fix, but not what
that commit was actually fixing.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
2010-10-14 20:03:12 +00:00
Mike Snitzer
9443b5d4cd Convey need for snapshot-merge target in lvconvert error message and man
page.

Add ->target_name to segtype_handler to allow a more specific target
name to be returned based on the state of the segment.

Result of trying to merge a snapshot using a kernel that doesn't have
the snapshot-merge target:

Before:
# lvconvert --merge vg/snap
  Can't expand LV lv: snapshot target support missing from kernel?
  Failed to suspend origin lv

After:
# lvconvert --merge vg/snap
  Can't process LV lv: snapshot-merge target support missing from kernel?
  Failed to suspend origin lv
  Unable to merge LV "snap" into it's origin.
2010-10-13 21:26:37 +00:00
Petr Rockai
042312952c Give correct error message when creating a too-small snapshot (BZ 587063) 2010-10-13 13:52:53 +00:00
Zdenek Kabelac
7c9fd3ea84 Don't use floor() in _bitset_with_random_bits
Use _even_rand() function instead of floor() in _bitset_with_random_bits().
floor() function is missing in dietlibc (on architectures other than x86).
Moreover using floor() to clip rand results does not assure even result
distribution. _even_rand() uses integer arithmetic only and is designed to
return evenly distributed results.

> Looks OK to me. It took a while to decipher what is the exact meaning of
> the loop in _even_rand (to a non-pseudorandomness-expert) but I am
> fairly comfortable with it now. If I understand this correctly, it
> rejects numbers that come from an "incomplete" slice of the RAND_MAX
> space (considering the number space [0, RAND_MAX] is divided into some
> "max"-sized slices and at most a single smaller slice, between [n*max,
> RAND_MAX] for suitable n -- numbers from this last slice are discarded
> because they could distort the distribution in favour of smaller
> numbers).

Signed-off-by: Przemyslaw Iskra <sparky <at> pld-linux.org>
Reviewed-by: Petr Rockai <prockai <at> redhat.com>
2010-10-13 12:18:53 +00:00
Dave Wysochanski
f70468ce0b Fix lv_modules_dup segfault. 2010-10-12 17:09:23 +00:00
Petr Rockai
98351ffbd5 Make lvconvert respect --yes/--force in the inactive log conversion
prompt. Fixes BZs 642055, 621281. Patch by Taka.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
2010-10-12 16:41:17 +00:00