1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-02-08 09:57:55 +03:00

1056 Commits

Author SHA1 Message Date
Zdenek Kabelac
d6ecde875a Thin pool allocation simplified
Support allocation of metadata from the same PV, if the VG
is build only from one PV.

As thinp is not mirror - we do not require 2 PVs
for basic thin usage as user is losing only perfomance.
2011-11-04 22:45:52 +00:00
Zdenek Kabelac
134641d62d Thin add thin_pool_metadata_require_separate_pvs
Allow to set different policy for pool from mirrors.
2011-11-04 22:44:21 +00:00
Zdenek Kabelac
d577391181 Thin supports poolmetadatasize setting
Add option to set pool metadatasize.
For passing size parameter reuse region_size.
2011-11-04 22:43:10 +00:00
Alasdair Kergon
b316ae6fd1 Add missing lvrename mirrored log recursion in for_each_sub_lv. 2011-11-04 01:31:23 +00:00
Zdenek Kabelac
7ca380cf6d Thin keep pool device in the same state
Leave the optimalisation to be done differently and preserve
availability state of the pool device.
2011-11-03 15:58:20 +00:00
Zdenek Kabelac
c4c8553390 Thin no device is created - so nothing to revert here 2011-11-03 15:46:51 +00:00
Zdenek Kabelac
da4dca3881 Thin removing unused detach_pool_messages 2011-11-03 14:57:04 +00:00
Zdenek Kabelac
19af11c2eb Thin using update_pool_lv
Replace detach_pool_messages with update_pool_lv.
Move creation code from to 'if' condition into 1.
Ensure creation has finished all previous message operations.
2011-11-03 14:56:20 +00:00
Zdenek Kabelac
6fbce7358c Thin genering update_pool_lv function
Function to trigger pool message passing via resume,
or resize of the pool itself independently on other thins.
2011-11-03 14:53:58 +00:00
Zdenek Kabelac
264ca9ba32 Thin uses _tdata instead of _tpool for data LV
Switch to different suffix and keep -tpool reserved for overlay device name.
2011-11-03 14:38:36 +00:00
Zdenek Kabelac
0e7603ecbb Thin code cleanup
Use iterate_items for list processing.
2011-11-03 14:36:40 +00:00
Zdenek Kabelac
d39f58ac2e Thin fix compile warns
Test for dm_snprintf < 0.
Add header for moved backup.
2011-10-30 22:52:08 +00:00
Zdenek Kabelac
cdf67ba0d0 Thin creation without activation
All thins are created with the next activation and VG is updated
without messages. Only some basic commands works.
(i.e. lvcreate -an  -V10 -T mvg/pool)
There can be some combination to confuse this system.

This functionality for snapshots is going to be interesting.
2011-10-30 22:07:38 +00:00
Zdenek Kabelac
4486b3b6a5 Cleanup unsuccessfully created thin LV
If something fails during creation of thin LV remove such LV
and deactivate in case it's been already tried to activate
(i.e. thin kernel driver fails for some reason.)
2011-10-30 22:02:18 +00:00
Zdenek Kabelac
8ffa1880be Make detach_pool_message visible for tools
Move there also vg_write and vg_commit.
2011-10-30 22:01:39 +00:00
Zdenek Kabelac
eadb017622 Thin cleanups
Fix/cleanup several error messages.
Remove test for seg_is_thin which could never be true there.
Replace (1<<24) with predefined constant.
2011-10-30 22:00:57 +00:00
Zdenek Kabelac
f924413d23 Thin support for stripe
Support stripe options to create thin data pool LV.

TODO: combine chunk size and stripe size.
2011-10-28 20:32:54 +00:00
Zdenek Kabelac
23177b8aad Thin pool resize support for data LV
Support for extension of pool data LV.

TODO: figure out thin volume for suspend/resume in cluster.
2011-10-28 20:31:01 +00:00
Zdenek Kabelac
22e6256d00 Thin support for lvrename
Rename pool's metadata lv _tmeta together with pool and _tdata.
2011-10-28 20:29:32 +00:00
Zdenek Kabelac
98b5d31edd Thin pool activation change
To ensure we properly handle LV cluster locking - explicitely do
not allow to change the availability of the thin pool that is in use
for some thin LV.

As soon as the thin volume is created the only way to activate pool
is via implicit dependency.

Ignore thinpool open count for lv/vgchange operations.
2011-10-28 20:28:00 +00:00
Zdenek Kabelac
82a46ed62c Improve lv_extend stack reporting
and some code cleanup with setting return value.
2011-10-28 20:23:24 +00:00
Zdenek Kabelac
cde3a58d4e Thin error messages clenaup and some indent 2011-10-28 20:19:26 +00:00
Zdenek Kabelac
1d5d5e41f7 Remove thin code from mirror/raid lv_extend 2011-10-28 20:18:32 +00:00
Zdenek Kabelac
20e6a4528a Extend virtual segment instead of adding new one
Before adding a new virtual segment to LV, check first whether
the last segment isn't already of the same type. In this case
extend last segment instead of creating the new one.

Thin volumes should have always only 1 virtual segment, but it
helps also to virtual snapshot or error segtype..
2011-10-28 20:17:55 +00:00
Zdenek Kabelac
7aa87822f7 Add last_seg
Implement a function to return the last segment in a LV.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2011-10-28 20:12:54 +00:00
Jonathan Earl Brassow
0fd3c2aa2b Disallow 'mirrored' log for cluster mirrors.
Git commit ID 0864378250956c310cb81608978d091fcdcc97d8 was meant to disallow
'mirrored' logs for cluster mirrors.  However, when add_mirror_log is used
to create the log (as is now the case when using 'lvcreate' or converting only
the log) the check is bypassed.

This patch adds the check to add_mirror_log.
2011-10-25 13:17:04 +00:00
Zdenek Kabelac
80cdb5240e Don't print char type[8] as a plain string
pvck prints 'extra' character from the label since there is no '\0'
after the struct label entry and just uint64_t follows directly.
So avoid it by limiting 8 chars to be printed.

https://www.redhat.com/archives/lvm-devel/2011-January/msg00109.html

Signed-off-by: Paul Bolle <pebolle tiscali nl>
2011-10-24 10:24:39 +00:00
Zdenek Kabelac
f5aee12510 Always use vg memory pool for allocated lv segment
Remove mem pool parameter from alloc_lv_segment()
Since we should always allocate LV segment from the vg mempool.
2011-10-23 16:02:01 +00:00
Zdenek Kabelac
c71178b1ab Remove old thin code from _lv_insert_empty_sublvs
Since thin is not able to use _lv_insert_empty_sublvs,
remove its appearence from this function.

Start to use extend_pool() function for desired functionality
and modify lv_extend() for this.
2011-10-22 16:48:59 +00:00
Zdenek Kabelac
493ea6330b Remove extra empty check
dm_list_splice handles empty list itself, no need to duplicate code.
2011-10-22 16:46:34 +00:00
Zdenek Kabelac
1fcd3fc4b5 Recoded way to insert thin pool into vg
Code in _lv_insert_empty_sublvs  was not able to provide proper
initialization order for thin pool LV.

New function extend_pool() first adds metadata segment to pool LV which
is still visible. Such LV is activate and cleared.

Then new meta LV is created and metadata segments are moved there.
Now the preallocated pool data segment is attached to the pool LV
and layer _tpool is created. Finaly segment is marked as thin_pool.
2011-10-22 16:44:23 +00:00
Zdenek Kabelac
b7bbad9964 Make move_lv_segment non-static
This function could be useful for other _manip source files.

Use dm_list manipulation function for provided functionality,
which make the code more readable and avoid touching list
internal details here.
2011-10-22 16:42:10 +00:00
Zdenek Kabelac
cd99a84e2b Store transaction_id with created thin lv
So we know the creation history and this should be useful with vgcfgrestore.
2011-10-21 11:38:35 +00:00
Zdenek Kabelac
9b55b4e08e Remove double-hack for setting metadata size
Drop the second lv_extend and set 128MB directly in the first hack place.
2011-10-21 09:55:50 +00:00
Zdenek Kabelac
5a6a8f4897 Thin pool now support chunk size as well
Use chunksize option to specify data_block_size for thin pool target.
Drop low_water_mark to zero.
2011-10-21 09:55:07 +00:00
Zdenek Kabelac
692d718d9d Ensure right activation order
Couple FIXMEs put into the code for parts of the code which may be
improved later, since we might be able to add 'lazy' device creation later.
For now require exclusive activation.
2011-10-20 10:35:14 +00:00
Zdenek Kabelac
79835da163 Add _BLOCK_ to define
Use DM_THIN_MIN_DATA_BLOCK_SIZE and
DM_THIN_MAX_DATA_BLOCK_SIZE to make it more obvious, for which
this define is useful in thin API.
2011-10-20 10:28:41 +00:00
Zdenek Kabelac
0f220e5858 Update error message
Drop INTERNAL_ERROR from public API functions.
Improve some messages.
2011-10-19 16:42:14 +00:00
Zdenek Kabelac
a374aed61c Simple validation of messages in mda
Check we do not combine multiple messages for same LV target
and switch to use  'delete_id' to make it clear for what this device_id
is being used.
2011-10-19 16:39:09 +00:00
Zdenek Kabelac
866b21532a Drop messages referencing deleted LV
lvremove may remove problematic LV for thin target.
2011-10-19 16:37:30 +00:00
Zdenek Kabelac
fc260f5bcb Just indent changes
Some tabs & spaces.
2011-10-19 16:36:39 +00:00
Zdenek Kabelac
d1be6a0c37 Remove test for thin_pool
Since both functions are called during mda read - we don't have full LV info
at this moment.
2011-10-19 16:32:34 +00:00
Zdenek Kabelac
f8690cf8d5 Message support for thin provisiong
lvm part of messaging.

Each message is now stored it's own thin pool section:

message1 {
	create = lv
}

Messages are queued to thin pool dm target when this target
is going to be resumed or used through some dependency.

Currently  'delete' message are purely queued and processed
with next thin pool resume operation (i.e. create_thin).

WARNING - thin provisioning support is developmental code.
2011-10-17 14:17:09 +00:00
Jonathan Earl Brassow
2fd1acc4dd Use a more correct macro for 'seg_is_linear'
It is better to check 'seg->area_count == 1' than '!seg->stripe_size'.
2011-10-14 14:21:32 +00:00
Zdenek Kabelac
3701873dc9 Check for refresh_filter failure
Properly detect if the filters were refreshed properly.

(May needs few more fixes ??)

Filter refresh may fail because it may be out of free file descriptors
when clvmd gets overloaded.
2011-10-11 09:09:00 +00:00
Jonathan Earl Brassow
2c80ace622 Add the ability to convert LVs of "mirror" segtype to "raid1" segtype.
Example:
~> lvconvert --type raid1 vg/mirror_lv

Steps to convert "mirror" to "raid1"
1) Allocate a RAID metadata LV for each mirror image from the same PVs
   on which they are located.
2) Clear the metadata LVs.  This involves writing LVM metadata, so we don't
   change any aspects of the mirror LV before this so that the user can easily
   remove LVs from the failed convert attempt while retaining the original
   mirror.
3) Remove the mirror log, if it exists.
4) Add metadata LVs to mirror LV
5) Rename mirror sub-lvs (s/mimage/rimage/)
6) Change flags and segtype from mirror to raid1
2011-10-07 14:56:01 +00:00
Jonathan Earl Brassow
50a48b38f5 Add the ability to convert linear LVs to RAID1
Example:
~> lvconvert --type raid1 -m 1 vg/lv

The following steps are performed to convert linear to RAID1:
1) Allocate a metadata device from the same PV as the linear device
   to provide the metadata/data LV pair required for all RAID components.
2) Allocate the required number of metadata/data LV pairs for the
   remaining additional images.
3) Clear the metadata LVs.  This performs a LVM metadata update.
4) Create the top-level RAID LV and add the component devices.

We want to make any failure easy to unwind.  This is why we don't create the
top-level LV and add the components until the last step.  Should anything
happen before that, the user could simply remove the unnecessary images.  Also,
we want to ensure that the metadata LVs are cleared before forming the array to
prevent stale information from polluting the new array.

A new macro 'seg_is_linear' was added to allow us to distinguish linear LVs
from striped LVs.
2011-10-07 14:52:26 +00:00
Jonathan Earl Brassow
76ab264200 Allow 'nosync' extension of mirrors.
This patch allows a mirror to be extended without an initial resync of the
extended portion.  It compliments the existing '--nosync' option to lvcreate.
This action can be done implicitly if the mirror was created with the '--nosync'
option, or explicitly if the '--nosync' option is used when extending the device.

Here are the operational criteria:
1) A mirror created with '--nosync' should extend with 'nosync' implicitly
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv ; lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 10.00g                         lv_mlog 100.00

2) The 'M' attribute ('M' signifies a mirror created with '--nosync', while 'm'
signifies a mirror created w/o '--nosync') must be preserved when extending a
mirror created with '--nosync'.  See #1 for example of 'M' attribute.

3) A mirror created without '--nosync' should extend with 'nosync' only when
'--nosync' is explicitly used when extending.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv; lvs vg
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 20.00m                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 5.02 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 5.02g                         lv_mlog   0.39
vs.
[EXAMPLE]# lvs vg; lvextend -L +5G vg/lv --nosync; lvs vg
  LV   VG   Attr     LSize  Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   mwi-a-m- 20.00m                         lv_mlog 100.00
  Extending 2 mirror images.
  Extending logical volume lv to 5.02 GiB
  Logical volume lv successfully resized
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log     Copy%  Convert
  lv   vg   Mwi-a-m- 5.02g                         lv_mlog 100.00

4) The 'm' attribute must change to 'M' when extending a mirror created without
'--nosync' is extended with the '--nosync' option.  (See #3 examples above.)

5) An inactive mirror's sync percent cannot be determined definitively, so it
must not be allowed to skip resync.  Instead, the extend should ask the user if
they want to extend while performing a resync.
[EXAMPLE]# lvchange -an vg/lv
[EXAMPLE]# lvextend -L +5G vg/lv
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  vg/lv is not active.  Unable to get sync percent.
Do full resync of extended portion of vg/lv?  [y/n]: y
  Logical volume lv successfully resized

6) A mirror that is performing recovery (as opposed to an initial sync) - like
after a failure - is not allowed to extend with either an implicit or
explicit nosync option.  [You can simulate this with a 'corelog' mirror because
when it is reactivated, it must be recovered every time.]
[EXAMPLE]# lvcreate -m1 -L 5G -n lv vg --nosync --corelog
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  Logical volume "lv" created
[EXAMPLE]# lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                             100.00
[EXAMPLE]# lvchange -an vg/lv; lvchange -ay vg/lv; lvs vg
  LV   VG   Attr     LSize Pool Origin Snap%  Move Log Copy%  Convert
  lv   vg   Mwi-a-m- 5.00g                               0.08
[EXAMPLE]# lvextend -L +5G vg/lv
  Extending 2 mirror images.
  Extending logical volume lv to 10.00 GiB
  vg/lv cannot be extended while it is recovering.

7) If 'no' is selected in #5 or if the condition in #6 is hit, it should not
result in the mirror being resized or the 'm/M' attribute being changed.


NOTE:  A mirror created with '--nosync' behaves differently than one created
without it when performing an extension.  The former cannot be extended when
the mirror is recovering (unless in-active), while the latter can.  This is
a reasonable thing to do since recovery of a mirror doesn't take long (at
least in the case of an on-disk log) and it would cause far more time in
degraded mode if the extension w/o '--nosync' was allowed.  It might be
reasonable to add the ability to force the operation in the future.  This
should /not/ force a nosync extension, but rather force a sync'ed extension.
IOW, the user would be saying, "Yes, yes... I know recovery won't take long
and that I'll be adding significantly to the time spent in degraded mode, but
I need the extra space right now!".
2011-10-06 15:32:26 +00:00
Jonathan Earl Brassow
1986e51928 Fix splitmirror in cluster having different DM/LVM views of storage.
This patch also does some clean-up of the splitmirrors code.

I've attempted to clean-up the splitmirrors code to make it easier to
understand with fewer operations.  I've tried to reduce the number of
metadata operations without compromising the intermediate stages which
are necessary for easy clean-up in the even of failure.

These changes now correctly handle cluster situations - including exclusive
cluster mirrors.  Whereas before, a splitmirror operation would result in
remote nodes having LVM commands report the newly split LV with a proper
name while DM commands would report the old (pre-split) names of the device.
IOW, there was a kernel/userspace mismatch.
2011-10-06 14:55:39 +00:00
Jonathan Earl Brassow
f7235e7cb4 Revert initial solution to bug 733114 - I/O error message during splitmirror
The original commit comments can be located via this git commit ID:
	7d8e615c0b30fc2ef300c90378a51f01c328128c

There were three possible solutions to the original problem proposed in the
initial check-in.  The one chosen was as follows:
    2) Do like _remove_mirror_images does and suspend the original, then suspend
    the sub-lv (the error target), then resume the sub-lv, and finally resume the
    original LV.  This seems like extra pointless operations to me, but it doesn't
    produce the error message (although, I'm not sure why) and it allows us to
    leave the visible flag in place.
Turns out, the cluster also views the extra suspend/resume operations as
pointless too and ignores them.  So, this solution doesn't work in a cluster.
Further, I've noticed that in addition to the remote cluster nodes still getting
I/O errors from scanning the error target, they also have a different LVM and
DM views of the same LV.  IOW, while the LVM level (gotten from the LVM metadata)
sees the correct name for the newly split LV, device-mapper still maintains the
old names.

Because the original fix failed to completely fix the problem (or work-around it)
and because a better solution must be found to address the additional cluster
issue of device renaming, I am reverting the above mentioned commit.
2011-10-06 14:49:16 +00:00