1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-18 10:04:20 +03:00

318 Commits

Author SHA1 Message Date
Heinz Mauelshagen
b84bf3e8cd raid: adjust to misordered raid table line output
This commit supersedes reverted 1e4462dbfbd2bbe3590936df24b3ccd83110b158
to avoid changes to liblvm and the libdm API completely.

The libdevmapper interface compares existing table line retrieved from
the kernel to new table line created to decide if it can suppress a reload.
Any difference between input and output of the table line is taken to be a
change thus causing a table reload.

The dm-raid target started to misorder the raid parameters (e.g. 'raid10_copies')
starting with dm-raid target version 1.9.0 up to (excluding) 1.11.0.  This causes
runtime failures (limited to raid10 as of tests) and needs to be reversed to allow
e.g. old lvm2 uspace to run properly.

Check for the aforementioned version range in libdm and adjust creation of the table line
to the respective (mis)ordered sequence inside and correct order outside the range
(as described for the raid target in the kernels Documentation/device-mapper/dm-raid.txt).
2017-03-23 01:20:00 +01:00
Heinz Mauelshagen
1bf90dac77 Revert "raid: adjust to misordered raid table line output"
This reverts commit 1e4462dbfbd2bbe3590936df24b3ccd83110b158
in favour of an enhanced solution avoiding changes in liblvm
completetly by checking the target versions in libdm and emitting
the respective parameter lines.
2017-03-23 01:19:41 +01:00
Heinz Mauelshagen
1e4462dbfb raid: adjust to misordered raid table line output
The libdevmapper interface compares existing table line retrieved from
the kernel to new table line created to decide if it can suppress a reload.
Any difference between input and output of the table line is taken to be a
change thus causing a table reload.

The dm-raid target started to misorder the raid parameters (e.g. 'raid10_copies')
starting with dm-raid target version 1.9.0 up to (excluding) 1.11.0.  This causes
runtime failures (limited to raid10 as of tests) and needs to be reversed to allow
e.g. old lvm2 uspace to run properly.

Check for the aforementioned version range and adjust creation of the table line
to the respective (mis)ordered sequence inside and correct order outside the range
(as described for the raid target in the kernels Documentation/device-mapper/dm-raid.txt).
2017-03-21 18:17:42 +01:00
Zdenek Kabelac
bb20fac4ab libdm: maintain binary interface for new FEATURE flag
Older library version was not detecting unknown 'feature' bits
and could let start target without needed option.

New versioned symbol now checks for supported feature bits.
_Base version keeps accepting only previously known features and
mask/ignores unknown bits.

NB: if the older binary passed in 'random' bits, it will not get
metadata2 by chance. New linked binary get new validation function.
Library user is required to not pass 'trash' for unsupported bits,
as such calls will be rejected.
2017-03-10 19:33:01 +01:00
Zdenek Kabelac
ddd5a76801 libdm: support cache metadata2 feature flag
Dm cache target version 1.10 introduces new cache metadata format
(upstream kernel >=4.11).

New format is enable by passing new target feature flag metadata2.
Interace side on libdm uses DM_CACHE_FEATURE_METADATA2.
This feature bit is now also recognized on status
and set in 'feature_flags' field of dm_status_cache structure.

Code also adds check for 'highest' supported feature flag bit.
So it rejects properly any 'unknown' feature bit set by application.
2017-03-10 19:33:01 +01:00
Zdenek Kabelac
bf79fb1a33 libdm: better code to enforce writethrough
Better code to enforce writethrough caching for cleaner policy.
Only check for cleaner when DM_CACHE_FEATURE_PASSTHROUGH or
DM_CACHE_FEATURE_WRITEBACK is set.
2017-03-10 19:33:01 +01:00
Heinz Mauelshagen
fb42874a4f lvconvert: libdm RAID API compatibility versioning; remove new function
Commit 80a6de616a19 versioned the dm_tree_node_add_raid_target_with_params()
and dm_tree_node_add_raid_target() APIs for compatibility reasons.

There's no user of the latter function, remove it.

Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
2017-03-01 18:58:48 +01:00
Heinz Mauelshagen
80a6de616a lvconvert: libdm RAID API compatibility versioning
Commit 27384c52cf6a lowered the maximum number of devices
back to 64 for compatibility.

Because more members have been added to the API in
'struct dm_tree_node_raid_params *', we have to version
the public libdm RAID API to not break any existing users.

Changes:

- keep the previous 'struct dm_tree_node_raid_params' and
  dm_tree_node_add_raid_target_with_params()/dm_tree_node_add_raid_target()
  in order to expose the already released public RAID API

- introduce 'struct dm_tree_node_raid_params_v2' and additional functions
  dm_tree_node_add_raid_target_with_params_v2()/dm_tree_node_add_raid_target_v2()
  to be used by the new lvm2 lib reshape extentions

With this new API, the bitfields for rebuild/writemostly legs in
'struct dm_tree_node_raid_params_v2' can be raised to 256 bits
again (253 legs maximum supported in MD kernel).

Mind that we can limit the maximum usable number via the
DEFAULT_RAID{1}_MAX_IMAGES definition in defaults.h.

Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
2017-02-28 22:34:00 +01:00
Heinz Mauelshagen
e2354ea344 lvconvert: add infrastructure for RaidLV reshaping support
In order to support striped raid5/6/10 LV reshaping (change
of LV type, stripesize or number of legs), this patch
introduces infrastructure prerequisites to be used
by raid_manip.c extensions in followup patches.

This base is needed for allocation of out-of-place
reshape space required by the MD raid personalities to
avoid writing over data in-place when reading off the
current RAID layout or number of legs and writing out
the new layout or to a different number of legs
(i.e. restripe)

Changes:
- add members reshape_len to 'struct lv_segment' to store
  out-of-place reshape length per component rimage
- add member data_copies to struct lv_segment
  to support more than 2 raid10 data copies
- make alloc_lv_segment() aware of both reshape_len and data_copies
- adjust all alloc_lv_segment() callers to the new API
- add functions to retrieve the current data offset (needed for
  out-of-place reshaping space allocation) and the devices count
  from the kernel
- make libdm deptree code aware of reshape_len
- add LV flags for disk add/remove reshaping
- support import/export of the new 'struct lv_segment' members
- enhance lv_extend/_lv_reduce to cope with reshape_len
- add seg_is_*/segtype_is_* macros related to reshaping
- add target version check for reshaping
- grow rebuilds/writemostly bitmaps to 246 bit to support kernel maximal
- enhance libdm deptree code to support data_offset (out-of-place reshaping)
  and delta_disk (legs add/remove reshaping) target arguments

Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
2017-02-24 05:20:58 +01:00
Zdenek Kabelac
c908a8b131 libdm: avoid resume if preloaded device is smaller
When we preload device with smaller size, we avoid its resume,
so later suspend/resume of full device tree my process all
existing in flight bios.

Also update comment and avoid using confusing opposite meaning.
2017-02-10 20:29:11 +01:00
Heinz Mauelshagen
a4bbaa3b89 lvconvert: add segtypes raid6_{ls,rs,la,ra}_6 and conversions to/from it
Add:
- support for segment types raid6_{ls,rs,la,ra}_6
  (striped raid with dedicated last Q-Syndrome SubLVs)
- conversion support from raid5_{ls,rs,la,ra} to/from raid6_{ls,rs,la,ra}_6
- setting convenient segtypes on conversions from/to raid4/5/6
- related tests to lvconvert-raid-takeover.sh factoring
  out _lvcreate,_lvconvert funxtions

Related: rhbz1366296
2017-02-05 00:56:27 +01:00
Heinz Mauelshagen
3673ce48e0 lvconvert: add segtype raid6_n_6 and conversions to/from it
Add:
- support for segment type raid6_n_6 (striped raid with dedicated last parity/Q-Syndrome SubLVs)
- conversion support from striped/raid0/raid0_meta/raid4 to/from raid6_n_6
- related tests to lvconvert-raid-takeover.sh

Related: rhbz1366296
2017-02-04 01:42:21 +01:00
Heinz Mauelshagen
60ddd05f16 lvconvert: add segtype raid5_n and conversions to/from it
Add:
- support for segment type raid5_n (striped raid with dedicated last parity SubLVs)
- conversion support from striped/raid0/raid0_meta/raid4 to/from raid5_n
- related tests to lvconvert-raid-takeover.sh

Related: rhbz1366296
2017-02-03 20:40:26 +01:00
Zdenek Kabelac
954c59779d libdm: drop callback on revert path
The system is likely in some very inconsisten state.
Do not try to make it even more problematic with trying
to invoke tools like thin_check via callback.
2016-12-18 19:29:08 +01:00
Zdenek Kabelac
a156fc9a54 libdm: cleaner debug message 2016-09-13 09:24:38 +02:00
Alasdair G Kergon
d8c2677ab9 raid0: Add raid0_meta segment type. 2016-07-01 22:20:54 +01:00
Alasdair G Kergon
bf8d00985a raid0: Add raid0 segment type.
This remains experimental and quite restrictive so should only be used
for testing at this stage.  (E.g. lvreduce is not supported.)
2016-05-23 16:46:38 +01:00
Zdenek Kabelac
e2ceb90095 debug: update message in libdm
When dm_tree_find_node_by_uuid() fails to find passed uuid,
report in lof_debug the complete original uuid,
not the one stripped of LVM- prefix.

TODO: inspect manipulation with LVM- prefix here.
2016-04-18 12:32:56 +02:00
M.H. Tsai
f91622741f dm: fix thin-pool targer params order
Wrong thin-pool feature flag ordering in dm table: It will lead to
unnecessary table reload.

Fix it by placeing feature flags in order they are returned from the
kernel so current 'table line diff' code will not see a difference.
2016-02-11 18:32:24 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Zdenek Kabelac
d582be43d4 libdm: const raid params and error for unsupported type
Accept const struct with raid params (No API change).
Also add extra error message when raid type is unsupported.
2015-11-26 09:27:04 +01:00
Zdenek Kabelac
6ca5447e0c libdm: enhance thin-pool preload
When preloading thin-pool device node for already
existing/running thin-pool do not resume such thin-pool.

This allows to properly schedule commit point for metadata,
when thin-pool data or metadata volume is resized.
2015-11-23 23:34:46 +01:00
Zdenek Kabelac
ddbf0075b1 libdm: drop extra space from cache target line
Extra space between 'cache' target and metadata device caused
string comparation being not equal and thus always causing
table reload even when uneeded.
2015-11-23 23:33:37 +01:00
Zdenek Kabelac
9ef820a2a5 libdm: dm_tree_node_size_changed recognizes reduction
Add more functionality to size_changed function.
While 'existing' API only detected  0 for
unchanged,  and !0 for changed,
new improved API will also detected if the
size has only went bigger - or there was
size reduction.

Function work for the whole dm-tree - so
no change is size is always 0.
only size extension  1.
and if some size reduction is there - returns -1.

This result can be used for better evaluation
whether we need to flush before suspend.
2015-10-25 21:05:15 +01:00
Zdenek Kabelac
5695c6aca6 libdm: enforce writethrough mode for cleaner
With "cleaner" policy always use 'writethrough' mode.
2015-10-13 14:35:48 +02:00
Alasdair G Kergon
0173c260d8 libdm: Move status fns from deptree to targets.
libdm-deptree is only for functions working with dm tree nodes.
2015-09-28 20:28:31 +01:00
Heinz Mauelshagen
1945a0f504 libdm: fix bogus macro causing false parameter count 2015-09-24 14:22:52 +02:00
Heinz Mauelshagen
4e60e62444 raid: Fix raid target write_behind parameter.
Now uses correct "max_write_behind" instead of "writebehind".
(Includes some tidying up.)
2015-09-23 15:53:27 +01:00
Heinz Mauelshagen
96a6210198 libdm: Improve raid segment parameter handling. 2015-09-23 15:25:46 +01:00
Zdenek Kabelac
e0d915a873 libdm: parse Overflow string from snapshot status
This is likely to be a new 'info' provided by kernel
snapshot target.
For now just parse this string.
2015-09-18 17:45:45 +02:00
Zdenek Kabelac
c356991fa8 libdm: no validate for pool without messages
Avoid validation of free space in pool, when no messages are passed.

Patch a3c7e326c3e9950fe74e433c406d6e1b5a53bf25 add new check for
pool overload - but this check should not be made if there are
no messages and transaction_id is still within 'bounds' (bigger by 1).
2015-09-14 20:18:54 +02:00
Zdenek Kabelac
a3c7e326c3 libdm: relocate parsing of thin-pool status
Use single routine for parsing status.

Internally we do not need to allocate pool memory for
passed struct.
2015-09-03 23:34:36 +02:00
Zdenek Kabelac
79ea81b8a8 thin: restore transaction_id handling
Revert back to already existing behavior which has been slightly
modified by a900d150e4658a5d72c39acdd4fefd069b8f00b8.

At the end however it seem to be equal to change TID right with first
metadata write.

Existing code missed handling for 'unused' thin-pool which would
require to also check empty message list for TID==0.

So with the fix we now again preserve 'active' thin-pool volume
when first thin volume is created - this property was lost and caused
problems in cluster, where the lock was hold, but volume was no longer
active on the node.

Another missing part was the proper support for already increased,
but unfinished TID change.

So going back here with existing logic -

TID is increased with first MDA update.

Code allows start with either same TID or (TID-1).

If there are messages, TID must be lower by 1 for sending,
otherwise messages were already posted.
2015-08-17 11:25:03 +02:00
Zdenek Kabelac
48ed8ac50c cleanup: indent 2015-08-12 14:33:16 +02:00
Zdenek Kabelac
79e9bde0ea libdm: rename to data_block_size
Use common name for pool device - as we use data_block_size
for thin pool metadata, use same name for cache_pool.

This change does not affect API.
2015-08-12 14:33:15 +02:00
Zdenek Kabelac
08f047eb51 libdm: cache target arg validation
Add some arg validation for dm_tree_node_add_cache_target().
2015-08-12 14:33:15 +02:00
Alasdair G Kergon
810ab095e6 macros: Wrap PRI with FMT.
Create a set of wrappers with embedded % such as
  #define FMTu64 "%" PRIu64
2015-07-06 15:09:17 +01:00
Zdenek Kabelac
a900d150e4 thin: move pool messaging from resume to suspend
Existing messaging intarface for thin-pool has a few 'weak' points:

* Message were posted with each 'resume' operation, thus not allowing
activation of thin-pool with the existing state.

* Acceleration skipped suspend step has not worked in cluster,
since clvmd resumes only nodes which are suspended (have proper lock
state).

* Resume may fail and code is not really designed to 'fail' in this
phase (generic rule here is resume DOES NOT fail unless something serious
is wrong and lvm2 tool usually doesn't handle recovery path in this case.)

* Full thin-pool suspend happened, when taken a thin-volume snapshot.

With this patch the new method relocates message passing into suspend
state.

This has a few drawbacks with current API, but overal it performs
better and gives are more posibilities to deal with errors.

Patch introduces a new logic for 'origin-only' suspend of thin-pool and
this also relates to thin-volume when taking snapshot.

When suspend_origin_only operation is invoked on a pool with
queued messages then only those messages are posted to thin-pool and
actual suspend of thin pool and data and metadata volume is skipped.

This makes taking a snapshot of thin-volume lighter operation and
avoids blocking of other unrelated active thin volumes.

Also fail now happens in 'suspend' state where the 'Fail' is more expected
and it is better handled through error paths.

Activation of thin-pool is now not sending any message and leaves upto a tool
to decided later how to finish unfinished double-commit transaction.

Problem which needs some API improvements relates to the lvm2 tree
construction. For the suspend tree we do not add target table line
into the tree, but only a device is inserted into a tree.
Current mechanism to attach messages for thin-pool requires the libdm
to know about thin-pool target, so lvm2 currently takes assumption, node
is really a thin-pool and fills in the table line for this node (which
should be ensured by the PRELOAD phase, but it's a misuse of internal API)
we would possibly need to be able to attach message to 'any' node.

Other thing to notice - current messaging interface in thin-pool
target requires to suspend thin volume origin first and then send
a create message, but this could not have any 'nice' solution on lvm2
side and IMHO we should introduce something like 'create_after_resume'
message.

Patch also changes the moment, where lvm2 transaction id is increased.
Now it happens only after successful finish of kernel transaction id
change. This change was needed to handle properly activation of pool,
which is in the middle of unfinished transaction, and also this corrects
usage of thin-pool by external apps like Docker.
2015-07-03 16:13:14 +02:00
Zdenek Kabelac
5bef18f2eb libdm: support for posting messages in suspend
Add support for sending message in suspend tree for thin-pools.
When this operation is requested whole subtree suspend is then skipped.

This is experimantal support for new lvm2 code for sending message
in suspend phase where 'thin-pool origin-only suspend' will send
messages instead of really suspending thin-pool tree.

When suspening thin volume origin-only - only thin volume is suspended,
then messages are posted and thin-pool suspend is skipped.
2015-07-03 16:13:14 +02:00
Zdenek Kabelac
21c0b1134f libdm: enhance tracing messages
Use new _node_name() and print name major:minor for thin-pool device.
2015-07-01 13:44:28 +02:00
Zdenek Kabelac
04ae5007e3 libdm: add helper function to print _node_name
_node_name() prepares into dm_tree internal buffer device
name and it (major:minor) for easy usage for debug messages.

To avoid any allocation a small buffer in struct dm_tree is preallocated
to store this message.
2015-07-01 13:41:40 +02:00
Zdenek Kabelac
69132f55ea libdm: add dm_tree_node_set_thin_pool_read_only
Support thin-pool tree node with activation in read-only mode.
(Native kernel API).
2015-06-18 15:15:39 +02:00
Zdenek Kabelac
9a06ae7b35 libdm: better debug message
Print reason for failing ioctl if thin pool message fails.
2015-06-15 14:48:04 +02:00
Zdenek Kabelac
5232fd13f3 cleanup: cast minor to dev_t
Let the arithmetic run with a single dev_t type (Coverity).
2015-05-08 15:15:10 +02:00
Zdenek Kabelac
2908ab3eed thin: errrorwhenfull support
Support error_if_no_space feature for thin pools.
Report more info about thinpool status:
(out_of_data (D), metadata_read_only (M), failed  (F) also as health
attribute.)
2015-01-14 14:52:05 +01:00
Zdenek Kabelac
20b22cd023 libdm: still better API
Do not use 'any' policy name as a value in config tree - so we stick
with 'policy_settings' and extra 'policy_name' for libdm params.

Update lvm2 API as well.

Example of supported metadata:

 policy = "mq"
 policy_settings {
      migration_threshold = 2048
      sequential_threshold = 512
      random_threshold = 4
      read_promote_adjustment = 10
 }
2014-11-11 00:54:03 +01:00
Zdenek Kabelac
f12e3da639 cleanup: gcc warnings 2014-11-10 22:05:49 +01:00
Zdenek Kabelac
824019531c libdm: tunning cache API
Support new PASSTHROUGH 'feature' flag.

Add dm_config_node to pass in policy args.

Really use origin_uuid instead of using extra call
to pass seg_areas.

Switch to 64bit feature flag bit set so there is
enough space in future for new bits...
2014-11-10 22:05:48 +01:00
Zdenek Kabelac
89233544e0 libdm: allow to activate any pool with tid == 0
When transaction_id is set 0 for thin-pool, libdm avoids validation
of thin-pool, unless there are real messages to be send to thin-pool.
This relaxes strict policy which always required to know
in front transaction_id for the kernel target.

It now allows to activate thin-pool with any transaction_id
(when transaction_id is passed in)

It is now upto application to validate transaction_id from life
thin-pool volume with transaction_id within it's own metadata.
2014-11-04 15:28:00 +01:00
Zdenek Kabelac
8f518cf197 libdm: add check transaction_id after message
Add extra safety detection for thin pool transaction id
and query pool status after confirmed message.

In case there is a missmatch, immeditelly abort further
processing.
2014-08-26 14:12:20 +02:00