1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

13444 Commits

Author SHA1 Message Date
Bryn M. Reeves
c90e9392e4 libdm: add dm_stats_bind_from_fd()
dmsetup already has a version of this function, and dmfilemapd will
need it too: move it to libdevmapper to avoid copying it around.
2016-12-18 20:47:17 +00:00
Bryn M. Reeves
009b711834 libdm: clear region table in dm_stats_list()
Call _stats_regions_destroy() from dm_stats_list() if dms->regions
is non-NULL. This avoids leaking any pool allocations and ensures
the handle is in a known state: if an error occurs during the list,
dms->regions will be NULL and the handle will appear empty.
2016-12-18 20:44:31 +00:00
Zdenek Kabelac
8d6ac1c3ba tests: using cached LV for external origin 2016-12-18 19:38:51 +01:00
Zdenek Kabelac
8c17233af5 debug: add debug message showing new lv
Make trace easier to follow knowing which LV was added to dtree.
2016-12-18 19:38:51 +01:00
Zdenek Kabelac
034931f68d activate: further _info API refinement
Another cleanup of internal _info() API simplifying code.
Also make sure 'error' on _info() call is properly passed upward
(return 0 is error path).
2016-12-18 19:38:51 +01:00
Zdenek Kabelac
79121416df thin: add comment with future extension
It could be actually better to use even cache origin in
read-only mode so there could no be some 'acidental'
change being done on this volume.

This however need further tools enhancment - where we would need
to handle whole subtree on 'lvchange -pr/-prw'.
2016-12-18 19:38:51 +01:00
Zdenek Kabelac
75f2388093 backup: show warning once per command
When command calls backup() more then once (which is actually not
wanted) this warning message is shown repeatedly:

"WARNING: This metadata update is NOT backed up."

Instead now print message just once and less confuse user.
2016-12-18 19:38:30 +01:00
Zdenek Kabelac
5bb6266046 lvconvert: support cache to external origin conversion
Add this functionality to lvconvert:

'lvconvert --thin cachedLV --thinpool vg/poll'

Converts cachedLV to external origin (which will be read-only).
New thin volume is created in thinpool LV and it's using external
origin as source for unprovisioned chunks.
This conversion happens  online (while volume is in use).
Thin LV remains fully writable.
Cached external origin no longer could be written so cache will be used
ONLY for read operations. For this limitation we require cache mode
to be writethrough (as writeback cannot write to read-only volumes).

When  thinLV is later removed  cached external origin is again
fully usable, just note, LV remain in 'read-only' mode.
When read-write is needed,  'lvchange -prw' has to be used.

Single external origin could be user by multiple thinLV in
multiple differen thin pool.
2016-12-18 19:35:27 +01:00
Zdenek Kabelac
69434c2eca cache: improve activation with -real
When cache volume may be converted from normal to -real layer LV
we need to improve logic for call cache_check.

With this patch, we register call for cache_check only when metadata LV
is not yet present in active table slot (should match initial table
load).
This avoids unwanted checking when cache would become layer device
online.
2016-12-18 19:30:50 +01:00
Zdenek Kabelac
954c59779d libdm: drop callback on revert path
The system is likely in some very inconsisten state.
Do not try to make it even more problematic with trying
to invoke tools like thin_check via callback.
2016-12-18 19:29:08 +01:00
Zdenek Kabelac
29b0e42be3 lv: fix lock holder for external origin
External origin could be reloaded via more locks.
It's actually even more complex then thin-pool,
as it may be active on more nodes for linear LVs
(and maybe even more types).

External origin is always read-only thus unmodifiable
device so there should not be a problem accesing it
through multiple nodes.

Also for thin-pool check first presence of active thin-pool.

FIXME:
It's not easy to detect on which nodes this device is active
Thus manipulation with such device may require checking every
node and it active state and refresh.

But since such setup is quite complex to prepare and use,
hopefully there are not user trying to 'explore' this usage yet.
2016-12-18 19:25:25 +01:00
Zdenek Kabelac
a24eae6e82 cache: prepare status checking for layer
To be ready to show status of cache volume, call the status
with layer.  Layer is automatically detected in this case when
cache volume is used in 'layered' form (needs -real suffix).
2016-12-18 19:23:13 +01:00
Zdenek Kabelac
bf157ed833 cache: improve wait for cache clear
Avoid printing misleading message about single dirty block.
Instead properly detect condition where the 'cleaner' policy
needs to be installed without 'overloading' dirty variable.

Also print warning if we would be clearing read-only volume.
(it really shouldn't happen).
2016-12-18 19:22:11 +01:00
Zdenek Kabelac
36f609e513 validation: check external property is matching
Detect if number of external_count is matching
referencing devices for  external_origin LV.
2016-12-18 19:17:59 +01:00
Zdenek Kabelac
7db46c4a39 thin: reload external origin with last thin
External origin could be activated as stand-alone device.
When the last thin LV is removed, external origin is no longer
the external origin and it's layer property was dropped.

Ensure dm table is correct by reloading external origin
(when it's active).
2016-12-18 19:13:34 +01:00
Zdenek Kabelac
c71fefad8d lvs: show status for layer
When LV is external origin, show info for LV but
status for -layer.  So we expose more info to a user
as otherwise active external origin is only linear
mapping of -real layer.

We do the same for i.e. old snaphost origin.
2016-12-18 19:12:12 +01:00
Zdenek Kabelac
bdfc96cb08 raid: fix activation of tracked image
Activation of raid has brough up also splitted image with tracing
(without taking lock for this).

So when raid is now activate - such image is not put into
table (with _rmeta).  When user needs such device, just active it.
2016-12-18 19:10:38 +01:00
Bryn M. Reeves
a15f0d181c dmstats: don't declare _start_timestamp if HAVE_SYS_TIMERFD_H
The _start_timestamp is not used by the TIMERFD clock.
2016-12-18 14:08:11 +00:00
Bryn M. Reeves
3e53adf7c0 dmstats: fix TIMERFD _timer_running() test 2016-12-18 14:07:25 +00:00
Bryn M. Reeves
5a4750d76c dmstats: fix interval number reporting with --count=0
When --count=0 interval numbers are miscalculated:

Interval     #18446744069414584325     time delta:    999920887ns
Interval     #18446744069414584325   current err:       -79113ns
End interval #18446744069414584325  duration:    999920887ns

This is because the command line argument is cast through the
uint32_t type, and fixed to UINT32_MAX:

  _count = ((uint32_t)_int_args[COUNT_ARG]) ? : UINT32_MAX;

We also need to handle --count=0 specially when calculating the
interval number: since intervals count from #1, this must account
for the implicit "minus one" when converting from zero to the
UINT64_MAX value used (which is too large to store in _int_args).
2016-12-18 13:03:45 +00:00
Bryn M. Reeves
5635cd3b03 dmstats: separate TIMERFD and useleep() exit conditions
The time management code mixes tests of the _timer_fd value with
code that should be timer agnostic: this causes problems for users
of the usleep() timer, since it cannot properly detect the start
of a new interval:

Beginning first interval
Interval     #18446744069414584348     time delta:   1000000000ns
Interval     #18446744069414584348   current err:            0ns
End interval #18446744069414584348  duration:   1000000000ns
Adjusted sample interval duration:   1000000000ns
[...]
Beginning first interval
Interval     #18446744069414584349     time delta:   1000000000ns
Interval     #18446744069414584349   current err:            0ns
End interval #18446744069414584349  duration:   1000000000ns
Adjusted sample interval duration:   1000000000ns

Separate these out, by defining a _timer_running() call that each
timer implements, and only define _timer_fd if we are compiling
with TIMERFD enabled.
2016-12-18 13:03:44 +00:00
Bryn M. Reeves
886b4f755d dmstats: use better interval estimate for usleep() timer
Although the usleep() interval timer is not used if the Linux
TIMERFD interface is available it should still provide reasonably
good timing.

Instead of trying to estimate the error from the duration of the
last sleep, peg it to the start time of the program, and use the
value of  ((start_time - now) % interval) to correct the current
interval duration.

This always pulls us back into sync at the end of each interval,
rather than relying on trying to incrementally adjust the time
duration at each interval start.

This greatly reduces drift when the usleep() clock is used.
2016-12-18 13:03:44 +00:00
Bryn M. Reeves
68ec42ebaf dmstats: improve tool help output and option coverage 2016-12-18 11:51:13 +00:00
Bryn M. Reeves
4f9d901c71 man: fix 'dmstats create' formatting in dmstats.8.in 2016-12-18 10:23:12 +00:00
Bryn M. Reeves
14be8c4fad man: fix 'dmstats list' option formatting in dmstats.8.in 2016-12-18 10:12:56 +00:00
Bryn M. Reeves
25dd3988c3 man: fix 'dmstats <command>' formatting in dmstats.8.in 2016-12-18 10:12:45 +00:00
Bryn M. Reeves
35791689ba libdm: use destination size as limit in dm_bit_copy()
The dm_bit_copy() macro uses the source (bs1) bitset size as the
limit for memcpy:

    memcpy((bs1) + 1, (bs2) + 1, ((*(bs1) / DM_BITS_PER_INT) + 1)..)

This is safe if the destination bitset is smaller than the source,
or if the two bitsets are of the same size.

With a destination that is larger (e.g. when resizing a bitmap to
add more capacity), the memcpy will overrun the source bitset and
set garbage bits in the destination.

There are nine uses of the macro currently (8 in libdm/regex, and
1 in daemons/cmirrord): in each case the two bitsets are always of
equal size so the behaviour is unchanged.

Fix the macro to use bs2's size to simplify resizing bitsets and
avoid the need for another copy macro.
2016-12-14 11:28:11 +00:00
Zdenek Kabelac
0f98d5c2e6 cleanup: use exiting function
Reuse existing code and some indent change.
2016-12-14 11:41:42 +01:00
Zdenek Kabelac
fecd043cca raid: split preserves local exlusive activation 2016-12-14 11:40:01 +01:00
Zdenek Kabelac
77e09c3fb4 raid: activation with list
Commit 0690392040 revealed a problem
in raid metadata manipulation.

We do two operations in one table reload:
- raid leg/image extraction
- rename remaining raid legs

This should be made in separate steps. Otherwise we do an
uncorrectable table change on error path (leaving tables
for admin and dmsetup).

As a hotfix - restore the previous logic and use a single
new function _lv_update_and_reload_list which activates exclusively
extracted LVs on the list before resuming suspended raid LV.
This restore 'rename' functionality upon resume.

Also still preserve the 'origin_only' logic - although we know
it's not working properly for cluster and LV stacking.

Further fixes are needed.
2016-12-14 11:37:02 +01:00
Zdenek Kabelac
4a05f83278 configure: just move new macro to right file
aclocal is regenerated while acinclude is permanent.
Move new macro to permanent file.
2016-12-13 22:49:59 +01:00
Bryn M. Reeves
f4401fe351 libdm: ensure first extent is always counted
If FIEMAP returns a single extent after the first call, no extent
boundary is detected and the first extent is not counted by the
normal mechanism.

In this case, increment nr_extents at the same time the extent is
added to the region table, before returning.
2016-12-13 21:41:31 +00:00
Zdenek Kabelac
fce7449d73 cleanup: remove wrapping function
backup is not 'tested' for success and also it should
actually happen just when command is finished.
We do not target to make backups with each inter-step
metadata change.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
c7da16e5f1 cleanup: log message updates 2016-12-13 22:07:52 +01:00
Zdenek Kabelac
a8f5e1f274 cleanup: more lv_is_ usage 2016-12-13 22:07:52 +01:00
Zdenek Kabelac
47b96c3537 cleanup: allocate NAME_LEN size for lv name 2016-12-13 22:07:52 +01:00
Zdenek Kabelac
d0fe3ec0c5 raid: avoid manipulation of segment status
RAID is LV property

TODO: only 2 flags are seg->status: PVMOVE & MERGING
At least the second one should be soon elimanted as again
we merge LV not a segment.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
d1e398c474 segtype: check for seg type instead of status
RAID is LV property - which has single segment of raid type.
2016-12-13 22:07:52 +01:00
Zdenek Kabelac
0690392040 raid: improve table reload sequence
This is another place for 'common' use pattern or
reload and activation of deleted devices.
(Moving the exclusive activation to _deactivate_and_remove_lvs()).

TODO: looks like halve of raid function is reloading
just 'origin' - and the other full LV.
2016-12-13 22:07:52 +01:00
Bryn M. Reeves
7dff632c11 libdm: add min_num_bits to dm_bitset_parse_list()
It's useful to be able to specify a minimum number of bits for a
new bitmap parsed from a list, for e.g. to allow for expansing a
group without needing to copy/reallocate the bitmap.

Add a backwards compatible symbol for programs linked against old
versions of the library.
2016-12-13 21:02:18 +00:00
Bryn M. Reeves
e8d966bc31 libdm: use dm_bit_get_last() in _stats_group_tag_fill()
Instead of iterating over all bits, use dm_bit_get_last() to find
the last set bit in the group bitmap.
2016-12-13 21:02:18 +00:00
Bryn M. Reeves
5d1d65e735 libdm: add dm_bit_get_last()/dm_bit_get_prev()
It is sometimes convenient to iterate over the set bits in a dm
bitset in reverse order (from the highest set bit toward zero), or
to quickly find the last set bit.

Add dm_bit_get_last() and dm_bit_get_prev(), mirroring the existing
dm_bit_get_first() and dm_bit_get_next().

dm_bit_get_prev() uses __builtin_clz when available to efficiently
test the bitset in reverse.
2016-12-13 21:01:58 +00:00
Bryn M. Reeves
107bc13db3 util: add clz() and use __builtin_clz() if available
Add a macro for the clz (count leading zeros) operation.

Use the GCC __builtin_clz() for clz() if it is available and fall
back to a shift based implementation on systems that do not set
HAVE___BUILTIN_CLZ.
2016-12-13 20:41:29 +00:00
Bryn M. Reeves
83a9cd5258 configure: check for __builtin_clz() 2016-12-13 20:38:40 +00:00
Bryn M. Reeves
930b0b4c9e libdm: fix start of file detection in _stats_map_extents() 2016-12-13 20:25:47 +00:00
Bryn M. Reeves
eb65572217 libdm: break up _stats_get_extents_for_file()
Split out the loop that iterates over each batch of FIEMAP
extent data from the function that sets up and calls the ioctl
to reduce nesting and simplify local variable use:

  _stats_get_extents_for_file()
  ->  _stats_map_extents()

The _stats_map_extents() function is responsible for detecting
eof and extent boundaries and adding whole, allocated extents
to the file extent table for region creation.
2016-12-13 20:25:45 +00:00
Bryn M. Reeves
ea9af3e290 libdm: fix dm_stats_foreach_group() macro 2016-12-13 20:01:00 +00:00
Bryn M. Reeves
8e33972828 libdm: check for non-existent region_id values in groups
Check that all region_id values specified in a group bitmap are
actually present: although this should not normally happen when
using the dmstats tool, it is possible as a result of manual
changes (or bugs) for a group descriptor to contain one or more
group_id values that do not exist.

Check for this situation when reading group descriptors, warn
the user the user, and clear these bits in the bitmap when
formatting it for output.
2016-12-13 15:37:48 +00:00
Bryn M. Reeves
99b6d82e2d libdm: fix segfault with invalid group descriptor
If a region has a a DMS_GROUP tag in aux_data where the first
region_id in the bitmap is not the same as the containing region,
dmstats will segfault:

  # '2' is never a valid group bitset list for region_id == 0
  # dmsetup message vg_hex/root 0 "@stats_set_aux 0 DMS_GROUP=img:2#"

  # dmsetup message vg_hex/root 0 "@stats_list"
  0: 45383680+16384 16384 dmstats DMS_GROUP=img:2#
  1: 46071808+32768 32768 dmstats -
  2: 47382528+16384 16384 dmstats -

  # dmstats list
  Segmentation fault (core dumped)

The crash will occur in some arbitrary dm_stats_get_* property
method - this happens while processing the 1st region_id in the
bitset, because the region is marked as grouped, but there is
no group bitmap present at dms->groups[2]->regions.

Fix this by detecting a mismatch between the expected region_id
and dm_bit_get_first() for the parsed bitset during
_parse_aux_data_group().
2016-12-13 14:37:41 +00:00
Bryn M. Reeves
138e4336fd libdm: fix region overlap tests 2016-12-13 09:09:29 +00:00