1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-06 17:18:29 +03:00
Commit Graph

333 Commits

Author SHA1 Message Date
Heinz Mauelshagen
e2354ea344 lvconvert: add infrastructure for RaidLV reshaping support
In order to support striped raid5/6/10 LV reshaping (change
of LV type, stripesize or number of legs), this patch
introduces infrastructure prerequisites to be used
by raid_manip.c extensions in followup patches.

This base is needed for allocation of out-of-place
reshape space required by the MD raid personalities to
avoid writing over data in-place when reading off the
current RAID layout or number of legs and writing out
the new layout or to a different number of legs
(i.e. restripe)

Changes:
- add members reshape_len to 'struct lv_segment' to store
  out-of-place reshape length per component rimage
- add member data_copies to struct lv_segment
  to support more than 2 raid10 data copies
- make alloc_lv_segment() aware of both reshape_len and data_copies
- adjust all alloc_lv_segment() callers to the new API
- add functions to retrieve the current data offset (needed for
  out-of-place reshaping space allocation) and the devices count
  from the kernel
- make libdm deptree code aware of reshape_len
- add LV flags for disk add/remove reshaping
- support import/export of the new 'struct lv_segment' members
- enhance lv_extend/_lv_reduce to cope with reshape_len
- add seg_is_*/segtype_is_* macros related to reshaping
- add target version check for reshaping
- grow rebuilds/writemostly bitmaps to 246 bit to support kernel maximal
- enhance libdm deptree code to support data_offset (out-of-place reshaping)
  and delta_disk (legs add/remove reshaping) target arguments

Related: rhbz834579
Related: rhbz1191935
Related: rhbz1191978
2017-02-24 05:20:58 +01:00
Bryn M. Reeves
e0d19feb85 libdm: add dm_stats_update_regions_from_fd()
Add a call to update the regions corresponding to a file mapped
group of regions. The regions to be updated must be grouped, to
allow us to correctly identify extents that have been deallocated
since the map was created.

Tables are built of the file extents, and the extents currently
mapped to dmstats regions: if a region no longer has a matching
file extent, it is deleted, and new regions are created for any
file extents without a matching region.

The FIEMAP call returns extents that are currently in-memory (or
journaled) and awaiting allocation in the file system. These have
the FIEMAP_EXTENT_UNKNOWN | FIEMAP_EXTENT_DELALLOC flag bits set
in the fe_flags field - these extents are skipped until they
have a known disk location.

Since it is possile for the 0th extent of the file to have been
deallocated this must also handle the possible deletion and
re-creation of the group leader: if no other region allocation
is taking place the group identifier will not change.
2017-01-25 16:15:21 +00:00
Bryn M. Reeves
ca427a711a libdm: fix stats comment formatting in libdevmapper.h 2017-01-24 09:29:31 +00:00
Bryn M. Reeves
c90e9392e4 libdm: add dm_stats_bind_from_fd()
dmsetup already has a version of this function, and dmfilemapd will
need it too: move it to libdevmapper to avoid copying it around.
2016-12-18 20:47:17 +00:00
Bryn M. Reeves
35791689ba libdm: use destination size as limit in dm_bit_copy()
The dm_bit_copy() macro uses the source (bs1) bitset size as the
limit for memcpy:

    memcpy((bs1) + 1, (bs2) + 1, ((*(bs1) / DM_BITS_PER_INT) + 1)..)

This is safe if the destination bitset is smaller than the source,
or if the two bitsets are of the same size.

With a destination that is larger (e.g. when resizing a bitmap to
add more capacity), the memcpy will overrun the source bitset and
set garbage bits in the destination.

There are nine uses of the macro currently (8 in libdm/regex, and
1 in daemons/cmirrord): in each case the two bitsets are always of
equal size so the behaviour is unchanged.

Fix the macro to use bs2's size to simplify resizing bitsets and
avoid the need for another copy macro.
2016-12-14 11:28:11 +00:00
Bryn M. Reeves
7dff632c11 libdm: add min_num_bits to dm_bitset_parse_list()
It's useful to be able to specify a minimum number of bits for a
new bitmap parsed from a list, for e.g. to allow for expansing a
group without needing to copy/reallocate the bitmap.

Add a backwards compatible symbol for programs linked against old
versions of the library.
2016-12-13 21:02:18 +00:00
Bryn M. Reeves
5d1d65e735 libdm: add dm_bit_get_last()/dm_bit_get_prev()
It is sometimes convenient to iterate over the set bits in a dm
bitset in reverse order (from the highest set bit toward zero), or
to quickly find the last set bit.

Add dm_bit_get_last() and dm_bit_get_prev(), mirroring the existing
dm_bit_get_first() and dm_bit_get_next().

dm_bit_get_prev() uses __builtin_clz when available to efficiently
test the bitset in reverse.
2016-12-13 21:01:58 +00:00
Bryn M. Reeves
ea9af3e290 libdm: fix dm_stats_foreach_group() macro 2016-12-13 20:01:00 +00:00
Bryn M. Reeves
2d1dbb9edd libdm: fix performance of failed filemap cleanup
While cleaning up the table of already created regions during a
failed dm_stats_create_regions_from_fd(), list the handle once,
and call _stats_delete_region() directly. This avoids sending a
@stats_list message for each region deleted, reducing runtime
from 6s to 0.7s when cleaning up ~250 out of ~10000 regions:

  # time dmstats create --filemap b.img
  device-mapper: message ioctl on (253:0) failed: Cannot allocate memory
  Failed to create region 246 of 309 at 9388032.
  Could not create regions from file /root/b.img
  << pauses here >>
  Command failed

  real	0m6.267s
  user	0m3.770s
  sys	0m2.487s

  # time dmstats create --filemap b.img
  device-mapper: message ioctl on (253:0) failed: Cannot allocate memory
  Failed to create region 246 of 309 at 9388032.
  Could not create regions from file /root/b.img
  Command failed

  real	0m0.716s
  user	0m0.034s
  sys	0m0.581s

Testing the error path requires region creation to start to
fail part way through the operation (in order to have regions
to clean up): the simplest way is to ensure the system is
close to the kernel limit of 1/4 RAM or 1/2 vmalloc space
consumed by dmstats data.
2016-12-10 11:59:16 +00:00
Zdenek Kabelac
74923c213f cleanup: add doc for raid status states
Show possible values for raid fields user may get ATM.
2016-11-23 17:55:03 +01:00
Peter Rajnoha
7563e69cf1 libdm: add dm_config_parse_without_dup_node_check
Introduce function for config parsing tree without checking
for duplicate nodes.
2016-09-21 18:15:18 +02:00
Peter Rajnoha
db0e34535c libdm: add some comments about DM_UDEV_DISABLE_LIBRARY_FALLBACK flag 2016-08-23 15:58:03 +02:00
Peter Rajnoha
7d1125e5b7 libdm: report: add dm_report_group_output_and_pop_all
The dm_report_group_output_and_pop_all calls dm_report_output and
dm_report_group_pop for all the items that are currently in report
group. This is just a shortcut that makes it easier to output and
pop group's content so the group handle can be reused again without
a need to initialize and configure it again.

The functionality of dm_report_group_output_and_pop_all is the
same as dm_report_destroy but without destroying the report group
handle.
2016-08-09 18:24:45 +02:00
Peter Rajnoha
9c21139284 libdm: report: add dm_report_destroy_rows
Calling dm_report_destroy_rows makes it possible to destroy any report
content we have but at the same time it doesn't destroy the report
handle itself, thus it's possible to reuse that handle again for new
report content.

Functionally, this is the same as calling dm_report_output with the
report handle but omitting the output iself. This functionality may
be useful if we, for whatever reason, need to discard the report
content and start a fresh new one but with the same report configuration
and initialization and thus we can just reuse the existing handle.
2016-08-09 18:24:45 +02:00
Bryn M. Reeves
252952ff33 libdm: document use of dm_free() with histogram bounds 2016-07-18 18:48:34 +01:00
Bryn M. Reeves
e104825916 libdm: add dm_stats_create_regions_from_fd()
Add a call to create dmstats regions that correspond to the extents
present in a file descriptor open on a file in a local file system.
The file must reside on a file system type that correctly supports
physical extent location data in the FIEMAP ioctl.

Regions are optionally placed into a group with a user-defined alias.

File systems that do not support physical offsets in FIEMAP (btrfs
currently) are detected via fstatfs() - although attempting to map
a --filemap group on btrfs will fail anyway with the generic error
"Not on a device-mapper device" this is confusing; the file system
mount is on a device-mapper device, but btrfs' volume layer masks
this in the returned st_dev field since the returned logical file
extents may span multiple physical devices.
2016-07-08 14:34:41 +01:00
Bryn M. Reeves
5e06b33c51 libdm: enclose dm_stats_walk_do/while() body in do..while
The call to dm_stats_walk_start() before the do statement makes
dm_stats_walk_do() behave inconsistently depending on context;
wrap them in an additional do { } while (0) so that the macro
always expands to a valid statement.
2016-07-08 11:14:22 +01:00
Bryn M. Reeves
f1dd0258f1 libdm: ensure flags constants have ULL suffix
The walk flags used by libdm-stats use the upper portion of a 64b
value: use the ULL suffix to ensure the compiler knows the expected
size.
2016-07-05 20:21:49 +01:00
Bryn M. Reeves
fef4832a85 libdm: clarify library's use of aux_data
Make it clear in libdevmapper.h, and in function argument names, that
libdm-stats uses the aux_data field internally and that any values set
for user_data are appended to the library values before being stored
with a region, and similarly, that internal data fields will be stripped
prior to returning any previously stored user_data.
2016-07-05 19:53:17 +01:00
Bryn M. Reeves
cda1622fef libdm: allow deleting regions with dm_stats_delete_group()
Add a flag to dm_stats_delete_group() to allow optional deletion
of all regions belonging to the group being removed.
2016-07-05 19:53:16 +01:00
Bryn M. Reeves
f1f2df7bc0 libdm: add stats group and region iterators and properties
Add support do dm_stats_walk*() to walk over the set of
available groups using the cursor embedded in the dm_stats
handle, and to obtain the type of the object at the current
stats cursor location. A set of flags is introduced to
control which objects are visited:

    DM_STATS_WALK_AREA
    DM_STATS_WALK_REGION
    DM_STATS_WALK_GROUP
    DM_STATS_WALK_ALL

A final flag suppresses visits to regions that contain only a
single area - since the aggregate of such a region is idential
to the area it contains this allows these duplicates to be
filtered out:

    DM_STATS_WALK_SKIP_SINGLE_AREA

If flags are not initialised before beginning a walk the default
set matches the behaviour of previous versions of the library.

Also accept group identifiers as immediate arguments to the
counter, metric, and property functions by adding control
flags to the region and area identifiers passed in.

Region and area properties are mapped to their equivalents for
the group (for example: group size is reported as the sum of
all regions contained in the group). Counter and metric values
are aggregated for the region or group.
2016-07-05 19:53:16 +01:00
Bryn M. Reeves
2cb9794da2 libdm: add statistics groups
Add a grouping facility to the libdm-stats library that allows the
user to bind several regions together as a group. Groups may be
used to aggregate data from several regions for reporting, or to
select and sort among large sets of regions.

A textual descriptor ("group tag") is associated with each group
and is stored in the first group member's aux_data field. The
tag contains the group member list and an optional alias for the
group, allowing the user to assign meaningful names to groups of
regions.

These descriptors are parsed in @stats_list message responses and
populate the resulting region and area tables with the group
structure.

Groups with overlapping regions are permitted but since this will
result in some events being counted more than once a warning is
printed in this case.

Nested and overlapping groups are not currently supported and
attempting to create these configurations results in error.
2016-07-05 19:53:16 +01:00
Bryn M. Reeves
82e5766062 libdm: add enum based counter and metric calls
Add a new enum based interface for accessing counter and metric
values that uses a single function for each:

uint64_t dm_stats_get_counter(const struct dm_stats *dms,
                              dm_stats_counter_t counter
                              uint64_t region_id, uint64_t area_id);

int dm_stats_get_metric(const struct dm_stats *dms, int metric,
                        uint64_t region_id, uint64_t area_id,
                        double *value);

This simplifies the implementation of value aggregation for
groups of regions. The named function interface now calls the
enum interface internally so that all new functionality is
available regardless of the method used to retrieve values.
2016-07-05 19:53:16 +01:00
Bryn M. Reeves
81fad9e853 libdm: add dm_bitset_parse_list()
Add a function to parse a list of integer values and ranges into
a dm_bitset representation. Individual values signify that that bit
is set in the resulting mask and ranges are given as a pair of
start and end values, M-N, such that M and N are the first and
last members of the range (inclusive).

The implementation is based on the kernel's __bitmap_parselist()
that is used for cpumasks and other set configuration passed in
string form from user space.
2016-07-05 19:53:16 +01:00
Peter Rajnoha
2078b842fb libdm: report: add dm_report_set_selection
Since we can do repeated dm_report_output calls now, we also like
to be able to set selection for each of these outputs.
2016-06-20 11:33:43 +02:00
Peter Rajnoha
f2facdc1d0 libdm: report: add DM_REPORT_OUTPUT_MULTIPLE_TIMES report flag to keep report data even after output is done
The DM_REPORT_OUTPUT_MULTIPLE_TIMES instructs reporting code to
keep rows even after dm_report_output call - the rows are not
destroyed in this case which makes it possible to call dm_report_output
multiple times.
2016-06-20 11:33:43 +02:00
Peter Rajnoha
094fce3776 libdm: report: implement DM_REPORT_GROUP_JSON for JSON report output
This patch introduces DM_REPORT_GROUP_JSON report group type. When using
this group type and when pushing a report to such a group, these flags
are automatically unset:

   DM_REPORT_OUTPUT_ALIGNED
   DM_REPORT_OUTPUT_HEADINGS
   DM_REPORT_OUTPUT_COLUMNS_AS_ROWS

...and this flag is set:

   DM_REPORT_OUTPUT_BUFFERED

The whole group is encapsulated in { } for the outermost JSON object
and then each report is reported on output as array of objects where
each object is the row from report:

  {
     "report_name1": [
         {field1="value", field2="value",...},
         {field1="value", field2="value",...}
         ...
     ],
     "report_name2": [
         {field1="value", field2="value",...},
         {field1="value", field2="value",...}
         ...
     ]
     ...
  }
2016-06-20 10:42:26 +02:00
Peter Rajnoha
230b7ff0f6 libdm: report: implement DM_REPORT_GROUP_BASIC for extended report output
This patch introduces DM_REPORT_GROUP_BASIC report group type. This
type has exactly the classical output format as we know from before
introduction of report groups. However, in addition to that, it allows
to put several reports into a group - this is the very basic grouping
scheme that doesn't change the output format itself:

  Report: report1_name
  Header1  Header2 ...
  value    value   ...
  value    value   ...
  ...      ...     ...

  Report: report2_name
  Header1  Header2 ...
  value    value   ...
  value    value   ...
  ...      ...     ...
2016-06-20 10:42:26 +02:00
Peter Rajnoha
9c8f912ea7 libdm: report: introduce dm_report_group
This patch introduces DM report group (represented by dm_report_group
structure) that is used to group several reports to make a whole. As a
whole, all the reports in the group follow the same settings and/or
formatting used on output and it controls that the output is properly
ordered (e.g. the output from different reports is not interleaved
which would break readability and/or syntax of target output format
used for the whole group).

To support this feature, there are 4 new functions:
  - dm_report_group_create
  - dm_report_group_push
  - dm_report_group_pop
  - dm_report_group_destroy

From the naming used (dm_report_group_push/pop), it's clear the reports
are pushed onto a stack. The rule then is that only the report on top
of the stack can be reported (that means calling dm_report_output).
This way we make sure that the output is not interleaved and provides
determinism and control over the output.

Different formats may allow or disallow some of the existing report
flags controlling output itself (DM_REPORT_OUTPUT_*) to be set or not so
once the report is pushed to a group, the grouping code makes sure that
all the reports have compatible flags set and then these flags are
restored once each report is popped from the report group stack.

We also allow to push/pop non-report item in which case such an item
creates a structure (e.g. to put several reports together with any
opening and/or closing lines needed on output which pose as extra
formatting structure besides formatting the reports).

The dm_report_group_push function accepts an argument to pass any
format-specific data needed (e.g. handle, name, structures passed
along while working with reports...).

We can call dm_report_output directly anytime we need (with the only
restriction that we can call dm_report_output only for the report that
is currently on top of the group's stack). Or we don't need to call
dm_report_output explicitly in which case all the reports in a stack are
reported on output automatically once we call dm_report_group_destroy.
2016-06-20 09:26:51 +02:00
Alasdair G Kergon
16019b518e libdm: Add dm_udev_wait_immediate.
dm_udev_wait() waits inside the library.
dm_udev_wait_immediate allows the caller to do other things if the
cookie isn't yet ready to be decremented.
2016-04-28 00:54:27 +01:00
Alasdair G Kergon
a5d53aec83 libdm: Raid status region units are sectors 2016-03-24 17:42:36 +00:00
Zdenek Kabelac
29d1533a49 libdm: parse more info from cache status
Parse Fail/Error/need_check/ro status info from cache.
2016-03-10 18:38:53 +01:00
Zdenek Kabelac
0fb3669d49 libdm: thin status update
Fix parsing of 'Fail' status (using capital letter) for thin-pool.
Add also parsing of 'Error' state for thin-pool.
Add needs_check test for thin-pool.

Detect Fail state for thin.
2016-02-18 16:45:42 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
Zdenek Kabelac
45781161f4 libdm: add some doc for mirror status
Comment content of struct for mirror status.
2015-12-04 22:10:30 +01:00
Zdenek Kabelac
fa87979004 libdm: introduce dm_get_status_mirror
Add missing function to parse mirror status.
2015-12-01 13:00:43 +01:00
Zdenek Kabelac
d582be43d4 libdm: const raid params and error for unsupported type
Accept const struct with raid params (No API change).
Also add extra error message when raid type is unsupported.
2015-11-26 09:27:04 +01:00
David Teigland
931fede81b hash: change name of new lookup function 2015-11-17 11:59:44 -06:00
David Teigland
485d2ca945 lvmetad: different style for hash functions
In lookup, return a count of entries with the
same key rather than the value from a second
entry with the same key.

Using some slightly different names.
2015-11-17 10:27:16 -06:00
David Teigland
920a281994 hash: add comment about multiple values 2015-11-16 11:02:25 -06:00
David Teigland
d9295410e9 lvmetad: change the new hash to take data len
If the data len is passed into the hash table
and saved there, then the hash table internals
do not need to assume that the data value is
a string at any point.
2015-11-13 16:54:22 -06:00
David Teigland
46193f4a59 lvmetad: handle duplicate VG names
New hash table functions are added that allow for
multiple entries with the same key.  Use of the
vgname_to_vgid hash table is converted to these
new functions since there are multiple entries
in vgname_to_vgid that have the same key (vgname).

When multiple VGs with the same name exist, commands
that reference only a VG name will fail saying the
VG could not be found (that error message could be
improved.)  Any command that works with the select
option can access one of the VGs with -S vg_uuid=X.
vgrename is a special case that allows the first VG
name arg to be replaced by a uuid, which also works.

(The existing hash table implementation is not well
suited for handling this case, but it works ok with
the new extensions.  Changing lvmetad to use its own
custom hash tables may be preferable at some point.)
2015-11-13 14:56:35 -06:00
Zdenek Kabelac
9ef820a2a5 libdm: dm_tree_node_size_changed recognizes reduction
Add more functionality to size_changed function.
While 'existing' API only detected  0 for
unchanged,  and !0 for changed,
new improved API will also detected if the
size has only went bigger - or there was
size reduction.

Function work for the whole dm-tree - so
no change is size is always 0.
only size extension  1.
and if some size reduction is there - returns -1.

This result can be used for better evaluation
whether we need to flush before suspend.
2015-10-25 21:05:15 +01:00
Zdenek Kabelac
09a62cca0c libdm: add dm_hold_control_dev
Support hold of control device open.
Useful for daemons so the control device is not frequently reopenned.
2015-10-22 22:27:31 +02:00
Peter Rajnoha
c3bfe07f2a config: add report/compact_output_cols to control which columns to compact in report output
The new report/compact_output_cols setting has exactly the same effect
as report/compact_output setting. The difference is that with the new
setting it's possible to define which cols should be compacted exactly
in contrast to all cols in case of report/compact_output.

In case both compact_output and compact_output_cols is enabled/set,
the compact_output prevails.

For example:

$ lvmconfig --type full report/compact_output report/compact_output_cols
compact_output=0
compact_output_cols=""

$ lvs vg
  LV    VG   Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lvol0 vg   -wi-a----- 4.00m

---

$ lvmconfig --type full report/compact_output report/compact_output_cols
compact_output=0
compact_output_cols="data_percent,metadata_percent,pool_lv,move_pv,origin"

$ lvs vg
  LV    VG   Attr       LSize Log Cpy%Sync Convert
  lvol0 vg   -wi-a----- 4.00m

---

$ lvmconfig --type full report/compact_output report/compact_output_cols
compact_output=1
compact_output_cols="data_percent,metadata_percent,pool_lv,move_pv,origin"

$ lvs vg
  LV    VG   Attr       LSize
  lvol0 vg   -wi-a----- 4.00m
2015-10-16 17:05:54 +02:00
Peter Rajnoha
508f0f5a21 libdm: add dm_report_compact_given_fields
dm_report_compact_given_fields is the same as dm_report_compact_fields,
but it processes only given fields, not all the fields in the report
like dm_report_compact_field does.
2015-10-16 17:05:54 +02:00
Zdenek Kabelac
e0d915a873 libdm: parse Overflow string from snapshot status
This is likely to be a new 'info' provided by kernel
snapshot target.
For now just parse this string.
2015-09-18 17:45:45 +02:00
Bryn M. Reeves
f09e4f7b10 libdm: allow formatting histogram strings with no whitespace
Allow dm_histogram_to_string() to format histogram strings with
no whitespace by passing a width value less than zero.
2015-09-03 22:04:10 +01:00
Bryn M. Reeves
a0cf3d47f1 libdm: add latency histogram support
Add support for creating, parsing, and reporting dm-stats latency
histograms on kernels that support precise_timestamps.

Histograms are specified as a series of time values that give the
boundaries of the bins into which I/O counts accumulate (with
implicit lower and upper bounds on the first and last bins).

A new type, struct dm_histogram, is introduced to represent
histogram values and bin boundaries.

The boundary values may be given as either a string of values (with
optional unit suffixes) or as a zero terminated array of uint64_t
values expressing boundary times in nanoseconds.

A new bounds argument is added to dm_stats_create_region() which
accepts a pointer to a struct dm_histogram initialised with bounds
values.

Histogram data associated with a region is parsed during a call to
dm_stats_populate() and used to build a table of histogram values
that are pointed to from the containing area's counter set. The
histogram for a specified area may then be obtained and interogated
for values and properties.

This relies on kernel support to provide the boundary values in
a @stats_list response: this will be present in 4.3 and 4.2-stable. A
check for a minimum driver version of 4.33.0 is implemented to ensure
that this is present (4.32.0 has the necessary precise_timestamps and
histogram features but is unable to report these via @stats_list).

Access methods are provided to retrieve histogram values and bounds
as well as simple string representations of the counts and bin
boundaries.  Methods are also available to return the total count
for a histogram and the relative value (as a dm_percent_t) of a
specified bin.
2015-09-02 20:48:59 +01:00
Bryn M. Reeves
567189cc76 libdm: add per region precise timestamps property methods 2015-08-24 20:03:21 +01:00