1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

520 Commits

Author SHA1 Message Date
David Teigland
57bb46c5e7 filter: use bcache for filter reads
Filters are still applied before any device reading or
the label scan, but any filter checks that want to read
the device are skipped and the device is flagged.

After bcache is populated, but before lvm looks for
devices (i.e. before label scan), the filters are
reapplied to the devices that were flagged above.
The filters will then find the data they need in
bcache.
2018-05-10 16:03:19 -05:00
Joe Thornber
ae50374811 bcache: Add sync io engine
Something to fall back to when testing.
2018-05-10 14:29:26 +01:00
Joe Thornber
67b80e2d9d bcache: knock out err param.
Dave used this for debugging.  Not needed in general.
2018-05-10 13:26:08 +01:00
Joe Thornber
1c5c99afce bcache-utils: bcache_set_bytes() 2018-05-09 11:05:29 +01:00
Joe Thornber
dfc320f5b8 bcache-utils: rewrite
They take care to avoid redundant reads now.
2018-05-03 11:36:29 +01:00
Joe Thornber
2688aafefb bcache: rename bcache_write_zeroes() -> bcache_zero_bytes()
Now matches the other util functions:

bcache_{prefetch,read,write,zero}_bytes()
2018-05-03 10:21:14 +01:00
Joe Thornber
8b755f1e04 bcache: rewrite bcache_write_zeros()
It now uses GF_ZERO to avoid reading blocks that are going to be
completely zeroed.
2018-05-03 10:14:56 +01:00
Joe Thornber
dc30d4b2f2 bcache: switch off_t -> uint64_t
We always want it to be 64bit
2018-05-03 09:37:43 +01:00
Joe Thornber
efad84ebc2 bcache: Move the utils to a separate file.
This makes it clearer that they don't access the cache internals.
2018-05-03 09:34:41 +01:00
Joe Thornber
b3c41bce3d bcache: add bcache_block_sectors() query fn 2018-05-03 09:33:55 +01:00
Joe Thornber
65912ce44d bcache: add a comment 2018-05-03 09:21:10 +01:00
Joe Thornber
90d0ff6636 bcache: reorder includes in .c file too 2018-05-02 19:45:06 +01:00
Joe Thornber
8fd300f7df device/bcache: reorder includes 2018-05-02 18:59:43 +01:00
David Teigland
24e7745d7a devices: ignore lvm1 and pool devices 2018-05-01 15:18:47 -05:00
Joe Thornber
bfc61a9543 bcache: squash some warnings on rhel6 2018-05-01 13:21:53 +01:00
Joe Thornber
f564e78d98 bcache: rewrite bcache_{write,zero}_bytes
These are utility functions so should only use the public interface.

Also write_bytes was flushing, which will kill performance.
2018-05-01 12:07:33 +01:00
Joe Thornber
e890c37704 [bcache] Some work on bcache_invalidate()
bcache_invalidate() now returns a bool to indicate success.  If fails
if the block is currently held, or the block is dirty and writeback
fails.

Added a bunch of unit tests for the invalidate functions.

Fixed some bugs to do with invalidating errored blocks.
2018-04-27 10:56:13 +01:00
Joe Thornber
1c97fda425 [bcache] get all unit tests passing again 2018-04-26 13:13:27 +01:00
Zdenek Kabelac
fcdac700f9 gcc: remove duplicate typedef 2018-04-23 22:42:18 +02:00
David Teigland
c0973e70a5 dev_cache: clean up scan
Pull out all of the twisted logic and simply call dev_cache_scan
at the start of the command prior to label scan.
2018-04-20 11:22:48 -05:00
David Teigland
6d05859862 bcache: let caller see an error 2018-04-20 11:22:48 -05:00
David Teigland
570c6239ee bcache: fix error handling
The error handling code wasn't working, but it
appears that just removing it is what we need.
The doesn't really need any different behavior
related to bcache blocks on an io error, it just
wants to know if there was an error.
2018-04-20 11:22:47 -05:00
David Teigland
4331182964 bcache: add some error messages for debugging 2018-04-20 11:22:47 -05:00
David Teigland
e49b114f7e bcache: use wrappers for bcache read write in lvm
Using a wrapper makes it easier to disable bcache if needed.
2018-04-20 11:22:47 -05:00
David Teigland
8065492046 bcache: do all writes through bcache 2018-04-20 11:22:47 -05:00
David Teigland
8b26a007b1 misc bcache fixes from ejt 2018-04-20 11:22:47 -05:00
David Teigland
6c67c7557c scan: use separate fd for bcache
Create a new dev->bcache_fd that the scanning code owns
and is in charge of opening/closing.  This prevents other
parts of lvm code (which do various open/close) from
interfering with the bcache fd.  A number of dev_open
and dev_close are removed from the reading path since
the read path now uses the bcache.

With that in place, open(O_EXCL) for pvcreate/pvremove
can then be fixed.  That wouldn't work previously because
of other open fds.
2018-04-20 11:22:46 -05:00
David Teigland
a7cb76ae94 scan: use bcache for label scan and vg read
New label_scan function populates bcache for each device
on the system.

The two read paths are updated to get data from bcache.

The bcache is not yet used for writing.  bcache blocks
for a device are invalidated when the device is written.
2018-04-20 11:19:24 -05:00
David Teigland
93fc937429 [device/bcache] bcache_read_bytes should put blocks 2018-04-20 11:12:50 -05:00
David Teigland
7be54bd687 [device/bcache] fix min() function 2018-04-20 11:12:50 -05:00
David Teigland
d9e6298edb [device/bcache] fix missing max_io fn in bcache async engine 2018-04-20 11:12:50 -05:00
Joe Thornber
dc8034f5eb [device/bcache] more work on bcache 2018-04-20 11:12:50 -05:00
Joe Thornber
6a57ed17a2 [device/bcache] add bcache_prefetch_bytes() and bcache_read_bytes()
Not tested yet.
2018-04-20 11:12:50 -05:00
Joe Thornber
467adfa082 [device/bcache] More tests and some bug fixes 2018-04-20 11:12:50 -05:00
Joe Thornber
19647d1cd4 [device/bcache] fix bug in _alloc_block 2018-04-20 11:12:50 -05:00
Joe Thornber
1563b93691 [device/bcache] Add bcache_max_prefetches()
Ignore prefetches if max io is in flight.
2018-04-20 11:12:50 -05:00
Joe Thornber
c4c4acfd42 [device/bcache] Add a couple of invalidate methods 2018-04-20 11:12:50 -05:00
Joe Thornber
0f0eb04edb [device/bcache] some more work on bcache 2018-04-20 11:12:50 -05:00
Joe Thornber
46867a45d2 [device/bcache] stub a unit test 2018-04-20 11:12:50 -05:00
Joe Thornber
da7e13ef88 [lib/device/bcache] Tweaks after Kabi's review 2018-04-20 11:10:45 -05:00
Joe Thornber
acb42ec465 [device/bcache] Initial code drop.
Compiles.  Not written tests yet.
2018-04-20 11:10:45 -05:00
Joe Thornber
00f1b208a1 [io paths] Unpick agk's aio stuff 2018-04-20 11:03:58 -05:00
Zdenek Kabelac
e878c3fc32 cleanup: correct casting 2018-04-20 12:17:01 +02:00
Zdenek Kabelac
f2d0eefa77 coverity: make use of defined variable
Since we declare 'r', let's use the value for something.
2018-03-17 23:33:58 +01:00
Zdenek Kabelac
285413b502 cleanup: missing dots and indent 2018-03-15 11:01:04 +01:00
Zdenek Kabelac
70ad633638 devcache: add reason and always log_error
With these read errors it's useful to know the reason.
Also avoid to log error just once so we know exactly
how many times we did failing read.

On the other hand reduce repeated log_error() on code 'backtrace'
path and change severity of message to just log_debug() so the
actual read error is printed once for one read.
2018-03-15 10:50:28 +01:00
Zdenek Kabelac
b6e7a0b490 cleanup: more usage of dm_strncpy
Use existing wrapper function arournd  strncpy + buf[] = 0;
2018-03-06 15:40:34 +01:00
Zdenek Kabelac
6b48868cf0 io: keep 64b arithmetic
Widen to 64b arithmetic from start.
2018-02-28 21:05:18 +01:00
Alasdair G Kergon
d6cabbbc53 device: Fix basic async I/O error handling 2018-02-08 20:19:21 +00:00
Alasdair G Kergon
3e29c80122 device: Queue any aio beyond defined limits. 2018-02-08 20:15:37 +00:00
Alasdair G Kergon
db41fe6c5d lvmcache: Use asynchronous I/O when scanning devices. 2018-02-08 20:15:29 +00:00
Alasdair G Kergon
8c7bbcfb0f device: Basic config and setup to support async I/O. 2018-02-08 20:15:14 +00:00
Alasdair G Kergon
7a9af3cd0e device: Add flag to indicate that a code path can support AIO
Until the whole source supports AIO, library code can check for
AIO_SUPPORTED_CODE_PATH to determine whether or not it is OK
to use AIO.
2018-02-06 01:11:00 +00:00
Alasdair G Kergon
e869a52cc4 callbacks: Miscellaneous fixes for recent changes 2018-02-06 01:09:39 +00:00
Zdenek Kabelac
a1cfef9f26 dev_io: fix writes for unaligned buffers
Actually the removed code is necessary - since not all writes are
getting alligned buffer - older compilers seems to be not able
to create 4K aligned buffers on stack - this the aligning code still
need to be present for write path.
2018-01-23 13:36:12 +01:00
Zdenek Kabelac
6e9148e7ab debug: drop DEBUG_MEM path
Memory is not allocated so no DEBUG_MEM part is needed.
2018-01-23 11:45:18 +01:00
Alasdair G Kergon
9194610f42 device: Add ioflags parameter to transfer additional state.
Flags are set on the initial I/O and passed to any callbacks that
may in turn issue further I/O using the inherited flags.
2018-01-21 21:10:23 +00:00
Alasdair G Kergon
c26458339e device: Move buffer allocation nearer to the I/O.
Don't allocate memory until it's needed - later we'll add
some of the I/O to an internal queue instead of issuing it
immediately.
2018-01-16 01:12:08 +00:00
Alasdair G Kergon
081902b4c1 device: Merge _dev_read and dev_read_callback. 2018-01-16 00:41:42 +00:00
Alasdair G Kergon
b825987b2f device: Rearrange _aligned_io(). 2018-01-15 20:10:54 +00:00
Alasdair G Kergon
c90582344d device: Add reason to devbuf. 2018-01-15 19:38:18 +00:00
Alasdair G Kergon
1f01eaa612 device: Store offset to data instead of pointer.
We want to save the relative offset before we've allocated the
buffer's memory.
2018-01-15 19:32:59 +00:00
Alasdair G Kergon
61d3296f2a device: Reorder device.h before change. 2018-01-15 19:24:01 +00:00
Alasdair G Kergon
6210c1ec28 device: Mark read-only device buffers const. 2018-01-10 19:57:10 +00:00
Alasdair G Kergon
c350f96c09 device: Eliminate unnecessary buffer from dev_read. 2018-01-10 18:48:01 +00:00
Alasdair G Kergon
366493a1d1 device: Suppress repeated reads of the same data.
If the data being requested is present in last_[extra_]devbuf,
return that directly instead of reading it from disk again.

Typical LVM2 access patterns request data within two adjacent 4k blocks
so we eliminate some read() system calls by always reading at least 8k.
2018-01-10 15:52:03 +00:00
Alasdair G Kergon
dcb2a5a611 device: Remove some data copying between buffers.
Callers that read larger amounts of data now get a pointer to read-only
data directly without copying it through an intermediate buffer.  This
data is owned by the device layer so the callers no longer free it.
2018-01-10 15:48:03 +00:00
Alasdair G Kergon
4d568b709c device: Free cached device bufs when metadata invalid or dev closed. 2018-01-10 15:48:03 +00:00
Alasdair G Kergon
bd0967a4b1 device: Keep the last data buffer read off each device.
If there's a second metadata area on device, we record that separately.

Note that the memory requirements aren't restricted yet.
2018-01-10 15:48:03 +00:00
Alasdair G Kergon
ea96381534 libdm: Introduce dm_malloc_aligned 2018-01-10 15:48:03 +00:00
Alasdair G Kergon
f4675af4cf format_text: Use vgsummary callbacks 2018-01-09 03:14:30 +00:00
Alasdair G Kergon
5e7d3ad749 device: Introduce dev_read_callback
If it obtains the data, it passes it into the supplied callback function
and returns 1.  Otherwise the callback receives failed = 1.

Updated config_file_read_fd to use this and similarly return the data
via a callback fn of its own.
2018-01-06 02:40:12 +00:00
Alasdair G Kergon
946f07af3e metadata: Use a consistent format for callback fn parameters 2018-01-05 14:24:56 +00:00
Alasdair G Kergon
22b6c482ec config: Split config buffer processing into new fn.
Wrap its parameters into struct process_config_file_params allocated
from a mempool now passed into the config_file_read* fns.
2018-01-02 21:10:46 +00:00
Alasdair G Kergon
17649d4ac8 device: Move dev_read memory allocation into device layer.
Rename dev_read() to dev_read_buf() - the function that reads data
into a supplied buffer.

Introduce a new dev_read() that allocates the buffer it returns and
switch the important users over to this.  No caller may change the
returned data.  (For now, callers are responsible for freeing it after
use, but later the device layer will take full ownership.)

dev_read_buf() should only be used for tiny buffers or unimportant code
(such as the old disk formats).
2017-12-19 01:31:50 +00:00
Alasdair G Kergon
5f45cb90a7 format_text: Transfer circular buf alloc to device layer.
Instead of the caller passing dev_read_circular() a buffer to fill with
data, the device layer itself now allocates it.
2017-12-15 22:34:26 +00:00
Alasdair G Kergon
beee9940a5 format_text: Separate out code paths for buffer wraparound
The creation of wrapped around metadata - where the start of metadata is
written up to the end of the buffer and the remainder follows back at
the start of the buffer - is now restricted to cases where writing the
metadata in one piece wouldn't fit.  This shouldn't happen in 'normal'
usage so let's begin treating the code for this as a special case that
can be ignored when optimising 'normal' cases.
2017-12-15 21:12:19 +00:00
Alasdair G Kergon
e932c5da50 device: Fix an unpaired device close.
dev_open_flags contains an unpaired dev_close_immediate so increment
open_count before calling it.
2017-12-12 17:56:58 +00:00
Alasdair G Kergon
c5ef76bf27 device: Internal error if writing 0 bytes to dev. 2017-12-12 12:57:25 +00:00
Alasdair G Kergon
d591d04103 device: Tag I/O for each mda on a device separately in log messages.
Mark the first metadata area on each text format PV as MDA_PRIMARY.
Pass this information down to the device layer so that when
there are two metadata areas on a block device, we can easily
distinguish two independent streams of I/O.
2017-12-07 03:48:11 +00:00
Alasdair G Kergon
7195df5aca device: Skip read-modify-write if replacing whole block. 2017-12-05 01:00:38 +00:00
Alasdair G Kergon
e4805e4883 device: categorise block i/o
Introduce enum dev_io_reason to categorise block device I/O
in debug messages so it's obvious what it is for.

DEV_IO_SIGNATURES   /* Scanning device signatures */
DEV_IO_LABEL        /* LVM PV disk label */
DEV_IO_MDA_HEADER   /* Text format metadata area header */
DEV_IO_MDA_CONTENT  /* Text format metadata area content */
DEV_IO_FMT1         /* Original LVM1 metadata format */
DEV_IO_POOL         /* Pool metadata format */
DEV_IO_LV           /* Content written to an LV */
DEV_IO_LOG          /* Logging messages */
2017-12-04 23:45:26 +00:00
Alasdair G Kergon
115e66e9be device: log debug when I/O bounce buffer used 2017-11-16 19:16:10 +00:00
Alasdair G Kergon
02e9876665 log: Add io debug class 2017-11-15 01:02:15 +00:00
Alasdair G Kergon
6bf0f04ae2 log: Improve various device-related messages
- Use 'lvmcache' consistently instead of 'metadata cache'
- Always use 5 characters for source line number
- Remember to convert uuids into printable form
- Use <no name> rather than (null) when VG has no name.
2017-11-13 19:45:33 +00:00
Zdenek Kabelac
e9206fb93d devcache: track more udev errors
Add a bit more details for failing udev function.
2017-10-30 13:16:50 +01:00
Alasdair G Kergon
f1cc5b12fd tidy: Add missing underscores to statics. 2017-10-18 15:58:13 +01:00
Alasdair G Kergon
146745ad88 device: Separate errors for dev not found and filtered.
Replaced the confusing device error message "not found (or ignored by
filtering)" by either "not found" or "excluded by a filter".
(Later we should be able to say which filter.)

Left the the liblvm code paths alone.
2017-10-17 02:12:41 +01:00
Zdenek Kabelac
48ce8c7a49 tidy: drop unneeded cast
Avoid casting to the same type.
2017-07-20 11:20:44 +02:00
Zdenek Kabelac
0bf836aa14 tidy: prefer not using else after return
clang-tidy: avoid using  'else' after return - give more readable code,
and also saves indention level.
2017-07-20 11:18:29 +02:00
Zdenek Kabelac
0d0a3397c2 cleanup: add braces in macro 2017-07-20 11:18:29 +02:00
Zdenek Kabelac
767a5e1281 dev-cache: avoid hashing same data again
Before hashing device again with path, check if it's not already hashed.

TODO: maybe bigger chunk of executed code might be actually skipped.
2017-07-17 12:33:17 +02:00
Alasdair G Kergon
6a20b22151 devices: Recognise Veritas Dynamic Multipathing
VxDMP doesn't interact very well with udev so always set
  devices/obtain_device_list_from_udev = 0
in lvm.conf on these systems.
2017-01-10 22:23:23 +00:00
Zdenek Kabelac
1d58074d9f debug: more stacktrace corrections
Continue previous patch dropping some unneeded stack traces
after printed log_error/warn messages.
2016-11-25 14:58:28 +01:00
Peter Rajnoha
c8a14a29cd dev-type: check for DEVLINKS udev db variable existence if udev_device_get_is_initialized fn is not present
Older udev versions (udev < v165), don't have the official
udev_device_get_is_initialized function available to query for
device initialization state in udev database. Also, devices don't
have USEC_INITIALIZED udev db variable set - this is bound to the
udev_device_get_is_initialized fn functionality.

In this case, check for "DEVLINKS" variable instead - all block devices
have at least one symlink set for the node (the "/dev/block/<major:minor>".
This symlink is set by default basic udev rules provided by udev directly.
We'll use this as an alternative for the check that initial udev
processing for a device has already finished.
2016-09-06 13:21:29 +02:00
Peter Rajnoha
d7b282c601 dev-type: use more appropriate messages in udev_dev_is_mpath_component and use 10s timeout 2016-09-05 14:37:13 +02:00
Peter Rajnoha
5d323c37f3 refactor: move and rename _dev_is_mpath_component in lvmetad.c to udev_dev_is_mpath_component in dev-type.c 2016-09-05 12:55:25 +02:00
Zdenek Kabelac
d37a26b680 devices: handle partscan loop devices
Treat loop device created with 'losetup -P' as regular
partitioned device - so if it has partition table,
prevent its usage in commands like 'pvcreate'.

Before 'pvcreate /dev/loop0' could have erased and formated as PV,
after this patch, device is filtered out and cannot be used.
2016-06-01 17:37:47 +02:00
Alasdair G Kergon
b5314c2a6a device: Retry open without O_NOATIME if it fails. 2016-05-12 01:05:52 +01:00
Zdenek Kabelac
447daa9179 coverity: use wider type for whole expression
Coverity likes when the types are same through the whole expression.
And since dev_t is 64b - widen int type early.
2016-04-22 01:12:34 +02:00
Zdenek Kabelac
cbf99be43a coverity: fix memory access
Commit  52e0d0db44 introduced regression
as code may access   buf[0 - 1].

Reorder code to first remove '\n' and then check buffer size for
empty.
2016-04-22 01:11:57 +02:00
Zdenek Kabelac
556eba1835 cleanup: use kdev_t header in lvm tree
Reuse libdm header in lvm so we have single definition
of MAJOR/MINOR/MKDEV macros in use.
2016-04-22 00:23:28 +02:00
Zdenek Kabelac
0f7975cb35 cleanup: avoid declaring var in the middle of code
Easier to read code.
2016-04-12 11:47:51 +02:00
Zdenek Kabelac
fe65a86cbc devcache: do not insert devices without device node
When not obtaining device from udev, we are doing deep devdir scan,
and at the same time we try to insert everything what /sys/dev/block
knows about. However in case  lvm2 is configured to use nonstardard
devdir this way it will see (and scan) devices from a real system.

lvm2 test suite is using its own test devdir with its
own device nodes. To avoid touching real /dev  devices, validate
the device node exist in give dir and do not insert such device
into a cache.

With obtain list from udev this patch has no effect
(the normal user path).
2016-04-11 10:33:14 +02:00
Zdenek Kabelac
07a60b59f7 devcache: index devices also without udev
We have _insert_dirs() for  udev and non-udev compilation.
Compiling without udev missed to call dev_cache_index_devs().
Move the call after _insert_dirs() call so both compilation
gets it.
2016-04-11 10:32:19 +02:00
Peter Rajnoha
42f04a0f77 dev-cache: skip VGID/LVID indexing if /sys/dev/block is not present
/sys/dev/block is available since kernel version 2.2.26 (~ 2008):
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-dev

The VGID/LVID indexing code relies on this feature so skip indexing
if it's not available to avoid error messages about inability to open
/sys/dev/block directory.

We're not going to provide fallback code to read the /sys/block/
instead in this case as that's not that efficient - it needs extra
reads for getting major:minor and reading partitions would also
pose further reads and that's not worth it.
2016-04-01 17:09:15 +02:00
Peter Rajnoha
15d1824fac dev-cache: iterate devices in sysfs for VGID/LVID index if obtain_device_list_from_udev=0
If obtain_device_list_from_udev=0, LVM can make use of persistent .cache
file. This cache file contains only devices which underwent filters in
previous LVM command run. But we need to iterate over all block devices
to create the VGID/LVID index completely for the device mismatch check
to be complete as well.

This patch iterates over block devices found in sysfs to generate the
VGID/LVID index in dev cache if obtain_device_list_from_udev=0
(if obtain_device_list_from_udev=1, we always read complete list of
block devices from udev and we ignore .cache file so we don't need
to look in sysfs for the complete list).
2016-04-01 14:49:39 +02:00
Peter Rajnoha
7ed5a65ee5 dev-cache: also add dev name for device found in sysfs only
For the case when we print device name associated with struct device
that was not found in /dev, but in sysfs, for example when printing
devices where LV device mismatch is found.
2016-04-01 14:48:56 +02:00
Peter Rajnoha
91d32f9d1b refactor: dev-cache: move code adding sysfs-only device into separate fn 2016-04-01 11:47:06 +02:00
Peter Rajnoha
0e774d5ae7 refactor: dev-cache: use btree instead of hash table for sysfs-only devices
major:minor btree is more convenient and more suitable than dev name
hash table here.
2016-04-01 11:42:25 +02:00
Peter Rajnoha
9a086a6607 dev-cache: fix check for already indexed dev in _index_dev_by_vgid_and_lvid 2016-03-30 15:57:57 +02:00
Peter Rajnoha
8b258a005b dev-cache: dev_cache_index_devs fn is available unconditionally
The new dev_cache_index_devs fn was under ifdef UDEV_SYNC_SUPPORT by mistake,
move it out of this ifdef.
2016-03-30 13:06:20 +02:00
Peter Rajnoha
52e0d0db44 dev-cache: remove spurious error msg if no value found in /sys/dev/block/<major>:<minor>/dm/uuid during dev scan
It's correct to have a DM device that has no DM UUID assigned
so no need to issue error message in this case. Also, if the
device doesn't have DM UUID, it's also clear it's not an LVM LV
(...when looking for VGID/LVID while creating VGID/LVID indices
in dev cache).

For example:

$ dmsetup create test --table "0 1 linear /dev/sda 0"
And there's no PV in the system.

Before this patch (spurious error message issued):
$ pvs
  _get_sysfs_value: /sys/dev/block/253:2/dm/uuid: no value

With this patch applied (no spurious error message):
$ pvs
2016-03-30 11:30:09 +02:00
Peter Rajnoha
8c27c52749 dev-cache: also index VGIDs and LVIDs if using persistent .cache file
If we're using persistent .cache file, we're reading this file instead
of traversing the /dev content. Fix missing indexing by VGID and LVID
here - hook this into persistent_filter_load where we populate device
cache from persistent .cache file instead of scanning /dev.

For example, inducing situation in which we warn about different device
actually used than what LVM thinks should be used based on metadata:

$ lsblk -s /dev/vg/lvol0
NAME     MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vg-lvol0 253:4    0  124M  0 lvm
`-loop1    7:1    0  128M  0 loop

$ lvmconfig --type diff

global {
	use_lvmetad=0
}
devices {
	obtain_device_list_from_udev=0
}

(obtain_device_list_from_udev=0 also means the persistent .cache file is used)

Before this patch - pvs is fine as it does the dev scan, but lvs relies
on persistent .cache file and it misses the VGID/LVID indices to check
and warn about incorrect devices used:

$ pvs
  Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
  Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
  WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
  PV          VG Fmt  Attr PSize   PFree
  /dev/loop0 vg lvm2 a--  124.00m    0

$ lvs
  Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
  Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
  LV    VG Attr       LSize
  lvol0 vg -wi-a----- 124.00m

With this patch applied - both pvs and lvs is fine - the indices are
always created correctly (lvs just an example here, other LVM commands
that rely on persistent .cache file are fixed with this patch too):

$ pvs
  Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
  Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
  WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
  PV          VG Fmt  Attr PSize   PFree
  /dev/loop0 vg lvm2 a--  124.00m    0

$ lvs
  Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
  Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
  WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
  LV    VG Attr       LSize
  lvol0 vg -wi-a----- 124.00m
2016-03-30 11:00:01 +02:00
Peter Rajnoha
91bb202ded dev-cache: handle situation where device is referenced in sysfs, but the node is not yet in dev dir
It's possible that while a device is already referenced in sysfs, the node
is not yet in /dev directory.

This may happen in some rare cases right after LVs get created - we sync
with udev (or alternatively we create /dev content ourselves) while VG
lock is held. However, dev scan is done without VG lock so devices may
already be in sysfs, but /dev may not be updated yet if we call LVM command
right after LV creation (so the fact that fs_unlock is done within VG
lock is not usable here much). This is not a problem with devtmpfs as
there's at least kernel name for device in /dev as soon as the sysfs
item exists, but we still support environments without devtmpfs or
where different directory for dev nodes is used (e.g. our test suite).

This patch covers these situations by tracking such devices in
_cache.sysfs_only_names helper hash for the vgid/lvid check to work still.

This also resolves commit 6129d2e64d
which was then reverted by commit 109b7e2095
due to performance issues it may have brought (...and it didn't resolve
the problem fully anyway).
2016-03-30 10:56:46 +02:00
Peter Rajnoha
94f78e0183 coverity: fix some issues reported by coverity for recent code 2016-03-22 16:03:55 +01:00
Peter Rajnoha
ed002ed22a dev: also count with suffixes in UUID for LVs when constructing VGID and LVID index
UUID for LV is either "LVM-<vg_uuid><lv_uuid>" or "LVM-<vg_uuid><lv_uuid>-<suffix>".
The code before just checked the length of the UUID based on the first
template, not the variant with suffix - so LVs with this suffix were not
processed properly.

For example a thin pool LV (as an example of an LV that contains
sub LVs where UUIDs have suffixes):

[0] fedora/~ # lsblk -s /dev/vg/lvol1
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vg-lvol1          253:8    0    4M  0 lvm
`-vg-pool-tpool   253:6    0  116M  0 lvm
  |-vg-pool_tmeta 253:2    0    4M  0 lvm
  | `-sda           8:0    0  128M  0 disk
  `-vg-pool_tdata 253:3    0  116M  0 lvm
    `-sda           8:0    0  128M  0 disk

Before this patch (spurious warning message about device mismatch):

[0] fedora/~ # pvs
  WARNING: Device mismatch detected for vg/lvol1 which is accessing /dev/mapper/vg-pool-tpool instead of (null).
  PV         VG     Fmt  Attr PSize   PFree
  /dev/sda   vg     lvm2 a--  124.00m    0

With this patch applied (no spurious warning message about device mismatch):

[0] fedora/~ # pvs
  PV         VG     Fmt  Attr PSize   PFree
  /dev/sda   vg     lvm2 a--  124.00m    0
2016-03-22 10:52:24 +01:00
Peter Rajnoha
2a47f0957f dev: also check for blank sysfs value containing only '\n' 2016-03-22 09:29:24 +01:00
Peter Rajnoha
8fad9b9e5d dev: be safer when reading sysfs properties
Check if the value we read from sysfs is not blank and replace the '\n'
at the end only when needed ('\n' should usually be there for sysfs values,
but better check this).
2016-03-21 15:50:32 +01:00
Peter Rajnoha
03b0a78640 dev: detect mismatch between devices used and devices assumed for an LV
It's possible for an LVM LV to use a device during activation which
then differs from device which LVM assumes based on metadata later on.

For example, such device mismatch can occur if LVM doesn't have
complete view of devices during activation or if filters are
misbehaving or they're incorrectly set during activation.

This patch adds code that can detect this mismatch by creating
VG UUID and LV UUID index while scanning devices for device cache.

The VG UUID index maps VG UUID to a device list. Each device in the
list has a device layered above as a holder which is an LVM LV device
and for which we know the VG UUID (and similarly for LV UUID index).

We can acquire VG and LV UUID by reading /sys/block/<dm_dev_name>/dm/uuid.
So these indices represent the actual state of PV device use in
the system by LVs and then we compare that to what LVM assumes
based on metadata.

For example:

[0] fedora/~ # lsblk /dev/sdq /dev/sdr /dev/sds /dev/sdt
NAME         MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
sdq           65:0    0  104M  0 disk
|-vg-lvol0   253:2    0  200M  0 lvm
`-mpath_dev1 253:3    0  104M  0 mpath
sdr           65:16   0  104M  0 disk
`-mpath_dev1 253:3    0  104M  0 mpath
sds           65:32   0  104M  0 disk
|-vg-lvol0   253:2    0  200M  0 lvm
`-mpath_dev2 253:4    0  104M  0 mpath
sdt           65:48   0  104M  0 disk
`-mpath_dev2 253:4    0  104M  0 mpath

In this case the vg-lvol0 is mapped onto sdq and sds becauset this is
what was available and seen during activation. Then later on, sdr and
sdt appeared and mpath devices were created out of sdq+sdr (mpath_dev1)
and sds+sdt (mpath_dev2). Now, LVM assumes (correctly) that mpath_dev1
and mpath_dev2 are the PVs that should be used, not the mpath
components (sdq/sdr, sds/sdt).

[0] fedora/~ # pvs
  Found duplicate PV xSUix1GJ2SK82ACFuKzFLAQi8xMfFxnO: using /dev/mapper/mpath_dev1 not /dev/sdq
  Using duplicate PV /dev/mapper/mpath_dev1 from subsystem DM, replacing /dev/sdq
  Found duplicate PV MvHyMVabtSqr33AbkUrobq1LjP8oiTRm: using /dev/mapper/mpath_dev2 not /dev/sds
  Using duplicate PV /dev/mapper/mpath_dev2 from subsystem DM, ignoring /dev/sds
  WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/sdq, /dev/sds instead of /dev/mapper/mpath_dev1, /dev/mapper/mpath_dev2.
  PV                     VG     Fmt  Attr PSize   PFree
  /dev/mapper/mpath_dev1 vg     lvm2 a--  100.00m      0
  /dev/mapper/mpath_dev2 vg     lvm2 a--  100.00m      0
2016-03-21 11:40:40 +01:00
Peter Rajnoha
65d9f742f8 device: add DEV_OPEN_FAILURE flag
DEV_OPEN_FAILURE flag is set if the most recent "open" for a device
failed and it's unset if any subsequent "open" succeeds.
2016-03-21 11:06:05 +01:00
Alasdair G Kergon
2159a1429d pre-release 2016-03-11 00:19:16 +00:00
Zdenek Kabelac
5aade9c402 topology: handle reported sizes smaller then sector
Recent kernel (4.4) start to report values smaller then sector size
(but in reporting size for SSD which support data zeroing on discard).

For now log warning and assume it really means 1 sector.

Addressing RHBZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1313377
2016-03-10 18:38:54 +01:00
David Teigland
57cd94b9e3 pvs: replace 'unknown device' with [unknown]
A config setting can restore the old string.
2016-03-01 11:12:03 -06:00
Zdenek Kabelac
9267e0c5e7 cleanup: use braces around macro params 2016-02-23 21:40:17 +01:00
Zdenek Kabelac
c0b836e316 gcc: logical-op warning go away
Don't be too much inventive and shutdown gcc6 warning:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69602
2016-02-23 14:41:24 +01:00
Peter Rajnoha
ec43f55445 filters: partitioned: fix partition table filter with external_device_info_source="udev" and blkid<2.20
Non-dm devices have ID_PART_TABLE_TYPE variable exported in
udev db from blkid scan for *both* whole devices and partitions.
We used ID_PART_ENTRY_DISK in addition to decide whether this
is the whole device or partition and then we filtered out only
whole devices where the partition table really is.

However, ID_PART_ENTRY_DISK was added in blkid 2.20 so we need
to use a different set of variables to decide on whole devices
and partitions on systems where older blkid is still used.

Now, we use ID_PART_TABLE_TYPE to detect that there's something
related to partitioning with this device and we use DEVTYPE variable
instead to decide between whole device (DEVTYPE="disk") and partition
(DEVTYPE="partition").

For dm devices it's simpler, we have ID_PART_TABLE_TYPE variable\
set in udev db for whole devices. It's not set for partitions,
hence we don't need more variable in addition to make the decision
on whole device vs. partition (dm devices do not have regular
partitions, hence DEVTYPE can't be used anyway, it's always set
to "disk" for whole disks and partitions).
2016-02-02 13:28:11 +01:00
Peter Rajnoha
d090d6574e device: also cache device size
Add "size" and "size_seqno" to struct device to cache device's size
and also to control its lifetime - the cached value is valid as long
as the global _dev_size_seqno is equal to the device's size_seqno,
otherwise we need to get the size again and cache the new value.

This patch also adds new dev_size_seqno_inc() fn for the appropriate
parts of the code to increment current global value of _dev_size_seqno
and hence to cause all currently cached values for device sizes to
be invalidated.

The device size is now cached because we're planning to reuse this
information for further checks and we want to avoid checking it more
than necessary to save resources.
2016-01-22 14:13:34 +01:00
Zdenek Kabelac
fcbef05aae doc: change fsf address
Hmm rpmlint suggest fsf is using a different address these days,
so lets keep it up-to-date
2016-01-21 12:11:37 +01:00
David Teigland
92e1422707 process_each_pv: do full scan earlier to find new devices
Before commit c1f246fedf,
_get_all_devices() did a full device scan before
get_vgnameids() was called.  The full scan in
_get_all_devices() is from calling dev_iter_create(f, 1).
The '1' arg forces a full scan.

By doing a full scan in _get_all_devices(), new devices
were added to dev-cache before get_vgnameids() began
scanning labels.  So, labels would be read from new devices.
(e.g. by the first 'pvs' command after the new device appeared.)

After that commit, _get_all_devices() was called
after get_vgnameids() was finished scanning labels.
So, new devices would be missed while scanning labels.
When _get_all_devices() saw the new devices (after
labels were scanned), those devices were added to
the .cache file.  This meant that the second 'pvs'
command would see the devices because they would be
in .cache.

Now, the full device scan is factored out of
_get_all_devices() and called by itself at the
start of the command so that new devices will
be known before get_vgnameids() scans labels.
2015-12-14 10:02:29 -06:00
Zdenek Kabelac
dccbc3b621 cleanup: simplify dev_cache_exit
Just set whole _cache struct into unitialized state just
like with lib init start usage.
Lists are initialized with dev_cache_init().
2015-11-16 01:16:11 +01:00
Zdenek Kabelac
5a4676fea9 cleanup: add _free on error path
Just like with failing allocation above also _free(dev).

TODO: rework this to always use mempool and drop unneeded
comlexity we have in this function.
2015-11-16 01:16:11 +01:00
Peter Rajnoha
b8779e706e configure: check for udev_device_get_is_initialized is available
The udev_device_get_is_initialized is available since libudev version
165. Older versions are still used somewhere (e.g. RHEL6). So better
check for this fn and use it only if it's available.
2015-11-11 15:15:50 +01:00
Peter Rajnoha
f82e0210b7 dev-ext: issue error if external_device_info_source=udev and udev db record incomplete
Udev db records are marked as not initialized (incomplete) on timeout.
Issue an error message whenever LVM finds such records so users are
aware that something's going wrong with udev db.

This is important in case we use devices/external_device_info_source="udev"
where udev database records are used to do various filtering decisions.

For example:

udev log of timed out worker:

Nov 11 13:02:25 raw.virt systemd-udevd[607]: seq 1997 '/devices/virtual/block/dm-2' is taking a long time
Nov 11 13:04:25 raw.virt systemd-udevd[607]: seq 1997 '/devices/virtual/block/dm-2' killed
Nov 11 13:04:25 raw.virt systemd-udevd[607]: worker [11221] terminated by signal 9 (Killed)
Nov 11 13:04:25 raw.virt systemd-udevd[607]: worker [11221] failed while handling '/devices/virtual/block/dm-2'
...

LVM also issues error message visibly if incomplete udev db record is found,
devices/external_device_info_source="udev" is set:

$ pvs
  Udev database has incomplete information about device /dev/dm-2.
  Failed to get external handle for device /dev/dm-2 [udev].
  ...
2015-11-11 13:14:07 +01:00
Zdenek Kabelac
e262d5e596 cleanup: keep using enum typedef
Using enum instead of unsigned.
2015-11-09 10:22:52 +01:00
Zdenek Kabelac
fa1d730847 dev-type: fix TOCTOU order
Doing 'stat' checking first and later opening is racy.
And since we do not really care about any 'status' info
here and we read 'sysfs' here - just drop whole 'stat()'
call and directly handle error from failing 'fopen()'.
2015-11-09 10:19:20 +01:00
Alasdair G Kergon
65ec00ce20 device: Tidy DASD CDL format detection code. 2015-10-27 15:27:52 +00:00
Lidong Zhong
729f489009 pvcreate: don't support unpartitioned DASD devices with CDL formatted
The former patch(dab3ebce4c) is a little bit strict. For example, it is
OK to create PV on unpartitioned DASD devices with LDL formatted. So
after lvm version containing the patch, LVs created on those devices
could not be found.

Signed-off-by: Lidong Zhong <lzhong@suse.com>
2015-10-27 11:42:47 +01:00
Peter Rajnoha
5ac81657e5 wiping: make libblkid detect all copies of the same signature if use_blkid_wiping=1
Some signatures are spread around the disk in several copies, mainly for
backup. Make libblkid to detect these extra copies - there was missing
"blkid_probe_step_back" fn call after successful wipe of previous signature
copy.

An example with FAT table which has copies:

$ mkfs.vfat /dev/sda1

Before this patch:

$ pvcreate /dev/sda1
WARNING: vfat signature detected on /dev/sda1 at offset 54. Wipe it? [y/n]: y
  Wiping vfat signature on /dev/sda1.
  Physical volume "/dev/sda1" successfully created

With this patch applied:

$ pvcreate /dev/sda1
WARNING: vfat signature detected on /dev/sda1 at offset 54. Wipe it? [y/n]: y
  Wiping vfat signature on /dev/sda1.
WARNING: vfat signature detected on /dev/sda1 at offset 0. Wipe it? [y/n]: y
  Wiping vfat signature on /dev/sda1.
WARNING: vfat signature detected on /dev/sda1 at offset 510. Wipe it? [y/n]: y
  Wiping vfat signature on /dev/sda1.
  Physical volume "/dev/sda1" successfully created
2015-10-13 12:22:09 +02:00
Peter Rajnoha
2081071bee wiping: warn if use_blkid_wiping=1 is set and LVM not compiled with blkid_wiping support 2015-09-22 11:11:26 +02:00
Peter Rajnoha
55c13f3de4 dev-cache: fix use of uninitialized device status if reading outdated .cache record
As part of fix that came with cf700151eb,
I forgot to add the check whether the result of stat was successful or
not. This bug caused uninitialized buffer to be used for entries
from .cache file which are no longer valid.

This bug may have caused these uninitialized values to be used further,
for example (see the unreal (2567,590944) representing major:minor
pair):

$ pvs
  /dev/abc: stat failed: No such file or directory
  Path /dev/abc no longer valid for device(2567,590944)
  PV               VG   Fmt  Attr PSize   PFree
  /dev/mapper/test      lvm2 ---  104.00m 104.00m
  /dev/vda2        rhel lvm2 a--    9.51g      0
2015-09-04 18:00:29 +02:00
Peter Rajnoha
d1d00fdeec dev-cache: append (major:minor) to debug messages about adding device or its alias to cache
device/dev-cache.c:350         /dev/sda: Added to device cache (8:0)
device/dev-cache.c:346         /dev/disk/by-id/lvm-pv-uuid-5nPovF-EWp4-vBwd-ylCJ-9Y0B-yzHQ-ek1li2: Aliased to /dev/sda in device cache (8:0)
...
2015-09-03 14:36:15 +02:00
Alasdair G Kergon
623b46a17d device: Don't try to close config file on failure.
$file: open failed: Permission denied
Failed to load config file $file
Attempt to close device '$file' which is not open.
2015-08-17 12:57:01 +01:00
Peter Rajnoha
cf700151eb cache: fix regression causing some PVs to bypass filters
This is a regression introduced by commit
6c0e44d5a2 which changed
the way dev_cache_get fn works - before this patch, when a
device was not found, it fired a full rescan to correct the
cache. However, the change coming with that commit missed
this full_rescan call, causing the lvmcache to still contain
info about PVs which should be filtered now.

Such situation may have happened by coincidence of using
old persistent cache (/etc/lvm/cache/.cache) which does not
reflect the actual state anymore, a device name/symlink which
now points to a device which should be filtered and a fact we
keep info about usable DM devices in .cache no matter what
the filter setting is.

This bug could be hidden though by changes introduced in
commit f1a000a477 as it
calls full_rescan earlier before this problem is hit.
But we need to fix this anyway for the dev_cache_get
to be correct if we happen to use the same code path
again somewhere sometime.

For example, simple reproducer was (before commit
1a000a477558e157532d5f2cd2f9c9139d4f87c):

- /dev/sda contains a PV header with UUID y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M

- lvm.conf: filter = [ "r|.*|" ]

- rm -f .cache (to start with clean state)

- dmsetup create test --table "0 8388608 linear /dev/sda 0" (8388608 is
  just the size of the /dev/sda device I use in the reproducer)

- pvs (this will create .cache file which contains
  "/dev/disk/by-id/lvm-pv-uuid-y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M"
  as well as "/dev/mapper/test" and the target node "/dev/dm-1" - all the
  usable DM mappings (and their symlinks) get into the .cache file even
  though the filter "is set to "ignore all" - we do this - so far it's OK)

- dmsetup remove test (so we end up with /dev/disk/by-id/lvm-pv-uuid-...
  pointing to the /dev/sda now since it's the underlying device
  containing the actual PV header)

- now calling "pvs" with such .cache file and we get:
$ pvs
  PV                                                                 VG  Fmt  Attr PSize PFree
  /dev/disk/by-id/lvm-pv-uuid-y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M vg  lvm2 a--  4.00g    0

Even though we have set filter = [ "r|.*|" ] in the lvm.conf file!
2015-07-29 10:19:12 +02:00
Peter Rajnoha
c3fddb0fbb wiping: add "Wiping skipped." for the message context to be complete 2015-07-21 11:00:43 +02:00
Peter Rajnoha
697fb353dc wiping: log_warn instead of log_error if blkid wipe ignored for a signature
Comply with the rules we have for log_error and log_warn...

$ pvcreate /dev/sda1
  Failed to get offset of the xfs_external_log signature on /dev/sda1.
  1 existing signature left on the device.
  Aborting pvcreate on /dev/sda1.

$ pvcreate /dev/sda1 --force
  WARNING: Failed to get offset of the xfs_external_log signature on /dev/sda1.
  Physical volume "/dev/sda1" successfully created
2015-07-21 10:34:04 +02:00
Peter Rajnoha
2a7c2539c6 wiping: ignore errors during detection if use_blkid_wiping=1 and --force is used
libblkid may return the list of signatures found, but it may not
provide offset and size for each signature detected. This may
happen in case signatures are mixed up or there are more, possibly
overlapping, signatures found.

Make lvm commands pass if such situation happens and we're using
--force (or any stronger force method).

For example:

$ pvcreate /dev/sda1
  Failed to get offset of the xfs_external_log signature on /dev/sda1.
  1 existing signature left on the device.
  Aborting pvcreate on /dev/sda1.

$ pvcreate --force /dev/sda1
  Failed to get offset of the xfs_external_log signature on /dev/sda1.
  Physical volume "/dev/sda1" successfully created
2015-07-21 09:54:20 +02:00
Peter Rajnoha
d947a815e8 config: make a difference between "not found" and "is empty" in log msg for devices/preferred_names
Replace misleading "not found" in the log message when
devices/preferred_names is set to empty array:

Really not found:
device/dev-cache.c:689   devices/preferred_names not found in config: using built-in preferences

Found, but empty:
config/config.c:1431   Setting devices/preferred_names to preferred_names = [ ]
device/dev-cache.c:689   devices/preferred_names is empty: using built-in preferences
2015-07-15 16:14:51 +02:00
Peter Rajnoha
3b6840e099 config: replace find_config_tree_node with find_config_tree_array where appropriate 2015-07-08 13:03:08 +02:00
Zdenek Kabelac
5232fd13f3 cleanup: cast minor to dev_t
Let the arithmetic run with a single dev_t type (Coverity).
2015-05-08 15:15:10 +02:00