IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- Use 'lvmcache' consistently instead of 'metadata cache'
- Always use 5 characters for source line number
- Remember to convert uuids into printable form
- Use <no name> rather than (null) when VG has no name.
Replaced the confusing device error message "not found (or ignored by
filtering)" by either "not found" or "excluded by a filter".
(Later we should be able to say which filter.)
Left the the liblvm code paths alone.
When not obtaining device from udev, we are doing deep devdir scan,
and at the same time we try to insert everything what /sys/dev/block
knows about. However in case lvm2 is configured to use nonstardard
devdir this way it will see (and scan) devices from a real system.
lvm2 test suite is using its own test devdir with its
own device nodes. To avoid touching real /dev devices, validate
the device node exist in give dir and do not insert such device
into a cache.
With obtain list from udev this patch has no effect
(the normal user path).
We have _insert_dirs() for udev and non-udev compilation.
Compiling without udev missed to call dev_cache_index_devs().
Move the call after _insert_dirs() call so both compilation
gets it.
/sys/dev/block is available since kernel version 2.2.26 (~ 2008):
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-dev
The VGID/LVID indexing code relies on this feature so skip indexing
if it's not available to avoid error messages about inability to open
/sys/dev/block directory.
We're not going to provide fallback code to read the /sys/block/
instead in this case as that's not that efficient - it needs extra
reads for getting major:minor and reading partitions would also
pose further reads and that's not worth it.
If obtain_device_list_from_udev=0, LVM can make use of persistent .cache
file. This cache file contains only devices which underwent filters in
previous LVM command run. But we need to iterate over all block devices
to create the VGID/LVID index completely for the device mismatch check
to be complete as well.
This patch iterates over block devices found in sysfs to generate the
VGID/LVID index in dev cache if obtain_device_list_from_udev=0
(if obtain_device_list_from_udev=1, we always read complete list of
block devices from udev and we ignore .cache file so we don't need
to look in sysfs for the complete list).
For the case when we print device name associated with struct device
that was not found in /dev, but in sysfs, for example when printing
devices where LV device mismatch is found.
It's correct to have a DM device that has no DM UUID assigned
so no need to issue error message in this case. Also, if the
device doesn't have DM UUID, it's also clear it's not an LVM LV
(...when looking for VGID/LVID while creating VGID/LVID indices
in dev cache).
For example:
$ dmsetup create test --table "0 1 linear /dev/sda 0"
And there's no PV in the system.
Before this patch (spurious error message issued):
$ pvs
_get_sysfs_value: /sys/dev/block/253:2/dm/uuid: no value
With this patch applied (no spurious error message):
$ pvs
If we're using persistent .cache file, we're reading this file instead
of traversing the /dev content. Fix missing indexing by VGID and LVID
here - hook this into persistent_filter_load where we populate device
cache from persistent .cache file instead of scanning /dev.
For example, inducing situation in which we warn about different device
actually used than what LVM thinks should be used based on metadata:
$ lsblk -s /dev/vg/lvol0
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vg-lvol0 253:4 0 124M 0 lvm
`-loop1 7:1 0 128M 0 loop
$ lvmconfig --type diff
global {
use_lvmetad=0
}
devices {
obtain_device_list_from_udev=0
}
(obtain_device_list_from_udev=0 also means the persistent .cache file is used)
Before this patch - pvs is fine as it does the dev scan, but lvs relies
on persistent .cache file and it misses the VGID/LVID indices to check
and warn about incorrect devices used:
$ pvs
Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
PV VG Fmt Attr PSize PFree
/dev/loop0 vg lvm2 a-- 124.00m 0
$ lvs
Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
LV VG Attr LSize
lvol0 vg -wi-a----- 124.00m
With this patch applied - both pvs and lvs is fine - the indices are
always created correctly (lvs just an example here, other LVM commands
that rely on persistent .cache file are fixed with this patch too):
$ pvs
Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
PV VG Fmt Attr PSize PFree
/dev/loop0 vg lvm2 a-- 124.00m 0
$ lvs
Found duplicate PV B9gXTHkIdEIiMVwcOoT2LX3Ywh4YIHgR: using /dev/loop0 not /dev/loop1
Using duplicate PV /dev/loop0 without holders, ignoring /dev/loop1
WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/loop1 instead of /dev/loop0.
LV VG Attr LSize
lvol0 vg -wi-a----- 124.00m
It's possible that while a device is already referenced in sysfs, the node
is not yet in /dev directory.
This may happen in some rare cases right after LVs get created - we sync
with udev (or alternatively we create /dev content ourselves) while VG
lock is held. However, dev scan is done without VG lock so devices may
already be in sysfs, but /dev may not be updated yet if we call LVM command
right after LV creation (so the fact that fs_unlock is done within VG
lock is not usable here much). This is not a problem with devtmpfs as
there's at least kernel name for device in /dev as soon as the sysfs
item exists, but we still support environments without devtmpfs or
where different directory for dev nodes is used (e.g. our test suite).
This patch covers these situations by tracking such devices in
_cache.sysfs_only_names helper hash for the vgid/lvid check to work still.
This also resolves commit 6129d2e64d
which was then reverted by commit 109b7e2095
due to performance issues it may have brought (...and it didn't resolve
the problem fully anyway).
UUID for LV is either "LVM-<vg_uuid><lv_uuid>" or "LVM-<vg_uuid><lv_uuid>-<suffix>".
The code before just checked the length of the UUID based on the first
template, not the variant with suffix - so LVs with this suffix were not
processed properly.
For example a thin pool LV (as an example of an LV that contains
sub LVs where UUIDs have suffixes):
[0] fedora/~ # lsblk -s /dev/vg/lvol1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vg-lvol1 253:8 0 4M 0 lvm
`-vg-pool-tpool 253:6 0 116M 0 lvm
|-vg-pool_tmeta 253:2 0 4M 0 lvm
| `-sda 8:0 0 128M 0 disk
`-vg-pool_tdata 253:3 0 116M 0 lvm
`-sda 8:0 0 128M 0 disk
Before this patch (spurious warning message about device mismatch):
[0] fedora/~ # pvs
WARNING: Device mismatch detected for vg/lvol1 which is accessing /dev/mapper/vg-pool-tpool instead of (null).
PV VG Fmt Attr PSize PFree
/dev/sda vg lvm2 a-- 124.00m 0
With this patch applied (no spurious warning message about device mismatch):
[0] fedora/~ # pvs
PV VG Fmt Attr PSize PFree
/dev/sda vg lvm2 a-- 124.00m 0
Check if the value we read from sysfs is not blank and replace the '\n'
at the end only when needed ('\n' should usually be there for sysfs values,
but better check this).
It's possible for an LVM LV to use a device during activation which
then differs from device which LVM assumes based on metadata later on.
For example, such device mismatch can occur if LVM doesn't have
complete view of devices during activation or if filters are
misbehaving or they're incorrectly set during activation.
This patch adds code that can detect this mismatch by creating
VG UUID and LV UUID index while scanning devices for device cache.
The VG UUID index maps VG UUID to a device list. Each device in the
list has a device layered above as a holder which is an LVM LV device
and for which we know the VG UUID (and similarly for LV UUID index).
We can acquire VG and LV UUID by reading /sys/block/<dm_dev_name>/dm/uuid.
So these indices represent the actual state of PV device use in
the system by LVs and then we compare that to what LVM assumes
based on metadata.
For example:
[0] fedora/~ # lsblk /dev/sdq /dev/sdr /dev/sds /dev/sdt
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdq 65:0 0 104M 0 disk
|-vg-lvol0 253:2 0 200M 0 lvm
`-mpath_dev1 253:3 0 104M 0 mpath
sdr 65:16 0 104M 0 disk
`-mpath_dev1 253:3 0 104M 0 mpath
sds 65:32 0 104M 0 disk
|-vg-lvol0 253:2 0 200M 0 lvm
`-mpath_dev2 253:4 0 104M 0 mpath
sdt 65:48 0 104M 0 disk
`-mpath_dev2 253:4 0 104M 0 mpath
In this case the vg-lvol0 is mapped onto sdq and sds becauset this is
what was available and seen during activation. Then later on, sdr and
sdt appeared and mpath devices were created out of sdq+sdr (mpath_dev1)
and sds+sdt (mpath_dev2). Now, LVM assumes (correctly) that mpath_dev1
and mpath_dev2 are the PVs that should be used, not the mpath
components (sdq/sdr, sds/sdt).
[0] fedora/~ # pvs
Found duplicate PV xSUix1GJ2SK82ACFuKzFLAQi8xMfFxnO: using /dev/mapper/mpath_dev1 not /dev/sdq
Using duplicate PV /dev/mapper/mpath_dev1 from subsystem DM, replacing /dev/sdq
Found duplicate PV MvHyMVabtSqr33AbkUrobq1LjP8oiTRm: using /dev/mapper/mpath_dev2 not /dev/sds
Using duplicate PV /dev/mapper/mpath_dev2 from subsystem DM, ignoring /dev/sds
WARNING: Device mismatch detected for vg/lvol0 which is accessing /dev/sdq, /dev/sds instead of /dev/mapper/mpath_dev1, /dev/mapper/mpath_dev2.
PV VG Fmt Attr PSize PFree
/dev/mapper/mpath_dev1 vg lvm2 a-- 100.00m 0
/dev/mapper/mpath_dev2 vg lvm2 a-- 100.00m 0
Before commit c1f246fedf,
_get_all_devices() did a full device scan before
get_vgnameids() was called. The full scan in
_get_all_devices() is from calling dev_iter_create(f, 1).
The '1' arg forces a full scan.
By doing a full scan in _get_all_devices(), new devices
were added to dev-cache before get_vgnameids() began
scanning labels. So, labels would be read from new devices.
(e.g. by the first 'pvs' command after the new device appeared.)
After that commit, _get_all_devices() was called
after get_vgnameids() was finished scanning labels.
So, new devices would be missed while scanning labels.
When _get_all_devices() saw the new devices (after
labels were scanned), those devices were added to
the .cache file. This meant that the second 'pvs'
command would see the devices because they would be
in .cache.
Now, the full device scan is factored out of
_get_all_devices() and called by itself at the
start of the command so that new devices will
be known before get_vgnameids() scans labels.
As part of fix that came with cf700151eb,
I forgot to add the check whether the result of stat was successful or
not. This bug caused uninitialized buffer to be used for entries
from .cache file which are no longer valid.
This bug may have caused these uninitialized values to be used further,
for example (see the unreal (2567,590944) representing major:minor
pair):
$ pvs
/dev/abc: stat failed: No such file or directory
Path /dev/abc no longer valid for device(2567,590944)
PV VG Fmt Attr PSize PFree
/dev/mapper/test lvm2 --- 104.00m 104.00m
/dev/vda2 rhel lvm2 a-- 9.51g 0
This is a regression introduced by commit
6c0e44d5a2 which changed
the way dev_cache_get fn works - before this patch, when a
device was not found, it fired a full rescan to correct the
cache. However, the change coming with that commit missed
this full_rescan call, causing the lvmcache to still contain
info about PVs which should be filtered now.
Such situation may have happened by coincidence of using
old persistent cache (/etc/lvm/cache/.cache) which does not
reflect the actual state anymore, a device name/symlink which
now points to a device which should be filtered and a fact we
keep info about usable DM devices in .cache no matter what
the filter setting is.
This bug could be hidden though by changes introduced in
commit f1a000a477 as it
calls full_rescan earlier before this problem is hit.
But we need to fix this anyway for the dev_cache_get
to be correct if we happen to use the same code path
again somewhere sometime.
For example, simple reproducer was (before commit
1a000a477558e157532d5f2cd2f9c9139d4f87c):
- /dev/sda contains a PV header with UUID y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M
- lvm.conf: filter = [ "r|.*|" ]
- rm -f .cache (to start with clean state)
- dmsetup create test --table "0 8388608 linear /dev/sda 0" (8388608 is
just the size of the /dev/sda device I use in the reproducer)
- pvs (this will create .cache file which contains
"/dev/disk/by-id/lvm-pv-uuid-y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M"
as well as "/dev/mapper/test" and the target node "/dev/dm-1" - all the
usable DM mappings (and their symlinks) get into the .cache file even
though the filter "is set to "ignore all" - we do this - so far it's OK)
- dmsetup remove test (so we end up with /dev/disk/by-id/lvm-pv-uuid-...
pointing to the /dev/sda now since it's the underlying device
containing the actual PV header)
- now calling "pvs" with such .cache file and we get:
$ pvs
PV VG Fmt Attr PSize PFree
/dev/disk/by-id/lvm-pv-uuid-y5PzRD-RBAv-7sBx-V3SP-vDmy-DeSq-GUh65M vg lvm2 a-- 4.00g 0
Even though we have set filter = [ "r|.*|" ] in the lvm.conf file!
Replace misleading "not found" in the log message when
devices/preferred_names is set to empty array:
Really not found:
device/dev-cache.c:689 devices/preferred_names not found in config: using built-in preferences
Found, but empty:
config/config.c:1431 Setting devices/preferred_names to preferred_names = [ ]
device/dev-cache.c:689 devices/preferred_names is empty: using built-in preferences
When pvscan --cache --major --minor command is issued from
udev REMOVE event, it basically resulted into a whole device
scan since the device was missing. So avoid such scan
and first check via /sysfs (when available) if such device actually
exists.
The code in dev_iter_create assumes that if a filter can be wiped, doing so will
always trigger a call to _full_scan. This is not true for composite filters
though, since they can always be wiped in principle, but there is no way to know
that a component filter inside will exist that actually triggers the scan.
The list of strings is used quite frequently and we'd like to reuse
this simple structure for report selection support too. Make it part
of libdevmapper for general reuse throughout the code.
This also simplifies the LVM code a bit since we don't need to
include and manage lvm-types.h anymore (the string list was the
only structure defined there).
When lvm2 command works with clvmd and uses locking in wrong way,
it may 'leak' certain file descriptors in opened (incorrect) state.
dev_cache_exit then destroys memory pool of cached devices, while
_open_devices list in dev-io.c was still referencing them if they
were still opened.
Patch properly calls _close() function to 'self-heal' from this
invalid state, but it will report internal error (so execution
with abort_on_internal_error causes immediate death). On the
normal 'execution', error is only reported, but memory state is
corrected, and linked list is not referencing devices from
released mempool.
For crash see: https://bugzilla.redhat.com/show_bug.cgi?id=1073886
When the device is inserted in dev_name_confirmed() stat() is
called twice as _insert() has it's own stat() call.
Extend _insert() parameter with struct stat* - which could be used
if it has been just obtained. When NULL is passed code is
doing its own stat() call as before.
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.
When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.
Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.
Log each filter initialisation to show sequence.
Avoid duplicate 'Using $device' debug messages.
Changes:
- move device type registration out of "type filter" (filter.c)
to a separate and new dev-type.[ch] for common use throughout the code
- the structure for keeping the major numbers detected for available
device types and available partitioning available is stored in
"dev_types" structure now
- move common partitioning detection code to dev-type.[ch] as well
together with other device-related functions bound to dev_types
(see dev-type.h for the interface)
The dev-type interface contains all common functions used to detect
subsystems/device types, signature/superblock recognition code,
type-specific device properties and other common device properties
(bound to dev_types), including partitioning support.
- add dev_types instance to cmd context as cmd->dev_types for common use
- use cmd->dev_types throughout as a central point for providing
information about device types
For example, the old call and reference:
find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)
...now becomes:
find_config_tree_str(cmd, devices_dir_CFG)
So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
Use log_warn to print non-fatal warning messages.
Use of log_error would confuse checker for testing
whether proper error has been reported for some real error.
Libudev does not provide transactions when querying udev database - once we
get the list of block devices (devices/obtain_device_list_from_udev=1) and
we iterate over the list to get more detailed information about device node
and symlink names used etc., the device could be removed just in between we
get the list and put a query for more info. In this case, libudev returns
NULL value as the device does not exist anymore.
Recently, we've added a warning message to reveal such situations. However,
this could be misleading if the device is not related to the LVM action
we're just processing - the non-related block device could be removed in
parallel and this is not an error but a possible and normal operation.
(N.B. This "missing info" should not happen when devices are related to
the LVM action we're just processing since all such processing should be
synchronized with udev and the udev db must always be in consistent state
after the sync point. But we can't filter this situation out from others,
non-related devices, so we have to lower the message verbosity here for a
general solution.)
Avoid using NULL pointers from udev. It seems like some older versions of udev
were improperly returning NULL in some case, so do not silently break here,
and give at least a warning to the user.
When both path have identical prefix i.e. /dev/disk/by-id
skip 2 x lstat() for /dev /dev/disk /dev/disk/by-id
and directly lstat() only different part of the path.
Reduces amount of lstat calls on system with lots of devices.
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
Also, add a new 'obtain_device_list_from_udev' setting to lvm.conf with which
we can turn this feature on or off if needed.
If set, the cache of block device nodes with all associated symlinks
will be constructed out of the existing udev database content.
This avoids using and opening any inapplicable non-block devices or
subdirectories found in the device directory. This setting is applied
to udev-managed device directory only, other directories will be scanned
fully. LVM2 needs to be compiled with udev support for this setting to
take effect. N.B. Any device node or symlink not managed by udev in
udev directory will be ignored with this setting on.
Change API interface to accept even completely const array patterns.
This should present no change for libdm users and allows to pass
pattern arrays without cast to const char **.
To have better control were the config tree could be modified use more
const pointers and very carefully downcast them back to non-const
(for config tree merge).
When we are stacking LV over device, which has for some reason
increased read_ahead (e.g. MD RAID), the read_ahead hint
for libdevmapper is wrong (it is zero).
If the calculated read_ahead hint is zero, patch uses read_ahead of underlying device
(if first segment is PV) when setting DM_READ_AHEAD_MINIMUM_FLAG.
Because we are using dev-cache, it also store this value to cache for future use
(if several LVs are over one PV, BLKRAGET is called only once for underlying device.)
This should fix all the reamining problems with readahead mismatch reported
for DM over MD configurations (and similar cases).
event-driven model. Without changes to the way the cache gets updated, the
option is currently unreliable without a global lock to prevent any lvm2
commands from running concurrently.
Additional verbosity level -vvvv includes line numbers and backtraces.
Verbose messages now go to stderr not stdout.
Close any stray file descriptors before starting.
Refine partitionable checks for certain device types.
Allow devices/types to override built-ins.
Clear many compiler warnings (i386) & associated bugs - hopefully without
introducing too many new bugs:-) (Same exercise required for other archs.)
Default compilation has optimisation - or else use ./configure --enable-debug
Lots of changes/very little testing so far => there'll be bugs!
Use 'vgcreate -M text' to create a volume group with its metadata stored
in text files. Text format metadata changes should be reasonably atomic,
with a (basic) automatic recovery mechanism if the system crashes while a
change is in progress.
Add a metadata section to lvm.conf to specify multiple directories if
you want (recommended) to keep multiple copies of the metadata (eg on
different filesystems).
e.g. metadata {
dirs = ["/etc/lvm/metadata1","/usr/local/lvm/metadata2"]
}
Plenty of refinements still in the pipeline.
This should be a rare occurrence so the aim is to recover if it's
straightforward to do so, otherwise just to abort the operation.
If people *knowingly* change device names, they should always run vgscan
afterwards.
A few bytes of memory gets leaked inside a pool each time an alias
has to be discarded - it's not worth restructuring the code to reuse it.
More of LVM2 needs updating to pass device objects (or uuids) about
instead of pathnames so that resolution of pathname->object only happens
once per operation.
dev_cache_get() should now always return the *current* device at the path given
dev_name_confirmed() replaces dev_name() whenever it's important to
know that name for the device is still current (ie when opening it).
If the cache doesn't know a current name, the function fails.
dev_open() guarantees that the file descriptor returned is for the dev_t
of the device structure it was passed.
o When opening device, return error if its cached name is incorrect (eg if
it's changed since the cache was generated). This prevents use until
the cache is rebuilt (eg with vgscan). Doesn't catch every case.
o updated vgcfgrestore args
o change _check_for_open_devices only to check devices present in the hash
table instead of using dev_iter which triggers a full scan even when only
displaying command line help
o Changed disk-rep to use these
o if NDEBUG is not defined the dev_cache will check for open devices on
teardown.
I was hoping this would speed things up. But I'm still getting:
reti:/home/joe/sistina/LVM2/tools# time ./lvm vgchange -a n
Volume group vg0 successfully changed
real 0m5.751s
user 0m0.060s
sys 0m0.070s
even though I have only 1 device with the vg on it passing the filters.
devices {
# first match is final, eg. /dev/ide/cdrom
# get's rejected due to the first pattern
filter=["r/cdrom/", # don't touch the music !
"a/hd[a-d][0-9]+/",
"a/ide/",
"a/sd/",
"a/md/",
"a|loop/[0-9]+|", # accept devfs style loop back
"r/loop/", # and reject old style
"a/dasd/",
"a/dac960/",
"a/nbd/",
"a/ida/",
"a/cciss/",
"a/ubd/",
"r/.*/"] # reject all others
}
Alasdair this is ready to roll into the tools now.