IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There is a window of time where the following can occur.
1. An API request is in process to the lvm shell, we have written some
command to the lvm shell and we are blocked on that thread waiting
2. A signal arrives to the daemon which causes us to exit. The signal
handling code path goes directly to the lvm shell and writes
"exit\n". This causes the lvm shell to simply exit.
3. The thread that was waiting for a response gets an EIO as the child
process has exited. This bubbles up a failure.
This is addressed by placing a lock in the lvm shell to prevent
concurrent access to the shell. We also gather additional debug data
when we get an error in the lvm shell read path. This should help if
the lvm shell exits/crashes on its own.
Check for pkg-config --libs libdlm_lt and test if the returned value
contains word 'pthread' - if so, it's likely a buggy result from
incorrect config file and use directly -ldlm_lt for this case.
Convert lvmlockd to use configure _LIBS and _CFLAGS for
discovered libraries.
TODO: ATM we ignore discovered libdlm and use libdlm_lt instead.
Also libseagate_ilm is hard to find unicorn for testing.
Convert naming SYSTEMD_CFLAGS/LIB -> LIBSYSTEMD_CFLAGS/LIBS
to better fit library check for libsystemd.
Build lvmlockd with SD_NOTIFY when we have defined LIBSYSTEMD_LIBS.
Keep the conversion 64bit as on x32 arch time_t is 64bit value
and we may loose precision (y2038).
TODO: like use universal string for time printing as in log/log.c
_set_time_prefix()
We can't assume that strerror_r returns char* just because _GNU_SOURCE is
defined. We already call the appropriate autoconf test, so let's use its
result (STRERROR_R_CHAR_P).
Note that in configure, _GNU_SOURCE is always set, but we add a defined
guard just in case for futureproofing.
Bug: https://bugs.gentoo.org/869404
Previously we utilized udev until we got a dbus notification from lvm
command line tools. This however misses the case where something outside
of lvm clears the signatures on a block device and we fail to refresh the
state of the daemon. Change the behavior so we always monitor udev events,
but ignore those udev events that pertain to lvm members.
Note: --udev command line option no longer does anything and simply
outputs a message that it's no longer used.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1967171
If lvmlockd in cluster is killed accidently or any other reason, the
lock resources will become orphaned in the VG lockspace. When the
cluster manager tries to restart this daemon, the LVs will probably
become inactive because of resource schedule policy and thus the lock
resouce will be omited during the adoption process. This patch will
try to purge the lock resources left in previous lockspace, so the
following actions can work again.
Previously when the __del__ method ran on LVMShellProxy we would blindly
call terminate(). This was a race condition as the underlying process
may/maynot be present. When the process is still present the SIGTERM will
end up being seen by lvmdbusd too. Re-work the code so that we
first try to wait for the child process to exit and only then if it hasn't
exited will we send it a SIGTERM. We also ensure that when this is
executed we will briefly ignore a SIGTERM that arrives for the daemon.
When checking to see if the PV is missing we incorrectly checked that the
path_create was equal to PV creation. However, there are cases where we
are doing a lookup where the path_create == None. In this case, we would
fail to set lvm_id == None which caused a problem as we had more than 1
PV that was missing. When this occurred, the second lookup matched the
first missing PV that was added to the object manager. This resulted in
the following:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 25, in _main_thread_load
(changes, remove) = load_pvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/pv.py", line 46, in load_pvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 55, in common
del existing_paths[dbus_object.dbus_object_path()]
Because we expect to find the object in existing_paths if we found it in
the lookup.
resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2085078
Latest upstream build of lvm results in the following error when
trying to use lvmshell.
"Argument --reportformat cannot be used in interactive mode.,
Error during parsing of command line."
When lvm is compiled with editline, if the file descriptors don't look like
a tty, then no "lvm> " prompt is done. Having lvm output the shell prompt
when consuming JSON on a report file descriptor is very useful in
determining if lvm command is complete.
Historically we have seen a few different errors which occur when we call
fullreport. Failing exit code and JSON which is missing one or more keys.
Instruct lvm to dump the debug to a file during fullreport calls when we
fork & exec lvm. If we encounter an error, ouput the debug data.
The reason this isn't being done when lvmshell is used is because we
don't have an easy way to test the error paths.
This change is complicated by the following:
1. We don't know if fullreport was good until we evaluate all the JSON.
This is done a bit after we have called into lvm and returned.
2. We don't want to orphan the debug file used by lvm if the daemon is
killed. Thus we try to minimize the window where the debug file hasn't
already been unlinked. A RFE to pass an open FD to lvm for this
purpose is outstanding.
The temp. file is:
-rw------. 1 root root /tmp/lvmdbusd.lvm.debug.XXXXXXXX.log
Introduce an exception which is used for known existing issues with lvm.
This is used to distinguish between errors between lvm itself and lvmdbusd.
In the case of lvm bugs, when we simply retry the operation we will log
very little. Otherwise, we will dump a full traceback for investigation
when we do the retry.
Instead of lumping all the exceptions, break them out to handle the dbus
exceptions separately, to reduce the amount of debug information that ends
up in the journal that has questionable value.
Lvm occasionally fails to return all the request JSON keys in the output of
"fullreport". This happens very rarely. When it does the daemon was reporting
the resulting informational exception:
MThreadRunner: exception
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 40, in _main_thread_load
(lv_changes, remove) = load_lvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 143, in load_lvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 95, in lvs_state_retrieve
l['vdo_operating_mode'],
KeyError: 'vdo_operating_mode'
The daemon retries the operation, which usually works and the daemon continues.
However, simply reporting this informational stack trace is causing CI and other
automated tests to fail as they expect no tracebacks in the log output.
Remove the reporting of this code path unless it persists and causes the daemon
to give up and exit.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2120267
When the daemon isn't started with --debug we will keep a circular
buffer of the past N number of debug messages which we will output
when we encounter an issue.
Rather than trying to bubble up return codes that get us to exit cleanly
it's better to just raise an exception to bail. In some cases functions
don't have return codes, so they cannot be checked.
We were incorrectly only setting this if --udev wasn't present on the
command line. In all cases when we see a manager.ExternalEvent we want
to set this.
Depending on when an occurs, it maynot have any information available for
lastlog. In this case try to grab an error message from the original
response.
Introduce a new lock for the flight recorder, so that we can dump it when
a command is block waiting for lvm to complete. Also in all paths we will
addthe metadata to the flight recorder before it's done, so we will have
it when a command hangs and we dump the flight recorder. Add the missing
bits after the command has finished.
Cleaned up the output too.
in vgcreate for shared sanlock vg, if sanlock_write_resource
returns an unexpected error, then make init_vg_sanlock fail
which will cause the vgcreate to fail.
When storage is lost under a sanlock VG, and kill_vg/drop_vg
are used, sanlock_rem_lockspace() may return an error, but
the cleanup steps should still be performed. Without the
cleanup, gl_lsname_sanlock was not cleared. This caused
future lock requests to fail with ENOLS, but the NO_GL_LS
flag was not set due to gl_lsname_sanlock being set.
This caused lockd_global(sh) in lvm commands to fail when
they could succeed.
Fix from guozhonghua216
'.ID_FS_TYPE_NEW' is a custom property added by an LVM UDev rule
which is now being removed and 'ID_FS_TYPE' has the same value.
Signed-off-by: Vojtech Trefny <vtrefny@redhat.com>
Analyzer here was rather confused about possiblity of loosing previously
assigned device pointers - fixed by passing zero initialize memory
before first assign.
Mask for strncpy() Coverity report warning would
actually need to copy buffer from 'tmp_name' instead of 'str'.
But replace it directly with single 'strncpy()' again for better readbility,
just mask out the warning reported for this strncpy instance
(so we do not need to put comment fro every call of strcpy_name_len).
- Use a new function for all instances of copying
a null-terminated string into a fixed size struct
field that is not null-terminated.
- use memcpy when copying between struct fields of
the same size
In testing where we inject large amounts of additional output in stderr
we can occassionally get truncated stdout from lvm. Catching and dumping
the json for debug before we re-raise the exception. As this doesn't
happen without the error injecting wrapper around lvm, the error seems to
be with the wrapper.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
When exec'ing lvm, it's possible to get large amounts of both stdout
and stderr depending on the state of lvm and the size of the lvm
configuration. If we allow any of the buffers to fill we can end
up deadlocking the process. Ensure we are handling stdout & stderr
during lvm execution.
Ref. https://bugzilla.redhat.com/show_bug.cgi?id=1966636
Signed-off-by: Tony Asleson <tasleson@redhat.com>
When we are walking the new lvm state comparing it to the old state we can
run into an issue where we remove a VG that is no longer present from the
object manager, but is still needed by LVs that are left to be processed.
When we try to process existing LVs to see if their state needs to be
updated, or if they need to be removed, we need to be able to reference the
VG that was associated with it. However, if it's been removed from the
object manager we fail to find it which results in:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/lvmdbusd/utils.py", line 666, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.6/site-packages/lvmdbusd/fetch.py", line 36, in _main_thread_load
cache_refresh=False)[1]
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 146, in load_lvs
lv_name, object_path, refresh, emit_signal, cache_refresh)
File "/usr/lib/python3.6/site-packages/lvmdbusd/loader.py", line 68, in common
num_changes += dbus_object.refresh(object_state=o)
File "/usr/lib/python3.6/site-packages/lvmdbusd/automatedproperties.py", line 160, in refresh
search = self.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 483, in lvm_id
return self.state.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 173, in lvm_id
return "%s/%s" % (self.vg_name_lookup(), self.Name)
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 169, in vg_name_lookup
return cfg.om.get_object_by_path(self.Vg).Name
Instead of removing objects from the object manager immediately, we will
keep them in a list and remove them once we have processed all of the state.
Ref:
https://bugzilla.redhat.com/show_bug.cgi?id=1968752
As SUSE build tool reports the warning:
lvmlockd-core.c: In function 'client_thread_main':
lvmlockd-core.c:4959:37: warning: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size 6 [-Wformat-truncation=]
snprintf(buf, sizeof(buf), "path[%d]", i);
^~
lvmlockd-core.c:4959:31: note: directive argument in the range [0, 2147483647]
snprintf(buf, sizeof(buf), "path[%d]", i);
^~~~~~~~~~
To dismiss the compilation warning, enlarge the array "buf" to 17
bytes to support the max signed integer: string format 6 bytes + signed
integer 10 bytes + terminal char "\0".
Reported-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
To allow the IDM locking scheme be used by users, this patch hooks the
IDM wrapper; it also introducs a new locking type "idm" and we can use
it for global lock with option '-g idm'.
To support IDM locking type, the main change in the data structure is to
add pvs path arrary. The pvs list is transferred from the lvm commands,
when lvmlockd core layer receives message, it extracts the message with
the keyword "path[idx]". Finally, the pv list will pass to IDM lock
manager as the target drives for sending IDM SCSI commands.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Alongside the existed locking schemes of DLM and sanlock, this patch is
to introduce new locking scheme: In-Drive-Mutex (IDM).
With the IDM support in the drive, the locks are resident in the drive,
thus, the locking lease is maintained in a central place: the drive
firmware. We can consider this is a typical client-server model,
every host (or node) in the server cluster launches the request for
leasing mutex to a drive firmware, the drive firmware works as an
arbitrator to grant the mutex to a requester and it can reject other
applicants if the mutex has been acquired. To satisfy the LVM
activation for different modes, IDM supports two locking modes:
exclusive and shareable.
Every IDM is identified with two IDs, one is the host ID and another is
the resource ID. The resource ID is a unique identifier for what the
resource it's protected, in the integration with lvmlockd, the resource
ID is combined with VG's UUID and LV's UUID; for the global locking,
the bytes in resource ID are all zeros, and for the VG locking, the
LV's UUID is set as zero. Every host can generate a random UUID and
use it as the host ID for the SCSI command, this ID is used to clarify
the ownership for mutex.
For easily invoking the IDM commands to drive, like other locking
scheme (e.g. sanlock), a daemon program named IDM lock manager is
created, so the detailed IDM SCSI commands are encapsulated in the
daemon, and lvmlockd uses the wrapper APIs to communicate with the
daemon program.
This patch introduces the IDM locking wrapper layer, it forwards the
locking requests from lvmlockd to the IDM lock manager, and returns the
result from drives' responding.
One thing should be mentioned is the IDM's LVB. IDM supports LVB to max
7 bytes when stores into the drive, the most significant byte of 8 bytes
is reserved for control bits. For this reason, the patch maps the
timestamp in macrosecond unit with its cached LVB, essentially, if any
timestamp was updated by other nodes, that means the local LVB is
invalidate. When the timestamp is stored into drive's LVB, it's
possbile to cause time-going-backwards issue, which is introduced by the
time precision or missing synchronization acrossing over multiple nodes.
So the IDM wrapper fixes up the timestamp by increment 1 to the latest
value and write back into drive.
Currently LVB is used to track VG changes and its purpose is to notify
lvmetad cache invalidation when detects any metadata has been altered;
but lvmetad is not used anymore for caching metadata, LVB doesn't
really work. It's possible that the LVB functionality could be useful
again in the future, so let's enable it for IDM in the first place.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
If the 'act' has been already processed by add_client_result()
it could have been possibly release - so avoid accessin 'act->'
afterward and go for next item directly.
Enhance handling of interruptions of polling process and lvmpoll daemon.
Daemon should now react much faster on interrups (i.e. shutdown
sequence) and avoid taking lenghty sleep waiting on pvmove signaling.
with fork and exec to avoid use of shell.
largely copied from lib/misc/lvm-exec.c
require lvmlockctl_kill_command to be full path
use lvm config instead of lvmconfig to avoid need for LVM_DIR
The LVM devices file lists devices that lvm can use. The default
file is /etc/lvm/devices/system.devices, and the lvmdevices(8)
command is used to add or remove device entries. If the file
does not exist, or if lvm.conf includes use_devicesfile=0, then
lvm will not use a devices file. When the devices file is in use,
the regex filter is not used, and the filter settings in lvm.conf
or on the command line are ignored.
LVM records devices in the devices file using hardware-specific
IDs, such as the WWID, and attempts to use subsystem-specific
IDs for virtual device types. These device IDs are also written
in the VG metadata. When no hardware or virtual ID is available,
lvm falls back using the unstable device name as the device ID.
When devnames are used, lvm performs extra scanning to find
devices if their devname changes, e.g. after reboot.
When proper device IDs are used, an lvm command will not look
at devices outside the devices file, but when devnames are used
as a fallback, lvm will scan devices outside the devices file
to locate PVs on renamed devices. A config setting
search_for_devnames can be used to control the scanning for
renamed devname entries.
Related to the devices file, the new command option
--devices <devnames> allows a list of devices to be specified for
the command to use, overriding the devices file. The listed
devices act as a sort of devices file in terms of limiting which
devices lvm will see and use. Devices that are not listed will
appear to be missing to the lvm command.
Multiple devices files can be kept in /etc/lvm/devices, which
allows lvm to be used with different sets of devices, e.g.
system devices do not need to be exposed to a specific application,
and the application can use lvm on its own set of devices that are
not exposed to the system. The option --devicesfile <filename> is
used to select the devices file to use with the command. Without
the option set, the default system devices file is used.
Setting --devicesfile "" causes lvm to not use a devices file.
An existing, empty devices file means lvm will see no devices.
The new command vgimportdevices adds PVs from a VG to the devices
file and updates the VG metadata to include the device IDs.
vgimportdevices -a will import all VGs into the system devices file.
LVM commands run by dmeventd not use a devices file by default,
and will look at all devices on the system. A devices file can
be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If
this file exists, lvm commands run by dmeventd will use it.
Internal implementaion:
- device_ids_read - read the devices file
. add struct dev_use (du) to cmd->use_devices for each devices file entry
- dev_cache_scan - get /dev entries
. add struct device (dev) to dev_cache for each device on the system
- device_ids_match - match devices file entries to /dev entries
. match each du on cmd->use_devices to a dev in dev_cache, using device ID
. on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID
- label_scan - read lvm headers and metadata from devices
. filters are applied, those that do not need data from the device
. filter-deviceid skips devs without MATCHED_USE_ID, i.e.
skips /dev entries that are not listed in the devices file
. read lvm label from dev
. filters are applied, those that use data from the device
. read lvm metadata from dev
. add info/vginfo structs for PVs/VGs (info is "lvmcache")
- device_ids_find_renamed_devs - handle devices with unstable devname ID
where devname changed
. this step only needed when devs do not have proper device IDs,
and their dev names change, e.g. after reboot sdb becomes sdc.
. detect incorrect match because PVID in the devices file entry
does not match the PVID found when the device was read above
. undo incorrect match between du and dev above
. search system devices for new location of PVID
. update devices file with new devnames for PVIDs on renamed devices
. label_scan the renamed devs
- continue with command processing
cmd context has 'threaded' value that used be set
by clvmd - and allowed proper memory locking management.
Reuse same bit for dmeventd.
Since dmeventd is using 300KiB stack per thread,
we will ignore any user settings for allocation/reserved_stack
until some better solution is find.
This avoids crashing of dmevend when user changes this value
and because in most cases lvm2 should work ok with 64K stack
size, this change should not cause any problems.
Switch remaining zero sized struct to flexible arrays to be C99
complient.
These simple rules should apply:
- The incomplete array type must be the last element within the structure.
- There cannot be an array of structures that contain a flexible array member.
- Structures that contain a flexible array member cannot be used as a member of another structure.
- The structure must contain at least one named member in addition to the flexible array member.
Although some of the code pieces should be still improved.
The lock adopt feature was disabled since it had used
lvmetad as a source of info. This replaces the lvmetad
info with a local file and enables the adopt feature again
(enabled with lvmlockd --adopt 1).
dmeventd is 'scanning' statuses in loop (most usually in 10sec
intervals) - and meanwhile it sleeps within:
pthread_cond_timedwait()
However this function call tends to wakeup sometimes a short amount of
time sooner - and our code still believe the 'right time' has not yet
arrived and basically for a moment 'busy-looped' on calling this
function - so for systems with 'clock_gettime()' present we obtain
time and we go 10ms to the future second - this avoids unneeded
repeated invocation of our time scheduling loop.
TODO: monitoring during 1 hour 'time-change'...
When _daemon_read()/_client_read() fails during the read,
ensure memory allocated withing function is also release here
(so caller does not need to care). Also improve code readbility a bit
a for same functionality use more similar code.
Since we fixed linking of proper version of 'libdevmapper' with
linking lvm2 plugin correctly - we already have correct function
available linked with internal lvm library.
So drop unneeded include of parsing function.