IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add a global timeout value to be used for the threads to end waiting for
whatever it is they are blocked on. The values varied from 2-5 seconds,
which is way longer than needed. Value of 0.5 shows no CPU load when
service is running and is idle.
syscall 186 is specific to x86 64bit. As this is different from arch
to arch and between same arch different arch size we will only grab
thread ID using built-in python support if it is supported.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2166931
When the daemon is starting we do an initial fetch of lvm state. If we
happened to get some type of failure with lvm during this time we would
exit. During error injection testing this happened enough that
the unit tests were unable to finish. Add retries to ensure we can get
started during error injection testing.
When we sort the LVs, we can stumble on a missing key, protect against
this as well.
Seen in error injection testing:
Traceback (most recent call last):
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/fetch.py", line 198, in update_thread
num_changes = load(*_load_args(queued_requests))
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/fetch.py", line 83, in load
rc = MThreadRunner(_main_thread_load, refresh, emit_signal).done()
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/utils.py", line 726, in done
raise self.exception
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/utils.py", line 732, in _run
self.rc = self.f(*self.args)
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/fetch.py", line 40, in _main_thread_load
(lv_changes, remove) = load_lvs(
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/lv.py", line 148, in load_lvs
return common(
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/lv.py", line 72, in lvs_state_retrieve
lvs = sorted(cfg.db.fetch_lvs(selection), key=get_key)
File "/home/tasleson/projects/lvm2/daemons/lvmdbusd/lv.py", line 35, in get_key
pool = i['pool_lv']
KeyError: 'pool_lv'
There is a window of time where the following can occur.
1. An API request is in process to the lvm shell, we have written some
command to the lvm shell and we are blocked on that thread waiting
2. A signal arrives to the daemon which causes us to exit. The signal
handling code path goes directly to the lvm shell and writes
"exit\n". This causes the lvm shell to simply exit.
3. The thread that was waiting for a response gets an EIO as the child
process has exited. This bubbles up a failure.
This is addressed by placing a lock in the lvm shell to prevent
concurrent access to the shell. We also gather additional debug data
when we get an error in the lvm shell read path. This should help if
the lvm shell exits/crashes on its own.
Previously we utilized udev until we got a dbus notification from lvm
command line tools. This however misses the case where something outside
of lvm clears the signatures on a block device and we fail to refresh the
state of the daemon. Change the behavior so we always monitor udev events,
but ignore those udev events that pertain to lvm members.
Note: --udev command line option no longer does anything and simply
outputs a message that it's no longer used.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1967171
Previously when the __del__ method ran on LVMShellProxy we would blindly
call terminate(). This was a race condition as the underlying process
may/maynot be present. When the process is still present the SIGTERM will
end up being seen by lvmdbusd too. Re-work the code so that we
first try to wait for the child process to exit and only then if it hasn't
exited will we send it a SIGTERM. We also ensure that when this is
executed we will briefly ignore a SIGTERM that arrives for the daemon.
When checking to see if the PV is missing we incorrectly checked that the
path_create was equal to PV creation. However, there are cases where we
are doing a lookup where the path_create == None. In this case, we would
fail to set lvm_id == None which caused a problem as we had more than 1
PV that was missing. When this occurred, the second lookup matched the
first missing PV that was added to the object manager. This resulted in
the following:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 25, in _main_thread_load
(changes, remove) = load_pvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/pv.py", line 46, in load_pvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 55, in common
del existing_paths[dbus_object.dbus_object_path()]
Because we expect to find the object in existing_paths if we found it in
the lookup.
resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2085078
Latest upstream build of lvm results in the following error when
trying to use lvmshell.
"Argument --reportformat cannot be used in interactive mode.,
Error during parsing of command line."
When lvm is compiled with editline, if the file descriptors don't look like
a tty, then no "lvm> " prompt is done. Having lvm output the shell prompt
when consuming JSON on a report file descriptor is very useful in
determining if lvm command is complete.
Historically we have seen a few different errors which occur when we call
fullreport. Failing exit code and JSON which is missing one or more keys.
Instruct lvm to dump the debug to a file during fullreport calls when we
fork & exec lvm. If we encounter an error, ouput the debug data.
The reason this isn't being done when lvmshell is used is because we
don't have an easy way to test the error paths.
This change is complicated by the following:
1. We don't know if fullreport was good until we evaluate all the JSON.
This is done a bit after we have called into lvm and returned.
2. We don't want to orphan the debug file used by lvm if the daemon is
killed. Thus we try to minimize the window where the debug file hasn't
already been unlinked. A RFE to pass an open FD to lvm for this
purpose is outstanding.
The temp. file is:
-rw------. 1 root root /tmp/lvmdbusd.lvm.debug.XXXXXXXX.log
Introduce an exception which is used for known existing issues with lvm.
This is used to distinguish between errors between lvm itself and lvmdbusd.
In the case of lvm bugs, when we simply retry the operation we will log
very little. Otherwise, we will dump a full traceback for investigation
when we do the retry.
Instead of lumping all the exceptions, break them out to handle the dbus
exceptions separately, to reduce the amount of debug information that ends
up in the journal that has questionable value.
Lvm occasionally fails to return all the request JSON keys in the output of
"fullreport". This happens very rarely. When it does the daemon was reporting
the resulting informational exception:
MThreadRunner: exception
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 40, in _main_thread_load
(lv_changes, remove) = load_lvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 143, in load_lvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 95, in lvs_state_retrieve
l['vdo_operating_mode'],
KeyError: 'vdo_operating_mode'
The daemon retries the operation, which usually works and the daemon continues.
However, simply reporting this informational stack trace is causing CI and other
automated tests to fail as they expect no tracebacks in the log output.
Remove the reporting of this code path unless it persists and causes the daemon
to give up and exit.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2120267