1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-07 21:18:59 +03:00
Commit Graph

19 Commits

Author SHA1 Message Date
Tony Asleson
66e79aab36 lvmdbusd: Add a retries during initial load
When the daemon is starting we do an initial fetch of lvm state.  If we
happened to get some type of failure with lvm during this time we would
exit.  During error injection testing this happened enough that
the unit tests were unable to finish.  Add retries to ensure we can get
started during error injection testing.
2023-03-10 12:51:53 -06:00
Tony Asleson
ed9072dad8 lvmdbusd: Correct undefined var 2022-09-16 10:49:37 -05:00
Tony Asleson
85fcbfd9d7 lvmdbusd: Instruct lvm to output debug to file for fullreport
Historically we have seen a few different errors which occur when we call
fullreport.  Failing exit code and JSON which is missing one or more keys.
Instruct lvm to dump the debug to a file during fullreport calls when we
fork & exec lvm. If we encounter an error, ouput the debug data.
The reason this isn't being done when lvmshell is used is because we
don't have an easy way to test the error paths.

This change is complicated by the following:

1. We don't know if fullreport was good until we evaluate all the JSON.
   This is done a bit after we have called into lvm and returned.
2. We don't want to orphan the debug file used by lvm if the daemon is
   killed. Thus we try to minimize the window where the debug file hasn't
   already been unlinked.  A RFE to pass an open FD to lvm for this
   purpose is outstanding.

The temp. file is:
-rw------. 1 root root /tmp/lvmdbusd.lvm.debug.XXXXXXXX.log
2022-09-16 10:49:37 -05:00
Tony Asleson
d42bdb07de lvmdbusd: Re-work error handling
Introduce an exception which is used for known existing issues with lvm.
This is used to distinguish between errors between lvm itself and lvmdbusd.
In the case of lvm bugs, when we simply retry the operation we will log
very little.  Otherwise, we will dump a full traceback for investigation
when we do the retry.
2022-09-16 10:49:37 -05:00
Tony Asleson
e5c41b94b8 lvmdbusd: refactor and correct fetch thread logic
Simplify the fetch thread and correct the logic used for selecting the
options which are used when we batch update a state refresh.
2022-09-16 10:49:37 -05:00
Tony Asleson
e6e874922e lvmdbusd: Handle SIGINT quietly
Change how we exit on SIGINT so that we don't output needless debug.
2022-09-16 10:49:37 -05:00
Tony Asleson
0296e56073 lvmdbusd: Don't report recoverable error
Lvm occasionally fails to return all the request JSON keys in the output of
"fullreport".  This happens very rarely.  When it does the daemon was reporting
the resulting informational exception:

MThreadRunner: exception
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
    self.rc = self.f(*self.args)
  File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 40, in _main_thread_load
    (lv_changes, remove) = load_lvs(
  File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 143, in load_lvs
    return common(
  File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 37, in common
    objects = retrieve(search_keys, cache_refresh=False)
  File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 95, in lvs_state_retrieve
    l['vdo_operating_mode'],
KeyError: 'vdo_operating_mode'

The daemon retries the operation, which usually works and the daemon continues.
However, simply reporting this informational stack trace is causing CI and other
automated tests to fail as they expect no tracebacks in the log output.

Remove the reporting of this code path unless it persists and causes the daemon
to give up and exit.

Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2120267
2022-09-16 10:49:37 -05:00
Tony Asleson
b0c7220dbb lvmdbusd: Add debug circular buffer
When the daemon isn't started with --debug we will keep a circular
buffer of the past N number of debug messages which we will output
when we encounter an issue.
2022-09-16 10:49:37 -05:00
Tony Asleson
3d8882db83 lvmdbusd: fix hangs on SIGINT
Rather than trying to bubble up return codes that get us to exit cleanly
it's better to just raise an exception to bail.  In some cases functions
don't have return codes, so they cannot be checked.
2022-09-16 10:49:37 -05:00
Tony Asleson
6b9cc7432e lvmdbusd: Remove exclusionary language 2022-09-16 10:49:36 -05:00
Tony Asleson
f70d97b916 lvmdbusd: Defer dbus object removal
When we are walking the new lvm state comparing it to the old state we can
run into an issue where we remove a VG that is no longer present from the
object manager, but is still needed by LVs that are left to be processed.
When we try to process existing LVs to see if their state needs to be
updated, or if they need to be removed, we need to be able to reference the
VG that was associated with it.  However, if it's been removed from the
object manager we fail to find it which results in:

Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/lvmdbusd/utils.py", line 666, in _run
  self.rc = self.f(*self.args)
File "/usr/lib/python3.6/site-packages/lvmdbusd/fetch.py", line 36, in _main_thread_load
  cache_refresh=False)[1]
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 146, in load_lvs
  lv_name, object_path, refresh, emit_signal, cache_refresh)
File "/usr/lib/python3.6/site-packages/lvmdbusd/loader.py", line 68, in common
  num_changes += dbus_object.refresh(object_state=o)
File "/usr/lib/python3.6/site-packages/lvmdbusd/automatedproperties.py", line 160, in refresh
  search = self.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 483, in lvm_id
  return self.state.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 173, in lvm_id
  return "%s/%s" % (self.vg_name_lookup(), self.Name)
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 169, in vg_name_lookup
  return cfg.om.get_object_by_path(self.Vg).Name

Instead of removing objects from the object manager immediately, we will
keep them in a list and remove them once we have processed all of the state.

Ref:
https://bugzilla.redhat.com/show_bug.cgi?id=1968752
2021-06-16 12:19:02 -05:00
Tony Asleson
4dcb36aba4 lvmdbusd: Fix model inconsistency when LV loses interface
When a LV loses an interface it ends up getting removed and recreated.
This happens after the VGs have been processed and updated.  Thus when
this happens we need to re-check the VGs.
2019-10-30 10:38:40 -05:00
Tony Asleson
ab1f1a306b lvmdbusd: Exit daemon when unable to retrieve state
In some cases we get stuck where we are unable to retrieve the current
state of lvm as we are encountering an error.  When the error is
persistent we will log and exit the daemon instead of consuming vast
amounts of resources.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
2018-12-20 10:27:30 -06:00
Tony Asleson
60e3dbd6d5 lvmdbusd: Give threads names
This will allow easier debug.
2017-09-27 07:45:00 -05:00
Tony Asleson
61420309ee lvmdbusd: Prevent stall when update thread gets exception
If during the process of fetching current lvm state we experience an
exception we fail to call set_result on the queued_requests we were
processing.  When this happens those threads block forever which causes
the service to stall infinitely.  Only clear the queued_requests after
we have called set_result.
2017-06-02 12:39:04 -05:00
Tony Asleson
72a943f720 lvmdbusd: Long running operations use own thread
Anything that ends up using polld will be done in a unique thread
so that we don't hold off other operations from running concurrently.
2016-11-17 11:35:16 -06:00
Tony Asleson
fa444906bb lvmdbusd: Use one thread to fetch state updates
In preparation to have more than one thread issuing commands to lvm
at the same time we need to serialize updates to the dbus state and
retrieving the global lvm state.  To achieve this we have one thread
handling this with a thread safe queue taking and coalescing requests.
2016-11-17 11:35:16 -06:00
Tony Asleson
38dd79307a lvmdbusd: Execute load in main thread
We will fetch the lvm state in non-main thread and only process the new
data with the main thread to prevent hanging the main thread event loop.

ref. https://bugs.freedesktop.org/show_bug.cgi?id=98521
2016-11-02 16:38:49 -05:00
Alasdair G Kergon
5987562cf9 lvmdbus: Add new daemon. 2016-02-17 23:53:35 +00:00