IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Instead of lumping all the exceptions, break them out to handle the dbus
exceptions separately, to reduce the amount of debug information that ends
up in the journal that has questionable value.
Lvm occasionally fails to return all the request JSON keys in the output of
"fullreport". This happens very rarely. When it does the daemon was reporting
the resulting informational exception:
MThreadRunner: exception
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 40, in _main_thread_load
(lv_changes, remove) = load_lvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 143, in load_lvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 95, in lvs_state_retrieve
l['vdo_operating_mode'],
KeyError: 'vdo_operating_mode'
The daemon retries the operation, which usually works and the daemon continues.
However, simply reporting this informational stack trace is causing CI and other
automated tests to fail as they expect no tracebacks in the log output.
Remove the reporting of this code path unless it persists and causes the daemon
to give up and exit.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2120267
When the daemon isn't started with --debug we will keep a circular
buffer of the past N number of debug messages which we will output
when we encounter an issue.
Rather than trying to bubble up return codes that get us to exit cleanly
it's better to just raise an exception to bail. In some cases functions
don't have return codes, so they cannot be checked.
We were incorrectly only setting this if --udev wasn't present on the
command line. In all cases when we see a manager.ExternalEvent we want
to set this.
Depending on when an occurs, it maynot have any information available for
lastlog. In this case try to grab an error message from the original
response.
Introduce a new lock for the flight recorder, so that we can dump it when
a command is block waiting for lvm to complete. Also in all paths we will
addthe metadata to the flight recorder before it's done, so we will have
it when a command hangs and we dump the flight recorder. Add the missing
bits after the command has finished.
Cleaned up the output too.
'.ID_FS_TYPE_NEW' is a custom property added by an LVM UDev rule
which is now being removed and 'ID_FS_TYPE' has the same value.
Signed-off-by: Vojtech Trefny <vtrefny@redhat.com>
In testing where we inject large amounts of additional output in stderr
we can occassionally get truncated stdout from lvm. Catching and dumping
the json for debug before we re-raise the exception. As this doesn't
happen without the error injecting wrapper around lvm, the error seems to
be with the wrapper.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
When exec'ing lvm, it's possible to get large amounts of both stdout
and stderr depending on the state of lvm and the size of the lvm
configuration. If we allow any of the buffers to fill we can end
up deadlocking the process. Ensure we are handling stdout & stderr
during lvm execution.
Ref. https://bugzilla.redhat.com/show_bug.cgi?id=1966636
Signed-off-by: Tony Asleson <tasleson@redhat.com>
When we are walking the new lvm state comparing it to the old state we can
run into an issue where we remove a VG that is no longer present from the
object manager, but is still needed by LVs that are left to be processed.
When we try to process existing LVs to see if their state needs to be
updated, or if they need to be removed, we need to be able to reference the
VG that was associated with it. However, if it's been removed from the
object manager we fail to find it which results in:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/lvmdbusd/utils.py", line 666, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.6/site-packages/lvmdbusd/fetch.py", line 36, in _main_thread_load
cache_refresh=False)[1]
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 146, in load_lvs
lv_name, object_path, refresh, emit_signal, cache_refresh)
File "/usr/lib/python3.6/site-packages/lvmdbusd/loader.py", line 68, in common
num_changes += dbus_object.refresh(object_state=o)
File "/usr/lib/python3.6/site-packages/lvmdbusd/automatedproperties.py", line 160, in refresh
search = self.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 483, in lvm_id
return self.state.lvm_id
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 173, in lvm_id
return "%s/%s" % (self.vg_name_lookup(), self.Name)
File "/usr/lib/python3.6/site-packages/lvmdbusd/lv.py", line 169, in vg_name_lookup
return cfg.om.get_object_by_path(self.Vg).Name
Instead of removing objects from the object manager immediately, we will
keep them in a list and remove them once we have processed all of the state.
Ref:
https://bugzilla.redhat.com/show_bug.cgi?id=1968752
When a LV loses an interface it ends up getting removed and recreated.
This happens after the VGs have been processed and updated. Thus when
this happens we need to re-check the VGs.
VDO pool LVs are represented by a new dbus interface VgVdo. Currently
the interface only has additional VDO properties, but when the
ability to support additional LV creation is added we can add a method
to the interface.
With Python 3.8 converting these directly to string using str()
no longer works, we need to convert these to integer first.
On Python 3.8:
>>> str(dbus.Int64(1))
'dbus.Int64(1)'
On Python 3.7 (and older):
>>> str(dbus.UInt64(1))
'1'
This is probably related to removing __str__ function from method
from int (dbus.UInt is subclass of int) which happened in 3.8, see
https://docs.python.org/3.8/whatsnew/3.8.html
Signed-off-by: Vojtech Trefny <vtrefny@redhat.com>
Lvm can at times have duplicate names. When this happens the daemon will
internally use vg_name:vg_uuid as the name for lookups, but display just
the vg_name externally. If an API user uses the Manager.LookUpByLvmId and
queries the vg name they will only get returned one result as the API
can only accommodate returning 1. The one returned is the first instance
found when sorting the volume groups by UUID.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1583510
When we have two logical volumes which switch their names at the
same time we are left with incorrect lookups. Anytime we find
an entry by doing a lookup by UUID or by name we will ensure
that the lookups are indeed correct.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1642176
When a VG is exported, the 'fullreport' returns an exit code of 5, but
otherwise returns the data we are wanting.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
In some cases we get stuck where we are unable to retrieve the current
state of lvm as we are encountering an error. When the error is
persistent we will log and exit the daemon instead of consuming vast
amounts of resources.
Signed-off-by: Tony Asleson <tasleson@redhat.com>
If we don't know the meaning we will return the key with default text
instead of raising an exception and taking the daemon out in the
process.
Resolves: rhbz1657950
When we get bug reports we may not get the entire log, so lets
dump the fight recorder from newest to oldest as the one we
are interested in was likely to be the last command run.
lvmdbusd executable script must use python3 interpreter detected by
configure script, as site-packages directory used for library is only
used by that interpreter.
If you send a SIGUSR1 (10) to the daemon it will dump all the
threads current stacks to stdout. This will be useful when the
daemon is apparently hung and not processing requests.
eg.
$ sudo kill -10 <daemon pid>
Make sure that any and all code that executes in the main thread is
wrapped with a try/except block to ensure that at the very least
we log when things are going wrong.
In some cases we are seeing where there are no VGs, but the data returned from
lvm shows that the PVs have the following for the VG:
"vg_name":"[unknown]", "vg_uuid":""
The code was only checking for the exitence of the VG name and we called into
the function get_object_path_by_uuid_lvm_id which requires both the VG name and
the LV name to exist (asserts this) which results in the following stack trace:
Traceback (most recent call last):
File "/home/tasleson/lvm2/daemons/lvmdbusd/utils.py", line 563, in runner
obj._run()
File "/home/tasleson/lvm2/daemons/lvmdbusd/utils.py", line 584, in _run
self.rc = self.f(*self.args)
File "/home/tasleson/lvm2/daemons/lvmdbusd/fetch.py", line 26, in
_main_thread_load
cache_refresh=False)[1]
File "/home/tasleson/lvm2/daemons/lvmdbusd/pv.py", line 48, in load_pvs
emit_signal, cache_refresh)
File "/home/tasleson/lvm2/daemons/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/home/tasleson/lvm2/daemons/lvmdbusd/pv.py", line 40, in
pvs_state_retrieve
p["pv_attr"], p["pv_tags"], p["vg_name"], p["vg_uuid"]))
File "/home/tasleson/lvm2/daemons/lvmdbusd/pv.py", line 84, in __init__
vg_uuid, vg_name, vg_obj_path_generate)
File "/home/tasleson/lvm2/daemons/lvmdbusd/objectmanager.py", line 318,
in get_object_path_by_uuid_lvm_id
assert uuid
AssertionError
When executing in the main thread, if we encounter an exception we
will bypass the notify_all call on the condition and the calling thread
never wakes up.
@staticmethod
def runner(obj):
# noinspection PyProtectedMember
Exception thrown here
----> obj._run()
So the following code doesn't run, which causes calling thread to hang
with obj.cond:
obj.function_complete = True
obj.cond.notify_all()
Additionally for some unknown reason the stderr is lost.
Best guess is it's something to do with scheduling a python function
into the GLib.idle_add. That made finding issue quite difficult.
If during the process of fetching current lvm state we experience an
exception we fail to call set_result on the queued_requests we were
processing. When this happens those threads block forever which causes
the service to stall infinitely. Only clear the queued_requests after
we have called set_result.
We were not adding background tasks to flight recorder. Add the meta
data to the flight recorder when we start the command and update the meta
data when the command is finished. Locking was added to meta data to
prevent concurrent update and returning string representation as these can
happen in two different threads.
vgreduce previously allowed --all and --removemissing together even though
it only actual did the remove missing. The lvm dbus daemon was passing
--all anytime there was no entries in pv_object_paths. This change supplies
--all if and only if we are not removing missing and the pv_object_paths
is empty.
Vgreduce has and continues to enforce the invalid combination of supplying a
device list when you specify --all or --removemissing so we do not need
to check for that invalid combination explicitly in the lvm dbus service as
it's already covered.
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1455471
When a user does a Manager.PvCreate they can specify the block device using a
device path that may be different than what lvm reports is the device path. For
example a user could use:
/dev/disk/by-id/wwn-0x5002538500000000 instead of /dev/sdc
In this case the pvcreate will succeed, but when we query lvm we don't find the
newly created PV. We fail because it's device path is returned as /dev/sdc. This
change re-uses an internal lookup which can accommodate this and correctly find
the newly created PV.
Corrects https://bugzilla.redhat.com/show_bug.cgi?id=1445654
Adding qualifier makes the only unqualified log_debug occurence
consistent with other uses in the same file.
Other possible ways to fix this:
- using `from .utils import log_debug`
- moving the line below `from . import utils` line
Udev events can come in like a flood when something changes. It really
doesn't do us any good to refresh the state of the service numerous times
when 1 would suffice. We had something like this before, but it was
removed when we added the refresh thread. However, we have since learned
that we need to sequence events in the correct order and block dbus
operations if we believe the state has been affected, thus udev events are
being processed on the main work queue. This change limits spurious
work items from getting to the queue.
If we always disable the sending of notify dbus events then in the case
where all the users are lvm dbus users we will be in udev handling mode
until at least 1 external lvm command occurs. Instead we will not disable
notify dbus until after we get at least 1 external event. This makes the
service get into the correct mode of operation faster.