IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Introduce a new lock for the flight recorder, so that we can dump it when
a command is block waiting for lvm to complete. Also in all paths we will
addthe metadata to the flight recorder before it's done, so we will have
it when a command hangs and we dump the flight recorder. Add the missing
bits after the command has finished.
Cleaned up the output too.
When exec'ing lvm, it's possible to get large amounts of both stdout
and stderr depending on the state of lvm and the size of the lvm
configuration. If we allow any of the buffers to fill we can end
up deadlocking the process. Ensure we are handling stdout & stderr
during lvm execution.
Ref. https://bugzilla.redhat.com/show_bug.cgi?id=1966636
Signed-off-by: Tony Asleson <tasleson@redhat.com>
We were not adding background tasks to flight recorder. Add the meta
data to the flight recorder when we start the command and update the meta
data when the command is finished. Locking was added to meta data to
prevent concurrent update and returning string representation as these can
happen in two different threads.
This code is no longer needed because the back ground task has been
removed. Will add back if we change the design and end up utilizing
multiple worker threads.
There is no reason to create another background task when the task that
created it is going to block waiting for it to finish. Instead we will
just execute the logic in the worker thread that is servicing the worker
queue.
Instead of creating a thread to handle the case where a client
is calling job.Wait, we will utilize a timer. This significantly
reduces the number of threads that get created and destroyed while
the service is running.
When a client is doing a wait on a job, any other clients will hang
when trying to do anything with the service. This is caused by
the wait code which was placing the thread that handles
incoming dbus requests to sleep until either the timeout expired or
the job operation completed.
This change creates a thread for the wait request, so that the thread
processing incoming requests can continue to run.
The following operations would hang if lvm was compiled with
'enable-notify-dbus' and the client specified -1 for the timeout:
* LV snapshot merge
* VG move
* LV move
This was caused because the implementation of these three dbus methods is
different. Most of the dbus method calls are executed by gathering information
needed to fulfill it, placing that information on a thread safe queue and
returning. The results later to be returned to the client with callbacks.
With this approach we can process an arbitrary number of commands without any
of them blocking other dbus commands. However, the 3 dbus methods listed
above did not utilize this functionality because they were implemented with a
separate thread that handles the fork & exec of lvm. This is done because these
operations can be very slow to complete. However, because of this the lvm
command that we were waiting on is trying to call back into the dbus service to
notify it that something changed. Because the code was blocking the process
that handles the incoming dbus activity the lvm command blocked. We were stuck
until the client timed-out the connection, which then causes the service to
unblock and continue. If the client did not have a timeout, we would have been
hung indefinitely.
The fix is to always utilize the worker queue on all dbus methods. We need to
ensure that lvm is tested with 'enable-notify-dbus' enabled and disabled.
It appears that the output of lvconvert --merge can vary some. The code
was blowing up as it was trying to parse a line of stdout to retrieve the
% complete, but the line did not have the needed format and an execption
was thrown. The uncaught exception caused the background thread to exit
without updating the job object, which caused the client to hang forever
waiting. Added a default exception handler to prevent unhandled execptions
causing hangs and removed the parameter skip_first_line as it's no longer
needed. The code checks to see if the line can be parsed before doing so.
Signed-off-by: Tony Asleson <tasleson@redhat.com>