1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-26 14:04:15 +03:00

498 Commits

Author SHA1 Message Date
Zdenek Kabelac
bb5f81624b cov: dmeventd plugin fix memleak
Fix memory leak when policy command fails too frequently and
plugin decided to skip it.
2020-10-25 00:56:11 +02:00
Zdenek Kabelac
511cd6adb7 lvmcmdlib: lvm2_init_threaded
cmd context has 'threaded' value that used be set
by clvmd - and allowed proper memory locking management.
Reuse same bit for dmeventd.

Since dmeventd is using 300KiB stack per thread,
we will ignore any user settings for allocation/reserved_stack
until some better solution is find.
This avoids crashing of dmevend when user changes this value
and because in most cases lvm2 should work ok with 64K stack
size, this change should not cause any problems.
2020-10-20 22:49:18 +02:00
Zdenek Kabelac
8e778995de cleanup: matching declaration order
Cosmetic
2020-10-16 16:02:06 +02:00
Zdenek Kabelac
5d93892d4a dmeventd: enhance time waiting loop
dmeventd is 'scanning' statuses in loop (most usually in 10sec
intervals) - and meanwhile it sleeps within:
pthread_cond_timedwait()

However this function call tends to wakeup sometimes a short amount of
time sooner - and our code still believe the 'right time' has not yet
arrived and basically for a moment 'busy-looped' on calling this
function - so for systems with 'clock_gettime()' present we obtain
time and we go 10ms to the future second - this avoids unneeded
repeated invocation of our time scheduling loop.

TODO: monitoring during 1 hour 'time-change'...
2020-04-08 15:22:54 +02:00
Heinz Mauelshagen
8d3e01ff4f dmeventd: avoid bail out preventing repair in raid plugin but keep message
Followup patch mentioned in previous commit 0585754593d7c010d83274c3a25dd6c3e8c8b4a8.

Problem:
  even though dead raid component devices are detected, the
  raid plugin is bailing out thus preventing a repair attempt.

Rational:
  in case of component device errors, the MD resynchronization
  thread runs in parallel with the thrown event being processed
  by the raid plugin.  The plugin retrieves the raid device status
  but that still reflects insync regions as 0 (when it should
  already be total regions) because the MD thread didn't update it yet.

Solution:
  Remove the insync regions check but keep the informal message
  "waiting for resynchronization"  and let lvconvert carry out its
  pre-repair checks and optionally carry out a repair attempt.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1751887
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1560739
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1468590
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1654860
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1729303
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1741016
2019-09-20 17:52:02 +02:00
Heinz Mauelshagen
0585754593 Revert "dmeventd: avoid bail out preventing repair in raid plugin"
This reverts commit 9e438b4bc6b9240b63fc79acfef3c77c01a848d8.

The reverted patch also removed the warning which we realized we need
to keep as valuable process information (see related bugzilla below).

In a followup patch, we'll keep the message and avoid bailing out thus
always allowing lvconvert to try repairing if 'allocate' fault policy set.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1751887
2019-09-20 17:48:48 +02:00
Heinz Mauelshagen
9e438b4bc6 dmeventd: avoid bail out preventing repair in raid plugin
Problem:
even though dead raid component devices are detected, the
raid plugin is bailing out thus preventing a repair attempt.

Rational:
in case of component device errors, the MD resynchronization
thread runs in parallel with the thrown event being processed
by the raid plugin.  The plugin retrieves the raid device status
but that still reflects insync regions as 0 (when it should
already be total regions) because the MD thread didn't update it yet.

Solution:
Remove the insync regions check and let lvconvert carry out its
pre-repair checks and optionally carry out a repair attempt.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1560739
Related:  https://bugzilla.redhat.com/show_bug.cgi?id=1468590
Related:  https://bugzilla.redhat.com/show_bug.cgi?id=1654860
Related:  https://bugzilla.redhat.com/show_bug.cgi?id=1729303
Related:  https://bugzilla.redhat.com/show_bug.cgi?id=1741016
2019-08-16 18:08:22 +02:00
Zdenek Kabelac
3b42cdad0c cov: unlock lvm2 mutex on error path
Add missing unlock call on theretical error path where
we would be missing our configured command.
2019-06-25 17:31:39 +02:00
Zdenek Kabelac
a8d59404f7 dmeventd: lvm2 plugin uses envvar registry
Thin plugin started to use configuble setting to allow to configure
usage of external scripts - however to read this value it needed to
execute internal command as dmeventd itself has no access to lvm.conf
and the API for dmeventd plugin has been kept stable.

The call of command itself was not normally 'a big issue' until users
started to use higher number of monitored LVs and execution of command
got stuck because other monitored resource already started to execute
some other lvm2 command and become blocked waiting on VG lock.

This scenario revealed necesity to somehow avoid calling lvm2 command
during resource registration - but this requires bigger changes - so
meanwhile this patch tries to minimize the possibility to hit this race
by obtaining any configurable setting just once - such patch is small
and covers majority of problem - yet better solution needs to be
introduced likely with bigger rework of dmeventd.

TODO: Avoid blocking registration of resource with execution of lvm2
commands since those can get stuck waiting on mutexes.
2018-09-05 14:39:14 +02:00
Zdenek Kabelac
f7645995da dmeventd: rebase to stable branch
Some minimal set of changes to make vdo plugin compilable in stable branch:

Use older headers.
Implement simple vdo status parser to only resolve use-percentage.
2018-07-31 14:55:03 +02:00
Zdenek Kabelac
4ed9b07380 dmeventd: base vdo plugin
Introduce VDO plugin for monitoring VDO devices.

This plugin can be used also by other users, as plugin checks
for UUID prefix 'LVM-' and run  lvm actions only on those
devices.

Non LVM- device are only monitored and log warnings
when usage threshold reaches 80%.
2018-07-31 14:53:27 +02:00
Joe Thornber
1ddbbb67e0 build: fix typo in dmeventd/plugins/Makefile.in 2018-04-30 15:31:57 +01:00
Joe Thornber
2bc896f2a3 build: remove --with-{snapshots,mirrors,raid,thin,cache} options from ./configure
It now behaves as if the were all set as 'internal'
2018-04-30 10:11:23 +01:00
Zdenek Kabelac
9be086fbee thin: pass environment to scripts
When dmeventd thin plugin forks a configurable script, switch to use
execvp to pass whole environment present to dmeventd - so all configured
paths present at dmeventd startup are visible to script.

This was likely not a problem for common user enviroment,
however in test suite case variable like LVM_SYSTEM_DIR were
not actually used from test itself but rather from
a system present lvm.conf and this may have cause strange
behavior of a testing script.
2018-03-06 15:35:04 +01:00
Zdenek Kabelac
d90a647802 activation: separate reporting of error and monitoring status
Avoid using same return code for reporting 2 different things
and stricly report error code by return value and add new
parameter for reporting monitoring status.

This makes easier to recognize which error we got from dm_event
and continue only with  ENOENT.
2018-02-12 22:14:59 +01:00
Zdenek Kabelac
f41935909f dmeventd: add check for result code
Check result from pthread_kill.
2018-01-17 14:44:33 +01:00
Zdenek Kabelac
76954884c7 cleanup: drop unused define 2017-12-04 15:38:50 +01:00
Zdenek Kabelac
0c9e3e8df2 coverity: add some initilizers
Coverity cannot do a deeper analyzis so let's make just reports
go away and initialize them to 0.
2017-11-07 21:26:11 +01:00
Zdenek Kabelac
9940c2f754 dmeventd: schedule exit on break
When dmeventd receives SIGTERM/INT/HUP/QUIT it validates if exit is possible.
If there was any device still monitored, such exit request used to
be ignored/refused. This 'usually' worked reasonably well, however if there
is very short time period between last device is unmonitored and signal
reception - there was possibility such EXIT was ignored, as dmeventd has
not yet got into idle state even commands like 'vgchange -an' has already
finished.

This patch changes logic towards scheduling EXIT to the nearest
point when there is no monitored device.

EXIT is never forgotten.

NOTE: if there is only a single monitored device and someone sends
SIGTERM and later someone uses i.e. 'lvchange --refresh' after
unmonitoring dmeventd will exit and new instance needs to be
started.
2017-10-05 10:19:21 +02:00
Zdenek Kabelac
2232e82d25 makefiles: fixing linking
Avoid adding -g more then once for debug builds.
Avoid enabling  DEBUG_MEM when we build multithreaded tools.
Link executables with -fPIE -pie and --export-dynamic LDFLAGS
Introduce PROGS_FLAGS to add option to pass flags for external libs.
Link  lvm2 internally library only when really used.
Link DAEMON_LIBS with daemons.
Pass VALGRIND_CFLAGS internally
Set shell failure mode on couple places.
2017-08-01 11:53:30 +02:00
Zdenek Kabelac
0bf836aa14 tidy: prefer not using else after return
clang-tidy: avoid using  'else' after return - give more readable code,
and also saves indention level.
2017-07-20 11:18:29 +02:00
Zdenek Kabelac
feed61f3fa libdm: use rounded float for percent print
Use new added  dm_percent_to_round_float to enhance print
of percentage values.
2017-06-24 17:44:42 +02:00
Zdenek Kabelac
0016b79c8b dmeventd: improve more raid status reporting
When we want to report primary leg failure, check for intial 'a',
since otherwice 'Aa idle' is normally visible.

Also reset array of bit flags marking dead devices, once
plugin detects raid is in sync.
2017-06-24 00:06:12 +02:00
Zdenek Kabelac
653bdedb83 raid: plugin does not to use --config
Functionality of ignore suspend devices is already granted by:

lvm2_disable_dmeventd_monitoring() -> init_run_by_dmeventd() ->
init_ignore_suspended_devices().

In fact plugins should never use --config because it has
some unpleasant technical issues.
2017-06-23 23:32:40 +02:00
Jonathan Brassow
4c0e908b0a RAID (lvconvert/dmeventd): Cleanly handle primary failure during 'recover' op
Add the checks necessary to distiguish the state of a RAID when the primary
source for syncing fails during the "recover" process.

It has been possible to hit this condition before (like when converting from
2-way RAID1 to 3-way and having the first two devices die during the "recover"
process).  However, this condition is now more likely since we treat linear ->
RAID1 conversions as "recover" now - so it is especially important we cleanly
handle this condition.
2017-06-14 08:39:50 -05:00
Zdenek Kabelac
455a4de090 dmeventd: restore multiple warnings
With recent updates for thin pool monitoring in version 169
we lost multiple WARNINGs to be printed in syslog, when
pool crossed  80%, 85%, 90%, 95%, 100%.

Restore this logic as we want to keep user informed more
then just once when 80% boundary is passed.
2017-05-10 15:39:36 +02:00
Heinz Mauelshagen
a37bb7b2a5 dmeventd: adjust mirror/raid DSOs to new repair design
Previous commit 506d88a2ec8c introduced disabling lvmetad on repairs.

Avoid calling lvscan and use of any --config options altogether
in the mirror and raid DSOs.

Related: rhbz1380521
2017-03-16 21:05:05 +01:00
Heinz Mauelshagen
e5b6f2685a dmeventd: reintroduce fix mirror DSO to work with lvmetad
Commit 07ded8059cbd assumed that the mirror is blocked which is not the case.

It is accessible, degraded and in need of repair because some of its legs
(partially) failed.  Any auto-repair via dmeventd fails though because
of lvmetad not providing proper data about the failed PV(s).  That's why
this workaround got introduced in commit 76f6951c3e8f until we get to
the lvmetad interaction core issue.

Mind any mirror auto-repair failure is caused by such lvmetad interaction
problems not yet solved so disabling lvmetad works as a resort as elaborated
on in the related bz.

Reintroducing the interim solution.

Resolves: rhbz1380521
2017-03-16 14:19:06 +01:00
Zdenek Kabelac
07ded8059c mirror: revert 76f6951c3e8f0933df9730a42e9c46f273d1da24
Effectively revert whole  76f6951c3e8f0933df9730a42e9c46f273d1da24.
We need to figure out some other solution.

At this moment usage of --config  with 'repair' of blocked mirror
is 'freezing' combination.
2017-03-16 01:17:57 +01:00
Zdenek Kabelac
115fd205de mirror: avoid scanning
While mirror is blocked we can't try to scan device.
Regression introduce by previous commit
76f6951c3e8f0933df9730a42e9c46f273d1da24.
2017-03-16 01:02:10 +01:00
Heinz Mauelshagen
76f6951c3e dmeventd: (workaround) fix mirror DSO to work with lvmetad
Automatic dmeventd repair of mirrors with active lvmetad configured
(mirror_image_fault_policy = "allocate") fails because the lvscan
run before the repair in the mirror DSO does not update the
lvmetad cache properly thus "lvconvert --repair ..." fails.

Need to scan the mirror LV before and after the repair
to have proper cache content after the repair finished.
The cache can't be relied on or the repair will fail.

Resolves: rhbz1380521
2017-03-09 20:41:07 +01:00
Zdenek Kabelac
05dd566a52 dmeventd: unify error handling
Always make sure the 'status' is release on 'error' path (thin pluging missed)
Make code looking same across all plugins.
2017-02-14 00:03:34 +01:00
Zdenek Kabelac
aa0c735e2c dmeventd: limit thin_command usage
Require usable command string to begining with '/'
So 'thin_command = "/some/path/command"' is the only supported variant
to internal 'lvm' command.
2017-02-13 09:43:53 +01:00
Zdenek Kabelac
0844b20f98 coverity: remove unneeded header files 2017-02-11 21:17:27 +01:00
Zdenek Kabelac
2a9eda1229 mem: add extra mem pages for pthread stack
Some archs can use even 64K pages and then lvm2 runs into trouble if
the stack is 'too small' to fit extra page capturing stack overwrite.

So when lvm2 limits stack - add extra mem page - be it 4K or 64K.

Relates to ppc64le bug: https://bugzilla.redhat.com/1387279
2017-02-11 18:23:15 +01:00
Zdenek Kabelac
836eb122ce dmeventd_thind: set LVM_RUN_BY_DMEVENTD
Set LVM_RUN_BY_DMEVENTD envvar to expose the command is runing from
dmeventd environment.
2017-01-23 14:55:47 +01:00
Zdenek Kabelac
2e0605d6db dmeventd_thin: internal command without lvm prefix
Internal command processing needs to go without 'lvm ' prefix.
2017-01-21 17:42:19 +01:00
Zdenek Kabelac
85dab3963f dmeventd_thin: enable support for external command
With this commit we start to support configurable action
from thin-pool monitoring via  'dmeventd/thin_command'
2017-01-21 00:01:05 +01:00
Zdenek Kabelac
8c4f3633ac dmeventd_thin: new logic for calling commands
For more advanced support we need to ensure better logic for calling
external much more advanced script for maintanance of thin-pool.

So this new code ensures:

When thin-pool data or metadata is bigger then 50%,
then with each 5% increment, action is called.
This is independent from autoextend_threshold.
This action always happens when thin-pool is over threshold,
(so no action when it's exactly i.e. 60%).
The only exception is 100% full thin-pool - which invokes 'last'
action.

Since thin-pool occupancy may change also downward, code needs
to also handle possibly reduction of occupancy  of thin-pool.
So when usage drop from 90% to 50%, thin-pool will start to call
again action when it will pass 55% threshold.

This give external commands lot of option i.e. to call 'fstrim'
before actual resize is needed.
2017-01-20 23:58:56 +01:00
Zdenek Kabelac
8b95551ade dmeventd_thin: drop umounting on error path
Default internal logic will stop trying to do any 'rescue' action
when executed command fails.
This will be now fully in hands of external script if such
behaviour is needed.
2017-01-20 23:58:56 +01:00
Zdenek Kabelac
43e3268ada dmeventd_thin: rework failure handling
Instead of stopping monitoring after couple failing retries,
keep monitoring forever, just make larger delays between command
retries (ATM upto ~42 minutes).

So syslog is not spammed too often, yet commands have a chance to
be retried and succeed eventually...
2017-01-20 23:56:39 +01:00
Zdenek Kabelac
46c23dfb87 dmeventd_thin: SIGCHLD handler
To improve reaction time on when child is finished,
lets handle SIGCHLD in particular thread.
Let's hope kernel will route SIGCHLD to matching thread.
2017-01-20 23:55:51 +01:00
Zdenek Kabelac
bc7a1d70d4 dmeventd_thin: init command
When dmeventd configured command does not start with 'lvm ' prefix,
it's going to be an 'external' command.
In this case we split command by spaces to argv strings.
2017-01-20 23:55:50 +01:00
Zdenek Kabelac
14746a6c00 dmeventd_thin: add wait_pid
Add support handling command exit.
2017-01-20 23:55:50 +01:00
Zdenek Kabelac
2e935c0967 dmeventd_thin: add run_command
Implement forking of executable command.
When command is forked, dmeventd may continue monitor device.
2017-01-20 23:55:50 +01:00
Zdenek Kabelac
e5bef50827 dmeventd_thin: better warning logic
When fullness is passing WARN_THRESHOLD, print warning,
when it drops bellow and crossed again, we should print
warning again, but always only once.
2017-01-20 23:55:50 +01:00
Zdenek Kabelac
0d945ddbad dmeventd_thin: switch to struct percent
Later we can use stored percent values to pass them
to executed commands.
2017-01-20 23:55:50 +01:00
Zdenek Kabelac
eca964b554 dmeventd_thin: handling of internal command 2017-01-20 23:55:50 +01:00
Zdenek Kabelac
dd19b56985 thin: refresh status when error processing fails
When thin-pool processes event and 'lvextend --use-policies' fails
rather capture up-to-date new info as the fullness percentage may
have jumped noticable. This way we could use 'more' correct numbers
when checking for thresholds.
2016-12-22 23:37:07 +01:00
Zdenek Kabelac
f9c6c115d3 cleanup: easier code for raid plugin
Set bits only when then were not yet assigned.
2016-12-09 15:15:02 +01:00