IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch will releases allocated private resources from
startup. Needs previous dm_zalloc patch to ensure unset
private pointer is NULL.
TODO: check on real cluster.
cmirrord polls for messages on the kernel and cluster interfaces.
Sometimes it is possible for messages to be received on the cluster
interface and be waiting for processing while the node is in the
process of leaving the cluster group. When this happens, the
messages received on the cluster interface are attempted to be
dispatched, but an error is returned because the connection is no
longer valid. It is a harmless situation. So, if we get the
specific error (CS_ERR_BAD_HANDLE) and we know that we have left
the group, then simply don't print the message.
In cases where PV appears on a new device without disappearing from an old one
first, the device->pvid pointers could become ambiguous. This could cause the
ambiguous PV to be lost from the cache when a different PV comes up on one of
the ambiguous devices.
The DM_EVENT_GET_PARAMETERS requests the parameters under which
the running dmeventd is run and the it sends them to caller.
The parameters sent:
- the pid of the running dmeventd
- foreground state
- exec_method (currently either "direct" or "systemd")
The exact message sent back:
pid=<pid> daemon=<no/yes> exec_method=<direct/systemd>
Trying to restart dmeventd as a reload action is causing problems
under systemd environment. The systemd loses track of new dmeventd
this way. See also https://bugzilla.redhat.com/show_bug.cgi?id=1060134
for more info.
We need to call dmeventd -R directly instead of "systemctl reload dm-event.service"
that was used before (the reload is aimed at configuration reload anyway,
not stateful restart of the daemon - we did this before just because
there's no ExecRestart in systemd and there's only ExecStart and
ExecStop with which we'd lose the state).
Also, use ExecStart="dmeventd -f" to run dmeventd in foreground
(and let's rely on systemd to daemonize it) and change the
service type from "forking" to "simple".
The PIE and RELRO compiler/linker options can be used to produce a code
some techniques applied that makes the code more immune to some attacks:
- PIE (Position Independent Executable). It can make use of the ASLR
(Address Space Layout Randomization) provided by kernel to avoid
static locations for .text regions of executables (this is the 'pie'
compiler and linker option)
- RELRO (Relocation Read-Only). This prevents overwrite attacks of
the GOT (Global Offset Table) and PLT (Procedure Lookup Table)
used for relocations by making it read-only after all relocations
are resolved (these are the 'relro' and 'now' linker options) -
hence all symbols are resolved at the very start so there's no
need for those tables to be writeable later.
These compiler/linker options are now used by default for daemons
if the compiler/linker supports it.
Make it easier to run a live lvmetad in debugging mode and
to avoid conflicts if multiple test instances need to be run
alongside a live one.
No longer require -s when -f is used: use built-in default.
Add -p to lvmetad to specify the pid file.
No longer disable pidfile if -f used to run in foreground.
If specified socket file appears to be genuine but stale, remove it
before use.
On error, only remove lvmetad socket file if created by the same
process. (Previous code removes socket even while a running instance
is using it!)
If using lv/vgchange --sysinit -aay and lvmetad is enabled, we'd like to
avoid the direct activation and rely on autoactivation instead so
it fits system initialization scripts.
But if we're calling lv/vgchange --sysinit -aay too early when even
lvmetad service is not started yet, we just need to do the direct
activation instead without printing any error messages (while
trying to connect to lvmetad and not finding its socket).
This patch adds two helper functions - "lvmetad_socket_present" and
"lvmetad_used" which can be used to check for this condition properly
and avoid these lvmetad connections when the socket is not present
(and hence lvmetad is not yet running).
Failures in the temporary mirror used when up-converting cause dmeventd
to issue 'lvconvert --repair' on the sub-LV, <lv_name>_mimagetmp_?. The
'lvconvert' command refuses to deal with this sub-LV outright - it
expects to be given the name of the top-level LV. So, just like we do
with mirrored logs, we strip-off the portion of the name that is not
the top-level LV and issue the command on the top-level LV instead.
This fixes a bug in commit 19baf842 where verify_message
was rejecting the CLVMD_FLAG_REMOTE flag. It was missed
since the patch was ported from an lvm version where that
flag does not exist.
Add LV_TEMPORARY flag for LVs with limited existence during command
execution. Such LVs are temporary in way that they need to be activated,
some action done and then removed immediately. Such LVs are just like
any normal LV - the only difference is that they are removed during
LVM command execution. This is also the case for LVs representing
future pool metadata spare LVs which we need to initialize by using
the usual LV before they are declared as pool metadata spare.
We can optimize some other parts like udev to do a better job if
it knows that the LV is temporary and any processing on it is just
useless.
This flag is orthogonal to LV_NOSCAN flag introduced recently
as LV_NOSCAN flag is primarily used to mark an LV for the scanning
to be avoided before the zeroing of the device happens. The LV_TEMPORARY
flag makes a difference between a full-fledged LV visible in the system
and the LV just used as a temporary overlay for some action that needs to
be done on underlying PVs.
For example: lvcreate --thinpool POOL --zero n -L 1G vg
- first, the usual LV is created to do a clean up for pool metadata
spare. The LV is activated, zeroed, deactivated.
- between "activated" and "zeroed" stage, the LV_NOSCAN flag is used
to avoid any scanning in udev
- betwen "zeroed" and "deactivated" stage, we need to avoid the WATCH
udev rule, but since the LV is just a usual LV, we can't make a
difference. The LV_TEMPORARY internal LV flag helps here. If we
create the LV with this flag, the DM_UDEV_DISABLE_DISK_RULES
and DM_UDEV_DISABLE_OTHER_RULES flag are set (just like as it is
with "invisible" and non-top-level LVs) - udev is directed to
skip WATCH rule use.
- if the LV_TEMPORARY flag was not used, there would normally be
a WATCH event generated once the LV is closed after "zeroed"
stage. This will make problems with immediated deactivation that
follows.
A common scenario is during new LV creation when we need to wipe the
newly created LV and avoid any udev scanning before this stage otherwise
it could cause the device (the LV) to be claimed by some other subsystem
for which there were stale metadata within LV data.
This patch adds possibility to mark the LV we're just about to wipe with
a flag that gets passed to udev via DM_COOKIE as a subsystem specific
flag - DM_SUBSYSTEM_UDEV_FLAG0 (in this case the subsystem is "LVM")
so LVM udev rules will take care of handling that.
The corosync cluster interface for clvmd did not correctly
deal with node up/down events so that when a node was removed
from the cluster clvmd would prevent remote operations
from happening, as it thought the node was up but not
running clvmd.
This patch fixes that code by simplifying the case to node
being up or down - which was the original intention
and is supported by pacemaker and CPG in the higher layers.
Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Add more 'realistic' simulation of dlm locking.
Previous version was not capable to maintain multiple locks.
Current version doesn't handle multiqueues for locks,
so the ordering is different.
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).
When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap. More than one machine can send a checkpoint,
but only the initial one should be imported. Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin. When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.
When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter. This parameter was designed to avoid reading all but
the initial checkpoint. Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
- null_fd resource leak on error path in _reopen_fd_null fn
- dead code in verify_message in clvmd code
- dead code in _init_filter_components in toolcontext code
- null dereference in dm_prepare_selinux_context on error path if
setfscreatecon fails while resetting SELinux context
Check that fields in clvm_header are valid when
local or remote messages are received. If not,
log an error, dump the message data and ignore
the message.
When creating a timeout thread for snapshots, the thread is not
tracked and thus never joined. This means that the exit status
of the timeout thread is held indefinitely. Saves a bit of
memory to set PTHREAD_CREATE_DETACHED when creating this thread.
I've also added pthread_attr_init|destroy to setup the creation
pthread_attr_t.
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Start separating the validation from the action in the basic lvresize
code moved to the library.
Remove incorrect use of command line error codes from lvresize library
functions. Move errors.h to tools directory to reinforce this,
exporting public versions of the error codes in lvm2cmd.h for dmeventd
plugins to use.
Just to make it more clear and also not to confuse
config_valid with check against config definition
(and its 'valid' flag within the config defintion tree).
Instead of calling syslog() from signal event handler,
run all logging code in the main loop.
Also it needs to take the lock and check for list
only when really needed.
Previously, we have relied on UUIDs alone, and on lvmcache to make getting a
"new copy" of VG metadata fast. If the code which triggers the activation has
the correct VG metadata at hand (the version which is currently on disk), it can
now hand it to the activation code directly.
Switch to use libdm dm_get_status_snapshot() function for
reading status info.
This fixes bug, where the code was using 32bit integers,
while the snapshot target is able to return 64bit sizes.
However this also means, someone is using >1TB snapshot
cow devices, which is actually very bad idea anyway, since the
perfomance and memory usage in this case is very bad.
Patch fixes hidden problem with lvm metadata caching.
When the pretest was made, only the commited data have been cached back
since the call lv_info_by_lvid() triggers mda read operation.
However call of lv_suspend_if_active() also reads precommited metadata.
The problem is visible in this sequence of calls:
vg_write(), suspend_lv(), vg_commit(), resume_lv()
which may end with leaving outdated mda in lvm cache, since vg_write()
drops cached metadata and vg_commit() only transforms precommited
to commited metadata, but in the case of pretesting we have
no precommited mda available so the cache will continue to use
old metadata. This happens, when suspend LV is inactive.
Sharing char* with field has a problem in error path,
when we allocate event, but fail to allocate timeout string.
Instead of creating complicated error paths to resolve
it individually stop using unions, and let the resource
to be released in a simple _free_message().
For example, the old call and reference:
find_config_tree_str(cmd, "devices/dir", DEFAULT_DEV_DIR)
...now becomes:
find_config_tree_str(cmd, devices_dir_CFG)
So we're referring to the named configuration ID instead
of passing the configuration path and the default value
is taken from central config definition in config_settings.h
automatically.
On glibc, those are erroneously (namespace pollution) pulled in via
other headers. this doesn't work with conformant libcs (musl libc in
this case), we simply need to include all needed headers.
Signed-Off-By: John Spencer <maillist-lvm@barfooze.de>
For reseting locale environment into significantly less memory
consuming version 'C' - use LC_ALL instead of LANG since it has
higher priority in locale settings.
Otherwise we may observe whole locale-archive which might be
over 100MB on i.e. Fedora systems locked in memory with
some daemons.
The idea is to avoid a period when an existing VG is not mapped to either the
old or the new name. (Note that the brief "blackout" was present even if the
name did not actually change.) We instead allow a brief overlap of a VG existing
under both names, i.e. a query for a VG might succeed but before a lock is
acquired the VG disappears.
This reverts commit ed23da95b6.
Hash table device_to_pvid seems to contain references to
already deleted pvids and so revert to the older
behaviour using allocated memory.
All operations on shared hash tables need to be protected by mutexes. Moreover,
lookup and subsequent key removal need to happen atomically, to avoid races (and
possible double free-ing) between multiple threads trying to manipulate the same
VG.
If we fail to get memory for mutex, hash the mutex
or fail somewhere along pthread function calls
return allocated resources back and unlock vg_lock_map mutex.
Since we are doing just dump and function doesn't report
any error, explicitely ignore return values from
dm_config_write_node and dm_asprintf.
Same applies for the logging function.
clvmd -d option parsing was not working properly.
clvmd -d 2 (with space) has been ignored because of
'::' used in getopt string, and as failsafe it's been used '1'.
Later this debug_arg has been ignored and debug_opt was used
instead which happend to have value '1'.
Submitted-by: Robert Milasan <rmilasan at suse.com>
Reported-by: Robert Milasan <rmilasan at suse.com>
Try to bring the lvmetad usage text and man page closer to the code.
There seem to be 3 useful ways to use -d with lvmetad at the moment:
-d all
-d wire
-d debug
(They can also be comma-separated like -d wire,debug.)
Prior to the last release, -d, -dd and -ddd were supported.
Fail if an unrecognised debug arg is supplied on the command line.
Change -V to report the same version as the lvm binary: previously it
just reported version 0.
Reset counter after thin pool resize failure.
If the pool goes above threshold, support unmounting
of all thin volumes if the lvextend fails to avoid
overfilling of the pool.
- move common dm_config_tree manipulation functions from lvmetad-core to
daemon-shared
- add config-tree-based request manipulation APIs to daemon-client
- factor out _v (va_list) variants of most variadic functions in libdaemon
If we were defining a section (which is a node without a value) and
the value was created automatically on dm_config_create_node call,
we were wasting resources as the next step after creating the config
node itself was assigning NULL for the node's value.
The dm_config_node_create + dm_config_create_value sequence should be
used instead for settings and dm_config_node_create alone for sections.
The majority of the code already used the correct sequence. Though
with dm_config_node_create fn creating the value as well, the pool
memory was being trashed this way.
This patch removes the node value initialization on dm_config_create_node
fn call and keeps it for the direct dm_config_create_value fn call.
- logging is not controlled by "levels" but by "types"; types are
independent of each other... implementation of the usual "log level"
user-level semantics can be simply done on top; the immediate
application is enabling/disabling wire traffic logging independently
of other debug data, since the former is rather bulky and can easily
obscure almost everything else
- all logs go to "outlets", of which we currently have 2: syslog and
stderr; which "types" go to which "outlets" is entirely configurable
There were several hard-coded values for run directory around the code.
Also, some tools are DM specific only, others are LVM specific and there
was no distinction made here before. With this patch applied, we have
this cleaned up a bit (subsystem in brackets, defaults in parentheses):
[common] configurable PID_DIR (/var/run)
lvm [lvm] configurable RUN_DIR (/var/run/lvm)
configurable locking dir (/var/lock/lvm)
clvmd [lvm] configurable pid file (PID_DIR/clvmd.pid)
socket (RUN_DIR/clvmd.sock)
lvmetad [lvm] configurable pid file (PID_DIR/lvmetad.pid)
socket (RUN_DIR/lvmetad.socket)
dm [dm] configurable DM_RUN_DIR (/var/run)
cmirrord [dm] configurable pid file (PID_DIR/cmirrord.pid)
dmeventd [dm] configurable pid file (PID_DIR/dmeventd.pid)
server fifo (DM_RUN_DIR/dmeventd-server)
client fifo (DM_RUN_DIR/dmeventd-client)
The changes briefly:
- added configure --with-default-pid-dir
- added configure --with-default-dm-run-dir
- added configure --with-lvmetad-pidfile
- by default, using one common pid directory for everything
(only lvmetad was not following this before)