IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The sysinit.target is ordered even before sockets.target and there
may be situations in which the lvmetad is needed early, for example
in rescue.target to activate some LVs on which mountpoints reside.
Also, like in the case of rescue.target, the sockets.target is not
pulled in (unless some other service pulls it in explicitly).
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1087586#c26
for the summary of the problem.
This is sort of info we always ask people to retrieve when
inspecting problems in systemd environment so let's have this
as part of lvmdump directly.
The -s option does not need to be bound to systemd only. We could
add support for initscripts or any other system-wide/service tracking
info that can help us with debugging problems.
For a quick overview of config when debugging and to quickly check
which values are different from defaults and which are not defined
in the config and for which defaults are used.
When clustered VG is available in the system but we don't have
clustering set up for whatever reason, the lvm2-monitor scripts should
not fail completely just because these clustered VGs are skipped during
vgs/vgchange calls in lvm2-monitor initscript/systemd unit.
Do not use default dependencies that systemd adds to the units
so we have better control of when the service is started/stopped
and we don't end up with unexpected behaviour.
When the activation units are generated if use_lvmetad=0 (no
autoactivation), use --ignoreskippedcluster option for vgchange calls
since the cluster with cLVM is set up by separate units.
This avoids a situation in which the generated activation units are
improperly in failed state just because of the vgchange return value
when clustered VGs are encountered while the activation of non-clustered
VGs does proceed normally.
lvm2_cluster_activation_red_hat.service.in -> lvm2_cluster_activation_systemd_red_hat.service.in
lvm2_clvmd_red_hat.service.in -> lvm2_clvmd_red_hat.service.in
Edit lvm2-cluster-activation reference on cmirror - take new
lvm2-cmirrord.service, it was just cmirrord(.service) before
as the old initscript was used in compatibility mode.
Also, use WantedBy=multi-user.target instead of sysinit.target
in lvm2-cluster-activation.service.
The commit splits original clvmd service in two new native services
for systemd enabled systems while original init scripts remain unaltered.
New systemd native services:
1) clvmd daemon itself (lvm2_clvmd_red_hat.service.in)
2) (de)activation of clustered VGs (lvm2_cluster_activation_red_hat.service.in)
There're several reasons to split it. First, there's no support for conditional
stop in systemd and AFAIK they don't plan to support it. In other words:
if the deactivation fails for some reason, systemd doesn't care and will simply
kill all remaining processes in original cgroup (by default). Killing the
remaining procs can be suppressed however it doesn't solve the following problem:
You can't repeat the stop command of a failed service. The repeated stop command
is simply not propagated to the service in a failed state. You would have to start
and then try to stop the service again. Unfortunately, this can't be done while
the daemon is still running (and we need the daemon to stay active until all
clustered VGs are deactivated properly).
In a separated setup we need only to restart the failed activation service and
that's fine.
No need to fork lvmetad when running under systemd.
Also, the "lvmetad -R" support has been removed in lvm2 v2.02.98
so remove the ExecReload line that called it on "systemctl reload".
Trying to restart dmeventd as a reload action is causing problems
under systemd environment. The systemd loses track of new dmeventd
this way. See also https://bugzilla.redhat.com/show_bug.cgi?id=1060134
for more info.
We need to call dmeventd -R directly instead of "systemctl reload dm-event.service"
that was used before (the reload is aimed at configuration reload anyway,
not stateful restart of the daemon - we did this before just because
there's no ExecRestart in systemd and there's only ExecStart and
ExecStop with which we'd lose the state).
Also, use ExecStart="dmeventd -f" to run dmeventd in foreground
(and let's rely on systemd to daemonize it) and change the
service type from "forking" to "simple".
Since support for xfs_check is going to be obsoleted,
replace its usage with xfs_repair -n tool.
However this tool needs further intrumentation, since for really small
xfs devices (having just 1 allocation group) it needs to pass in
flag: "-o force_geometry". As we run the tool with '-n', it should
be safe to pass this flag always.
FIXME: figure way without always passing this flag.
When using filters for the pvscan --cache (the global_filter),
there's a difference between:
pvscan --cache -aay /dev/block/<major>:<minor>
and
pvscan --cache -aay <major>:<minor> (or --major <major> --minor <minor>)
In the first case, we need to be sure to have an exact matching line
in the filter for the device to be used, no aliases are considered
So for example even if we have accept rule for "/dev/sda" present,
this won't apply for "/dev/block/8:0" even though it's the same device!
This is because we're comparing the path used on command line directly
with the path written in the rule.
For the second one, any alias mentioned in the filter will apply
as we're comparing the major and minor pair, not looking at actual
device names - so any alias mentioned in the rules will suffice for
the filtering rule to apply.
For the global_filter to be properly used, we need to call the
second one in the lvm2-pvscan@.service - nobody is able to tell
what value of major:minor the kernel assignes next time, hence
this bug makes the use of global_filter quite unusable!
The blkdeactivate script iterates over the list of devices if they're
given as an argument and it tries to umount/deactivate them one by one.
This iteration failed to proceed if any of the umount/deactivation
was unsuccessful - there was a missing "shift" call to move to the
next argument (device) for processing. As a result of this, the same
device was tried again and again, causing an endless loop, never
proceeding to the next device given.
When using ENV{SYSTEMD_WANTS}=lvm2-pvscan@... to instantiate a service
for lvmetad scan when the new PV appears in the system, the service
is started and executed. However, to track device removal, we need
to bind it (the "BindsTo" systemd directive) to a certain .device
systemd unit.
In default systemd setup, the device is tracked by it's name and
sysfs path (there's normally a sysfs path .device systemd unit for
a device and then the device name .device unit as an alias for it).
Neither of these two is useful for lvmetad update as we need to bind
it to device's <major>:<minor> pair.
The /dev/block/<major>:<minor> is the essential symlink under /dev
that exists for each block device (created by default udev rules
provided by udev directly). So let's use this as an alias for
the device's .device unit as well by means of "ENV{SYSTEMD_ALIAS}"
declaration within udev rules which systemd understands (this will
create a new alias "dev-block-<major>:<minor>.device".
Then we can easily bind the "dev-block-<major>:<minor>" device
systemd unit with instantiated lvm2-pvscan@<major>:<minor>.service.
So once the device is removed from the systemd, the
lvm-pvscan@<major>:<minor>.service executes it's ExecStop action
(which in turn notifies lvmetad about the device being gone).
This completes the udev-systemd-lvmetad interaction then.
There is no point eating stderr for these commands. In fact the
redirect causes confusion and hurts dubugging.
Also reword an error message if the pvs command fails so as not be
certain that a device is not a PV. Coupled with removing the stderr
redirect this will improve the user experience in the face of errors.
The new lvm2-pvscan@.service is responsible for on-demand execution
of "pvscan --cache --activate ay" which causes lvmetad to be
updated and LVM activation done if the VG is complete.
Also, use udev-systemd mechanism to instantiate the job as the
lvm2-pvscan@$devnode.service on each newly appeared PV in the system.
This prevents the background job to be killed (that would happen
if it was directly forked from udev rule - this behaviour is seen
in recent versions of udev with the help of systemd that can track
detached processes - the detached process would still be in the same
cgroup).
To enable this official udev-systemd protocol for instantiating
background jobs, use new --enable-udev-systemd-background-jobs
configure switch (it's disabled by default). This option is highly
recommended wherever systemd is used!
lvmetad is not yet supported in clustered environment so
disable it automatically if using lvmconf --enable-cluster
and reset it to default value if using lvmconf --disable-cluster.
Also, add a few comments in lvm.conf about locking_type vs. use_lvmetad
if setting it for clustered environment.
The lvm2-activation-net.service was ordered only with respect to iscsi
and fcoe service before. In addition to that, we also need ordering
with respect to lvm2-activation.service to prevent parallel vgchange -aay
runs which may cause some problems during activation.
See also https://bugs.gentoo.org/show_bug.cgi?id=480066.
With this patch, the ordering is firmly set to:
lvm2-activation-early.service -> lvm2-activation.service -> lvm2-activation-net.service
Thanks to Alexander Tsoy for the original patch (modified a bit here):
https://www.redhat.com/archives/lvm-devel/2013-September/msg00049.html
Remove default "/tmp" as destination directory if no args
specified for lvm2-activation-generator. Require all the
args to be specified directly for proper functionality.
Do not print success status for lvm2-activation-generator:
"LVM: Activation generator successfully completed."
"LVM: Logical Volume autoactivation enabled." (if use_lvmetad=1)
Though this information is quite useful during boot, it may
be confusing for users if it happens anytime later and it
actually happens if systemd reloads. This is usually on package
update to update the systemd state and load any new units that are
newly installed in the system. The systemd reload is global and
so any existing generators are rerun at that moment too.
Recent version of util-linux/umount (v2.23+) provides
umount --all-targets that can unmount all the mount targets of
the same device (the bind mounts). Use this if available when
calling the umount blkdeactivate.
Otherwise, for older versions of util-linux, use findmnt
(that is also a part of the util-linux) to iterate over all
mount targets of the same device - this is the manual way.
The blkdeactivate now suppresses error messages from external
tools that are called. Instead, only a summary message "done"
or "skipped" is issued by blkdeactivate as any error in calling
the external tool (e.g. unmounting or deactivating a device) causes
the device to be skipped and the blkdeactivate continues with the
next device in the tree.
Add new -e/--errors switch to display any error messages from
external tools.
Also, suppress any output given by the external tools and add
new -v/--verbose switch to display it including the verbose
output of the tools called (this will enable error reporting
as well).
Also add blkdeactivate -vv for even more debug (the script's debug).
In case lvmetad is not used, we need to wait for udev to complete
after net-attached storage is initialized (after iscsi/fcoe service).
N.B. This also requires the storage to be attached synchronously
in the kernel itself.
The new lvm2-activation-net.service activates LVM volumes
after network-attached devices are set up (iSCSI and FCoE)
if lvmetad is disabled and hence the autoactivation is not
used.
When the init scripts are run from within systemd, the systemd
needs to know the pidfile for it to work correctly when the
daemon itself is killed. Otherwise, systemd keeps these services
in "active" and "exited state" at the same time
(it assumes RemainAfterExit=yes without the pidfile reference in
chkconfig header).
See also https://bugzilla.redhat.com/show_bug.cgi?id=971819#c5.
The global filter in system's lvm.conf may conflict with the custom filter we
set up in vgimportclone (they can easily fail to intersect). Since we explicitly
avoid talking to lvmetad in vgimportclone, it is safe and reasonable to do so.
'lvchange' is used to alter a RAID 1 logical volume's write-mostly and
write-behind characteristics. The '--writemostly' parameter takes a
PV as an argument with an optional trailing character to specify whether
to set ('y'), unset ('n'), or toggle ('t') the value. If no trailing
character is given, it will set the flag.
Synopsis:
lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv
Example:
lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv
The last character in the 'lv_attr' field is used to show whether a device
has the WriteMostly flag set. It is signified with a 'w'. If the device
has failed, the 'p'artial flag has priority.
Example ("nosync" raid1 with mismatch_cnt and writemostly):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg Rwi---r-m 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-w 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-- 1 linear 4.00m
Example (raid1 with mismatch_cnt, writemostly - but failed drive):
[~]# lvs -a --segment vg
LV VG Attr #Str Type SSize
raid1 vg rwi---r-p 2 raid1 500.00m
[raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m
[raid1_rimage_1] vg Iwi---r-p 1 linear 500.00m
[raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m
[raid1_rmeta_1] vg ewi---r-p 1 linear 4.00m
A new reportable field has been added for writebehind as well. If
write-behind has not been set or the LV is not RAID1, the field will
be blank.
Example (writebehind is set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r-- 512
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
Example (writebehind is not set):
[~]# lvs -a -o name,attr,writebehind vg
LV Attr WBehind
lv rwi-a-r--
[lv_rimage_0] iwi-aor-w
[lv_rimage_1] iwi-aor--
[lv_rmeta_0] ewi-aor--
[lv_rmeta_1] ewi-aor--
On glibc, those are erroneously (namespace pollution) pulled in via
other headers. this doesn't work with conformant libcs (musl libc in
this case), we simply need to include all needed headers.
Signed-Off-By: John Spencer <maillist-lvm@barfooze.de>
If there was a nested mountpoint inside an existing mount path,
blkdeactivate could fail to unmount such a mountpoint as it
needs to deactivate the deepest path first and continue upwards.
For example the simplest reproducer:
[root@rhel6-a ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 4G 0 disk
|-vg-lvol0 (dm-2) 253:2 0 32M 0 lvm /mnt/a
`-vg-lvol1 (dm-3) 253:3 0 32M 0 lvm /mnt/a/b
Before this patch:
[root@rhel6-a ~]# blkdeactivate -u
Deactivating block devices:
UMOUNT: unmounting vg-lvol0 (dm-2) mounted on /mnt/a
umount: /mnt/a: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
UMOUNT: unmounting vg-lvol1 (dm-3) mounted on /mnt/a/b
LVM: deactivating Logical Volume vg/lvol1
(deactivation of vg/lvol0 is skipped as /mnt/a that is on lvol0
can't be unmounted - it still has /mnt/a/b as nested mountpoint!)
With this patch applied:
[root@rhel6-a ~]# blkdeactivate -u
Deactivating block devices:
UMOUNT: unmounting vg-lvol1 (dm-3) mounted on /mnt/a/b
UMOUNT: unmounting vg-lvol0 (dm-2) mounted on /mnt/a
LVM: deactivating Logical Volume vg/lvol0
LVM: deactivating Logical Volume vg/lvol1
===
Also, this patch contains a fix for processing mangled mount paths:
[root@rhel6-a ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 4G 0 disk
`-vg-lvol0 (dm-2) 253:2 0 32M 0 lvm /mnt/x y z
[root@rhel6-a ~]# lsblk -r
vg-lvol0 253:2 0 32M 0 lvm /mnt/x\x20y\x20z
(the mount path is mangled with \xNN that is visible in raw
lsblk output only and which is used in blkdeactive as well)
Before this patch:
[root@rhel6-a ~]# blkdeactivate -u
Deactivating block devices:
umount: /mnt/x\x20y\x20z: not found
After this patch applied:
[root@rhel6-a ~]# blkdeactivate -u
Deactivating block devices:
UMOUNT: unmounting vg-lvol0 (dm-2) mounted on /mnt/x\x20y\x20z
LVM: deactivating Logical Volume vg/lvol0
For reseting locale environment into significantly less memory
consuming version 'C' - use LC_ALL instead of LANG since it has
higher priority in locale settings.
Otherwise we may observe whole locale-archive which might be
over 100MB on i.e. Fedora systems locked in memory with
some daemons.
Fix previous commit 360c569ce8.
Remove only fedora-storage-init/fedora-storage-init-late.service, but
not lvm2-activation.service.
fedora-storage-init.service fedora-storage-init-late.service
Don't use lvmetad in lvm2-monitor.service ExecStop to avoid a systemd issue.
- a systemd design issue while processing dependencies
with socket-based activation that ends up with a hang
- https://bugzilla.redhat.com/show_bug.cgi?id=843587
(also tracker bug https://bugzilla.redhat.com/show_bug.cgi?id=871527)
- not using lvmetad in this case is just a workaround, once the bug
above is resolved, we should enable the lvmetad in that specific case
Remove dependency on fedora-storage-init.service in lvm2 systemd units.
- fedora-storage-init.service and fedora-storage-init-late.service is
going to be separated into respective units that belong to each block
device subsystem:
- mpath + mdraid activated via udev solely
- dmraid with its own dmraid-activation.service unit
- lvm2 with the lvm2-activation-generator to generate the
activation units runtime if lvmetad disabled
(global/use_lvmetad=0 set in lvm.conf) and activation done
via udev+lvmetad if lvmetad enabled (global/use_lvmetad=1 set
in lvm.conf)
Depend on lvm2-lvmetad.socket in lvm2-monitor.service systemd unit.
- as lvm2-monitor uses lvmetad if lvmetad is enabled
blkdeactivate - utility to deactivate block devices
Traverses the tree of block devices and tries to deactivate them.
Currently, it supports device-mapper-based devices together with LVM.
See man/blkdeactivate.8 for more info.
It is targeted for use during shutdown to properly deactivate the
whole block device stack - systemd and init scripts are provided as
well. However, it might be used directly on command line too.
Please, see the commentary at the top of the blkdeactivate script
for dependencies and versions of other utilities required.
lvm2-activation-early.service (generated by activation generator) should
be ordered before cryptsetup.target.
lvm2-monitor.service should be ordered after lvm2-activation.service,
if used. The lvm2-activation.service will replace fedora-storage-init.service
and fedora-storage-init-late.service in the end, but let's have it
prepared now.
The ExecStartPost with pvscan --cache in lvm2-lvmetad.service
is not needed now as this is called transparently within the
first LVM command that queries lvmetad.
The "fedora-wait-storage.service" that the "lvm2-activation.service"
had as a dependency (which was fedora-specific solution anyway)
is obsolete now as this unit called "modprobe scsi_wait_scan"
which is not used anymore.
The "fedora-wait-storage.service" had "systemd-udev-settle" as
its dependency, so let's depend on this one directly now,
bypassing the out-dated "fedora-wait-storage.service".
The lvm2 activation generator generates systemd units conditionally
based on the global/use_lvmetad lvm.conf setting.
If use_lvmetad=0, the lvm2-activation-early.service and lvm2-activation.service
units will be generated. These units are responsible for direct volume activation
by calling "vgchange -aay --sysinit" (this is actually the original on-boot
activation as it was used before). If use_lvmetad=1, no units will be generated
as we're relying on autoactivation.
Important thing to note is that the lvm2-activation units normally bring
in the udev-settle ("storage-wait") service that waits for udev to settle
(with block devices). We don't need this if lvmetad is used in conjunction
with autoactivation feature... but systemd units can't be enabled or disabled
(or dependencies added/removed) dynamically based on external configuration.
Therefore, we need the unit generator which adds support for such situations:
the units as a whole either exist or not based on the external configuration.
Monitoring is handled using "vgchange --monitor" call. Ensure that lvmetad is up
and running at the time of this call to prevent any fallback to direct scan
within the vgchange. The same applies for shutdown sequence but the other way
round - switch monitoring off and lvmetad afterwards.
The clmvd init script called "vgchange -aly" before to activate
all VGs in cluster environment. This activated all VGs, no matter
if it was clustered or not.
Auto activation for clustered VGs is not supported yet so the behaviour
of -aay is still the same as before for clustered VGs. However, for
non-clustered VGs, we need to check with the activation/auto_activation_volume_list
whether the VG/LV should be activated on boot or not.
There were several hard-coded values for run directory around the code.
Also, some tools are DM specific only, others are LVM specific and there
was no distinction made here before. With this patch applied, we have
this cleaned up a bit (subsystem in brackets, defaults in parentheses):
[common] configurable PID_DIR (/var/run)
lvm [lvm] configurable RUN_DIR (/var/run/lvm)
configurable locking dir (/var/lock/lvm)
clvmd [lvm] configurable pid file (PID_DIR/clvmd.pid)
socket (RUN_DIR/clvmd.sock)
lvmetad [lvm] configurable pid file (PID_DIR/lvmetad.pid)
socket (RUN_DIR/lvmetad.socket)
dm [dm] configurable DM_RUN_DIR (/var/run)
cmirrord [dm] configurable pid file (PID_DIR/cmirrord.pid)
dmeventd [dm] configurable pid file (PID_DIR/dmeventd.pid)
server fifo (DM_RUN_DIR/dmeventd-server)
client fifo (DM_RUN_DIR/dmeventd-client)
The changes briefly:
- added configure --with-default-pid-dir
- added configure --with-default-dm-run-dir
- added configure --with-lvmetad-pidfile
- by default, using one common pid directory for everything
(only lvmetad was not following this before)
Fix propagation of -e option - pass it via internal shell variable.
Fix parsing of /proc/mounts files (don't check for substrings).
as reported by O.Mangold with suggested patch:
https://www.redhat.com/archives/linux-lvm/2012-February/msg00030.html
Properly pass arguments with spaces ("$@")
Add validation for YES and EXTOFF variable content.
LISTEN_PID and LISTEN_FDS environment variables are defined only during systemd
"start" action. But we still need to know whether we're activated during
"reload" action as well - we use the reload action to call "dmeventd -R"/"lvmetad -R"
for statefull daemon restart. We can't use normal "restart" as that is simply
composed of "stop" and "start" and we would lose any state the daemon has.
/etc/tmpfiles.d directory holds configuration files for temporary/volatile
files and directories that should be automatically managed. For example,
if we have some parts of the fs hierarchy on tmpfs, we'd like to recreate
some files or directories on every boot so they're always prepared for use.
Systemd can read such configuration files. For now, the lock and run directory
are the ones that are most probably placed on tmpfs. If this is the case, we
can install the configuration by 'make install_tmpfiles_configuration'.