1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-06 17:18:29 +03:00
lvm2/tools/toollib.c

5718 lines
157 KiB
C
Raw Normal View History

2001-09-25 16:49:28 +04:00
/*
2008-01-30 17:00:02 +03:00
* Copyright (C) 2001-2004 Sistina Software, Inc. All rights reserved.
* Copyright (C) 2004-2017 Red Hat, Inc. All rights reserved.
2001-09-25 16:49:28 +04:00
*
2004-03-30 23:35:44 +04:00
* This file is part of LVM2.
*
* This copyrighted material is made available to anyone wishing to use,
* modify, copy, or redistribute it subject to the terms and conditions
* of the GNU Lesser General Public License v.2.1.
2004-03-30 23:35:44 +04:00
*
* You should have received a copy of the GNU Lesser General Public License
2004-03-30 23:35:44 +04:00
* along with this program; if not, write to the Free Software Foundation,
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
2001-09-25 16:49:28 +04:00
*/
2001-10-06 01:39:30 +04:00
#include "tools.h"
#include "lib/format_text/format-text.h"
#include "lib/label/hints.h"
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
#include "lib/device/device_id.h"
#include "lib/device/online.h"
#include <sys/stat.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/utsname.h>
#define report_log_ret_code(ret_code) report_current_object_cmdlog(REPORT_OBJECT_CMDLOG_NAME, \
2017-07-19 17:17:30 +03:00
((ret_code) == ECMD_PROCESSED) ? REPORT_OBJECT_CMDLOG_SUCCESS \
: REPORT_OBJECT_CMDLOG_FAILURE, (ret_code))
const char *command_name(struct cmd_context *cmd)
{
return cmd->command->name;
}
static void _sigchld_handler(int sig __attribute__((unused)))
{
while (wait4(-1, NULL, WNOHANG | WUNTRACED, NULL) > 0) ;
}
/*
* returns:
* -1 if the fork failed
* 0 if the parent
* 1 if the child
*/
int become_daemon(struct cmd_context *cmd, int skip_lvm)
{
static const char devnull[] = "/dev/null";
int null_fd;
pid_t pid;
struct sigaction act = {
.sa_handler = _sigchld_handler,
.sa_flags = SA_NOCLDSTOP,
};
log_verbose("Forking background process from command: %s", cmd->cmd_line);
2018-10-15 15:15:58 +03:00
if (sigaction(SIGCHLD, &act, NULL))
log_warn("WARNING: Failed to set SIGCHLD action.");
if (!skip_lvm)
if (!sync_local_dev_names(cmd)) { /* Flush ops and reset dm cookie */
log_error("Failed to sync local devices before forking.");
return -1;
}
if ((pid = fork()) == -1) {
log_error("fork failed: %s", strerror(errno));
return -1;
}
/* Parent */
if (pid > 0)
return 0;
/* Child */
if (setsid() == -1)
log_error("Background process failed to setsid: %s",
strerror(errno));
/* Set this to avoid discarding output from background process */
// #define DEBUG_CHILD
#ifndef DEBUG_CHILD
if ((null_fd = open(devnull, O_RDWR)) == -1) {
log_sys_error("open", devnull);
_exit(ECMD_FAILED);
}
if ((dup2(null_fd, STDIN_FILENO) < 0) || /* reopen stdin */
(dup2(null_fd, STDOUT_FILENO) < 0) || /* reopen stdout */
(dup2(null_fd, STDERR_FILENO) < 0)) { /* reopen stderr */
log_sys_error("dup2", "redirect");
(void) close(null_fd);
_exit(ECMD_FAILED);
}
if (null_fd > STDERR_FILENO)
(void) close(null_fd);
init_verbose(VERBOSE_BASE_LEVEL);
#endif /* DEBUG_CHILD */
strncpy(*cmd->argv, "(lvm2)", strlen(*cmd->argv));
if (!skip_lvm) {
reset_locking();
lvmcache_destroy(cmd, 1, 1);
if (!lvmcache_init(cmd))
/* FIXME Clean up properly here */
_exit(ECMD_FAILED);
}
/* coverity[leaked_handle] null_fd does not leak here */
return 1;
}
/*
* Strip dev_dir if present
*/
const char *skip_dev_dir(struct cmd_context *cmd, const char *vg_name,
2014-07-11 14:25:18 +04:00
unsigned *dev_dir_found)
{
size_t devdir_len = strlen(cmd->dev_dir);
const char *dmdir = dm_dir() + devdir_len;
size_t dmdir_len = strlen(dmdir), vglv_sz;
2021-09-10 23:32:44 +03:00
char *vgname = NULL, *lvname, *layer, *vglv;
/* FIXME Do this properly */
if (*vg_name == '/')
while (vg_name[1] == '/')
vg_name++;
if (strncmp(vg_name, cmd->dev_dir, devdir_len)) {
if (dev_dir_found)
*dev_dir_found = 0;
} else {
if (dev_dir_found)
*dev_dir_found = 1;
vg_name += devdir_len;
while (*vg_name == '/')
vg_name++;
/* Reformat string if /dev/mapper found */
if (!strncmp(vg_name, dmdir, dmdir_len) && vg_name[dmdir_len] == '/') {
vg_name += dmdir_len + 1;
while (*vg_name == '/')
vg_name++;
if (!dm_split_lvm_name(cmd->mem, vg_name, &vgname, &lvname, &layer) ||
*layer) {
2014-11-14 18:08:27 +03:00
log_error("skip_dev_dir: Couldn't split up device name %s.",
vg_name);
return vg_name;
}
vglv_sz = strlen(vgname) + strlen(lvname) + 2;
if (!(vglv = dm_pool_alloc(cmd->mem, vglv_sz)) ||
dm_snprintf(vglv, vglv_sz, "%s%s%s", vgname,
*lvname ? "/" : "",
lvname) < 0) {
2014-11-14 18:08:27 +03:00
log_error("vg/lv string alloc failed.");
return vg_name;
}
return vglv;
}
}
return vg_name;
}
static int _printed_clustered_vg_advice = 0;
/*
* Three possible results:
* a) return 0, skip 0: take the VG, and cmd will end in success
* b) return 0, skip 1: skip the VG, and cmd will end in success
* c) return 1, skip *: skip the VG, and cmd will end in failure
*
* Case b is the special case, and includes the following:
* . The VG is inconsistent, and the command allows for inconsistent VGs.
* . The VG is clustered, the host cannot access clustered VG's,
* and the command option has been used to ignore clustered vgs.
*
* Case c covers the other errors returned when reading the VG.
* If *skip is 1, it's OK for the caller to read the list of PVs in the VG.
*/
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
static int _ignore_vg(struct cmd_context *cmd,
uint32_t error_flags, struct volume_group *error_vg,
const char *vg_name, struct dm_list *arg_vgnames,
uint32_t read_flags, int *skip, int *notfound)
{
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
uint32_t read_error = error_flags;
*skip = 0;
*notfound = 0;
if ((read_error & FAILED_NOTFOUND) && (read_flags & READ_OK_NOTFOUND)) {
*notfound = 1;
return 0;
}
if (read_error & FAILED_CLUSTERED) {
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (arg_vgnames && str_list_match_item(arg_vgnames, vg_name)) {
log_error("Cannot access clustered VG %s.", vg_name);
if (!_printed_clustered_vg_advice) {
_printed_clustered_vg_advice = 1;
log_error("See lvmlockd(8) for changing a clvm/clustered VG to a shared VG.");
}
return 1;
} else {
log_warn("Skipping clustered VG %s.", vg_name);
if (!_printed_clustered_vg_advice) {
_printed_clustered_vg_advice = 1;
log_error("See lvmlockd(8) for changing a clvm/clustered VG to a shared VG.");
}
*skip = 1;
return 0;
}
}
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
if (read_error & FAILED_EXPORTED) {
if (arg_vgnames && str_list_match_item(arg_vgnames, vg_name)) {
log_error("Volume group %s is exported", vg_name);
return 1;
} else {
read_error &= ~FAILED_EXPORTED; /* Check for other errors */
log_verbose("Skipping exported volume group %s", vg_name);
*skip = 1;
}
}
/*
* Commands that operate on "all vgs" shouldn't be bothered by
* skipping a foreign VG, and the command shouldn't fail when
* one is skipped. But, if the command explicitly asked to
* operate on a foreign VG and it's skipped, then the command
* would expect to fail.
*/
if (read_error & FAILED_SYSTEMID) {
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (arg_vgnames && str_list_match_item(arg_vgnames, vg_name)) {
log_error("Cannot access VG %s with system ID %s with %slocal system ID%s%s.",
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
vg_name,
error_vg ? error_vg->system_id : "unknown ",
cmd->system_id ? "" : "unknown ",
cmd->system_id ? " " : "",
cmd->system_id ? cmd->system_id : "");
return 1;
} else {
read_error &= ~FAILED_SYSTEMID; /* Check for other errors */
log_verbose("Skipping foreign volume group %s", vg_name);
*skip = 1;
}
}
2015-03-05 23:00:44 +03:00
/*
* Accessing a lockd VG when lvmlockd is not used is similar
* to accessing a foreign VG.
* This is also the point where a command fails if it failed
* to acquire the necessary lock from lvmlockd.
* The two cases are distinguished by FAILED_LOCK_TYPE (the
* VG lock_type requires lvmlockd), and FAILED_LOCK_MODE (the
* command failed to acquire the necessary lock.)
2015-03-05 23:00:44 +03:00
*/
if (read_error & (FAILED_LOCK_TYPE | FAILED_LOCK_MODE)) {
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (arg_vgnames && str_list_match_item(arg_vgnames, vg_name)) {
if (read_error & FAILED_LOCK_TYPE)
log_error("Cannot access VG %s with lock type %s that requires lvmlockd.",
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
vg_name,
error_vg ? error_vg->lock_type : "unknown");
/* For FAILED_LOCK_MODE, the error is printed in vg_read. */
2015-03-05 23:00:44 +03:00
return 1;
} else {
read_error &= ~FAILED_LOCK_TYPE; /* Check for other errors */
read_error &= ~FAILED_LOCK_MODE;
2015-03-05 23:00:44 +03:00
log_verbose("Skipping volume group %s", vg_name);
*skip = 1;
}
}
if (read_error != SUCCESS) {
*skip = 0;
if (is_orphan_vg(vg_name))
log_error("Cannot process standalone physical volumes");
else
log_error("Cannot process volume group %s", vg_name);
return 1;
}
return 0;
}
/*
* This functiona updates the "selected" arg only if last item processed
* is selected so this implements the "whole structure is selected if
* at least one of its items is selected".
*/
static void _update_selection_result(struct processing_handle *handle, int *selected)
{
if (!handle || !handle->selection_handle)
return;
if (handle->selection_handle->selected)
*selected = 1;
}
static void _set_final_selection_result(struct processing_handle *handle, int selected)
{
if (!handle || !handle->selection_handle)
return;
handle->selection_handle->selected = selected;
}
2005-01-19 20:31:51 +03:00
/*
* Metadata iteration functions
*/
2005-04-20 00:58:25 +04:00
int process_each_segment_in_pv(struct cmd_context *cmd,
struct volume_group *vg,
struct physical_volume *pv,
struct processing_handle *handle,
process_single_pvseg_fn_t process_single_pvseg)
2005-04-20 00:58:25 +04:00
{
struct pv_segment *pvseg;
int whole_selected = 0;
int ret_max = ECMD_PROCESSED;
2005-04-20 00:58:25 +04:00
int ret;
struct pv_segment _free_pv_segment = { .pv = pv };
if (dm_list_empty(&pv->segments)) {
ret = process_single_pvseg(cmd, NULL, &_free_pv_segment, handle);
if (ret != ECMD_PROCESSED)
stack;
if (ret > ret_max)
ret_max = ret;
} else {
dm_list_iterate_items(pvseg, &pv->segments) {
if (sigint_caught())
return_ECMD_FAILED;
ret = process_single_pvseg(cmd, vg, pvseg, handle);
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
if (ret > ret_max)
ret_max = ret;
}
}
/* the PV is selected if at least one PV segment is selected */
_set_final_selection_result(handle, whole_selected);
2005-04-20 00:58:25 +04:00
return ret_max;
}
int process_each_segment_in_lv(struct cmd_context *cmd,
struct logical_volume *lv,
struct processing_handle *handle,
process_single_seg_fn_t process_single_seg)
{
struct lv_segment *seg;
int whole_selected = 0;
int ret_max = ECMD_PROCESSED;
int ret;
dm_list_iterate_items(seg, &lv->segments) {
if (sigint_caught())
return_ECMD_FAILED;
ret = process_single_seg(cmd, seg, handle);
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
if (ret > ret_max)
ret_max = ret;
}
/* the LV is selected if at least one LV segment is selected */
_set_final_selection_result(handle, whole_selected);
return ret_max;
}
static const char *_extract_vgname(struct cmd_context *cmd, const char *lv_name,
const char **after)
{
const char *vg_name = lv_name;
char *st, *pos;
/* Strip dev_dir (optional) */
if (!(vg_name = skip_dev_dir(cmd, vg_name, NULL)))
return_0;
/* Require exactly one set of consecutive slashes */
if ((st = pos = strchr(vg_name, '/')))
while (*st == '/')
st++;
if (!st || strchr(st, '/')) {
2014-11-14 18:08:27 +03:00
log_error("\"%s\": Invalid path for Logical Volume.",
lv_name);
return 0;
}
if (!(vg_name = dm_pool_strndup(cmd->mem, vg_name, pos - vg_name))) {
log_error("Allocation of vg_name failed.");
return 0;
}
if (after)
*after = st;
return vg_name;
}
2014-10-07 01:02:00 +04:00
/*
* Extract default volume group name from environment
*/
static const char *_default_vgname(struct cmd_context *cmd)
{
const char *vg_path;
/* Take default VG from environment? */
vg_path = getenv("LVM_VG_NAME");
if (!vg_path)
return 0;
vg_path = skip_dev_dir(cmd, vg_path, NULL);
if (strchr(vg_path, '/')) {
2014-11-14 18:08:27 +03:00
log_error("\"%s\": Invalid environment var LVM_VG_NAME set for Volume Group.",
vg_path);
return 0;
}
return dm_pool_strdup(cmd->mem, vg_path);
}
2005-01-19 20:31:51 +03:00
/*
* Determine volume group name from a logical volume name
*/
const char *extract_vgname(struct cmd_context *cmd, const char *lv_name)
{
const char *vg_name = lv_name;
2001-10-29 16:52:23 +03:00
/* Path supplied? */
2001-11-06 22:02:26 +03:00
if (vg_name && strchr(vg_name, '/')) {
if (!(vg_name = _extract_vgname(cmd, lv_name, NULL)))
return_NULL;
2004-03-08 20:19:15 +03:00
2001-10-29 16:52:23 +03:00
return vg_name;
}
2001-11-06 22:02:26 +03:00
if (!(vg_name = _default_vgname(cmd))) {
2001-11-06 22:02:26 +03:00
if (lv_name)
2014-11-14 18:08:27 +03:00
log_error("Path required for Logical Volume \"%s\".",
lv_name);
return NULL;
2001-11-06 22:02:26 +03:00
}
2001-11-06 22:02:26 +03:00
return vg_name;
}
2014-11-14 18:08:27 +03:00
const char _pe_size_may_not_be_negative_msg[] = "Physical extent size may not be negative.";
int vgcreate_params_set_defaults(struct cmd_context *cmd,
struct vgcreate_params *vp_def,
struct volume_group *vg)
{
int64_t extent_size;
/* Only vgsplit sets vg */
if (vg) {
vp_def->vg_name = NULL;
vp_def->extent_size = vg->extent_size;
vp_def->max_pv = vg->max_pv;
vp_def->max_lv = vg->max_lv;
vp_def->alloc = vg->alloc;
vp_def->vgmetadatacopies = vg->mda_copies;
vp_def->system_id = vg->system_id; /* No need to clone this */
} else {
vp_def->vg_name = NULL;
extent_size = find_config_tree_int64(cmd,
allocation_physical_extent_size_CFG, NULL) * 2;
if (extent_size < 0) {
log_error(_pe_size_may_not_be_negative_msg);
return 0;
}
vp_def->extent_size = (uint32_t) extent_size;
vp_def->max_pv = DEFAULT_MAX_PV;
vp_def->max_lv = DEFAULT_MAX_LV;
vp_def->alloc = DEFAULT_ALLOC_POLICY;
vp_def->vgmetadatacopies = DEFAULT_VGMETADATACOPIES;
vp_def->system_id = cmd->system_id;
}
return 1;
}
/*
* Set members of struct vgcreate_params from cmdline arguments.
* Do preliminary validation with arg_*() interface.
* Further, more generic validation is done in validate_vgcreate_params().
* This function is to remain in tools directory.
*/
int vgcreate_params_set_from_args(struct cmd_context *cmd,
struct vgcreate_params *vp_new,
struct vgcreate_params *vp_def)
{
const char *system_id_arg_str;
2015-03-05 23:00:44 +03:00
const char *lock_type = NULL;
int use_lvmlockd;
lock_type_t lock_type_num;
if (arg_is_set(cmd, clustered_ARG)) {
log_error("The clustered option is deprecated, see --shared.");
return 0;
}
vp_new->vg_name = skip_dev_dir(cmd, vp_def->vg_name, NULL);
vp_new->max_lv = arg_uint_value(cmd, maxlogicalvolumes_ARG,
vp_def->max_lv);
vp_new->max_pv = arg_uint_value(cmd, maxphysicalvolumes_ARG,
vp_def->max_pv);
vp_new->alloc = (alloc_policy_t) arg_uint_value(cmd, alloc_ARG, vp_def->alloc);
/* Units of 512-byte sectors */
vp_new->extent_size =
arg_uint_value(cmd, physicalextentsize_ARG, vp_def->extent_size);
if (arg_sign_value(cmd, physicalextentsize_ARG, SIGN_NONE) == SIGN_MINUS) {
log_error(_pe_size_may_not_be_negative_msg);
return 0;
}
if (arg_uint64_value(cmd, physicalextentsize_ARG, 0) > MAX_EXTENT_SIZE) {
log_error("Physical extent size must be smaller than %s.",
display_size(cmd, (uint64_t) MAX_EXTENT_SIZE));
return 0;
}
if (arg_sign_value(cmd, maxlogicalvolumes_ARG, SIGN_NONE) == SIGN_MINUS) {
2014-11-14 18:08:27 +03:00
log_error("Max Logical Volumes may not be negative.");
return 0;
}
if (arg_sign_value(cmd, maxphysicalvolumes_ARG, SIGN_NONE) == SIGN_MINUS) {
2014-11-14 18:08:27 +03:00
log_error("Max Physical Volumes may not be negative.");
return 0;
}
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
if (arg_is_set(cmd, vgmetadatacopies_ARG))
vp_new->vgmetadatacopies = arg_int_value(cmd, vgmetadatacopies_ARG,
DEFAULT_VGMETADATACOPIES);
else
vp_new->vgmetadatacopies = find_config_tree_int(cmd, metadata_vgmetadatacopies_CFG, NULL);
2015-03-05 23:00:44 +03:00
if (!(system_id_arg_str = arg_str_value(cmd, systemid_ARG, NULL))) {
vp_new->system_id = vp_def->system_id;
2015-03-05 23:00:44 +03:00
} else {
if (!(vp_new->system_id = system_id_from_string(cmd, system_id_arg_str)))
return_0;
/* FIXME Take local/extra_system_ids into account */
if (vp_new->system_id && cmd->system_id &&
strcmp(vp_new->system_id, cmd->system_id)) {
if (*vp_new->system_id)
log_warn("VG with system ID %s might become inaccessible as local system ID is %s",
vp_new->system_id, cmd->system_id);
else
2015-02-25 17:17:35 +03:00
log_warn("WARNING: A VG without a system ID allows unsafe access from other hosts.");
}
}
2015-03-05 23:00:44 +03:00
if ((system_id_arg_str = arg_str_value(cmd, systemid_ARG, NULL))) {
vp_new->system_id = system_id_from_string(cmd, system_id_arg_str);
} else {
vp_new->system_id = vp_def->system_id;
}
if (system_id_arg_str) {
if (!vp_new->system_id || !vp_new->system_id[0])
log_warn("WARNING: A VG without a system ID allows unsafe access from other hosts.");
if (vp_new->system_id && cmd->system_id &&
strcmp(vp_new->system_id, cmd->system_id)) {
log_warn("VG with system ID %s might become inaccessible as local system ID is %s",
vp_new->system_id, cmd->system_id);
}
}
/*
* Locking: what kind of locking should be used for the
* new VG, and is it compatible with current lvm.conf settings.
*
* The end result is to set vp_new->lock_type to:
* none | clvm | dlm | sanlock | idm.
2015-03-05 23:00:44 +03:00
*
* If 'vgcreate --lock-type <arg>' is set, the answer is given
* directly by <arg> which is one of none|clvm|dlm|sanlock|idm.
2015-03-05 23:00:44 +03:00
*
* 'vgcreate --clustered y' is the way to create clvm VGs.
*
* 'vgcreate --shared' is the way to create lockd VGs.
* lock_type of sanlock, dlm or idm is selected based on
2015-03-05 23:00:44 +03:00
* which lock manager is running.
*
*
* 1. Using neither clvmd nor lvmlockd.
* ------------------------------------------------
* lvm.conf:
* global/use_lvmlockd = 0
* global/locking_type = 1
*
* - no locking is enabled
* - clvmd is not used
* - lvmlockd is not used
* - VGs with CLUSTERED set are ignored (requires clvmd)
* - VGs with lockd type are ignored (requires lvmlockd)
* - vgcreate can create new VGs with lock_type none
* - 'vgcreate --clustered y' fails
* - 'vgcreate --shared' fails
* - 'vgcreate' (neither option) creates a local VG
*
* 2. Using clvmd.
* ------------------------------------------------
* lvm.conf:
* global/use_lvmlockd = 0
* global/locking_type = 3
*
* - locking through clvmd is enabled (traditional clvm config)
* - clvmd is used
* - lvmlockd is not used
* - VGs with CLUSTERED set can be used
* - VGs with lockd type are ignored (requires lvmlockd)
* - vgcreate can create new VGs with CLUSTERED status flag
* - 'vgcreate --clustered y' works
* - 'vgcreate --shared' fails
* - 'vgcreate' (neither option) creates a clvm VG
*
* 3. Using lvmlockd.
* ------------------------------------------------
* lvm.conf:
* global/use_lvmlockd = 1
* global/locking_type = 1
*
* - locking through lvmlockd is enabled
* - clvmd is not used
* - lvmlockd is used
* - VGs with CLUSTERED set are ignored (requires clvmd)
* - VGs with lockd type can be used
* - vgcreate can create new VGs with lock_type sanlock, dlm or idm
2015-03-05 23:00:44 +03:00
* - 'vgcreate --clustered y' fails
* - 'vgcreate --shared' works
* - 'vgcreate' (neither option) creates a local VG
*/
use_lvmlockd = find_config_tree_bool(cmd, global_use_lvmlockd_CFG, NULL);
if (arg_is_set(cmd, locktype_ARG)) {
lock_type = arg_str_value(cmd, locktype_ARG, "");
if (arg_is_set(cmd, shared_ARG) && !is_lockd_type(lock_type)) {
log_error("The --shared option requires lock type sanlock, dlm or idm.");
return 0;
}
2015-03-05 23:00:44 +03:00
} else if (arg_is_set(cmd, shared_ARG)) {
int found_multiple = 0;
2015-03-05 23:00:44 +03:00
if (use_lvmlockd) {
if (!(lock_type = lockd_running_lock_type(cmd, &found_multiple))) {
if (found_multiple)
log_error("Found multiple lock managers, select one with --lock-type.");
else
log_error("Failed to detect a running lock manager to select lock type.");
2015-03-05 23:00:44 +03:00
return 0;
}
} else {
log_error("Using a shared lock type requires lvmlockd (lvm.conf use_lvmlockd.)");
2015-03-05 23:00:44 +03:00
return 0;
}
} else {
lock_type = "none";
2015-03-05 23:00:44 +03:00
}
/*
* Check that the lock_type is recognized, and is being
* used with the correct lvm.conf settings.
*/
lock_type_num = get_lock_type_from_string(lock_type);
switch (lock_type_num) {
case LOCK_TYPE_INVALID:
case LOCK_TYPE_CLVM:
2015-03-05 23:00:44 +03:00
log_error("lock_type %s is invalid", lock_type);
return 0;
case LOCK_TYPE_SANLOCK:
case LOCK_TYPE_DLM:
case LOCK_TYPE_IDM:
2015-03-05 23:00:44 +03:00
if (!use_lvmlockd) {
log_error("Using a shared lock type requires lvmlockd.");
2015-03-05 23:00:44 +03:00
return 0;
}
break;
case LOCK_TYPE_NONE:
break;
};
/*
* The vg is not owned by one host/system_id.
* Locking coordinates access from multiple hosts.
*/
if (lock_type_num == LOCK_TYPE_DLM || lock_type_num == LOCK_TYPE_SANLOCK)
2015-03-05 23:00:44 +03:00
vp_new->system_id = NULL;
vp_new->lock_type = lock_type;
log_debug("Setting lock_type to %s", vp_new->lock_type);
return 1;
}
/* Shared code for changing activation state for vgchange/lvchange */
int lv_change_activate(struct cmd_context *cmd, struct logical_volume *lv,
activation_change_t activate)
{
int r = 1;
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
int integrity_recalculate;
struct logical_volume *snapshot_lv;
if (lv_is_cache_pool(lv)) {
if (is_change_activating(activate)) {
log_verbose("Skipping activation of cache pool %s.",
display_lvname(lv));
return 1;
}
if (!dm_list_empty(&lv->segs_using_this_lv)) {
log_verbose("Skipping deactivation of used cache pool %s.",
display_lvname(lv));
return 1;
}
/*
* Allow to pass only deactivation of unused cache pool.
* Useful only for recovery of failed zeroing of metadata LV.
*/
}
if (lv_is_merging_origin(lv)) {
/*
* For merging origin, its snapshot must be inactive.
* If it's still active and cannot be deactivated
* activation or deactivation of origin fails!
*
* When origin is deactivated and merging snapshot is thin
* it allows to deactivate origin, but still report error,
* since the thin snapshot remains active.
*
* User could retry to deactivate it with another
* deactivation of origin, which is the only visible LV
*/
snapshot_lv = find_snapshot(lv)->lv;
if (lv_is_thin_type(snapshot_lv) && !deactivate_lv(cmd, snapshot_lv)) {
if (is_change_activating(activate)) {
2016-04-18 12:04:06 +03:00
log_error("Refusing to activate merging volume %s while "
"snapshot volume %s is still active.",
display_lvname(lv), display_lvname(snapshot_lv));
return 0;
}
2016-04-18 12:04:06 +03:00
log_error("Cannot fully deactivate merging origin volume %s while "
"snapshot volume %s is still active.",
display_lvname(lv), display_lvname(snapshot_lv));
r = 0; /* and continue to deactivate origin... */
}
}
if (is_change_activating(activate) &&
lvmcache_has_duplicate_devs() &&
vg_has_duplicate_pvs(lv->vg) &&
!find_config_tree_bool(cmd, devices_allow_changes_with_duplicate_pvs_CFG, NULL)) {
log_error("Cannot activate LVs in VG %s while PVs appear on duplicate devices.",
lv->vg->name);
return 0;
}
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
if ((integrity_recalculate = lv_has_integrity_recalculate_metadata(lv))) {
/* Don't want pvscan to write VG while running from systemd service. */
if (!strcmp(cmd->name, "pvscan")) {
log_error("Cannot activate uninitialized integrity LV %s from pvscan.",
display_lvname(lv));
return 0;
}
if (vg_is_shared(lv->vg)) {
uint32_t lockd_state = 0;
if (!lockd_vg(cmd, lv->vg->name, "ex", 0, &lockd_state)) {
log_error("Cannot activate uninitialized integrity LV %s without lock.",
display_lvname(lv));
return 0;
}
}
}
if (!lv_active_change(cmd, lv, activate))
return_0;
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
/* Write VG metadata to clear the integrity recalculate flag. */
if (integrity_recalculate && lv_is_active(lv)) {
log_print_unless_silent("Updating VG to complete initialization of integrity LV %s.",
display_lvname(lv));
lv_clear_integrity_recalculate_metadata(lv);
}
pvscan: add options listlvs listvg checkcomplete pvscan --cache <dev> . read only dev . create online file for dev pvscan --listvg <dev> . read only dev . list VG using dev pvscan --listlvs <dev> . read only dev . list VG using dev . list LVs using dev pvscan --cache --listvg [--checkcomplete] <dev> . read only dev . create online file for dev . list VG using dev . [check online files and report if VG is complete] pvscan --cache --listlvs [--checkcomplete] <dev> . read only dev . create online file for dev . list VG using dev . list LVs using dev . [check online files and report if VG is complete] . [check online files and report if LVs are complete] [--vgonline] can be used with --checkcomplete, to enable use of a vg online file. This results in only the first pvscan command to see the complete VG to report 'VG complete', and others will report 'VG finished'. This allows the caller to easily run a single activation of the VG. [--udevoutput] can be used with --cache --listvg --checkcomplete, to enable an output mode that prints LVM_VG_NAME_COMPLETE='vgname' that a udev rule can import, and prevents other output from the command (other output causes udev to ignore the command.) The list of complete LVs is meant to be passed to lvchange -aay, or the complete VG used with vgchange -aay. When --checkcomplete is used, lvm assumes that that the output will be used to trigger event-based autoactivation, so the pvscan does nothing if event_activation=0 and --checkcomplete is used. Example of listlvs ------------------ $ lvs -a vg -olvname,devices LV Devices lv_a /dev/loop0(0) lv_ab /dev/loop0(1),/dev/loop1(1) lv_abc /dev/loop0(3),/dev/loop1(3),/dev/loop2(1) lv_b /dev/loop1(0) lv_c /dev/loop2(0) $ pvscan --cache --listlvs --checkcomplete /dev/loop0 pvscan[35680] PV /dev/loop0 online, VG vg incomplete (need 2). VG vg incomplete LV vg/lv_a complete LV vg/lv_ab incomplete LV vg/lv_abc incomplete $ pvscan --cache --listlvs --checkcomplete /dev/loop1 pvscan[35681] PV /dev/loop1 online, VG vg incomplete (need 1). VG vg incomplete LV vg/lv_b complete LV vg/lv_ab complete LV vg/lv_abc incomplete $ pvscan --cache --listlvs --checkcomplete /dev/loop2 pvscan[35682] PV /dev/loop2 online, VG vg is complete. VG vg complete LV vg/lv_c complete LV vg/lv_abc complete Example of listvg ----------------- $ pvscan --cache --listvg --checkcomplete /dev/loop0 pvscan[35684] PV /dev/loop0 online, VG vg incomplete (need 2). VG vg incomplete $ pvscan --cache --listvg --checkcomplete /dev/loop1 pvscan[35685] PV /dev/loop1 online, VG vg incomplete (need 1). VG vg incomplete $ pvscan --cache --listvg --checkcomplete /dev/loop2 pvscan[35686] PV /dev/loop2 online, VG vg is complete. VG vg complete
2020-12-09 19:59:40 +03:00
/*
* When LVs are deactivated, then autoactivation of the VG is
* "re-armed" by removing the vg online file. So, after deactivation
* of LVs, if PVs are disconnected and reconnected again, event
* activation will trigger autoactivation again. This secondary
* autoactivation is somewhat different from, and not as important as
* the initial autoactivation during system startup. The secondary
* autoactivation will happen to a VG on a running system and may be
* mixing with user commands, so the end result is unpredictable.
*
* It's possible that we might want a config setting for usersto
* disable secondary autoactivations. Once a system is up, the
* user may want to take charge of activation changes to the VG
* and not have the system autoactivation interfere.
*/
if (!is_change_activating(activate) && cmd->event_activation)
pvscan: add options listlvs listvg checkcomplete pvscan --cache <dev> . read only dev . create online file for dev pvscan --listvg <dev> . read only dev . list VG using dev pvscan --listlvs <dev> . read only dev . list VG using dev . list LVs using dev pvscan --cache --listvg [--checkcomplete] <dev> . read only dev . create online file for dev . list VG using dev . [check online files and report if VG is complete] pvscan --cache --listlvs [--checkcomplete] <dev> . read only dev . create online file for dev . list VG using dev . list LVs using dev . [check online files and report if VG is complete] . [check online files and report if LVs are complete] [--vgonline] can be used with --checkcomplete, to enable use of a vg online file. This results in only the first pvscan command to see the complete VG to report 'VG complete', and others will report 'VG finished'. This allows the caller to easily run a single activation of the VG. [--udevoutput] can be used with --cache --listvg --checkcomplete, to enable an output mode that prints LVM_VG_NAME_COMPLETE='vgname' that a udev rule can import, and prevents other output from the command (other output causes udev to ignore the command.) The list of complete LVs is meant to be passed to lvchange -aay, or the complete VG used with vgchange -aay. When --checkcomplete is used, lvm assumes that that the output will be used to trigger event-based autoactivation, so the pvscan does nothing if event_activation=0 and --checkcomplete is used. Example of listlvs ------------------ $ lvs -a vg -olvname,devices LV Devices lv_a /dev/loop0(0) lv_ab /dev/loop0(1),/dev/loop1(1) lv_abc /dev/loop0(3),/dev/loop1(3),/dev/loop2(1) lv_b /dev/loop1(0) lv_c /dev/loop2(0) $ pvscan --cache --listlvs --checkcomplete /dev/loop0 pvscan[35680] PV /dev/loop0 online, VG vg incomplete (need 2). VG vg incomplete LV vg/lv_a complete LV vg/lv_ab incomplete LV vg/lv_abc incomplete $ pvscan --cache --listlvs --checkcomplete /dev/loop1 pvscan[35681] PV /dev/loop1 online, VG vg incomplete (need 1). VG vg incomplete LV vg/lv_b complete LV vg/lv_ab complete LV vg/lv_abc incomplete $ pvscan --cache --listlvs --checkcomplete /dev/loop2 pvscan[35682] PV /dev/loop2 online, VG vg is complete. VG vg complete LV vg/lv_c complete LV vg/lv_abc complete Example of listvg ----------------- $ pvscan --cache --listvg --checkcomplete /dev/loop0 pvscan[35684] PV /dev/loop0 online, VG vg incomplete (need 2). VG vg incomplete $ pvscan --cache --listvg --checkcomplete /dev/loop1 pvscan[35685] PV /dev/loop1 online, VG vg incomplete (need 1). VG vg incomplete $ pvscan --cache --listvg --checkcomplete /dev/loop2 pvscan[35686] PV /dev/loop2 online, VG vg is complete. VG vg complete
2020-12-09 19:59:40 +03:00
online_vg_file_remove(lv->vg->name);
set_lv_notify(lv->vg->cmd);
return r;
}
int lv_refresh(struct cmd_context *cmd, struct logical_volume *lv)
{
struct logical_volume *snapshot_lv;
if (lv_is_merging_origin(lv)) {
snapshot_lv = find_snapshot(lv)->lv;
if (lv_is_thin_type(snapshot_lv) && !deactivate_lv(cmd, snapshot_lv))
log_print_unless_silent("Delaying merge for origin volume %s since "
"snapshot volume %s is still active.",
display_lvname(lv), display_lvname(snapshot_lv));
}
if (!lv_refresh_suspend_resume(lv))
return_0;
/*
* check if snapshot merge should be polled
* - unfortunately: even though the dev_manager will clear
* the lv's merge attributes if a merge is not possible;
* it is clearing a different instance of the lv (as
* retrieved with lv_from_lvid)
* - fortunately: polldaemon will immediately shutdown if the
* origin doesn't have a status with a snapshot percentage
*/
if (background_polling() && lv_is_merging_origin(lv) && lv_is_active(lv))
lv_spawn_background_polling(cmd, lv);
return 1;
}
int vg_refresh_visible(struct cmd_context *cmd, struct volume_group *vg)
{
struct lv_list *lvl;
int r = 1;
2009-10-06 20:00:38 +04:00
sigint_allow();
dm_list_iterate_items(lvl, &vg->lvs) {
if (sigint_caught()) {
r = 0;
stack;
break;
}
if (lv_is_visible(lvl->lv) && !lv_refresh(cmd, lvl->lv)) {
r = 0;
stack;
}
}
sigint_restore();
2009-10-06 20:00:38 +04:00
return r;
}
void lv_spawn_background_polling(struct cmd_context *cmd,
struct logical_volume *lv)
{
const char *pvname;
const struct logical_volume *lv_mirr = NULL;
if (lv_is_pvmove(lv))
lv_mirr = lv;
else if (lv_is_locked(lv))
lv_mirr = find_pvmove_lv_in_lv(lv);
if (lv_mirr &&
(pvname = get_pvmove_pvname_from_lv_mirr(lv_mirr))) {
2014-11-14 18:08:27 +03:00
log_verbose("Spawning background pvmove process for %s.",
pvname);
pvmove_poll(cmd, pvname, lv_mirr->lvid.s, lv_mirr->vg->name, lv_mirr->name, 1);
}
if (lv_is_converting(lv) || lv_is_merging(lv)) {
2014-11-14 18:08:27 +03:00
log_verbose("Spawning background lvconvert process for %s.",
lv->name);
lvconvert_poll(cmd, lv, 1);
}
}
int get_activation_monitoring_mode(struct cmd_context *cmd,
int *monitoring_mode)
{
*monitoring_mode = DEFAULT_DMEVENTD_MONITOR;
if (arg_is_set(cmd, monitor_ARG) &&
(arg_is_set(cmd, ignoremonitoring_ARG) ||
arg_is_set(cmd, sysinit_ARG))) {
2014-11-14 18:08:27 +03:00
log_error("--ignoremonitoring or --sysinit option not allowed with --monitor option.");
return 0;
}
if (arg_is_set(cmd, monitor_ARG))
*monitoring_mode = arg_int_value(cmd, monitor_ARG,
DEFAULT_DMEVENTD_MONITOR);
else if (is_static() || arg_is_set(cmd, ignoremonitoring_ARG) ||
arg_is_set(cmd, sysinit_ARG) ||
!find_config_tree_bool(cmd, activation_monitoring_CFG, NULL))
*monitoring_mode = DMEVENTD_MONITOR_IGNORE;
return 1;
}
2010-04-13 05:54:32 +04:00
/*
* Read pool options from cmdline
*/
int get_pool_params(struct cmd_context *cmd,
const struct segment_type *segtype,
uint64_t *pool_metadata_size,
int *pool_metadata_spare,
uint32_t *chunk_size,
thin_discards_t *discards,
thin_zero_t *zero_new_blocks)
{
if (segtype_is_thin_pool(segtype) || segtype_is_thin(segtype)) {
2014-10-31 13:41:03 +03:00
if (arg_is_set(cmd, zero_ARG)) {
*zero_new_blocks = arg_int_value(cmd, zero_ARG, 0) ? THIN_ZERO_YES : THIN_ZERO_NO;
log_very_verbose("%s pool zeroing.",
(*zero_new_blocks == THIN_ZERO_YES) ? "Enabling" : "Disabling");
} else
*zero_new_blocks = THIN_ZERO_UNSELECTED;
2014-10-31 13:41:03 +03:00
if (arg_is_set(cmd, discards_ARG)) {
*discards = (thin_discards_t) arg_uint_value(cmd, discards_ARG, 0);
2014-11-14 18:08:27 +03:00
log_very_verbose("Setting pool discards to %s.",
get_pool_discards_name(*discards));
} else
*discards = THIN_DISCARDS_UNSELECTED;
}
if (arg_from_list_is_negative(cmd, "may not be negative",
chunksize_ARG,
pooldatasize_ARG,
poolmetadatasize_ARG,
-1))
return_0;
if (arg_from_list_is_zero(cmd, "may not be zero",
chunksize_ARG,
pooldatasize_ARG,
poolmetadatasize_ARG,
-1))
return_0;
if (arg_is_set(cmd, chunksize_ARG)) {
*chunk_size = arg_uint_value(cmd, chunksize_ARG, 0);
if (!validate_pool_chunk_size(cmd, segtype, *chunk_size))
return_0;
2014-11-14 18:08:27 +03:00
log_very_verbose("Setting pool chunk size to %s.",
display_size(cmd, *chunk_size));
} else
*chunk_size = 0;
if (arg_is_set(cmd, poolmetadatasize_ARG)) {
if (arg_is_set(cmd, poolmetadata_ARG)) {
log_error("Please specify either metadata logical volume or its size.");
return 0;
}
*pool_metadata_size = arg_uint64_value(cmd, poolmetadatasize_ARG,
UINT64_C(0));
} else
*pool_metadata_size = 0;
/* TODO: default in lvm.conf and metadata profile ? */
*pool_metadata_spare = arg_int_value(cmd, poolmetadataspare_ARG,
DEFAULT_POOL_METADATA_SPARE);
return 1;
}
2010-04-13 05:54:32 +04:00
/*
* Generic stripe parameter checks.
*/
static int _validate_stripe_params(struct cmd_context *cmd, const struct segment_type *segtype,
uint32_t *stripes, uint32_t *stripe_size)
2010-04-13 05:54:32 +04:00
{
if (*stripes < 1 || *stripes > MAX_STRIPES) {
log_error("Number of stripes (%d) must be between %d and %d.",
*stripes, 1, MAX_STRIPES);
return 0;
}
if (!segtype_supports_stripe_size(segtype)) {
if (*stripe_size) {
log_print_unless_silent("Ignoring stripesize argument for %s devices.",
segtype->name);
*stripe_size = 0;
}
} else if (*stripes == 1) {
if (*stripe_size) {
log_print_unless_silent("Ignoring stripesize argument with single stripe.");
*stripe_size = 0;
}
} else {
if (!*stripe_size) {
*stripe_size = find_config_tree_int(cmd, metadata_stripesize_CFG, NULL) * 2;
log_print_unless_silent("Using default stripesize %s.",
display_size(cmd, (uint64_t) *stripe_size));
}
if (*stripe_size > STRIPE_SIZE_LIMIT * 2) {
log_error("Stripe size cannot be larger than %s.",
display_size(cmd, (uint64_t) STRIPE_SIZE_LIMIT));
return 0;
} else if (*stripe_size < STRIPE_SIZE_MIN || !is_power_of_2(*stripe_size)) {
log_error("Invalid stripe size %s.",
display_size(cmd, (uint64_t) *stripe_size));
return 0;
}
2010-04-13 05:54:32 +04:00
}
return 1;
}
/*
* The stripe size is limited by the size of a uint32_t, but since the
* value given by the user is doubled, and the final result must be a
* power of 2, we must divide UINT_MAX by four and add 1 (to round it
* up to the power of 2)
*/
int get_stripe_params(struct cmd_context *cmd, const struct segment_type *segtype,
uint32_t *stripes, uint32_t *stripe_size,
unsigned *stripes_supplied, unsigned *stripe_size_supplied)
2010-04-13 05:54:32 +04:00
{
/* stripes_long_ARG takes precedence (for lvconvert) */
/* FIXME Cope with relative +/- changes for lvconvert. */
if (arg_is_set(cmd, stripes_long_ARG)) {
*stripes = arg_uint_value(cmd, stripes_long_ARG, 0);
*stripes_supplied = 1;
} else if (arg_is_set(cmd, stripes_ARG)) {
*stripes = arg_uint_value(cmd, stripes_ARG, 0);
*stripes_supplied = 1;
} else {
/*
* FIXME add segtype parameter for min_stripes and remove logic for this
* from all other places
*/
if (segtype_is_any_raid6(segtype))
*stripes = 3;
else if (segtype_is_striped_raid(segtype))
*stripes = 2;
else
*stripes = 1;
*stripes_supplied = 0;
}
2010-04-13 05:54:32 +04:00
if ((*stripe_size = arg_uint_value(cmd, stripesize_ARG, 0))) {
if (arg_sign_value(cmd, stripesize_ARG, SIGN_NONE) == SIGN_MINUS) {
2014-11-14 18:08:27 +03:00
log_error("Negative stripesize is invalid.");
2010-04-13 05:54:32 +04:00
return 0;
}
}
*stripe_size_supplied = arg_is_set(cmd, stripesize_ARG);
2010-04-13 05:54:32 +04:00
return _validate_stripe_params(cmd, segtype, stripes, stripe_size);
2010-04-13 05:54:32 +04:00
}
static int _validate_cachepool_params(const char *policy_name, cache_mode_t cache_mode)
{
/*
* FIXME: it might be nice if cmd def rules could check option values,
* then a rule could do this.
*/
if ((cache_mode == CACHE_MODE_WRITEBACK) && policy_name && !strcmp(policy_name, "cleaner")) {
log_error("Cache mode \"writeback\" is not compatible with cache policy \"cleaner\".");
return 0;
}
return 1;
}
int get_cache_params(struct cmd_context *cmd,
2017-03-09 18:20:44 +03:00
uint32_t *chunk_size,
cache_metadata_format_t *cache_metadata_format,
cache_mode_t *cache_mode,
const char **name,
struct dm_config_tree **settings)
{
const char *str;
struct arg_value_group_list *group;
struct dm_config_tree *result = NULL, *prev = NULL, *current = NULL;
2014-11-20 19:49:32 +03:00
struct dm_config_node *cn;
int ok = 0;
if (arg_is_set(cmd, chunksize_ARG)) {
*chunk_size = arg_uint_value(cmd, chunksize_ARG, 0);
if (!validate_cache_chunk_size(cmd, *chunk_size))
return_0;
log_very_verbose("Setting pool chunk size to %s.",
display_size(cmd, *chunk_size));
}
*cache_metadata_format = (cache_metadata_format_t)
arg_uint_value(cmd, cachemetadataformat_ARG, CACHE_METADATA_FORMAT_UNSELECTED);
*cache_mode = (cache_mode_t) arg_uint_value(cmd, cachemode_ARG, CACHE_MODE_UNSELECTED);
*name = arg_str_value(cmd, cachepolicy_ARG, NULL);
if (!_validate_cachepool_params(*name, *cache_mode))
goto_out;
dm_list_iterate_items(group, &cmd->arg_value_groups) {
if (!grouped_arg_is_set(group->arg_values, cachesettings_ARG))
continue;
if (!(current = dm_config_create()))
goto_out;
if (prev)
current->cascade = prev;
prev = current;
if (!(str = grouped_arg_str_value(group->arg_values,
cachesettings_ARG,
NULL)))
goto_out;
if (!dm_config_parse_without_dup_node_check(current, str, str + strlen(str)))
goto_out;
}
if (current) {
if (!(result = dm_config_flatten(current)))
goto_out;
if (result->root) {
if (!(cn = dm_config_create_node(result, "policy_settings")))
goto_out;
cn->child = result->root;
result->root = cn;
}
}
ok = 1;
out:
if (!ok && result) {
dm_config_destroy(result);
result = NULL;
}
while (prev) {
current = prev->cascade;
dm_config_destroy(prev);
prev = current;
}
*settings = result;
return ok;
}
static int _get_one_writecache_setting(struct cmd_context *cmd, struct writecache_settings *settings,
char *key, char *val, uint32_t *block_size_sectors)
{
/* special case: block_size is not a setting but is set with the --cachesettings option */
if (!strncmp(key, "block_size", strlen("block_size"))) {
uint32_t block_size = 0;
if (sscanf(val, "%u", &block_size) != 1)
goto_bad;
if (block_size == 512)
*block_size_sectors = 1;
else if (block_size == 4096)
*block_size_sectors = 8;
else
goto_bad;
return 1;
}
if (!strncmp(key, "high_watermark", strlen("high_watermark"))) {
if (sscanf(val, "%llu", (unsigned long long *)&settings->high_watermark) != 1)
goto_bad;
if (settings->high_watermark > 100)
goto_bad;
settings->high_watermark_set = 1;
return 1;
}
if (!strncmp(key, "low_watermark", strlen("low_watermark"))) {
if (sscanf(val, "%llu", (unsigned long long *)&settings->low_watermark) != 1)
goto_bad;
if (settings->low_watermark > 100)
goto_bad;
settings->low_watermark_set = 1;
return 1;
}
if (!strncmp(key, "writeback_jobs", strlen("writeback_jobs"))) {
if (sscanf(val, "%llu", (unsigned long long *)&settings->writeback_jobs) != 1)
goto_bad;
settings->writeback_jobs_set = 1;
return 1;
}
if (!strncmp(key, "autocommit_blocks", strlen("autocommit_blocks"))) {
if (sscanf(val, "%llu", (unsigned long long *)&settings->autocommit_blocks) != 1)
goto_bad;
settings->autocommit_blocks_set = 1;
return 1;
}
if (!strncmp(key, "autocommit_time", strlen("autocommit_time"))) {
if (sscanf(val, "%llu", (unsigned long long *)&settings->autocommit_time) != 1)
goto_bad;
settings->autocommit_time_set = 1;
return 1;
}
if (!strncmp(key, "fua", strlen("fua"))) {
if (settings->nofua_set) {
log_error("Setting fua and nofua cannot both be set.");
return 0;
}
if (sscanf(val, "%u", &settings->fua) != 1)
goto_bad;
settings->fua_set = 1;
return 1;
}
if (!strncmp(key, "nofua", strlen("nofua"))) {
if (settings->fua_set) {
log_error("Setting fua and nofua cannot both be set.");
return 0;
}
if (sscanf(val, "%u", &settings->nofua) != 1)
goto_bad;
settings->nofua_set = 1;
return 1;
}
if (!strncmp(key, "cleaner", strlen("cleaner"))) {
if (sscanf(val, "%u", &settings->cleaner) != 1)
goto_bad;
settings->cleaner_set = 1;
return 1;
}
if (!strncmp(key, "max_age", strlen("max_age"))) {
if (sscanf(val, "%u", &settings->max_age) != 1)
goto_bad;
settings->max_age_set = 1;
return 1;
}
if (settings->new_key) {
log_error("Setting %s is not recognized. Only one unrecognized setting is allowed.", key);
return 0;
}
log_warn("Unrecognized writecache setting \"%s\" may cause activation failure.", key);
if (yes_no_prompt("Use unrecognized writecache setting? [y/n]: ") == 'n') {
log_error("Aborting writecache conversion.");
return 0;
}
log_warn("Using unrecognized writecache setting: %s = %s.", key, val);
settings->new_key = dm_pool_strdup(cmd->mem, key);
settings->new_val = dm_pool_strdup(cmd->mem, val);
return 1;
bad:
log_error("Invalid setting: %s", key);
return 0;
}
int get_writecache_settings(struct cmd_context *cmd, struct writecache_settings *settings,
uint32_t *block_size_sectors)
{
struct arg_value_group_list *group;
const char *str;
char key[64];
char val[64];
int num;
int pos;
/*
* "grouped" means that multiple --cachesettings options can be used.
* Each option is also allowed to contain multiple key = val pairs.
*/
dm_list_iterate_items(group, &cmd->arg_value_groups) {
if (!grouped_arg_is_set(group->arg_values, cachesettings_ARG))
continue;
if (!(str = grouped_arg_str_value(group->arg_values, cachesettings_ARG, NULL)))
break;
pos = 0;
while (pos < strlen(str)) {
/* scan for "key1=val1 key2 = val2 key3= val3" */
memset(key, 0, sizeof(key));
memset(val, 0, sizeof(val));
if (sscanf(str + pos, " %63[^=]=%63s %n", key, val, &num) != 2) {
log_error("Invalid setting at: %s", str+pos);
return 0;
}
pos += num;
if (!_get_one_writecache_setting(cmd, settings, key, val, block_size_sectors))
return_0;
}
}
if (settings->high_watermark_set && settings->low_watermark_set &&
(settings->high_watermark <= settings->low_watermark)) {
log_error("High watermark must be greater than low watermark.");
return 0;
}
return 1;
}
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
/* FIXME move to lib */
static int _pv_change_tag(struct physical_volume *pv, const char *tag, int addtag)
{
if (addtag) {
if (!str_list_add(pv->fmt->cmd->mem, &pv->tags, tag)) {
2014-11-14 18:08:27 +03:00
log_error("Failed to add tag %s to physical volume %s.",
tag, pv_dev_name(pv));
return 0;
}
} else
str_list_del(&pv->tags, tag);
return 1;
}
/* Set exactly one of VG, LV or PV */
int change_tag(struct cmd_context *cmd, struct volume_group *vg,
struct logical_volume *lv, struct physical_volume *pv, int arg)
{
const char *tag;
struct arg_value_group_list *current_group;
dm_list_iterate_items(current_group, &cmd->arg_value_groups) {
if (!grouped_arg_is_set(current_group->arg_values, arg))
continue;
if (!(tag = grouped_arg_str_value(current_group->arg_values, arg, NULL))) {
2014-11-14 18:08:27 +03:00
log_error("Failed to get tag.");
return 0;
}
if (vg && !vg_change_tag(vg, tag, arg == addtag_ARG))
return_0;
else if (lv && !lv_change_tag(lv, tag, arg == addtag_ARG))
return_0;
else if (pv && !_pv_change_tag(pv, tag, arg == addtag_ARG))
return_0;
}
return 1;
}
/*
* FIXME: replace process_each_label() with process_each_vg() which is
* based on performing vg_read(), which provides a correct representation
* of VGs/PVs, that is not provided by lvmcache_label_scan().
*/
int process_each_label(struct cmd_context *cmd, int argc, char **argv,
struct processing_handle *handle,
2013-07-29 20:51:27 +04:00
process_single_label_fn_t process_single_label)
{
log_report_t saved_log_report_state = log_get_report_state();
2013-07-29 20:51:27 +04:00
struct label *label;
struct dev_iter *iter;
struct device *dev;
struct lvmcache_info *info;
struct dm_list process_duplicates;
struct device_list *devl;
2013-07-29 20:51:27 +04:00
int ret_max = ECMD_PROCESSED;
int ret;
2013-07-29 20:51:27 +04:00
int opt = 0;
dm_list_init(&process_duplicates);
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_LABEL);
lvmcache_label_scan(cmd);
2013-07-29 20:51:27 +04:00
if (argc) {
for (; opt < argc; opt++) {
if (sigint_caught()) {
log_error("Interrupted.");
ret_max = ECMD_FAILED;
goto out;
}
if (!(dev = dev_cache_get(cmd, argv[opt], cmd->filter))) {
2013-07-29 20:51:27 +04:00
log_error("Failed to find device "
2014-11-14 18:08:27 +03:00
"\"%s\".", argv[opt]);
2013-07-29 20:51:27 +04:00
ret_max = ECMD_FAILED;
continue;
}
if (!(label = lvmcache_get_dev_label(dev))) {
if (!lvmcache_dev_is_unused_duplicate(dev)) {
log_error("No physical volume label read from %s.", argv[opt]);
ret_max = ECMD_FAILED;
} else {
if (!(devl = malloc(sizeof(*devl))))
return_0;
devl->dev = dev;
dm_list_add(&process_duplicates, &devl->list);
}
continue;
}
log_set_report_object_name_and_id(dev_name(dev), NULL);
ret = process_single_label(cmd, label, handle);
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
log_set_report_object_name_and_id(NULL, NULL);
}
dm_list_iterate_items(devl, &process_duplicates) {
if (sigint_caught()) {
log_error("Interrupted.");
ret_max = ECMD_FAILED;
goto out;
}
/*
* remove the existing dev for this pvid from lvmcache
* so that the duplicate dev can replace it.
*/
if ((info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0)))
lvmcache_del(info);
/*
* add info to lvmcache from the duplicate dev.
*/
label_scan_dev(cmd, devl->dev);
/*
* the info/label should now be found because
* the label_read should have added it.
*/
if (!(label = lvmcache_get_dev_label(devl->dev)))
2013-07-29 20:51:27 +04:00
continue;
log_set_report_object_name_and_id(dev_name(devl->dev), NULL);
2013-07-29 20:51:27 +04:00
ret = process_single_label(cmd, label, handle);
report_log_ret_code(ret);
2013-07-29 20:51:27 +04:00
if (ret > ret_max)
ret_max = ret;
log_set_report_object_name_and_id(NULL, NULL);
2013-07-29 20:51:27 +04:00
}
goto out;
2013-07-29 20:51:27 +04:00
}
if (!(iter = dev_iter_create(cmd->filter, 1))) {
2014-11-14 18:08:27 +03:00
log_error("dev_iter creation failed.");
ret_max = ECMD_FAILED;
goto out;
2013-07-29 20:51:27 +04:00
}
while ((dev = dev_iter_get(cmd, iter))) {
if (sigint_caught()) {
log_error("Interrupted.");
ret_max = ECMD_FAILED;
break;
}
if (!(label = lvmcache_get_dev_label(dev)))
2013-07-29 20:51:27 +04:00
continue;
log_set_report_object_name_and_id(dev_name(label->dev), NULL);
2013-07-29 20:51:27 +04:00
ret = process_single_label(cmd, label, handle);
report_log_ret_code(ret);
2013-07-29 20:51:27 +04:00
if (ret > ret_max)
ret_max = ret;
log_set_report_object_name_and_id(NULL, NULL);
2013-07-29 20:51:27 +04:00
}
dev_iter_destroy(iter);
out:
log_restore_report_state(saved_log_report_state);
2013-07-29 20:51:27 +04:00
return ret_max;
}
/*
* Parse persistent major minor parameters.
*
* --persistent is unspecified => state is deduced
* from presence of options --minor or --major.
*
* -Mn => --minor or --major not allowed.
*
* -My => --minor is required (and also --major on <=2.4)
*/
int get_and_validate_major_minor(const struct cmd_context *cmd,
const struct format_type *fmt,
int32_t *major, int32_t *minor)
{
if (arg_count(cmd, minor_ARG) > 1) {
log_error("Option --minor may not be repeated.");
return 0;
}
if (arg_count(cmd, major_ARG) > 1) {
log_error("Option -j|--major may not be repeated.");
return 0;
}
/* Check with default 'y' */
if (!arg_int_value(cmd, persistent_ARG, 1)) { /* -Mn */
if (arg_is_set(cmd, minor_ARG) || arg_is_set(cmd, major_ARG)) {
log_error("Options --major and --minor are incompatible with -Mn.");
return 0;
}
*major = *minor = -1;
return 1;
}
/* -1 cannot be entered as an argument for --major, --minor */
*major = arg_int_value(cmd, major_ARG, -1);
*minor = arg_int_value(cmd, minor_ARG, -1);
if (arg_is_set(cmd, persistent_ARG)) { /* -My */
if (*minor == -1) {
log_error("Please specify minor number with --minor when using -My.");
return 0;
}
}
if (!strncmp(cmd->kernel_vsn, "2.4.", 4)) {
/* Major is required for 2.4 */
if (arg_is_set(cmd, persistent_ARG) && *major < 0) {
log_error("Please specify major number with --major when using -My.");
return 0;
}
} else {
if (*major != -1) {
log_warn("WARNING: Ignoring supplied major number %d - "
"kernel assigns major numbers dynamically. "
"Using major number %d instead.",
*major, cmd->dev_types->device_mapper_major);
}
/* Stay with dynamic major:minor if minor is not specified. */
*major = (*minor == -1) ? -1 : cmd->dev_types->device_mapper_major;
}
if ((*minor != -1) && !validate_major_minor(cmd, fmt, *major, *minor))
return_0;
return 1;
}
/*
* Validate lvname parameter
*
* If it contains vgname, it is extracted from lvname.
* If there is passed vgname, it is compared whether its the same name.
*/
int validate_lvname_param(struct cmd_context *cmd, const char **vg_name,
const char **lv_name)
{
const char *vgname;
const char *lvname;
if (!lv_name || !*lv_name)
return 1; /* NULL lvname is ok */
/* If contains VG name, extract it. */
if (strchr(*lv_name, (int) '/')) {
if (!(vgname = _extract_vgname(cmd, *lv_name, &lvname)))
return_0;
if (!*vg_name)
*vg_name = vgname;
else if (strcmp(vgname, *vg_name)) {
log_error("Please use a single volume group name "
2014-11-14 18:08:27 +03:00
"(\"%s\" or \"%s\").", vgname, *vg_name);
return 0;
}
*lv_name = lvname;
}
if (!validate_name(*lv_name)) {
log_error("Logical volume name \"%s\" is invalid.",
*lv_name);
return 0;
}
return 1;
}
/*
* Validate lvname parameter
* This name must follow restriction rules on prefixes and suffixes.
*
* If it contains vgname, it is extracted from lvname.
* If there is passed vgname, it is compared whether its the same name.
*/
int validate_restricted_lvname_param(struct cmd_context *cmd, const char **vg_name,
const char **lv_name)
{
if (!validate_lvname_param(cmd, vg_name, lv_name))
return_0;
if (lv_name && *lv_name && !apply_lvname_restrictions(*lv_name))
return_0;
return 1;
}
/*
* Extract list of VG names and list of tags from command line arguments.
*/
static int _get_arg_vgnames(struct cmd_context *cmd,
int argc, char **argv,
const char *one_vgname,
struct dm_list *use_vgnames,
struct dm_list *arg_vgnames,
struct dm_list *arg_tags)
{
int opt = 0;
int ret_max = ECMD_PROCESSED;
const char *vg_name;
if (one_vgname) {
if (!str_list_add(cmd->mem, arg_vgnames,
dm_pool_strdup(cmd->mem, one_vgname))) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
return ret_max;
}
if (use_vgnames && !dm_list_empty(use_vgnames)) {
dm_list_splice(arg_vgnames, use_vgnames);
return ret_max;
}
for (; opt < argc; opt++) {
vg_name = argv[opt];
if (*vg_name == '@') {
if (!validate_tag(vg_name + 1)) {
log_error("Skipping invalid tag: %s", vg_name);
if (ret_max < EINVALID_CMD_LINE)
ret_max = EINVALID_CMD_LINE;
continue;
}
if (!str_list_add(cmd->mem, arg_tags,
dm_pool_strdup(cmd->mem, vg_name + 1))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
continue;
}
vg_name = skip_dev_dir(cmd, vg_name, NULL);
if (strchr(vg_name, '/')) {
2014-11-14 18:08:27 +03:00
log_error("Invalid volume group name %s.", vg_name);
if (ret_max < EINVALID_CMD_LINE)
ret_max = EINVALID_CMD_LINE;
continue;
}
if (!str_list_add(cmd->mem, arg_vgnames,
dm_pool_strdup(cmd->mem, vg_name))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
}
return ret_max;
}
struct processing_handle *init_processing_handle(struct cmd_context *cmd, struct processing_handle *parent_handle)
{
struct processing_handle *handle;
if (!(handle = dm_pool_zalloc(cmd->mem, sizeof(struct processing_handle)))) {
log_error("_init_processing_handle: failed to allocate memory for processing handle");
return NULL;
}
handle->parent = parent_handle;
/*
* For any reporting tool, the internal_report_for_select is reset to 0
* automatically because the internal reporting/selection is simply not
* needed - the reporting/selection is already a part of the code path
* used there.
*
* *The internal report for select is only needed for non-reporting tools!*
*/
handle->internal_report_for_select = arg_is_set(cmd, select_ARG);
handle->include_historical_lvs = cmd->include_historical_lvs;
if (!parent_handle && !cmd->cmd_report.report_group) {
if (!report_format_init(cmd)) {
dm_pool_free(cmd->mem, handle);
return NULL;
}
} else
cmd->cmd_report.saved_log_report_state = log_get_report_state();
log_set_report_context(LOG_REPORT_CONTEXT_PROCESSING);
return handle;
}
int init_selection_handle(struct cmd_context *cmd, struct processing_handle *handle,
report_type_t initial_report_type)
{
struct selection_handle *sh;
const char *selection;
if (!(sh = dm_pool_zalloc(cmd->mem, sizeof(struct selection_handle)))) {
log_error("_init_selection_handle: failed to allocate memory for selection handle");
return 0;
}
if (!report_get_single_selection(cmd, initial_report_type, &selection))
return_0;
sh->report_type = initial_report_type;
if (!(sh->selection_rh = report_init_for_selection(cmd, &sh->report_type, selection))) {
dm_pool_free(cmd->mem, sh);
return_0;
}
handle->selection_handle = sh;
return 1;
}
void destroy_processing_handle(struct cmd_context *cmd, struct processing_handle *handle)
{
if (handle) {
if (handle->selection_handle && handle->selection_handle->selection_rh)
dm_report_free(handle->selection_handle->selection_rh);
log_restore_report_state(cmd->cmd_report.saved_log_report_state);
if (!cmd->is_interactive) {
if (!dm_report_group_destroy(cmd->cmd_report.report_group))
stack;
cmd->cmd_report.report_group = NULL;
if (cmd->cmd_report.log_rh) {
dm_report_free(cmd->cmd_report.log_rh);
cmd->cmd_report.log_rh = NULL;
}
}
pvmove: fix possible memory pool corruption This is a hotfix for a bug introduced in 6d7dc87cb356162f912b13c8a0cd198037c0226b. The bug description: First we allocate memory for processing handle (at an address 1) then we allocate some memory on the same pool for later use in pvmove_poll function inside the process_each_pv function (at an address 2). After we jump out of process_each_pv we called destroy_processing_handle. As a result of destroying the handle memory pool could deallocate all memory at address 1 or higher. The pvmove_poll function tried to copy a memory allocated at address 2 that could be returned to the system. If it was so it led to segfault. We need to rethink proper fix but in the same time cmd->mem pool is recreated per each lvm command so this should not cause problems even when we run multiple commands in lvm shell. A valgrind snapshot of the corruption: Invalid read of size 1 at 0x4C29F92: strlen (mc_replace_strmem.c:403) by 0x5495F2E: dm_pool_strdup (pool.c:51) by 0x1592A7: _create_id (pvmove.c:774) by 0x159409: pvmove_poll (pvmove.c:796) by 0x1599E3: pvmove (pvmove.c:931) by 0x15105B: lvm_run_command (lvmcmdline.c:1655) by 0x1523C3: lvm2_main (lvmcmdline.c:2121) by 0x1754F3: main (lvm.c:22) Address 0xf15df8a is 138 bytes inside a block of size 8,192 free'd at 0x4C28430: free (vg_replace_malloc.c:446) by 0x5494E73: dm_free_wrapper (dbg_malloc.c:357) by 0x5495DE2: _free_chunk (pool-fast.c:318) by 0x549561C: dm_pool_free (pool-fast.c:151) by 0x164451: destroy_processing_handle (toollib.c:1837) by 0x1598C1: pvmove (pvmove.c:903) by 0x15105B: lvm_run_command (lvmcmdline.c:1655) by 0x1523C3: lvm2_main (lvmcmdline.c:2121) by 0x1754F3: main (lvm.c:22)
2016-02-12 13:34:26 +03:00
/*
* TODO: think about better alternatives:
* handle mempool, dm_alloc for handle memory...
*/
memset(handle, 0, sizeof(*handle));
}
}
int select_match_vg(struct cmd_context *cmd, struct processing_handle *handle,
struct volume_group *vg)
{
int r;
if (!handle->internal_report_for_select)
return 1;
handle->selection_handle->orig_report_type = VGS;
if (!(r = report_for_selection(cmd, handle, NULL, vg, NULL)))
log_error("Selection failed for VG %s.", vg->name);
handle->selection_handle->orig_report_type = 0;
return r;
}
int select_match_lv(struct cmd_context *cmd, struct processing_handle *handle,
struct volume_group *vg, struct logical_volume *lv)
{
int r;
if (!handle->internal_report_for_select)
return 1;
handle->selection_handle->orig_report_type = LVS;
if (!(r = report_for_selection(cmd, handle, NULL, vg, lv)))
log_error("Selection failed for LV %s.", lv->name);
handle->selection_handle->orig_report_type = 0;
return r;
}
int select_match_pv(struct cmd_context *cmd, struct processing_handle *handle,
struct volume_group *vg, struct physical_volume *pv)
{
int r;
if (!handle->internal_report_for_select)
return 1;
handle->selection_handle->orig_report_type = PVS;
if (!(r = report_for_selection(cmd, handle, pv, vg, NULL)))
log_error("Selection failed for PV %s.", dev_name(pv->dev));
handle->selection_handle->orig_report_type = 0;
return r;
}
static int _select_matches(struct processing_handle *handle)
{
if (!handle->internal_report_for_select)
return 1;
return handle->selection_handle->selected;
}
static int _process_vgnameid_list(struct cmd_context *cmd, uint32_t read_flags,
struct dm_list *vgnameids_to_process,
struct dm_list *arg_vgnames,
struct dm_list *arg_tags,
struct processing_handle *handle,
process_single_vg_fn_t process_single_vg)
{
log_report_t saved_log_report_state = log_get_report_state();
char uuid[64] __attribute__((aligned(8)));
struct volume_group *vg;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
struct volume_group *error_vg = NULL;
struct vgnameid_list *vgnl;
const char *vg_name;
const char *vg_uuid;
uint32_t lockd_state = 0;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
uint32_t error_flags = 0;
int whole_selected = 0;
int ret_max = ECMD_PROCESSED;
int ret;
int skip;
int notfound;
int process_all = 0;
int do_report_ret_code = 1;
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_VG);
/*
* If no VG names or tags were supplied, then process all VGs.
*/
if (dm_list_empty(arg_vgnames) && dm_list_empty(arg_tags))
process_all = 1;
/*
* FIXME If one_vgname, only proceed if exactly one VG matches tags or selection.
*/
dm_list_iterate_items(vgnl, vgnameids_to_process) {
vg_name = vgnl->vg_name;
vg_uuid = vgnl->vgid;
skip = 0;
notfound = 0;
uuid[0] = '\0';
if (is_orphan_vg(vg_name)) {
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_ORPHAN);
log_set_report_object_name_and_id(vg_name + sizeof(VG_ORPHANS), uuid);
} else {
if (vg_uuid && !id_write_format((const struct id*)vg_uuid, uuid, sizeof(uuid)))
stack;
log_set_report_object_name_and_id(vg_name, uuid);
}
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
log_very_verbose("Processing VG %s %s", vg_name, uuid);
2015-03-05 23:00:44 +03:00
if (!lockd_vg(cmd, vg_name, NULL, 0, &lockd_state)) {
2018-06-13 16:56:58 +03:00
stack;
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
continue;
}
2015-03-05 23:00:44 +03:00
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
vg = vg_read(cmd, vg_name, vg_uuid, read_flags, lockd_state, &error_flags, &error_vg);
if (_ignore_vg(cmd, error_flags, error_vg, vg_name, arg_vgnames, read_flags, &skip, &notfound)) {
2015-03-05 23:00:44 +03:00
stack;
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (error_vg)
unlock_and_release_vg(cmd, error_vg, vg_name);
2015-03-05 23:00:44 +03:00
goto endvg;
}
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (error_vg)
unlock_and_release_vg(cmd, error_vg, vg_name);
if (skip || notfound)
2015-03-05 23:00:44 +03:00
goto endvg;
/* Process this VG? */
if ((process_all ||
(!dm_list_empty(arg_vgnames) && str_list_match_item(arg_vgnames, vg_name)) ||
(!dm_list_empty(arg_tags) && str_list_match_list(arg_tags, &vg->tags, NULL))) &&
select_match_vg(cmd, handle, vg) && _select_matches(handle)) {
log_very_verbose("Running command for VG %s %s", vg_name, vg_uuid ? uuid : "");
ret = process_single_vg(cmd, vg_name, vg, handle);
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
}
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
unlock_vg(cmd, vg, vg_name);
2015-03-05 23:00:44 +03:00
endvg:
release_vg(vg);
if (!lockd_vg(cmd, vg_name, "un", 0, &lockd_state))
stack;
log_set_report_object_name_and_id(NULL, NULL);
}
/* the VG is selected if at least one LV is selected */
_set_final_selection_result(handle, whole_selected);
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
/*
* Check if a command line VG name is ambiguous, i.e. there are multiple VGs on
* the system that have the given name. If *one* VG with the given name is
* local and the rest are foreign, then use the local VG (removing foreign VGs
* with the same name from the vgnameids_on_system list). If multiple VGs with
* the given name are local, we don't know which VG is intended, so remove the
* ambiguous name from the list of args.
*/
static int _resolve_duplicate_vgnames(struct cmd_context *cmd,
struct dm_list *arg_vgnames,
struct dm_list *vgnameids_on_system)
{
struct dm_str_list *sl, *sl2;
struct vgnameid_list *vgnl, *vgnl2;
char uuid[64] __attribute__((aligned(8)));
int found;
int ret = ECMD_PROCESSED;
dm_list_iterate_items_safe(sl, sl2, arg_vgnames) {
found = 0;
dm_list_iterate_items(vgnl, vgnameids_on_system) {
if (strcmp(sl->str, vgnl->vg_name))
continue;
found++;
}
if (found < 2)
continue;
/*
* More than one VG match the given name.
* If only one is local, use that one.
*/
found = 0;
dm_list_iterate_items_safe(vgnl, vgnl2, vgnameids_on_system) {
if (strcmp(sl->str, vgnl->vg_name))
continue;
/*
* label scan has already populated lvmcache vginfo with
* this information.
*/
if (lvmcache_vg_is_foreign(cmd, vgnl->vg_name, vgnl->vgid)) {
if (!id_write_format((const struct id*)vgnl->vgid, uuid, sizeof(uuid)))
stack;
dm_list_del(&vgnl->list);
} else {
found++;
}
}
if (found < 2)
continue;
/*
* More than one VG with this name is local so the intended VG
* is unknown.
*/
log_error("Multiple VGs found with the same name: skipping %s", sl->str);
log_error("Use --select vg_uuid=<uuid> in place of the VG name.");
dm_list_del(&sl->list);
ret = ECMD_FAILED;
}
return ret;
}
/*
* For each arg_vgname, move the corresponding entry from
* vgnameids_on_system to vgnameids_to_process. If an
* item in arg_vgnames doesn't exist in vgnameids_on_system,
* then add a new entry for it to vgnameids_to_process.
*/
static void _choose_vgs_to_process(struct cmd_context *cmd,
struct dm_list *arg_vgnames,
struct dm_list *vgnameids_on_system,
struct dm_list *vgnameids_to_process)
{
char uuid[64] __attribute__((aligned(8)));
struct dm_str_list *sl, *sl2;
struct vgnameid_list *vgnl, *vgnl2;
struct id id;
int arg_is_uuid = 0;
int found;
dm_list_iterate_items_safe(sl, sl2, arg_vgnames) {
found = 0;
dm_list_iterate_items_safe(vgnl, vgnl2, vgnameids_on_system) {
if (strcmp(sl->str, vgnl->vg_name))
continue;
dm_list_del(&vgnl->list);
dm_list_add(vgnameids_to_process, &vgnl->list);
found = 1;
break;
}
/*
* If the VG name arg looks like a UUID, then check if it
* matches the UUID of a VG. (--select should generally
* be used to select a VG by uuid instead.)
*/
if (!found && (cmd->cname->flags & ALLOW_UUID_AS_NAME))
arg_is_uuid = id_read_format_try(&id, sl->str);
if (!found && arg_is_uuid) {
dm_list_iterate_items_safe(vgnl, vgnl2, vgnameids_on_system) {
if (!(id_write_format((const struct id*)vgnl->vgid, uuid, sizeof(uuid))))
continue;
if (strcmp(sl->str, uuid))
continue;
log_print("Processing VG %s because of matching UUID %s",
vgnl->vg_name, uuid);
dm_list_del(&vgnl->list);
dm_list_add(vgnameids_to_process, &vgnl->list);
/* Make the arg_vgnames entry use the actual VG name. */
sl->str = dm_pool_strdup(cmd->mem, vgnl->vg_name);
found = 1;
break;
}
}
/*
* If the name arg was not found in the list of all VGs, then
* it probably doesn't exist, but we want the "VG not found"
* failure to be handled by the existing vg_read() code for
* that error. So, create an entry with just the VG name so
* that the processing loop will attempt to process it and use
* the vg_read() error path.
*/
if (!found) {
log_verbose("VG name on command line not found in list of VGs: %s", sl->str);
if (!(vgnl = dm_pool_alloc(cmd->mem, sizeof(*vgnl))))
continue;
vgnl->vgid = NULL;
if (!(vgnl->vg_name = dm_pool_strdup(cmd->mem, sl->str)))
continue;
dm_list_add(vgnameids_to_process, &vgnl->list);
}
}
}
/*
* Call process_single_vg() for each VG selected by the command line arguments.
* If one_vgname is set, process only that VG and ignore argc/argv (which should be 0/NULL).
* If one_vgname is not set, get VG names to process from argc/argv.
*/
int process_each_vg(struct cmd_context *cmd,
int argc, char **argv,
const char *one_vgname,
struct dm_list *use_vgnames,
uint32_t read_flags,
int include_internal,
struct processing_handle *handle,
process_single_vg_fn_t process_single_vg)
{
log_report_t saved_log_report_state = log_get_report_state();
int handle_supplied = handle != NULL;
struct dm_list arg_tags; /* str_list */
struct dm_list arg_vgnames; /* str_list */
struct dm_list vgnameids_on_system; /* vgnameid_list */
struct dm_list vgnameids_to_process; /* vgnameid_list */
int enable_all_vgs = (cmd->cname->flags & ALL_VGS_IS_DEFAULT);
int process_all_vgs_on_system = 0;
int ret_max = ECMD_PROCESSED;
int ret;
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_VG);
log_debug("Processing each VG");
2015-03-05 23:00:44 +03:00
/* Disable error in vg_read so we can print it from ignore_vg. */
cmd->vg_read_print_access_error = 0;
dm_list_init(&arg_tags);
dm_list_init(&arg_vgnames);
dm_list_init(&vgnameids_on_system);
dm_list_init(&vgnameids_to_process);
/*
* Find any VGs or tags explicitly provided on the command line.
*/
if ((ret = _get_arg_vgnames(cmd, argc, argv, one_vgname, use_vgnames, &arg_vgnames, &arg_tags)) != ECMD_PROCESSED) {
ret_max = ret;
goto_out;
}
/*
* Process all VGs on the system when:
* . tags are specified and all VGs need to be read to
* look for matching tags.
* . no VG names are specified and the command defaults
* to processing all VGs when none are specified.
*/
if ((dm_list_empty(&arg_vgnames) && enable_all_vgs) || !dm_list_empty(&arg_tags))
process_all_vgs_on_system = 1;
2015-03-05 23:00:44 +03:00
/*
* Needed for a current listing of the global VG namespace.
*/
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
if (process_all_vgs_on_system && !lock_global(cmd, "sh")) {
ret_max = ECMD_FAILED;
goto_out;
}
/*
scan: do scanning at the start of a command Move the location of scans to make it clearer and avoid unnecessary repeated scanning. There should be one scan at the start of a command which is then used through the rest of command processing. Previously, the initial label scan was called as a side effect from various utility functions. This would lead to it being called unnecessarily. It is an expensive operation, and should only be called when necessary. Also, this is a primary step in the function of the command, and as such it should be called prominently at the top level of command processing, not as a hidden side effect of a utility function. lvm knows exactly where and when the label scan needs to be done. Because of this, move the label scan calls from the internal functions to the top level of processing. Other specific instances of lvmcache_label_scan() are still called unnecessarily or unclearly by specific commands that do not use the common process_each functions. These will be improved in future commits. During the processing phase, rescanning labels for devices in a VG needs to be done after the VG lock is acquired in case things have changed since the initial label scan. This was being done by way of rescanning devices that had the INVALID flag set in lvmcache. This usually approximated the right set of devices, but it was not exact, and obfuscated the real requirement. Correct this by using a new function that rescans the devices in the VG: lvmcache_label_rescan_vg(). Apart from being inexact, the rescanning was extremely well hidden. _vg_read() would call ->create_instance(), _text_create_text_instance(), _create_vg_text_instance() which would call lvmcache_label_scan() which would call _scan_invalid() which repeats the label scan on devices flagged INVALID. lvmcache_label_rescan_vg() is now called prominently by _vg_read() directly.
2018-02-07 22:26:37 +03:00
* Scan all devices to populate lvmcache with initial
* list of PVs and VGs.
*/
if (!(read_flags & PROCESS_SKIP_SCAN))
lvmcache_label_scan(cmd);
/*
* A list of all VGs on the system is needed when:
* . processing all VGs on the system
* . A VG name is specified which may refer to one
* of multiple VGs on the system with that name.
*/
log_very_verbose("Obtaining the complete list of VGs to process");
if (!lvmcache_get_vgnameids(cmd, &vgnameids_on_system, NULL, include_internal)) {
ret_max = ECMD_FAILED;
goto_out;
2015-03-05 23:00:44 +03:00
}
if (!dm_list_empty(&arg_vgnames)) {
/* This may remove entries from arg_vgnames or vgnameids_on_system. */
ret = _resolve_duplicate_vgnames(cmd, &arg_vgnames, &vgnameids_on_system);
if (ret > ret_max)
ret_max = ret;
if (dm_list_empty(&arg_vgnames) && dm_list_empty(&arg_tags)) {
ret_max = ECMD_FAILED;
goto out;
}
}
if (dm_list_empty(&arg_vgnames) && dm_list_empty(&vgnameids_on_system)) {
/* FIXME Should be log_print, but suppressed for reporting cmds */
log_verbose("No volume groups found.");
ret_max = ECMD_PROCESSED;
goto out;
}
if (dm_list_empty(&arg_vgnames))
read_flags |= READ_OK_NOTFOUND;
/*
* When processing all VGs, vgnameids_on_system simply becomes
* vgnameids_to_process.
* When processing only specified VGs, then for each item in
* arg_vgnames, move the corresponding entry from
* vgnameids_on_system to vgnameids_to_process.
*/
if (process_all_vgs_on_system)
dm_list_splice(&vgnameids_to_process, &vgnameids_on_system);
else
_choose_vgs_to_process(cmd, &arg_vgnames, &vgnameids_on_system, &vgnameids_to_process);
if (!handle && !(handle = init_processing_handle(cmd, NULL))) {
ret_max = ECMD_FAILED;
goto_out;
}
if (handle->internal_report_for_select && !handle->selection_handle &&
!init_selection_handle(cmd, handle, VGS)) {
ret_max = ECMD_FAILED;
goto_out;
}
ret = _process_vgnameid_list(cmd, read_flags, &vgnameids_to_process,
&arg_vgnames, &arg_tags, handle, process_single_vg);
if (ret > ret_max)
ret_max = ret;
out:
if (!handle_supplied)
destroy_processing_handle(cmd, handle);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
static struct dm_str_list *_str_list_match_item_with_prefix(const struct dm_list *sll, const char *prefix, const char *str)
{
struct dm_str_list *sl;
size_t prefix_len = strlen(prefix);
dm_list_iterate_items(sl, sll) {
if (!strncmp(prefix, sl->str, prefix_len) &&
!strcmp(sl->str + prefix_len, str))
return sl;
}
return NULL;
}
/*
* Dummy LV, segment type and segment to represent all historical LVs.
*/
static struct logical_volume _historical_lv = {
.name = "",
.major = -1,
.minor = -1,
.snapshot_segs = DM_LIST_HEAD_INIT(_historical_lv.snapshot_segs),
.segments = DM_LIST_HEAD_INIT(_historical_lv.segments),
.tags = DM_LIST_HEAD_INIT(_historical_lv.tags),
.segs_using_this_lv = DM_LIST_HEAD_INIT(_historical_lv.segs_using_this_lv),
.indirect_glvs = DM_LIST_HEAD_INIT(_historical_lv.indirect_glvs),
.hostname = "",
};
static struct segment_type _historical_segment_type = {
.name = "historical",
.flags = SEG_VIRTUAL | SEG_CANNOT_BE_ZEROED,
};
static struct lv_segment _historical_lv_segment = {
.lv = &_historical_lv,
.segtype = &_historical_segment_type,
.len = 0,
.tags = DM_LIST_HEAD_INIT(_historical_lv_segment.tags),
.origin_list = DM_LIST_HEAD_INIT(_historical_lv_segment.origin_list),
};
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int opt_in_list_is_set(struct cmd_context *cmd, int *opts, int count,
int *match_count, int *unmatch_count)
{
int match = 0;
int unmatch = 0;
int i;
for (i = 0; i < count; i++) {
if (arg_is_set(cmd, opts[i]))
match++;
else
unmatch++;
}
if (match_count)
*match_count = match;
if (unmatch_count)
*unmatch_count = unmatch;
return match ? 1 : 0;
}
void opt_array_to_str(struct cmd_context *cmd, int *opts, int count,
char *buf, int len)
{
int pos = 0;
int ret;
int i;
for (i = 0; i < count; i++) {
ret = snprintf(buf + pos, len - pos, "%s ", arg_long_option_name(opts[i]));
if (ret >= len - pos)
break;
pos += ret;
}
buf[len - 1] = '\0';
}
static void _lvp_bits_to_str(uint64_t bits, char *buf, int len)
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
{
struct lv_prop *prop;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lvp_enum;
int pos = 0;
int ret;
for (lvp_enum = 0; lvp_enum < LVP_COUNT; lvp_enum++) {
if (!(prop = get_lv_prop(lvp_enum)))
continue;
if (lvp_bit_is_set(bits, lvp_enum)) {
ret = snprintf(buf + pos, len - pos, "%s ", prop->name);
if (ret >= len - pos)
break;
pos += ret;
}
}
buf[len - 1] = '\0';
}
static void _lvt_bits_to_str(uint64_t bits, char *buf, int len)
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
{
struct lv_type *type;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lvt_enum;
int pos = 0;
int ret;
for (lvt_enum = 0; lvt_enum < LVT_COUNT; lvt_enum++) {
if (!(type = get_lv_type(lvt_enum)))
continue;
if (lvt_bit_is_set(bits, lvt_enum)) {
ret = snprintf(buf + pos, len - pos, "%s ", type->name);
if (ret >= len - pos)
break;
pos += ret;
}
}
buf[len - 1] = '\0';
}
/*
* This is the lv_prop function pointer used for lv_is_foo() #defines.
* Alternatively, lv_is_foo() could all be turned into functions.
*/
static int _lv_is_prop(struct cmd_context *cmd, struct logical_volume *lv, int lvp_enum)
{
switch (lvp_enum) {
case is_locked_LVP:
return lv_is_locked(lv);
case is_partial_LVP:
return lv_is_partial(lv);
case is_virtual_LVP:
return lv_is_virtual(lv);
case is_merging_LVP:
return lv_is_merging(lv);
case is_merging_origin_LVP:
return lv_is_merging_origin(lv);
case is_converting_LVP:
return lv_is_converting(lv);
case is_external_origin_LVP:
return lv_is_external_origin(lv);
case is_virtual_origin_LVP:
return lv_is_virtual_origin(lv);
case is_not_synced_LVP:
return lv_is_not_synced(lv);
case is_pending_delete_LVP:
return lv_is_pending_delete(lv);
case is_error_when_full_LVP:
return lv_is_error_when_full(lv);
case is_pvmove_LVP:
return lv_is_pvmove(lv);
case is_removed_LVP:
return lv_is_removed(lv);
case is_vg_writable_LVP:
return (lv->vg->status & LVM_WRITE) ? 1 : 0;
case is_thinpool_data_LVP:
return lv_is_thin_pool_data(lv);
case is_thinpool_metadata_LVP:
return lv_is_thin_pool_metadata(lv);
case is_cachepool_data_LVP:
return lv_is_cache_pool_data(lv);
case is_cachepool_metadata_LVP:
return lv_is_cache_pool_metadata(lv);
case is_mirror_image_LVP:
return lv_is_mirror_image(lv);
case is_mirror_log_LVP:
return lv_is_mirror_log(lv);
case is_raid_image_LVP:
return lv_is_raid_image(lv);
case is_raid_metadata_LVP:
return lv_is_raid_metadata(lv);
case is_origin_LVP: /* use lv_is_thick_origin */
return lv_is_origin(lv);
case is_thick_origin_LVP:
return lv_is_thick_origin(lv);
case is_thick_snapshot_LVP:
return lv_is_thick_snapshot(lv);
case is_thin_origin_LVP:
return lv_is_thin_origin(lv, NULL);
case is_thin_snapshot_LVP:
return lv_is_thin_snapshot(lv);
case is_cache_origin_LVP:
return lv_is_cache_origin(lv);
case is_merging_cow_LVP:
return lv_is_merging_cow(lv);
case is_cow_covering_origin_LVP:
return lv_is_cow_covering_origin(lv);
case is_visible_LVP:
return lv_is_visible(lv);
case is_historical_LVP:
return lv_is_historical(lv);
case is_raid_with_tracking_LVP:
return lv_is_raid_with_tracking(lv);
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
case is_raid_with_integrity_LVP:
return lv_raid_has_integrity(lv);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
default:
log_error(INTERNAL_ERROR "unknown lv property value lvp_enum %d", lvp_enum);
}
return 0;
}
/*
* Check if an LV matches a given LV type enum.
*/
static int _lv_is_type(struct cmd_context *cmd, struct logical_volume *lv, int lvt_enum)
{
struct lv_segment *seg = first_seg(lv);
switch (lvt_enum) {
case striped_LVT:
return seg_is_striped(seg) && !lv_is_cow(lv);
case linear_LVT:
return seg_is_linear(seg) && !lv_is_cow(lv);
case snapshot_LVT:
return lv_is_cow(lv);
case thin_LVT:
return lv_is_thin_volume(lv);
case thinpool_LVT:
return lv_is_thin_pool(lv);
case cache_LVT:
return lv_is_cache(lv);
case cachepool_LVT:
return lv_is_cache_pool(lv);
case vdo_LVT:
return lv_is_vdo(lv);
case vdopool_LVT:
return lv_is_vdo_pool(lv);
case vdopooldata_LVT:
return lv_is_vdo_pool_data(lv);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
case mirror_LVT:
return lv_is_mirror(lv);
case raid_LVT:
return lv_is_raid(lv);
case raid0_LVT:
2017-02-06 20:51:06 +03:00
return seg_is_any_raid0(seg);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
case raid1_LVT:
return seg_is_raid1(seg);
case raid4_LVT:
return seg_is_raid4(seg);
case raid5_LVT:
2017-02-06 20:51:06 +03:00
return seg_is_any_raid5(seg);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
case raid6_LVT:
2017-02-06 20:51:06 +03:00
return seg_is_any_raid6(seg);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
case raid10_LVT:
return seg_is_raid10(seg);
case writecache_LVT:
return seg_is_writecache(seg);
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
case integrity_LVT:
return seg_is_integrity(seg);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
case error_LVT:
return !strcmp(seg->segtype->name, SEG_TYPE_NAME_ERROR);
case zero_LVT:
return !strcmp(seg->segtype->name, SEG_TYPE_NAME_ZERO);
default:
log_error(INTERNAL_ERROR "unknown lv type value lvt_enum %d", lvt_enum);
}
return 0;
}
int get_lvt_enum(struct logical_volume *lv)
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
{
struct lv_segment *seg = first_seg(lv);
/*
* The order these are checked is important, because a snapshot LV has
* a linear seg type.
*/
if (lv_is_cow(lv))
return snapshot_LVT;
if (seg_is_linear(seg))
return linear_LVT;
if (seg_is_striped(seg))
return striped_LVT;
if (lv_is_thin_volume(lv))
return thin_LVT;
if (lv_is_thin_pool(lv))
return thinpool_LVT;
if (lv_is_cache(lv))
return cache_LVT;
if (lv_is_cache_pool(lv))
return cachepool_LVT;
if (lv_is_vdo(lv))
return vdo_LVT;
if (lv_is_vdo_pool(lv))
return vdopool_LVT;
if (lv_is_vdo_pool_data(lv))
return vdopooldata_LVT;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
if (lv_is_mirror(lv))
return mirror_LVT;
if (lv_is_raid(lv))
return raid_LVT;
2017-02-06 20:51:06 +03:00
if (seg_is_any_raid0(seg))
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
return raid0_LVT;
if (seg_is_raid1(seg))
return raid1_LVT;
if (seg_is_raid4(seg))
return raid4_LVT;
2017-02-06 20:51:06 +03:00
if (seg_is_any_raid5(seg))
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
return raid5_LVT;
2017-02-06 20:51:06 +03:00
if (seg_is_any_raid6(seg))
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
return raid6_LVT;
if (seg_is_raid10(seg))
return raid10_LVT;
if (seg_is_writecache(seg))
return writecache_LVT;
Allow dm-integrity to be used for raid images dm-integrity stores checksums of the data written to an LV, and returns an error if data read from the LV does not match the previously saved checksum. When used on raid images, dm-raid will correct the error by reading the block from another image, and the device user sees no error. The integrity metadata (checksums) are stored on an internal LV allocated by lvm for each linear image. The internal LV is allocated on the same PV as the image. Create a raid LV with an integrity layer over each raid image (for raid levels 1,4,5,6,10): lvcreate --type raidN --raidintegrity y [options] Add an integrity layer to images of an existing raid LV: lvconvert --raidintegrity y LV Remove the integrity layer from images of a raid LV: lvconvert --raidintegrity n LV Settings Use --raidintegritymode journal|bitmap (journal is default) to configure the method used by dm-integrity to ensure crash consistency. Initialization When integrity is added to an LV, the kernel needs to initialize the integrity metadata/checksums for all blocks in the LV. The data corruption checking performed by dm-integrity will only operate on areas of the LV that are already initialized. The progress of integrity initialization is reported by the "syncpercent" LV reporting field (and under the Cpy%Sync lvs column.) Example: create a raid1 LV with integrity: $ lvcreate --type raid1 -m1 --raidintegrity y -n rr -L1G foo Creating integrity metadata LV rr_rimage_0_imeta with size 12.00 MiB. Logical volume "rr_rimage_0_imeta" created. Creating integrity metadata LV rr_rimage_1_imeta with size 12.00 MiB. Logical volume "rr_rimage_1_imeta" created. Logical volume "rr" created. $ lvs -a foo LV VG Attr LSize Origin Cpy%Sync rr foo rwi-a-r--- 1.00g 4.93 [rr_rimage_0] foo gwi-aor--- 1.00g [rr_rimage_0_iorig] 41.02 [rr_rimage_0_imeta] foo ewi-ao---- 12.00m [rr_rimage_0_iorig] foo -wi-ao---- 1.00g [rr_rimage_1] foo gwi-aor--- 1.00g [rr_rimage_1_iorig] 39.45 [rr_rimage_1_imeta] foo ewi-ao---- 12.00m [rr_rimage_1_iorig] foo -wi-ao---- 1.00g [rr_rmeta_0] foo ewi-aor--- 4.00m [rr_rmeta_1] foo ewi-aor--- 4.00m
2019-11-21 01:07:27 +03:00
if (seg_is_integrity(seg))
return integrity_LVT;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
if (!strcmp(seg->segtype->name, SEG_TYPE_NAME_ERROR))
return error_LVT;
if (!strcmp(seg->segtype->name, SEG_TYPE_NAME_ZERO))
return zero_LVT;
return 0;
}
/*
* Call lv_is_<type> for each <type>_LVT bit set in lvt_bits.
* If lv matches one of the specified lv types, then return 1.
*/
static int _lv_types_match(struct cmd_context *cmd, struct logical_volume *lv, uint64_t lvt_bits,
uint64_t *match_bits, uint64_t *unmatch_bits)
{
struct lv_type *type;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lvt_enum;
int found_a_match = 0;
int match;
if (match_bits)
*match_bits = 0;
if (unmatch_bits)
*unmatch_bits = 0;
for (lvt_enum = 1; lvt_enum < LVT_COUNT; lvt_enum++) {
if (!lvt_bit_is_set(lvt_bits, lvt_enum))
continue;
if (!(type = get_lv_type(lvt_enum)))
continue;
/*
* All types are currently handled by _lv_is_type()
* because lv_is_type() are #defines and not exposed
* in tools.h
*/
if (!type->fn)
match = _lv_is_type(cmd, lv, lvt_enum);
else
match = type->fn(cmd, lv);
if (match)
found_a_match = 1;
if (match_bits && match)
*match_bits |= lvt_enum_to_bit(lvt_enum);
if (unmatch_bits && !match)
*unmatch_bits |= lvt_enum_to_bit(lvt_enum);
}
return found_a_match;
}
/*
* Call lv_is_<prop> for each <prop>_LVP bit set in lvp_bits.
* If lv matches all of the specified lv properties, then return 1.
*/
static int _lv_props_match(struct cmd_context *cmd, struct logical_volume *lv, uint64_t lvp_bits,
uint64_t *match_bits, uint64_t *unmatch_bits)
{
struct lv_prop *prop;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lvp_enum;
int found_a_mismatch = 0;
int match;
if (match_bits)
*match_bits = 0;
if (unmatch_bits)
*unmatch_bits = 0;
for (lvp_enum = 1; lvp_enum < LVP_COUNT; lvp_enum++) {
if (!lvp_bit_is_set(lvp_bits, lvp_enum))
continue;
if (!(prop = get_lv_prop(lvp_enum)))
continue;
if (!prop->fn)
match = _lv_is_prop(cmd, lv, lvp_enum);
else
match = prop->fn(cmd, lv);
if (!match)
found_a_mismatch = 1;
if (match_bits && match)
*match_bits |= lvp_enum_to_bit(lvp_enum);
if (unmatch_bits && !match)
*unmatch_bits |= lvp_enum_to_bit(lvp_enum);
}
return !found_a_mismatch;
}
static int _check_lv_types(struct cmd_context *cmd, struct logical_volume *lv, int pos)
{
2017-06-27 12:38:56 +03:00
int ret;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
if (!pos)
return 1;
if (!cmd->command->required_pos_args[pos-1].def.lvt_bits)
return 1;
if (!val_bit_is_set(cmd->command->required_pos_args[pos-1].def.val_bits, lv_VAL)) {
log_error(INTERNAL_ERROR "Command %d:%s arg position %d does not permit an LV (%llx)",
cmd->command->command_index, cmd->command->command_id,
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
pos, (unsigned long long)cmd->command->required_pos_args[pos-1].def.val_bits);
return 0;
}
ret = _lv_types_match(cmd, lv, cmd->command->required_pos_args[pos-1].def.lvt_bits, NULL, NULL);
if (!ret) {
int lvt_enum = get_lvt_enum(lv);
struct lv_type *type = get_lv_type(lvt_enum);
if (!type) {
log_warn("Command on LV %s does not accept LV type unknown (%d).",
display_lvname(lv), lvt_enum);
} else {
log_warn("Command on LV %s does not accept LV type %s.",
display_lvname(lv), type->name);
}
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
}
return ret;
}
/* Check if LV passes each rule specified in command definition. */
static int _check_lv_rules(struct cmd_context *cmd, struct logical_volume *lv)
{
char buf[64];
struct cmd_rule *rule;
struct lv_type *lvtype = NULL;
uint64_t lv_props_match_bits = 0, lv_props_unmatch_bits = 0;
uint64_t lv_types_match_bits = 0, lv_types_unmatch_bits = 0;
int opts_match_count = 0, opts_unmatch_count = 0;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lvt_enum;
int ret = 1;
int i;
lvt_enum = get_lvt_enum(lv);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
if (lvt_enum)
lvtype = get_lv_type(lvt_enum);
for (i = 0; i < cmd->command->rule_count; i++) {
rule = &cmd->command->rules[i];
/*
* RULE: <conditions> INVALID|REQUIRE <checks>
*
* If all the conditions apply to the command+LV, then
* the checks are performed. If all conditions are zero
* (!opts_count, !lvt_bits, !lvp_bits), then the check
* is always performed.
*
* Conditions:
*
* 1. options (opts): if any of the specified options are set,
* then the checks may apply.
*
* 2. LV types (lvt_bits): if any of the specified LV types
* match the LV, then the checks may apply.
*
* 3. LV properties (lvp_bits): if all of the specified
* LV properties match the LV, then the checks may apply.
*
* If conditions 1, 2, 3 all pass, then the checks apply.
*
* Checks:
*
* 1. options (check_opts):
* INVALID: if any of the specified options are set,
* then the command fails.
* REQUIRE: if any of the specified options are not set,
* then the command fails.
*
* 2. LV types (check_lvt_bits):
* INVALID: if any of the specified LV types match the LV,
* then the command fails.
* REQUIRE: if none of the specified LV types match the LV,
* then the command fails.
*
* 3. LV properties (check_lvp_bits):
* INVALID: if any of the specified LV properties match
* the LV, then the command fails.
* REQUIRE: if any of the specified LV properties do not match
* the LV, then the command fails.
*/
if (rule->opts_count && !opt_in_list_is_set(cmd, rule->opts, rule->opts_count, NULL, NULL))
continue;
/* If LV matches one type in lvt_bits, this returns 1. */
if (rule->lvt_bits && !_lv_types_match(cmd, lv, rule->lvt_bits, NULL, NULL))
continue;
/* If LV matches all properties in lvp_bits, this returns 1. */
if (rule->lvp_bits && !_lv_props_match(cmd, lv, rule->lvp_bits, NULL, NULL))
continue;
/*
* Check the options, LV types, LV properties.
*/
if (rule->check_opts)
opt_in_list_is_set(cmd, rule->check_opts, rule->check_opts_count,
&opts_match_count, &opts_unmatch_count);
if (rule->check_lvt_bits)
_lv_types_match(cmd, lv, rule->check_lvt_bits,
&lv_types_match_bits, &lv_types_unmatch_bits);
if (rule->check_lvp_bits)
_lv_props_match(cmd, lv, rule->check_lvp_bits,
&lv_props_match_bits, &lv_props_unmatch_bits);
/*
* Evaluate if the check results pass based on the rule.
* The options are checked again here because the previous
* option validation (during command matching) does not cover
* cases where the option is combined with conditions of LV types
* or properties.
*/
/* Fail if any invalid options are set. */
if (rule->check_opts && (rule->rule == RULE_INVALID) && opts_match_count) {
memset(buf, 0, sizeof(buf));
opt_array_to_str(cmd, rule->check_opts, rule->check_opts_count, buf, sizeof(buf));
log_warn("Command on LV %s has invalid use of option %s.",
display_lvname(lv), buf);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
/* Fail if any required options are not set. */
if (rule->check_opts && (rule->rule == RULE_REQUIRE) && opts_unmatch_count) {
memset(buf, 0, sizeof(buf));
opt_array_to_str(cmd, rule->check_opts, rule->check_opts_count, buf, sizeof(buf));
log_warn("Command on LV %s requires option %s.",
display_lvname(lv), buf);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
/* Fail if the LV matches any of the invalid LV types. */
if (rule->check_lvt_bits && (rule->rule == RULE_INVALID) && lv_types_match_bits) {
if (rule->opts_count)
log_warn("Command on LV %s uses options invalid with LV type %s.",
display_lvname(lv), lvtype ? lvtype->name : "unknown");
else
log_warn("Command on LV %s with invalid LV type %s.",
display_lvname(lv), lvtype ? lvtype->name : "unknown");
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
/* Fail if the LV does not match any of the required LV types. */
if (rule->check_lvt_bits && (rule->rule == RULE_REQUIRE) && !lv_types_match_bits) {
memset(buf, 0, sizeof(buf));
_lvt_bits_to_str(rule->check_lvt_bits, buf, sizeof(buf));
if (rule->opts_count)
log_warn("Command on LV %s uses options that require LV types %s.",
display_lvname(lv), buf);
else
log_warn("Command on LV %s does not accept LV type %s. Required LV types are %s.",
display_lvname(lv), lvtype ? lvtype->name : "unknown", buf);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
/* Fail if the LV matches any of the invalid LV properties. */
if (rule->check_lvp_bits && (rule->rule == RULE_INVALID) && lv_props_match_bits) {
memset(buf, 0, sizeof(buf));
_lvp_bits_to_str(lv_props_match_bits, buf, sizeof(buf));
if (rule->opts_count)
log_warn("Command on LV %s uses options that are invalid with LV properties: %s.",
display_lvname(lv), buf);
else
log_warn("Command on LV %s is invalid on LV with properties: %s.",
display_lvname(lv), buf);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
/* Fail if the LV does not match any of the required LV properties. */
if (rule->check_lvp_bits && (rule->rule == RULE_REQUIRE) && lv_props_unmatch_bits) {
memset(buf, 0, sizeof(buf));
_lvp_bits_to_str(lv_props_unmatch_bits, buf, sizeof(buf));
if (rule->opts_count)
log_warn("Command on LV %s uses options that require LV properties: %s.",
display_lvname(lv), buf);
else
log_warn("Command on LV %s requires LV with properties: %s.",
display_lvname(lv), buf);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret = 0;
}
}
return ret;
}
/*
* Return which arg position the given LV is at,
* where 1 represents the first position arg.
* When the first position arg is repeatable,
* return 1 for all.
*
* Return 0 when the command has no required
* position args. (optional position args are
* not considered.)
*/
static int _find_lv_arg_position(struct cmd_context *cmd, struct logical_volume *lv)
{
const char *sep, *lvname;
int i;
if (cmd->command->rp_count == 0)
return 0;
if (cmd->command->rp_count == 1)
return 1;
for (i = 0; i < cmd->position_argc; i++) {
if (i == cmd->command->rp_count)
break;
if (!val_bit_is_set(cmd->command->required_pos_args[i].def.val_bits, lv_VAL))
continue;
if ((sep = strstr(cmd->position_argv[i], "/")))
lvname = sep + 1;
else
lvname = cmd->position_argv[i];
if (!strcmp(lvname, lv->name))
return i + 1;
}
/*
* If the last position arg is an LV and this
* arg is beyond that position, then the last
* LV position arg is repeatable, so return
* that position.
*/
if (i == cmd->command->rp_count) {
int last_pos = cmd->command->rp_count;
if (val_bit_is_set(cmd->command->required_pos_args[last_pos-1].def.val_bits, lv_VAL))
return last_pos;
}
return 0;
}
int process_each_lv_in_vg(struct cmd_context *cmd, struct volume_group *vg,
struct dm_list *arg_lvnames, const struct dm_list *tags_in,
2014-10-07 19:45:45 +04:00
int stop_on_error,
struct processing_handle *handle,
check_single_lv_fn_t check_single_lv,
process_single_lv_fn_t process_single_lv)
{
log_report_t saved_log_report_state = log_get_report_state();
char lv_uuid[64] __attribute__((aligned(8)));
char vg_uuid[64] __attribute__((aligned(8)));
int ret_max = ECMD_PROCESSED;
int ret = 0;
int whole_selected = 0;
int handle_supplied = handle != NULL;
unsigned process_lv;
unsigned process_all = 0;
unsigned tags_supplied = 0;
unsigned lvargs_supplied = 0;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
int lv_is_named_arg;
int lv_arg_pos;
struct lv_list *lvl;
struct dm_str_list *sl;
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
struct dm_list final_lvs;
struct lv_list *final_lvl;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
struct dm_list found_arg_lvnames;
struct glv_list *glvl, *tglvl;
int do_report_ret_code = 1;
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_LV);
vg_uuid[0] = '\0';
if (!id_write_format(&vg->id, vg_uuid, sizeof(vg_uuid)))
stack;
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
dm_list_init(&final_lvs);
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
dm_list_init(&found_arg_lvnames);
if (tags_in && !dm_list_empty(tags_in))
tags_supplied = 1;
if (arg_lvnames && !dm_list_empty(arg_lvnames))
lvargs_supplied = 1;
if (!handle && !(handle = init_processing_handle(cmd, NULL))) {
ret_max = ECMD_FAILED;
goto_out;
}
if (handle->internal_report_for_select && !handle->selection_handle &&
!init_selection_handle(cmd, handle, LVS)) {
ret_max = ECMD_FAILED;
goto_out;
}
/* Process all LVs in this VG if no restrictions given
* or if VG tags match. */
if ((!tags_supplied && !lvargs_supplied) ||
(tags_supplied && str_list_match_list(tags_in, &vg->tags, NULL)))
process_all = 1;
log_set_report_object_group_and_group_id(vg->name, vg_uuid);
dm_list_iterate_items(lvl, &vg->lvs) {
lv_uuid[0] = '\0';
if (!id_write_format(&lvl->lv->lvid.id[1], lv_uuid, sizeof(lv_uuid)))
stack;
log_set_report_object_name_and_id(lvl->lv->name, lv_uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
2016-12-13 02:09:15 +03:00
if (lv_is_snapshot(lvl->lv))
continue;
/* Skip availability change for non-virt snaps when processing all LVs */
/* FIXME: pass process_all to process_single_lv() */
if (process_all && arg_is_set(cmd, activate_ARG) &&
lv_is_cow(lvl->lv) && !lv_is_virtual_origin(origin_from_cow(lvl->lv)))
continue;
if (lv_is_virtual_origin(lvl->lv) && !arg_is_set(cmd, all_ARG)) {
if (lvargs_supplied &&
str_list_match_item(arg_lvnames, lvl->lv->name))
log_print_unless_silent("Ignoring virtual origin logical volume %s.",
display_lvname(lvl->lv));
continue;
}
/*
* Only let hidden LVs through if --all was used or the LVs
* were specifically named on the command line.
*/
if (!lvargs_supplied && !lv_is_visible(lvl->lv) && !arg_is_set(cmd, all_ARG) &&
(!cmd->process_component_lvs || !lv_is_component(lvl->lv)))
continue;
/*
* Only let sanlock LV through if --all was used or if
* it is named on the command line.
*/
if (lv_is_lockd_sanlock_lv(lvl->lv)) {
if (arg_is_set(cmd, all_ARG) ||
(lvargs_supplied && str_list_match_item(arg_lvnames, lvl->lv->name))) {
log_very_verbose("Processing lockd_sanlock_lv %s/%s.", vg->name, lvl->lv->name);
} else {
continue;
}
}
/*
* process the LV if one of the following:
* - process_all is set
* - LV name matches a supplied LV name
* - LV tag matches a supplied LV tag
* - LV matches the selection
*/
process_lv = process_all;
if (lvargs_supplied && str_list_match_item(arg_lvnames, lvl->lv->name)) {
/* Remove LV from list of unprocessed LV names */
str_list_del(arg_lvnames, lvl->lv->name);
if (!str_list_add(cmd->mem, &found_arg_lvnames, lvl->lv->name)) {
log_error("strlist allocation failed.");
ret_max = ECMD_FAILED;
goto out;
}
process_lv = 1;
}
if (!process_lv && tags_supplied && str_list_match_list(tags_in, &lvl->lv->tags, NULL))
process_lv = 1;
process_lv = process_lv && select_match_lv(cmd, handle, vg, lvl->lv) && _select_matches(handle);
if (!process_lv)
continue;
2021-03-11 00:05:10 +03:00
log_very_verbose("Adding %s to the list of LVs to be processed.", display_lvname(lvl->lv));
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
if (!(final_lvl = dm_pool_zalloc(cmd->mem, sizeof(struct lv_list)))) {
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
log_error("Failed to allocate final LV list item.");
ret_max = ECMD_FAILED;
goto out;
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
}
final_lvl->lv = lvl->lv;
if (lv_is_thin_pool(lvl->lv)) {
/* Add to the front of the list */
dm_list_add_h(&final_lvs, &final_lvl->list);
} else
dm_list_add(&final_lvs, &final_lvl->list);
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
}
log_set_report_object_name_and_id(NULL, NULL);
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
/*
* If a PV is stacked on an LV, then the LV is kept open
* in bcache, and needs to be closed so the open fd doesn't
* interfere with processing the LV.
*/
dm_list_iterate_items(lvl, &final_lvs)
label_scan_invalidate_lv(cmd, lvl->lv);
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
dm_list_iterate_items(lvl, &final_lvs) {
lv_uuid[0] = '\0';
if (!id_write_format(&lvl->lv->lvid.id[1], lv_uuid, sizeof(lv_uuid)))
stack;
log_set_report_object_name_and_id(lvl->lv->name, lv_uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
/*
* FIXME: Once we have index over vg->removed_lvs, check directly
* LV presence there and remove LV_REMOVE flag/lv_is_removed fn
* as they won't be needed anymore.
*/
metadata: process_each_lv_in_vg: get the list of LVs to process first, then do the processing This avoids a problem in which we're using selection on LV list - we need to do the selection on initial state and not on any intermediary state as we process LVs one by one - some of the relations among LVs can be gone during this processing. For example, processing one LV can cause the other LVs to lose the relation to this LV and hence they're not selectable anymore with the original selection criteria as it would be if we did selection on inital state. A perfect example is with thin snapshots: $ lvs -o lv_name,origin,layout,role vg LV Origin Layout Role lvol1 thin,sparse public,origin,thinorigin,multithinorigin lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot pool thin,pool private $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed The lvremove command above was supposed to remove lvol1 as well as all its snapshots which have origin=lvol1. It failed to do so, because once we removed the origin lvol1, the lvol2 and lvol3 which were snapshots before are not snapshots anymore - the relations change as we're processing these LVs one by one. If we do the selection first and then execute any concrete actions on these LVs (which is what this patch does), the behaviour is correct then - the selection is done on the *initial state*: $ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1' Logical volume "lvol1" successfully removed Logical volume "lvol2" successfully removed Logical volume "lvol3" successfully removed Similarly for all the other situations in which relations among LVs are being changed by processing the LVs one by one. This patch also introduces LV_REMOVED internal LV status flag to mark removed LVs so they're not processed further when we iterate over collected list of LVs to be processed. Previously, when we iterated directly over vg->lvs list to process the LVs, we relied on the fact that once the LV is removed, it is also removed from the vg->lvs list we're iterating over. But that was incorrect as we shouldn't remove LVs from the list during one iteration while we're iterating over that exact list (dm_list_iterate_items safe can handle only one removal at one iteration anyway, so it can't be used here).
2015-03-16 19:10:21 +03:00
if (lv_is_removed(lvl->lv))
continue;
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
lv_is_named_arg = str_list_match_item(&found_arg_lvnames, lvl->lv->name);
lv_arg_pos = _find_lv_arg_position(cmd, lvl->lv);
/*
* The command definition may include restrictions on the
* types and properties of LVs that can be processed.
*/
if (!_check_lv_types(cmd, lvl->lv, lv_arg_pos)) {
/* FIXME: include this result in report log? */
if (lv_is_named_arg) {
log_error("Command not permitted on LV %s.", display_lvname(lvl->lv));
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret_max = ECMD_FAILED;
}
continue;
}
if (!_check_lv_rules(cmd, lvl->lv)) {
/* FIXME: include this result in report log? */
if (lv_is_named_arg) {
log_error("Command not permitted on LV %s.", display_lvname(lvl->lv));
commands: new method for defining commands . Define a prototype for every lvm command. . Match every user command with one definition. . Generate help text and man pages from them. The new file command-lines.in defines a prototype for every unique lvm command. A unique lvm command is a unique combination of: command name + required option args + required positional args. Each of these prototypes also includes the optional option args and optional positional args that the command will accept, a description, and a unique string ID for the definition. Any valid command will match one of the prototypes. Here's an example of the lvresize command definitions from command-lines.in, there are three unique lvresize commands: lvresize --size SizeMB LV OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB, --poolmetadatasize SizeMB OP: PV ... ID: lvresize_by_size DESC: Resize an LV by a specified size. lvresize LV PV ... OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --resizefs, --stripes Number, --stripesize SizeKB ID: lvresize_by_pv DESC: Resize an LV by specified PV extents. FLAGS: SECONDARY_SYNTAX lvresize --poolmetadatasize SizeMB LV_thinpool OO: --alloc Alloc, --autobackup Bool, --force, --nofsck, --nosync, --noudevsync, --reportformat String, --stripes Number, --stripesize SizeKB OP: PV ... ID: lvresize_pool_metadata_by_size DESC: Resize a pool metadata SubLV by a specified size. The three commands have separate definitions because they have different required parameters. Required parameters are specified on the first line of the definition. Optional options are listed after OO, and optional positional args are listed after OP. This data is used to generate corresponding command definition structures for lvm in command-lines.h. usage/help output is also auto generated, so it is always in sync with the definitions. Every user-entered command is compared against the set of command structures, and matched with one. An error is reported if an entered command does not have the required parameters for any definition. The closest match is printed as a suggestion, and running lvresize --help will display the usage for each possible lvresize command. The prototype syntax used for help/man output includes required --option and positional args on the first line, and optional --option and positional args enclosed in [ ] on subsequent lines. command_name <required_opt_args> <required_pos_args> [ <optional_opt_args> ] [ <optional_pos_args> ] Command definitions that are not to be advertised/suggested have the flag SECONDARY_SYNTAX. These commands will not be printed in the normal help output. Man page prototypes are also generated from the same original command definitions, and are always in sync with the code and help text. Very early in command execution, a matching command definition is found. lvm then knows the operation being done, and that the provided args conform to the definition. This will allow lots of ad hoc checking/validation to be removed throughout the code. Each command definition can also be routed to a specific function to implement it. The function is associated with an enum value for the command definition (generated from the ID string.) These per-command-definition implementation functions have not yet been created, so all commands currently fall back to the existing per-command-name implementation functions. Using per-command-definition functions will allow lots of code to be removed which tries to figure out what the command is meant to do. This is currently based on ad hoc and complicated option analysis. When using the new functions, what the command is doing is already known from the associated command definition.
2016-08-12 23:52:18 +03:00
ret_max = ECMD_FAILED;
}
continue;
}
if (check_single_lv && !check_single_lv(cmd, lvl->lv, handle, lv_is_named_arg)) {
if (lv_is_named_arg)
ret_max = ECMD_FAILED;
continue;
}
2014-11-14 18:08:27 +03:00
log_very_verbose("Processing LV %s in VG %s.", lvl->lv->name, vg->name);
ret = process_single_lv(cmd, lvl->lv, handle);
if (handle_supplied)
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
2014-10-07 19:45:45 +04:00
if (stop_on_error && ret != ECMD_PROCESSED) {
do_report_ret_code = 0;
goto_out;
}
}
log_set_report_object_name_and_id(NULL, NULL);
if (handle->include_historical_lvs && !tags_supplied) {
if (dm_list_empty(&_historical_lv.segments))
dm_list_add(&_historical_lv.segments, &_historical_lv_segment.list);
_historical_lv.vg = vg;
dm_list_iterate_items_safe(glvl, tglvl, &vg->historical_lvs) {
lv_uuid[0] = '\0';
if (!id_write_format(&glvl->glv->historical->lvid.id[1], lv_uuid, sizeof(lv_uuid)))
stack;
log_set_report_object_name_and_id(glvl->glv->historical->name, lv_uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
process_lv = process_all;
if (lvargs_supplied &&
(sl = _str_list_match_item_with_prefix(arg_lvnames, HISTORICAL_LV_PREFIX, glvl->glv->historical->name))) {
str_list_del(arg_lvnames, glvl->glv->historical->name);
dm_list_del(&sl->list);
process_lv = 1;
}
process_lv = process_lv && select_match_lv(cmd, handle, vg, lvl->lv) && _select_matches(handle);
if (!process_lv)
continue;
_historical_lv.this_glv = glvl->glv;
_historical_lv.name = glvl->glv->historical->name;
log_very_verbose("Processing historical LV %s in VG %s.", glvl->glv->historical->name, vg->name);
ret = process_single_lv(cmd, &_historical_lv, handle);
if (handle_supplied)
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
if (stop_on_error && ret != ECMD_PROCESSED) {
do_report_ret_code = 0;
goto_out;
}
}
log_set_report_object_name_and_id(NULL, NULL);
}
if (vg->needs_write_and_commit && (ret_max == ECMD_PROCESSED) &&
(!vg_write(vg) || !vg_commit(vg)))
ret_max = ECMD_FAILED;
if (lvargs_supplied) {
/*
* FIXME: lvm supports removal of LV with all its dependencies
* this leads to miscalculation that depends on the order of args.
*/
dm_list_iterate_items(sl, arg_lvnames) {
log_set_report_object_name_and_id(sl->str, NULL);
log_error("Failed to find logical volume \"%s/%s\"",
vg->name, sl->str);
if (ret_max < ECMD_FAILED)
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
}
}
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_set_report_object_name_and_id(NULL, NULL);
log_set_report_object_group_and_group_id(NULL, NULL);
if (!handle_supplied)
destroy_processing_handle(cmd, handle);
else
_set_final_selection_result(handle, whole_selected);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
/*
* If arg is tag, add it to arg_tags
* else the arg is either vgname or vgname/lvname:
* - add the vgname of each arg to arg_vgnames
* - if arg has no lvname, add just vgname arg_lvnames,
* it represents all lvs in the vg
* - if arg has lvname, add vgname/lvname to arg_lvnames
*/
static int _get_arg_lvnames(struct cmd_context *cmd,
int argc, char **argv,
const char *one_vgname, const char *one_lvname,
struct dm_list *arg_vgnames,
struct dm_list *arg_lvnames,
struct dm_list *arg_tags)
{
int opt = 0;
int ret_max = ECMD_PROCESSED;
char *vglv;
size_t vglv_sz;
const char *vgname;
const char *lv_name;
const char *tmp_lv_name;
const char *vgname_def;
unsigned dev_dir_found;
if (one_vgname) {
if (!str_list_add(cmd->mem, arg_vgnames,
dm_pool_strdup(cmd->mem, one_vgname))) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
if (!one_lvname) {
if (!str_list_add(cmd->mem, arg_lvnames,
dm_pool_strdup(cmd->mem, one_vgname))) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
} else {
vglv_sz = strlen(one_vgname) + strlen(one_lvname) + 2;
if (!(vglv = dm_pool_alloc(cmd->mem, vglv_sz)) ||
dm_snprintf(vglv, vglv_sz, "%s/%s", one_vgname, one_lvname) < 0) {
log_error("vg/lv string alloc failed.");
return ECMD_FAILED;
}
if (!str_list_add(cmd->mem, arg_lvnames, vglv)) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
}
return ret_max;
}
for (; opt < argc; opt++) {
lv_name = argv[opt];
dev_dir_found = 0;
/* Do we have a tag or vgname or lvname? */
vgname = lv_name;
if (*vgname == '@') {
if (!validate_tag(vgname + 1)) {
2014-11-14 18:08:27 +03:00
log_error("Skipping invalid tag %s.", vgname);
continue;
}
if (!str_list_add(cmd->mem, arg_tags,
dm_pool_strdup(cmd->mem, vgname + 1))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
continue;
}
/* FIXME Jumbled parsing */
vgname = skip_dev_dir(cmd, vgname, &dev_dir_found);
if (*vgname == '/') {
2014-11-14 18:08:27 +03:00
log_error("\"%s\": Invalid path for Logical Volume.",
argv[opt]);
if (ret_max < ECMD_FAILED)
ret_max = ECMD_FAILED;
continue;
}
lv_name = vgname;
if ((tmp_lv_name = strchr(vgname, '/'))) {
/* Must be an LV */
lv_name = tmp_lv_name;
while (*lv_name == '/')
lv_name++;
if (!(vgname = extract_vgname(cmd, vgname))) {
if (ret_max < ECMD_FAILED) {
stack;
ret_max = ECMD_FAILED;
}
continue;
}
} else if (!dev_dir_found &&
(vgname_def = _default_vgname(cmd)))
vgname = vgname_def;
else
lv_name = NULL;
if (!str_list_add(cmd->mem, arg_vgnames,
dm_pool_strdup(cmd->mem, vgname))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
if (!lv_name) {
if (!str_list_add(cmd->mem, arg_lvnames,
dm_pool_strdup(cmd->mem, vgname))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
} else {
vglv_sz = strlen(vgname) + strlen(lv_name) + 2;
if (!(vglv = dm_pool_alloc(cmd->mem, vglv_sz)) ||
dm_snprintf(vglv, vglv_sz, "%s/%s", vgname, lv_name) < 0) {
2014-11-14 18:08:27 +03:00
log_error("vg/lv string alloc failed.");
return ECMD_FAILED;
}
if (!str_list_add(cmd->mem, arg_lvnames, vglv)) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
}
}
return ret_max;
}
/*
* Finding vgname/lvname to process.
*
* When the position arg is a single name without any '/'
* it is treated as an LV name (leaving the VG unknown).
* Other option values, or env var, must be searched for a VG name.
* If one of the option values contains a vgname/lvname value,
* then the VG name is extracted and used for the LV position arg.
* Or, if the env var has the VG name, that is used.
*
* Other option values that are searched for a VG name are:
* --thinpool, --cachepool, --poolmetadata.
*
* . command vg/lv1
* . add vg to arg_vgnames
* . add vg/lv1 to arg_lvnames
*
* command lv1
* . error: no vg name (unless LVM_VG_NAME)
*
* command --option=vg/lv1 vg/lv2
* . verify both vg names match
* . add vg to arg_vgnames
* . add vg/lv2 to arg_lvnames
*
* command --option=lv1 lv2
* . error: no vg name (unless LVM_VG_NAME)
*
* command --option=vg/lv1 lv2
* . add vg to arg_vgnames
* . add vg/lv2 to arg_lvnames
*
* command --option=lv1 vg/lv2
* . add vg to arg_vgnames
* . add vg/lv2 to arg_lvnames
*/
static int _get_arg_lvnames_using_options(struct cmd_context *cmd,
int argc, char **argv,
struct dm_list *arg_vgnames,
struct dm_list *arg_lvnames,
struct dm_list *arg_tags)
{
/* Array with args which may provide vgname */
static const unsigned _opts_with_vgname[] = {
cachepool_ARG, poolmetadata_ARG, thinpool_ARG
};
unsigned i;
const char *pos_name = NULL;
const char *arg_name = NULL;
const char *pos_vgname = NULL;
const char *opt_vgname = NULL;
const char *pos_lvname = NULL;
const char *use_vgname = NULL;
char *vglv;
size_t vglv_sz;
if (argc != 1) {
log_error("One LV position arg is required.");
return ECMD_FAILED;
}
if (!(pos_name = dm_pool_strdup(cmd->mem, argv[0]))) {
log_error("string alloc failed.");
return ECMD_FAILED;
}
if (*pos_name == '@') {
if (!validate_tag(pos_name + 1)) {
log_error("Skipping invalid tag %s.", pos_name);
return ECMD_FAILED;
}
if (!str_list_add(cmd->mem, arg_tags,
dm_pool_strdup(cmd->mem, pos_name + 1))) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
return ECMD_PROCESSED;
}
if (strchr(pos_name, '/')) {
/*
* This splits pos_name 'x/y' into pos_vgname 'x' and pos_lvname 'y'
* It skips repeated '/', e.g. x//y
* It also checks and fails for extra '/', e.g. x/y/z
*/
if (!(pos_vgname = _extract_vgname(cmd, pos_name, &pos_lvname)))
return_0;
use_vgname = pos_vgname;
} else
pos_lvname = pos_name;
/* Go through the list of options which can provide vgname */
for (i = 0; i < DM_ARRAY_SIZE(_opts_with_vgname); ++i) {
if ((arg_name = arg_str_value(cmd, _opts_with_vgname[i], NULL)) &&
strchr(arg_name, '/')) {
/* Combined VG/LV */
/* Don't care about opt lvname, only extract vgname. */
if (!(opt_vgname = _extract_vgname(cmd, arg_name, NULL)))
return_0;
/* Compare with already known vgname */
if (use_vgname) {
if (strcmp(use_vgname, opt_vgname)) {
log_error("VG name mismatch from %s arg (%s) and option arg (%s).",
pos_vgname ? "position" : "option",
use_vgname, opt_vgname);
return ECMD_FAILED;
}
} else
use_vgname = opt_vgname;
}
}
/* VG not specified as position nor as optional arg, so check for default VG */
if (!use_vgname && !(use_vgname = _default_vgname(cmd))) {
log_error("Cannot find VG name for LV %s.", pos_lvname);
return ECMD_FAILED;
}
if (!str_list_add(cmd->mem, arg_vgnames, dm_pool_strdup(cmd->mem, use_vgname))) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
vglv_sz = strlen(use_vgname) + strlen(pos_lvname) + 2;
if (!(vglv = dm_pool_alloc(cmd->mem, vglv_sz)) ||
dm_snprintf(vglv, vglv_sz, "%s/%s", use_vgname, pos_lvname) < 0) {
log_error("vg/lv string alloc failed.");
return ECMD_FAILED;
}
if (!str_list_add(cmd->mem, arg_lvnames, vglv)) {
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
return ECMD_PROCESSED;
}
static int _process_lv_vgnameid_list(struct cmd_context *cmd, uint32_t read_flags,
struct dm_list *vgnameids_to_process,
struct dm_list *arg_vgnames,
struct dm_list *arg_lvnames,
struct dm_list *arg_tags,
struct processing_handle *handle,
check_single_lv_fn_t check_single_lv,
process_single_lv_fn_t process_single_lv)
{
log_report_t saved_log_report_state = log_get_report_state();
char uuid[64] __attribute__((aligned(8)));
struct volume_group *vg;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
struct volume_group *error_vg = NULL;
2014-10-07 01:02:00 +04:00
struct vgnameid_list *vgnl;
struct dm_str_list *sl;
struct dm_list *tags_arg;
struct dm_list lvnames;
uint32_t lockd_state = 0;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
uint32_t error_flags = 0;
const char *vg_name;
const char *vg_uuid;
const char *vgn;
const char *lvn;
int ret_max = ECMD_PROCESSED;
int ret;
int skip;
int notfound;
int do_report_ret_code = 1;
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_VG);
dm_list_iterate_items(vgnl, vgnameids_to_process) {
2014-10-07 01:02:00 +04:00
vg_name = vgnl->vg_name;
vg_uuid = vgnl->vgid;
skip = 0;
notfound = 0;
uuid[0] = '\0';
if (vg_uuid && !id_write_format((const struct id*)vg_uuid, uuid, sizeof(uuid)))
stack;
log_set_report_object_name_and_id(vg_name, uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
/*
* arg_lvnames contains some elements that are just "vgname"
* which means process all lvs in the vg. Other elements
* are "vgname/lvname" which means process only the select
* lvs in the vg.
*/
tags_arg = arg_tags;
dm_list_init(&lvnames); /* LVs to be processed in this VG */
dm_list_iterate_items(sl, arg_lvnames) {
vgn = sl->str;
lvn = strchr(vgn, '/');
if (!lvn && !strcmp(vgn, vg_name)) {
/* Process all LVs in this VG */
tags_arg = NULL;
dm_list_init(&lvnames);
break;
}
if (lvn && !strncmp(vgn, vg_name, strlen(vg_name)) &&
strlen(vg_name) == (size_t) (lvn - vgn)) {
if (!str_list_add(cmd->mem, &lvnames,
dm_pool_strdup(cmd->mem, lvn + 1))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
ret_max = ECMD_FAILED;
goto out;
}
}
}
log_very_verbose("Processing VG %s %s", vg_name, vg_uuid ? uuid : "");
2015-03-05 23:00:44 +03:00
if (!lockd_vg(cmd, vg_name, NULL, 0, &lockd_state)) {
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
2015-03-05 23:00:44 +03:00
continue;
}
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
vg = vg_read(cmd, vg_name, vg_uuid, read_flags, lockd_state, &error_flags, &error_vg);
if (_ignore_vg(cmd, error_flags, error_vg, vg_name, arg_vgnames, read_flags, &skip, &notfound)) {
stack;
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (error_vg)
unlock_and_release_vg(cmd, error_vg, vg_name);
2015-03-05 23:00:44 +03:00
goto endvg;
}
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (error_vg)
unlock_and_release_vg(cmd, error_vg, vg_name);
if (skip || notfound)
2015-03-05 23:00:44 +03:00
goto endvg;
ret = process_each_lv_in_vg(cmd, vg, &lvnames, tags_arg, 0,
handle, check_single_lv, process_single_lv);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
unlock_vg(cmd, vg, vg_name);
2015-03-05 23:00:44 +03:00
endvg:
release_vg(vg);
if (!lockd_vg(cmd, vg_name, "un", 0, &lockd_state))
stack;
log_set_report_object_name_and_id(NULL, NULL);
}
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
/*
* Call process_single_lv() for each LV selected by the command line arguments.
*/
int process_each_lv(struct cmd_context *cmd,
int argc, char **argv,
const char *one_vgname, const char *one_lvname,
uint32_t read_flags,
struct processing_handle *handle,
check_single_lv_fn_t check_single_lv,
process_single_lv_fn_t process_single_lv)
{
log_report_t saved_log_report_state = log_get_report_state();
int handle_supplied = handle != NULL;
struct dm_list arg_tags; /* str_list */
struct dm_list arg_vgnames; /* str_list */
struct dm_list arg_lvnames; /* str_list */
struct dm_list vgnameids_on_system; /* vgnameid_list */
struct dm_list vgnameids_to_process; /* vgnameid_list */
int enable_all_vgs = (cmd->cname->flags & ALL_VGS_IS_DEFAULT);
int process_all_vgs_on_system = 0;
int ret_max = ECMD_PROCESSED;
int ret;
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_LV);
2015-03-05 23:00:44 +03:00
/* Disable error in vg_read so we can print it from ignore_vg. */
cmd->vg_read_print_access_error = 0;
dm_list_init(&arg_tags);
dm_list_init(&arg_vgnames);
dm_list_init(&arg_lvnames);
dm_list_init(&vgnameids_on_system);
dm_list_init(&vgnameids_to_process);
/*
* Find any LVs, VGs or tags explicitly provided on the command line.
*/
if (cmd->cname->flags & GET_VGNAME_FROM_OPTIONS)
ret = _get_arg_lvnames_using_options(cmd, argc, argv, &arg_vgnames, &arg_lvnames, &arg_tags);
else
ret = _get_arg_lvnames(cmd, argc, argv, one_vgname, one_lvname, &arg_vgnames, &arg_lvnames, &arg_tags);
if (ret != ECMD_PROCESSED) {
ret_max = ret;
goto_out;
}
if (!handle && !(handle = init_processing_handle(cmd, NULL))) {
ret_max = ECMD_FAILED;
goto_out;
}
if (handle->internal_report_for_select && !handle->selection_handle &&
!init_selection_handle(cmd, handle, LVS)) {
ret_max = ECMD_FAILED;
goto_out;
}
/*
* Process all VGs on the system when:
* . tags are specified and all VGs need to be read to
* look for matching tags.
* . no VG names are specified and the command defaults
* to processing all VGs when none are specified.
* . no VG names are specified and the select option needs
* resolving.
*/
if (!dm_list_empty(&arg_tags))
process_all_vgs_on_system = 1;
else if (dm_list_empty(&arg_vgnames) && enable_all_vgs)
process_all_vgs_on_system = 1;
else if (dm_list_empty(&arg_vgnames) && handle->internal_report_for_select)
process_all_vgs_on_system = 1;
/*
* Needed for a current listing of the global VG namespace.
*/
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
if (process_all_vgs_on_system && !lock_global(cmd, "sh")) {
ret_max = ECMD_FAILED;
goto_out;
}
2015-03-05 23:00:44 +03:00
scan: do scanning at the start of a command Move the location of scans to make it clearer and avoid unnecessary repeated scanning. There should be one scan at the start of a command which is then used through the rest of command processing. Previously, the initial label scan was called as a side effect from various utility functions. This would lead to it being called unnecessarily. It is an expensive operation, and should only be called when necessary. Also, this is a primary step in the function of the command, and as such it should be called prominently at the top level of command processing, not as a hidden side effect of a utility function. lvm knows exactly where and when the label scan needs to be done. Because of this, move the label scan calls from the internal functions to the top level of processing. Other specific instances of lvmcache_label_scan() are still called unnecessarily or unclearly by specific commands that do not use the common process_each functions. These will be improved in future commits. During the processing phase, rescanning labels for devices in a VG needs to be done after the VG lock is acquired in case things have changed since the initial label scan. This was being done by way of rescanning devices that had the INVALID flag set in lvmcache. This usually approximated the right set of devices, but it was not exact, and obfuscated the real requirement. Correct this by using a new function that rescans the devices in the VG: lvmcache_label_rescan_vg(). Apart from being inexact, the rescanning was extremely well hidden. _vg_read() would call ->create_instance(), _text_create_text_instance(), _create_vg_text_instance() which would call lvmcache_label_scan() which would call _scan_invalid() which repeats the label scan on devices flagged INVALID. lvmcache_label_rescan_vg() is now called prominently by _vg_read() directly.
2018-02-07 22:26:37 +03:00
/*
* Scan all devices to populate lvmcache with initial
* list of PVs and VGs.
*/
lvmcache_label_scan(cmd);
/*
* A list of all VGs on the system is needed when:
* . processing all VGs on the system
* . A VG name is specified which may refer to one
* of multiple VGs on the system with that name.
*/
log_very_verbose("Obtaining the complete list of VGs before processing their LVs");
if (!lvmcache_get_vgnameids(cmd, &vgnameids_on_system, NULL, 0)) {
ret_max = ECMD_FAILED;
goto_out;
2015-03-05 23:00:44 +03:00
}
if (!dm_list_empty(&arg_vgnames)) {
/* This may remove entries from arg_vgnames or vgnameids_on_system. */
ret = _resolve_duplicate_vgnames(cmd, &arg_vgnames, &vgnameids_on_system);
if (ret > ret_max)
ret_max = ret;
if (dm_list_empty(&arg_vgnames) && dm_list_empty(&arg_tags)) {
ret_max = ECMD_FAILED;
2017-06-23 11:59:12 +03:00
goto_out;
}
}
if (dm_list_empty(&arg_vgnames) && dm_list_empty(&vgnameids_on_system)) {
/* FIXME Should be log_print, but suppressed for reporting cmds */
log_verbose("No volume groups found.");
ret_max = ECMD_PROCESSED;
goto out;
}
if (dm_list_empty(&arg_vgnames))
read_flags |= READ_OK_NOTFOUND;
/*
* When processing all VGs, vgnameids_on_system simply becomes
* vgnameids_to_process.
* When processing only specified VGs, then for each item in
* arg_vgnames, move the corresponding entry from
* vgnameids_on_system to vgnameids_to_process.
*/
if (process_all_vgs_on_system)
dm_list_splice(&vgnameids_to_process, &vgnameids_on_system);
else
_choose_vgs_to_process(cmd, &arg_vgnames, &vgnameids_on_system, &vgnameids_to_process);
ret = _process_lv_vgnameid_list(cmd, read_flags, &vgnameids_to_process, &arg_vgnames, &arg_lvnames,
&arg_tags, handle, check_single_lv, process_single_lv);
if (ret > ret_max)
ret_max = ret;
out:
if (!handle_supplied)
destroy_processing_handle(cmd, handle);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
2014-10-07 01:02:00 +04:00
static int _get_arg_pvnames(struct cmd_context *cmd,
int argc, char **argv,
struct dm_list *arg_pvnames,
struct dm_list *arg_tags)
2014-10-07 01:02:00 +04:00
{
int opt = 0;
char *at_sign, *tagname;
char *arg_name;
2014-10-07 01:02:00 +04:00
int ret_max = ECMD_PROCESSED;
for (; opt < argc; opt++) {
arg_name = argv[opt];
dm_unescape_colons_and_at_signs(arg_name, NULL, &at_sign);
if (at_sign && (at_sign == arg_name)) {
tagname = at_sign + 1;
if (!validate_tag(tagname)) {
2014-11-14 18:08:27 +03:00
log_error("Skipping invalid tag %s.", tagname);
if (ret_max < EINVALID_CMD_LINE)
ret_max = EINVALID_CMD_LINE;
continue;
}
if (!str_list_add(cmd->mem, arg_tags,
dm_pool_strdup(cmd->mem, tagname))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
2014-10-07 01:02:00 +04:00
continue;
}
if (!str_list_add(cmd->mem, arg_pvnames,
dm_pool_strdup(cmd->mem, arg_name))) {
2014-11-14 18:08:27 +03:00
log_error("strlist allocation failed.");
return ECMD_FAILED;
}
2014-10-07 01:02:00 +04:00
}
return ret_max;
}
static int _get_arg_devices(struct cmd_context *cmd,
struct dm_list *arg_pvnames,
struct dm_list *arg_devices)
{
struct dm_str_list *sl;
struct device_id_list *dil;
int ret_max = ECMD_PROCESSED;
dm_list_iterate_items(sl, arg_pvnames) {
if (!(dil = dm_pool_zalloc(cmd->mem, sizeof(*dil)))) {
log_error("device_id_list alloc failed.");
return ECMD_FAILED;
}
if (!(dil->dev = dev_cache_get(cmd, sl->str, cmd->filter))) {
log_error("Cannot use %s: %s", sl->str, devname_error_reason(sl->str));
ret_max = EINIT_FAILED;
} else {
memcpy(dil->pvid, dil->dev->pvid, ID_LEN);
dm_list_add(arg_devices, &dil->list);
}
}
return ret_max;
}
static int _device_list_remove(struct dm_list *devices, struct device *dev)
{
struct device_id_list *dil;
dm_list_iterate_items(dil, devices) {
if (dil->dev == dev) {
dm_list_del(&dil->list);
return 1;
}
}
return 0;
}
static struct device_id_list *_device_list_find_dev(struct dm_list *devices, struct device *dev)
{
struct device_id_list *dil;
dm_list_iterate_items(dil, devices) {
if (dil->dev == dev)
return dil;
}
return NULL;
}
/* Process devices that are not PVs. */
static int _process_other_devices(struct cmd_context *cmd,
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
{
struct dev_iter *iter;
struct physical_volume pv_dummy;
struct physical_volume *pv;
struct device *dev;
int failed = 0;
int ret;
log_debug("Processing devices that are not PVs");
/*
* We want devices here that passed filters during
* label_scan but were found to not be PVs.
*
* No filtering used in iter, DEV_SCAN_FOUND_NOLABEL
* was set by label_scan which did filtering.
*/
if (!(iter = dev_iter_create(NULL, 0)))
return_0;
while ((dev = dev_iter_get(cmd, iter))) {
if (sigint_caught()) {
failed = 1;
break;
}
if (!(dev->flags & DEV_SCAN_FOUND_NOLABEL))
continue;
/*
* Pretend that each device is a PV with dummy values.
* FIXME Formalise this extension or find an alternative.
*/
2014-10-07 01:02:00 +04:00
memset(&pv_dummy, 0, sizeof(pv_dummy));
dm_list_init(&pv_dummy.tags);
dm_list_init(&pv_dummy.segments);
pv_dummy.dev = dev;
2014-10-07 01:02:00 +04:00
pv = &pv_dummy;
log_very_verbose("Processing device %s.", dev_name(dev));
2014-10-07 01:02:00 +04:00
ret = process_single_pv(cmd, NULL, pv, handle);
if (ret != ECMD_PROCESSED)
failed = 1;
2014-10-07 01:02:00 +04:00
}
dev_iter_destroy(iter);
2014-10-07 01:02:00 +04:00
return failed ? 0 : 1;
}
2016-02-11 21:37:36 +03:00
static int _process_duplicate_pvs(struct cmd_context *cmd,
struct dm_list *arg_devices,
int process_other_devices,
2016-02-11 21:37:36 +03:00
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
{
struct device_id_list *dil;
struct device_list *devl;
struct dm_list unused_duplicate_devs;
struct lvmcache_info *info;
const char *vgname;
const char *vgid;
int failed = 0;
int ret;
2016-02-11 21:37:36 +03:00
struct physical_volume dummy_pv = {
.pe_size = 1,
.tags = DM_LIST_HEAD_INIT(dummy_pv.tags),
.segments= DM_LIST_HEAD_INIT(dummy_pv.segments),
};
struct format_instance dummy_fid = {
.metadata_areas_in_use = DM_LIST_HEAD_INIT(dummy_fid.metadata_areas_in_use),
.metadata_areas_ignored = DM_LIST_HEAD_INIT(dummy_fid.metadata_areas_ignored),
};
struct volume_group dummy_vg = {
.cmd = cmd,
.vgmem = cmd->mem,
.extent_size = 1,
.fid = &dummy_fid,
.name = "",
.system_id = (char *) "",
.pvs = DM_LIST_HEAD_INIT(dummy_vg.pvs),
.lvs = DM_LIST_HEAD_INIT(dummy_vg.lvs),
.historical_lvs = DM_LIST_HEAD_INIT(dummy_vg.historical_lvs),
.tags = DM_LIST_HEAD_INIT(dummy_vg.tags),
};
2016-02-11 21:37:36 +03:00
dm_list_init(&unused_duplicate_devs);
if (!lvmcache_get_unused_duplicates(cmd, &unused_duplicate_devs))
return_0;
2016-02-11 21:37:36 +03:00
dm_list_iterate_items(devl, &unused_duplicate_devs) {
/* Duplicates are displayed if -a is used or the dev is named as an arg. */
if ((dil = _device_list_find_dev(arg_devices, devl->dev)))
_device_list_remove(arg_devices, devl->dev);
if (!process_other_devices && !dil)
2016-02-11 21:37:36 +03:00
continue;
if (!(cmd->cname->flags & ENABLE_DUPLICATE_DEVS))
2016-02-11 21:37:36 +03:00
continue;
/*
* Use the cached VG from the preferred device for the PV,
* the vg is only used to display the VG name.
*
* This VG from lvmcache was not read from the duplicate
* dev being processed here, but from the preferred dev
* in lvmcache.
*
* When a duplicate PV is displayed, the reporting fields
* that come from the VG metadata are not shown, because
* the dev is not a part of the VG, the dev for the
* preferred PV is (also the VG metadata in lvmcache is
* not from the duplicate dev, but from the preferred dev).
*/
log_very_verbose("Processing duplicate device %s.", dev_name(devl->dev));
/*
* Don't pass dev to lvmcache_info_from_pvid because we looking
* for the chosen/preferred dev for this pvid.
*/
if (!(info = lvmcache_info_from_pvid(devl->dev->pvid, NULL, 0))) {
log_error(INTERNAL_ERROR "No info for pvid");
return 0;
}
vgname = lvmcache_vgname_from_info(info);
vgid = vgname ? lvmcache_vgid_from_vgname(cmd, vgname) : NULL;
2016-02-11 21:37:36 +03:00
dummy_pv.dev = devl->dev;
dummy_pv.fmt = lvmcache_fmt_from_info(info);
dummy_vg.name = vgname ?: "";
2016-02-11 21:37:36 +03:00
if (vgid)
memcpy(&dummy_vg.id, vgid, ID_LEN);
else
memset(&dummy_vg.id, 0, sizeof(dummy_vg.id));
2016-02-11 21:37:36 +03:00
ret = process_single_pv(cmd, &dummy_vg, &dummy_pv, handle);
if (ret != ECMD_PROCESSED)
failed = 1;
2016-02-11 21:37:36 +03:00
if (sigint_caught())
return_0;
2016-02-11 21:37:36 +03:00
}
return failed ? 0 : 1;
2016-02-11 21:37:36 +03:00
}
static int _process_pvs_in_vg(struct cmd_context *cmd,
struct volume_group *vg,
struct dm_list *arg_devices,
struct dm_list *arg_tags,
int process_all_pvs,
int skip,
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
uint32_t error_flags,
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
{
log_report_t saved_log_report_state = log_get_report_state();
char pv_uuid[64] __attribute__((aligned(8)));
char vg_uuid[64] __attribute__((aligned(8)));
int handle_supplied = handle != NULL;
struct physical_volume *pv;
struct pv_list *pvl;
struct device_id_list *dil;
const char *pv_name;
int process_pv;
int do_report_ret_code = 1;
int ret_max = ECMD_PROCESSED;
int ret = 0;
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_PV);
vg_uuid[0] = '\0';
if (!id_write_format(&vg->id, vg_uuid, sizeof(vg_uuid)))
stack;
if (!handle && (!(handle = init_processing_handle(cmd, NULL)))) {
ret_max = ECMD_FAILED;
goto_out;
}
if (handle->internal_report_for_select && !handle->selection_handle &&
!init_selection_handle(cmd, handle, PVS)) {
ret_max = ECMD_FAILED;
goto_out;
}
if (!is_orphan_vg(vg->name))
log_set_report_object_group_and_group_id(vg->name, vg_uuid);
dm_list_iterate_items(pvl, &vg->pvs) {
pv = pvl->pv;
pv_name = pv_dev_name(pv);
pv_uuid[0]='\0';
if (!id_write_format(&pv->id, pv_uuid, sizeof(pv_uuid)))
stack;
log_set_report_object_name_and_id(pv_name, pv_uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
process_pv = process_all_pvs;
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
dil = NULL;
/* Remove each arg_devices entry as it is processed. */
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
if (arg_devices && !dm_list_empty(arg_devices)) {
if ((dil = _device_list_find_dev(arg_devices, pv->dev)))
_device_list_remove(arg_devices, dil->dev);
}
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
if (!process_pv && dil)
process_pv = 1;
if (!process_pv && !dm_list_empty(arg_tags) &&
str_list_match_list(arg_tags, &pv->tags, NULL))
process_pv = 1;
process_pv = process_pv && select_match_pv(cmd, handle, vg, pv) && _select_matches(handle);
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
/*
* The command has asked to process a specific PV
* named on the command line, but the VG containing
* that PV cannot be accessed. In this case report
* and return an error. If the inaccessible PV is
* not explicitly named on the command line, it is
* silently skipped.
*/
if (process_pv && skip && dil && error_flags) {
if (error_flags & FAILED_EXPORTED)
log_error("Cannot use PV %s in exported VG %s.", pv_name, vg->name);
if (error_flags & FAILED_SYSTEMID)
log_error("Cannot use PV %s in foreign VG %s.", pv_name, vg->name);
if (error_flags & (FAILED_LOCK_TYPE | FAILED_LOCK_MODE))
log_error("Cannot use PV %s in shared VG %s.", pv_name, vg->name);
ret_max = ECMD_FAILED;
}
if (process_pv) {
if (skip)
2014-11-14 18:08:27 +03:00
log_verbose("Skipping PV %s in VG %s.", pv_name, vg->name);
else
2014-11-14 18:08:27 +03:00
log_very_verbose("Processing PV %s in VG %s.", pv_name, vg->name);
if (!skip) {
ret = process_single_pv(cmd, vg, pv, handle);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
}
}
/*
* When processing only specific PVs, we can quit once they've all been found.
*/
if (!process_all_pvs && dm_list_empty(arg_tags) &&
(!arg_devices || dm_list_empty(arg_devices)))
break;
log_set_report_object_name_and_id(NULL, NULL);
}
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_set_report_object_name_and_id(NULL, NULL);
log_set_report_object_group_and_group_id(NULL, NULL);
if (!handle_supplied)
destroy_processing_handle(cmd, handle);
log_restore_report_state(saved_log_report_state);
2014-10-07 01:02:00 +04:00
return ret_max;
}
/*
* Iterate through all PVs in each listed VG. Process a PV if
* its dev or tag matches arg_devices or arg_tags. If both
* arg_devices and arg_tags are empty, then process all PVs.
* No PV should be processed more than once.
*
* Each PV is removed from arg_devices when it is processed.
* Any names remaining in arg_devices were not found, and
* should produce an error.
2014-10-07 01:02:00 +04:00
*/
static int _process_pvs_in_vgs(struct cmd_context *cmd, uint32_t read_flags,
struct dm_list *all_vgnameids,
struct dm_list *arg_devices,
struct dm_list *arg_tags,
int process_all_pvs,
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
2014-10-07 01:02:00 +04:00
{
log_report_t saved_log_report_state = log_get_report_state();
char uuid[64] __attribute__((aligned(8)));
struct volume_group *vg;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
struct volume_group *error_vg;
struct vgnameid_list *vgnl;
const char *vg_name;
const char *vg_uuid;
uint32_t lockd_state = 0;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
uint32_t error_flags = 0;
2014-10-07 01:02:00 +04:00
int ret_max = ECMD_PROCESSED;
int ret;
int skip;
int notfound;
int do_report_ret_code = 1;
2014-10-07 01:02:00 +04:00
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_VG);
dm_list_iterate_items(vgnl, all_vgnameids) {
vg_name = vgnl->vg_name;
vg_uuid = vgnl->vgid;
skip = 0;
notfound = 0;
2014-10-07 01:02:00 +04:00
uuid[0] = '\0';
if (is_orphan_vg(vg_name)) {
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_ORPHAN);
log_set_report_object_name_and_id(vg_name + sizeof(VG_ORPHANS), uuid);
} else {
if (vg_uuid && !id_write_format((const struct id*)vg_uuid, uuid, sizeof(uuid)))
stack;
log_set_report_object_name_and_id(vg_name, uuid);
}
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
2015-03-05 23:00:44 +03:00
if (!lockd_vg(cmd, vg_name, NULL, 0, &lockd_state)) {
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
2015-03-05 23:00:44 +03:00
continue;
}
log_debug("Processing PVs in VG %s", vg_name);
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
error_flags = 0;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
vg = vg_read(cmd, vg_name, vg_uuid, read_flags, lockd_state, &error_flags, &error_vg);
if (_ignore_vg(cmd, error_flags, error_vg, vg_name, NULL, read_flags, &skip, &notfound)) {
stack;
ret_max = ECMD_FAILED;
report_log_ret_code(ret_max);
2015-03-05 23:00:44 +03:00
if (!skip)
goto endvg;
exported vg handling The exported VG checking/enforcement was scattered and inconsistent. This centralizes it and makes it consistent, following the existing approach for foreign and shared VGs/PVs, which are very similar to exported VGs/PVs. The access policy that now applies to foreign/shared/exported VGs/PVs, is that if a foreign/shared/exported VG/PV is named on the command line (i.e. explicitly requested by the user), and the command is not permitted to operate on it because it is foreign/shared/exported, then an access error is reported and the command exits with an error. But, if the command is processing all VGs/PVs, and happens to come across a foreign/shared/exported VG/PV (that is not explicitly named on the command line), then the command silently skips it and does not produce an error. A command using tags or --select handles inaccessible VGs/PVs the same way as a command processing all VGs/PVs, and will not report/return errors if these inaccessible VGs/PVs exist. The new policy fixes the exit codes on a somewhat random set of commands that previously exited with an error if they were looking at all VGs/PVs and an exported VG existed on the system. There should be no change to which commands are allowed/disallowed on exported VGs/PVs. Certain LV commands (lvs/lvdisplay/lvscan) would previously not display LVs from an exported VG (for unknown reasons). This has not changed. The lvm fullreport command would previously report info about an exported VG but not about the LVs in it. This has changed to include all info from the exported VG.
2019-06-21 21:37:11 +03:00
/* Drop through to eliminate unmpermitted PVs from the devices list */
}
if (notfound)
goto endvg;
/*
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
* Don't call "continue" when skip is set, because we need to remove
* error_vg->pvs entries from devices list.
*/
ret = _process_pvs_in_vg(cmd, vg ? vg : error_vg, arg_devices, arg_tags,
process_all_pvs, skip, error_flags,
handle, process_single_pv);
if (ret != ECMD_PROCESSED)
stack;
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
report_log_ret_code(ret);
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (ret > ret_max)
ret_max = ret;
if (!skip && vg)
lvmetad: two phase vg_update Previously, a command sent lvmetad new VG metadata in vg_commit(). In vg_commit(), devices are suspended, so any memory allocation done by the command while sending to lvmetad, or by lvmetad while updating its cache could deadlock if memory reclaim was triggered. Now lvmetad is updated in unlock_vg(), after devices are resumed. The new method for updating VG metadata in lvmetad is in two phases: 1. In vg_write(), before devices are suspended, the command sends lvmetad a short message ("set_vg_info") telling it what the new VG seqno will be. lvmetad sees that the seqno is newer than the seqno of its cached VG, so it sets the INVALID flag for the cached VG. If sending the message to lvmetad fails, the command fails before the metadata is committed and the change is not made. If sending the message succeeds, vg_commit() is called. 2. In unlock_vg(), after devices are resumed, the command sends lvmetad the standard vg_update message with the new metadata. lvmetad sees that the seqno in the new metadata matches the seqno it saved from set_vg_info, and knows it has the latest copy, so it clears the INVALID flag for the cached VG. If a command fails between 1 and 2 (after committing the VG on disk, but before sending lvmetad the new metadata), the cached VG retains the INVALID flag in lvmetad. A subsequent command will read the cached VG from lvmetad, see the INVALID flag, ignore the cached copy, read the VG from disk instead, update the lvmetad copy with the latest copy from disk, (this clears the INVALID flag in lvmetad), and use the correct VG metadata for the command. (This INVALID mechanism already existed for use by lvmlockd.)
2016-06-08 22:42:03 +03:00
unlock_vg(cmd, vg, vg->name);
2015-03-05 23:00:44 +03:00
endvg:
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (error_vg)
unlock_and_release_vg(cmd, error_vg, vg_name);
2015-03-05 23:00:44 +03:00
release_vg(vg);
if (!lockd_vg(cmd, vg_name, "un", 0, &lockd_state))
stack;
/* Quit early when possible. */
if (!process_all_pvs && dm_list_empty(arg_tags) && dm_list_empty(arg_devices)) {
do_report_ret_code = 0;
goto out;
}
2014-10-07 01:02:00 +04:00
log_set_report_object_name_and_id(NULL, NULL);
}
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_restore_report_state(saved_log_report_state);
return ret_max;
}
2014-10-07 01:02:00 +04:00
int process_each_pv(struct cmd_context *cmd,
int argc, char **argv, const char *only_this_vgname,
int all_is_set, uint32_t read_flags,
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
{
log_report_t saved_log_report_state = log_get_report_state();
struct dm_list arg_tags; /* str_list */
struct dm_list arg_pvnames; /* str_list */
struct dm_list arg_devices; /* device_id_list */
struct dm_list all_vgnameids; /* vgnameid_list */
struct device_id_list *dil;
int process_all_pvs;
int process_other_devices;
int ret_max = ECMD_PROCESSED;
int ret;
2014-10-07 01:02:00 +04:00
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_PV);
log_debug("Processing each PV");
/*
* When processing a specific VG name, warn if it's inconsistent and
* print an error if it's not found. Otherwise we're processing all
* VGs, in which case the command doesn't care if the VG is inconsisent
* or not found; it just wants to skip that VG. (It may be not found
* if it was removed between creating the list of all VGs and then
* processing each VG.
*/
if (only_this_vgname)
read_flags |= READ_WARN_INCONSISTENT;
else
read_flags |= READ_OK_NOTFOUND;
2015-03-05 23:00:44 +03:00
/* Disable error in vg_read so we can print it from ignore_vg. */
cmd->vg_read_print_access_error = 0;
dm_list_init(&arg_tags);
dm_list_init(&arg_pvnames);
dm_list_init(&arg_devices);
dm_list_init(&all_vgnameids);
2014-10-07 01:02:00 +04:00
/*
* Create two lists from argv:
* arg_pvnames: pvs explicitly named in argv
* arg_tags: tags explicitly named in argv
*
* Then convert arg_pvnames, which are free-form, user-specified,
* names/paths into arg_devices which can be used to match below.
*/
if ((ret = _get_arg_pvnames(cmd, argc, argv, &arg_pvnames, &arg_tags)) != ECMD_PROCESSED) {
ret_max = ret;
goto_out;
}
2014-10-07 01:02:00 +04:00
if ((cmd->cname->flags & DISALLOW_TAG_ARGS) && !dm_list_empty(&arg_tags)) {
log_error("Tags cannot be used with this command.");
return ECMD_FAILED;
}
process_all_pvs = dm_list_empty(&arg_pvnames) && dm_list_empty(&arg_tags);
2014-10-07 01:02:00 +04:00
process_other_devices = process_all_pvs && (cmd->cname->flags & ENABLE_ALL_DEVS) && all_is_set;
2015-03-05 23:00:44 +03:00
/* Needed for a current listing of the global VG namespace. */
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
if (!only_this_vgname && !lock_global(cmd, "sh")) {
ret_max = ECMD_FAILED;
goto_out;
}
2015-03-05 23:00:44 +03:00
if (!(read_flags & PROCESS_SKIP_SCAN))
scan: do scanning at the start of a command Move the location of scans to make it clearer and avoid unnecessary repeated scanning. There should be one scan at the start of a command which is then used through the rest of command processing. Previously, the initial label scan was called as a side effect from various utility functions. This would lead to it being called unnecessarily. It is an expensive operation, and should only be called when necessary. Also, this is a primary step in the function of the command, and as such it should be called prominently at the top level of command processing, not as a hidden side effect of a utility function. lvm knows exactly where and when the label scan needs to be done. Because of this, move the label scan calls from the internal functions to the top level of processing. Other specific instances of lvmcache_label_scan() are still called unnecessarily or unclearly by specific commands that do not use the common process_each functions. These will be improved in future commits. During the processing phase, rescanning labels for devices in a VG needs to be done after the VG lock is acquired in case things have changed since the initial label scan. This was being done by way of rescanning devices that had the INVALID flag set in lvmcache. This usually approximated the right set of devices, but it was not exact, and obfuscated the real requirement. Correct this by using a new function that rescans the devices in the VG: lvmcache_label_rescan_vg(). Apart from being inexact, the rescanning was extremely well hidden. _vg_read() would call ->create_instance(), _text_create_text_instance(), _create_vg_text_instance() which would call lvmcache_label_scan() which would call _scan_invalid() which repeats the label scan on devices flagged INVALID. lvmcache_label_rescan_vg() is now called prominently by _vg_read() directly.
2018-02-07 22:26:37 +03:00
lvmcache_label_scan(cmd);
if (!lvmcache_get_vgnameids(cmd, &all_vgnameids, only_this_vgname, 1)) {
ret_max = ret;
goto_out;
toollib: fix duplicate handling in process_each_pv With use_lvmetad=0, duplicate PVs /dev/loop0 and /dev/loop1, where in this example, /dev/loop1 is the cached device referenced by pv->dev, the command 'pvs /dev/loop0' reports: Failed to find physical volume "/dev/loop0". This is because the duplicate PV detection by pvid is not working because _get_all_devices() is not setting any dev->pvid for any entries. This is because the pvid information has not yet been saved in lvmcache. This is fixed by calling _get_vgnameids_on_system() before _get_all_devices(), which has the effect of caching the necessary pvid information. With this fix, running pvs /dev/loop0, or pvs /dev/loop1, produces no error and one line of output for the PV (the device printed is the one cached in pv->dev, in this example /dev/loop1.) Running 'pvs /dev/loop0 /dev/loop1' produces no error and two lines of output, with each device displayed on one of the lines. Running 'pvs -a' shows two PVs, one with loop0 and one with loop1, and both shown as a member of the same VG. Running 'pvs' shows only one of the duplicate PVs, and that shows the device cached in pv->dev (loop1). The above output is what the duplicate handling code was previously designed to output in commits: b64da4d8b521 toollib: search for duplicate PVs only when needed 3a7c47af0e88 toollib: pvs -a should display VG name for each duplicate PV 57d74a45a05e toollib: override the PV device with duplicates c1f246fedfc3 toollib: handle duplicate pvs in process_in_pv As a further step after this, we may choose to change some of those. For all of these commands, a warning is printed about the existence of the duplicate PVs: Found duplicate PV ...: using /dev/loop1 not /dev/loop0
2015-04-20 22:35:35 +03:00
}
if ((ret = _get_arg_devices(cmd, &arg_pvnames, &arg_devices)) != ECMD_PROCESSED) {
/* get_arg_devices reports EINIT_FAILED for any PV names not found. */
ret_max = ret;
if (ret_max == ECMD_FAILED)
goto_out;
2021-03-11 02:54:32 +03:00
ret_max = ECMD_FAILED; /* but ATM we've returned FAILED for all cases */
}
ret = _process_pvs_in_vgs(cmd, read_flags, &all_vgnameids,
&arg_devices, &arg_tags, process_all_pvs,
handle, process_single_pv);
if (ret != ECMD_PROCESSED)
2016-02-11 21:37:36 +03:00
stack;
if (ret > ret_max)
ret_max = ret;
/*
* Process the list of unused duplicate devs to display duplicate PVs
* in two cases: 1. pvs -a (which has traditionally included duplicate
* PVs in addition to the expected non-PV devices), 2. pvs <devname>
* (duplicate dev is named on the command line.)
2016-02-11 21:37:36 +03:00
*/
if (process_other_devices || !dm_list_empty(&arg_devices)) {
if (!_process_duplicate_pvs(cmd, &arg_devices, process_other_devices, handle, process_single_pv))
ret_max = ECMD_FAILED;
}
dm_list_iterate_items(dil, &arg_devices) {
log_error("Failed to find physical volume \"%s\".", dev_name(dil->dev));
ret_max = ECMD_FAILED;
}
/*
* pvs -a and pvdisplay -a want to show devices that are not PVs.
*/
if (process_other_devices) {
if (!_process_other_devices(cmd, handle, process_single_pv))
ret_max = ECMD_FAILED;
}
out:
log_restore_report_state(saved_log_report_state);
2014-10-07 01:02:00 +04:00
return ret_max;
}
int process_each_pv_in_vg(struct cmd_context *cmd, struct volume_group *vg,
struct processing_handle *handle,
process_single_pv_fn_t process_single_pv)
{
log_report_t saved_log_report_state = log_get_report_state();
char pv_uuid[64] __attribute__((aligned(8)));
char vg_uuid[64] __attribute__((aligned(8)));
int whole_selected = 0;
int ret_max = ECMD_PROCESSED;
int ret;
int do_report_ret_code = 1;
struct pv_list *pvl;
2014-10-07 01:02:00 +04:00
log_set_report_object_type(LOG_REPORT_OBJECT_TYPE_PV);
vg_uuid[0] = '\0';
if (!id_write_format(&vg->id, vg_uuid, sizeof(vg_uuid)))
stack;
if (!is_orphan_vg(vg->name))
log_set_report_object_group_and_group_id(vg->name, vg_uuid);
dm_list_iterate_items(pvl, &vg->pvs) {
pv_uuid[0] = '\0';
if (!id_write_format(&pvl->pv->id, pv_uuid, sizeof(pv_uuid)))
stack;
log_set_report_object_name_and_id(pv_dev_name(pvl->pv), pv_uuid);
if (sigint_caught()) {
ret_max = ECMD_FAILED;
goto_out;
}
ret = process_single_pv(cmd, vg, pvl->pv, handle);
_update_selection_result(handle, &whole_selected);
if (ret != ECMD_PROCESSED)
stack;
report_log_ret_code(ret);
if (ret > ret_max)
ret_max = ret;
log_set_report_object_name_and_id(NULL, NULL);
}
_set_final_selection_result(handle, whole_selected);
do_report_ret_code = 0;
out:
if (do_report_ret_code)
report_log_ret_code(ret_max);
log_restore_report_state(saved_log_report_state);
return ret_max;
2014-10-07 01:02:00 +04:00
}
2014-10-07 19:45:45 +04:00
int lvremove_single(struct cmd_context *cmd, struct logical_volume *lv,
struct processing_handle *handle __attribute__((unused)))
2014-10-07 19:45:45 +04:00
{
/*
* Single force is equivalent to single --yes
* Even multiple --yes are equivalent to single --force
* When we require -ff it cannot be replaced with -f -y
*/
force_t force = (force_t) arg_count(cmd, force_ARG)
? : (arg_is_set(cmd, yes_ARG) ? DONT_PROMPT : PROMPT);
if (!lv_remove_with_dependencies(cmd, lv, force, 0))
return_ECMD_FAILED;
return ECMD_PROCESSED;
}
int pvcreate_params_from_args(struct cmd_context *cmd, struct pvcreate_params *pp)
{
pp->yes = arg_count(cmd, yes_ARG);
pp->force = (force_t) arg_count(cmd, force_ARG);
if (arg_int_value(cmd, labelsector_ARG, 0) >= LABEL_SCAN_SECTORS) {
log_error("labelsector must be less than %lu.",
LABEL_SCAN_SECTORS);
return 0;
}
pp->pva.label_sector = arg_int64_value(cmd, labelsector_ARG,
DEFAULT_LABELSECTOR);
if (arg_is_set(cmd, metadataignore_ARG))
pp->pva.metadataignore = arg_int_value(cmd, metadataignore_ARG,
DEFAULT_PVMETADATAIGNORE);
else
pp->pva.metadataignore = find_config_tree_bool(cmd, metadata_pvmetadataignore_CFG, NULL);
if (arg_is_set(cmd, pvmetadatacopies_ARG) &&
!arg_int_value(cmd, pvmetadatacopies_ARG, -1) &&
pp->pva.metadataignore) {
2017-06-23 11:59:12 +03:00
log_error("metadataignore only applies to metadatacopies > 0.");
return 0;
}
pp->zero = arg_int_value(cmd, zero_ARG, 1);
if (arg_sign_value(cmd, dataalignment_ARG, SIGN_NONE) == SIGN_MINUS) {
log_error("Physical volume data alignment may not be negative.");
return 0;
}
pp->pva.data_alignment = arg_uint64_value(cmd, dataalignment_ARG, UINT64_C(0));
if (pp->pva.data_alignment > UINT32_MAX) {
log_error("Physical volume data alignment is too big.");
return 0;
}
if (arg_sign_value(cmd, dataalignmentoffset_ARG, SIGN_NONE) == SIGN_MINUS) {
2017-06-23 11:59:12 +03:00
log_error("Physical volume data alignment offset may not be negative.");
return 0;
}
pp->pva.data_alignment_offset = arg_uint64_value(cmd, dataalignmentoffset_ARG, UINT64_C(0));
if (pp->pva.data_alignment_offset > UINT32_MAX) {
log_error("Physical volume data alignment offset is too big.");
return 0;
}
if ((pp->pva.data_alignment + pp->pva.data_alignment_offset) &&
(pp->pva.pe_start != PV_PE_START_CALC)) {
if ((pp->pva.data_alignment ? pp->pva.pe_start % pp->pva.data_alignment : pp->pva.pe_start) != pp->pva.data_alignment_offset) {
log_warn("WARNING: Ignoring data alignment %s"
2017-06-23 11:59:12 +03:00
" incompatible with restored pe_start value %s.",
display_size(cmd, pp->pva.data_alignment + pp->pva.data_alignment_offset),
display_size(cmd, pp->pva.pe_start));
pp->pva.data_alignment = 0;
pp->pva.data_alignment_offset = 0;
}
}
if (arg_sign_value(cmd, metadatasize_ARG, SIGN_NONE) == SIGN_MINUS) {
log_error("Metadata size may not be negative.");
return 0;
}
if (arg_sign_value(cmd, bootloaderareasize_ARG, SIGN_NONE) == SIGN_MINUS) {
log_error("Bootloader area size may not be negative.");
return 0;
}
pp->pva.pvmetadatasize = arg_uint64_value(cmd, metadatasize_ARG, UINT64_C(0));
Place the first PE at 1 MiB for all defaults . When using default settings, this commit should change nothing. The first PE continues to be placed at 1 MiB resulting in a metadata area size of 1020 KiB (for 4K page sizes; slightly smaller for larger page sizes.) . When default_data_alignment is disabled in lvm.conf, align pe_start at 1 MiB, based on a default metadata area size that adapts to the page size. Previously, disabling this option would result in mda_size that was too small for common use, and produced a 64 KiB aligned pe_start. . Customized pe_start and mda_size values continue to be set as before in lvm.conf and command line. . Remove the configure option for setting default_data_alignment at build time. . Improve alignment related option descriptions. . Add section about alignment to pvcreate man page. Previously, DEFAULT_PVMETADATASIZE was 255 sectors. However, the fact that the config setting named "default_data_alignment" has a default value of 1 (MiB) meant that DEFAULT_PVMETADATASIZE was having no effect. The metadata area size is the space between the start of the metadata area (page size offset from the start of the device) and the first PE (1 MiB by default due to default_data_alignment 1.) The result is a 1020 KiB metadata area on machines with 4KiB page size (1024 KiB - 4 KiB), and smaller on machines with larger page size. If default_data_alignment was set to 0 (disabled), then DEFAULT_PVMETADATASIZE 255 would take effect, and produce a metadata area that was 188 KiB and pe_start of 192 KiB. This was too small for common use. This is fixed by making the default metadata area size a computed value that matches the value produced by default_data_alignment.
2018-11-14 00:00:11 +03:00
if (!pp->pva.pvmetadatasize) {
pp->pva.pvmetadatasize = find_config_tree_int(cmd, metadata_pvmetadatasize_CFG, NULL);
Place the first PE at 1 MiB for all defaults . When using default settings, this commit should change nothing. The first PE continues to be placed at 1 MiB resulting in a metadata area size of 1020 KiB (for 4K page sizes; slightly smaller for larger page sizes.) . When default_data_alignment is disabled in lvm.conf, align pe_start at 1 MiB, based on a default metadata area size that adapts to the page size. Previously, disabling this option would result in mda_size that was too small for common use, and produced a 64 KiB aligned pe_start. . Customized pe_start and mda_size values continue to be set as before in lvm.conf and command line. . Remove the configure option for setting default_data_alignment at build time. . Improve alignment related option descriptions. . Add section about alignment to pvcreate man page. Previously, DEFAULT_PVMETADATASIZE was 255 sectors. However, the fact that the config setting named "default_data_alignment" has a default value of 1 (MiB) meant that DEFAULT_PVMETADATASIZE was having no effect. The metadata area size is the space between the start of the metadata area (page size offset from the start of the device) and the first PE (1 MiB by default due to default_data_alignment 1.) The result is a 1020 KiB metadata area on machines with 4KiB page size (1024 KiB - 4 KiB), and smaller on machines with larger page size. If default_data_alignment was set to 0 (disabled), then DEFAULT_PVMETADATASIZE 255 would take effect, and produce a metadata area that was 188 KiB and pe_start of 192 KiB. This was too small for common use. This is fixed by making the default metadata area size a computed value that matches the value produced by default_data_alignment.
2018-11-14 00:00:11 +03:00
if (!pp->pva.pvmetadatasize)
pp->pva.pvmetadatasize = get_default_pvmetadatasize_sectors();
}
pp->pva.pvmetadatacopies = arg_int_value(cmd, pvmetadatacopies_ARG, -1);
if (pp->pva.pvmetadatacopies < 0)
pp->pva.pvmetadatacopies = find_config_tree_int(cmd, metadata_pvmetadatacopies_CFG, NULL);
pp->pva.ba_size = arg_uint64_value(cmd, bootloaderareasize_ARG, pp->pva.ba_size);
return 1;
}
enum {
PROMPT_PVCREATE_PV_IN_VG = 1,
PROMPT_PVREMOVE_PV_IN_VG = 2,
PROMPT_PVCREATE_DEV_SIZE = 4,
};
enum {
PROMPT_ANSWER_NO = 1,
PROMPT_ANSWER_YES = 2
};
/*
* When a prompt entry is created, save any strings or info
* in this struct that are needed for the prompt messages.
* The VG/PV structs are not be available when the prompt
* is run.
*/
struct pvcreate_prompt {
struct dm_list list;
uint32_t type;
uint64_t size;
uint64_t new_size;
const char *pv_name;
const char *vg_name;
struct device *dev;
int answer;
unsigned abort_command : 1;
unsigned vg_name_unknown : 1;
};
struct pvcreate_device {
struct dm_list list;
const char *name;
struct device *dev;
char pvid[ID_LEN + 1];
const char *vg_name;
int wiped;
unsigned is_not_pv : 1; /* device is not a PV */
unsigned is_orphan_pv : 1; /* device is an orphan PV */
unsigned is_vg_pv : 1; /* device is a PV used in a VG */
unsigned is_used_unknown_pv : 1; /* device is a PV used in an unknown VG */
};
/*
* If a PV is in a VG, and pvcreate or pvremove is run on it:
*
* pvcreate|pvremove -f : fails
* pvcreate|pvremove -y : fails
* pvcreate|pvremove -f -y : fails
* pvcreate|pvremove -ff : get y/n prompt
* pvcreate|pvremove -ff -y : succeeds
*
* FIXME: there are a lot of various phrasings used depending on the
* command and specific case. Find some similar way to phrase these.
*/
static void _check_pvcreate_prompt(struct cmd_context *cmd,
struct pvcreate_params *pp,
struct pvcreate_prompt *prompt,
int ask)
{
const char *vgname = prompt->vg_name ? prompt->vg_name : "<unknown>";
const char *pvname = prompt->pv_name;
int answer_yes = 0;
int answer_no = 0;
/* The VG name can be unknown when the PV is used but metadata is not available */
if (prompt->type & PROMPT_PVCREATE_PV_IN_VG) {
if (pp->force != DONT_PROMPT_OVERRIDE) {
answer_no = 1;
if (prompt->vg_name_unknown) {
log_error("PV %s is used by a VG but its metadata is missing.", pvname);
log_error("Can't initialize PV '%s' without -ff.", pvname);
} else if (!strcmp(command_name(cmd), "pvcreate")) {
log_error("Can't initialize physical volume \"%s\" of volume group \"%s\" without -ff", pvname, vgname);
} else {
log_error("Physical volume '%s' is already in volume group '%s'", pvname, vgname);
log_error("Unable to add physical volume '%s' to volume group '%s'", pvname, vgname);
}
} else if (pp->yes) {
answer_yes = 1;
} else if (ask) {
if (yes_no_prompt("Really INITIALIZE physical volume \"%s\" of volume group \"%s\" [y/n]? ", pvname, vgname) == 'n') {
answer_no = 1;
} else {
answer_yes = 1;
log_warn("WARNING: Forcing physical volume creation on %s of volume group \"%s\"", pvname, vgname);
}
}
}
if (prompt->type & PROMPT_PVCREATE_DEV_SIZE) {
if (pp->yes) {
log_warn("WARNING: Faking size of PV %s. Don't write outside real device.", pvname);
answer_yes = 1;
} else if (ask) {
if (prompt->new_size != prompt->size) {
if (yes_no_prompt("WARNING: %s: device size %s does not match requested size %s. Proceed? [y/n]: ", pvname,
display_size(cmd, prompt->size),
display_size(cmd, prompt->new_size)) == 'n') {
answer_no = 1;
} else {
answer_yes = 1;
log_warn("WARNING: Faking size of PV %s. Don't write outside real device.", pvname);
}
}
}
}
if (prompt->type & PROMPT_PVREMOVE_PV_IN_VG) {
if (pp->force != DONT_PROMPT_OVERRIDE) {
answer_no = 1;
if (prompt->vg_name_unknown)
log_error("PV %s is used by a VG but its metadata is missing.", pvname);
else
log_error("PV %s is used by VG %s so please use vgreduce first.", pvname, vgname);
log_error("(If you are certain you need pvremove, then confirm by using --force twice.)");
} else if (pp->yes) {
2017-06-23 11:59:12 +03:00
log_warn("WARNING: PV %s is used by VG %s.", pvname, vgname);
answer_yes = 1;
} else if (ask) {
2017-06-23 11:59:12 +03:00
log_warn("WARNING: PV %s is used by VG %s.", pvname, vgname);
if (yes_no_prompt("Really WIPE LABELS from physical volume \"%s\" of volume group \"%s\" [y/n]? ", pvname, vgname) == 'n')
answer_no = 1;
else
answer_yes = 1;
}
}
if (answer_yes && answer_no) {
2020-10-03 14:52:37 +03:00
log_warn("WARNING: prompt answer yes is overridden by prompt answer no.");
answer_yes = 0;
}
/*
* no answer is valid when not asking the user.
* the caller uses this to check if all the prompts
* can be answered automatically without prompts.
*/
if (!ask && !answer_yes && !answer_no)
return;
if (answer_no)
prompt->answer = PROMPT_ANSWER_NO;
else if (answer_yes)
prompt->answer = PROMPT_ANSWER_YES;
/*
* Mostly historical messages. Other messages above could be moved
* here to separate the answer logic from the messages.
*/
if ((prompt->type & (PROMPT_PVCREATE_DEV_SIZE | PROMPT_PVCREATE_PV_IN_VG)) &&
(prompt->answer == PROMPT_ANSWER_NO))
log_error("%s: physical volume not initialized.", pvname);
if ((prompt->type & PROMPT_PVREMOVE_PV_IN_VG) &&
(prompt->answer == PROMPT_ANSWER_NO))
log_error("%s: physical volume label not removed.", pvname);
if ((prompt->type & PROMPT_PVREMOVE_PV_IN_VG) &&
(prompt->answer == PROMPT_ANSWER_YES) &&
(pp->force == DONT_PROMPT_OVERRIDE))
log_warn("WARNING: Wiping physical volume label from %s of volume group \"%s\".", pvname, vgname);
}
static struct pvcreate_device *_pvcreate_list_find_dev(struct dm_list *devices, struct device *dev)
{
struct pvcreate_device *pd;
dm_list_iterate_items(pd, devices) {
if (pd->dev == dev)
return pd;
}
return NULL;
}
static struct pvcreate_device *_pvcreate_list_find_name(struct dm_list *devices, const char *name)
{
struct pvcreate_device *pd;
dm_list_iterate_items(pd, devices) {
if (!strcmp(pd->name, name))
return pd;
}
return NULL;
}
static int _pvcreate_check_used(struct cmd_context *cmd,
struct pvcreate_params *pp,
struct pvcreate_device *pd)
{
struct pvcreate_prompt *prompt;
uint64_t size = 0;
uint64_t new_size = 0;
int need_size_prompt = 0;
int need_vg_prompt = 0;
struct lvmcache_info *info;
const char *vgname;
log_debug("Checking %s for pvcreate %.32s.",
dev_name(pd->dev), pd->dev->pvid[0] ? pd->dev->pvid : "");
if (!pd->dev->pvid[0]) {
log_debug("Check pvcreate arg %s no PVID found", dev_name(pd->dev));
pd->is_not_pv = 1;
return 1;
}
/*
* Don't allow using a device with duplicates.
*/
if (lvmcache_pvid_in_unused_duplicates(pd->dev->pvid)) {
log_error("Cannot use device %s with duplicates.", dev_name(pd->dev));
dm_list_move(&pp->arg_fail, &pd->list);
return 0;
}
if (!(info = lvmcache_info_from_pvid(pd->dev->pvid, pd->dev, 0))) {
log_error("Failed to read lvm info for %s PVID %s.", dev_name(pd->dev), pd->dev->pvid);
dm_list_move(&pp->arg_fail, &pd->list);
return 0;
}
vgname = lvmcache_vgname_from_info(info);
/*
* What kind of device is this: an orphan PV, an uninitialized/unused
* device, a PV used in a VG.
*/
if (vgname && !is_orphan_vg(vgname)) {
/* Device is a PV used in a VG. */
log_debug("Check pvcreate arg %s found vg %s.", dev_name(pd->dev), vgname);
pd->is_vg_pv = 1;
pd->vg_name = dm_pool_strdup(cmd->mem, vgname);
} else if (!vgname || (vgname && is_orphan_vg(vgname))) {
uint32_t ext_flags = lvmcache_ext_flags(info);
if (ext_flags & PV_EXT_USED) {
/* Device is used in an unknown VG. */
log_debug("Check pvcreate arg %s found EXT_USED flag.", dev_name(pd->dev));
pd->is_used_unknown_pv = 1;
} else {
/* Device is an orphan PV. */
log_debug("Check pvcreate arg %s is orphan.", dev_name(pd->dev));
pd->is_orphan_pv = 1;
}
pp->orphan_vg_name = FMT_TEXT_ORPHAN_VG_NAME;
}
if (arg_is_set(cmd, setphysicalvolumesize_ARG)) {
new_size = arg_uint64_value(cmd, setphysicalvolumesize_ARG, UINT64_C(0));
if (!dev_get_size(pd->dev, &size)) {
log_error("Can't get device size of %s.", dev_name(pd->dev));
dm_list_move(&pp->arg_fail, &pd->list);
return 0;
}
if (new_size != size)
need_size_prompt = 1;
}
/*
* pvcreate is being run on this device, and it's not a PV,
* or is an orphan PV. Neither case requires a prompt.
* Or, pvcreate is being run on this device, but the device
* is already a PV in a VG. A prompt or force option is required
* to use it.
*/
if (pd->is_orphan_pv || pd->is_not_pv)
need_vg_prompt = 0;
else
need_vg_prompt = 1;
if (!need_size_prompt && !need_vg_prompt)
return 1;
if (!(prompt = dm_pool_zalloc(cmd->mem, sizeof(*prompt)))) {
dm_list_move(&pp->arg_fail, &pd->list);
return_0;
}
prompt->dev = pd->dev;
prompt->pv_name = dm_pool_strdup(cmd->mem, dev_name(pd->dev));
prompt->size = size;
prompt->new_size = new_size;
2016-03-01 13:41:09 +03:00
if (pd->is_used_unknown_pv)
prompt->vg_name_unknown = 1;
else if (need_vg_prompt)
prompt->vg_name = dm_pool_strdup(cmd->mem, vgname);
if (need_size_prompt)
prompt->type |= PROMPT_PVCREATE_DEV_SIZE;
if (need_vg_prompt)
prompt->type |= PROMPT_PVCREATE_PV_IN_VG;
dm_list_add(&pp->prompts, &prompt->list);
2016-03-01 13:41:09 +03:00
return 1;
}
static int _pvremove_check_used(struct cmd_context *cmd,
struct pvcreate_params *pp,
struct pvcreate_device *pd)
{
struct pvcreate_prompt *prompt;
struct lvmcache_info *info;
const char *vgname = NULL;
log_debug("Checking %s for pvremove %.32s.",
dev_name(pd->dev), pd->dev->pvid[0] ? pd->dev->pvid : "");
/*
* Is there a pv here already?
* If not, this is an error unless you used -f.
*/
if (!pd->dev->pvid[0]) {
log_debug("Check pvremove arg %s no PVID found", dev_name(pd->dev));
if (pp->force)
return 1;
pd->is_not_pv = 1;
}
if (!(info = lvmcache_info_from_pvid(pd->dev->pvid, pd->dev, 0))) {
if (pp->force)
return 1;
log_error("No PV found on device %s.", dev_name(pd->dev));
dm_list_move(&pp->arg_fail, &pd->list);
return 0;
}
if (info)
vgname = lvmcache_vgname_from_info(info);
/*
* What kind of device is this: an orphan PV, an uninitialized/unused
* device, a PV used in a VG.
*/
if (pd->is_not_pv) {
/* Device is not a PV. */
log_debug("Check pvremove arg %s device is not a PV.", dev_name(pd->dev));
} else if (vgname && !is_orphan_vg(vgname)) {
/* Device is a PV used in a VG. */
log_debug("Check pvremove arg %s found vg %s.", dev_name(pd->dev), vgname);
pd->is_vg_pv = 1;
pd->vg_name = dm_pool_strdup(cmd->mem, vgname);
} else if (info && (!vgname || (vgname && is_orphan_vg(vgname)))) {
uint32_t ext_flags = lvmcache_ext_flags(info);
if (ext_flags & PV_EXT_USED) {
/* Device is used in an unknown VG. */
log_debug("Check pvremove arg %s found EXT_USED flag.", dev_name(pd->dev));
pd->is_used_unknown_pv = 1;
} else {
/* Device is an orphan PV. */
log_debug("Check pvremove arg %s is orphan.", dev_name(pd->dev));
pd->is_orphan_pv = 1;
}
pp->orphan_vg_name = FMT_TEXT_ORPHAN_VG_NAME;
}
if (pd->is_not_pv) {
log_error("No PV found on device %s.", dev_name(pd->dev));
dm_list_move(&pp->arg_fail, &pd->list);
return 0;
}
/*
* pvremove is being run on this device, and it's not a PV,
* or is an orphan PV. Neither case requires a prompt.
*/
if (pd->is_orphan_pv)
return 1;
/*
* pvremove is being run on this device, but the device is in a VG.
* A prompt or force option is required to use it.
*/
if (!(prompt = dm_pool_zalloc(cmd->mem, sizeof(*prompt)))) {
dm_list_move(&pp->arg_fail, &pd->list);
return_0;
}
prompt->dev = pd->dev;
prompt->pv_name = dm_pool_strdup(cmd->mem, dev_name(pd->dev));
if (pd->is_used_unknown_pv)
prompt->vg_name_unknown = 1;
else
prompt->vg_name = dm_pool_strdup(cmd->mem, vgname);
prompt->type |= PROMPT_PVREMOVE_PV_IN_VG;
dm_list_add(&pp->prompts, &prompt->list);
return 1;
}
static int _confirm_check_used(struct cmd_context *cmd,
struct pvcreate_params *pp,
struct pvcreate_device *pd)
{
struct lvmcache_info *info = NULL;
const char *vgname = NULL;
int is_not_pv = 0;
log_debug("Checking %s to confirm %.32s.",
dev_name(pd->dev), pd->dev->pvid[0] ? pd->dev->pvid : "");
if (!pd->dev->pvid[0]) {
log_debug("Check confirm arg %s no PVID found", dev_name(pd->dev));
is_not_pv = 1;
}
if (!(info = lvmcache_info_from_pvid(pd->dev->pvid, pd->dev, 0))) {
log_debug("Check confirm arg %s no info.", dev_name(pd->dev));
is_not_pv = 1;
}
if (info)
vgname = lvmcache_vgname_from_info(info);
/*
* What kind of device is this: an orphan PV, an uninitialized/unused
* device, a PV used in a VG.
*/
if (vgname && !is_orphan_vg(vgname)) {
/* Device is a PV used in a VG. */
if (pd->is_orphan_pv || pd->is_not_pv || pd->is_used_unknown_pv) {
/* In first check it was an orphan or unused. */
goto fail;
}
if (pd->is_vg_pv && pd->vg_name && strcmp(pd->vg_name, vgname)) {
/* In first check it was in a different VG. */
goto fail;
}
} else if (info && (!vgname || is_orphan_vg(vgname))) {
uint32_t ext_flags = lvmcache_ext_flags(info);
/* Device is an orphan PV. */
if (pd->is_not_pv) {
/* In first check it was not a PV. */
goto fail;
}
if (pd->is_vg_pv) {
/* In first check it was in a VG. */
goto fail;
}
if ((ext_flags & PV_EXT_USED) && !pd->is_used_unknown_pv) {
/* In first check it was different. */
goto fail;
}
if (!(ext_flags & PV_EXT_USED) && pd->is_used_unknown_pv) {
/* In first check it was different. */
goto fail;
}
} else if (is_not_pv) {
/* Device is not a PV. */
if (pd->is_orphan_pv || pd->is_used_unknown_pv) {
/* In first check it was an orphan PV. */
goto fail;
}
if (pd->is_vg_pv) {
/* In first check it was in a VG. */
goto fail;
}
}
return 1;
fail:
log_error("Cannot use device %s: it changed during prompt.", dev_name(pd->dev));
dm_list_move(&pp->arg_fail, &pd->list);
return 1;
}
/*
* This can be used by pvcreate, vgcreate and vgextend to create PVs. The
* callers need to set up the pvcreate_each_params structure based on command
* line args. This includes the pv_names field which specifies the devices to
* create PVs on.
*
* This function returns 0 (failed) if the caller requires all specified
* devices to be created, and any of those devices are not found, or any of
* them cannot be created.
*
* This function returns 1 (success) if the caller requires all specified
* devices to be created, and all are created, or if the caller does not
* require all specified devices to be created and one or more were created.
*
* Process of opening, scanning and filtering:
*
* - label scan and filter all devs
* . open ro
* . standard label scan at the start of command
* . done prior to this function
*
* - label scan and filter dev args
* . label_scan_devs(&scan_devs) in this function
* . open ro
* . uses full md component check
* . typically the first scan and filter of pvcreate devs
*
* - close and reopen dev args
* . open rw and excl
* . done by label_scan_devs_excl
*
* - repeat label scan and filter dev args
* . using reopened rw excl fd
* . since something could have used dev
* in the small window between close and reopen
*
* - wipe and write new headers
* . using reopened rw excl fd
*/
int pvcreate_each_device(struct cmd_context *cmd,
struct processing_handle *handle,
struct pvcreate_params *pp)
{
struct pvcreate_device *pd, *pd2;
struct pvcreate_prompt *prompt, *prompt2;
struct physical_volume *pv;
struct volume_group *orphan_vg;
struct dm_list remove_duplicates;
struct dm_list arg_sort;
struct dm_list scan_devs;
struct dm_list rescan_devs;
struct pv_list *pvl;
struct pv_list *vgpvl;
struct device_list *devl;
char pvid[ID_LEN + 1] __attribute__((aligned(8))) = { 0 };
const char *pv_name;
unsigned int physical_block_size, logical_block_size;
unsigned int prev_pbs = 0, prev_lbs = 0;
int must_use_all = (cmd->cname->flags & MUST_USE_ALL_ARGS);
int unlocked_for_prompts = 0;
int found;
unsigned i;
set_pv_notify(cmd);
dm_list_init(&remove_duplicates);
dm_list_init(&arg_sort);
dm_list_init(&scan_devs);
dm_list_init(&rescan_devs);
handle->custom_handle = pp;
/*
* Create a list entry for each name arg.
*/
for (i = 0; i < pp->pv_count; i++) {
dm_unescape_colons_and_at_signs(pp->pv_names[i], NULL, NULL);
pv_name = pp->pv_names[i];
if (!(pd = dm_pool_zalloc(cmd->mem, sizeof(*pd)))) {
2016-03-01 13:41:09 +03:00
log_error("alloc failed.");
return 0;
}
if (!(pd->name = dm_pool_strdup(cmd->mem, pv_name))) {
2016-03-01 13:41:09 +03:00
log_error("strdup failed.");
return 0;
}
dm_list_add(&pp->arg_devices, &pd->list);
}
/*
* Translate arg names into struct device's.
*
* lvmcache_label_scan has already been run by the caller.
* It has likely found and filtered pvremove args, but often
* not pvcreate args, since pvcreate args are not typically PVs
* yet (but may be.)
*
* We call label_scan_devs on the args, using the full
* md filter (the previous scan likely did not use the
* full md filter - we really only need to check the
* command args to ensure they are not md components.)
*/
dm_list_iterate_items_safe(pd, pd2, &pp->arg_devices) {
struct device *dev;
/* No filter used here */
if (!(dev = dev_cache_get(cmd, pd->name, NULL))) {
log_error("No device found for %s.", pd->name);
dm_list_del(&pd->list);
dm_list_add(&pp->arg_fail, &pd->list);
continue;
}
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
goto bad;
devl->dev = dev;
pd->dev = dev;
dm_list_add(&scan_devs, &devl->list);
}
if (dm_list_empty(&pp->arg_devices))
goto_bad;
/*
* Clear the filtering results from lvmcache_label_scan because we are
* going to rerun the filters and don't want to get the results saved
* by the prior filtering. The filtering in label scan will use full
* md filter.
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
*
* We allow pvcreate to look outside devices file here to find
* the target device, in case the user has not added the device
* being pvcreated to the devices file.
*/
dm_list_iterate_items(devl, &scan_devs)
cmd->filter->wipe(cmd, cmd->filter, devl->dev, NULL);
cmd->use_full_md_check = 1;
if (cmd->enable_devices_file && !pp->is_remove)
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
cmd->filter_deviceid_skip = 1;
log_debug("Scanning and filtering device args (%u).", dm_list_size(&scan_devs));
label_scan_devs(cmd, cmd->filter, &scan_devs);
/*
* Check if the filtering done by label scan excluded any devices.
*/
dm_list_iterate_items_safe(pd, pd2, &pp->arg_devices) {
if (!cmd->filter->passes_filter(cmd, cmd->filter, pd->dev, NULL)) {
log_error("Cannot use %s: %s", pd->name, devname_error_reason(pd->name));
dm_list_del(&pd->list);
dm_list_add(&pp->arg_fail, &pd->list);
}
}
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
cmd->filter_deviceid_skip = 0;
/*
* Can the command continue if some specified devices were not found?
*/
if (must_use_all && !dm_list_empty(&pp->arg_fail)) {
log_error("Command requires all devices to be found.");
return 0;
}
/*
* Check for consistent block sizes.
*/
if (pp->check_consistent_block_size) {
dm_list_iterate_items(pd, &pp->arg_devices) {
logical_block_size = 0;
physical_block_size = 0;
if (!dev_get_direct_block_sizes(pd->dev, &physical_block_size, &logical_block_size)) {
log_warn("WARNING: Unknown block size for device %s.", dev_name(pd->dev));
continue;
}
if (!logical_block_size) {
log_warn("WARNING: Unknown logical_block_size for device %s.", dev_name(pd->dev));
continue;
}
if (!prev_lbs) {
prev_lbs = logical_block_size;
prev_pbs = physical_block_size;
continue;
}
if (prev_lbs == logical_block_size) {
/* Require lbs to match, just warn about unmatching pbs. */
if (!cmd->allow_mixed_block_sizes && prev_pbs && physical_block_size &&
(prev_pbs != physical_block_size))
log_warn("WARNING: Devices have inconsistent physical block sizes (%u and %u).",
prev_pbs, physical_block_size);
continue;
}
if (!cmd->allow_mixed_block_sizes) {
log_error("Devices have inconsistent logical block sizes (%u and %u).",
prev_lbs, logical_block_size);
log_print("See lvm.conf allow_mixed_block_sizes.");
return 0;
}
}
}
/* check_used moves pd entries into the arg_fail list if pvcreate/pvremove is disallowed */
dm_list_iterate_items_safe(pd, pd2, &pp->arg_devices) {
if (pp->is_remove)
_pvremove_check_used(cmd, pp, pd);
else
_pvcreate_check_used(cmd, pp, pd);
}
/*
* If the user specified a uuid for the new PV, check
* if a PV on another dev is already using that uuid.
*/
if (!pp->is_remove && pp->uuid_str) {
struct device *dev;
if ((dev = lvmcache_device_from_pv_id(cmd, &pp->pva.id, NULL))) {
dm_list_iterate_items_safe(pd, pd2, &pp->arg_devices) {
if (pd->dev != dev) {
log_error("UUID %s already in use on \"%s\".", pp->uuid_str, dev_name(dev));
dm_list_move(&pp->arg_fail, &pd->list);
}
}
}
}
/*
* Special case: pvremove -ff is allowed to clear a duplicate device in
* the unchosen duplicates list. We save them here and erase them below.
*/
if (pp->is_remove && (pp->force == DONT_PROMPT_OVERRIDE) &&
!dm_list_empty(&pp->arg_devices) && lvmcache_has_duplicate_devs()) {
dm_list_iterate_items_safe(pd, pd2, &pp->arg_devices) {
if (lvmcache_dev_is_unused_duplicate(pd->dev)) {
log_debug("Check pvremove arg %s device is a duplicate.", dev_name(pd->dev));
dm_list_move(&remove_duplicates, &pd->list);
}
}
}
/*
* Any devices not moved to arg_fail can be processed.
*/
dm_list_splice(&pp->arg_process, &pp->arg_devices);
/*
* Can the command continue if some specified devices cannot be used?
*/
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_bad;
/*
* The command cannot continue if there are no devices to process.
*/
if (dm_list_empty(&pp->arg_process) && dm_list_empty(&remove_duplicates)) {
log_debug("No devices to process.");
goto bad;
}
/*
* Clear any prompts that have answers without asking the user.
*/
dm_list_iterate_items_safe(prompt, prompt2, &pp->prompts) {
_check_pvcreate_prompt(cmd, pp, prompt, 0);
switch (prompt->answer) {
case PROMPT_ANSWER_YES:
/* The PV can be used, leave it on arg_process. */
dm_list_del(&prompt->list);
break;
case PROMPT_ANSWER_NO:
/* The PV cannot be used, remove it from arg_process. */
if ((pd = _pvcreate_list_find_dev(&pp->arg_process, prompt->dev)))
dm_list_move(&pp->arg_fail, &pd->list);
dm_list_del(&prompt->list);
break;
}
}
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_bad;
/*
* If no remaining prompts need a user response, then keep orphans
* locked and go directly to the create steps.
*/
if (dm_list_empty(&pp->prompts))
goto do_command;
/*
* Prompts require asking the user and make take some time, during
* which we don't want to block other commands. So, release the lock
* to prevent blocking other commands while we wait. After a response
* from the user, reacquire the lock, verify that the PVs were not used
* during the wait, then do the create steps.
*/
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
lockf_global(cmd, "un");
unlocked_for_prompts = 1;
/*
* Process prompts that require asking the user. The global lock is
* not held, so there's no harm in waiting for a user to respond.
*/
dm_list_iterate_items_safe(prompt, prompt2, &pp->prompts) {
_check_pvcreate_prompt(cmd, pp, prompt, 1);
switch (prompt->answer) {
case PROMPT_ANSWER_YES:
/* The PV can be used, leave it on arg_process. */
dm_list_del(&prompt->list);
break;
case PROMPT_ANSWER_NO:
/* The PV cannot be used, remove it from arg_process. */
if ((pd = _pvcreate_list_find_dev(&pp->arg_process, prompt->dev)))
dm_list_move(&pp->arg_fail, &pd->list);
dm_list_del(&prompt->list);
break;
}
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_out;
if (sigint_caught())
goto_out;
if (prompt->abort_command)
goto_out;
}
/*
* Reacquire the lock that was released above before waiting, then
* check again that the devices can still be used. If the second check
* finds them changed, or can't find them any more, then they aren't
* used. Use a non-blocking request when reacquiring to avoid
* potential deadlock since this is not the normal locking sequence.
*/
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
if (!lockf_global_nonblock(cmd, "ex")) {
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
log_error("Failed to reacquire global lock after prompt.");
goto_out;
}
do_command:
dm_list_iterate_items(pd, &pp->arg_process) {
if (!(devl = dm_pool_zalloc(cmd->mem, sizeof(*devl))))
goto bad;
devl->dev = pd->dev;
dm_list_add(&rescan_devs, &devl->list);
}
/*
* We want label_scan excl to repeat the filter check in case something
* changed to filter out a dev before we were able to get exclusive.
*/
dm_list_iterate_items(devl, &rescan_devs)
cmd->filter->wipe(cmd, cmd->filter, devl->dev, NULL);
if (cmd->enable_devices_file && !pp->is_remove)
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
cmd->filter_deviceid_skip = 1;
log_debug("Rescanning and filtering device args with exclusive open");
if (!label_scan_devs_excl(cmd, cmd->filter, &rescan_devs)) {
log_debug("Failed to rescan devs excl");
goto bad;
}
dm_list_iterate_items_safe(pd, pd2, &pp->arg_process) {
if (!cmd->filter->passes_filter(cmd, cmd->filter, pd->dev, NULL)) {
log_error("Cannot use %s: %s", pd->name, devname_error_reason(pd->name));
dm_list_del(&pd->list);
dm_list_add(&pp->arg_fail, &pd->list);
}
}
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
cmd->filter_deviceid_skip = 0;
if (dm_list_empty(&pp->arg_process) && dm_list_empty(&remove_duplicates)) {
log_debug("No devices to process.");
goto bad;
}
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_bad;
/*
* If the global lock was unlocked to wait for prompts, then
* devs could have changed while unlocked, so confirm that
* the devs are unchanged since check_used.
* Changed pd entries are moved to arg_fail.
*/
if (unlocked_for_prompts) {
dm_list_iterate_items_safe(pd, pd2, &pp->arg_process)
_confirm_check_used(cmd, pp, pd);
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_bad;
}
if (dm_list_empty(&pp->arg_process)) {
log_debug("No devices to process.");
goto bad;
}
/*
* Reorder arg_process entries to match the original order of args.
*/
dm_list_splice(&arg_sort, &pp->arg_process);
for (i = 0; i < pp->pv_count; i++) {
if ((pd = _pvcreate_list_find_name(&arg_sort, pp->pv_names[i])))
dm_list_move(&pp->arg_process, &pd->list);
}
if (pp->is_remove)
dm_list_splice(&pp->arg_remove, &pp->arg_process);
else
dm_list_splice(&pp->arg_create, &pp->arg_process);
/*
* Wipe signatures on devices being created.
*/
dm_list_iterate_items_safe(pd, pd2, &pp->arg_create) {
2016-03-01 13:41:09 +03:00
log_verbose("Wiping signatures on new PV %s.", pd->name);
if (!wipe_known_signatures(cmd, pd->dev, pd->name, TYPE_LVM1_MEMBER | TYPE_LVM2_MEMBER,
0, pp->yes, pp->force, &pd->wiped)) {
dm_list_move(&pp->arg_fail, &pd->list);
}
if (sigint_caught())
goto_bad;
}
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
goto_bad;
/*
* Find existing orphan PVs that vgcreate or vgextend want to use.
* "preserve_existing" means that the command wants to use existing PVs
* and not recreate a new PV on top of an existing PV.
*/
if (pp->preserve_existing && pp->orphan_vg_name) {
2016-03-01 13:41:09 +03:00
log_debug("Using existing orphan PVs in %s.", pp->orphan_vg_name);
improve reading and repairing vg metadata The fact that vg repair is implemented as a part of vg read has led to a messy and complicated implementation of vg_read, and limited and uncontrolled repair capability. This splits read and repair apart. Summary ------- - take all kinds of various repairs out of vg_read - vg_read no longer writes anything - vg_read now simply reads and returns vg metadata - vg_read ignores bad or old copies of metadata - vg_read proceeds with a single good copy of metadata - improve error checks and handling when reading - keep track of bad (corrupt) copies of metadata in lvmcache - keep track of old (seqno) copies of metadata in lvmcache - keep track of outdated PVs in lvmcache - vg_write will do basic repairs - new command vgck --updatemetdata will do all repairs Details ------- - In scan, do not delete dev from lvmcache if reading/processing fails; the dev is still present, and removing it makes it look like the dev is not there. Records are now kept about the problems with each PV so they be fixed/repaired in the appropriate places. - In scan, record a bad mda on failure, and delete the mda from mda in use list so it will not be used by vg_read or vg_write, only by repair. - In scan, succeed if any good mda on a device is found, instead of failing if any is bad. The bad/old copies of metadata should not interfere with normal usage while good copies can be used. - In scan, add a record of old mdas in lvmcache for later, do not repair them while reading, and do not let them prevent us from finding and using a good copy of metadata from elsewhere. One result is that "inconsistent metadata" is no longer a read error, but instead a record in lvmcache that can be addressed separate from the read. - Treat a dev with no good mdas like a dev with no mdas, which is an existing case we already handle. - Don't use a fake vg "handle" for returning an error from vg_read, or the vg_read_error function for getting that error number; just return null if the vg cannot be read or used, and an error_flags arg with flags set for the specific kind of error (which can be used later for determining the kind of repair.) - Saving an original copy of the vg metadata, for purposes of reverting a write, is now done explicitly in vg_read instead of being hidden in the vg_make_handle function. - When a vg is not accessible due to "access restrictions" but is otherwise fine, return the vg through the new error_vg arg so that process_each_pv can skip the PVs in the VG while processing. (This is a temporary accomodation for the way process_each_pv tracks which devs have been looked at, and can be dropped later when process_each_pv implementation dev tracking is changed.) - vg_read does not try to fix or recover a vg, but now just reads the metadata, checks access restrictions and returns it. (Checking access restrictions might be better done outside of vg_read, but this is a later improvement.) - _vg_read now simply makes one attempt to read metadata from each mda, and uses the most recent copy to return to the caller in the form of a 'vg' struct. (bad mdas were excluded during the scan and are not retried) (old mdas were not excluded during scan and are retried here) - vg_read uses _vg_read to get the latest copy of metadata from mdas, and then makes various checks against it to produce warnings, and to check if VG access is allowed (access restrictions include: writable, foreign, shared, clustered, missing pvs). - Things that were previously silently/automatically written by vg_read that are now done by vg_write, based on the records made in lvmcache during the scan and read: . clearing the missing flag . updating old copies of metadata . clearing outdated pvs . updating pv header flags - Bad/corrupt metadata are now repaired; they were not before. Test changes ------------ - A read command no longer writes the VG to repair it, so add a write command to do a repair. (inconsistent-metadata, unlost-pv) - When a missing PV is removed from a VG, and then the device is enabled again, vgck --updatemetadata is needed to clear the outdated PV before it can be used again, where it wasn't before. (lvconvert-repair-policy, lvconvert-repair-raid, lvconvert-repair, mirror-vgreduce-removemissing, pv-ext-flags, unlost-pv) Reading bad/old metadata ------------------------ - "bad metadata": the mda_header or metadata text has invalid fields or can't be parsed by lvm. This is a form of corruption that would not be caused by known failure scenarios. A checksum error is typically included among the errors reported. - "old metadata": a valid copy of the metadata that has a smaller seqno than other copies of the metadata. This can happen if the device failed, or io failed, or lvm failed while commiting new metadata to all the metadata areas. Old metadata on a PV that has been removed from the VG is the "outdated" case below. When a VG has some PVs with bad/old metadata, lvm can simply ignore the bad/old copies, and use a good copy. This is why there are multiple copies of the metadata -- so it's available even when some of the copies cannot be used. The bad/old copies do not have to be repaired before the VG can be used (the repair can happen later.) A PV with no good copies of the metadata simply falls back to being treated like a PV with no mdas; a common and harmless configuration. When bad/old metadata exists, lvm warns the user about it, and suggests repairing it using a new metadata repair command. Bad metadata in particular is something that users will want to investigate and repair themselves, since it should not happen and may indicate some other problem that needs to be fixed. PVs with bad/old metadata are not the same as missing devices. Missing devices will block various kinds of VG modification or activation, but bad/old metadata will not. Previously, lvm would attempt to repair bad/old metadata whenever it was read. This was unnecessary since lvm does not require every copy of the metadata to be used. It would also hide potential problems that should be investigated by the user. It was also dangerous in cases where the VG was on shared storage. The user is now allowed to investigate potential problems and decide how and when to repair them. Repairing bad/old metadata -------------------------- When label scan sees bad metadata in an mda, that mda is removed from the lvmcache info->mdas list. This means that vg_read will skip it, and not attempt to read/process it again. If it was the only in-use mda on a PV, that PV is treated like a PV with no mdas. It also means that vg_write will also skip the bad mda, and not attempt to write new metadata to it. The only way to repair bad metadata is with the metadata repair command. When label scan sees old metadata in an mda, that mda is kept in the lvmcache info->mdas list. This means that vg_read will read/process it again, and likely see the same mismatch with the other copies of the metadata. Like the label_scan, the vg_read will simply ignore the old copy of the metadata and use the latest copy. If the command is modifying the vg (e.g. lvcreate), then vg_write, which writes new metadata to every mda on info->mdas, will write the new metadata to the mda that had the old version. If successful, this will resolve the old metadata problem (without needing to run a metadata repair command.) Outdated PVs ------------ An outdated PV is a PV that has an old copy of VG metadata that shows it is a member of the VG, but the latest copy of the VG metadata does not include this PV. This happens if the PV is disconnected, vgreduce --removemissing is run to remove the PV from the VG, then the PV is reconnected. In this case, the outdated PV needs have its outdated metadata removed and the PV used flag needs to be cleared. This repair will be done by the subsequent repair command. It is also done if vgremove is run on the VG. MISSING PVs ----------- When a device is missing, most commands will refuse to modify the VG. This is the simple case. More complicated is when a command is allowed to modify the VG while it is missing a device. When a VG is written while a device is missing for one of it's PVs, the VG metadata is written to disk with the MISSING flag on the PV with the missing device. When the VG is next used, it is treated as if the PV with the MISSING flag still has a missing device, even if that device has reappeared. If all LVs that were using a PV with the MISSING flag are removed or repaired so that the MISSING PV is no longer used, then the next time the VG metadata is written, the MISSING flag will be dropped. Alternative methods of clearing the MISSING flag are: vgreduce --removemissing will remove PVs with missing devices, or PVs with the MISSING flag where the device has reappeared. vgextend --restoremissing will clear the MISSING flag on PVs where the device has reappeared, allowing the VG to be used normally. This must be done with caution since the reappeared device may have old data that is inconsistent with data on other PVs. Bad mda repair -------------- The new command: vgck --updatemetadata VG first uses vg_write to repair old metadata, and other basic issues mentioned above (old metadata, outdated PVs, pv_header flags, MISSING_PV flags). It will also go further and repair bad metadata: . text metadata that has a bad checksum . text metadata that is not parsable . corrupt mda_header checksum and version fields (To keep a clean diff, #if 0 is added around functions that are replaced by new code. These commented functions are removed by the following commit.)
2019-05-24 20:04:37 +03:00
if (!(orphan_vg = vg_read_orphans(cmd, pp->orphan_vg_name))) {
2016-03-01 13:41:09 +03:00
log_error("Cannot read orphans VG %s.", pp->orphan_vg_name);
goto bad;
}
dm_list_iterate_items_safe(pd, pd2, &pp->arg_create) {
if (!pd->is_orphan_pv)
continue;
if (!(pvl = dm_pool_alloc(cmd->mem, sizeof(*pvl)))) {
2016-03-01 13:41:09 +03:00
log_error("alloc pvl failed.");
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
found = 0;
dm_list_iterate_items(vgpvl, &orphan_vg->pvs) {
if (vgpvl->pv->dev == pd->dev) {
found = 1;
break;
}
}
if (found) {
2016-03-01 13:41:09 +03:00
log_debug("Using existing orphan PV %s.", pv_dev_name(vgpvl->pv));
pvl->pv = vgpvl->pv;
dm_list_add(&pp->pvs, &pvl->list);
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
/* allow deviceidtype_ARG/deviceid_ARG ? */
memcpy(pvid, &pvl->pv->id.uuid, ID_LEN);
device_id_add(cmd, pd->dev, pvid, NULL, NULL);
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
} else {
log_error("Failed to find PV %s", pd->name);
dm_list_move(&pp->arg_fail, &pd->list);
}
}
}
/*
* Create PVs on devices. Either create a new PV on top of an existing
* one (e.g. for pvcreate), or create a new PV on a device that is not
* a PV.
*/
dm_list_iterate_items_safe(pd, pd2, &pp->arg_create) {
/* Using existing orphan PVs is covered above. */
if (pp->preserve_existing && pd->is_orphan_pv)
continue;
if (!dm_list_empty(&pp->arg_fail) && must_use_all)
break;
if (!(pvl = dm_pool_alloc(cmd->mem, sizeof(*pvl)))) {
2016-03-01 13:41:09 +03:00
log_error("alloc pvl failed.");
dm_list_move(&pp->arg_fail, &pd->list);
2016-03-01 13:40:53 +03:00
continue;
}
pv_name = pd->name;
2016-03-01 13:41:09 +03:00
log_debug("Creating a new PV on %s.", pv_name);
if (!(pv = pv_create(cmd, pd->dev, &pp->pva))) {
2016-03-01 13:41:09 +03:00
log_error("Failed to setup physical volume \"%s\".", pv_name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
/* allow deviceidtype_ARG/deviceid_ARG ? */
memcpy(pvid, &pv->id.uuid, ID_LEN);
device_id_add(cmd, pd->dev, pvid, NULL, NULL);
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
log_verbose("Set up physical volume for \"%s\" with %" PRIu64
2016-03-01 13:41:09 +03:00
" available sectors.", pv_name, pv_size(pv));
if (!label_remove(pv->dev)) {
2016-03-01 13:41:09 +03:00
log_error("Failed to wipe existing label on %s.", pv_name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
if (pp->zero) {
2016-03-01 13:41:09 +03:00
log_verbose("Zeroing start of device %s.", pv_name);
if (!dev_write_zeros(pv->dev, 0, 2048)) {
2016-03-01 13:41:09 +03:00
log_error("%s not wiped: aborting.", pv_name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
2016-03-01 13:41:09 +03:00
}
}
2016-03-01 13:41:09 +03:00
log_verbose("Writing physical volume data to disk \"%s\".", pv_name);
if (!pv_write(cmd, pv, 0)) {
2016-03-01 13:41:09 +03:00
log_error("Failed to write physical volume \"%s\".", pv_name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
2016-03-01 13:41:09 +03:00
log_print_unless_silent("Physical volume \"%s\" successfully created.",
pv_name);
pvl->pv = pv;
dm_list_add(&pp->pvs, &pvl->list);
}
/*
* Remove PVs from devices for pvremove.
*/
dm_list_iterate_items_safe(pd, pd2, &pp->arg_remove) {
if (!label_remove(pd->dev)) {
2016-03-01 13:41:09 +03:00
log_error("Failed to wipe existing label(s) on %s.", pd->name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
device_id_pvremove(cmd, pd->dev);
2016-03-01 13:41:09 +03:00
log_print_unless_silent("Labels on physical volume \"%s\" successfully wiped.",
pd->name);
}
/*
* Special case: pvremove duplicate PVs (also see above).
*/
dm_list_iterate_items_safe(pd, pd2, &remove_duplicates) {
if (!label_remove(pd->dev)) {
log_error("Failed to wipe existing label(s) on %s.", pd->name);
dm_list_move(&pp->arg_fail, &pd->list);
continue;
}
lvmcache_del_dev_from_duplicates(pd->dev);
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
device_id_pvremove(cmd, pd->dev);
log_print_unless_silent("Labels on physical volume \"%s\" successfully wiped.",
pd->name);
}
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
/* TODO: when vgcreate uses only existing PVs this doesn't change and can be skipped */
if (!device_ids_write(cmd))
stack;
device usage based on devices file The LVM devices file lists devices that lvm can use. The default file is /etc/lvm/devices/system.devices, and the lvmdevices(8) command is used to add or remove device entries. If the file does not exist, or if lvm.conf includes use_devicesfile=0, then lvm will not use a devices file. When the devices file is in use, the regex filter is not used, and the filter settings in lvm.conf or on the command line are ignored. LVM records devices in the devices file using hardware-specific IDs, such as the WWID, and attempts to use subsystem-specific IDs for virtual device types. These device IDs are also written in the VG metadata. When no hardware or virtual ID is available, lvm falls back using the unstable device name as the device ID. When devnames are used, lvm performs extra scanning to find devices if their devname changes, e.g. after reboot. When proper device IDs are used, an lvm command will not look at devices outside the devices file, but when devnames are used as a fallback, lvm will scan devices outside the devices file to locate PVs on renamed devices. A config setting search_for_devnames can be used to control the scanning for renamed devname entries. Related to the devices file, the new command option --devices <devnames> allows a list of devices to be specified for the command to use, overriding the devices file. The listed devices act as a sort of devices file in terms of limiting which devices lvm will see and use. Devices that are not listed will appear to be missing to the lvm command. Multiple devices files can be kept in /etc/lvm/devices, which allows lvm to be used with different sets of devices, e.g. system devices do not need to be exposed to a specific application, and the application can use lvm on its own set of devices that are not exposed to the system. The option --devicesfile <filename> is used to select the devices file to use with the command. Without the option set, the default system devices file is used. Setting --devicesfile "" causes lvm to not use a devices file. An existing, empty devices file means lvm will see no devices. The new command vgimportdevices adds PVs from a VG to the devices file and updates the VG metadata to include the device IDs. vgimportdevices -a will import all VGs into the system devices file. LVM commands run by dmeventd not use a devices file by default, and will look at all devices on the system. A devices file can be created for dmeventd (/etc/lvm/devices/dmeventd.devices) If this file exists, lvm commands run by dmeventd will use it. Internal implementaion: - device_ids_read - read the devices file . add struct dev_use (du) to cmd->use_devices for each devices file entry - dev_cache_scan - get /dev entries . add struct device (dev) to dev_cache for each device on the system - device_ids_match - match devices file entries to /dev entries . match each du on cmd->use_devices to a dev in dev_cache, using device ID . on match, set du->dev, dev->id, dev->flags MATCHED_USE_ID - label_scan - read lvm headers and metadata from devices . filters are applied, those that do not need data from the device . filter-deviceid skips devs without MATCHED_USE_ID, i.e. skips /dev entries that are not listed in the devices file . read lvm label from dev . filters are applied, those that use data from the device . read lvm metadata from dev . add info/vginfo structs for PVs/VGs (info is "lvmcache") - device_ids_find_renamed_devs - handle devices with unstable devname ID where devname changed . this step only needed when devs do not have proper device IDs, and their dev names change, e.g. after reboot sdb becomes sdc. . detect incorrect match because PVID in the devices file entry does not match the PVID found when the device was read above . undo incorrect match between du and dev above . search system devices for new location of PVID . update devices file with new devnames for PVIDs on renamed devices . label_scan the renamed devs - continue with command processing
2020-06-23 21:25:41 +03:00
/*
* Don't keep devs open excl in bcache because the excl will prevent
* using that dev elsewhere.
*/
dm_list_iterate_items(devl, &rescan_devs)
label_scan_invalidate(devl->dev);
dm_list_iterate_items(pd, &pp->arg_fail)
2016-03-01 13:41:09 +03:00
log_debug("%s: command failed for %s.",
cmd->command->name, pd->name);
if (!dm_list_empty(&pp->arg_fail))
locking: unify global lock for flock and lockd There have been two file locks used to protect lvm "global state": "ORPHANS" and "GLOBAL". Commands that used the ORPHAN flock in exclusive mode: pvcreate, pvremove, vgcreate, vgextend, vgremove, vgcfgrestore Commands that used the ORPHAN flock in shared mode: vgimportclone, pvs, pvscan, pvresize, pvmove, pvdisplay, pvchange, fullreport Commands that used the GLOBAL flock in exclusive mode: pvchange, pvscan, vgimportclone, vgscan Commands that used the GLOBAL flock in shared mode: pvscan --cache, pvs The ORPHAN lock covers the important cases of serializing the use of orphan PVs. It also partially covers the reporting of orphan PVs (although not correctly as explained below.) The GLOBAL lock doesn't seem to have a clear purpose (it may have eroded over time.) Neither lock correctly protects the VG namespace, or orphan PV properties. To simplify and correct these issues, the two separate flocks are combined into the one GLOBAL flock, and this flock is used from the locking sites that are in place for the lvmlockd global lock. The logic behind the lvmlockd (distributed) global lock is that any command that changes "global state" needs to take the global lock in ex mode. Global state in lvm is: the list of VG names, the set of orphan PVs, and any properties of orphan PVs. Reading this global state can use the global lock in sh mode to ensure it doesn't change while being reported. The locking of global state now looks like: lockd_global() previously named lockd_gl(), acquires the distributed global lock through lvmlockd. This is unchanged. It serializes distributed lvm commands that are changing global state. This is a no-op when lvmlockd is not in use. lockf_global() acquires an flock on a local file. It serializes local lvm commands that are changing global state. lock_global() first calls lockf_global() to acquire the local flock for global state, and if this succeeds, it calls lockd_global() to acquire the distributed lock for global state. Replace instances of lockd_gl() with lock_global(), so that the existing sites for lvmlockd global state locking are now also used for local file locking of global state. Remove the previous file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN). The following commands which change global state are now serialized with the exclusive global flock: pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove, vgcreate, vgextend, vgremove, vgreduce, vgrename, vgcfgrestore, vgimportclone, vgmerge, vgsplit Commands that use a shared flock to read global state (and will be serialized against the prior list) are those that use process_each functions that are based on processing a list of all VG names, or all PVs. The list of all VGs or all PVs is global state and the shared lock prevents those lists from changing while the command is processing them. The ORPHAN lock previously attempted to produce an accurate listing of orphan PVs, but it was only acquired at the end of the command during the fake vg_read of the fake orphan vg. This is not when orphan PVs were determined; they were determined by elimination beforehand by processing all real VGs, and subtracting the PVs in the real VGs from the list of all PVs that had been identified during the initial scan. This is fixed by holding the single global lock in shared mode while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
goto_out;
return 1;
bad:
out:
return 0;
}