IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
more descriptive message if locking fails instead of
"Locking type -1 initialisation failed."
Use read-only locking instead of misleading ignorelocking option
in message.
All this seems to do is provide a memory leak so remove it.
The only caller of _alloc_pv() later explicitly sets
pv->vg_name = fmt->orphan_vg_name so clearly this allocation
should be removed. I also saw no where in the code where
strncpy was used to assign pv->vg_name - only direct assignments
and strdup's.
merge completes. This narrows the scope of this "hack" (which still
needs a proper fix within the deptree).
This stops dmeventd from trying to access snapshot devices that were
already removed.
For mirror repair (and similar tasks) it can happen that full
device rescan is issued from clvmd.
Because code can be in the middle of repair (calling suspend)
clvmd should never try to scan suspended devices
(otherwise it causes deadlock).
Also code must not change ignore_suspended_device flag when
doing refresh_filters (called from lvmcache scan code).
'const'. Be consistent with its use (and dev_manager_snapshot_percent()).
Pass 'lv' from dev_manager_snapshot_percent() to _percent() to
_percent_run(). _percent_run() always dereferenced 'lv' (when
initializing segh) even though it may have been NULL (as was the case
until now for dev_manager_snapshot_percent()).
If a "snapshot-origin" LV (snapshot-merge whose merge was deferred
becuase it was open) was passed to _percent_run() it would always return
100%.
Update _percent_run() to NOT return PERCENT_100 et. al. if
->target_percent() wasn't ever called and supplied 'lv' is a merging
origin. A default return of 100% does not work for snapshot-merge.
Also tweak a related lvconvert log_error() to include "Aborting merge."
Eliminate 'merging_snapshot' from 'struct logical_volume' and just use
'snapshot' for origin lv's reference to the merging snapshot; also set
MERGING in the origin lv's status.
If either the origin or snapshot that is to be merged is open the merge
will not start; only the merge metadata will be written. The merge will
start on the next activation of the origin (or via lvchange --refresh)
IFF both the origin and snapshot are closed.
Merge on activate is particularly important if we want to merge over a
mounted filesystem that cannot be unmounted (until next boot) --- for
example root.
snapshots are suspended, new origin is created, snapshots are resumed, new
origin is resumed. So it allocates memory while suspended.
To fix it, move vg_commit after suspend_lv, so that the suspend code will
treat it as precommitted vg and will preload new origin prior to suspend.
NOTE: agk doesn't like this "hack"; need to revisit and fix
This is useful for when the snapshot is still active and merging hasn't
started yet; it shows a merge is pending. Once merging starts the
merging snapshot will be hidden but can still be displayed with 'lvs -a'
Report snapshot origin with merging snapshot as 'O' instead of 'o':
Before merge starts this shows that a merge is pending. While merging
the snapshot will be hidden, 'O' enables a user to see that there is a
snapshot merging.
"snapshot-merge" target based on whether the LV is a merging snapshot.
When activating a snapshot-merge target do not attempt to monitor the
LV for events; the polldaemon will monitor the snapshot as it is
merged.
Allow "snapshot-merge" target's usage to be parsed via standard
"snapshot" methods.
NOTE: follow on fixes to the _percent_run change are still needed
Introduces new libdevmapper function dm_tree_node_add_snapshot_merge_target
Verifies that the kernel (dm-snapshot) provides the 'snapshot-merge'
target.
Activate origin LV as snapshot-merge target. Using snapshot-origin
target would be pointless because the origin contains volatile data
while a merge is in progress.
Because snapshot-merge target is activated in place of the
snapshot-origin target it must be resumed after all other snapshots
(just like snapshot-origin does) --- otherwise small window for data
corruption would exist.
Ideally the merging snapshot would not be activated at all but if it is
to be activated (because snapshot was already active) it _must_ be done
after the snapshot-merge. This insures that DM's snapshot-merge target
will perform exception handover in the proper order (new->resume before
old->resume). DM's snapshot-merge does support handover if the reverse
sequence is used (old->resume before new->resume) but DM will fail to
resume the old snapshot; leaving it suspended.
To insure the proper activation sequence dm_tree_activate_children() was
updated to accommodate an additional 'activation_priority' level. All
regular snapshots are 0, snapshot-merge is 1, and merging snapshot is 2.
Make 'merging_snapshot' pointer that points from the origin to the
segment that represents the merging snapshot.
Import/export 'merging_store' metadata.
Do not allow creating snapshots while another snapshot is merging.
Snapshot created in this state would certainly contain invalid data.
NOTE: patches at the end of this series will remove 'merging_snapshot'
and will introduce helpful wrappers and cleanups.
This spurious 'break' has been here since this code was first committed
in June 2005 and stopped the algorithm behaving as described in the
comment above it and rendered the variable 'already_found_one' useless.
1. Found bug in 'redundant log' implementation that caused
problems when converting a linear that spanned multiple
devices to a mirror (wasn't checking for NULL value of
provided parameter in _alloc_parallel_area)
2. Testsuite was failing to perform tests when 'not' modifier
was used. This allowed a couple issues to slip through.
Added a 'not_sh' modifier that negates tests performed by
functions defined in the shell source file.
3. Was initializing a variable to far down, which cause
previously set value to be overridden. (This was the
result of the collision of the "redundant log" and
lvconvert fix patches.)
Upon successful fork(), _become_daemon() must assert that the locks that
are currently held belong to the parent, not the child. All of the
child's internal state saying 'this process holds a lock' has to be
reset.
A proper lvmcache_locking_reset() should follow later.
It is pretty much the same as reducing the number of
mirror legs, but we just don't delete them afterwards.
The following command line interface is enforced:
prompt> lvconvert --splitmirror <n> -n <name> <VG>/<LV>
where 'n' is the number of images to split off, and
where 'name' is the name of the newly split off logical volume.
If more than one leg is split off, a new mirror will be the
result. The newly split off mirror will have a 'core' log.
Example:
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_1(0),lv_mimage_2(0),lv_mimage_3(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_1] /dev/sdc1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mimage_3] /dev/sde1(0)
[lv_mlog] /dev/sdi1(0)
[root@bp-01 LVM2]# lvconvert --splitmirrors 2 --name split vg/lv /dev/sd[ce]1
Logical volume lv converted.
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_2(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mlog] /dev/sdi1(0)
split 100.00 split_mimage_0(0),split_mimage_1(0)
[split_mimage_0] /dev/sde1(0)
[split_mimage_1] /dev/sdc1(0)
It can be seen that '--splitmirror <n>' is exactly the same
as '--mirrors -<n>' (note the minus sign), except there is the
additional notion to keep the image being detached from the
mirror instead of just throwing it away.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Version >= 1.8.0 of the DM snapshot target appends metadata sectors used
to a snapshot's status. This patch allows LVM2 to accurately determine
if the snapshot store is empty. Knowing when a snapshot store is empty
is important in the context of snapshot-merge (means merge is complete).
Also update LVM2 to be aware of the possibility for "Merge failed" in
the snapshot-merge target's status.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
the background polldaemon is allowed to start. It can be used
standalone or in conjunction with --refresh or --available y.
Control over when the background polldaemon starts will be particularly
important for snapshot-merge of a root filesystem.
Dracut will be updated to activate all LVs with: --poll n
The lvm2-monitor initscript will start polling with: --poll y
NOTE: Because we currently have no way of knowing if a background
polldaemon is active for a given LV the following limitations exist and
have been deemed acceptable:
1) it is not possible to stop an active polldaemon; so the lvm2-monitor
initscript doesn't stop running polldaemon(s)
2) redundant polldaemon instances will be started for all specified LVs
if vgchange or lvchange are repeatedly used with '--poll y'
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This patch tries to correctly track changes in lvmcache related to commit/revert.
For vg_commit: if there is cached precommitted metadata, after successfull commit
these metadata must be tracked as committed.
For vg_revert: remote nodes must drop precommitted metadata and its flag in lvmcache.
(N.B. Patch do not touch LV locks here in any way.)
All this machinery is needed to properly solve remote node cache invalidaton which
cause several problems recently observed.
Lock mode is int masked by LCK_TYPE_MASK, always.
Patch also remove uneccessary masking lock flag on sender side,
if masking is needed, it is don on client side already.
- Add drop_precommitted flag to force drop precommitted metadata
- add lvmcache_commit_metadata() which upgrades precommitted metadata in cache
No functional change in this patch - just preparation for following change.
And decode flags in humar readable form in client.
And clean some trailing whitespaces.
No functional change in this patch (only debugging messages changed).
The use_precommitted flag indicates, that we want to use precommitted metadata
(used in suspend call to preload table with precommitted data).
But if there are no such data, committed metadata are read but the cache
still contains that precommitted flag.
(The problem is that later possible drop_metadata call will not invalidate
device in cache.)
The wrong precommitted state is stored in on remote nodes during normal
suspend/resume cycle _without_ vg_write/commit.
Use the PRECOMMITTED status flag here instead (which is always set if using
precommited metadata here).
If renaming snapshot with virtual origin, the origin is renamed too.
But the code must resume LVs in reverse order to properly
pair memlock (in cluster locking).
(The resume of snapshot resumes origin too and later resume
is ignored otherwise.)
When PV device reappears with old metadata, it is
always updated to new version byt atutomatic metadata
repair.
Remove missing flag if device is empty.
If device contains allocated extents, issue warning that
user must remove volumes and re-add this PV before
manipulating with this volume.
This partially solves bug 547842 when one PV (log) is failed,
dmeventd removes that device and later this device reappears and
is wrongly added into VG marked missing.
The memlock_inc() fix is wrong, memlock count is not
propagated to long living process (clvmd) and just
it underflow there.
Also suspend is needed to pre-load precommited metadata
on other nodes (remapping to error taget in this case).
With explicit suspend we generate lock request and code
can update memlock count.
(Infinitely "locked" memory caused that fs_unlock() was not
called properly and on cluster nodes remains
old links in /dev/mapper for not active devices.)
(N.B. failing of suspend call here is not handled as fatal
error - the LV is going to be removed later anyway.)
The new recovery code first tries to repair LV and then removes failed PV
from VG. It means that during operation there can be VG with PV missing,
and vg_read code handles it like not consistent VG.
We already allows returning "inconsistent" commited metadata,
for mirror repair we need this for precommited too.
(The suspend call prepares precommited metadata to inactive table on
other cluster nodes.)
"Inconsistent" here means - correct metadata, just with some metadata areas
not found (obviously on missing or failed PVs).
Patch should not cause any problems, only real change is
removing LCK_LOCAL bit from lock type flag, it is never used there.
(LCK_LOCAL is part arg[1] bits anyway.)
If there is problem deactivate LV and
_init_mirror_log is called with remove_on_failure = 1,
remove the newly created log LV from metadata.
(This can happen if there is active device with the same name
but different UUID.)
The main reason for this "workaround" patch is to
- do not keep _mlog volume in metadata, so user can repeat the action
- print better error message describing the real problem
# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
WARNING: New mirror won't be synchronised. Don't read what you didn't write!
/dev/vg_bar/lv1_mlog: not found: device not cleared
Aborting. Failed to wipe mirror log.
Error locking on node bar-01: Input/output error
Unable to deactivate mirror log LV. Manual intervention required.
Failed to create mirror log.
# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
WARNING: New mirror won't be synchronised. Don't read what you didn't write!
Aborting. Unable to deactivate mirror log.
Failed to initialise mirror log.
At this point they probably do not matter but going forward they
may - depends on future patches for replicator, etc. I think
these probably got missed because they were 'flags' so I changed
the name to 'status' to be consistent. So the on-disk
things 'flags' and the in structure 'status' (bits).
NOTE: WHATS_NEW already has entry for this in current release.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
pvmove suspends all moved LVs + pvmoveX mirrored LV itself.
This suspends even underlying pvmoveX and following explicit
suspend call is just noop.
But in resume the pvmoveX volume is no longer underlying
device for moved LVs, so it performs full resume with memlock
decrease.
Code must call memlock_inc() if suspend is requested, volume
is already suspended and error is not requested.
These are no longer used by anyone. The dm_list defines are all in
libdevmapper.h and libdm/datastruct/list.c contains any function definitions.
There is some code in "old-tests" that still use this but this code is not
being maintained.
Thanks to Zdenek for spotting this.
The physical_volume, volume_group, logical_volume and lv_segment
structures' 'status' member is now uint64_t.
The alignment of these structures was also audited to remove holes. The
movement of some members in 'volume_group' and 'lv_segment' eliminates
holes. The 'physical_volume' structure still has one 4-byte hole after
'pe_size'; the other structures no longer have any holes. Each
structures' size has not changed.
The sysfs filter initialise hash of available devices using
scan of /sys/block. We need to refresh even this hash
when performing full scan otherwise the newly appeared
device could be rejected, because there is no entry
in sysfs filter.
This easily could happen when attaching new device
to cluster node. (Only force refresh of context
in clvmd -R works here now).
Unfortunately consequences of this are much worse,
missing device part on that node is replaced with missing segment
(even when no partial arg is selected) and this directly
lead to data corruption.
See https://bugzilla.redhat.com/show_bug.cgi?id=538515
Simply fix it by refreshing device filters in lvmcache
before performing the full device scan.
(This affects only cluster locking because only cluster
locking module set LCK_PRE_MEMLOCK.)
With currect code you get
# vgchange -a n
Internal error: _memlock_count has dropped below 0.
when using cluster locking.
It is caused by _unlock_memory calls here
if ((flags & (LCK_SCOPE_MASK | LCK_TYPE_MASK)) == LCK_LV_RESUME)
memlock_dec();
Unfortunately it is also (wrongly) called in immediate unlock
(when LCK_HOLD is not set) from lock_vol
(LCK_UNLOCK is misinterpreted as LCK_LV_RESUME).
Avoid this by comparing original flags and provide memlock
code type of operation (suspend/resume).
indented metadata lines.
Macro outnl() is using exported out_newline() instead of direct
call f->fn(), that required the visibility of the internal
struct formatter.
Rename fill_default_pvcreate_params to pvcreate_params_set_defaults.
Rename pvcreate_validate_restore_params to pvcreate_restore_params_validate.
Rename pvcreate_validate_params to pvcreate_params_validate.
Similar to other vg_set_* functions, we create a vg_set_clustered() function
which does a few checks and sets a flag. This is where we check for
any limitations of clusters.
The DRBD uses underlying device so code should prefer top
device if duplicate is found.
Patch also introduce
dev_subsystem_part_major and dev_subsytem_name
functions to easily handle all these replication susbystems
and not hardcode md_major call.
See https://bugzilla.redhat.com/show_bug.cgi?id=530881
for full problem description.
- we have these levels when the udev rules are processed:
10-dm.rules --> [11-dm-<subsystem>.rules] --> [12-dm-permissions.rules] -->
13-dm-disk.rules --> [...all the other foreign rules...] --> 95-dm-notify.rules
- each level can be disabled now by
DM_UDEV_DISABLE_{DM, SUBSYSTEM, DISK, OTHER}_RULES_FLAG
- add DM_UDEV_DISABLE_DM_RULES_FLAG to disable 10-dm.rules
- add DM_UDEV_DISABLE_OTHER_RULES_FLAG to disable all the other (non-dm) rules.
We cutoff these rules by using the 'last_rule', so this one should really be
used with great care and in well-founded situations. We use this for lvm's
hidden and layer devices now.
- add a parameter for add_dev_node, rm_dev_node and rename_dev_node so it's
possible to switch on/off udev checks
- use DM_UDEV_DISABLE_DM_RULES_FLAG and DM_UDEV_DISABLE_SUBSYSTEM_RULES_FLAG
if there's no cookie set and we have resume, remove and rename ioctl.
This could happen when someone uses the libdevmapper that is compiled with
udev_sync but the software does not make use of it. This way we can switch
off the rules and fallback to libdevmapper node creation so there's no
udev/libdevmapper race.
[root@xxxx-01 ~]# lvconvert -m 1 --corelog VG/cmirror
Unable to convert the log of inactive cluster mirror cmirror
I've tried to clean-up the message a little more, so the name
of the mirror stands out more while preserving the sense that
it's not a problem with the specific device, but the fact that
it is inactive that is causing the problem.
New msg:
Unable to convert the log of an inactive cluster mirror, cmirror
Split pvcreate_validate_params into recovery and non-recovery parameters.
This is necessary so we can call the non-recovery validate function from
vgextend / vgcreate. Note in the pvcreate tool case, we must call the
recovery validation function first (see treatment of pe_start and --zero),
and that we add a call to fill_default_pvcreate_params before the validation
functions.
We need defaults for pvcreate_params at a higher level - this will
allow us to use a common function from the tools to take defaults,
then fill in any non-defaults from the commandline.
Future patches will refactor vgcreate/vgextend to call this function
if one or more pvcreate parameters are given on the commandline.
Another refactoring for implicit pvcreate support. We need to get
the pvcreate parameters somehow to the vg_extend routine. Options
seemed to be:
1. Attach the parameters to struct volume_group. I personally
did not like this idea in most cases, though one could make an
agrument why it might be ok at least for some of the parameters
(e.g. metadatacopies).
2. Pass them in to the extend routine. This second route seemed
to be the best approach given the constraints.
Future patches will parse the command line and fill in the actual
values for the pvcreate_single call.
Should be no functional change.
Should be no functional change. If this parameter is set to NULL, just fail
the extend if the device is not already a PV. If non-NULL, try pvcreate_single
before failing. Note that pvcreate_single() handles the log_error in case
of failure so we just return 0 if pvcreate_single() fails.
lv_deactivate now returns always success, because tree deactivation
functions (see dm_tree_deactivate_children) always returns success.
Because code should return failure in lv_deactivate at least,
fix it by checking for device existence after real deactivation call.
(After discussion this was prefered solution to dm tree function rewrite
which affects snapshots and mirrors.)
Add configure --enable-units-compat to set si_unit_consistency off by default.
Use standard output units for 'PE Size' and 'Stripe size' in pv/lvdisplay.
Clean up VG_RESIZEABLE flag by creating vg_is_resizeable().
Update comment - we no longer have ALLOW_RESIZEABLE.
Also use vg_is_exported() in one place missed by earlier patch.
Should be no functional change.
This patch is all just cleanup and no other patch depends on it.
Replace explicit dereference and check with vg_is_exported().
Update a few copyrights and remove unnecessary whitespace.
Should be no functional change.
Of the vgs field vg_attr, a few of the most likely to be used attributes
are clustered, exported, and partial. This patch adds the following 3
functions:
uint64_t lvm_vg_is_clustered(const vg_t vg)
uint64_t lvm_vg_is_exported(const vg_t vg)
uint64_t lvm_vg_is_partial(const vg_t vg)
Now that we've split vg_remove_single into two routines, in the first routine
that only manipulates memory, we move the PVs from the vg->pvs list to the
vg->removed_pvs list. Then later, we iterate through this list to write the
removed PVs to disk, which removes them from the volume group and places them
into the internal ORPHAN VG.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Author: Dave Wysochanski <dwysocha@redhat.com>
Split vg_remove_single into vg_remove_check (mandatory checks before
vgremove) and vg_remove (do actual remove by committing to disk).
In liblvm, we'd like to provide an consistent API that allows multiple
changes in memory, then let lvm_vg_write() control the commit to disk. In
some cases (for example, lvresize calls fsadm) this may not be possible.
However, since we are using an object model and dividing things into small
operations, the most logical model seems to be the lvm_vg_write model, and
handling the special cases as they arrive. So as best as possible
we move towards this end.
A possible optimization would be to consolidate vg_remove (committing)
code with vgreduce code. A second possible optimization is making vgreduce
of the last device equivalent to vgremove. Today, lvm_vg_reduce fails if
vgreduce is called with the last device, but from an object model perspective
we could view this as equivalent to vgremove and allow it. My gut feel is
we do not want to do this though.
Author: Dave Wysochanski <dwysocha@redhat.com>
Later patches should consolidate the vgremove / vgreduce functions but for
now let's clarify what vg_remove actually does by changing the name.
Author: Dave Wysochanski <dwysocha@redhat.com>
Add a new constraint that vgname locks must be obtained in
alphabetical order. At this point, we have test coverage for
the 3 commands affected - vgsplit, vgmerge, and vgrename.
Tests have been updated to cover these commands.
Going forward any command or library call that must obtain
more than one vgname lock must do so in alphabetical order.
Future patches will update lvm2app to enforce this ordering.
Author: Dave Wysochanski <dwysocha@redhat.com>
# pvcreate -u udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi /dev/sdc
uuid udwxr7BoKYEeKMr033xK6o4og7F13sGi|��� already in use on "/dev/sdb1"
is now
# pvcreate -u udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi /dev/sdc
uuid udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi already in use on "/dev/sdb1"