1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

1370 Commits

Author SHA1 Message Date
Zdenek Kabelac
ea8acabe26 Export function out_text_with_comment() and add outfc() macro that checks
for error.
2010-01-07 14:45:28 +00:00
Zdenek Kabelac
1e13fa7a6a Add macros outsize() for out_size() and outhint() for out_hint() that check
for errors in a similar way as outf() for out_text().
2010-01-07 14:40:46 +00:00
Zdenek Kabelac
c75550f5ba Use offsetof() macro and avoid defining dummy static union for FIELD() macro.
Makes it compilable by clang compiler.
2010-01-07 14:37:11 +00:00
Milan Broz
03984e05a3 Rename mirror_device_fault_policy to mirror_image_fault policy 2010-01-06 13:27:06 +00:00
Milan Broz
bf8c8a6d61 Remove empty "repaired" devices if empty in lvconvert.
The logic was that lvconvert repair volumes, marking
PV as MISSING and following vgreduce --removemissing
removes these missing devices.

Previously dmeventd mirror DSO removed all LV and PV
from VG by simply relying on
vgreduce --removemissing --force.

Now, there are two subsequent calls:
lvconvert --repair --use-policies
vgreduce --removemissing

So the VG is locked twice, opening space for all races
between other running lvm processes. If the PV reappears
with old metadata on it (so the winner performs autorepair,
if locking VG for update) the situation is even worse.


Patch simply adds removemissing PV functionality into
lvconcert BUT ONLY if running with --repair and --use-policies
and removing only these empty missing PVs which are
involved in repair.
(This combination is expected to run only from dmeventd.)
2010-01-06 13:26:21 +00:00
Milan Broz
5d196aa430 Use fixed buffer to prevent stack overflow in persistent filter dump. 2010-01-06 13:25:36 +00:00
Mike Snitzer
255fc32087 update WHATS_NEW and WHATS_NEW_DM with previous commits' changes 2010-01-05 21:32:59 +00:00
Mike Snitzer
5b7f6ad698 Use snapshot metadata usage to determine if snapshot is empty
Version >= 1.8.0 of the DM snapshot target appends metadata sectors used
to a snapshot's status.  This patch allows LVM2 to accurately determine
if the snapshot store is empty.  Knowing when a snapshot store is empty
is important in the context of snapshot-merge (means merge is complete).

Also update LVM2 to be aware of the possibility for "Merge failed" in
the snapshot-merge target's status.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-01-05 21:14:04 +00:00
Mike Snitzer
7a8fa6aaab Add a [--poll {y|n}] flag to vgchange and lvchange to control whether
the background polldaemon is allowed to start.  It can be used
standalone or in conjunction with --refresh or --available y.

Control over when the background polldaemon starts will be particularly
important for snapshot-merge of a root filesystem.

Dracut will be updated to activate all LVs with: --poll n

The lvm2-monitor initscript will start polling with: --poll y

NOTE: Because we currently have no way of knowing if a background
polldaemon is active for a given LV the following limitations exist and
have been deemed acceptable:
1) it is not possible to stop an active polldaemon; so the lvm2-monitor
   initscript doesn't stop running polldaemon(s)
2) redundant polldaemon instances will be started for all specified LVs
   if vgchange or lvchange are repeatedly used with '--poll y'

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-01-05 20:56:51 +00:00
Milan Broz
0e06c92fdf Propagate commit and revert metadata event to other nodes in cluster.
This patch tries to correctly track changes in lvmcache related to commit/revert.

For vg_commit: if there is cached precommitted metadata, after successfull commit
these metadata must be tracked as committed.

For vg_revert: remote nodes must drop precommitted metadata and its flag in lvmcache.

(N.B. Patch do not touch LV locks here in any way.)

All this machinery is needed to properly solve remote node cache invalidaton which
cause several problems recently observed.
2010-01-05 16:09:33 +00:00
Milan Broz
c9118a1d20 Proper mask lock mode for vg lock.
Lock mode is int masked by LCK_TYPE_MASK, always.

Patch also remove uneccessary masking lock flag on sender side,
if masking is needed, it is don on client side already.
2010-01-05 16:07:56 +00:00
Milan Broz
d37be0b865 Add possibility to handle precommitted metadata in lvmcache.
- Add drop_precommitted flag to force drop precommitted metadata
 - add lvmcache_commit_metadata() which upgrades precommitted metadata in cache

No functional change in this patch - just preparation for following change.
2010-01-05 16:06:42 +00:00
Milan Broz
ac85c2e75b Move processing of VG locks to separate function (similar to LV locks).
And print some debugging info.

No functional change in this patch.
2010-01-05 16:05:12 +00:00
Milan Broz
d7f44761ab Properly decode flags even for VG locks.
And decode flags in humar readable form in client.
And clean some trailing whitespaces.

No functional change in this patch (only debugging messages changed).
2010-01-05 16:03:37 +00:00
Milan Broz
4b1687fb74 Do not set precommitted flag in cache when precommitted metadata does not exist.
The use_precommitted flag indicates, that we want to use precommitted metadata
(used in suspend call to preload table with precommitted data).

But if there are no such data, committed metadata are read but the cache
still contains that precommitted flag.

(The problem is that later possible drop_metadata call will not invalidate
device in cache.)

The wrong precommitted state is stored in on remote nodes during normal
suspend/resume cycle _without_ vg_write/commit.

Use the PRECOMMITTED status flag here instead (which is always set if using
precommited metadata here).
2010-01-05 16:01:22 +00:00
Milan Broz
60494fe74b Resume volumes in reverse order to preserve memlock pairing.
If renaming snapshot with virtual origin, the origin is renamed too.
But the code must resume LVs in reverse order to properly
pair memlock (in cluster locking).

(The resume of snapshot resumes origin too and later resume
is ignored otherwise.)
2010-01-05 15:58:11 +00:00
Milan Broz
cfe30f1df3 Drop metadata cache after device was autorepaired and removed from VG.
All long running processes must reload metadata when some
device becomes orphan after repair.
2009-12-18 12:45:41 +00:00
Milan Broz
aa02928ff7 Remove missing flag if PV reappeared and is empty.
When PV device reappears with old metadata, it is
always updated to new version byt atutomatic metadata
repair.

Remove missing flag if device is empty.

If device contains allocated extents, issue warning that
user must remove volumes and re-add this PV before
manipulating with this volume.

This partially solves bug 547842 when one PV (log) is failed,
dmeventd removes that device and later this device reappears and
is wrongly added into VG marked missing.
2009-12-18 12:44:20 +00:00
Zdenek Kabelac
735308699c Destroy allocated mempool in _vg_read_orphans() error path. 2009-12-11 13:14:44 +00:00
Zdenek Kabelac
685be1dc7a Fix unlocking vg in some pvresize and toollib error paths. 2009-12-11 13:11:56 +00:00
Milan Broz
34de60e4d4 Call explicitly suspend for temporary mirror layer.
The memlock_inc() fix is wrong, memlock count is not
propagated to long living process (clvmd) and just
it underflow there.
Also suspend is needed to pre-load precommited metadata
on other nodes (remapping to error taget in this case).

With explicit suspend we generate lock request and code
can update memlock count.

(Infinitely "locked" memory caused that fs_unlock() was not
called properly and on cluster nodes remains
old links in /dev/mapper for not active devices.)

(N.B. failing of suspend call here is not handled as fatal
error - the LV is going to be removed later anyway.)
2009-12-09 19:53:39 +00:00
Milan Broz
0fa0e6addf Allow manipulation with precommited metadata even when a PV is missing.
The new recovery code first tries to repair LV and then removes failed PV
from VG. It means that during operation there can be VG with PV missing,
and vg_read code handles it like not consistent VG.

We already allows returning "inconsistent" commited metadata,
for mirror repair we need this for precommited too.
(The suspend call prepares precommited metadata to inactive table on
other cluster nodes.)

"Inconsistent" here means - correct metadata, just with some metadata areas
not found (obviously on missing or failed PVs).
2009-12-09 19:29:04 +00:00
Milan Broz
27132718d4 Add memlock information to do_lock_lv debug output.
This helps a lot to detect that something strange happened.
2009-12-09 19:01:27 +00:00
Milan Broz
85fabd8116 Never ever use distributed lock for LV in non-clustered VG.
The LV locks make sense only for clustered LVs.

Properly check cluster flag and never issue cluster lock here.

There are several places in code, where it is already checked, this
patch add this check to all needed calls.

In previous code the lock behaviour was inconsistent,
for example, the pre/post callback can take lock even for local volume,
but deactivate call do not released this lock and it remains held forever.

The local LV lock request now just let run the underlying activation code
on local node, the same process like in local locking.

(Again, this is important for new mirror repair calls, here for local
mirrors but with cluster locking enabled.)
2009-12-09 19:00:16 +00:00
Milan Broz
4b3efd3537 Allow implicit lock conversion for pre/post callbacks.
This is unnoticed regression from commit 31672ff60e

The pre/post callback need to convert lock always, local node
is going to modify metadata in this case, it it fails conversion,
the call is ignored.

Also it fixes bug when the lock is not yet held, we cannot set LKF_CONVERT
in this case, it will fail because this lock do not exist.

Note that the automatic conversion is still disabled in activate
call, so the original fix (reactivation of exlusive LV) should
be still in place.
2009-12-09 18:55:53 +00:00
Milan Broz
4499aa2ea5 Allow implicit "convert" to the same lock mode.
(Code already not fail if unlocking not locked resource.)

This is needed in pre/post lock_lv call, where we can
request the same lock on local node becuase of suspend call.
2009-12-09 18:45:12 +00:00
Milan Broz
cc31b2bd3f Get rid of magic masks in cluster locking code - clvmd part.
- do_command and lock_vg expect flags (no change here)

Bug fixes:
- lock_vg should check for NONBLOCK on lock_cmd, flags have this bit masked-out

- do_pre/post_command expect do not mask flag at all, this causes that
the code inside is never run! (see following patches, these functions
expect plain command without flags)
2009-12-09 18:42:02 +00:00
Milan Broz
f72a06ccf7 Remove newly created log volume if initial deactivation fails.
If there is problem deactivate LV and
_init_mirror_log is called with remove_on_failure = 1,
remove the newly created log LV from metadata.

(This can happen if there is active device with the same name
but different UUID.)

The main reason for this "workaround" patch is to
 - do not keep _mlog volume in metadata, so user can repeat the action
 - print better error message describing the real problem

# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  /dev/vg_bar/lv1_mlog: not found: device not cleared
  Aborting. Failed to wipe mirror log.
  Error locking on node bar-01: Input/output error
  Unable to deactivate mirror log LV. Manual intervention required.
  Failed to create mirror log.

# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  Aborting. Unable to deactivate mirror log.
  Failed to initialise mirror log.
2009-12-09 18:09:52 +00:00
Peter Rajnoha
05d08428b3 WHATS_NEW for previous commit. 2009-12-04 14:26:22 +00:00
Milan Broz
63ae0d1464 Fix memory lock imbalance in lv_suspend if already suspended.
pvmove suspends all moved LVs + pvmoveX mirrored LV itself.

This suspends even underlying pvmoveX and following explicit
suspend call is just noop.

But in resume the pvmoveX volume is no longer underlying
device for moved LVs, so it performs full resume with memlock
decrease.

Code must call memlock_inc() if suspend is requested, volume
is already suspended and error is not requested.
2009-12-03 19:23:40 +00:00
Milan Broz
29f011314d Fix pvmove test mode to not fail and do not poll.
Test mode should not fail nor try to poll non-existent devices.
2009-12-03 19:22:24 +00:00
Milan Broz
b917086464 Print error if VG already exist.
This test have to be moved because of new vg read error handling.
2009-12-03 19:20:48 +00:00
Milan Broz
fec4de9563 Fix tools to report error when stopped by user.
(And do not produce internal error message.)
2009-12-03 19:18:33 +00:00
Dave Wysochanski
e4e8cf3b59 Add tests to check for readahead value in lvcreate. 2009-12-03 01:48:05 +00:00
Milan Broz
0548bcc2dc Fix memory leak in lv_info_by_lvid
The lv_from_lvid calls internally vg_read(),
we must release vg structure afterwards.

Code is called only from clvmd.
2009-12-01 19:10:23 +00:00
Milan Broz
5800aa5c07 Do not allow creating mirrors of more than 8 images.
This is kernel limitation in all kernel versions,
so better detect this early.
2009-11-27 14:35:38 +00:00
Milan Broz
16e033b91a Use locking_type 3 (compiled in cluster locking) in lvmconf. 2009-11-27 14:32:16 +00:00
Dave Wysochanski
ccb601a3cb Remove unnecessary / duplicate dm_list macros and functions.
These are no longer used by anyone.  The dm_list defines are all in
libdevmapper.h and libdm/datastruct/list.c contains any function definitions.
There is some code in "old-tests" that still use this but this code is not
being maintained.

Thanks to Zdenek for spotting this.
2009-11-25 20:44:07 +00:00
Alasdair Kergon
32780caae3 Log failure type and recognise type 'F' (flush) in dmeventd mirror plugin. 2009-11-25 15:59:07 +00:00
Mike Snitzer
a2552d4f59 Switch status from 32-bit to 64-bit
The physical_volume, volume_group, logical_volume and lv_segment
structures' 'status' member is now uint64_t.

The alignment of these structures was also audited to remove holes.  The
movement of some members in 'volume_group' and 'lv_segment' eliminates
holes.  The 'physical_volume' structure still has one 4-byte hole after
'pe_size'; the other structures no longer have any holes.  Each
structures' size has not changed.
2009-11-24 22:55:55 +00:00
Alasdair Kergon
b1bee9cd52 Post-release.
Fingers crossed this one's more successful that the last one!
2009-11-24 19:04:23 +00:00
Alasdair Kergon
13b665e481 . 2009-11-24 18:54:23 +00:00
Alasdair Kergon
2b2c5617d6 pre-release 2009-11-24 18:26:08 +00:00
Milan Broz
fed0e904f2 Add missing vg_release to pvs and pvdisplay to fix memory leak. 2009-11-24 17:07:09 +00:00
Milan Broz
0025670dc9 Do not try to unlock VG which is not locked.
If the vg_read() returned error, no lock was taken,
so always call vg_release().

Otherwise this can happen because of missing FAILED_*:

# vgchange -a y x --ignorelockingfailure
  Volume group "x" not found
  Internal error: Attempt to unlock unlocked VG x
2009-11-24 16:13:02 +00:00
Milan Broz
cd501dd440 Move persistent filter dump to more appropriate place.
After context_refresh is cache empty, the cache flush
does nothing.

Call it after lvmcache full rescan if running from
log lived process.
2009-11-24 16:11:37 +00:00
Milan Broz
e1ab01e3ad Refresh device filters before full device rescan in lvmcache.
The sysfs filter initialise hash of available devices using
scan of /sys/block. We need to refresh even this hash
when performing full scan otherwise the newly appeared
device could be rejected, because there is no entry
in sysfs filter.

This easily could happen when attaching new device
to cluster node. (Only force refresh of context
in clvmd -R works here now).

Unfortunately consequences of this are much worse,
missing device part on that node is replaced with missing segment
(even when no partial arg is selected) and this directly
lead to data corruption.

See https://bugzilla.redhat.com/show_bug.cgi?id=538515

Simply fix it by refreshing device filters in lvmcache
before performing the full device scan.
2009-11-24 16:10:25 +00:00
Milan Broz
155c608cd3 Return error status if vgchange fails to activate some volume.
(on one node a storage connection failed):

# vgchange -a y vg_bar ; echo $?
  Error locking on node bar-02: Refusing activation of partial LV lv1. Use --partial to override.
    1 logical volume(s) in volume group "vg_bar" now active
    0

So activation fails on one node, error is correctly printed but
status code is wrong.

This patch fixes the top level (vgchange) to return proper code
(and print # of activated LVs).

(lvchange returns error properly here.)
2009-11-24 16:08:49 +00:00
Milan Broz
6b8304ab43 Fix memory lock imbalance in locking code.
(This affects only cluster locking because only cluster
locking module set LCK_PRE_MEMLOCK.)

With currect code you get
# vgchange -a n
  Internal error: _memlock_count has dropped below 0.
when using cluster locking.

It is caused by _unlock_memory calls here

  if ((flags & (LCK_SCOPE_MASK | LCK_TYPE_MASK)) == LCK_LV_RESUME)
     memlock_dec();

Unfortunately it is also (wrongly) called in immediate unlock
(when LCK_HOLD is not set) from lock_vol
(LCK_UNLOCK is misinterpreted as LCK_LV_RESUME).

Avoid this by comparing original flags and provide memlock
code type of operation (suspend/resume).
2009-11-23 10:55:14 +00:00
Milan Broz
a4893bc377 Revert vg_read_internal change, clvmd cannot use vg_read now. (2.02.55) 2009-11-23 10:44:50 +00:00