IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
teardown sequence. (Previously the snapshot was deactivated while its
origin was active and before its removal was committed to disk, so
restarting after a crash at the point would leave corruption.)
When an LVM mirror is up-converted (an additional image added), it creates
a temporary mirror stack. The lower-level mirror in the stack that is
created was not being activated exclusively - violating the exclusive nature
of the original mirror. We now check for exclusive activation of a mirror
before converting it, and if found, we ensure that the temporary mirror
is also exclusively activated.
Mirrors used to be created by first creating a linear device and then adding
the other images plus the log. Now mirrors are created by creating all the
images in one go and then adding the log separately. The new way ran into
the condition that cluster mirrors cannot change the log type (in the case
of creation, from core -> disk) while the mirror is not active. (It isn't
active because it is in the process of being created.) The reason this
condition is in place is because a remote node may have the mirror active, and
we don't want to alter the log underneath it.
What we really needed was a way of checking if the mirror was active remotely
but not locally, and in that case do not allow a change of the log. I've added
this check, and cluster mirrors can now be created again.
It's useful to keep the partial flag cached - so just move the call
for vg_mark_partil_lvs() into import_vg_from_config_tree() so it gets
evaluated before it goes through the lvmcache.
This patch should not present any functional change.
Note: It is rather temporal solution - proper place is probably inside the
'read' call back - but needs some more discussion.
For now using this minor hack.
As the ACTIVATE_EXCL could be set only in clvmd code - there is no
use for this test in lv_add_mirrors() function only called from
tools context.
FIXME: Add cluster test case for this.
To avoid modification of 'read-only' volume group structure
add a new structure to pass local data around the code for LV
activation.
As origin_only is one such flag - replace this parameter with new
struct lv_activate_opts.
More parameters might eventually become part of lv_activate_opts.
transient error), stemming from the following sequence of events:
1) devices fail IO, triggering repair
2) dmeventd starts fixing up the mirror
3) during the downconversion, a new metadata version is written
--> the devices come back online here
4) the mirror device suspend/resume is called to update DM tables
5) during the suspend/resume cycle, *pre*-commit metadata is read;
however, since the failed devices are now back online, we get back
inconsistent set of precommit metadata and the whole operation fails
The patch relaxes the check that fails in step 5 above, namely by ignoring
inconsistencies coming from PVs that are marked MISSING.
are affected by the move. (Currently it's possible for I/O to become
trapped between suspended devices amongst other problems.
The current fix was selected so as to minimise the testing surface. I
hope eventually to replace it with a cleaner one that extends the
deptree code.
Some lvconvert scenarios still suffer from related problems.
Currently some operation with striped mirrors lead
to corrupted metadata, this patch just add detection of such
situation.
Example:
# lvcreate -i2 -l10 -n lvs vg_test
# lvconvert -m1 vg_test/lvs
# lvreduce -f -l1 vg_test/lvs
Reducing logical volume lvs to 4.00 MiB
Segment extent reduction 9not divisible by #stripes 2
Logical volume lvs successfully resized
# lvremove vg_test/lvs
Segment extent reduction 1not divisible by #stripes 2
LV segment lvs:0-4294967295 is incorrectly listed as being used by LV lvs_mimage_0
Internal error: LV segments corrupted in lvs_mimage_0.
Avoid using of already released memory when duplicated MDA is found.
As get_pv_from_vg_by_id() may call lvmcache_label_scan() use the local copy
of the vgname and vgid on the stack as vginfo may dissapear and code was
then accessing garbage in memory.
i.e. pvs /dev/loop0
(when /dev/loop0 and /dev/loop1 has same MDA content)
Invalid read of size 1
at 0x523C986: dm_hash_lookup (hash.c:325)
by 0x440C8C: vginfo_from_vgname (lvmcache.c:399)
by 0x4605C0: _create_vg_text_instance (format-text.c:1882)
by 0x46140D: _text_create_text_instance (format-text.c:2243)
by 0x47EB49: _vg_read (metadata.c:2887)
by 0x47FBD8: vg_read_internal (metadata.c:3231)
by 0x477594: get_pv_from_vg_by_id (metadata.c:344)
by 0x45F07A: _get_pv_if_in_vg (format-text.c:1400)
by 0x45F0B9: _populate_pv_fields (format-text.c:1414)
by 0x45F40F: _text_pv_read (format-text.c:1493)
by 0x480431: _pv_read (metadata.c:3500)
by 0x4802B2: pv_read (metadata.c:3462)
Address 0x652ab80 is 0 bytes inside a block of size 4 free'd
at 0x4C2756E: free (vg_replace_malloc.c:366)
by 0x442277: _free_vginfo (lvmcache.c:963)
by 0x44235E: _drop_vginfo (lvmcache.c:992)
by 0x442B23: _lvmcache_update_vgname (lvmcache.c:1165)
by 0x443449: lvmcache_update_vgname_and_id (lvmcache.c:1358)
by 0x443C07: lvmcache_add (lvmcache.c:1492)
by 0x46588C: _text_read (text_label.c:271)
by 0x466A65: label_read (label.c:289)
by 0x4413FC: lvmcache_label_scan (lvmcache.c:635)
by 0x4605AD: _create_vg_text_instance (format-text.c:1881)
by 0x46140D: _text_create_text_instance (format-text.c:2243)
by 0x47EB49: _vg_read (metadata.c:2887)
Add testing script
pv_manip.c to properly account for case when pe_start=0 and the first
physical extent is to be released (currently skip the first extent to
avoid discarding the PV label).
My previous patch fixed incorrect error check for dm_snprintf.
However in this particular case - dm_snprintf has been used differently -
just like strncpy + setting last char with '\0' - so the code had to return
error - because the buffer was to short for whole string.
Patch replaces it with real strncpy.
Also test for alloca() failure is removed - as the program behaviour
is rather undefined in this case - it never returns NULL.
lvseg properties for lvm2app, 'devices' and 'seg_pe_ranges'.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
dm_pool_free incorrectly. This check-in fixes that incorrect usage.
I've also added a WHATS_NEW line to reflect the changes I made to allow
lv_extend to operate on 0 length intrinsically layered LVs (i.e mirrors
and RAID). I forgot that in the last commit.
allows us to allocate all images of a mirror (or RAID array) at one
time during create.
The current mirror implementation still requires a separate allocation
for the log, however.
Actually, we can call vg_set_fid(vg, NULL) instead of calling
destroy_instance for all PV structs and a VG struct - it's the same
code we already have in the vg_set_fid.
Also, allow exactly the same fid to be set again for the same PV/VG
Before, this could end up with the fid destroyed because we destroyed
existing fid first and then we used the new one and we didn't care
whether existing one == new one by chance.
Instead of searching linear list of all LVs, PVs - use created hash tables
also for quick mapping between LV.
(Note - for small number of PVs or LVs the overhead of the hash is bigger).
TODO: Use hash tables in volume_group structure directly.
If _move_lv_segments is passed a 'lv_from' that does not yet
have any segments, it will screw things up because the code
that does the segment copy assumes there is at least one
segment. See copy code here:
lv_to->segments = lv_from->segments;
lv_to->segments.n->p = &lv_to->segments;
lv_to->segments.p->n = &lv_to->segments;
If 'segments' is an empty list, the first statement copies over
the values, but the next two reset those values to point to the
other LV's list structure. 'lv_to' now appears to have one
segment, but it is really an ill-set pointer.
other cases, the code could wind up removing wrong number of mirrors. In yet
other cases, we could remove the right number of mirrors, but fail to respect
the removal preferences (i.e. keep an image that was requested to be removed
while removing an image that was requested to be kept). Under some
circumstances, remove_mirror_images could also get stuck in an infinite loop.
This patch should fix all of the above undesirable behaviours.
Signed-off-by: Petr Rockai <prockai@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Attach \0 for proper char* display - otherwise somewhat random message could
be displayed in debug more and read of unpredictable read of uninitilized
memory values could happen.
As code uses strncpy(system_id, NAME_LEN) and doesn't set '\0'
Fix it by always allocating NAME_LEN + 1 buffer size and with zalloc
we always get '\0' as the last byte.
This bug may trigger some unexpected behavior of the string operation
code - depends on the pool allocator.
FIXME: refactor this code to alloc_vg.
Missing free_vg on error_path in lvmcache_get_vg fn. Call destroy_instance
only if the fid is not part of the vg in backup_read_vg fn (otherwise it's
part of the VG we're returning and we definitely don't want to destroy it!).
This is necessary for proper format instance ref_count support. We iterate
over vg->pvs and vg->removed_pvs list and the ref_count is decremented and
then it is destroyed if not referenced anymore.
Since format instances will use own memory pool, it's necessary to properly
deallocate it. For now, only fid is deallocated. The PV structure itself
still uses cmd mempool mostly, but anytime we'd like to add a mempool
in the struct physical_volume, we can just rename this fn to free_pv and
add the code (like we have free_vg fn for VGs).
This is essential for proper format instance ref_count support. We must
use these functions to set the fid everywhere from now on, even the NULL
value!
Format instances can be created anytime on demand and it contains
metadata area information mostly (at least for now, but in the future,
we may store more things here to update/edit in a PV/VG). In case we
have lots of metadata areas, memory consumption will rise. Using cmd
context mempool is not quite optimal here because it is destroyed too
late. So let's use a separate mempool for format instances.
Reference counting is used because fids could be shared, e.g. each PV
has either a PV-based fid or VG-based fid. If it's VG-based, each PV has
a shared fid with the VG - a reference to VG's fid.