IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When checking minimum mda size, make sure the mda_size after alignment
and calculation is more than 0 - if there's no place for an MDA at the
end of the disk, the _text_pv_add_metadata_area does not try to add it
there and it returns (because we already have the MDA at the start of
the disk at least).
Actually, we don't need extra condition as introduced in commit
00348c0a63. We should fix the last
condition:
(mdac->rlocn.size >= mdah->size)
...which should be:
(MDA_HEADER_SIZE + (rlocn ? rlocn->size : 0) + mdac->rlocn.size >= mdah->size))
Where the "mdac" is new metadata, the "rlocn" is old metadata.
So the main problem with the previous condition was that it
didn't count in MDA_HEADER_SIZE properly (and possible existing
metadata - the "rlocn"). This could have caused the error state
where metadata in ring buffer overlap to not be hit.
Replace the new condition introduced in 00348c0a63
with the improved one for the condition that existed there
already but it was just incomplete.
We're already checking whether old and new meta do not overlap in
ring buffer (as we need to keep both old and new meta during vg_write
up until vg_commit).
We also need to check whether the new metadata do not overlap
themselves in case we don't have old metadata yet (...because
we're in vgcreate). This could happen if we're creating a VG so
that the very first metadata written are long enough that it wraps
themselves in metadata ring buffer.
Although we limited the minimum metadata area size better with the
previous commit ccb8da404d which
makes the initial VG metadata overlap in ring buffer to be less
probable, the risk of hitting this overlap condition is still there
if we still manage to generate big enough metadata somehow.
For example, users can provide many and/or long VG tags during vgcreate
so that the VG metadata is long enough to start to wrap in the ring
buffer again...
Also, leave out the note about "circular buffer" which is
an internal imeplementation detail anyway and not quite
informational for users:
Before this patch:
$ vgcreate vg1 /dev/sda
VG vg1 metadata too large for circular buffer
Failed to write VG vg1.
With this patch applied:
$ vgcreate vg1 /dev/sda
VG vg1 metadata too large: size of metadata to write is 691 bytes while PV metadata area size on /dev/sda is 512 bytes.
Failed to write VG vg1.
When using lvm shell, some structures which are cached in memory may be
reused. This happens for the struct label (a part of lvmcache_info
structure) when lvmetad is used in which case the PV scan is not
done that would normally overwrite these label structures in memory
and making them up-to-date.
This is all consequence of the fact that struct lvmcache_info and
struct label are not always assigned in the same part of the code.
For example, if lvmetad *is not* used, parts of the struct label are
reassigned in label_read fn while struct lvmcache_info is created
elsewhere. No part of the code reused struct label (and its "dev"
field) before calling label_read fn. That's why the real bug is
hidden when using lvm shell without lvmetad.
However, with lvmetad and lvm shell, the situation is a bit different.
The label_read fn is not called if lvmetad *is* used, hence the
struct label may have ended up not initialized properly.
There was missing assignment for the dev field in struct label
in _text_pv_write fn which caused this problem to appear in
lvm shell with lvmetad, for example:
Before this patch:
lvm> pvcreate /dev/sda
Physical volume "/dev/sda" successfully created
lvm> pvs /dev/sda
PV VG Fmt Attr PSize PFree
unknown device lvm2 --- 128.00m 128.00m
With this patch applied:
lvm> pvcreate /dev/sda
Physical volume "/dev/sda" successfully created
lvm> pvs /dev/sda
PV VG Fmt Attr PSize PFree
/dev/sda lvm2 --- 128.00m 128.00m
Also, this problem had not appeared before changes introduced
by commits e1a63905d1 through
3a6f91d713 which, among other
things, added proper label field type reporting. Before, label
reporting was the same as using struct physical_volume which
has its own dev field assigned and so this problem was not exposed.
This reverts commit 70db1d523d.
Since we use 'strncpy' even for case where it exactly matches
the buffer size and \0 is not expected to be added there.
vgsummary information contains provisional VG information
that is obtained without holding the VG lock. This info
can be used to lock the VG, and then read it with vg_read().
After the VG is read properly, the vgsummary info should
be verified.
Add the VG lock_type to the vgsummary. It needs to be
known before the VG can be locked and read.
Use 64bit arithmentic for PV size calculation (Coverity).
Also remove sector shift for compared PV size, since all
values are already held in sectors.
This fixes validatio of PV size when restoring PV
from vg metadata backup file.
When performing initial allocation (so there is nothing yet to
cling to), use the list of tags in allocation/cling_tag_list to
partition the PVs. We implement this by maintaining a list of
tags that have been "used up" as we proceed and ignoring further
devices that have a tag on the list.
https://bugzilla.redhat.com/983600
pv_write is called both to write orphans and to rewrite PV headers
of PVs in VGs. It needs to select the correct VG id so that the
internal cache state gets updated correctly.
It only affected commands that involved further steps after
the pv_write and was often masked because the metadata would
be re-read off disk and correct itself.
"Incorrect metadata area header checksum" warnings appeared.
Example:
Create vg1 containing dev1, dev2 and dev3.
Hide dev1 and dev2 from the system.
Fix up vg1 with vgreduce --removemissing.
Bring back dev1 and dev2.
In a single operation reinstate dev1 and dev2 into vg1 (vgextend).
Done as separate operations (automatically fix-up dev1 and dev2 as orphans,
then vgextend) it worked, but done all in one go the internal cache got
corrupted and warnings about checksum errors appeared.
This avoids a problem in which we're using selection on LV list - we
need to do the selection on initial state and not on any intermediary
state as we process LVs one by one - some of the relations among LVs
can be gone during this processing.
For example, processing one LV can cause the other LVs to lose the
relation to this LV and hence they're not selectable anymore with
the original selection criteria as it would be if we did selection
on inital state. A perfect example is with thin snapshots:
$ lvs -o lv_name,origin,layout,role vg
LV Origin Layout Role
lvol1 thin,sparse public,origin,thinorigin,multithinorigin
lvol2 lvol1 thin,sparse public,snapshot,thinsnapshot
lvol3 lvol1 thin,sparse public,snapshot,thinsnapshot
pool thin,pool private
$ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1'
Logical volume "lvol1" successfully removed
The lvremove command above was supposed to remove lvol1 as well as
all its snapshots which have origin=lvol1. It failed to do so, because
once we removed the origin lvol1, the lvol2 and lvol3 which were
snapshots before are not snapshots anymore - the relations change
as we're processing these LVs one by one.
If we do the selection first and then execute any concrete actions on
these LVs (which is what this patch does), the behaviour is correct
then - the selection is done on the *initial state*:
$ lvremove -ff -S 'lv_name=lvol1 || origin=lvol1'
Logical volume "lvol1" successfully removed
Logical volume "lvol2" successfully removed
Logical volume "lvol3" successfully removed
Similarly for all the other situations in which relations among
LVs are being changed by processing the LVs one by one.
This patch also introduces LV_REMOVED internal LV status flag
to mark removed LVs so they're not processed further when we
iterate over collected list of LVs to be processed.
Previously, when we iterated directly over vg->lvs list to
process the LVs, we relied on the fact that once the LV is removed,
it is also removed from the vg->lvs list we're iterating over.
But that was incorrect as we shouldn't remove LVs from the list
during one iteration while we're iterating over that exact list
(dm_list_iterate_items safe can handle only one removal at
one iteration anyway, so it can't be used here).
The code never mixes reads of committed and precommitted metadata,
so there's no need to attempt to set PRECOMMITTED when
*use_previous_vg is being set.
Refactor the recent metadata-reading optimisation patches.
Remove the recently-added cache fields from struct labeller
and struct format_instance.
Instead, introduce struct lvmcache_vgsummary to wrap the VG information
that lvmcache holds and add the metadata size and checksum to it.
Allow this VG summary information to be looked up by metadata size +
checksum. Adjust the debug log messages to make it clear when this
shortcut has been successful.
(This changes the optimisation slightly, and might be extendable
further.)
Add struct cached_vg_fmtdata to format-specific vg_read calls to
preserve state alongside the VG across separate calls and indicate
if the details supplied match, avoiding the need to read and
process the VG metadata again.
Detect an lvm1 system id by looking at the WRITE_LOCKED flag.
Don't copy this lvm1 system id into vg->system_id so that the
restrictions associated with the new system id are not applied
to the old VG with the inherited lvm1 system id.
Use similar logic as with text_vg_import_fd() and avoid repeated
parsing of same mda and its config tree for vgname_from_mda().
Remember last parsed vgname, vgid and creation_host in labeller
structure and if the metadata have the same size and checksum,
return this stored info.
TODO: The reuse of labeller struct is not ideal, some lvmcache API for
this functionality would be nicer.
When reading VG mda from multiple PVs - do all the validation only
when mda is seen for the first time and when mda checksum and length
is same just return already existing VG pointer.
(i.e. using 300PVs for a VG would lead to create and destroy 300 config trees....)
Previous versions of lvm will not obey the restrictions
imposed by the new system_id, and would allow such a VG
to be written. So, a VG with a new system_id is further
changed to force previous lvm versions to treat it as
read-only. This is done by removing the WRITE flag from
the metadata status line of these VGs, and putting a new
WRITE_LOCKED flag in the flags line of the metadata.
Versions of lvm that recognize WRITE_LOCKED, also obey the
new system_id. For these lvm versions, WRITE_LOCKED is
identical to WRITE, and the rules associated with matching
system_id's are imposed.
A new VG lock_type field is also added that causes the same
WRITE/WRITE_LOCKED transformation when set. A previous
version of lvm will also see a VG with lock_type as read-only.
Versions of lvm that recognize WRITE_LOCKED, must also obey
the lock_type setting. Until the lock_type feature is added,
lvm will fail to read any VG with lock_type set and report an
error about an unsupported lock_type. Once the lock_type
feature is added, lvm will allow VGs with lock_type to be
used according to the rules imposed by the lock_type.
When both system_id and lock_type settings are removed, a VG
is written with the old WRITE status flag, and without the
new WRITE_LOCKED flag. This allows old versions of lvm to
use the VG as before.
Set ACCESS_NEEDS_SYSTEM_ID VG status flag whenever there is
a non-lvm1 system_id set. Prevents concurrent access from
older LVM2 versions.
Not set on VGs that bear a system_id only due to conversion
from lvm1 metadata.
format_text processes both lvm2 on-disk metadata and metadata read
from other sources such as backup files. Add original_fmt field
to retain the format type of the original metadata.
Before this patch, /etc/lvm/archives would contain backups of
lvm1 metadata with format = "lvm2" unless the source was lvm1 on-disk
metadata.
When checking whether the system ID permits access to a VG, check for
each permitted situation first, and only then issue the appropriate
error message. Always issue a message for now. (We'll try to
suppress some of those later when the VG concerned wasn't explicitly
requested.)
Add more messages to try to ensure every return code is checked and
every error path (and only an error path) contains a log_error().
Add self-correction to vgchange -c to deal with situations where
the cluster state and system ID state are out-of-sync (e.g. if
old tools were used).
Support error_if_no_space feature for thin pools.
Report more info about thinpool status:
(out_of_data (D), metadata_read_only (M), failed (F) also as health
attribute.)