1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00
Commit Graph

1191 Commits

Author SHA1 Message Date
Dave Wysochanski
b1ef78d000 Add supporting functions vg_name_dup, vg_fmt_dup, vg_system_id_dup.
Add supporting functions for vg_name, vg_fmt, vg_system_id.
Append "_dup" to end of supporting functions to make clear the strings
are dup'd and to avoid namespace conflict with vg_name.
2010-09-30 14:08:33 +00:00
Dave Wysochanski
c508945ca9 Add pv_tags_dup, vg_tags_dup, lv_tags_dup functions that call tags_format_and_copy. 2010-09-30 14:08:19 +00:00
Dave Wysochanski
f15033c0e1 Add tags_format_and_copy() common function and call from _tags_disp.
Add a common function to allocate memory and format a string of
tags.
Call tags_format_and_copy() from _tags_disp().
2010-09-30 14:08:07 +00:00
Dave Wysochanski
254d672dcc Add pv_uuid_dup, vg_uuid_dup, and lv_uuid_dup, and call id_format_and_copy.
Add supporting functions for pv_uuid, vg_uuid, and lv_uuid.
Call new function id_format_and_copy.  Use 'const' where appropriate.
Add "_dup" suffix to indicate memory is being allocated.
Call {pv|vg|lv}_uuid_dup from lvm2app uuid functions.
2010-09-30 14:07:47 +00:00
Dave Wysochanski
4bbadbe1cf Simplify logic to create 'attr' strings.
This patch addresses code review request to simplify creation of 'attr'
strings.  The simplification is done in this separate patch to more
easily review and ensure the simplification is done without error.
2010-09-30 14:07:19 +00:00
Dave Wysochanski
14663348d0 Add {pv|vg|lv}_attr_dup() functions and refactor 'disp' functions.
Move the creating of the 'attr' strings into a common function so
they can be called from the 'disp' functions as well as the new
'get' property functions.
Add "_dup" suffix to indicate memory is allocated.
Refactor pvstatus_disp to take pv argument and call pv_attr_dup().
2010-09-30 13:52:55 +00:00
Dave Wysochanski
e32e2eb011 Add lib/metadata/vg.[ch] and lib/metadata/lv.[ch].
These got missed when git cvsexportcommit was used.
2010-09-30 13:16:55 +00:00
Dave Wysochanski
b88b638d6e Add lib/metadata/pv.[ch] new files.
Apparently git cvsexportcommit does not properly add new files
from a git commit.
2010-09-30 13:15:42 +00:00
Dave Wysochanski
b171907fc5 Refactor metadata.[ch] into lv.[ch] for lv functions.
This patch is similar to the other patches for pv and vg
functionality, and separates lv functionality into separate
files, concentrating on reporting fields and simple functions.
2010-09-30 13:05:45 +00:00
Dave Wysochanski
f42b708eae Refactor metadata.[ch] into pv.[ch] for pv functions.
The metadata.[ch] files are very large.  This patch makes a first
attempt at separating out pv functions and data, particularly
related to the reporting fields calculations.

More code could be moved here but for now I'm stopping at reporting
functions 'get' / 'set' functions.
2010-09-30 13:05:20 +00:00
Dave Wysochanski
81f0124a58 Refactor metadata.[ch] into vg.[ch] for vg functions.
The metadata.[ch] files are very large.  This patch makes a first
attempt at separating out vg functions and data, particularly
related to the reporting fields calculations.
2010-09-30 13:04:55 +00:00
Peter Rajnoha
bad35c6554 Add escape sequence for ':' and '@' found in device names used as PVs. 2010-09-23 12:02:33 +00:00
Milan Broz
c7af31dbd7 Fix return type qualifier to avoid compiler warning.
introduced in commit b16b4d92a7
"Improve various log messages."

fixes a lot of
../include/metadata.h:148: warning: type qualifiers ignored on function return type
2010-08-26 12:08:19 +00:00
Mike Snitzer
4efb1d9cbb Update heuristic used for default and detected data alignment.
Add "devices/default_data_alignment" to lvm.conf to control the internal
default that LVM2 uses: 0==64k, 1==1MB, 2==2MB, etc.

If --dataalignment (or lvm.conf's "devices/data_alignment") is specified
then it is always used to align the start of the data area.  This means
the md_chunk_alignment and data_alignment_detection are disabled if set.

(Same now applies to pvcreate --dataalignmentoffset, the specified value
will be used instead of the result from data_alignment_offset_detection)

set_pe_align() still looks to use the determined default alignment
(based on lvm.conf's default_data_alignment) if the default is a
multiple of the MD or topology detected values.
2010-08-20 20:59:05 +00:00
Dave Wysochanski
69d67dc2ca Add vg_mda_size and vg_mda_free functions.
Add supporting functions to get vg_mda_size and vg_mda_free fields.
Should be no functional change.
2010-08-20 12:43:49 +00:00
Milan Broz
586b56b18c Fix wrong use of LCK_WRITE
In all top vg read functions only LCK_VG_READ/WRITE can be used.
All other vg lock definitions are low-level backend machinery.

Moreover, LCK_WRITE cannot be tested through bitmask.
This patch fixes these mistakes.

For _recover_vg() we do not need lock_flags, it can be only
two of above and we always upgrading to LCK_VG_WRITE lock there.
(N.B. that code is racy)

There is no functional change in code (despite wrong masking
it produces correct bits:-)
2010-08-19 23:26:31 +00:00
Milan Broz
727f7bfa49 Detect LUKS signature in pvcreate
One shiny day we should use libblkid here. But now using LUKS is
very common together with LVM and pvcreate destroys LUKS completely.

So for user's convenience, try to detect LUKS signature and allow abort.
2010-08-19 23:08:18 +00:00
Milan Broz
2d5e2b52ca Change the pvcreate swap/md logic
pvcreate detects MD and swap signature.

The logic hidden there is not only documented but it is also
user unfriendly. Who invented this logic should run pvcreate
on its own critical MD device to see why;-)

This patch
 - creates one function instead of duplication code
 - asks if user want to overwrite signature
 - allows aborting (!)
 (Please note that writing LVM signatute without wiping old
 is wrong, it confuses blkid, MD will not work anyway and
 swap and LUKS is broken too.)
2010-08-19 23:03:34 +00:00
Alasdair Kergon
22149572e8 Use 'SINGLENODE' instead of 'dead' in clvmd singlenode messages.
Ignore snapshots when performing mirror recovery beneath an origin.
Pass LCK_ORIGIN_ONLY flag around cluster.
Add suspend_lv_origin and resume_lv_origin using LCK_ORIGIN_ONLY.
2010-08-17 19:25:05 +00:00
Alasdair Kergon
2d6fcbf67d Allow internal suspend and resume of origin without its snapshots. 2010-08-17 16:25:32 +00:00
Jonathan Earl Brassow
d0191bf9f4 Fix for bug 612291: dm devices of split off mirror images are not removed
DM devices were not handled properly on nodes in a cluster that were not
where the splitmirrors command was issued.  This was happening because
suspend_lv/resume_lv were being used in a place where activate_lv should
have been used.

When the suspend/resume are issued on (effectively) new LVs, their
'resource' (UUID) is not located in the lv_hash.  Thus, both operations
turn into no-ops.  You can see this from the output of clvmd from one
of the remote nodes:
<snip>
do_suspend_lv, lock not already held
<snip>
do_resume_lv, lock not already held

'activate_lv' enjoins the other nodes in the cluster to process the lock
and activate the new LV.  clvmd output from remote node as follows:
do_lock_lv: resource 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S', cmd = 0x19 LCK_LV_ACTIVATE (READ|LV|NONBLOCK), flags = 0x84 (DMEVENTD_MONITOR ), memlock = 1
sync_lock: 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S' mode:1 flags=1
sync_lock: returning lkid 27b0001

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
2010-08-16 18:02:14 +00:00
Mike Snitzer
b123a82d73 Change default alignment of pe_start to 1MB.
The new standard in the storage industry is to default alignment of data
areas to 1MB.  fdisk, parted, and mdadm have all been updated to this
default.

Update LVM to align the PV's data area start (pe_start) to 1MB.  This
provides a more useful default than the previous default of 64K (which
generally ended up being a 192K pe_start once the first metadata area
was created).

Before this patch:
# pvs -o name,vg_mda_size,pe_start
  PV         VMdaSize  1st PE
  /dev/sdd     188.00k 192.00k

After this patch:
# pvs -o name,vg_mda_size,pe_start
  PV         VMdaSize  1st PE
  /dev/sdd    1020.00k   1.00m

The heuristic for setting the default alignment for LVM data areas is:
- If the default value (1MB) is a multiple of the detected alignment
  then just use the default.
- Otherwise, use the detected value.

In practice this means we'll almost always use 1MB -- that is unless:
- the alignment was explicitly specified with --dataalignment
- or MD's full stripe width, or the {minimum,optimal}_io_size exceeds
  1MB
- or the specified/detected value is not a power-of-2
2010-08-12 04:11:48 +00:00
Jonathan Earl Brassow
8d2d4f1fa0 Fix for bug 619221 - log device splitting regression
An incorrect fix on July 13, 2010 for an annoyance has caused a regression.
The offending check-in was part of the 2.02.71 release of LVM.  That
check-in caused any PVs specified on the command line to be ignored when
performing a mirror split.

This patch reverses the aforementioned check-in (solving the regressions)
and posits a new solution to the list reversal problem.  The original
problem was that we would always take the lowest mimage LVs from a mirror
when performing a split, but what we really want is to take the highest
mimage LVs.  This patch accomplishes that by working through the list in
reverse order - choosing the higher numbered mimages first.  (This also
reduces the amount of processing necessary.)

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Takahiro Yasui <takahiro.yasui@hds.com>
2010-08-06 15:38:32 +00:00
Jonathan Earl Brassow
cbd41292a4 Taka's fix for handling failure of all mirrored log devices and
all but one mirror leg.

<patch header>
To handle a double failure of a mirrored log, Jon's two patches are
commited, however, lvconvert command can't still handle an error
when mirror leg and mirrored log got failure at the same time.

  [Patch]: Handle both devices of a mirrored log failing (bug 607347)
  posted: https://www.redhat.com/archives/lvm-devel/2010-July/msg00009.html
  commit: https://www.redhat.com/archives/lvm-devel/2010-July/msg00027.html

  [Patch]: Handle both devices of a mirrored log failing (bug 607347) -
           additional fix
  posted: https://www.redhat.com/archives/lvm-devel/2010-July/msg00093.html
  commit: https://www.redhat.com/archives/lvm-devel/2010-July/msg00101.html

In the second patch, the target type of mirrored log is replaced with
error target when remove_log is set to 1, but this procedure should be
also used in other cases such as the number of mirror leg is 1. This
patch relocates the procedure to the main path.

In addition, I added following three changes.

- Removed tmp_orphan_lvs handling procedure
  It seems that _delete_lv() can handle detached_log_lv properly
  without adding mirror legs in mirrored log to tmp_orphan_lvs.
  Therefore, I removed the procedure.

- Removed vg_write()/vg_commit()
  Metadata is saved by vg_write()/vg_commit() just after detached_log_lv
  is handled. Therefore, I removed vg_write()/vg_commit().

- With Jon's second patch, we think that we don't have to call
  remove_mirror_log() in _lv_update_mirrored_log() because will be
  handled remove_mirror_images() in _lvconvert_mirrors_repaire().
</patch header>

Signed-off-by: Takahiro Yasui <takahiro.yasui@hds.com>
Reviewed-by: Petr Rockai <prockai@redhat.com>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-08-02 21:07:40 +00:00
Jonathan Earl Brassow
efaaf3146d Disallow mirrored logs in cluster mirrors.
The cluster log daemon (cmirrord) is not multi-threaded and
can handle only one request at a time.  When a log is stacked
on top of a mirror (which itself contains a 'core' log), it
creates a situation that cannot be solved without threading.

When the top level mirror issues a "resume", the log daemon
attempts to read from the log device to retrieve the log
state.  However, the log is a mirror which, before issuing
the read, attempts to determine the 'sync' status of the
region of the mirror which is to be read.  This sync status
request cannot be completed by the daemon because it is
blocked on a read I/O to the very mirror requesting the
sync status.
2010-08-02 19:03:45 +00:00
Dave Wysochanski
936541ec56 Remove irrelevant comments relating to vg_mda_copies. 2010-07-30 16:47:27 +00:00
Jonathan Earl Brassow
405c4a45d8 It's not enough to check for the kernel module in the case of cluster
mirrors, we must also check that the log daemon (cmirrord) is running.
The log module can be auto-loaded, but the daemon cannot be
"auto-started".  Failing to check for the daemon produces cryptic
messages that customers have a hard time deciphering.  (The system
messages do report that the log daemon is not running, but people
don't seem to find this message easily.)

Here are examples of what is printed when the module is available,
but the log daemon has not been started.

[root@bp-01 LVM2]# lvcreate -m1 -l1 -n lv vg
  Shared cluster mirrors are not available.

[root@bp-01 LVM2]# lvcreate -m1 -l1 -n lv vg -v
    Setting logging type to disk
    Finding volume group "vg"
    Archiving volume group "vg" metadata (seqno 3).
    Creating logical volume lv
    Executing: /sbin/modprobe dm-log-userspace
    Cluster mirror log daemon is not running
  Shared cluster mirrors are not available.
    Creating volume group backup "/etc/lvm/backup/vg" (seqno 4).
2010-07-21 13:40:21 +00:00
Jonathan Earl Brassow
60f425d1b3 Fix for bug 614164: No check for existing name when splitting mirror
The user could use the same name as an existing LV when specifying a
name for an LV split off from a mirror.  This causes all sorts of
issues.
2010-07-13 22:24:39 +00:00
Jonathan Earl Brassow
c42b084793 Fix for bugs: 612248 & 612291 Split mirror issues
The main problem with these bugs was that the newly split
off LV was not being suspended properly.  This meant that
the memlock count was not being balanced, the DM devices
were not being renamed, and some DM devices which should
have been removed were not.

I've also renamed some of the variables and added comments
to make things clearer as to what is going on.  (I can break
this patch in two if it means easier review.)
2010-07-13 21:48:16 +00:00
Jonathan Earl Brassow
a93fb6299f Failed to test for the case where a log was requested to be removed
even though there was no log.  A simple run through the in-tree test
suite would have caught this.  :(

-               if (lv_is_mirrored(detached_log_lv) &&
+               if (detached_log_lv && lv_is_mirrored(detached_log_lv) &&

Also, made some cosmetic changes suggested by kabi after my last check-in
(e.g. s/return 0/return_0/ and adding an error message).
2010-07-09 17:57:51 +00:00
Dave Wysochanski
f77fb62b2a Add log_error when strdup fails in {vg|lv}_change_tag().
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-07-09 16:57:44 +00:00
Alasdair Kergon
08f1ddea6c Use __attribute__ consistently throughout. 2010-07-09 15:34:40 +00:00
Alasdair Kergon
80e569104b Remove superfluous fn prototypes. 2010-07-09 15:21:10 +00:00
Jonathan Earl Brassow
aa5734f2a3 Finish fix for bug 607347: failing both redundant mirror log legs...
A previous check-in added logic to handle the case where both images
of a mirrored log failed.  It solved the problem by simply removing
the log entirely - leaving the parent mirror with a 'core' log.  This
worked for most cases.  However, if there was a small delay between
the failures of the two mirrored log devices, the mirror would hang,
LVM would hang, and no additional LVM commands could be issued.

When the first leg of the log fails, it signals the need for repair.
Before 'lvconvert --repair' is run by dmeventd, the second leg fails.
'lvconvert' would see both devices as failed and try to remove the
log entirely.  When it came time to suspend the parent mirror to
update the configuration, the suspend would hang because it couldn't
get any I/O through the mirrored log, which was plugged waiting for
corrective action.  The solution is to replace the log with an error
target to clear any pending writes before removing it.  This allows
the parent mirror to suspend and make the proper changes.
2010-07-09 15:08:12 +00:00
Dave Wysochanski
a5fb2bbff3 Pass metadataignore to pv_create, pv_setup, _mda_setup, and add_mda.
Pass metadataignore through PV creation / setup paths.
As a result of this cleanup, we can remove the unnecessary setting
of mda_ignore bits inside pvcreate_single(), after call to pv_create.
For now, just set metadataignore to '0' in some places.  This is
equivalent to the prior functionality, although the 0 is given
by the caller not hardcoded in _mda_setup() call.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-07-08 18:24:29 +00:00
Dave Wysochanski
dce204cec5 Init mda->list in mda_copy.
This patch should be no functional change as all callers initialize
mda->list.
2010-07-08 17:41:46 +00:00
Dave Wysochanski
7041b476ac Add warning to vgextend and pvchange if metadataignore given on cmdline.
Warn the user then change the value of vg_mda_copies.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-07-07 18:59:45 +00:00
Alasdair Kergon
7f7af46862 Adjust auto-metadata repair and caching logic to try to cope with empty mdas.
- If a PV contained empty mdas, the auto-recovery code was not kicking in.
- The 'inconsistent' state was getting lost when metadata was cached so
  recovery didn't kick in.  But leave the behaviour alone when using
  precommitted metadata because of a warning in a confusing FIXME.

In my testing, pvs and vgs didn't repair inconsistent metadata like they
used to do.  (How many other tools fail similarly now?)

And there should be no need to cache inconsistent metadata because it is
supposed to get repaired under the protection of a write lock immediately it is
discovered.

This code is in need of a redesign based on first principles.
I still see bugs in this code and this commit is risky.
2010-07-07 02:53:16 +00:00
Alasdair Kergon
6c8655ce9b fix code in 2nd mda unignore loop to match 1st loop 2010-07-06 20:09:38 +00:00
Alasdair Kergon
68f4e0c734 s/flags/mda/ 2010-07-06 17:29:50 +00:00
Alasdair Kergon
0db1bbc3c3 shorten mesg 2010-07-06 17:27:32 +00:00
Alasdair Kergon
643f234119 fix jumbled args in 'Adjusting' message 2010-07-06 17:26:08 +00:00
Alasdair Kergon
d911ec67a9 Randomly select which mdas to use or ignore.
Add some missing standard configure.in checks.
2010-07-05 22:23:15 +00:00
Alasdair Kergon
db3c1ac1c8 Add printf format attributes to yes_no_prompt & dm_{sn,as}printf and fix a calle 2010-07-02 21:16:50 +00:00
Alasdair Kergon
12eadbabdd improve vgmetadatacopies unmanaged message 2010-06-30 20:03:52 +00:00
Dave Wysochanski
3b9d1b1a96 Check for missing_pv in vg_remove loop.
If a pv is missing, we should just skip it rather than checking the
device size and failing the vgremove.
2010-06-30 19:55:43 +00:00
Alasdair Kergon
d8886386bd more mda ignore cleanups 2010-06-30 19:28:35 +00:00
Dave Wysochanski
40b4d1c3ae Refactor vg_remove_check to place pv removal into separate function. 2010-06-30 18:03:52 +00:00
Alasdair Kergon
23177eda88 more metadataignore message/code cleanup 2010-06-30 17:13:05 +00:00
Alasdair Kergon
efe75fd705 revert that 2010-06-30 14:54:29 +00:00
Alasdair Kergon
a6c4427188 suppress useless compiler warning 2010-06-30 14:52:29 +00:00
Dave Wysochanski
ef7b409966 Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg. 2010-06-30 14:48:07 +00:00
Alasdair Kergon
67b91d0848 Only attempt to guarantee 1 mda ignored if there's at least one mda in the vg. 2010-06-30 14:27:40 +00:00
Alasdair Kergon
647c64c796 Improve various log messages. 2010-06-30 13:51:11 +00:00
Dave Wysochanski
a5bf70018b Add --metadataignore to pvcreate.
Allow metadataignore flag to be passed in to pvcreate.
Ideally, more refactoring of the mda allocation / initialization
is warranted, but for now, we just add another parameter to 'add_mda'
to take an existing mda ignored flag.  We need to do this or pv_write
loses the state of the mda 'ignored' flag before copying and writing
to disk.
2010-06-30 12:17:24 +00:00
Dave Wysochanski
6af5155529 Improve logging for setting --vgmetadatacopies.
Example of logging:
metadata/metadata.c:1127     Setting mda_copies = 3 on vg vgtest
metadata/pv_manip.c:296         /dev/loop2 0:      0     25: NULL(0:0)
metadata/pv_manip.c:296         /dev/loop3 0:      0     25: NULL(0:0)
metadata/pv_manip.c:296         /dev/loop4 0:      0     25: NULL(0:0)
metadata/metadata.c:1072     Adjusting ignored mdas on vg vgtest, vg_mda_used_count=5, vg_mda_copies=3
metadata/metadata.c:1015     Setting ignore flag for 2 mdas on vg vgtest
metadata/metadata.c:4151     Setting mda ignored flag for metadata_locn /dev/loop2.
metadata/metadata.c:4151     Setting mda ignored flag for metadata_locn /dev/loop3.
2010-06-29 22:41:28 +00:00
Dave Wysochanski
d37dd5b2d3 Improve logging for metadata ignore by printing device name.
Print device name when setting or clearing metadata ignore bit.
Example:
label/label.c:160       /dev/loop2: lvm2 label detected
cache/lvmcache.c:1136         lvmcache: /dev/loop2: now in VG #orphans_lvm2 (#orphans_lvm2)
metadata/metadata.c:4142     Setting mda ignored flag for metadata_locn /dev/loop2.
format_text/text_label.c:318     Skipping mda with ignored flag on device /dev/loop2 at offset 4096
2010-06-29 22:37:32 +00:00
Dave Wysochanski
710c9373bf Add some log_verbose debug statements related to metadataignore.
Logging isn't ideal, especially for mda_set_ignore.  Ideally we'd
like to display the device name and offset in this case but this
requires a bit more work and a per-format 'mda_description' function
pointer definition (we don't have access to mda_context in
metadata.c).
2010-06-29 22:25:58 +00:00
Dave Wysochanski
a375ced300 Move code into pv_change_metadataignore library function.
In preparation to call this from both pvcreate as well as pvchange,
move the guts of metadataignore into a library function.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-29 21:32:44 +00:00
Dave Wysochanski
a9d8bf269a Allow 'all' and 'unmanaged' values for --vgmetadatacopies.
Allowing an 'all' and 'unmanaged' value is more intuitive, and
provides a simple way for users to get back to original LVM behavior
of metadata written to all PVs in the volume group.

If the user requests "--vgmetadatacopies unmanaged", this instructs
LVM not to manage the ignore bits to achieve a specific number of
metadata copies in the volume group.  The user is free to use
"pvchange --metadataignore" to control the mdas on a per-PV basis.
If the user requests "--vgmetadatacopies all", this instructs LVM
to do 2 things: 1) clear all ignore bits, and 2) set the "unmanaged"
policy going forward.

Internally, we use the special MAX_UINT32 value to indicate 'all'.
This 'just' works since it's the largest value possible for the
field and so all 'ignore' bits on all mdas in the VG will get
cleared inside _vg_metadata_balance().  However, after we've
called the _vg_metadata_balance function, we check for the special
'all' value, and if set, we write the "unmanaged" value into the
metadata.  As such, the 'all' value is never written to disk.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:40:01 +00:00
Dave Wysochanski
a09a8efb66 Update check in vg_split_mdas to account for ignored mdas list.
The check in vg_split_mdas will trigger an error if the 'from' vg
list is empty.  However, this might be ok in some instances now
that we have ignored mdas.  Relax this check so an error is triggered
only in the case where there's truly no more mdas in the 'from'
vg.

One example of where this makes a difference is with vgreduce.
If we try to vgreduce a PV with un-ignored mdas, this should trigger
the balancing function to un-ignore mdas on another PV in the VG.
However, we don't get to vg_write() before we fail because this
list size check fails, and we see an error message indicating:
"Cannot remove final metadata area ..."

Another example is with vgsplit into a new VG, where the PVs
being moved contain all ignored mdas.  We must move the mdas on
fid->metadata_areas_ignored from 'vg_from' to 'vg_to'.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:38:56 +00:00
Dave Wysochanski
f61cd7b249 Ensure fid mda lists are populated correctly during vgextend.
The vgextend path calls add_pv_to_vg().  Inside add_pv_to_vg(),
we must ensure we pass the correct mdas list into pv_setup(), as
copies of mdas are placed on the vg->fid list.  If we don't place
the mdas on the correct vg->fid list, the various counts may be
incorrect and the metadata balance algorithm will not work when
called from vg_write() path.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:38:39 +00:00
Dave Wysochanski
1b54343328 Implement _vg_adjust_ignored_mdas and call from vg_write() path.
Compare the value of the newly added vg_mda_copies field
(--vgmetadatacopies parameter) with the current count of
in-use mdas and ignoring or unignoring mdas as necessary to
get to the target count.  Also, as a safety check before
returning, ensure we have at least one mda enabled.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:37:54 +00:00
Dave Wysochanski
821f0cc5ea Add vg get/set methods for VG metadata copies.
This patch adds the get and partially implemented set function.
The 'set' function should probably ignore or un-ignore metadata areas
based on new values.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:36:56 +00:00
Dave Wysochanski
88d7dc1af8 Add mda_copies to VG structures and initialization.
Add a field to struct volume_group to later implement metadata
balancing:
- mda_copies: target # of non-ignored mdas in the VG; default 0 (do
not control pv 'ignore mdas' bit.

This patch just adds the parameter to the structures with the default
values but does not modify any commands.  Should be no functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:36:37 +00:00
Dave Wysochanski
0f2f8a5c3a Before committing each mda, arrange mdas so ignored mdas get committed first.
Arrange mdas so mdas that are to be ignored come first.  This is an
optimization that ensures consistency on disk for the longest period of time.
This was noted by agk in review of the v4 patchset of pvchange-based mda
balance.

Note the following example for an explanation of the background:
Assume the initial state on disk is as follows:
PV0 (v1, non-ignored)
PV1 (v1, non-ignored)
PV2 (v1, non-ignored)
PV3 (v1, non-ignored)

If we did not sort the list, we would have a commit sequence something like
this:
PV0 (v2, non-ignored)
PV1 (v2, ignored)
PV2 (v2, ignored)
PV3 (v2, non-ignored)

After the commit of PV0's mdas, we'd have an on-disk state like this:
PV0 (v2, non-ignored)
PV1 (v1, non-ignored)
PV2 (v1, non-ignored)
PV3 (v1, non-ignored)

This is an inconsistent state of the disk. If the machine fails, the next
time it was brought back up, the auto-correct mechanism in vg_read would
update the metadata on PV1-PV3.  However, if possible we try to avoid
inconsistent on-disk states.  Clearly, because we did not sort, we have
a greater chance of on-disk inconsistency - from the time the commit of
PV0 is complete until the time PV3 is complete.

We could improve the amount of time the on-disk state is consistent by simply
sorting the commit order as follows:
PV1 (v2, ignored)
PV2 (v2, ignored)
PV0 (v2, non-ignored)
PV3 (v2, non-ignored)

Thus, after the first PV is committed (in this case PV1), on-disk we would
have:
PV0 (v1, non-ignored)
PV1 (v2, ignored)
PV2 (v1, non-ignored)
PV3 (v1, non-ignored)

This is clearly a consistent state.  PV1 will be read but the mda will be
ignored.  All other PVs contain v1 metadata, and no auto-correct will be
required.  In fact, if we commit all PVs with ignored mdas first, we'll
only have an inconsistent state when we start writing non-ignored PVs,
and thus the chances we'll get an inconsistent state on disk is much
less with the sorted method.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:35:49 +00:00
Dave Wysochanski
77e0ed4be7 Refactor vg_commit() to add _vg_commit_mdas().
Factor out calling mda->ops->vg_commit() for each mda.
No functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:35:33 +00:00
Dave Wysochanski
69d1732334 Update _vg_read and _text_create_text_instance to use fid_add_mda[s].
When we are constructing the vg, we may need to adjust the list of
metadata_areas if there are ignored mdas.  At label read time, we
do not read the metadata of ignored mdas, and as a result, they do
not get placed on vg->fid->metadata_areas inside _text_create_text_instance
since lvmcache does not have these areas attached to vginfo->infos.
However, when we're checking the pvids inside _vg_read, after having
read another metadata area from another PV, we do have the opportunity
to update the metadata_area and metadata_areas_ignored lists based
on the read metadata_area.  We need accurate mda lists for the reporting
functions that count the ignored mdas, as well as general correctness
of mda balancing.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:35:17 +00:00
Dave Wysochanski
bb723d7897 Use mdas_empty_or_ignored() in place of checks for empty mda list.
With the addition of ignored mdas, we replace all checks for an empty
mda list with a new function to look for either an empty mda list or
ignored mdas.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:34:58 +00:00
Dave Wysochanski
f9c307cd07 Add mdas_empty_or_ignored() helper function.
Add a helper function to consolidate checking for an empty mdas list
or ignored mdas.  Ignored mdas should behave almost identically to
an empty mda list - the metadata areas should not be read or written
to.  This function will make it easier to implement metadata balancing
and easier to track pvs with an empty mda list or ignored mdas.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:34:40 +00:00
Dave Wysochanski
cdbe475fe3 Define new functions and vgs/pvs fields related to mda ignore.
Define a new pvs field, pv_mda_used_count, and a new vgs field,
vg_mda_used_count to match the existing pv_mda_count and vg_mda_count.
These new fields count the number of mdas that have the 'ignored' bit
clear (they are in use on the PV / VG).  Also define various supporting
functions to implement the counting as well as setting the ignored
flag and determining if an mda is ignored.  These high level functions
call into the lower level location independent mda ignore functions
defined by earlier patches.

Note that counting ignored mdas in a vg requires traversing both lists
and checking for the ignored bit on the mda.  The count of 'ignored'
mdas then is defined by having the bit set, not by which list the mda
is on.  The list does determine whether LVM actually does read/write to
the mda, though we must count the bits in order to return accurate numbers
for the various counts.  Also, pv_mda_set_ignored must search both vg
lists for ignored mda.  If the state changes and needs to be committed
to disk, the ignored mda will be on the non-ignored list.

Note also in pv_mda_set_ignored(), we must properly manage the mda lists.
If we change the ignored state of an mda, we must change any mdas on
vg->fid->metadata_areas that correspond to this pv.  Also, we may
need to allocate a copy of the mda, as is done when fid->metadata_areas
is populated from _vg_read(), if we are un-ignoring an ignored mda.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:33:44 +00:00
Dave Wysochanski
9ccac021a7 Add metadata_areas_ignored list and functions to manage ignored mdas.
Add a second mda list, metadata_areas_ignored to fid, and a couple
functions, fid_add_mda() and fid_add_mdas() to help manage the lists.

These functions are needed to properly count the ignored mdas and
manage the lists attached to the 'fid' and ultimately the 'vg'.

Ensure metadata_areas_ignored is initialized in other formats, even
if the list is never used.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-06-28 20:33:22 +00:00
Dave Wysochanski
f55a20eb36 Rename fid->metadata_areas to fid->metadata_areas_in_use.
Rename the metadata_areas list to an 'in_use' list to prepare for
future 'ignored' list.
2010-06-28 20:32:44 +00:00
Dave Wysochanski
ef4fa155a5 Add mda location specific mda_copy constructor.
Because of the way mdas are handled internally, where a PV in a VG
has mdas on both info->mdas and vg->fid->metadata_areas list, we
need a location independent copy constructor for struct
metadata_area.  Break up the existing format-text specific copy
constructor into a format independent piece and a format dependent
piece.

This function is necessary to properly implement pv_set_mda_ignored().

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
2010-06-28 20:31:59 +00:00
Dave Wysochanski
29f24d4634 Add mda_locns_match() internal library function for mapping pv/device to VG mda.
A metadata_area is defined independent of the location.  One downside
is that there is no obvious mapping from a pv to an mda.  For a PV in
a VG, we need a way to start with a PV and end up with an MDA, if we
are to manage mdas starting with a device/pv.  This function provides
us a way to go down the list of PVs on a VG, and identify which ones
match a particular PV.

I'm not entirely happy with this approach, but it does fit into the
existing structures in a reasonable way.

An alternative solution might be to refactor the VG - PV interface such
that mdas are a list tied to a PV.  However, this seemed a bit tricky since
a PV does not come into existence until after the list of mdas is
constructed (see _vg_read() - we create a 'fid' and attach mdas to it,
then we go through them and attach pvs).

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
2010-06-28 20:31:38 +00:00
Dave Wysochanski
322c5868b3 Add location independent flag and functions to ignore mdas.
First we add a 'flags' field to the location independent
metadata_area structure, and a MDA_IGNORE flag.  The
mda_is_ignored and mda_set_ignored functions are added to
manage the flag.  Adding the flag and functions gives a
library interface to ignore metadata areas independent of
the underlying location (disk, file, etc).  The location
specific read/write functions must then handle the specifics
of what this flag means to the location.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
2010-06-28 20:30:14 +00:00
Jonathan Earl Brassow
68c31a2a36 Fix for bz608048 from Taka...
The same region size is used for both mirror volume and mirrored
log volume, but when the physical extent size is bigger than region size,
the size of mirror leg for mirrored log is smaller than the region size
and lvcreate command fails.

This patch adjusts a region size of mirrored log to a smaller value of
region size or physical extent size.

[This patch ensures that the region_size of the mirrored log does not
exceed the size of the mirrored log itself, which would violate the
kernel constraint: (region_size <= ti->len).]

Signed-off-by: Takahiro Yasui <takahiro.yasui@hds.com>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-06-28 14:19:41 +00:00
Jonathan Earl Brassow
42f7fd0590 The function that runs to compress a stacked mirror after
converting from 2-way to 3-way mirror (collapse_mirrored_lv)
was calling '_remove_mirror_images' with the 'remove_log'
parameter set.  When the code was put in to fix 599898 to
honor log parameters during conversion, this argument was
suddenly being honored.  Thus, when someone would convert from
a 2-way to 3-way mirror, the log would get removed.

'collapse_mirrored_lv' should not be calling '_remove_mirror_images'
with 'remove_log' set.
2010-06-23 13:57:26 +00:00
Milan Broz
f9e177d281 Fix "allocated" warning typo. 2010-06-22 21:10:53 +00:00
Jonathan Earl Brassow
a7d355a28c Mirrors can be layered - as in the case of an converting 2-way
to 3-way mirror.  When conversion operations are performed on
these types of mirrors, log options can be confused/ignored.

In the case of a converting 3-way mirror, we have a top-level
2-way corelog mirror whose legs are 1) a 2-way disk-log mirror
and 2) a linear device.  If we wish to convert this 3-way mirror
to a 2-way mirror, the linear device is removed and the extra
top layer is eliminated.  If we also wished to convert the disk
log to a core log in the same step, ambiguity creeps in.  It is
somewhat obvious what the user wants - a 2-way mirror with a
corelog.  However, looking at the top level mirror before
compression, it seems that the mirror already has a core log.
This is why the operation seemed to fail.

This patch simply re-evaluates what mirrored_seg points to after
a compression and then considers the log argument.

This is a fix for bug 599898.
2010-06-21 16:12:33 +00:00
Petr Rockai
d345bf2cd3 Account for mirror transient status when doing lvconvert --repair. 2010-05-24 15:32:20 +00:00
Zdenek Kabelac
4ef2bf27a7 Update Copyright date for resently modifed files 2010-05-24 09:04:27 +00:00
Zdenek Kabelac
65928349e7 Replicator: add read and release VGs for rsites
Add functions to read and release remote VGs for replicator sites
in activation context.
2010-05-21 14:07:16 +00:00
Zdenek Kabelac
f6d7e637c3 Add toolcontext.h header file. 2010-05-21 13:34:09 +00:00
Zdenek Kabelac
6222635b38 Replicator: add find_replicator_vgs
Adding find_replicator_vgs() function to find all needed
VGs for replicator-dev LV.

This function is later called before taking lock_vol().
2010-05-21 12:55:25 +00:00
Zdenek Kabelac
12569ccb03 Replicator: add sorted cmd_vg list
Introduce struct cmd_vg to store information about needed
volume group name, vgid, flags and the pointer to opened VG.

Keep VGs list in alphabetical order for locking order.

Introduce functions:
cmd_vg_add() add new cmd_vg entry.
cmd_vg_lookup() search cmd_vgs for vg_name.
cmd_vg_read() open VGs in cmd_vgs list.
cmd_vg_release() close VGs in reversed order.
2010-05-21 12:52:01 +00:00
Zdenek Kabelac
0a02d30ea4 Replicator: extend volume_group with list of VGs and flag
Add pointer to linked list of opened VGs. List temporarily keeps
the information about needed or locked and opened VGs for replicator target.

Also add cmd_missing_vgs flag information for quick check and
also for possible continuos process_each_lv() usage where we need
to detect whether failure has been caused by missing VG or
some other reason.
2010-05-21 12:47:46 +00:00
Zdenek Kabelac
e86e45f7ea Replicator: extend _lv_each_dependency() with dependencies for Replicator devices 2010-05-21 12:45:18 +00:00
Zdenek Kabelac
651cae3c5c Replicator: check replicator segment
Check for possible problems within replicator structures.
Used also by vg_validate.
2010-05-21 12:43:02 +00:00
Zdenek Kabelac
1207106fbc Replicator: new files for Replicator target 2010-05-21 12:40:05 +00:00
Zdenek Kabelac
8fea97b7e7 Replicator: base lvm2 support
Adding configure.in support for Replicators.
Adding basic lib lvm support for Replicators.
Adding flags REPLICATOR and REPLICATOR_LOG.
Adding segments SEG_REPLICATOR and SEG_REPLICATOR_DEV.
Adding basic methods for handling replicator metadata.
2010-05-21 12:36:30 +00:00
Dave Wysochanski
dd2a0e940d Add find_vgname_from_{pvname|pvid} functions.
Some commands start with a pvname, but we'd like to force users to
start with a vg handle to obtain a pv handle.  Our best option seems
to be providing a way to look up the vgname from the pvname, and then
require them to use vg_read/vg_open.

In addition to the pvname lookup function, this patch also provides a
lookup by pvid.  The lookup by pvid can be used in conjunction with
lvmcache_get_pvids to process all pvs in the system.

The pvid find function first calls lvmcache_vgname_from_pvid, which may
cause the label to be read if it is not in the cache.  If the vgname is
returned is an orphan, we then check to see if there are metadata areas,
and if not, we scan every PV on the system by calling scan_vgs_for_pvs().
In most cases we should not need to do this, and by using the info->mdas
count, we avoid calling pv_read() as prior code did.  So this patch is a
bit cleaner and should allow us to refactor more of the pv code.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-05-19 11:52:37 +00:00
Alasdair Kergon
1d837442bf Add is_global_vg and split out from is_orphan_vg. 2010-05-19 02:36:33 +00:00
Alasdair Kergon
34220fe292 Validate orphan and VG_GLOBAL lock order too. 2010-05-19 02:08:50 +00:00
Alasdair Kergon
fa305e2ec6 Accept orphan VG names as parameters to lock_vol() and related functions. 2010-05-19 01:16:40 +00:00
Jonathan Earl Brassow
a932c2b61f Disallow toggling the cluster attribute of a volume group if there
are active mirrors or snapshots.

We don't have the mechanisms in place to change the device-mapper
tables for those targets that have behavioral differences between
cluster and single machine instances.  Allowing users to change
the attribute but not changing the target's behavior can lead to
data corruption.

The following bugs are fixed/avoided by this patch:
235123 - vgchange -c [ny] do not change target types when necessary
289331 - RFE: switching from cluster domain to local domain needs to deactivate volume somehow
289541 - when changing from local to cluster, volumes can not appear to be deactivated
2010-05-14 15:19:42 +00:00
Jonathan Earl Brassow
56a5925aed Fix comment from last commit. Additionally, there is no need
to put a comment into the WHATS_NEW file if it is a regression
that was created and fixed inside the same release window.
2010-04-27 15:26:58 +00:00
Jonathan Earl Brassow
d7c9d72390 Patch to fix bug 586021 and mantain historical behavior of
being able to remove more images from a mirror than the
number of PVs directly specified for removal.

The effort to fix bug 581611 corrected a bug that was unnoticed
at the time.  The loop in _remove_mirror_images that looks over
the specified PVs was allowing devices that were previously
counted and moved to the end of the list to be double-counted.
This resulted in the number of devices needed for removal always
being satisfied - even if the user did not specify enough PVs
for removal to satisfy the request.  When 581611 was fixed, this
double-counting no longer took place and the result was to remove
only the minimum of the number of PVs specified or the number
that was asked to be removed.

By simply always setting 'new_area_count' (as used to be done
only in the else statement), we return to the previous behavior.
Indeed, this is exactly what the double-counting was allowing
to happen before the fix of 581611.
2010-04-27 14:57:49 +00:00
Mike Snitzer
60267bdce8 Disallow the direct removal of a merging snapshot.
Allow lv_remove_with_dependencies() to know the top-level LV that was
requested to be removed (otherwise it recurses and we lose context).

A merging snapshot cannot be removed directly but the associated origin
can be.  Disallow removal of a merging snapshot unless the associated
origin is also being removed.
2010-04-23 19:27:10 +00:00
Mike Snitzer
1f661c5dd8 When removing a snapshot avoid preloading the origin if the
snapshot-merge target is not active.
2010-04-23 02:57:39 +00:00
Jonathan Earl Brassow
66f79d05eb Disallow the primary mirror image from being removed when the
mirror is not in-sync.  This restriction is not extended to
repair operations (i.e. it will not limit what 'lvconvert --repair'
can do).
2010-04-21 13:55:08 +00:00
Alasdair Kergon
ee90b8197f Move function up file 2010-04-20 12:14:28 +00:00
Peter Rajnoha
1e696b0c15 Do not reset position in metadata ring buffer on vgrename and vgcfgrestore.
We should write metadata into next position in the ring buffer while calling
vgrename and vgcfgrestore. At this code level (_vg_write_raw), we were not able
to determine if this is a rename or not. If yes, then accompanying VG structure
passed here has a new name set, not the old one.

When looking for a location where to put metadata next, we were given a NULL
value because of failed VG name comparison (in _find_vg_rlocn) between the
name in existing metadata and metadata we're just about to write.

This resets the position in the ring buffer, overwriting any existing metadata
(and also incorrectly updates the cache to "orphan" afterwards).

This patch just adds old_name item in struct volume_group that we can check and use
if necessary and detect renames at lower layers as well.

The same applies for vgcfgrestore, but here we're using a special value of
old_name, an empty string, to disable the check with existing metadata totally.
2010-04-14 13:09:16 +00:00
Dave Wysochanski
af46c894d0 Add pv->vg to solidify link between a pv and a vg.
lvm2app needs a link back to the vg in order to use the vg handle for
memory allocations as well as other things.  This patch adds the field
to struct physical_volume, and sets pv->vg when reading a vg from disk or
extending a vg by using the helper function previously added,
add_pvl_to_vgs().  Moves and renames are handled with separate code
inside move_pv() and vgmerge().  Add pv->vg check to vg_validate().

A NULL value in pv->vg signifies membership in the orphan VG.
Note though in the case of pv_read() on a device with metadatacopies == 0,
more devices may need to be read for an authoritative answer.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-13 17:26:36 +00:00
Dave Wysochanski
11647ad01c Use del_pvl_from_vgs() in vgreduce paths.
Somehow these got missed in earlier patches.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-13 17:26:20 +00:00
Dave Wysochanski
0adfbfd5ea Call add_pvl_to_vgs() and del_pvl_from_vgs() from more places.
Now that we have library functions to add/delete a pv from the vg->pvs
list, call them from everywhere.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-13 17:26:03 +00:00
Dave Wysochanski
8cfd64de78 Add del_pvl_from_vgs() and move prototypes into metadata-exported.h
Add a delete function to manage the vg->pvs list.

NOTE: It may be possible to do further cleanup to these add/del functions
by passing a 'pv' as input instead of 'pv_list'.  The pv_list is used for
functions which do allocations (lvcreate) while other places in the code
just manage a list of 'pv' (e.g. import functions, vgextend, etc).

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-13 17:25:44 +00:00
Alasdair Kergon
1485ce69c4 Permit mimage LVs to be striped in lvcreate and lvresize. 2010-04-09 01:00:10 +00:00
Dave Wysochanski
fddc256a02 Check for duplicate paths (pvids) on the commandline of vgcreate.
A user specifying duplicate paths on the cmdline of vgcreate will
get a message similar to the following:
vgcreate vgtest2 /dev/loop3 /dev/loop5
  Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3
  Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5
  Internal error: Duplicate PV id jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8 detected for /dev/loop3 in vgtest2.

This is caught by vg_validate(), but it would be good to find
this condition earlier in the vgcreate code.  add_pv_to_vg()
currently checks by pvname, but does not look for duplcate pvids.
This patch adds the check for duplicate pvids and results in new
error output as follows:
vgcreate vgtest2 /dev/loop3 /dev/loop5
  Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop5 not /dev/loop3
  Found duplicate PV jk1lXsKzwyOKlXq6bhaFFKMQQ06oPgu8: using /dev/loop3 not /dev/loop5
  Physical volume '/dev/loop5 (jk1lXs-Kzwy-OKlX-q6bh-aFFK-MQQ0-6oPgu8)' listed more than once.
  Unable to add physical volume '/dev/loop5' to volume group 'vgtest2'.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-08 15:18:35 +00:00
Alasdair Kergon
4d0e07a799 missing ?: 2010-04-08 00:56:26 +00:00
Alasdair Kergon
b3302a0c3c suppress bogus compiler warning 2010-04-08 00:52:41 +00:00
Alasdair Kergon
aab7a3978b Fix pvmove allocation to take existing parallel stripes into account.
When moving parts of striped LVs, pvmove wouldn't care about leaving you with
two stripes on the same disk.  Now --alloc anywhere is needed for that.
(Tried and gave up on two alternative approaches before the one committed here.)
2010-04-08 00:28:57 +00:00
Dave Wysochanski
9e82787da2 Add add_pvl_to_vgs() - helper function to add a pv to a vg list.
Small refactor of main places in the code where a pv is added to a
vg into a small function which adds the pv to the list and updates
the vg counts.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-06 14:04:54 +00:00
Dave Wysochanski
53ad3cad14 Add pv to vg->pvs after check for maximum value of vg->extent_count.
In add_pv_to_vg(), we should only add the pv to vg->pvs after all
internal checks have passed.  The check for vg->extent_count exeeding
maximum was after we added the pv to the list, so this function could
return a state of vg->pvs that did not reflect other parameters such
as vg->pv_count.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-04-06 14:03:43 +00:00
Alasdair Kergon
d27c8b5660 remove compiler warning 2010-04-02 01:35:34 +00:00
Alasdair Kergon
6ec52a01ee A few more log_error to log_warn changes for mirrors. 2010-04-01 14:54:37 +00:00
Alasdair Kergon
abb9fb8370 Try to fix tracking of whether or not log extents need allocating. 2010-04-01 13:58:13 +00:00
Alasdair Kergon
0c67893ce9 Avoid endless loop if lv->segments list is corrupted 2010-04-01 13:08:06 +00:00
Alasdair Kergon
e7159c828b initialise log_allocated to 0 2010-04-01 12:29:07 +00:00
Alasdair Kergon
d723636d52 Limit number of error messages when checking LV segments. 2010-04-01 12:14:20 +00:00
Alasdair Kergon
a1192f17ba Improve vg_validate to detect some loops in lists. 2010-04-01 11:45:36 +00:00
Alasdair Kergon
0640232acd Improve vg_validate to detect some loops in lists. 2010-04-01 11:43:24 +00:00
Alasdair Kergon
258db3ad8e Change most remaining log_error WARNING messages to log_warn. 2010-04-01 10:34:09 +00:00
Alasdair Kergon
bce2869d92 Attempt to fix non-ALLOC_ANYWHERE allocation code after recent changes broke
The preference given to the PVs with the largest free areas.
2010-03-31 20:26:04 +00:00
Milan Broz
6733116a19 Fix all segments memory is allocated from vg private mempool.
Physical segments were still allocated from global
command context mempool.

This leads to very high memory usage when
activating large VG (vgchange).
(Memory usage was about 2G when >3000LVs).

Fix it by properly using vg->vgmem private pool,
so all the memory is released early.

New memory pool parameter is needed here for pv_split_segment
function.

Also fix the same problem in some minor allocations
(vg description, lv segment split).
2010-03-31 17:23:18 +00:00
Milan Broz
0423887528 Do not traverse PV segment list twice.
In addition to previous patch, we really do not need
to search for segment which was just allocated in
split request.

Make pv_split_segment function return newly allocated
(split) segment also.

(So after this patch, there is only one user
of slow find_peg_by_pe).
2010-03-31 17:22:26 +00:00
Milan Broz
80b96a8974 Optimise PV segments search.
The function find_peg_by_pe is incredibly inefficient
for Pvs with many segments.

In shiny future there should be binary (or interval) tree
instead of sorted linked list (volunteers?).

Anyway, for now, we can use dirty trick here to optimise this case:

 - Allocations are usually applied from the beginning
 of PV (we have no alloocation policy which allocates areas
 "backwards")

 - The only user of find_peg_by_pe is pv_split_segment()
 call. In *most* cases it need to split *last* PV segment.

So if we search sorted pv segment list backwards, we
hit the requested segment immediatelly.

This patch applies this tiny change.
(and saves >30% of processing time when >3000LVs segments are on one PV!)

To discourage using this inefficient function from other code,
it is moved to pv_manip.c and used static for now:-)
2010-03-31 17:21:40 +00:00
Mikulas Patocka
655849fb14 A missing space in the error message.
Add missing parentheses to an error message

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2010-03-31 12:06:30 +00:00
Alasdair Kergon
1dee5eb625 Fix --alloc contiguous policy only to allocate one set of parallel areas. 2010-03-29 17:59:46 +00:00
Jonathan Earl Brassow
7a369d3704 Add ability to create mirrored logs for mirror LVs.
This check-in enables the 'mirrored' log type.  It can be specified
by using the '--mirrorlog' option as follows:
#> lvcreate -m1 --mirrorlog mirrored -L 5G -n lv vg

I've also included a couple updates to the testsuite.  These updates
include tests for the new log type, and some fixes to some of the
*lvconvert* tests.
2010-03-26 22:15:43 +00:00
Alasdair Kergon
2abbc07f3c Allow ALLOC_ANYWHERE to split contiguous areas. 2010-03-25 21:19:26 +00:00
Alasdair Kergon
a7ca334681 Add some assertions to allocation code. 2010-03-25 18:16:54 +00:00
Alasdair Kergon
f4cea344b1 improve a few comments in last check-in 2010-03-25 02:40:09 +00:00
Alasdair Kergon
8d6722c8ad Introduce pv_area_used into allocation algorithm and add debug messages.
This is the next preparatory step towards better --alloc anywhere
support and is not intended to break anything that currently works so
please report any problems - segfaults, bogus data in the new debug
messages, or if the code now chooses bizarre allocation layouts.
2010-03-25 02:31:48 +00:00
Mike Snitzer
a6bc975a24 Improve activation monitoring option processing
. Add "monitoring" option to "activation" section of lvm.conf
. Have clvmd consult the lvm.conf "activation/monitoring" too.
. Introduce toollib.c:get_activation_monitoring_mode().
. Error out when both --monitor and --ignoremonitoring are provided.
. Add --monitor and --ignoremonitoring support to lvcreate.  Update
  lvcreate man page accordingly.
. Clarify that '--monitor' controls the start and stop of monitoring in
  the {vg,lv}change man pages.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-03-23 22:30:18 +00:00
Alasdair Kergon
36f9d53b60 Allow dynamic extension of array of areas selected as allocation candidates. 2010-03-23 15:07:55 +00:00
Dave Wysochanski
15fdc8d3ee Avoid scanning all pvs in the system if operating on a device with mdas.
When we pv_read() a device that has an orphan vgname, we might need to scan
the system to be sure this is true.  However, if the PV has mdas, there's
no way possible for it to have an orphan vgname unless it is a true orphan.
Some areas of the code were optimized to take advantage of this fact, while
others were not (we would still do the expensive scan if a device had mdas
but had an orphan VG).

This patch unifies the code so that every place we are operating on such
a PV, we skip the expensive scan if there are mdas.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Petr Rockai <prockai@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2010-03-18 17:29:12 +00:00
Milan Broz
acb4b5e4de Fix pvcreate device check.
If user try to vgcreate or vgextend non-existent VG,
these messages appears:

# vgcreate xxx /dev/xxx
  Internal error: Volume Group xxx was not unlocked
  Device /dev/xxx not found (or ignored by filtering).
  Unable to add physical volume '/dev/xxx' to volume group 'xxx'.
  Internal error: Attempt to unlock unlocked VG xxx.

(the same with existing VG and non-existing PV & vgextend)
# vgextend vg_test /dev/xxx
...

It is caused because code tries to "refresh" cache if
md filter is switched on using cache destroy.

But we can change filters and rescan even without this
machinery now, just use refresh_filters
(and reset md filter afterwards).

(Patch also  discovers cache alias bug in vgsplit test,
fix it by using better filter line.)
2010-03-17 14:44:18 +00:00
Alasdair Kergon
b1f9a2f5d1 Only do one full device scan during each read of text format metadata. 2010-03-16 17:30:00 +00:00
Alasdair Kergon
38220f9fe9 Remove unnecessary full_scan parameter from get_vgids and get_vgnames calls. 2010-03-16 16:57:03 +00:00
Alasdair Kergon
cccae7e633 Look up missing PVs by uuid not dev_name in _pvs_single to avoid invalid stat.
Make find_pv_in_vg_by_uuid() return same type as related functions.
2010-03-16 15:30:48 +00:00
Alasdair Kergon
770dc81b8e Introduce is_missing_pv(). 2010-03-16 14:37:38 +00:00
Mike Snitzer
c485fe183e Handle a misaligned device that reports a -1 alignment_offset.
The kernel's blk_stack_limits() function may flag a device as
'misaligned'.  If it does the alignment_offset will be -1.

Update set_pe_align_offset() to accommodate this corner case.
2010-03-02 21:56:14 +00:00
Alasdair Kergon
16d9293bd7 Extend core allocation code in preparation for mirrored log areas. 2010-03-01 20:00:20 +00:00
Dave Wysochanski
3c23ff0f2e Add dm_pool_strdup to allocate memory and copy a tag in {lv|vg}_change_tag()
We need to allocate memory for the tag and copy the tag value before we
add it to the list of tags.  We could put this inside lvm2app since the
tools keep their memory around until vg_write/vg_commit is called, but
we put it inside the internal library to minimize code in lvm2app.
We need to copy the tag passed in by the caller to ensure the lifetime of
the memory until the {vg|lv} handle is released.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-02-24 18:15:57 +00:00
Dave Wysochanski
cd69ee7453 Refactor lvchange_tag() to call lv_change_tag() library function.
Similar refactoring to vgchange - pull out common parts and put into
library function for reuse.  Should be no functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-02-24 18:15:49 +00:00
Dave Wysochanski
e17bcc7432 Refactor _vgchange_tag() to vg_change_tag() library function.
Pull out common code to be called from tools as well as lvm2app.
Leave archive() at tool level so we can use from vgcreate
as well as vgchange.  Should be no functional change.
- add stack macro in vgchange

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-02-24 18:15:05 +00:00
Mike Snitzer
4bdebfd151 Do not reload origin again in lv_remove_single() if it had a merging
snapshot.  vg_remove_snapshot() will have already performed the required
reload.
2010-02-17 23:36:45 +00:00
Mike Snitzer
a5ec3e3827 Refactor snapshot-merge deptree and device removal to support info-by-uuid
Add a merging snapshot to the deptree, using the "error" target, rather
than avoid adding it entirely.  This allows proper cleanup of the -cow
device without having to rename the -cow to use the origin's name as a
prefix.

Move the preloading of the origin LV, after a merge, from
lv_remove_single() to vg_remove_snapshot().  Having vg_remove_snapshot()
preload the origin allows the -cow device to be released so that it can
be removed via deactivate_lv().  lv_remove_single()'s deactivate_lv()
reliably removes the -cow device because the associated snapshot LV,
that is to be removed when a snapshot-merge completes, is always added
to the deptree (and kernel -- via "error" target).

Now when the snapshot LV is removed both the -cow and -real devices
get removed using uuid rather than device name.  This paves the way
for us to switch over to info-by-uuid queries.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-02-17 22:59:46 +00:00
Dave Wysochanski
629efc6a89 Export lvm_pv_get_size(), lvm_pv_get_free(), lvm_pv_get_dev_size in lvm2app.
We add these exports to show the pv_size and pv_free and dev_size
fields.
Fixes rhbz561423.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-02-14 03:21:37 +00:00
Dave Wysochanski
ed3329eb45 Fix off by 512 sizes for lvm2app.
Internally we store sizes in sectors, but lvm2app exports sizes
in bytes.  We could get fancier and allow units configuration but
this fix should do for now.

Fixes rhbz561422.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2010-02-14 03:21:06 +00:00
Peter Rajnoha
04fa77c3be This is related to liblvm and its lvm_list_vg_names() and lvm_list_vg_uuids() functions
where we should not expose internal VG names/uuids (the ones with "#" prefix )through the
interface. Otherwise, we could end up with library users opening internal VGs which will
initiate locking mechanism that won't be cleaned up properly.

"#orphans_{lvm1, lvm2, pool}" names are treated in a special way, they are truncated first
to "orphans" and this is used as a part of the lock name then (e.g. while calling lvm_vg_open()).
When library user calls lvm_vg_close(), the original name "orphans_{lvm1, lvm2, pool}"
is used directly and therefore no unlock occurs.

We should exclude internal VG names and uuids in the lists provided by lvmcache:
lvmcache_get_vgids() and lvmcache_get_vgnames().
2010-02-03 14:08:39 +00:00
Dave Wysochanski
a7ca101517 Call _alloc_pv() inside _pv_read() and clean up error paths.
We should be consistent with pv constructors so call _alloc_pv()
here as we do from pv_create().
2010-01-21 21:09:23 +00:00
Dave Wysochanski
1d749d01fb Remove useless memory allocation for pv->vg_name in _alloc_pv().
All this seems to do is provide a memory leak so remove it.
The only caller of _alloc_pv() later explicitly sets
pv->vg_name = fmt->orphan_vg_name so clearly this allocation
should be removed.  I also saw no where in the code where
strncpy was used to assign pv->vg_name - only direct assignments
and strdup's.
2010-01-21 21:04:44 +00:00
Dave Wysochanski
2b1446c7d6 Correct 'void *' usage in pvcreate_single.
Remove needless cast.
2010-01-21 21:04:20 +00:00
Mike Snitzer
dfcb905db0 Preload the origin prior to suspend IFF snapshot(s) still exist after a
merge completes.  This narrows the scope of this "hack" (which still
needs a proper fix within the deptree).

This stops dmeventd from trying to access snapshot devices that were
already removed.
2010-01-20 21:53:10 +00:00
Mike Snitzer
e47a591d76 Improve target type compatibility checking in _percent_run().
Add 'target_status_compatible' method to 'struct segtype_handler'.
2010-01-15 16:35:26 +00:00
Alasdair Kergon
8dc351e8d4 Note some problems still to be addressed. 2010-01-14 14:39:57 +00:00
Zdenek Kabelac
fc28b13c7d Cleanup const compiler warning 2010-01-14 10:17:12 +00:00
Zdenek Kabelac
4269e36315 Move initialization of the 'cmd' member of the struct alloc_handle
before the first potentional return.
2010-01-14 10:09:42 +00:00
Zdenek Kabelac
5f31bc7926 lvol%d is generated for NULL name in lv_create_empty().
So just avoid code duplication.
2010-01-14 10:08:03 +00:00
Mike Snitzer
c52678ee9b Rename segment and lv status flag from SNAPSHOT_MERGE to MERGING.
Eliminate 'merging_snapshot' from 'struct logical_volume' and just use
'snapshot' for origin lv's reference to the merging snapshot; also set
MERGING in the origin lv's status.
2010-01-13 01:56:18 +00:00
Mike Snitzer
c79b425135 Add snapshot merge wrappers to abstract the associations and flags used
to represent merging origin and snapshot volumes.
2010-01-13 01:55:43 +00:00
Mike Snitzer
28c3f0354a When turning merging origin into non-merging origin, there is bad sequence:
snapshots are suspended, new origin is created, snapshots are resumed, new
origin is resumed.  So it allocates memory while suspended.

To fix it, move vg_commit after suspend_lv, so that the suspend code will
treat it as precommitted vg and will preload new origin prior to suspend.

NOTE: agk doesn't like this "hack"; need to revisit and fix
2010-01-13 01:52:58 +00:00
Mike Snitzer
3a8d01b6e1 Reload origin if merging has stopped. 2010-01-13 01:51:45 +00:00
Mike Snitzer
68e8f5a4a2 Add 'SNAPSHOT_MERGE' lv_segment 'status' flag.
Make 'merging_snapshot' pointer that points from the origin to the
segment that represents the merging snapshot.

Import/export 'merging_store' metadata.

Do not allow creating snapshots while another snapshot is merging.
Snapshot created in this state would certainly contain invalid data.

NOTE: patches at the end of this series will remove 'merging_snapshot'
and will introduce helpful wrappers and cleanups.
2010-01-13 01:35:49 +00:00
Alasdair Kergon
109e6334b0 Fix allocation code not to stop at the first area of a PV that fits.
This spurious 'break' has been here since this code was first committed
in June 2005 and stopped the algorithm behaving as described in the
comment above it and rendered the variable 'already_found_one' useless.
2010-01-12 20:53:20 +00:00
Alasdair Kergon
f3ac7d1b82 Revert so-called "redundant" log until after next release. 2010-01-12 14:00:51 +00:00
Jonathan Earl Brassow
673421ffc2 Testsuite updates and fixes for recently added features.
1. Found bug in 'redundant log' implementation that caused
   problems when converting a linear that spanned multiple
   devices to a mirror (wasn't checking for NULL value of
   provided parameter in _alloc_parallel_area)

2. Testsuite was failing to perform tests when 'not' modifier
   was used.  This allowed a couple issues to slip through.
   Added a 'not_sh' modifier that negates tests performed by
   functions defined in the shell source file.

3. Was initializing a variable to far down, which cause
   previously set value to be overridden.  (This was the
   result of the collision of the "redundant log" and
   lvconvert fix patches.)
2010-01-11 21:20:19 +00:00
Mike Snitzer
b422bb2187 remove unused variable 'i' that was recently introduced in lv_add_segment 2010-01-10 20:44:09 +00:00
Jonathan Earl Brassow
23f4aabd69 update comment 2010-01-08 23:06:36 +00:00
Jonathan Earl Brassow
77dd1c0e5f Add the new mirror log type "redundant". The options are now:
--mirrorlog core: in-memory log
--mirrorlog disk: persistent log
--mirrorlog redundant: redundant persistent log

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-01-08 22:32:35 +00:00
Jonathan Earl Brassow
72e0743621 This patch adds the capability to split off a mirror legs.
It is pretty much the same as reducing the number of
mirror legs, but we just don't delete them afterwards.

The following command line interface is enforced:
  prompt> lvconvert --splitmirror <n> -n <name> <VG>/<LV>
where 'n' is the number of images to split off, and
where 'name' is the name of the newly split off logical volume.

If more than one leg is split off, a new mirror will be the
result.  The newly split off mirror will have a 'core' log.
Example:
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
  LV            Copy%  Devices
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0),lv_mimage_2(0),lv_mimage_3(0)
  [lv_mimage_0]        /dev/sdb1(0)
  [lv_mimage_1]        /dev/sdc1(0)
  [lv_mimage_2]        /dev/sdd1(0)
  [lv_mimage_3]        /dev/sde1(0)
  [lv_mlog]            /dev/sdi1(0)
[root@bp-01 LVM2]# lvconvert --splitmirrors 2 --name split vg/lv /dev/sd[ce]1
  Logical volume lv converted.
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
  LV               Copy%  Devices
  lv               100.00 lv_mimage_0(0),lv_mimage_2(0)
  [lv_mimage_0]           /dev/sdb1(0)
  [lv_mimage_2]           /dev/sdd1(0)
  [lv_mlog]               /dev/sdi1(0)
  split            100.00 split_mimage_0(0),split_mimage_1(0)
  [split_mimage_0]        /dev/sde1(0)
  [split_mimage_1]        /dev/sdc1(0)

It can be seen that '--splitmirror <n>' is exactly the same
as '--mirrors -<n>' (note the minus sign), except there is the
additional notion to keep the image being detached from the
mirror instead of just throwing it away.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-01-08 22:00:31 +00:00
Zdenek Kabelac
56bb994166 orig_status preserves 64bit status. 2010-01-08 10:50:11 +00:00
Zdenek Kabelac
f760f97a1f Just add '.' at the end of error message. 2010-01-07 14:29:53 +00:00
Milan Broz
03984e05a3 Rename mirror_device_fault_policy to mirror_image_fault policy 2010-01-06 13:27:06 +00:00
Mike Snitzer
df13cf08d5 Add missing 'stack;' for all suspend_lv and resume_lv callers.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2010-01-05 21:07:31 +00:00
Milan Broz
0e06c92fdf Propagate commit and revert metadata event to other nodes in cluster.
This patch tries to correctly track changes in lvmcache related to commit/revert.

For vg_commit: if there is cached precommitted metadata, after successfull commit
these metadata must be tracked as committed.

For vg_revert: remote nodes must drop precommitted metadata and its flag in lvmcache.

(N.B. Patch do not touch LV locks here in any way.)

All this machinery is needed to properly solve remote node cache invalidaton which
cause several problems recently observed.
2010-01-05 16:09:33 +00:00
Milan Broz
4b1687fb74 Do not set precommitted flag in cache when precommitted metadata does not exist.
The use_precommitted flag indicates, that we want to use precommitted metadata
(used in suspend call to preload table with precommitted data).

But if there are no such data, committed metadata are read but the cache
still contains that precommitted flag.

(The problem is that later possible drop_metadata call will not invalidate
device in cache.)

The wrong precommitted state is stored in on remote nodes during normal
suspend/resume cycle _without_ vg_write/commit.

Use the PRECOMMITTED status flag here instead (which is always set if using
precommited metadata here).
2010-01-05 16:01:22 +00:00
Milan Broz
60494fe74b Resume volumes in reverse order to preserve memlock pairing.
If renaming snapshot with virtual origin, the origin is renamed too.
But the code must resume LVs in reverse order to properly
pair memlock (in cluster locking).

(The resume of snapshot resumes origin too and later resume
is ignored otherwise.)
2010-01-05 15:58:11 +00:00
Milan Broz
cfe30f1df3 Drop metadata cache after device was autorepaired and removed from VG.
All long running processes must reload metadata when some
device becomes orphan after repair.
2009-12-18 12:45:41 +00:00
Milan Broz
aa02928ff7 Remove missing flag if PV reappeared and is empty.
When PV device reappears with old metadata, it is
always updated to new version byt atutomatic metadata
repair.

Remove missing flag if device is empty.

If device contains allocated extents, issue warning that
user must remove volumes and re-add this PV before
manipulating with this volume.

This partially solves bug 547842 when one PV (log) is failed,
dmeventd removes that device and later this device reappears and
is wrongly added into VG marked missing.
2009-12-18 12:44:20 +00:00
Petr Rockai
fbcb06145b Revert another unintended change that snuck in. 2009-12-17 15:59:53 +00:00
Petr Rockai
dff5da2d64 Fix removal of multiple devices from a mirror (+ regression test). 2009-12-17 15:38:29 +00:00
Petr Rockai
207542b40e Revert unintended change that slipped in with last checkin. 2009-12-16 19:26:20 +00:00
Petr Rockai
550cae2340 #define an INTERNAL_ERROR macro and use it throughout LVM. 2009-12-16 19:22:11 +00:00
Zdenek Kabelac
735308699c Destroy allocated mempool in _vg_read_orphans() error path. 2009-12-11 13:14:44 +00:00
Milan Broz
34de60e4d4 Call explicitly suspend for temporary mirror layer.
The memlock_inc() fix is wrong, memlock count is not
propagated to long living process (clvmd) and just
it underflow there.
Also suspend is needed to pre-load precommited metadata
on other nodes (remapping to error taget in this case).

With explicit suspend we generate lock request and code
can update memlock count.

(Infinitely "locked" memory caused that fs_unlock() was not
called properly and on cluster nodes remains
old links in /dev/mapper for not active devices.)

(N.B. failing of suspend call here is not handled as fatal
error - the LV is going to be removed later anyway.)
2009-12-09 19:53:39 +00:00
Milan Broz
adee669441 Use more descriptive variable name for temporary layer lv. 2009-12-09 19:43:39 +00:00
Milan Broz
0fa0e6addf Allow manipulation with precommited metadata even when a PV is missing.
The new recovery code first tries to repair LV and then removes failed PV
from VG. It means that during operation there can be VG with PV missing,
and vg_read code handles it like not consistent VG.

We already allows returning "inconsistent" commited metadata,
for mirror repair we need this for precommited too.
(The suspend call prepares precommited metadata to inactive table on
other cluster nodes.)

"Inconsistent" here means - correct metadata, just with some metadata areas
not found (obviously on missing or failed PVs).
2009-12-09 19:29:04 +00:00
Milan Broz
f72a06ccf7 Remove newly created log volume if initial deactivation fails.
If there is problem deactivate LV and
_init_mirror_log is called with remove_on_failure = 1,
remove the newly created log LV from metadata.

(This can happen if there is active device with the same name
but different UUID.)

The main reason for this "workaround" patch is to
 - do not keep _mlog volume in metadata, so user can repeat the action
 - print better error message describing the real problem

# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  /dev/vg_bar/lv1_mlog: not found: device not cleared
  Aborting. Failed to wipe mirror log.
  Error locking on node bar-01: Input/output error
  Unable to deactivate mirror log LV. Manual intervention required.
  Failed to create mirror log.

# lvcreate -m 2 -n lv1 -l 1 --nosync vg_bar
  WARNING: New mirror won't be synchronised. Don't read what you didn't write!
  Aborting. Unable to deactivate mirror log.
  Failed to initialise mirror log.
2009-12-09 18:09:52 +00:00
Dave Wysochanski
59baeb838c Update a few more uint64_t's related to the 64-bit status change.
At this point they probably do not matter but going forward they
may - depends on future patches for replicator, etc.  I think
these probably got missed because they were 'flags' so I changed
the name to 'status' to be consistent.  So the on-disk
things 'flags' and the in structure 'status' (bits).
NOTE: WHATS_NEW already has entry for this in current release.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
2009-12-04 17:48:32 +00:00
Milan Broz
fec4de9563 Fix tools to report error when stopped by user.
(And do not produce internal error message.)
2009-12-03 19:18:33 +00:00
Dave Wysochanski
c053fb62bc Fix setting of readahead in lvcreate.
The default comes from the configuration settings, with possible
commandline override.
2009-12-03 01:47:33 +00:00
Mike Snitzer
a2552d4f59 Switch status from 32-bit to 64-bit
The physical_volume, volume_group, logical_volume and lv_segment
structures' 'status' member is now uint64_t.

The alignment of these structures was also audited to remove holes.  The
movement of some members in 'volume_group' and 'lv_segment' eliminates
holes.  The 'physical_volume' structure still has one 4-byte hole after
'pe_size'; the other structures no longer have any holes.  Each
structures' size has not changed.
2009-11-24 22:55:55 +00:00
Milan Broz
a4893bc377 Revert vg_read_internal change, clvmd cannot use vg_read now. (2.02.55) 2009-11-23 10:44:50 +00:00
Petr Rockai
4e11dfe3ca In case we refuse to continue due to missing PVs, print a hint about using
vgreduce --removemissing to remedy the situation.
2009-11-19 13:44:37 +00:00
Petr Rockai
e2683aafe6 The double resume in remove_mirror_images does not happen *always*. Only call
memlock_inc() when it actually does happen.
2009-11-19 13:42:38 +00:00
Petr Rockai
090585a8f4 Un-export vg_read_internal. 2009-11-19 12:13:37 +00:00
Petr Rockai
2f1d6f7f0c Add a missing #include (fix compiler warning). 2009-11-19 12:09:53 +00:00
Petr Rockai
c85222c461 Add an extra memlock_inc() to _remove_mirror_images to properly balance
reference counting (see code comment for details).
2009-11-18 18:23:46 +00:00
Dave Wysochanski
a42efe6bdf Rename validate_vg_create_params to vgcreate_params_validate. 2009-11-01 20:05:17 +00:00
Dave Wysochanski
accb17389c Rename pvcreate_params processing functions to better match <object><action>.
Rename fill_default_pvcreate_params to pvcreate_params_set_defaults.
Rename pvcreate_validate_restore_params to pvcreate_restore_params_validate.
Rename pvcreate_validate_params to pvcreate_params_validate.
2009-11-01 19:51:54 +00:00
Dave Wysochanski
0e6c4e93da Add vg_set_clustered() - move logic from vgchange.
Similar to other vg_set_* functions, we create a vg_set_clustered() function
which does a few checks and sets a flag.  This is where we check for
any limitations of clusters.
2009-10-31 17:30:52 +00:00
Dave Wysochanski
29aa56df68 Add vg_mda_count library function. 2009-10-31 17:26:13 +00:00
Alasdair Kergon
984abde146 Permit snapshots of mirrors. (brassow) 2009-10-26 10:01:56 +00:00
Petr Rockai
b4048242f5 Handle metadata with unknown segment types more gracefully. 2009-10-16 17:41:49 +00:00
Jonathan Earl Brassow
a1bb606aab I saw this in a bug report:
[root@xxxx-01 ~]# lvconvert -m 1 --corelog VG/cmirror
  Unable to convert the log of inactive cluster mirror cmirror

I've tried to clean-up the message a little more, so the name
of the mirror stands out more while preserving the sense that
it's not a problem with the specific device, but the fact that
it is inactive that is causing the problem.

New msg:
  Unable to convert the log of an inactive cluster mirror, cmirror
2009-10-14 14:55:44 +00:00
Dave Wysochanski
21e094d9df Cleanup comment and some whitespace. 2009-10-06 16:00:38 +00:00
Dave Wysochanski
36a1d8166c Refactor pvcreate - split pvcreate_validate_params into recovery/non-recovery.
Split pvcreate_validate_params into recovery and non-recovery parameters.
This is necessary so we can call the non-recovery validate function from
vgextend / vgcreate.  Note in the pvcreate tool case, we must call the
recovery validation function first (see treatment of pe_start and --zero),
and that we add a call to fill_default_pvcreate_params before the validation
functions.
2009-10-05 20:03:25 +00:00
Dave Wysochanski
c24a4ff2cc Allow calling fill_default_pvcreate_params from tools.
We need defaults for pvcreate_params at a higher level - this will
allow us to use a common function from the tools to take defaults,
then fill in any non-defaults from the commandline.

Future patches will refactor vgcreate/vgextend to call this function
if one or more pvcreate parameters are given on the commandline.
2009-10-05 20:03:08 +00:00
Dave Wysochanski
29123aa652 Add pvcreate_params to vg_extend.
Another refactoring for implicit pvcreate support.  We need to get
the pvcreate parameters somehow to the vg_extend routine.  Options
seemed to be:
1. Attach the parameters to struct volume_group.  I personally
did not like this idea in most cases, though one could make an
agrument why it might be ok at least for some of the parameters
(e.g. metadatacopies).
2. Pass them in to the extend routine.  This second route seemed
to be the best approach given the constraints.

Future patches will parse the command line and fill in the actual
values for the pvcreate_single call.
Should be no functional change.
2009-10-05 20:02:48 +00:00
Dave Wysochanski
acb4073eed Add pvcreate_params to vg_extend_single_pv.
Should be no functional change.  If this parameter is set to NULL, just fail
the extend if the device is not already a PV.  If non-NULL, try pvcreate_single
before failing.  Note that pvcreate_single() handles the log_error in case
of failure so we just return 0 if pvcreate_single() fails.
2009-10-05 20:02:30 +00:00
Dave Wysochanski
a80fc69320 Refactor vg_extend - add vg_extend_single_pv.
Simple refactor to setup future changes related to implicit pvcreates.
Should be no functional change.
2009-10-05 20:02:04 +00:00
Alasdair Kergon
3d32c5f88b Add percent_range to copy_percent too. 2009-10-01 01:04:27 +00:00
Alasdair Kergon
78ad1549a5 Introduce percent_range_t and centralise snapshot full/mirror in-sync checks. 2009-10-01 00:35:29 +00:00
Alasdair Kergon
d557773841 Consolidate LV allocation into alloc_lv(). 2009-09-28 17:46:15 +00:00
Dave Wysochanski
68fac97a07 Add vg_is_resizeable() and cleanup references.
Clean up VG_RESIZEABLE flag by creating vg_is_resizeable().
Update comment - we no longer have ALLOW_RESIZEABLE.
Also use vg_is_exported() in one place missed by earlier patch.
Should be no functional change.
2009-09-15 18:35:13 +00:00
Dave Wysochanski
fca434258a Add most relevant vg_attr fields as lvm2app 'get' functions.
Of the vgs field vg_attr, a few of the most likely to be used attributes
are clustered, exported, and partial.  This patch adds the following 3
functions:
uint64_t lvm_vg_is_clustered(const vg_t vg)
uint64_t lvm_vg_is_exported(const vg_t vg)
uint64_t lvm_vg_is_partial(const vg_t vg)
2009-09-14 19:43:11 +00:00
Dave Wysochanski
8c7946664c Add max_pv and max_lv vg 'get' lvm2app exports. 2009-09-14 15:45:23 +00:00
Dave Wysochanski
43a1ea4e2f Update vg_remove_single_* functions to use the removed_pvs list.
Now that we've split vg_remove_single into two routines, in the first routine
that only manipulates memory, we move the PVs from the vg->pvs list to the
vg->removed_pvs list.  Then later, we iterate through this list to write the
removed PVs to disk, which removes them from the volume group and places them
into the internal ORPHAN VG.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-09-02 21:39:49 +00:00
Dave Wysochanski
d50795ed09 Split vg_remove_single into 2 functions - the second part commits to disk.
Split vg_remove_single into vg_remove_check (mandatory checks before
vgremove) and vg_remove (do actual remove by committing to disk).

In liblvm, we'd like to provide an consistent API that allows multiple
changes in memory, then let lvm_vg_write() control the commit to disk.  In
some cases (for example, lvresize calls fsadm) this may not be possible.
However, since we are using an object model and dividing things into small
operations, the most logical model seems to be the lvm_vg_write model, and
handling the special cases as they arrive.  So as best as possible
we move towards this end.

A possible optimization would be to consolidate vg_remove (committing)
code with vgreduce code.  A second possible optimization is making vgreduce
of the last device equivalent to vgremove.  Today, lvm_vg_reduce fails if
vgreduce is called with the last device, but from an object model perspective
we could view this as equivalent to vgremove and allow it.  My gut feel is
we do not want to do this though.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-09-02 21:39:29 +00:00
Dave Wysochanski
940077d030 Rename internal library function vg_remove to vg_remove_mdas.
Later patches should consolidate the vgremove / vgreduce functions but for
now let's clarify what vg_remove actually does by changing the name.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-09-02 21:39:07 +00:00
Milan Broz
c2d4398d47 Fix uuid warning in pvcreate to use terminated (and dash formatted) UUID string.
# pvcreate -u udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi /dev/sdc
   uuid udwxr7BoKYEeKMr033xK6o4og7F13sGi|��� already in use on "/dev/sdb1"
 is now
# pvcreate -u udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi /dev/sdc
   uuid udwxr7-BoKY-EeKM-r033-xK6o-4og7-F13sGi already in use on "/dev/sdb1"
2009-08-20 07:03:02 +00:00
Dave Wysochanski
b521cadd66 Remove useless _pv_write wrapper. 2009-08-10 17:15:01 +00:00
Mike Snitzer
2aabcc1c1c Add devices/data_alignment_detection to lvm.conf.
Adds 'data_alignment_detection' config option to the devices section of
lvm.conf.  If your kernel provides topology information in sysfs (linux
>= 2.6.31) for the Physical Volume, the start of data area will be
aligned on a multiple of the ’minimum_io_size’ or ’optimal_io_size’
exposed in sysfs.

minimum_io_size is used if optimal_io_size is undefined (0).  If both
md_chunk_alignment and data_alignment_detection are enabled the result
of data_alignment_detection is used.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2009-08-01 17:08:43 +00:00
Mike Snitzer
57b660356e Add devices/data_alignment_offset_detection to lvm.conf.
If the pvcreate --dataalignmentoffset option is not specified the start
of a PV's aligned data area will be shifted by the associated
'alignment_offset' exposed in sysfs (unless
devices/data_alignment_offset_detection is disabled in lvm.conf).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2009-08-01 17:07:36 +00:00
Mike Snitzer
04b2a4bdcf Add --dataalignmentoffset to pvcreate to shift start of aligned data area
Adds pe_align_offset to 'struct physical_volume'; is initialized with
set_pe_align_offset().  After pe_start is established pe_align_offset is
added to it.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2009-07-30 17:45:28 +00:00
Alasdair Kergon
9e813cc93b Remove pv_t, vg_t & lv_t handles from lib. Only liblvm uses them.
Rename lvm.h to lvm2app.h for now.
2009-07-29 13:26:01 +00:00
Alasdair Kergon
8762493eb8 \n 2009-07-28 20:41:41 +00:00
Dave Wysochanski
afcd9399a9 Add an open_mode to the vg struct for liblvm - enforce read / write semantics.
For now, a simple way to enforce the read/write semantics is to just save the
open mode of the VG.  If the caller uses lvm_vg_create, the mode is write.
The caller using lvm_vg_open can use either read or write to open the VG.
Once we have this, we enforce the permissions on each API call and don't allow
a caller to modify a VG that has not been opened properly.

This may be better combined with the locking mode, but I view that as future
cleanup, past this initial release.  The intial release should enforce the
basic object semantics though, as described in the lvm.h file.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-28 15:14:56 +00:00
Dave Wysochanski
9ac1af7160 Add lvm_vg_get_seqno, updating lvm.h and unit test.
Adding the ability to get the seqno is important for an application to
determine if something has changed in a VG.  Otherwise, the only way to
know is to open the VG with write permission and hold the handle.
2009-07-28 13:17:04 +00:00
Dave Wysochanski
1bd72d90a4 Add vg_reduce to metadata.c and metadata-exported.h
This function behaves a little bit different than vg_reduce_single, because
it allowes to remove even the latest pv. This has been done to be consistent
to lvm_vg_create, which creates an empty vg.

removed_pvs has been added to the volume_group struct. vg_reduce adds remove
pvs to this list to be able to commit the changes for the pvs in lvm_vg_comm
in liblvm2app.

Initialize removed_pvs list in format-specific volume_group constructors.
Ideally, we should have a base constructor here that initializes the general
non-format specific members of struct volume_group.  But until then, there
are multiple places to initialize these members.  Maybe a better patch would
be a base constructor patch for struct volume_group.  That is more work
though.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Thomas Woerner <twoerner@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-27 17:43:39 +00:00
Dave Wysochanski
8b3755a679 Use vg_size in vg_set_extent_size.
Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 12:41:36 +00:00
Dave Wysochanski
483a7cb6d5 Refactor a few report field calculations into separate functions.
For liblvm 'get' functions, we should share code with the reporting functions.
This means we need common code to return the values for the fields.
In this patch we refactor a few of the fields needed in liblvm.
Unfortunately, for the simple fields that do derefernces of structure
members (for example, vg_extent_count), we cannot call the common function
from the reporting infrastructure without more refactoring.  The reason is
that the dereference of the simple fields is done deep inside the reporting
code (to get the generic "data" pointer), and the display function is a
generic 'size32' function.  We can fix these issues later with more
refactoring.

Should be no functional change and the testsuite should cover any possible
regressions.  The only fields in the report affected by this patch are:
vg_size, vg_free, and pv_mda_count.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 12:41:09 +00:00
Dave Wysochanski
9963d0710e Move extents_from_size from lvcreate into internal library so we can reuse.
Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 02:34:09 +00:00
Dave Wysochanski
c42b235610 Move _lvcreate into the internal library and rename to lv_create_single.
After some refactorings, we can now move the bulk of _lvcreate into the
internal library, and we can call from liblvm.  In the future, we should
refactor lv_create_single further, probably by segtype, to reduce the
size of struct lvcreate_params.  For now this is a reasonable refactor
and allows us to re-use the function from liblvm.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 02:33:35 +00:00
Dave Wysochanski
c9b4604ba6 Remove use of void * from pvcreate_single.
We should use struct pvcreate_params to utilize compiler typechecking.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 02:02:22 +00:00
Dave Wysochanski
aa496e4c23 Move ORPHAN_VG lock outside pvcreate_single.
The implicit pvcreate require either moving the ORPHAN_VG lock outside
pvcreate_single or somehow having the function know or detect whether
the ORPHAN_VG lock is already held.


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 01:54:20 +00:00
Dave Wysochanski
89777f9cec Change pvcreate_single to return pv_t and update function description.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 01:53:57 +00:00
Dave Wysochanski
9d5a318ac6 Allow pvcreate_single to be called with NULL for default pvcreate params.
Passing NULL for pvcreate parameters gives you default parameters for
pvcreate_single.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 01:53:30 +00:00
Dave Wysochanski
d4b6a8aa2a Move bulk of pvcreate logic into library.
In preparation for implicit pvcreate during vgcreate / vgextend,
move bulk of pvcreate logic inside library.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 01:53:09 +00:00
Dave Wysochanski
beeba64080 Remove unneeded pv_create wrapper function.
Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-26 01:52:19 +00:00
Dave Wysochanski
fce6fb489f Eliminate compile warning introduced by previous commit. 2009-07-24 15:15:26 +00:00
Dave Wysochanski
e6923120b9 Revert previous patch that moved VG_ORPHAN lock inside vg_extend.
We must hold the VG_ORPHAN lock until we commit to disk.  Otherwise,
we risk a race condition on vgcreate / vgextend.  Reverts the following
commit:

commit 72a41480ba
Author: Dave Wysochanski <dwysocha@redhat.com>
Date:   Fri Jul 10 20:09:21 2009 +0000

    Move orphan lock obtain/release inside vg_extend().

    With this change we now have vgcreate/vgextend liblvm functions.
    Note that this changes the lock order of the following functions as the
    orphan lock is now obtained first.  With our policy of non-blocking
    second locks, this should not be a problem.

    Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-24 15:01:43 +00:00
Dave Wysochanski
2206eb2409 Remove 'is already' message from vg_set_* functions.
These messages are unnecessary in the set functions.  We check for this
condition and print a message in the vgchange tool but not the library
functions.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-16 20:18:16 +00:00
Dave Wysochanski
e0bd7c7645 Remove extraneous messages for extent_size and alloc_policy upon vgcreate.
When converting to the new liblvm functions, the vgcreate code path
changed to create a new vg, then set values.  As a result of this
change, and the fact that we give a user a message if they try to
set the same value of a VG attribute (extent_size, alloc_policy, etc),
you'll see these 2 extraneous "is already" messages with vgcreate:
tools/lvm vgcreate vg2 /dev/loop2
  Physical extent size of VG vg2 is already 4.00 MB
  Volume group allocation policy is already normal
  Volume group "vg2" successfully created

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>


Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-16 03:25:26 +00:00
Alasdair Kergon
b8f47d5f69 Use log_error macro consistently throughout in place of log_err. 2009-07-15 20:02:46 +00:00
Alasdair Kergon
99a5d8a9c4 Revert broken commit:
2009-07-15 06:10:51  Un-export vg_read_internal.

lvm-functions.c:774: warning: implicit declaration of function 'vg_read_internal'
2009-07-15 17:26:26 +00:00
Petr Rockai
0d7bacf56e Un-export vg_read_internal. 2009-07-15 06:10:51 +00:00
Petr Rockai
21a98eda88 Port process_each_pv to new vg_read. 2009-07-15 05:50:22 +00:00
Petr Rockai
6ee7d2aa53 Remove lockingfailed().
We provide a lock type that behaves like no_locking, but is not
  clustered. Moreover, it also forbids any write locks. This magically (and
  consistently) prevents use of clustered VGs, or changing local VGs with
  --ignorelockingfailure. As a bonus, we can remove the special hacks in a few
  places. Of course, people looking for trouble can always set their locking_type
  to 0 to override.
2009-07-15 05:49:47 +00:00
Petr Rockai
19089ba331 Refuse to open VG with MISSING_PVs for update unless handles_missing_pvs is set. 2009-07-15 05:47:55 +00:00
Dave Wysochanski
39d6ccdfc7 Define handles to liblvm objects for pv, vg, lv, lvseg, pvseg.
Define the 5 main liblvm objects to be the pv, vg, lv, lvseg, and pvseg.
We need handles defined to all these objects in order for liblvm to be
equivalent to the reporting commands pvs, vgs, and lvs.

- move vg_t, lv_t, and pv_t from metadata-exported.h into lvm.h
- move lv_segment and pv_segment forward declarations into lvm.h
- add lvseg_t and pvseg_t to lvm.h

NOTE: We currently have an inconsistency in handle definitions.
lvm_t is defined as a pointer, while these other handles are just
structures.  We should pick one scheme and be consistent - perhaps
define all handles as pointers (this is what I've seen elsewhere).

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2009-07-14 03:00:30 +00:00
Dave Wysochanski
cec2a2dacc Remove READ_REQUIRE_RESIZEABLE flag from vg_read() interface - no users.
The checks for RESIZEABLE_VG should now be inside the various functions that
have to do such operations.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2009-07-14 02:19:19 +00:00
Dave Wysochanski
ffc12b3fc1 Remove READ_REQUIRE_RESIZEABLE flag from vgsplit.
Remove READ_REQUIRE_RESIZEABLE flag from vgsplit similar to the removal from
vgextend.  Move the check inside the functions that actually move pvs from
one vg structure to another.  Should be no functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2009-07-14 02:16:05 +00:00
Dave Wysochanski
6452d4ae9d Refactor vgsplit - move move_pvs_used_by_lv and move_pv inside library.
In the future we may export these functions or something like them in liblvm
For now this helps in cleaning up the checks for RESIZEABLE since we can
use the internal library function vg_bad_status_bits.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2009-07-14 02:15:21 +00:00
Dave Wysochanski
5b57e82508 Remove READ_REQUIRE_RESIZEABLE from vgextend by moving check inside vg_extend.
Move the check for the RESIZEABLE flag inside the vg_extend function.
When we consolidated the vg locking, reading, and status flag checking,
we tied the check for the RESIZEABLE flag to the vg_read() call.  The problem
with this is you cannot know what other APIs the application my or may not
call after a vg_read() call.  Thus the READ_REQUIRE_RESIZEABLE flag is not
really ideal - ideally we should be checking for this flag on a specific
operation, not inside the vg_read() call.  This patch moves one check inside
the library.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
2009-07-14 02:14:04 +00:00
Dave Wysochanski
85d2f1811a Remove unused code vg_lock_and_read() and related flags.
Author: Dave Wysochanski <dwysocha@redhat.com>
2009-07-10 21:19:37 +00:00
Dave Wysochanski
7fa91ec044 Move orphan lock obtain/release inside vg_extend().
With this change we now have vgcreate/vgextend liblvm functions.
Note that this changes the lock order of the following functions as the
orphan lock is now obtained first.  With our policy of non-blocking
second locks, this should not be a problem.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-10 20:09:21 +00:00
Dave Wysochanski
03660b3045 Move orphan lock inside vg_remove_single.
Move the vg orphan lock inside vg_remove_single, now a complete liblvm
function.  Note that this changes the order of the locks - originally
VG_ORPHAN was obtained first, then the vgname lock.  With the current
policy of non-blocking second locks, this could mean we get a failure
obtaining the orphan lock.  In the case of a vg with lvs being removed,
this could result in the lvs being removed but not the vg.  Such a
scenario could have happened prior though with a different failure.
Other tools were examined for side-effects, and no major problems
were noted.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-10 20:08:37 +00:00
Dave Wysochanski
42ae96fa3d Remove force parameter from vg_remove_single, now the liblvm function.
Move check for active LVs outside of library function.  The vgremove
liblvm function function will fail if there are active LVs.  It will
be the application's responsibility to check this condition and remove
the LVs individually before calling vgremove.  Note also that we've
duplicated the EXPORTED_VG check in vgremove_single (tools) and
vg_remove_single (library).  Duplication seemed the only option here
since we don't want to do the automatic removal of LVs (in the tools)
if the vg is exported, and we still need to protect the library call
from removal if the vg is exported.

We still need to deal with the ORPHAN lock but vg_remove_single is now
very close to our liblvm function.

TODO: Refactor lvremove in a similar way.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-10 20:07:02 +00:00
Dave Wysochanski
a6ad9c6166 Remove unnecessary parameters from vg_remove_single().
Use vg_t instead of struct volume_group.
Should be no functional change.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-10 20:05:29 +00:00
Dave Wysochanski
10a27bdfb6 Change vg_create() to take only minimal parameters and obtain a lock.
vg_t *vg_create(struct cmd_context *cmd, const char *vg_name);
This is the first step towards the API called to create a VG.
Call vg_lock_newname() inside this function.  Use _vg_make_handle()
where possible.
Now we have 2 ways to construct a volume group:
1) vg_read: Used when constructing an existing VG from disks
2) vg_create: Used when constructing a new VG
Both of these interfaces obtain a lock, and return a vg_t *.
The usage of _vg_make_handle() inside vg_create() doesn't fit
perfectly but it's ok for now.  Needs some cleanup though and I've
noted "FIXME" in the code.

Add the new vg_create() plus vg 'set' functions for non-default
VG parameters in the following tools:
- vgcreate: Fairly straightforward refactoring.  We just moved
vg_lock_newname inside vg_create so we check the return via
vg_read_error.
- vgsplit: The refactoring here is a bit more tricky.  Originally
we called vg_lock_newname and depending on the error code, we either
read the existing vg or created the new one.  Now vg_create()
calls vg_lock_newname, so we first try to create the VG.  If this
fails with FAILED_EXIST, we can then do the vg_read.  If the
create succeeds, we check the input parameters and set any new
values on the VG.

TODO in future patches:
1. The VG_ORPHAN lock needs some thought.  We may want to treat
this as any other VG, and require the application to obtain a handle
and pass it to other API calls (for example, vg_extend).  Or,
we may find that hiding the VG_ORPHAN lock inside other APIs is
the way to go.  I thought of placing the VG_ORPHAN lock inside
vg_create() and tying it to the vg handle, but was not certain
this was the right approach.
2. Cleanup error paths. Integrate vg_read_error() with vg_create and
vg_read* error codes and/or the new error APIs.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:09:33 +00:00
Dave Wysochanski
5d623bde94 Add vg_set_alloc_policy() liblvm function and move vgchange logic inside.
NOTE: vg_set_alloc_policy() returns success if you try to set a value that
is already stored.  The behavior of vgchange is the same though - it fails.
There is a fixme noted in the code about this inconsistency, which should
be resolved if possible.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:08:54 +00:00
Dave Wysochanski
dba458ae9a Add vg_set_max_pv() liblvm function and move vgchange logic inside.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:07:47 +00:00
Dave Wysochanski
a88bfbcb13 Add vg_set_max_lv() liblvm function and move vgchange logic inside.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:06:00 +00:00
Dave Wysochanski
3f6a21ead4 Rename vg_change_pesize to vg_set_extent_size and use vg_t.
In liblvm, we will reserve the word 'change' to mean an API that
both sets one or more values, and commits to disk.  This will be
consistent with the LVM commandline.  The existing vg_change_pesize()
function does not commit to disk, but just changes the extent_size
and ensures all internal structures are updated.  This logic should
be contained in a function that sets the value.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:04:52 +00:00
Dave Wysochanski
b1278ba1b8 Remove unused 'cmd' from vg_change_pesize().
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:03:37 +00:00
Dave Wysochanski
de5dfec56b Update vg_change_pesize() to contain all validity checks.
It would be nice to have one function that does all the validation
and setting of the VG's pesize.  However, currently some checks
are in the higher-level function _vgchange_pesize(), and some
checks are in the lower function vg_change_pesize().
This patch moves most of the higher-level checks inside
vg_change_pesize.  In one case a failure return code is
changed from ECMD_FAILED to EINVALID_CMD_LINE.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 10:02:15 +00:00
Dave Wysochanski
4c35d6dee6 Remove unneeded LOCK_NONBLOCKING from vg_read() API.
Remove unneeded LOCK_NONBLOCKING from vg_read() API and tools that
use it.  We no longer need this flag anywhere since we now automatically
set LCK_NONBLOCK inside lock_vol() if vgs_locked().
For further details, see:
commit d52b3fd3fe
Author: Dave Wysochanski <dwysocha@redhat.com>
Date:   Wed May 13 13:02:52 2009 +0000

    Remove NON_BLOCKING lock flag from tools and set a policy to auto-set.

    As a simplification to the tools and further liblvm, this patch pushes
    the setting of NON_BLOCKING lock flag inside the lock_vol() call.
    The policy we set is if any existing VGs are currently locked, we
    set the NON_BLOCKING flag.

At some point it may make sense to add this flag back if we get an
RFE from a liblvm user, but for now let's keep it as simple as
possible.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-08 14:33:17 +00:00
Dave Wysochanski
b251e09035 Remove READ_CHECK_EXISTENCE and vg_might_exist().
Remove READ_CHECK_EXISTENCE and vg_might_exist().
This flag and API is no longer used now that we have a separate
API to check for existence.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-08 14:31:17 +00:00
Dave Wysochanski
cd082bbea7 Remove unneeded LOCK_KEEP from vg_read() interface.
Remove unneeded LOCK_KEEP from vg_read() interface.
Update comment to clarify cases where _vg_lock_and_read() may return
with an error but the lock held.  Would be nice to make the vg_read()
interface consistent with regards to lock held and error behavior.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-08 14:28:30 +00:00
Alasdair Kergon
dd1d42d5d0 Permit several segment types to be registered by a single shared object. 2009-07-08 12:36:01 +00:00
Mike Snitzer
bb6a3a9608 Use the MD device's stripe-width, instead of chunk_size, to align the
data blocks of a Physical Volume that is placed directly upon an MD
device.
2009-07-06 19:04:24 +00:00
Dave Wysochanski
b91eb158d4 Don't segfault in vg_release when vg->cmd is NULL.
Sun May  3 13:06:14 CEST 2009  Petr Rockai <me@mornfall.net>
  * Don't segfault in vg_release when vg->cmd is NULL.


Author: Petr Rockai <prockai@redhat.com>
Committer: Dave Wysochanski <dwysocha@redhat.com>
2009-07-01 17:03:38 +00:00
Dave Wysochanski
13e8c7e434 Rework the toollib interface (process_each_*) on top of new vg_read.
Sun May  3 12:32:30 CEST 2009  Petr Rockai <me@mornfall.net>
  * Rework the toollib interface (process_each_*) on top of new vg_read.

Rebased 6/26/09 by Dave W.
- Add skipping message to process_each_lv
- Remove inconsistent_t.
2009-07-01 17:00:50 +00:00
Alasdair Kergon
476d463348 pre-release tidy up 2009-06-30 18:39:31 +00:00
Petr Rockai
11ee855e40 Allow metadata correction even when PVs are missing. 2009-06-10 20:17:32 +00:00
Dave Wysochanski
fe2b3ea0d4 In the new _vg_read_for_update(), we always do the check for CLUSTERED vg
status flag after reading the volume group.  Thus, no need to set the flag
in vg_read() or clear it later before calling _vg_bad_status_bits().

Also, add back in the !lockingfailed() as part of the CLUSTERED flag check.
It's unclear why it was removed when the check was moved from
_vg_bad_status_bits() to inside _vg_lock_and_read().
There was an open question about the last check in the 'if' stmt for
lockingfailed() with a previous patch submitted.  However, I would
defer that right now as it is a separate item and this patch should
be no functional change by including the !lockingfailed().

Petr acked this patch on 5/26 I just forgot to check it in.

Acked-by: Petr Rockai <prockai@redhat.com>
2009-06-10 16:14:40 +00:00
Dave Wysochanski
07b0c948ee Add vg_lock_newname() library function.
Various tools need to check for existence of a VG before doing something
(vgsplit, vgrename, vgcreate).  Currently we don't have an interface to
check for existence, but the existence check is part of the vg_read* call(s).
This patch is an attempt to pull out some of that functionality into a
separate function, and hopefully simplify our vg_read interface, and
move those patches along.

vg_lock_newname() is only concerned about checking whether a vg exists in
the system.  Unfortunately, we cannot just scan the system, but we must first
obtain a lock.  Since we are reserving a vgname, we take a WRITE lock on
the vgname.  Once obtained, we scan the system to ensure the name does
not exist.  The return codes and behavior is in the function header.
You might think of this function as similar to an open() call with
O_CREAT and O_EXCL flags (returns failure with -EEXIST if file already
exists).

NOTE: I think including the word "lock" in the function name is important,
as it clearly states the function obtains a lock and makes the code more
readable, especially when it comes to cleanup / unlocking.  The ultimate
function name is somewhat open for debate though so later we may rename.
2009-06-09 14:29:10 +00:00
Milan Broz
a908d0030c Suspend virtual origin before real snapshot.
Because preload of table for snapshot can produce snapshot
metadata (in kernel cow header) read.

Code should suspend origin first to avoid possible deadlock
when preloading (thus calling snapshot in-kernel constructor)
for origin with suspended cow device.

(fixes previous commit)
2009-06-06 16:37:15 +00:00
Milan Broz
66086ce962 Fix double releasing of vg when repairing of vg is requested.
Several commands calls process_each_vg() and in provided
callback it explicitly recovers VG if inconsistent.
(vgchange, vgconvert, vgscan)

It means that old VG is released and reread  but the function
above (process_one_vg) tries to unlock and release old VG.

Patch moves the repair logic into _process_one_vg() function.

It always tries to read vg (even inconsistent) and then decides
what to do according new defined parameter.

Also patch unifies inconsistent error messages.

The only slight change if for vgremove command, where
it now tries to repair VG before it removes if force arg is given.
(It works similar way before, just the order of operation changed).
2009-06-05 20:00:52 +00:00
Milan Broz
771e191e99 Fix rename of active snapshot with virtual origin.
Code must suspend/resume virtual origin too when renaming
snaphsot otherwise in kernel old name remains.
2009-06-01 15:55:06 +00:00
Milan Broz
31f55a07db Fix convert polling to ignore LV with different UUID.
When mirror convert polling is started (mainly as backgound process,
in lvchange -a y or in lvconvert itself) it tries to read VG
and LV identified by its name.

Unfortunatelly, the VG can have already different LV under the same name,
and various more or less funny things can happen (note that
_finish_lvconvert_mirror suspends the volume for example).

(the typical example is our testing script which continuously recreates
LVs under the same name in the same VG.)

This patch adds optional uuid parameter which helps to properly
select the monitoring object. For lvconvert polling it is set to LV UUID
and both _get_lvconvert_vg and _get_lvconvert_lv uses it to read proper VG/LV.

(In the pvmove case it is NULL, here we poll for physical volume name).
2009-06-01 14:43:27 +00:00
Milan Broz
59d06d4dc7 Fix log allocation segfault (fix previous commits).
If there is no free area for log, code should break the loop.
(Otherwise it uses uninitializes areas later.)

Easily reproducible using lvconvert --repair
 - kill device with log
 - run lvconvert --repair vg/lv (with no PV usable for log)
2009-06-01 14:23:38 +00:00
Milan Broz
c1fdeec999 Fix readahead calculation problems.
During vgreduce is failed mirror image replaced with error segment,
this segmant type has always area_count == 0.
Current code expects that there is at least one area with device,
patch fixes it by additional check (fixes segfault during vgreduce).

Also do not calculate readahead in every lv_info call, we only need
to cache PV readahead before activation calls which locks memory.
2009-06-01 12:43:31 +00:00
Alasdair Kergon
eca971ca6c Remove verbose 'visited' messages. 2009-05-30 01:54:29 +00:00
Alasdair Kergon
5cffbf0bb0 Handle multi-extent mirror log allocation when smallest PV has only 1 extent. 2009-05-30 00:09:27 +00:00
Alasdair Kergon
5746e2e769 When creating new LV, double-check that name is not already in use. 2009-05-28 01:59:37 +00:00
Alasdair Kergon
ea0e5e6ea8 Rename internal vorigin LV to match visible LV. 2009-05-28 00:29:14 +00:00
Alasdair Kergon
99113cc588 Suppress 'removed' messages displayed when internal LVs are removed.
Fix lvchange -a and -p for sparse LVs.
Fix lvcreate --virtualsize to activate the new device immediately.
2009-05-27 18:19:21 +00:00
Alasdair Kergon
ea91a71bb9 Fix counting of virtual origin LVs in vg_validate. (mbroz) 2009-05-27 13:19:34 +00:00
Alasdair Kergon
25a2e7b80e Pre-release cleanups. 2009-05-21 03:04:52 +00:00
Milan Broz
d396100278 Use readahead of underlying device and not default (smaller) one.
When we are stacking LV over device, which has for some reason
increased read_ahead (e.g. MD RAID), the read_ahead hint
for libdevmapper is wrong (it is zero).

If the calculated read_ahead hint is zero, patch uses read_ahead of underlying device
(if first segment is PV) when setting DM_READ_AHEAD_MINIMUM_FLAG.

Because we are using dev-cache, it also store this value to cache for future use
(if several LVs are over one PV, BLKRAGET is called only once for underlying device.)

This should fix all the reamining problems with readahead mismatch reported
for DM over MD configurations (and similar cases).
2009-05-20 11:09:49 +00:00
Milan Broz
a01e55b6ec Use lock query instead of activate_lv_excl
- switch lvremove to not force activate volume when removing
 - ditto for force resync

 - fix some wrong return codes in lvchange_resync()
2009-05-20 09:55:33 +00:00
Milan Broz
970f241c52 Check max_lv on only place and force the check only for new volume.
We can temporarily violate max_lv during mirror conversion etc.

(If the operation fails, orphan mirror images are visible to administrator
for manual remove for example. Not that this should ever happen:-)

Force limit only for lvcreate (and vg merge) command.

Patch also adds simple max_lv tests into testsuite
2009-05-13 21:29:10 +00:00
Milan Broz
82cf926094 Remove unneeded import parameter from lv_create_empty. 2009-05-13 21:28:31 +00:00
Milan Broz
afd9ba98c1 Merge lv_is_displayable and lv_is_visible.
Displayable and visible is the same thing.

volumes_count(vg) is now vg_visible_lvs() and always
returns number of LVs from user perspective.
2009-05-13 21:27:43 +00:00
Milan Broz
59d8429cb3 Introduce lv_set_visible & lv_set_invisible and use lv_is_visible always.
The vg->lv_count parameter now includes always number of visible
logical volumes.

Note that virtual snapshot volume (snapshotX) is never visible,
but it is stored in metadata with visible flag.
2009-05-13 21:26:45 +00:00
Milan Broz
b14c5af76d Fix lv_is_visible to handle virtual origin.
Snapshot is visible if its origin is marked visible,
or if the origin is virtual.
2009-05-13 21:25:45 +00:00
Milan Broz
0b706ac672 Introduce link_lv_to_vg and unlink_lv_from_vg functions.
link_lv_to_vg and unlink_lv_from_vg are the only functions
for adding/removing logical volume from volume group.

Only these function should manipulate with vg->lvs list.
2009-05-13 21:25:01 +00:00
Milan Broz
d60f341d96 Remove vg->lv_count and use counter function.
This should not cause problems but simplifies code a lot.

(the volumes_count is merged and renamed with lvs_visible
function by following patch.)
2009-05-13 21:22:57 +00:00
Milan Broz
4b13d5a823 Fix snapshot segment import to not use duplicate segments & replace.
The snapshot segment (snapshotX) is created twice
during the text metadata segment processing.

This can cause temporary violation of max_lv count.

Simplify the code, snapshot segment is properly initialized
in init_snapshot_seg function now and do not need to be replaced
by vg_add_snapshot call.

The vg_add_snapshot() is now usefull only for adding new
snapshot and it shares the same initialization function.

The snapshot name is always generated, name paramater can be
removed from function call.
2009-05-13 21:21:58 +00:00
Alasdair Kergon
b44e3bdc86 better variable name for snapshot counting 2009-05-13 01:48:18 +00:00
Milan Broz
2f9a9d1a7f Remove snapshot_count from VG and use function instead. 2009-05-12 19:12:09 +00:00
Milan Broz
920e68d603 Fix first_seg() call for empty segment list.
The seg variable is temporary variable for list iterator,
code cannot expect that after iteration it remains NULL
(it contains non-NULL pointer here id list is empty).

Patch fixes first_seg function so it now correctly returns NULL
for empty segment list.
2009-05-12 19:09:21 +00:00
Dave Wysochanski
e08ad14696 Fix error path in vg_make_handle().
Enter the error condition if either of the allocations fail, and
don't use dm_pool_zalloc if dm_pool_create fails.
2009-04-28 17:46:47 +00:00
Alasdair Kergon
87f42fda5e Add sparse devices: lvcreate -s --virtualoriginsize (hidden zero origin).
Add lvs origin_size field.
Fix linux configure --enable-debug to exclude -O2.

Still a few rough edges, but hopefully usable now:
  lvcreate -s vg1 -L 100M --virtualoriginsize 1T
2009-04-25 01:17:59 +00:00
Petr Rockai
e1ea382999 A more thorough PV equality test (that also copes better with MISSING_PVs) in
_is_mirror_image_removable.
2009-04-23 16:43:01 +00:00
Petr Rockai
eeadae1b6a Do not include MISSING_PVs in allocation maps. 2009-04-23 16:41:27 +00:00
Milan Broz
e5656d86d2 Alloc PV internal structure from VG mempool if possible. 2009-04-22 09:31:30 +00:00
Milan Broz
8f3fd69ffa Move metadata backup call after vg_commit.
The backup() call store metadata from memory.

But in cluster backup() call performs
remote nodes metadata backup and it reads data from disk.

For metadata backup consistency,
patch moves all backup() calls after vg_commit.

(Moreover, some tools already do that this way.)
2009-04-21 14:31:57 +00:00
Milan Broz
405366fd48 Properly release VG memory pool in metadata manipulation code. 2009-04-10 10:01:08 +00:00
Milan Broz
8e1d5615b4 Introduce memory pool per volume group.
Since now, all code reading volume group is responsible for releasing
the memory allocated by calling vg_release(vg).
(For simplicity of use, vg_releae can be called for vg == NULL,
the same logic like free(NULL)).

Also providing simple macro for unlocking & releasing in one step,
tools usualy uses this approach.

The global memory pool (cmd->mem) should be used only for global
physical volume operations.

This patch have to be applied with all subsequent patches to complete
memory pool per vg logic.

Using separate memory pool has quite bit memory saving impact when
using large VGs, this is mainly needed when we have to use
preallocated and locked memory (and should not overflow from that
memory space).
2009-04-10 09:59:18 +00:00
Milan Broz
6fe905c705 Properly copy the whole pv structure for later use.
The all_pvs list, used in vg_read, should make its own private
copy of pv structures, otherwise (when vg will use its own pool)
it will point to released memory pool.
The same applies for get_pvs() call.

Patch adds pv_list copy helper and adds explicit memory pool
parameter into _copy_pv.

(Please note that all these helper functions cannot guarantee that
vg related fields are valid - proper vg read & lock must be used
if it is requested.)
2009-04-10 09:56:00 +00:00
Milan Broz
6ce8f8d553 Fix mirror log convert validation question. 2009-04-10 09:53:42 +00:00
Milan Broz
e24f357c23 Fix memory pool leak.
Call the alloc_destory call always after finishing operation
with handle otherwise it will leak a memory pool.

Also fix return code in lv_extend.
2009-04-07 10:20:28 +00:00
Milan Broz
ec5703ea07 Allocate new pv->vg_name from pool, it can be destroyed later.
(The mempool rename will be used later by vg private mempools)
2009-04-02 15:01:11 +00:00
Milan Broz
aa8111b3cd fix some issues when compiling with -D DEBUG_POOL
- fix compilation issues
- fix wrong pool object maipulation (lvm dumpconfig triggers assert)
- second iteration in loop _log_parallel_areas operates on non-existing object
2009-03-26 09:25:18 +00:00
Milan Broz
7f436a0f39 Fix lv_count when manipulating with snapshots and max_lv is set.
Patch fixes these problems:
 - during the snapshot creation process, it needs create 2 LVs,
   one is cow, second becomes snapshot.
   If the code fails in vg_add_snapshot, code lvcreate will not remove
   LV cow volume.

 - if max_lv is set and VG contains snapshot, it can happen that
   during the activation lv_count is temporarily increased over the limit
   and VG metadata are not properly processed
   see https://bugzilla.redhat.com/show_bug.cgi?id=490298

 - vgcfgrestore alows restore with max_lv set to lower valuer that actual
   LV count. This later leads to situation that max_lv is completely ignored.

 - vgck doesn't call vg_validate(). It should at least try:-)

Signed-off-by: Milan Broz <mbroz@redhat.com>
2009-03-16 14:34:57 +00:00
Alasdair Kergon
81680dce3c Fix last check-ins: seg can be NULL. 2009-02-28 20:04:24 +00:00
Milan Broz
7b1c853bd9 Try to avoid full rescan if label scan is enough. 2009-02-25 23:29:06 +00:00
Milan Broz
0241c10fd6 Fix validation of dataalignment value introduced in previous commit. 2009-02-23 16:53:42 +00:00
Alasdair Kergon
8929ce6651 Add --dataalignment to pvcreate to specify alignment of data area. (mbroz)
This patch is not fully tested and leaves some related bugs unfixed.

Intended behaviour of the code now:

  pe_start in the lvm2 format PV label header is set only by pvcreate (or
vgconvert -M2) and then preserved in *all* operations thereafter.

  In some specialist cases, after the PV is added to a VG, the pe_start
field in the VG metadata may hold a different value and if so, it
overrides the other one for as long as the PV is in such a VG.

  Currently, the field storing the size of the data area in the PV label
header always holds 0.  As it only has meaning in the context of a
volume group, it is calculated whenever the PV is added to a VG (and can
be derived from extent_size and pe_count in the VG metadata).
2009-02-22 19:00:26 +00:00
Dave Wysochanski
4631e58782 Rename get_vgs() to get_vgnames() and clarify related error messages.
get_vgs() really returns a list of vgnames.  In the future we will use
get_vgs() to return a list of vg structures, similar to get_pvs().
2009-02-03 16:19:25 +00:00
Alasdair Kergon
432d4b9f6e Add as-yet-unused vg_read_error() and vg_might_exist(). (mornfall) 2009-01-27 01:48:47 +00:00
Alasdair Kergon
544deede49 Introduce as-yet-unused replacement vg_read() and vg_read_for_update()
functions.  (mornfall)
2009-01-27 00:40:44 +00:00
Alasdair Kergon
b8fa516016 Replace internal vg_check_status() implementation. (mornfall) 2009-01-26 22:42:59 +00:00
Alasdair Kergon
ff2f094761 Properly enforce cluster locking in as-yet-unused _vg_lock_and_read. (mornfall) 2009-01-26 22:22:07 +00:00
Alasdair Kergon
bc92cde62c Introduce as-yet-unused _vg_lock_and_read() and associated header file
definitions.
2009-01-26 22:13:22 +00:00
Alasdair Kergon
8544a8a254 Rename vg_read() to vg_read_internal(). (mornfall) 2009-01-26 19:01:32 +00:00
Dave Wysochanski
e9f57f2beb Add pv_mda_size to 'pvs' and vg_mda_size to 'vgs'.
Reports the size of the smallest metadata area in a PV or a VG.
Useful to confirm pvcreate --metadatasize or pvmetadatasize setting in
/etc/lvm/lvm.conf file.

NOTE: Actual value in these fields will most always differ from that
given in pvcreate options due to rounding and alignment effects.
2009-01-09 22:44:33 +00:00
Milan Broz
de28fed87b Fix "Calculate mirror log size" commit, the le_count should be always set. 2009-01-06 17:24:21 +00:00
Milan Broz
42dee539e2 Do not issue write behind lv size.
pvcreate $DEV
vgcreate -s 1k vg_test $DEV
lvcreate -l 1 -n lv1 vg_test
..
/dev/vg_test/lv1: write failed after 1024 of 4096 at 0: No space left on device

Just check for maximum write size in set_lv.
2008-12-19 15:26:01 +00:00
Milan Broz
6d1b3b5385 Calculate mirror log size instead of hardcoding 1 extent size.
It fails for 1k PE now.

Patch adds log_region_size into allocation habdle struct
and use it in _alloc_parallel_area() for proper log size calculation
instead of hardcoded 1 extent - which can fail.

Reproducer for incorrect log size calculation:
        DEV=/dev/sd[bcd]

        pvcreate $DEV
        vgcreate -s 1k vg_test $DEV
        lvcreate -m1 -L 12M -n mirr vg_test

https://bugzilla.redhat.com/show_bug.cgi?id=477040

The log size calculation is mostly copied from kernel code.
2008-12-19 15:24:52 +00:00
Peter Rajnoha
b47952641a Added displayable_lvs_in_vg and lv_is_displayable functions to deal with
the counts of visible LVs from user's perspective consistently throughout
the code.
2008-12-04 15:54:26 +00:00
Alasdair Kergon
56d8844068 more fixes 2008-11-04 15:07:45 +00:00
Alasdair Kergon
2c44337bd5 Right, a simple build (without options) is working again. 2008-11-03 22:14:30 +00:00
Alasdair Kergon
9e71c18092 Fix temp table activation in mirror conversions not to happen in other cmds.
Fix temp table in mirror conversions to use always-present error not zero.
2008-10-23 11:21:04 +00:00
Alasdair Kergon
5da4feac0e Use temp table to set device size when converting mirrors.
(Avoids having same mirror table loaded twice concurrently by first
using a 'zero' table to set the size of the device so when mirror
table is preloaded it doesn't have to be activated immediately.)
2008-10-17 10:57:15 +00:00
Alasdair Kergon
3935c3ecd6 In resume_mirror_images replace activate_lv with resume_lv as workaround.
(The resume has the side-effect of resuming all of the original
mirror's sub-lvs in addition to the new 'error' target middle layer.)
2008-10-17 10:50:14 +00:00
Alasdair Kergon
9c4bf5db4a Fix conversion of md chunk size into sectors. 2008-10-03 14:22:18 +00:00
Milan Broz
9352a2fdad Fix misleading error message when there is no allocatable extents in VG. 2008-09-29 09:59:10 +00:00
Milan Broz
061fa9c4c5 Fix handling of PVs which reappeared with old metadata version. 2008-09-25 15:59:10 +00:00
Milan Broz
affecdc5fc Try to fix possible infinite loop in dependency tree walking (by mornfall). 2008-09-25 15:57:02 +00:00
Alasdair Kergon
8c5bcdabab Improve the way VGs with PVs missing are handled so manual intervention
is required in fewer circumstances.  (mornfall)
2008-09-19 06:42:00 +00:00
Alasdair Kergon
86fb36e2b0 Add device/md_chunk_alignment to lvm.conf 2008-09-19 05:33:37 +00:00
Alasdair Kergon
3c2086efdd adjust pe_align for md chunk size 2008-09-19 05:19:09 +00:00
Alasdair Kergon
f6700b450e remove unsed var 2008-09-19 04:30:02 +00:00
Alasdair Kergon
a77f5bf258 Pass struct physical_volume to pe_align. 2008-09-19 04:28:58 +00:00
Alasdair Kergon
df6936c9e1 fix last patch return code 2008-09-19 00:20:39 +00:00
Alasdair Kergon
324e23b72d Avoid shuffling remaining mirror images when removing one, retaining primary. 2008-09-18 19:56:50 +00:00
Alasdair Kergon
4bb7a2f523 Add missing LV error target activation in _remove_mirror_images. 2008-09-18 19:09:47 +00:00
Milan Broz
3a2fb07349 Fix setting of volume limit count if converting to lvm1 format.
Fixes problem when after downconvert to lvm1 VG is broken:

# lvcreate -n lv1 -l 4 vg_test
  Invalid LV in extent map (PV /dev/sdb1, PE 0, LV 0, LE 0)
  ...
2008-08-29 13:41:21 +00:00
Zdenek Kabelac
38a8b563bf get lv_list properly from vg->lst and fix compiler warning 2008-08-13 13:42:35 +00:00
Zdenek Kabelac
e2151fb4af vgremove tries to remove lv snapshot first.
Added function lv_remove_with_dependencies().
2008-08-05 12:05:26 +00:00
Alasdair Kergon
82185ada82 Cease recognising snapshot-in-use percentages returned by early development kernels. 2008-07-15 00:25:52 +00:00
Alasdair Kergon
59743245b4 Fix up cache for PVs without mdas after consistent VG metadata is processed. 2008-06-27 15:18:31 +00:00
Alasdair Kergon
6db4136358 Update validation of safe mirror log type conversions in lvconvert. (brassow) 2008-06-26 23:05:11 +00:00
Alasdair Kergon
36081ccf2d Fix reporting of LV fields alongside unallocated PV segments. 2008-06-25 16:52:27 +00:00
Dave Wysochanski
15db00b53e Refactor pv_create() to take cmd_context - no functional change. 2008-06-24 20:10:32 +00:00
Dave Wysochanski
2d415cf9f8 Add uninitialized_var macro to suppress invalid compiler warnings.
One such warning is seen on fedora9 gcc compiler:
/metadata.c:1923: warning: 'results' may be used uninitialized in this function
2008-06-23 19:04:34 +00:00
Jim Meyering
ac27ef2593 Don't deref uninitialized log_lv upon failed mirror addition.
* mirror.c (add_mirror_images): Ensure that log_lv is initialized.
2008-06-13 12:15:55 +00:00
Alasdair Kergon
bfadd2133e Tweak detection of invalid fid after changes to PVs in VG in _vg_read. 2008-06-08 14:18:44 +00:00
Alasdair Kergon
ec40d92889 post-release 2008-06-06 19:28:35 +00:00
Alasdair Kergon
697e3bb2df back out unnecessary changes for this release 2008-06-06 17:36:19 +00:00
Alasdair Kergon
d0de492ff3 cope with volatile vginfo in vg_read 2008-06-06 11:12:50 +00:00
Alasdair Kergon
57d0dc0db2 Allow for vginfo changing during _vg_read. 2008-06-06 09:48:04 +00:00
Alasdair Kergon
ad134662a2 Drop metadata cache if update fails in vg_revert or vg_commit. 2008-06-03 17:56:54 +00:00
Alasdair Kergon
d9c0105fef Drop metadata cache before writing precommitted metadata instead of after. 2008-05-08 18:06:58 +00:00
Alasdair Kergon
eddc0f3307 post-release - reinstate incomplete enhancements 2008-04-29 16:11:28 +00:00
Alasdair Kergon
04ed52f6a0 pre-release (bug fixes only - enhancements excluded) 2008-04-29 15:58:25 +00:00
Dave Wysochanski
8e8baf89c0 Fix vgsplit internal counting of snapshot LVs. 2008-04-23 14:33:06 +00:00
Alasdair Kergon
fd1b118942 Check lv_count in vg_validate.
Fix internal LV counter when a snapshot is removed.
2008-04-22 12:54:33 +00:00
Milan Broz
581b17def2 Drop cached VG metadata before and after committing changes to it. 2008-04-15 14:46:19 +00:00
Alasdair Kergon
3a370b7350 more pre-release cleanup 2008-04-10 19:59:43 +00:00
Alasdair Kergon
0b2a795ece make list_move consistent with other list fns 2008-04-10 19:14:27 +00:00
Dave Wysochanski
985ca02b6a Add vg_is_clustered() helper function.
Should be no functional change.
2008-04-10 17:09:32 +00:00
Alasdair Kergon
6eb44b5091 Fix vgreduce to use vg_split_mdas to check sufficient mdas remain.
Add (empty) orphan VGs to lvmcache during initialisation.
Fix orphan VG name used for format_pool.
2008-04-08 12:49:21 +00:00
Alasdair Kergon
ad607a23f1 create fids for internal orphan VGs 2008-04-07 22:12:37 +00:00
Milan Broz
5619c629f6 Add detection of clustered mirror log capability.
Currently only check for kernel module presence.
2008-04-07 10:23:47 +00:00
Dave Wysochanski
9da5d7ac02 Add check to vg_commit() to ensure lock is held before writing new VG metadata. 2008-04-04 15:41:20 +00:00
Alasdair Kergon
6e210a6c54 Cache VG metadata internally while VG lock is held. 2008-04-01 22:40:13 +00:00
Dave Wysochanski
9332d2cb9d Add find_lv_in_lv_list() and find_pv_in_pv_list().
Update _add_pvs() to call find_pv_in_pv_list().
2008-03-28 19:08:23 +00:00
Dave Wysochanski
8e32e58e00 Use list_move() in applicable places. 2008-03-26 17:26:32 +00:00
Dave Wysochanski
052bbfba3a Add pvseg_is_allocated() for identifying a PV segment allocated to a LV. 2008-03-26 16:48:10 +00:00
Alasdair Kergon
7284f3f966 preparation for vg cache 2008-03-17 16:51:31 +00:00
Dave Wysochanski
ad8f37df1b Const cleanups in find_* functions. 2008-03-13 22:51:24 +00:00
Alasdair Kergon
60be88a0a6 Fix resetting of MIRROR_IMAGE and VISIBLE_LV after removal of LV. 2008-02-22 13:28:29 +00:00
Alasdair Kergon
4c0f4125ec Fix remove_layer_from_lv to empty the LV before removing it. (2.02.30) 2008-02-22 13:22:44 +00:00
Alasdair Kergon
39d3ec0b51 Add missing no-longer-used segs_using_this_lv test to check_lv_segments. 2008-02-22 13:22:21 +00:00
Jim Meyering
a34a6a3f71 is_orphan: make parameter "const" to avoid compiler warning 2008-02-13 20:01:48 +00:00
Alasdair Kergon
bb097a97ea split orphan VG by format type 2008-02-06 15:47:28 +00:00
Alasdair Kergon
4e9083db10 Fix mirror log name construction during lvconvert. (2.02.30)
Make monitor_dev_for_events recurse through the stack of LVs.
Clean up some more compiler warnings.
Add mirror names test script.
2008-01-31 12:19:36 +00:00
Alasdair Kergon
2871881859 undo a few 'stack' moves 2008-01-30 14:17:29 +00:00
Alasdair Kergon
67cdbd7e4d Some whitespace tidy-ups. 2008-01-30 14:00:02 +00:00
Alasdair Kergon
c51b9fff19 Use stack return macros throughout. 2008-01-30 13:19:47 +00:00
Alasdair Kergon
5dc6c0de80 Fix two check_lv_segments error messages to show whole segment. 2008-01-26 00:30:28 +00:00
Alasdair Kergon
3d13b4677d Refactor mirror log attachment code. 2008-01-26 00:25:04 +00:00
Alasdair Kergon
ec2bd20886 suppress compiler warning 2008-01-22 16:02:26 +00:00
Dave Wysochanski
1ce224d13f Fix vgsplit - print error if vgcreate option given w/existing vg destination
Fix vgsplit - reject split if metadata types or clustered attributes differ
Fix vgsplit - remove physicalextentsize option
Add vgsplit test cases
2008-01-22 02:48:53 +00:00
Alasdair Kergon
0e0a6eb6cf Fix lvcreate --nosync not to wait for non-happening sync. 2008-01-18 22:02:37 +00:00
Alasdair Kergon
7644c656d8 add lvconvert messages 2008-01-18 22:00:46 +00:00
Alasdair Kergon
0c06de632a pre-release review cleanups 2008-01-17 17:17:09 +00:00
Alasdair Kergon
db24ceca33 rename lv_remap_error 2008-01-17 13:54:05 +00:00
Alasdair Kergon
58a63ae973 mirror log stuff 2008-01-17 13:37:51 +00:00
Alasdair Kergon
5cf3c51857 lvconvert/vgreduce fixes 2008-01-17 13:13:54 +00:00
Alasdair Kergon
70955d40a1 fix a _get_vgs return 2008-01-16 22:52:46 +00:00
Alasdair Kergon
79182305ef additional safety check on new segment list 2008-01-16 20:00:01 +00:00
Dave Wysochanski
d865615e9a Create vgs_are_compatible() fn to check whether vgs are compatible for merging.
Add new vgmerge and vgsplit tests to check rejection of incompatible vgs.
Cleanup comments.
Bugzilla: bz251992

---
 lib/metadata/metadata-exported.h |    3 +
 lib/metadata/metadata.c          |   89 +++++++++++++++++++++++++++++++++-
 test/t-vgmerge-usage.sh          |  101 +++++++++++++++++++++++++++++++++++++++
 test/t-vgsplit-operation.sh      |   20 +++++++
 tools/vgmerge.c                  |   69 --------------------------
 tools/vgsplit.c                  |    5 -
 6 files changed, 215 insertions(+), 72 deletions(-)
2008-01-16 19:54:39 +00:00
Alasdair Kergon
ba4d6ad8ea adjust mirror log error message 2008-01-16 19:50:23 +00:00
Alasdair Kergon
17431cddac cope with stacked LVs as well as PVs when deciding which bits of mirrors to remove 2008-01-16 19:38:39 +00:00
Alasdair Kergon
876003dc44 allow a mirror to contain only one mimage 2008-01-16 19:18:51 +00:00
Alasdair Kergon
171b53fb25 export find_temporary_mirror() 2008-01-16 19:13:51 +00:00
Alasdair Kergon
e344497277 move removable_pvs checking 2008-01-16 19:11:39 +00:00
Alasdair Kergon
7d18ea22eb reorder funcs 2008-01-16 19:09:35 +00:00
Alasdair Kergon
72baf0c345 Maintain lists of stacked LV segments using each LV. 2008-01-16 19:00:59 +00:00
Alasdair Kergon
fb3226a3ed use scan_vgs_for_pvs to detect non-orphans without MDAs 2008-01-16 18:15:26 +00:00
Dave Wysochanski
8868a4ffc2 Move more parameter validation into the library.
Update vgrename to call validate_vg_rename_params().
Fix vgcreate and vgsplit default arguments by adding defaults parameter to
fill_vg_create_params().
Add t-vgrename-usage.sh test.
Bugzilla: bz251992
---
 tools/toollib.c  |   32 ++++++++------------------------
 tools/toollib.h  |    5 ++---
 tools/vgcreate.c |   35 +++++++++++++++++++++--------------
 tools/vgrename.c |   35 ++++++-----------------------------
 tools/vgsplit.c  |   21 ++++++++++++++-------
 5 files changed, 51 insertions(+), 77 deletions(-)
2008-01-15 22:56:30 +00:00
Dave Wysochanski
b8daca8570 Allow vgcreate options as input to vgsplit when new vg is split destination. 2008-01-14 21:07:58 +00:00
Alasdair Kergon
06ea7eaa27 Various lvconvert/polldaemon-related fixes from NEC. See lvm-devel
for original patches & explanations.
2008-01-10 18:35:51 +00:00
Milan Broz
a95892f77d Fix a segfault if using pvs with --all argument. (2.02.29) 2008-01-07 20:42:57 +00:00
Alasdair Kergon
ba0c495db7 lvconvert uses polldaemon now 2007-12-22 12:13:29 +00:00
Alasdair Kergon
b9c69aa63a a few more changes/fixes to recent code 2007-12-22 02:13:00 +00:00
Alasdair Kergon
1620864c35 more fixes 2007-12-20 23:12:27 +00:00
Alasdair Kergon
2b3dda7f72 various cleanups in recent patches 2007-12-20 22:37:42 +00:00
Alasdair Kergon
31e9db2690 stacked mirror support (incomplete) 2007-12-20 18:55:46 +00:00
Alasdair Kergon
a69ab65278 Major restructuring of pvmove and lvconvert layer manipulation code 2007-12-20 15:42:55 +00:00
Alasdair Kergon
b680c5c677 export can_split parameter until rest of pvmove allocation restructuring gets done 2007-12-05 22:11:20 +00:00
Alasdair Kergon
940d710ece drop mirrored_pv/mirrored_pe from alloc handle 2007-11-22 14:54:35 +00:00
Alasdair Kergon
3da4613d7b Start refactoring pvmove allocation code. 2007-11-22 13:57:21 +00:00
Alasdair Kergon
a68d8ec833 move pvresize_single back under tools 2007-11-15 22:11:18 +00:00
Alasdair Kergon
e5f7352bef Convert some vg_reads into vg_lock_and_reads 2007-11-15 02:20:03 +00:00
Alasdair Kergon
a6b22cf317 readahead activation code (but no dm support yet) 2007-11-12 20:51:54 +00:00
Alasdair Kergon
b4068515e8 Enhance the management of readahead settings. 2007-11-09 16:51:54 +00:00
Alasdair Kergon
19c865437a Prevent lvconvert -s from using same LV as origin and snapshot. 2007-11-07 16:33:12 +00:00
Alasdair Kergon
00a7c302ea Add pv_mda_free and vg_mda_free fields to reports for raw text format. 2007-11-05 17:17:55 +00:00
Alasdair Kergon
b7940c98c1 fix new lvremove checks - mustn't fail when activation is disabled 2007-11-04 16:28:57 +00:00
Alasdair Kergon
d38bf3616c Fix orphan-related locking in pvdisplay and pvs.
Fix missing VG unlocks in some pvchange error paths.
Add some missing validation of VG names.
Rename validate_vg_name() to validate_new_vg_name().
Change orphan lock to VG_ORPHANS.
Change format1 to use ORPHAN as orphan VG name.
2007-11-02 20:40:05 +00:00
Bryn M. Reeves
8b98c12815 Add is_orphan_vg() and change all hardcoded checks to use it. 2007-11-02 13:06:42 +00:00
Dave Wysochanski
0f8387c2d6 Remove comment about allocation of pv->vg_name. 2007-10-12 21:08:38 +00:00
Dave Wysochanski
0283c439ec Add _alloc_pv() and _free_pv() from _pv_create() code and fix error paths.
Modified original patch by Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
2007-10-12 18:37:19 +00:00
Dave Wysochanski
1b8de4cb25 Add pv_dev_name() to access PV device name.
Patch by Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
2007-10-12 14:29:32 +00:00
Dave Wysochanski
70d9f98ed3 Accessor functions for PV will not modify the given PV.
So we can add 'const' to it.
Patch by Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
2007-10-12 14:08:10 +00:00
Dave Wysochanski
9ea1d647af Non-functional change - refactor lv_create_empty().
Remove struct format_instance param - we can safely obtain
this from vg->fid inside the function.
2007-10-11 19:20:38 +00:00
Dave Wysochanski
8fd14f09e2 Non-functional change - refactor vg_add_snapshot fid parameter. 2007-10-11 18:51:21 +00:00
Dave Wysochanski
bf4f5b21a4 Some const fixups for previous checkins 2007-09-24 21:30:00 +00:00
Dave Wysochanski
dfef7f6942 Add %PVS extents option to lvresize, lvextend, and lvcreate. 2007-09-20 21:39:08 +00:00
Alasdair Kergon
9eea0107ba Fix strdup memory leak in str_list_dup(). 2007-09-17 16:02:46 +00:00