1
0
mirror of git://sourceware.org/git/lvm2.git synced 2024-12-21 13:34:40 +03:00
Commit Graph

3380 Commits

Author SHA1 Message Date
Alasdair G Kergon
a3a5f58c21 reporting: Add devtypes command.
Add internal devtypes reporting command to display built-in recognised
block device types.  (The output does not include any additional
types added by a configuration file.)

> lvm devtypes -o help
  Device Types Fields
  -------------------
    devtype_all            - All fields in this section.
    devtype_name           - Name of Device Type exactly as it appears in /proc/devices.
    devtype_max_partitions - Maximum number of partitions. (How many device minor numbers get reserved for each device.)
    devtype_description    - Description of Device Type.

> lvm devtypes
  DevType       MaxParts Description
  aoe                 16 ATA over Ethernet
  ataraid             16 ATA Raid
  bcache               1 bcache block device cache
  blkext               1 Extended device partitions
...
2013-09-18 01:09:15 +01:00
Alasdair G Kergon
97ba18f4cb filters: Add bcache.
N.B. Using bcache devices as PVs is still experimental.
Problems should be reported to the appropriate mailing lists.
2013-09-16 16:56:55 +01:00
Zdenek Kabelac
b8ea27ac97 cleanup: hide gcc warning
Older gcc is giving misleading warning:

metadata/lv_manip.c:4018: warning: ‘seg’ may be used uninitialized in
this function

But warning free compilation is better.
2013-09-11 23:40:45 +02:00
Jonathan Brassow
2691f1d764 RAID: Make RAID single-machine-exclusive capable in a cluster
Creation, deletion, [de]activation, repair, conversion, scrubbing
and changing operations are all now available for RAID LVs in a
cluster - provided that they are activated exclusively.

The code has been changed to ensure that no LV or sub-LV activation
is attempted cluster-wide.  This includes the often overlooked
operations of activating metadata areas for the brief time it takes
to clear them.  Additionally, some 'resume_lv' operations were
replaced with 'activate_lv_excl_local' when sub-LVs were promoted
to top-level LVs for removal, clearing or extraction.  This was
necessary because it forces the appropriate renaming actions the
occur via resume in the single-machine case, but won't happen in
a cluster due to the necessity of acquiring a lock first.

The *raid* tests have been updated to allow testing in a cluster.
For the most part, this meant creating devices with '-aey' if they
were to be converted to RAID.  (RAID requires the converting LV to
be EX because it is a condition of activation for the RAID LV in
a cluster.)
2013-09-10 16:33:22 -05:00
Jonathan Brassow
ca51435153 Misc/RAID: Enable resume_lv to handle some renaming conflicts.
When images and their associated metadata are removed from a RAID1 LV,
the remaining sub-LVs are "shifted" down to fill the gaps.  For
example, if there is a 3-way mirror:
	[0][1][2]
and we remove device#0, the devices will be shifted down
	[1][2]
and renamed.
	[0][1]

This can create a problem for resume_lv (specifically,
dm_tree_activate_children) during the renaming process though.  This
is because it will attempt to rename the higher indexed sub-LVs first
and find that it cannot because there are currently other sub-LVs with
that name.  The solution is to check for a conflicting name before
attempting to rename.  If a conflict is found and that conflicting
sub-LV is also in the process of renaming, we can defer the current
rename until the conflicting sub-LV has renamed and cleared the
conflict.

Now that resume_lv can handle these types of rename conflicts, we can
remove the workaround in RAID that was attempting to resume a RAID1
LV from the bottom-up in order to force a proper rename in assending
order before attempting a resume on the top-level LV.  This "hack"
only worked for single machine use-cases of LVM.  Clearing this up
paves the way for exclusive activation of RAID LVs in a cluster.
2013-09-09 15:07:28 -05:00
Zdenek Kabelac
f5832d8c49 deactivate: drop readahead calc in deactivation
Skip readahead when device will be deactivated.
2013-09-07 09:13:20 +02:00
Zdenek Kabelac
0670bfeb59 thin: validation catch multiseg thin pool/volumes
Multisegment thin pools and volumes are not supported.
Catch such error code path early.
2013-09-07 03:32:07 +02:00
Zdenek Kabelac
655296609e thin: fix monitoring of thin pool volume
Properly skip unmonitoring of thin pool volume in deactivation code
path. Code makes sure if there is just any thin pool user
it stays monitored with all its resources.
2013-09-07 03:31:04 +02:00
Zdenek Kabelac
4c001a7854 thin: fix resize of stacked thin pool volume
When the pool is created from non-linear target the more complex rules
have to be used and stacking needs to properly decode args for _tdata
LV. Also proper allocation policies are being used according to those
set in lvm2 metadata for data and metadata LVs.

Also properly check for active pool and extra code to active it
temporarily.

With this fix it's now possible to use:

lvcreate -L20 -m2 -n pool vg  --alloc anywhere
lvcreate -L10 -m2 -n poolm vg --alloc anywhere
lvconvert --thinpool vg/pool --poolmetadata vg/poolm

lvresize -L+10 vg/pool
2013-09-07 03:24:48 +02:00
Alasdair G Kergon
78647da1c6 toolcontext: Only reopen stdin if readable.
Don't fail when running lvm commands under versions of nohup that set
up stdin as O_WRONLY!
2013-08-28 23:55:14 +01:00
Alasdair G Kergon
c0f987949b activation: Fix segfault with inactive pvmove LV.
Set flag to avoid recursion back through an inactive pvmove LV when
populating deptree.
2013-08-28 22:56:23 +01:00
Jonathan Brassow
72d6bdd6b9 misc: make lv_is_on_pv use for_each_sub_lv to walk LV tree
Make lv_is_on_pv use for_each_sub_lv to walk the LV tree.  This
reduces code duplication.
2013-08-23 11:03:28 -05:00
Jonathan Brassow
e5c0213168 Thin: Make 'lv_is_on_pv(s)' work with thin types
The pool metadata LV must be accounted for when determining what PVs
are in a thin-pool.  The pool LV must also be accounted for when
checking thin volumes.

This is a prerequisite for pvmove working with thin types.
2013-08-23 08:49:16 -05:00
Jonathan Brassow
f1e3640df3 Misc: Make get_pv_list_for_lv() available to more than just RAID
The function 'get_pv_list_for_lv' will assemble all the PVs that are
used by the specified LV.  It uses 'for_each_sub_lv' to traverse all
of the sub-lvs which may compose it.
2013-08-23 08:40:13 -05:00
Peter Rajnoha
c8daa15270 filter-mpath: remove superfluous error message about mpath major not equal to dm major
This is a regression caused by commit 3bd9048854.
The error message added with that commit "mpath major %d is not dm major %d" is
superfluous.

When scanning for mpath components, we're looking for a parent device.
But this parent device is not necessarily an mpath device (so the dm device)
if it exists - it can be any other device layered on top (e.g. an MD RAID device).
2013-08-21 14:07:01 +02:00
Peter Rajnoha
0563bd0037 fix: some issues reported by coverity
- null_fd resource leak on error path in _reopen_fd_null fn
  - dead code in verify_message in clvmd code
  - dead code in _init_filter_components in toolcontext code
  - null dereference in dm_prepare_selinux_context on error path if
    setfscreatecon fails while resetting SELinux context
2013-08-15 12:23:49 +02:00
Alasdair G Kergon
80bcdb93ff filters: check for mpath before opening devs
Split out the partitioned device filter that needs to open the device
and move the multipath filter in front of it.

When a device is multipathed, sending I/O to the underlying paths may
cause problems, the most obvious being I/O errors visible to lvm if a
path is down.

Revert the incorrect <backtrace> messages added when a device doesn't
pass a filter.

Log each filter initialisation to show sequence.

Avoid duplicate 'Using $device' debug messages.
2013-08-13 23:26:58 +01:00
Jonathan Brassow
abc89422af Mirror: Fix inability to remove VG's cluster flag if it contains a mirror
According to bug 995193, if a volume group
	1) contains a mirror
	2) is clustered
	3) 'locking_type' = 0 is used
then it is not possible to remove the 'c'luster flag from the VG.  This
is due to the way _lv_is_active behaves.

We shouldn't allow the cluster flag to be flipped unless the mirrors in
the cluster are not active.  This is because different kernel modules
are used depending on whether a mirror is cluster or not.  When we
attempt to see if the mirror is active, we first check locally.  If it
is not, then we attempt to check for remotely active instances if the VG
is clustered.  Since the no_lock locking type is LCK_CLUSTERED, but does
not implement 'query_resource', remote_lock_held will always return an
error in this case.  An error from remove_lock_held is treated as though
the lock _is_ held (i.e. the LV is active remotely).  This blocks the
cluster flag from changing.

The solution is to implement 'query_resource' for the no_lock type.  It
will report a message and return 1.  This will allow _lv_is_active to
function properly.  The LV would be considered not active remotely and
the VG can change its flag.
2013-08-12 13:56:47 -05:00
Alasdair G Kergon
28760275e6 logging: tidy log_sys_error when string empty 2013-08-12 18:40:41 +01:00
Peter Rajnoha
2f61478436 workaround: gcc v4.8 on 32 bit param. passing bug when -02 opimization used
gcc -O2 v4.8 on 32 bit architecture is causing a bug in parameter
passing. It does not happen with -01 nor -O0.

The problematic part of the code was strlen use in config.c in
the config_def_check fn and the call for _config_def_check_tree in it:

<snip>
  rplen = strlen(rp);
  if (!_config_def_check_tree(handle, vp, vp + strlen(vp), rp, rp + rplen, CFG_PATH_MAX_LEN - rplen, cn, cmd->cft_def_hash)) ...
</snip>

If compiled with -O0 (correct):

Breakpoint 1, config_def_check (cmd=0x819b050, handle=0x81a04f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	strlen(vp)
$2 = 6
(gdb)
_config_def_check_tree (handle=0x81a04f8, vp=0x8189ee0 <_cfg_path>
"config", pvp=0x8189ee6 <_cfg_path+6> "", rp=0xbfffe1e8 "config",
prp=0xbfffe1ee "", buf_size=58, root=0x81a2568, ht=0x81a65
48) at config/config.c:680
(gdb) p	vp
$4 = 0x8189ee0 <_cfg_path> "config"
(gdb) p	pvp
$5 = 0x8189ee6 <_cfg_path+6> ""

If compiled with -O2 (incorrect):

Breakpoint 1, config_def_check (cmd=cmd@entry=0x8183050, handle=0x81884f8) at config/config.c:775
(gdb) p	vp
$1 = 0x8172fc0 <_cfg_path> "config"
(gdb) p strlen(vp)
$2 = 6
(gdb) p	vp + strlen(vp)
$3 = 0x8172fc6 <_cfg_path+6> ""
(gdb)
_config_def_check_tree (handle=handle@entry=0x81884f8, pvp=0x8172fc7
<_cfg_path+7> "host_list", rp=rp@entry=0xbffff190 "config",
prp=prp@entry=0xbffff196 "", buf_size=buf_size@entry=58, ht=0x
818e548, root=0x818a568, vp=0x8172fc0 <_cfg_path> "config") at
config/config.c:674
(gdb) p	pvp
$4 = 0x8172fc7 <_cfg_path+7> "host_list"

The difference is in passing the "pvp" arg for _config_def_check_tree.
While in the correct case, the value of _cfg_path+6 is passed
(the result of vp + strlen(vp) - see the snippet of the code above),
in the incorrect case, this value is increased by 1 to _cfg_path+7,
hence totally malforming the string that is being processed.

This ends up with incorrect validation check and incorrect warning
messages are issued like:

 "Configuration setting "config/checks" has invalid type. Found integer, expected section."

To workaround this issue, remove the "static" qualifier from the
"static char _cfg_path[CFG_PATH_MAX_LEN]". This causes the optimalizer
to be less aggressive (also shuffling the arg list for
_config_def_check_tree call helps).
2013-08-09 13:24:50 +02:00
Jonathan Brassow
c95f17ea64 Mirror: Fix issue preventing PV creation on mirror LVs
Commit b248ba0a39 attempted to
prevent mirror devices which had a failed device in their
mirrored log from being usable/readable by LVM.  This was to
protect against circular dependancies where one LVM command
could be blocked trying to read one of these affected mirrors
while the LVM command to fix/unblock that mirror was stuck
behind the currently running command.

The above commit went wrong when it used 'device_is_usable()' to
recurse on the mirrored log device to check if it was suspended
or blocked.  The 'device_is_usable' function also contains a check
for reserved names - like *_mlog, etc.  This last check always
triggered when checking a mirror's log simply because of the name,
not because it was suspended or blocked - a false positive.

The solution is to create a new function like 'device_is_usable',
but without the check for reserved names.  Using this new function
(device_is_suspended_or_blocked), we can check the status of a
mirror's log device properly.
2013-08-07 17:42:26 -05:00
Jonathan Brassow
c13d1b11b2 RAID: Make "raid10" the default striped + mirror segment type
When both the '-i' and '-m' arguments are specified on the command
line, use the "raid10" segment type.  This way, the native RAID10
personality is used through dm-raid rather than layering a mirror
on striped LVs.  If the old behavior is desired, the '--type'
argument to use would be "mirror" rather than "raid10".
2013-08-06 14:15:08 -05:00
Jonathan Brassow
7e1083c985 RAID: Make "raid1" the default mirror segment type 2013-08-06 14:13:55 -05:00
Peter Rajnoha
f74e8fe044 thin: fix commit e195b5227e
Check chunk_size range unconditionally.
2013-08-06 16:28:12 +02:00
Zdenek Kabelac
f6dd5a294b exec: pipe open
Function replaces popen() system and avoids shell execution
and argument parsing (no surprices).
2013-08-06 16:18:43 +02:00
Zdenek Kabelac
b6437a6180 cleanup: update exec_cmd comment and error
Use log_sys_error for reporting error of system call.
Fix comment for return value.
2013-08-06 16:16:57 +02:00
Peter Rajnoha
34d207d9b3 lvmetad: fix mda offset/size overflow if >= 4g (32bit)
When reading an info about MDAs from lvmetad, we need to use 64 bit
int to read the value of the offset/size, otherwise the value is
overflows and then it's used throughout!

This is dangerous if we're trying to write such metadata area then,
mostly visible if we're using 2 mdas where the 2nd one is at the end
of the underlying device and hence the value of the mda offset is
high enough to cause problems:

(the offset trimmed to value of 0 instead of 4096m, so we write
at the very start of the disk (or elsewhere if the offset has
some other value!)

[1] raw/~ # lvcreate -s -l 100%FREE vg --virtualsize 4097m
  Logical volume "lvol0" created

[1] raw/~ # pvcreate --metadatacopies 2 /dev/vg/lvol0
  Physical volume "/dev/vg/lvol0" successfully created

[1] raw/~ # hexdump -n 512 /dev/vg/lvol0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000200

[1] raw/~ # pvchange -u /dev/vg/lvol0
  Physical volume "/dev/vg/lvol0" changed
  1 physical volume changed / 0 physical volumes not changed

[1] raw/~ # hexdump -n 512 /dev/vg/lvol0
0000000 d43e d2a5 4c20 4d56 2032 5b78 4135 7225
0000010 4e30 3e2a 0001 0000 0000 0000 0000 0000
0000020 0000 0010 0000 0000 0000 0000 0000 0000
0000030 0000 0000 0000 0000 0000 0000 0000 0000
*
0000200

=======

(the offset overflows to undefined values which is far behind
the end of the disk)

[1] raw/~ # lvcreate -s -l 100%FREE vg --virtualsize 100g
  Logical volume "lvol0" created

[1] raw/~ # pvcreate --metadatacopies 2 /dev/vg/lvol0
  Physical volume "/dev/vg/lvol0" successfully created

[1] raw/~ # pvchange -u /dev/vg/lvol0
  /dev/vg/lvol0: lseek 18446744073708503040 failed: Invalid argument
  /dev/vg/lvol0: lseek 18446744073708503040 failed: Invalid argument
  Failed to store physical volume "/dev/vg/lvol0"
  0 physical volumes changed / 1 physical volume not changed
2013-08-06 13:37:42 +02:00
Peter Rajnoha
e195b5227e thin: apply VG profile if creating a new thin pool
When creating a new thin pool and there's no profile requested
via "lvcreate --profile ...", inherit any VG profile if it's attached.

Currently this applies to these settings:
  allocation/thin_pool_chunk_size
  allocation/thin_pool_discards
  allocation/thin_pool_zero
2013-08-06 11:42:40 +02:00
Zdenek Kabelac
22fc80982a thin: add thin_repair and thin_dump options
Add new configure lvm.conf options for binaries thin_repair
and thin_dump.

Those are part of device-mapper-persistent-data package
and will be used for recovery of thin_pool.
2013-07-31 15:30:47 +02:00
Zdenek Kabelac
ea605d1ec7 thin: metadata resize needs 1.9 version
Version 1.8 is not yet fully usable for metadata resize.
2013-07-31 15:29:27 +02:00
Zdenek Kabelac
7b58f10442 thin: move setting of THIN_POOL
Set flag when attaching data LV which make segment THIN_POOL.
2013-07-31 15:27:38 +02:00
Zdenek Kabelac
4a722c5c8b cleanup: use compile time strlen
Use sizeof instead of strlen().
2013-07-31 15:24:45 +02:00
Alasdair G Kergon
b6bfddcd0a alloc: fix lvextend when stripe number varies
The PREFERRED allocation mechanism requires the number of areas in the
previous LV segment to match the number in the new segment being
allocated.  If they do not match, the code may crash.
  E.g. https://bugzilla.redhat.com/989347

Introduce A_AREA_COUNT_MATCHES and when not set avoid referring
to the previous segment with the contiguous and cling policies.
2013-07-29 19:35:45 +01:00
Peter Rajnoha
ecc9f74988 filters: fix segfault on incorrect global_filter
When using a global_filter and if this filter is incorrectly
specified, we ended up with a segfault:

  raw/~ $ pvs
    Invalid filter pattern "r|/dev/sda".
  Segmentation fault (core dumped)

In the example above a closing '|' character is missing at the end
of the regex. The segfault itself was caused by trying to destroy
the same filter twice in _init_filters fn within the error path
(the "bad" goto target):

bad:
        if (f3)
                f3->destroy(f3);
        if (f4)
                f4->destroy(f4);

Where f3 is the composite filter (sysfs + regex + type + md + mpath filter)
and f4 is the persistent filter which encompasses this composite filter
within persistent filter's 'real' field in 'struct pfilter'.

So in the end, we need to destroy the persistent filter only as
this will also destroy any 'real' filter attached to it.
2013-07-26 13:04:53 +02:00
Alasdair G Kergon
d13e87b9ef cleanup: comments and a message 2013-07-24 22:10:37 +01:00
Jonathan Brassow
f5a205668b Revert a previous change
commit d00d45a8b6 introduced changes
that are causing cluster mirror tests to fail.  Ultimately, I think
the change was right, but a proper clean-up will have to wait.
The portion of the commit we are reverting correlates to the
following commit comment:
    2) lib/metadata/mirror.c:_delete_lv() - should have been calling
       _activate_lv_like_model() with 'mirror_lv'.  This is because
       'mirror_lv' is the LV that the overall operation is being
       performed on.  We need to use this LV as the basis for
       determining whether to activate locally, or across the
       cluster, etc.
It appears that when legs or logs are removed from a mirror, they
are being activated before they are deleted in order to make them
top-level LVs that can be acted upon.  When doing this, it appears
they are not activated based on the characteristics of the mirror
from which they came.  IOW, if the mirror was exclusively active,
the sub-LVs are activated globally.  This is a no-no.  This then
made it impossible to activate_lv_like_model if the model was
"mirror_lv" instead of "lv" in _delete_lv().  Thus, at some point
this change should probably be put back and those location where
the sub-LVs are being improperly activated "shared" instead of
EX should be corrected.
2013-07-24 14:18:07 -05:00
Zdenek Kabelac
5597dc3652 thin: not zeroing for non-zeroed thin pool snaps
Do not zero initial 4KB of thin snapshot volume for thin pool with
disabled zeroing.
2013-07-24 01:15:31 +02:00
Jonathan Brassow
d00d45a8b6 Clean-up: Addressing a few FIXME's
Three fixme's addressed in this commit:
1) lib/metadata/lv_manip.c:_calc_area_multiple() - this could be
   safely changed to a comment explaining that currently because
   RAID10 can only have a 2-way mirror, we don't need to know the
   number of stripes.  However, we will need to know that in the
   future if RAID10 is to support more than 2-way mirroring.

2) lib/metadata/mirror.c:_delete_lv() - should have been calling
   _activate_lv_like_model() with 'mirror_lv'.  This is because
   'mirror_lv' is the LV that the overall operation is being
   performed on.  We need to use this LV as the basis for
   determining whether to activate locally, or across the
   cluster, etc.

3) tools/lvcreate.c:_lvcreate_params() - Minor clean-up.  If
   '-m 0' is given, treat it as though the mirroring argument
   was not given (i.e. as though the requested segment type
   was 'stripe' and not mirror).
2013-07-23 14:46:22 -05:00
Zdenek Kabelac
373f95a921 snapshot: update merging fix
Activation is needed only for clustered VG.
For non-clustered VG skip activation, since deactivate_lv()
is called without problems (no testing for lock presence).

(updates f6ded62291)
2013-07-23 15:15:04 +02:00
Zdenek Kabelac
6311be29e4 thin: use 64bit arithmetic for checking meta size
Avoid overflow since extents are just 32bit values.

(in release fix 87aca628)
2013-07-23 14:58:07 +02:00
Alasdair G Kergon
84801d7c34 thin: rename extend_pool to create_pool 2013-07-23 13:33:14 +01:00
Zdenek Kabelac
f6ded62291 snapshot: fix merging
When the merging of snapshot is finished, we need to clean dm table
intries for snapshot and -cow device. So for merging snapshot
we have to activate_lv plain 'cow' LV and let the table
resolver to its work - shortly deactivation_lv() request
will follow - in cluster this needs LV lock to be held by clvmd.

Also update a test - add small wait - if lvremove is not 'fast enough'
and merging process has not been stopped and $lv1 removed in background.
Ortherwise the following lvcreate occasionally finds name $lv1 still in use.

(in release fix)
2013-07-22 16:26:00 +02:00
Zdenek Kabelac
aed4e9c703 coverity: pointer validation
Check for metadata_lv and make sure we have got proper thin pool segment.

Check we are working with merging snapshot when adding merging target.
2013-07-22 12:41:21 +02:00
Zdenek Kabelac
05a70f2da3 cleanup: simplier string reset 2013-07-22 12:41:21 +02:00
Zdenek Kabelac
ea68f08501 cleanup: remove unused headers 2013-07-22 12:41:21 +02:00
Petr Rockai
6d2604f026 metadata: Fix tracking of read_status flags in _vg_make_handle. 2013-07-22 12:04:47 +02:00
Petr Rockai
3ed7f78ff4 metadata: Do not ignore errors in _vg_update_vg_ondisk. 2013-07-22 12:00:48 +02:00
Petr Rockai
f897fcbd95 metadata: Do not try to maintain an ondisk copy of orphan VGs. 2013-07-22 11:51:35 +02:00
Jonathan Brassow
4eea660191 RAID: Fix segfault when reporting raid_syncaction field on older kernel
The status printed for dm-raid targets on older kernels does not include
the syncaction field.  This is handled by dev_manager_raid_status() just
fine by populating the raid status structure with NULL for that field.
However, lv_raid_sync_action() does not properly handle that field being
NULL.  So, check for it and return 0 if it is NULL.
2013-07-19 10:01:48 -05:00
Peter Rajnoha
f0ab6c33a9 dev-type: dev_get_primary_dev default error code 0, not -1 2013-07-19 15:26:53 +02:00