IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
if the thin_check fail on thin pool - still return successful deactivation,
since lvremove would currently fail.
TODO: find some way to not run check with lvremove.
Add some hack math to allow 16GB devices to be passed as thinpool metadata.
Since kernel has put in limit to not allow which are just bigger then
some predefined constant in kernel but not matching 16GB so any device bigger
is rejected.
FIXME: Current code still might need more tweaks to be more generic.
Since lvm seems to call driver_version(NULL, 0) this would lead
to crash. Though the combination of the code is probably very hard to hit.
If the user doesn't supply version buffer, just skip printing to buffer.
pvcreate gives
WARNING: Ignoring unsupported value for metadata/pvmetadataignore.
It was warning if there is no config file entry instead of only if the node
exists but is empty.
Should be faster then strncpy - since we could avoid clearing 4KB pages
with each strncpy(...,PATH_MAX).
Also it's easy to check whether string fit - and eventually avoid
to continue working we incomplete string.
If we have good enough glibc to return number of needed chars, do not
loop try to reach good size, but use this size directly for allocation,
saving also last strdup.
Since now we start with 16 bytes - skip buffer realloc for shorter string.
Device-mapper in kernel uses '\' as escape character so it's better
to double it to avoid any confusion when using existing device names
with '\' in the table specification.
For example:
dmsetup create x --table "0 8 linear /dev/mapper/a\x20b 0"
should pass just fine now without a need to explicitly escape the '\' char
like this:
dmsetup create x --table "0 8 linear /dev/mapper/a\\x20b 0"
If dm_task_get_name or dm_task_get_names gets called, these will return
unmangled form of the names so the name mangling stays totally transparent
to any libdevmapper user (unless DM_STRING_MANGLING_NONE is used in which
case the name is not touched and it is is returned as it is in kernel).
For example:
dmsetup create "a b" - will create a\x20b device in kernel and so udev will
create /dev/mapper/a\x20b
dm_task_get_name/names will still return "a b"
In AUTO mode, the libdevmapper user can still query the device by using
the mangled ("a\x20b") or unmangled form of the name when calling dm_task_set_name.
If mangled name is provided, it's detected and the name is kept as it is.
If unmangled name is provided, it will be mangled. IOW in AUTO mode it's
totally transparent and it should not require any changes in the code
using libdevmapper.
However, any libdevmapper user must be aware of the fact that the mangled form
of the name appears in /dev/mapper (udev just can't deal with those blacklisted
characters).
dm_task_get_name_mangled will always return mangled form of the name while
the dm_task_get_name_unmangled will always return unmangled form of the name
irrespective of the global setting (dm_set/get_name_mangling_mode).
This is handy in situations where we need to detect whether the name is already
mangled or not. Also display functions make use of it.
Use the DEV_NAME macro to use the mangled form of the name if present,
use normal name otherwise (we store both forms - mangled and unmangled in
struct dm_task). Mangled form should be always preferred over unmangled
with the exception of the situations where we divide one task into several
others (like "create and load") - we need to avoid mangling the name twice
(because of multiple dm_task_set_name calls)!
If dm_task_set_name/newname is called, the name provided will be
automatically translated to correct encoded form with the hex enconding
so any character not on udev whitelist will be mangled with \xNN
format where NN is hex value of the character used.
By default, the name mangling mode used is the one set during
configure with the '--with-default-name-mangling' option.
This option configures the default name mangling mode used, one of:
AUTO, NONE and HEX.
The name mangling is primarily used to support udev character whitelist
(0-9, A-Z, a-z, #*-.:=@_) so any character that is not on udev whitelist
will get translated into an encoded form \xNN where NN is the hex value
of the character.
It was not possible to pass down the DM_[FORCE|NO]SYNC flags to
'dm_tree_node_add_raid_target'. This meant that converting to 'raid1' from
'mirror' would cause a full resync. (It also meant that '--nosync' was
ineffective when creating a 'raid1' LV.)
I've taken the 'reserved' parameter in 'dm_tree_node_add_raid_target' and
used it for the "flags" parameter. Now it is possible to pass the sync
flags and any other flags that may come up.
In case of zero bytes would be read from sysfs, it would store '\0' on
temp_buf[-1] address.
Simplify some buffer length calculation and use strcpy if we've just
checked string fits in give buffer.
Replace jump label error: with bad: commonly used in libdm.
Replace asserts with test for failing memory allocation.
Add at least stack traces.
Index counter starts from 1 (0 reserved for error), so replacing fingerprint.
Since the function dm_get_next_target() returns NULL as 'next' pointer
so it's not a 'real' error - set 0 to all parameters when NULL is
returned because of missing head.
i.e. one of use case::
do {
next = dm_get_next_target(dmt, next, &start, &length,
&target_type, ¶ms);
size += length;
} while (next);
Using PRELOAD part would lead to problems when the problem
would happen before vg_write and vg_commit.
Also this change is necessary for snapshot creation sequence.
This is accomplished by reading associated sysfs information. For a dm device,
this is /sys/dev/block/major:minor/dm/name (supported in kernel version >= 2.6.29,
for older kernels, the behaviour is the same as for non-dm devices).
For a non-dm device, this is a readlink on /sys/dev/block/major:minor, e.g.
/sys/dev/block/253:0 --> ../../devices/virtual/block/dm-0.
The last component of the path is a proper kernel name (block device name).
One can request to read only kernel names by setting the 'prefer_kernel_name'
argument if needed.
LVM- prefix.
Try harder not to leave stray empty devices around (locally or remotely) when
reverting changes after failures while there are inactive tables.
If we know major:minor number of device (which is known after resume) we will
try to use sysfs to set/get read ahead parameters of device.
This avoid potential problem of blocking commands like 'dmsetup info' awaiting
for device being usable for open/close - i.e. overfilled thin pool may block
such command.
Add dm_get_status_thin_pool and dm_get_status_thin functions to
parse 'params' argument which is received via dm_get_next_target.
Returns filed structure allocated from given mempool.
RAID is not like traditional LVM mirroring. LVM mirroring required failed
devices to be removed or the logical volume would simply hang. RAID arrays can
keep on running with failed devices. In fact, for RAID types other than RAID1,
removing a device would mean substituting an error target or converting to a
lower level RAID (e.g. RAID6 -> RAID5, or RAID4/5 to RAID0). Therefore, rather
than removing a failed device unconditionally and potentially allocating a
replacement, RAID allows the user to "replace" a device with a new one. This
approach is a 1-step solution vs the current 2-step solution.
example> lvconvert --replace <dev_to_remove> vg/lv [possible_replacement_PVs]
'--replace' can be specified more than once.
example> lvconvert --replace /dev/sdb1 --replace /dev/sdc1 vg/lv
Avoid creation of target type name when it's longer then
DM_MAX_TYPE_NAME (noticed by static analyzer where the
sp.target_type might be missing '\0' at the end.)
Before patch:
$> dmsetup create long
0 1000 looooooooooooooooooooooooooong
^D
device-mapper: reload ioctl failed: Invalid argument
After patch:
$> dmsetup create xxx
0 1000 looooooooooooooooooooooooooong
Target type name looooooooooooooooooooooooooong is too long.
Command failed
Remove DM_THIN_ERROR_DEVICE_ID from API.
Remove API warning.
Drop code that was using DM_THIN_ERROR_DEVICE_ID (already commented)
Remove debug message which slipped in through some previous commit.
A little code shuffling and adding support for
DM_THIN_ERROR_DEVICE_ID which might be eventually be used
for activation of thin which is going to be deleted.
For now we do not need it lvm.
Add a new node flag send_messages that is used to simplify
test when to call _node_send_messages().
Add call to _node_send_messages when pool is deeper in the tree.
There should be no need for retry for our internal devices - it would be hinding
our own bug in the tree processing.
Update error messages to show also also device name.
No WHATS_NEW - in release fix.
When DEBUG_MEM is used, the memory is trashed with extra pattern before real
free() is called, and as this memory was marked as non accessible when used with
valgrind, make it again usable.
Certain errno codes could be expected in some situations thus
add experimental support for them.
When expected errno is set after ioctl error - function skips error
printing and exits succefully.
Currently only useful for thin pool messages.
Version 2 of the userspace log protocol accepts return information during the
DM_ULOG_CTR exchange. The return information contains the name of the log
device that is being used (if there is one). The kernel can then register the
device via 'dm_get_device'. Amoung other things, this allows for userspace to
assemble a correct dependency tree of devices - critical for LVM handling of
suspend/resume calls.
Also, update dm-log-userspace.h to match the kernel header associated with
this protocol change. (Includes a version inc.)
The upstream kernel version that this file mirrors has changed, here is the
commit message:
commit 86a54a4802df10d23ccd655e2083e812fe990243
Author: Jonathan Brassow <jbrassow@redhat.com>
Date: Thu Jan 13 19:59:52 2011 +0000
dm log userspace: add version number to comms
This patch adds a 'version' field to the 'dm_ulog_request'
structure.
The 'version' field is taken from a portion of the unused
'padding' field in the 'dm_ulog_request' structure. This
was done to avoid changing the size of the structure and
possibly disrupting backwards compatibility.
The version number will help notify user-space daemons
when a change has been made to the kernel/userspace
log API.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The current code does not always assign proper udev flags to sub-LVs (e.g.
mirror images and log LVs). This shows up especially during a splitmirror
operation in which an image is split off from a mirror to form a new LV.
A mirror with a disk log is actually composed of 4 different LVs: the 2
mirror images, the log, and the top-level LV that "glues" them all together.
When a 2-way mirror is split into two linear LVs, two of those LVs must be
removed. The segments of the image which is not split off to form the new
LV are transferred to the top-level LV. This is done so that the original
LV can maintain its major/minor, UUID, and name. The sub-lv from which the
segments were transferred gets an error segment as a transitory process
before it is eventually removed. (Note that if the error target was not put
in place, a resume_lv would result in two LVs pointing to the same segment!
If the machine crashes before the eventual removal of the sub-LV, the result
would be a residual LV with the same mapping as the original (now linear) LV.)
So, the two LVs that need to be removed are now the log device and the sub-LV
with the error segment. If udev_flags are not properly set, a resume will
cause the error LV to come up and be scanned by udev. This causes I/O errors.
Additionally, when udev scans sub-LVs (or former sub-LVs), it can cause races
when we are trying to remove those LVs. This is especially bad during failure
conditions.
When the mirror is suspended, the top-level along with its sub-LVs are
suspended. The changes (now 2 linear devices and the yet-to-be-removed log
and error LV) are committed. When the resume takes place on the original
LV, there are no longer links to the other sub-lvs through the LVM metadata.
The links are implicitly handled by querying the kernel for a list of
dependencies. This is done in the '_add_dev' function (which is recursively
called for each dependency found) - called through the following chain:
_add_dev
dm_tree_add_dev_with_udev_flags
<*** DM / LVM divide ***>
_add_dev_to_dtree
_add_lv_to_dtree
_create_partial_dtree
_tree_action
dev_manager_activate
_lv_activate_lv
_lv_resume
lv_resume_if_active
When udev flags are calculated by '_get_udev_flags', it is done by referencing
the 'logical_volume' structure. Those flags are then passed down into
'dm_tree_add_dev_with_udev_flags', which in turn passes them to '_add_dev'.
Unfortunately, when '_add_dev' is finding the dependencies, it has no way to
calculate their proper udev_flags. This is because it is below the DM/LVM
divide - it doesn't have access to the logical_volume structure. In fact,
'_add_dev' simply reuses the udev_flags given for the initial device! This
virtually guarentees the udev_flags are wrong for all the dependencies unless
they are reset by some other mechanism. The current code provides no such
mechanism. Even if '_add_new_lv_to_dtree' were called on the sub-devices -
which it isn't - entries already in the tree are simply passed over, failing
to reset any udev_flags. The solution must retain its implicit nature of
discovering dependencies and be able to go back over the dependencies found
to properly set the udev_flags.
My solution simply calls a new function before leaving '_add_new_lv_to_dtree'
that iterates over the dtree nodes to properly reset the udev_flags of any
children. It is important that this function occur after the '_add_dev' has
done its job of querying the kernel for a list of dependencies. It is this
list of children that we use to look up their respective LVs and properly
calculate the udev_flags.
This solution has worked for single machine, cluster, and cluster w/ exclusive
activation.
Make limits for thin data_block_size and device_id part of public API.
FIXME: read them possible from some kernel header file in the future ?
But we may need to support different values for different versions ?
Since it's internal function and we always check for NULL value
before call - this is safe.
Just for case add nonnull attribute so analyzer might better
catch error.
It's 100% equivalent test - since it always happen for the first iteration.
But the check for 'l' is understandable with analyzers - since analyzer
is not smart enough to deduce connection between root->child == NULL.
Before, we used to display "Can't remove open logical volume" which was
generic. There 3 possibilities of how a device could be opened:
- used by another device
- having a filesystem on that device which is mounted
- opened directly by an application
With the help of sysfs info, we can distinguish the first two situations.
The third one will be subject to "remove retry" logic - if it's opened
quickly (e.g. a parallel scan from within a udev rule run), this will
finish quickly and we can remove it once it has finished. If it's a
legitimate application that keeps the device opened, we'll do our best
to remove the device, but we will fail finally after a few retries.
Add dm_device_has_mounted_fs fn to check mounted filesystem on a device.
This requires sysfs directory to be correctly set via dm_set_sysfs_dir
(/sys by default). If sysfs dir is not used or it's set incorrectly,
dm_device_has_{holders,mounted_fs} will return 0!
Transfer of build_dm_uuid() function into libdm made uuid_prefix as parameter,
thus sizeof() was replaced with strlen() and room for '\0' missed.
As it's only fix in current version - no whatsnew.
This is a workaround for long-lasting problem with using the WATCH udev
rule. When trying to remove a DM device, this one can still be opened
while processing the event in parallel (generated based on the WATCH
udev rule).
Let's use this until we have a proper solution.
Makes dumpconfig whole-section output wrong in a different way from before,
but we should be able to merge cft_cmdline properly into cmd->cft now and
remove cascade.
functionality. A number of bugs (copied and pasted all over the code) should
disappear:
- most string lookup based on dm_config_find_node would segfault when
encountering a non-zero integer (the intention there was to print an
error message instead)
- check for required sections in metadata would have been satisfied by
values as well (i.e. not sections)
- encountering a section in place of expected flag value would have
segfaulted (due to assumed but unchecked cn->v != NULL)
leaving behind the LVM-specific parts of the code (convenience wrappers that
handle `struct device` and `struct cmd_context`, basically). A number of
functions have been renamed (in addition to getting a dm_ prefix) -- namely,
all of the config interface now has a dm_config_ prefix.
This patch adds the ability to upconvert a raid1 array - say from 2-way to
3-way. It does not yet support upconverting linear to n-way.
The 'raid' device-mapper target allows for individual components (images) of
an array to be specified for rebuild. This mechanism is used when adding
new images to the array so that the new images can be resync'ed while the
rest of the images in the array can remain 'in-sync'. (There is no
mirror-on-mirror layering required.)
~> lvconvert --splitmirrors 1 --trackchanges vg/lv
The '--trackchanges' option allows a user the ability to use an image of
a RAID1 array for the purposes of temporary read-only access. The image
can be merged back into the array at a later time and only the blocks that
have changed in the array since the split will be resync'ed. This
operation can be thought of as a partial split. The image is never completely
extracted from the array, in that the array reserves the position the device
occupied and tracks the differences between the array and the split image via
a bitmap. The image itself is rendered read-only and the name (<LV>_rimage_*)
cannot be changed. The user can complete the split (permanently splitting the
image from the array) by re-issuing the 'lvconvert' command without the
'--trackchanges' argument and specifying the '--name' argument.
~> lvconvert --splitmirrors 1 --name my_split vg/lv
Merging the tracked image back into the array is done with the '--merge'
option (included in a follow-on patch).
~> lvconvert --merge vg/lv_rimage_<n>
The internal mechanics of this are relatively simple. The 'raid' device-
mapper target allows for the specification of an empty slot in an array
via '- -'. This is what will be used if a partial activation of an array
is ever required. (It would also be possible to use 'error' targets in
place of the '- -'.) If a RAID image is found to be both read-only and
visible, then it is considered separate from the array and '- -' is used
to hold it's position in the array. So, all that needs to be done to
temporarily split an image from the array /and/ cause the kernel target's
bitmap to track (aka "mark") changes made is to make the specified image
visible and read-only. To merge the device back into the array, the image
needs to be returned to the read/write state of the top-level LV and made
invisible.
Adding debuging functionality to lock and unlock memory pool.
2 ways to debug code:
crc - is default checksum/hash of the locked pool.
It gets slower when the pool is larger - so the check is only
made when VG is finaly released and it has been used more then
once.Thus the result is rather informative.
mprotect - quite fast all the time - but requires more memory and
currently it is using posix_memalign() - this could be
later modified to use dm_malloc() and align internally.
Tool segfaults when locked memory is modified and core
could be examined for faulty code section (backtrace).
Only fast memory pools could use mprotect for now -
so such debug builds cannot be combined with DEBUG_POOL.
Implementation described in doc/lvm2-raid.txt.
Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types