shaba/lvm2 - lvm2 - Gitea: Git with a cup of tea

shaba/lvm2

mirror of git://sourceware.org/git/lvm2.git synced 2024-12-22 17:35:59 +03:00

Author	SHA1	Message	Date
Jonathan Brassow	f0be9ac904	cmirrord: Prevent secondary checkpoints from corrupting bitmaps The bug addressed by this patch manifested itself during testing by showing a mirror that never became 'in-sync' after creation. The bug is isolated to distributions that do not have support for openAIS checkpointing (i.e. > RHEL6, > F16). When a node joins a group that is managing a mirror log, the other machines in the group send it a checkpoint representing the current state of the bitmap. More than one machine can send a checkpoint, but only the initial one should be imported. Once the bitmap state has been imported from the initial checkpoint, operations (such as resync, mark, and clear operations) can begin. When subsequent checkpoints are allowed to be imported, it has the effect of erasing all the log operations between the initial checkpoint and the ones that follow. When cmirrord was updated to handle the absence of openAIS checkpointing (commit `62e38da133`), the new import_checkpoint() function failed to honor the 'no_read' parameter. This parameter was designed to avoid reading all but the initial checkpoint. Honoring this parameter has solved the issue of corrupting bitmap data with secondary checkpoints.	2013-08-20 13:21:09 -05:00
Zdenek Kabelac	003f08c164	clogd: fix descriptor leak when daemonzing	2013-08-06 16:21:51 +02:00
Zdenek Kabelac	636c51ae3f	cleanup: use unrelated temporary variables use of tmp_rq2 is unrelated to tmp_rq - so use separate variable.	2013-06-16 00:07:33 +02:00
Zdenek Kabelac	6d0abc6b48	cmirrord: check for result of chdir Error exit if chdir fails.	2012-08-23 14:37:20 +02:00
Zdenek Kabelac	6f3cd63551	cleanup: replace memset with struct initilization Simplifies the code, properly detects too long socket paths, drops unused parameter.	2012-06-22 13:23:03 +02:00
Zdenek Kabelac	461eb1ac6a	cmirrord: add missing checks for kernel_send Log errors if kernel_send fails.	2012-06-20 14:48:26 +02:00
Zdenek Kabelac	865b9d3701	cmirrord: fix cut&paste	2012-06-20 14:41:57 +02:00
Zdenek Kabelac	fb4584b83d	cmirrord: add test for closedir() and close()	2012-06-20 14:40:39 +02:00
Jonathan Earl Brassow	e5b9338ada	Fix bug in cmirror that caused incorrect status info to print on some nodes. Looking at the code in cmirrord/local.c, we can see the various different request types handled in different ways. Some information that is non-changing does not need to go around the cluster and can be short-circuited. For example, once the cluster mirror is in-sync, it is pointless to continue sending that query around the cluster. We can save network bandwidth and reply directly back to the kernel. When it comes to status information, there are two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and belongs to the group of requests that can be safely short-circuited. The 'STATUS' information can change - and will change if a device fails. Thus it cannot be short-circuited, but this is exactly what was found. The 'STATUS' information request was being short-circuited and therefore never reporting the failure condition to anyone other than the "server" that experienced it directly.	2012-04-26 17:30:49 +00:00
Milan Broz	7991a9636e	Remove some whitespaces. (test commit)	2012-03-10 09:32:46 +00:00
Jonathan Earl Brassow	2ce9693341	s/CPG_/CS_: Various CPG constants are going away, even though CPG itself stays F17 is getting rid of OpenAIS libraries (and checkpointing). While the CPG stuff is staying, some if its constants are being removed. So, we must adjust and use the remaining constants which the CPG constants were based on. [~]# egrep 'CPG_DISPATCH_ALL\|CPG_OK' /usr/include// corosync/corotypes.h:#define CPG_DISPATCH_ALL CS_DISPATCH_ALL corosync/corotypes.h:#define CPG_OK CS_OK	2012-03-01 17:41:39 +00:00
Jonathan Earl Brassow	62e38da133	Allow cluster mirrors to handle the absence of the checkpoint lib (libSaCkpt). The OpenAIS checkpoint library is going away; therefore, cmirrord must operate without it. The algorithms the handle the timing of when to send a checkpoint, the determination of what to send, and which ongoing cluster requests are relevent with respect to the checkpoints are unaffected. We need only replace the functions that actually perform the storing/transmitting and retrieving/receiving of the checkpoint data. Rather than store the checkpoint data in an OpenAIS checkpoint file, we simply transmit it along with the message that notifies the incoming node that the checkpoint is ready.	2012-02-29 21:15:34 +00:00
Zdenek Kabelac	a6292f2a6d	Remove unneeded assignments Variables have (or will have) those values set.	2012-02-08 11:36:18 +00:00
Zdenek Kabelac	3a8b6a9948	Keep page_size as signed number Since it's return value from sysconf and is checked for <0.	2012-02-08 11:34:46 +00:00
Jonathan Earl Brassow	3b032963d5	cmirrord now returns log name to kernel in CTR so it can be registered Version 2 of the userspace log protocol accepts return information during the DM_ULOG_CTR exchange. The return information contains the name of the log device that is being used (if there is one). The kernel can then register the device via 'dm_get_device'. Amoung other things, this allows for userspace to assemble a correct dependency tree of devices - critical for LVM handling of suspend/resume calls. Also, update dm-log-userspace.h to match the kernel header associated with this protocol change. (Includes a version inc.)	2011-10-14 14:18:49 +00:00
Zdenek Kabelac	d9bba4f16f	Check for failing 'stat' and skip this loop iteration (since data in statbuf are invalid). Check whether sysconf managed to find _SC_PAGESIZE. Report at least debug warning about failing unlink (logging scheme here seems to be a different then in lvm). Duplicate terminal FDs and use similar code as is made in clvmd and cleanup warns about missing open/close tests. FIXME: Looks like we already have 3 instancies of the same code in lvm repo.	2011-09-21 10:42:53 +00:00
Zdenek Kabelac	e9047f4f9c	Detect sscanf recovering_region input error Missing check for sscanf found by static analyzer.	2011-09-06 18:24:27 +00:00
Zdenek Kabelac	b647de3e07	Fix memory leak of allocated bitmap in error path Found by static analyzer.	2011-09-06 18:15:43 +00:00
Zdenek Kabelac	7b83071708	Log unlink() error	2011-09-06 18:11:21 +00:00
Zdenek Kabelac	35ce2b332b	Removed unused pointer Pointer 'duplicate' is unused.	2010-12-20 13:58:38 +00:00
Zdenek Kabelac	9d3be13f4f	Use dm_free for dm_malloc-ed areas in _clog_ctr/_clog_dtr (cmirrord). Use dm_zalloc to obtain zeroed memory block. Use dm_free for dm_ allocated memory blocks. Test close() for error.	2010-12-20 13:57:19 +00:00
Peter Rajnoha	7dfce0e467	Add new dm_prepare_selinux_context fn to libdevmapper and use it throughout. Detect existence of new SELinux selabel interface during configure. Use new dm_prepare_selinux_context instead of dm_set_selinux_context. We should set the SELinux context before the actual file system object creation. The new dm_prepare_selinux_context function sets this using the selabel_lookup fn in conjuction with the setfscreatecon fn. If selinux/label.h interface (that should be a part of the selinux library) is not found during configure, we fallback to the original matchpathcon function instead.	2010-12-13 10:43:56 +00:00
Zdenek Kabelac	44110cd33e	Add missing return for NULL passed buffer Function pull_stateo() checks for NULL 'buf' - but return for this error path was missing. cmirror code never calls this function with NULL 'buf', so this fix has no effect on current code base, but makes clang happier.	2010-10-26 10:14:41 +00:00
Zdenek Kabelac	321ae653b6	Fix missing initilisation to 0 Add missing init value for variable 'found' which is later tested and may have contained some garbage value.	2010-10-25 12:59:24 +00:00
Jonathan Earl Brassow	34cbedceaf	This patch fixes an issue where cluster mirror write I/O can be opprobriously slow if created with '--nosync'. One of the ways cluster mirrors coordinate I/O and recovery amoung the different machines is by the use of the log function 'is_remote_recovering()' which lets nodes know if a region they wish to perform a write on is currently being recovered on another node. If the region is being recovered, the I/O is delayed. The 'is_remote_recovering' routine has been optimized to avoid the deluge of requests that would be issued to the userspace log server by maintaining a marker of how far the recovery has gotten. It can then immediately return 'not recovering' if the region being inquired about is less than this mark. Additionally, if the region of concern is greater than the mark, the function will limit the number of transmissions to userspace by assuming the region /is/ being recovered when skipping the transmission. This limits the amount of processing and updates the mark in 1/4 sec time steps. This patch fixes a problem where 'the mark' is not being updated because of faulty logic in the userspace log daemon. When '--nosync' is used to create a cluster mirror, the userspace log daemon never has a chance to update the mark in the normal way. The fix is to set the mark to "complete" if the mirror was created with the --nosync flag.	2010-08-30 18:37:42 +00:00
Jonathan Earl Brassow	53670b18f5	Fix for bug 596453: multiple mirror image failures cause lvm repair... The lvm repair issues I believe are the superficial symptoms of this bug - there are worse issues that are not as clearly seen. From my inline comments: * If the mirror was successfully recovered, we want to always * force every machine to write to all devices - otherwise, * corruption will occur. Here's how: * Node1 suffers a failure and marks a region out-of-sync * Node2 attempts a write, gets by is_remote_recovering, * and queries the sync status of the region - finding * it out-of-sync. * Node2 thinks the write should be a nosync write, but it * hasn't suffered the drive failure that Node1 has yet. * It then issues a generic_make_request directly to * the primary image only - which is exactly the device * that has suffered the failure. * Node2 suffers a lost write - which completely bypasses the * mirror layer because it had gone through generic_m_r. * The file system will likely explode at this point due to * I/O errors. If it wasn't the primary that failed, it is * easily possible in this case to issue writes to just one * of the remaining images - also leaving the mirror inconsistent. * * We let in_sync() return 1 in a cluster regardless of what is * in the bitmap once recovery has successfully completed on a * mirror. This ensures the mirroring code will continue to * attempt to write to all mirror images. The worst that can * happen for reads is that additional read attempts may be * taken.	2010-08-17 23:56:23 +00:00
Jonathan Earl Brassow	498747d792	A misunderstanding of the return value of 'dm_bit' has been causing a data corruption bug in cmirror. 'dm_bit' is only ever used as a boolean operation within LVM, but it can return a range of values. If the bit is set, a power of 2 is returned. If the bit is unset, 0 is returned. 'log_test_bit' (a function in the cluster mirror log daemon code) has switched to using the dm bit operations in rhel6. There are two places in the daemon code where 'log_test_bit' is not used merely as a boolean, but rather the return value is used as the return value for the log functions 'is_clean' and 'in_sync' - having assumed that 'dm_bit' was returning 0 or 1 only. One place the 'in_sync' function is utilized is in 'dm_rh_get_state' - a function that informs the mirroring code how to treat I/O and which devices to read/write from. 'dm_rh_get_state' was checking if the return value of 'in_sync' was 1 to determine if the region was DM_RH_CLEAN. Since 'dm_bit' (and by extension 'log_test_bit' and 'in_sync') was returning powers of 2, DM_RH_CLEAN was rarely being reported as it should have been. Thinking the region was out-of-sync, the mirroring code would write only to the primary device. When the primary device was failed, all of those writes were lost - leaving the entire mirror corrupted.	2010-08-04 18:18:18 +00:00
Fabio M. Di Nitto	8c4e8a185a	Add dm_create_lockfile to libdm to handle pidfiles for all daemons. Switch dmeventd to use dm_create_lockfile and drop duplicate code. Allow clvmd pidfile to be configurable. Switch cmirrord and clvmd to use dm_create_lockfile.	2010-07-13 13:51:01 +00:00
Alasdair Kergon	08f1ddea6c	Use __attribute__ consistently throughout.	2010-07-09 15:34:40 +00:00
Jonathan Earl Brassow	548cc88947	Add error checking for calls to sprintf - it can fail for more reasons than just 'out-of-space'.	2010-06-21 16:07:06 +00:00
Jonathan Earl Brassow	2995925278	daemons/cmirrord/functions.c (part of cmirrord) was referencing linux/kdev_t.h even though it wasn't needed. Strangely, it seems to be causing problems on various architectures (i686) in the function daemons/cmirrord/functions.c:disk_status_info()->sprintf. I'm not sure why this is a problem since none of the macros in kdev_t.h are used in that code, but it certainly doesn't hurt to pull an unnecessary header and it seems to fix the problem.	2010-06-18 20:58:04 +00:00
Zdenek Kabelac	cee2f123a4	Use "" instead of <> for configure.h and libdevmapper.h Move configure.h as the first header for clvmd source files.	2010-06-15 11:00:44 +00:00
Zdenek Kabelac	23b059e7b7	INSTALL rules updates Patch is inspired by Debian's extra patch. - removes OWNER & GROUP make vars they are parts of INSTALL command. - adds INSTALL_PROGRAM for executable, uses $(INSTALL) - adds INSTALL_DATA for non-executable data, uses ($INSTALL) - adds INSTALL_WDATA for writable non-executable data, uses ($INSTALL) - adds configure option --enable-write_install - to support installatin of writable files used by distribution - replaces usage of ifeq @LIB_SUFFIX@ with $(LIB_SUFFIX) - installs .a files from static builds without executable flag - installs .a files to $(usrlibdir) instead of $(libdir) - installs all static binaries to $(staticdir) - create .so links for devel package in $(usrlibdir) instead of $(libdir) - makes .so and .so.LIB_VERSION files within builddir - removes VERSIONED_SHLIB and created versioned LIB_SHARED automagicaly - install LIB_SHARED via install_lib_shared target - install plugins via install_lib_shared_plugin target - prints whole 'install' command during installation instead of less informative "Installing $(something) $(somewhere)" - install multiple man pages with one INSTALL command - use DISTCLEAN_TARGETS instead of creating multiple distclean targets	2010-04-09 21:42:48 +00:00
Zdenek Kabelac	c737d34804	Use vpath instead of VPATH. Usage of VPATH makes troubles when used within $(builddir). Not only source files are being found through VPATH, but targets as well. (make --debug=v) Thus if user builds the code in $(srcdir) and also in some $(builddir) he gets mangled results as some generated files (i.e. .export.sym) are 'reused' from $(srcdir) instead of $(builddir). This patch switches to use vpath were we could explicitly name suffixes that should be looked via vpath - we must take care, we do not generate files with these suffixes: .c, .in, .po, .exported_symbols	2010-04-09 21:34:25 +00:00
Zdenek Kabelac	2384a25499	Fixing compilation warning: implicit declaration of function ‘umask’	2010-03-29 14:05:17 +00:00
Zdenek Kabelac	814aebc4e9	Use $(top_builddir) for inclusion of make.tmpl in Makefiles.	2010-03-04 09:51:37 +00:00
Jonathan Earl Brassow	f972c51364	Was using dm_list_iterate_items when I should have been using *_safe. This had the effect of segfaulting the log daemon when converting a mirror from one log type to another. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-01-27 22:28:05 +00:00
Alasdair Kergon	86dce60ffe	missing header	2010-01-22 00:43:28 +00:00
Alasdair Kergon	e408cf6906	Deal with a few more compiler warnings.	2010-01-20 02:43:19 +00:00
Alasdair Kergon	9bad95bd95	Remove mknod() and add FIXMEs. In the udev-world, this function should work differently.	2010-01-19 18:21:03 +00:00
Alasdair Kergon	acdd91b3ce	remove more compiler warnings add FIXMEs for incomplete write()s	2010-01-19 17:24:29 +00:00
Alasdair Kergon	fc0c0cb075	Signal handling FIXMEs. A few integer type changes.	2010-01-19 15:58:45 +00:00
Alasdair Kergon	667c6be176	Clean up include files.	2010-01-18 21:07:24 +00:00
Jonathan Earl Brassow	98998134de	Fix some compiler warnings.	2010-01-18 20:58:50 +00:00
Alasdair Kergon	3c4310d6ef	Misc compilation clean-ups.	2010-01-18 20:08:44 +00:00
Jonathan Earl Brassow	27318b98a1	Make the intermachine communication structures architecture independant to allow for mixed architecture clusters. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-01-15 19:49:35 +00:00
Jonathan Earl Brassow	3579eeb2b0	When moving the cluster log server into the LVM tree, the in memory bitmap tracking was switched from the e2fsprogs implementation to the device-mapper implementation (dm_bitset_t). The latter has a leading uin32_t field designed to hold the number of bits that are being tracked. The code was not properly handling this change in all places. Specifically, when getting the bitmap to/from disk. Endian adjustments will likely need to be made on the accounting field as well, since bitmaps are passed between machines on start-up. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-01-15 18:48:24 +00:00
Jonathan Earl Brassow	e30f6c899d	At some point "clustered_[core\|disk]" was changed to "clustered-[core\|disk]". This patch makes the log server recognise the new format. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>	2010-01-15 16:03:19 +00:00
Alasdair Kergon	437219e27d	More makefile cleaning up and fixing. (gentoo)	2009-10-05 13:46:00 +00:00
Alasdair Kergon	db8b5af9d9	Allow for a build directory separate from the source.	2009-10-02 19:10:31 +00:00

1 2

52 Commits