1
0
mirror of git://sourceware.org/git/lvm2.git synced 2025-01-10 05:18:36 +03:00
Commit Graph

20 Commits

Author SHA1 Message Date
Jonathan Brassow
bdd7baeab3 cmirrord: Clean-up stray warning message (attempt #2)
There are two types of CPG communications in a corosync cluster:
messages and state transitions.  Cmirrord processes the state
transitions first.

When a cluster mirror issues a POSTSUSPEND, it signals the end of
cluster communication with the rest of the nodes in the cluster.
The POSTSUSPEND marks the last communication of the 'message'
type that will go around the cluster.  The node then calls
cpg_leave which causes a final 'state transition' communication to
all of the nodes.  Once the out-going node receives its own state
transition notice from the cluster, it finalizes the leave.  At this
point, the state of the log is 'INVALID'; but it is possible that
there remains some cluster trafic that was queued up behind the
state transition that still wants to be processed.  It is harmless
to attempt to dispatch any remaining messages - they won't be
delivered because the node is no longer in the cluster.  However,
there was a warning message that was being printed in this case
that is now removed by this patch.  The failure of the dispatch
created a false positive condition that triggered the message.
2014-03-19 14:43:00 -05:00
Jonathan Brassow
52aa3dbcab cmirrord: Clean-up stray warning message
cmirrord polls for messages on the kernel and cluster interfaces.
Sometimes it is possible for messages to be received on the cluster
interface and be waiting for processing while the node is in the
process of leaving the cluster group.  When this happens, the
messages received on the cluster interface are attempted to be
dispatched, but an error is returned because the connection is no
longer valid.  It is a harmless situation.  So, if we get the
specific error (CS_ERR_BAD_HANDLE) and we know that we have left
the group, then simply don't print the message.
2014-03-05 10:44:20 -06:00
Jonathan Brassow
f0be9ac904 cmirrord: Prevent secondary checkpoints from corrupting bitmaps
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).

When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap.  More than one machine can send a checkpoint,
but only the initial one should be imported.  Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin.  When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.

When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter.  This parameter was designed to avoid reading all but
the initial checkpoint.  Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
2013-08-20 13:21:09 -05:00
Zdenek Kabelac
636c51ae3f cleanup: use unrelated temporary variables
use of tmp_rq2 is unrelated to tmp_rq - so use separate
variable.
2013-06-16 00:07:33 +02:00
Zdenek Kabelac
6f3cd63551 cleanup: replace memset with struct initilization
Simplifies the code, properly detects too long socket paths,
drops unused parameter.
2012-06-22 13:23:03 +02:00
Zdenek Kabelac
461eb1ac6a cmirrord: add missing checks for kernel_send
Log errors if kernel_send fails.
2012-06-20 14:48:26 +02:00
Jonathan Earl Brassow
e5b9338ada Fix bug in cmirror that caused incorrect status info to print on some nodes.
Looking at the code in cmirrord/local.c, we can see the various different
request types handled in different ways.  Some information that is non-changing
does not need to go around the cluster and can be short-circuited.  For
example, once the cluster mirror is in-sync, it is pointless to continue
sending that query around the cluster.  We can save network bandwidth and reply
directly back to the kernel.  When it comes to status information, there are
two types 'TABLE' and 'INFO'.  The 'TABLE' information never changes and
belongs to the group of requests that can be safely short-circuited.  The
'STATUS' information can change - and will change if a device fails.  Thus it
cannot be short-circuited, but this is exactly what was found.  The 'STATUS'
information request was being short-circuited and therefore never reporting the
failure condition to anyone other than the "server" that experienced it
directly.
2012-04-26 17:30:49 +00:00
Milan Broz
7991a9636e Remove some whitespaces.
(test commit)
2012-03-10 09:32:46 +00:00
Jonathan Earl Brassow
2ce9693341 s/CPG_/CS_: Various CPG constants are going away, even though CPG itself stays
F17 is getting rid of OpenAIS libraries (and checkpointing).  While the
CPG stuff is staying, some if its constants are being removed.  So, we
must adjust and use the remaining constants which the CPG constants were based on.

[~]# egrep 'CPG_DISPATCH_ALL|CPG_OK' /usr/include/*/*
corosync/corotypes.h:#define CPG_DISPATCH_ALL     CS_DISPATCH_ALL
corosync/corotypes.h:#define CPG_OK               CS_OK
2012-03-01 17:41:39 +00:00
Jonathan Earl Brassow
62e38da133 Allow cluster mirrors to handle the absence of the checkpoint lib (libSaCkpt).
The OpenAIS checkpoint library is going away; therefore, cmirrord must
operate without it.  The algorithms the handle the timing of when to send
a checkpoint, the determination of what to send, and which ongoing cluster
requests are relevent with respect to the checkpoints are unaffected.  We
need only replace the functions that actually perform the storing/transmitting
and retrieving/receiving of the checkpoint data.  Rather than store the
checkpoint data in an OpenAIS checkpoint file, we simply transmit it along
with the message that notifies the incoming node that the checkpoint is
ready.
2012-02-29 21:15:34 +00:00
Zdenek Kabelac
b647de3e07 Fix memory leak of allocated bitmap in error path
Found by static analyzer.
2011-09-06 18:15:43 +00:00
Zdenek Kabelac
321ae653b6 Fix missing initilisation to 0
Add missing init value for variable 'found' which is later tested and may
have contained some garbage value.
2010-10-25 12:59:24 +00:00
Alasdair Kergon
08f1ddea6c Use __attribute__ consistently throughout. 2010-07-09 15:34:40 +00:00
Jonathan Earl Brassow
f972c51364 Was using dm_list_iterate_items when I should have been using
*_safe.  This had the effect of segfaulting the log daemon when
converting a mirror from one log type to another.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-01-27 22:28:05 +00:00
Alasdair Kergon
e408cf6906 Deal with a few more compiler warnings. 2010-01-20 02:43:19 +00:00
Alasdair Kergon
667c6be176 Clean up include files. 2010-01-18 21:07:24 +00:00
Alasdair Kergon
3c4310d6ef Misc compilation clean-ups. 2010-01-18 20:08:44 +00:00
Jonathan Earl Brassow
27318b98a1 Make the intermachine communication structures architecture independant
to allow for mixed architecture clusters.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
2010-01-15 19:49:35 +00:00
Alasdair Kergon
90c8088760 More cmirror makefile fixes from Fabio. 2009-09-14 22:57:46 +00:00
Dave Wysochanski
2433909e20 Add daemons/cmirrord files to git - somehow got messed up with cvs rename.
When clogd was renamed to cmirrord, somehow git got the remove of the old
files but not the add of the new files.  This patch adds the new files.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-09-04 20:44:58 +02:00