IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The bug addressed by this patch manifested itself during testing
by showing a mirror that never became 'in-sync' after creation.
The bug is isolated to distributions that do not have support
for openAIS checkpointing (i.e. > RHEL6, > F16).
When a node joins a group that is managing a mirror log, the other
machines in the group send it a checkpoint representing the current
state of the bitmap. More than one machine can send a checkpoint,
but only the initial one should be imported. Once the bitmap state
has been imported from the initial checkpoint, operations (such
as resync, mark, and clear operations) can begin. When subsequent
checkpoints are allowed to be imported, it has the effect of erasing
all the log operations between the initial checkpoint and the ones
that follow.
When cmirrord was updated to handle the absence of openAIS
checkpointing (commit 62e38da133),
the new import_checkpoint() function failed to honor the 'no_read'
parameter. This parameter was designed to avoid reading all but
the initial checkpoint. Honoring this parameter has solved the
issue of corrupting bitmap data with secondary checkpoints.
Looking at the code in cmirrord/local.c, we can see the various different
request types handled in different ways. Some information that is non-changing
does not need to go around the cluster and can be short-circuited. For
example, once the cluster mirror is in-sync, it is pointless to continue
sending that query around the cluster. We can save network bandwidth and reply
directly back to the kernel. When it comes to status information, there are
two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and
belongs to the group of requests that can be safely short-circuited. The
'STATUS' information can change - and will change if a device fails. Thus it
cannot be short-circuited, but this is exactly what was found. The 'STATUS'
information request was being short-circuited and therefore never reporting the
failure condition to anyone other than the "server" that experienced it
directly.
F17 is getting rid of OpenAIS libraries (and checkpointing). While the
CPG stuff is staying, some if its constants are being removed. So, we
must adjust and use the remaining constants which the CPG constants were based on.
[~]# egrep 'CPG_DISPATCH_ALL|CPG_OK' /usr/include/*/*
corosync/corotypes.h:#define CPG_DISPATCH_ALL CS_DISPATCH_ALL
corosync/corotypes.h:#define CPG_OK CS_OK
The OpenAIS checkpoint library is going away; therefore, cmirrord must
operate without it. The algorithms the handle the timing of when to send
a checkpoint, the determination of what to send, and which ongoing cluster
requests are relevent with respect to the checkpoints are unaffected. We
need only replace the functions that actually perform the storing/transmitting
and retrieving/receiving of the checkpoint data. Rather than store the
checkpoint data in an OpenAIS checkpoint file, we simply transmit it along
with the message that notifies the incoming node that the checkpoint is
ready.
*_safe. This had the effect of segfaulting the log daemon when
converting a mirror from one log type to another.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
When clogd was renamed to cmirrord, somehow git got the remove of the old
files but not the add of the new files. This patch adds the new files.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>