IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
we must always restart the lockmanager when the cluster has been
reconfigured and ip addresses has changed. This is to make sure we get a
clusterwide grace period for nfs locking.
if we dont do this and only restart locking on the nodes that were
direclty affected, a different client can take out a conflicting lock
from a different node before affected clients has had a chance to
reclaim all the locks lost during reconfigure.
grace period on rhel5 kernel has bene increased to 90 seconds!
statd-callout:
we must restart lockmanager to ensure a clusterwide grace period for
nfs. this makes locking "more correct" for nfs clients and prevents
other clients/nodes from taking out a conflicting lock while a different
client/node tries to reclaim lost locks.
This makes it "almost consistent" for NFS clients but there is still
the possibility that a cifs client can take out a conflicting lock
before an nfs client has had a chance to reclaim an existing lock.
This can not be solved with anything less than making the kernel nfs
lock manager "samba aware" and making samba aware of the internal state
of the kernel lock manager so that they can cooperate.
we can not just stop/start the lockmanager back to back in rhel5 since
if they are stopped/started too close to eachother then when the new
lockmanager upon starting up sends out statd notifications two things
can happen:
1, new lockmanager sends out notification BEFORE it has registered with
portmapper leading to
lockmanager starts
lockmanager sends notification to the client
client tries to recover the lock and tries to portmap the lockmanager
port on the server.
server is not (yet) registered with portmapper and server responds
"no such program" to hte clients request to discover where lockmanager
is.
client then just completely gives up reclaiming the lock and doesnt
even reattempt the portmapper call after some timeout.
==> lock reclaim failed.
2, if they are started back to back, and a client tries to reclaim the
lock the lockmanager sometimes sends two responses back to back
to the client. one with status NLM_GRANTED (==you got the lock
reclaimed) and one with status NLM_DENIED (==you could not get the lock
reclaimed)
This confuses the client and leads to the server thinking that the
client does have the lock and the client thinking it has not got the
lock and orphaned locks result.
We also send out additional notification messages of different formats
to allow more legacy clients to interoperate with locking.
(This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033)
multiple public addresses spread across multiple interfaces on each
node.
this is a massive patch since we have previously made the assumtion that
we only have one public address per node.
get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to
(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
nfs-state directory actually exists (by creating it)
or else the lock manager will not start
(This used to be ctdb commit f2d15d04df842538c8d8331796a3c6fbe23463f2)
specific script /etc/ctdb/events.d/00.ctdb
get rid of CTDB_EVENTS_SCRIPT and --event-script
(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
times to try the reset.
the reset retry attempt is now handled inside the daemon
update the 60.nfs script and remove this parameter that is no longer
used
(This used to be ctdb commit 30fb09b8b9a989e5cfe86b6daf2dcd2487013344)
the tcp connection
change the 60.nfs script to run ctdb killtcp in the foreground so we
dont get lots of these running in parallel when there are a lot of tcp
connections to rst
(This used to be ctdb commit d81616214752882242f2886e94681972a790db80)
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)
(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)
- added monitoring of the ethernet link state
When monitoring detects an error, the node loses its public IP address
(This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501)
use a dedicated variable CTDB_MANAGES_NFSLOCK since some might want to
use nfs but no lockmanager
(This used to be ctdb commit 1e8cec86617ffb188bd49c70f074a4b350d3fe3d)