1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-25 23:21:54 +03:00
Commit Graph

1557 Commits

Author SHA1 Message Date
Ronnie Sahlberg
416409d31b add a ctdb command to print the ctdb version
(This used to be ctdb commit 401fb01f8cb06886e2c5c277a9a70512a9b68579)
2008-04-03 17:07:00 +11:00
Ronnie Sahlberg
d6736b3720 we allocated one byte too little in the blob we need to send as the control to the server.
(This used to be ctdb commit 10e585413c217d9b9c32ff3d2fb3d8f24183c458)
2008-04-03 16:35:23 +11:00
Ronnie Sahlberg
6b797f148c From Chris Cowan
Add support in AIX to track the PID of a client that connects to the unix domain socket

(This used to be ctdb commit 4c006c675d577d4a45f4db2929af6d50bc28dd9e)
2008-04-03 10:58:51 +11:00
Ronnie Sahlberg
eaad3e1868 bump version to .32
(This used to be ctdb commit 794ed5852c09deaffd1817f8a443b4711ed4d06f)
2008-04-02 12:09:27 +11:00
Ronnie Sahlberg
e8e67ef576 add a mechanism to force a node to run the eventscripts with arbitrary arguments
ctdb eventscript "command argument argument ..."

(This used to be ctdb commit 118a16e763d8332c6ce4d8b8e194775fb874c8c8)
2008-04-02 11:13:30 +11:00
Ronnie Sahlberg
03d30f405d decorate the memdump output with a nice field for ctdb_client structures to show the pid of the client that attached
(This used to be ctdb commit 0d9314302d0b988b6ab5d533deef40c5b343c249)
2008-04-01 17:17:21 +11:00
Ronnie Sahlberg
27a7f854f5 add improvements to tracking memory usage in ctdbd adn the recovery daemon
and a ctdb command to pull the talloc memory map from a recovery daemon
ctdb rddumpmemory

(This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05)
2008-04-01 15:34:54 +11:00
Ronnie Sahlberg
78081de82a from tridge: decorate dumpmemory output so that packets that are queued show up with a little more information to make memory leak debugging easier
(This used to be ctdb commit 890832ba37d92c7996b38735451f93592c37ff79)
2008-04-01 11:31:42 +11:00
Ronnie Sahlberg
0de4f37c91 return 0 if iscsi is disabled
(This used to be ctdb commit b76400e282cab60ac6b6039dbb33d93bb1350199)
2008-03-31 12:58:20 +11:00
Ronnie Sahlberg
a1334246cf make sure the iface string is nullterminated in the addip control packet
(This used to be ctdb commit 983490556bc12fe03de4c22b5fdc12d15c11d43c)
2008-03-31 12:49:39 +11:00
Ronnie Sahlberg
d03bb15eb3 update the iscis support under RHEL5 to allow one iscsi target to be defined for each public address in the cluster.
update the documentation for iscsi

(This used to be ctdb commit c1130e58296e63be3787ec59690941b2677a3378)
2008-03-31 11:00:08 +11:00
Ronnie Sahlberg
0d7b34c9e5 Add two new controls to add/delete public ip address from a node at runtime.
The controls only modify the runtime setting of which public addresses a node
can server and does not modify /etc/ctdb/public_addresses.
To make the change permanent you also need to edit /etc/ctdb/public_addresses
manually.

After ip addresses have been added/deleted you need to invoke a recovery
for the ip addresses to be redistributed.

(This used to be ctdb commit f8294d103fdd8a720d0b0c337d3973c7fdf76b5c)
2008-03-27 09:23:27 +11:00
Ronnie Sahlberg
26ec64a571 fix a memory leak
allocate the memory to the 'call' context and not off the 'ctdb' context

(This used to be ctdb commit be89005bd5d13409e377d425db2aad1c0d5b3826)
2008-03-25 11:11:13 +11:00
Ronnie Sahlberg
5bb9b021f9 update to version 1.0.31
(This used to be ctdb commit a0c9a451afde0c99efdc92e1fd418991bb81fa2b)
2008-03-25 09:43:47 +11:00
Ronnie Sahlberg
2863d2cfd1 From M Dietz,
Add back the controls to enable/disable monitoring we used to have for debugging but removed a while ago

(This used to be ctdb commit 8477f6a079e2beb8c09c19702733c4e17f5032fe)
2008-03-25 08:27:38 +11:00
Ronnie Sahlberg
d53424731f in ctdb_call_local() we can not talloc_steal() the returned data and hang it off ctdb.
This can cause a memory leak if the call is terminated before we have managed to respond to the client.
(and the call is talloc_free()d but the data is still hanging off ctdb)

instead we must talloc_steal() the data and hang it off the call structure to avoid the memory leak.

In order to do this we must also change the call structure that is passed into ctdb_call_local() to be allocated through talloc().

This structure was previously either a static variable, or an element of a larger talloc()ed structure (ctdb_call_state or ctdb_client_call_state) so
we must change all creations of a ctdb_call into explicitely creating it through talloc()

(This used to be ctdb commit 4becf32aea088a25686e8bc330eb47d85ae0ef8f)
2008-03-19 13:54:17 +11:00
Ronnie Sahlberg
38212ffd9b dont steal reply_data.dptr to ctdb if there is no data, since then we would leak
memory

(This used to be ctdb commit 53c4f483bb122e6fa13abcc6d4584130f20af461)
2008-03-19 12:08:29 +11:00
Ronnie Sahlberg
e19264ea26 change the log level for the message when someone connects to a non-public ip
(This used to be ctdb commit bc9c4f0d52e9b06aceb08cea99ed3fd20b44616c)
2008-03-13 07:54:55 +11:00
Ronnie Sahlberg
74d57f8d51 Redo the vacukming process to mkake it scalable.
Vacumming used to delete one record at a time on all nodes, that was
m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all.

The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted.

(This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53)
2008-03-13 07:53:29 +11:00
Ronnie Sahlberg
e2930588b3 update to version 1.0.30
(This used to be ctdb commit 89529ea81379335b3db09774d192fb7cefe37338)
2008-03-04 13:40:29 +11:00
Ronnie Sahlberg
b1cf2b5653 Update ctdb uptime to provide machinereadable output
(This used to be ctdb commit 4f7f8aa6f178115b551ac35f7df2ec5aad054fe2)
2008-03-04 13:29:48 +11:00
Ronnie Sahlberg
61b52e0e64 provide machinereadble -Y output for 'ctdb getdebug'
(This used to be ctdb commit 646f4d9a01637685e967fb3ecc042fc97c0b7529)
2008-03-04 13:23:06 +11:00
Ronnie Sahlberg
212fbb42d5 make 'ctdb ip' provide machinereadble output using '-Y'
(This used to be ctdb commit 446e2f4e650b12d6fce5677a6841006462c23dba)
2008-03-04 13:18:27 +11:00
Ronnie Sahlberg
5afb32f976 document some public tunables
(This used to be ctdb commit 61fd50e2b3aa9a3ed32bc81a8e28464f267dc490)
2008-03-04 13:06:46 +11:00
Ronnie Sahlberg
4600834377 document some new ctdb command
(This used to be ctdb commit f3648a8a5b3934ea42c7d2550f729a5bd61a4d0f)
2008-03-04 12:37:24 +11:00
Ronnie Sahlberg
d9b534b59d A new command to 'ctdb'
ctdb moveip <IPADDRESS> <NODE>

which can be used to manually fail an ip address over to a specific node.

This can only be used if DeteministicIPs are disabled and also only if NoIPFailback is enabled.

(This used to be ctdb commit ffee062b7e26a6aa6ad254edb58399040ecaa542)
2008-03-04 12:20:23 +11:00
Ronnie Sahlberg
a89ed0fdc2 add a new tunable 'NoIPFailback'
when this tunable is set, ip addresses will only be failed over when a node
fails. And only those ip addresses held by the failed node will be reallocated
in the cluster.

When a node becomes active again, this will not lead to any failback of ip addresses.

This can reduce the number of "ip address movements" in the cluster since we dont automatically fail an ip address back, but can also lead to an unbalanced cluster since we no longer attempt to spread the ip addresses out evenly across the active nodes.

This tuneable can NOT be active at the same time as DeterministicIPs are used.

(This used to be ctdb commit d3b8a461b15bc584fa1785eb5922de6d49d8f6c4)
2008-03-03 12:52:16 +11:00
Ronnie Sahlberg
e08519b74d when we reallocate the ip addresses for nodes, we must make sure that
a node that has been allocated to server an ip actually CAN serve that ip
(if we use differing public_addresses files on each node)

(This used to be ctdb commit fdaf7cb2d7682507fbf4c6c2b833b327c93fac08)
2008-03-03 10:53:23 +11:00
Ronnie Sahlberg
57d29f1011 add a num_connected field to the rec structure that holds the number
of connected nodes

num_active only contains the number of active nodes and would thus not count
banned nodes

(This used to be ctdb commit 06d3ce470766ef0b60d68ccd84de5437146cc147)
2008-03-03 10:24:17 +11:00
Ronnie Sahlberg
f6f7f54bd6 add a new tunable : reclockpingperiod
once every such interval :
* the recovery master on each node will uppdate the "connected" count in the
reclock count file (ctdb getreclock)
* if the node thinks it is a recovery master but it detects another node
  that is DISCONNECTED but which still holds a lock to the reclock count file
  this may mean that we have a split cluster.
  if that other node that is DISCONNECTED but still holds the lock on hte reclock
  pnn count file, is MORE connected than the local node,
  yield the recmaster role and let the other half of the lcuster take over

this add a second, last chance mechanism to detect split clusters.
IF the cluster is split but GPFS is not yet split, this mechanism makes
the largest half of the cluster become the active half.

(This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287)
2008-03-03 09:19:30 +11:00
Ronnie Sahlberg
cadd95263f change recmaster from being a local variable in monitor_cluster() to be a member of the ctdb_recoverd structure
(This used to be ctdb commit b7f955338f50c92374b4f559268fb3a1a516aefa)
2008-03-03 07:53:46 +11:00
Ronnie Sahlberg
814570f904 update the reclock pnn count for how many nodes are connected to the current node once every 60 seconds
(This used to be ctdb commit bf1863cc9e2539b2c3e53c664b493b459ebfcc8b)
2008-02-29 13:14:47 +11:00
Ronnie Sahlberg
efa29c6c98 store the num_active variable (number of connected/active nodes) inside the rec
structure and avoid passing this as an extra parameter to do_recovery()

(This used to be ctdb commit 8bb229aa3b4bd41e48d4e4e2e148d8680c8ba436)
2008-02-29 12:55:20 +11:00
Ronnie Sahlberg
e0036942bc add a new file <reclock>.pnn where each recovery daemon can lock that byte at offset==pnn to offer an alternative way to detect which nodes are active instead of relying on CONNECTED being accurate.
(This used to be ctdb commit 21d3319eaf463e2a00637d440ee2d4d15f53bf09)
2008-02-29 12:37:42 +11:00
Ronnie Sahlberg
4adeafef11 add a control to get the name of the reclock file from the daemon
(This used to be ctdb commit 9effb22cc1616d684352d7ebabb359e69adb0f52)
2008-02-29 10:03:39 +11:00
Ronnie Sahlberg
7bc8007f93 add a new tunable DisableWhenUnhealthy which when set will cause a node to automatically become DISABLED anytime monitoring fails and the node becomes UNHEALTHY.
Use with caution.

(This used to be ctdb commit c20293360db67f9876b0c84e5e9e12a5868964cb)
2008-02-22 10:33:09 +11:00
Ronnie Sahlberg
0c193b349b document the --start-as-disabled argument
(This used to be ctdb commit 613881a06186dec90fb64a7190ddf4afd7437d67)
2008-02-22 10:01:15 +11:00
Ronnie Sahlberg
f3b474cffb Add debug output to indicate why a node starts up in DISABLED state
(This used to be ctdb commit 8df75775966ead36e1073896fedeff674a6e0587)
2008-02-22 09:52:57 +11:00
Ronnie Sahlberg
39539f6044 Add a new parameter to /etc/sysconfig/ctdb
CTDB_START_AS_DISABLED="yes"

and command line argument
--start-as-disabled

When set, this makes the ctdb node to always start in DISABLED mode and will thus not host any public ip addresses.
The administrator must manually "ctdb enable" the node after it has started when the administrator wants the node to start hosting public ip addresses.

Using this option it is possible to start ctdb on a node without causing any reallocation of ip addresses when it is starting. The node will still merge with the cluster and there will still be a recovery phase but the ip address allocations will not change in the cluster.

(This used to be ctdb commit b93d29f43f5306c244c887b54a77bca8a061daf2)
2008-02-22 09:42:52 +11:00
Ronnie Sahlberg
c8503e06cd monitor the amount of free memory and if this treshold is crossed, monitoring will log an OOM memory in the ctdb log and shut down ctdb on the node.
by default ctdb does not monitor for OOM.
to enable this you need to uncomment the CTDB_MONITOR_FREE_MEMORY line in /etc/sysconfig/ctdb and specify the amount in MByte free that will trigger OOM and cause ctdb to shutdown the node

(This used to be ctdb commit 35627c7450a03f36a353c3dd7cce31ce3433a7ff)
2008-02-21 13:29:28 +11:00
Ronnie Sahlberg
050e6298e6 update version to 1.0.29
(This used to be ctdb commit bb8229d7b479bd486b07fa6cd04100fec02bddee)
2008-02-21 08:37:29 +11:00
Ronnie Sahlberg
16c4e9c4aa make the ctdb reloadnodes reload the nodes file on all nodes and restart the transport
(This used to be ctdb commit 6272ad33b4af6ea9d6fd0ac877df3f75be45d665)
2008-02-21 08:25:01 +11:00
Ronnie Sahlberg
9f99b44fd1 to make it easier/less disruptive to add nodes to a running cluster
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.

When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer

add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file

(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
2008-02-19 14:44:48 +11:00
Ronnie Sahlberg
87b38e01b2 the ctdb structure must make its own copy of the ->address field and not just
copy the content of the nodes structure.

this ctdb_address structure contains a pointer which is talloced hanging off the structure itself.
If we copy the content of this structure as we did in assigning to ctdb->address from nodes[i]
then if we talloc_free() the node structure we end up with a wild pointer in ctdb->address

(This used to be ctdb commit 644a7248548260d37df432979b129797750907f4)
2008-02-19 14:35:15 +11:00
Ronnie Sahlberg
bef60e8200 read the current debuglevel in each loop in the recovery daemon so that we
pick up when they change in the parent daemon

(This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf)
2008-02-18 19:38:04 +11:00
Ronnie Sahlberg
8da0e15a07 from Mathieu PARENT <math.parent@gmail.com>
Simulate "nice service" on systems that do not have "service"

(This used to be ctdb commit d0e6dcbadaf41745d423640e5ff5bafd9f68eb88)
2008-02-13 08:20:20 +11:00
Ronnie Sahlberg
e3770e5f9d From Mathieu PARENT <math.parent@gmail.com>
Set the correct permissions for events.d/README

(This used to be ctdb commit d8953c89adc7d11d2fecc61323b7e1456b56fcaa)
2008-02-13 08:17:53 +11:00
Ronnie Sahlberg
42702fa770 add helpers to stop/start nfs lockmanager on different platforms
(This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c)
2008-02-11 09:52:09 +11:00
Ronnie Sahlberg
0e31eaed57 create a startstop_nfs function that can start/stop the nfs service of different platforms
(This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da)
2008-02-11 09:35:37 +11:00
Ronnie Sahlberg
b58a128f6a update to revision 28
(This used to be ctdb commit ca266a989ba1c6fcac48b1739e7cff7766481df7)
2008-02-08 15:12:06 +11:00