1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-11 05:18:09 +03:00
Commit Graph

1370 Commits

Author SHA1 Message Date
Ronnie Sahlberg
74d57f8d51 Redo the vacukming process to mkake it scalable.
Vacumming used to delete one record at a time on all nodes, that was
m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all.

The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted.

(This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53)
2008-03-13 07:53:29 +11:00
Ronnie Sahlberg
e2930588b3 update to version 1.0.30
(This used to be ctdb commit 89529ea81379335b3db09774d192fb7cefe37338)
2008-03-04 13:40:29 +11:00
Ronnie Sahlberg
b1cf2b5653 Update ctdb uptime to provide machinereadable output
(This used to be ctdb commit 4f7f8aa6f178115b551ac35f7df2ec5aad054fe2)
2008-03-04 13:29:48 +11:00
Ronnie Sahlberg
61b52e0e64 provide machinereadble -Y output for 'ctdb getdebug'
(This used to be ctdb commit 646f4d9a01637685e967fb3ecc042fc97c0b7529)
2008-03-04 13:23:06 +11:00
Ronnie Sahlberg
212fbb42d5 make 'ctdb ip' provide machinereadble output using '-Y'
(This used to be ctdb commit 446e2f4e650b12d6fce5677a6841006462c23dba)
2008-03-04 13:18:27 +11:00
Ronnie Sahlberg
5afb32f976 document some public tunables
(This used to be ctdb commit 61fd50e2b3aa9a3ed32bc81a8e28464f267dc490)
2008-03-04 13:06:46 +11:00
Ronnie Sahlberg
4600834377 document some new ctdb command
(This used to be ctdb commit f3648a8a5b3934ea42c7d2550f729a5bd61a4d0f)
2008-03-04 12:37:24 +11:00
Ronnie Sahlberg
d9b534b59d A new command to 'ctdb'
ctdb moveip <IPADDRESS> <NODE>

which can be used to manually fail an ip address over to a specific node.

This can only be used if DeteministicIPs are disabled and also only if NoIPFailback is enabled.

(This used to be ctdb commit ffee062b7e26a6aa6ad254edb58399040ecaa542)
2008-03-04 12:20:23 +11:00
Ronnie Sahlberg
a89ed0fdc2 add a new tunable 'NoIPFailback'
when this tunable is set, ip addresses will only be failed over when a node
fails. And only those ip addresses held by the failed node will be reallocated
in the cluster.

When a node becomes active again, this will not lead to any failback of ip addresses.

This can reduce the number of "ip address movements" in the cluster since we dont automatically fail an ip address back, but can also lead to an unbalanced cluster since we no longer attempt to spread the ip addresses out evenly across the active nodes.

This tuneable can NOT be active at the same time as DeterministicIPs are used.

(This used to be ctdb commit d3b8a461b15bc584fa1785eb5922de6d49d8f6c4)
2008-03-03 12:52:16 +11:00
Ronnie Sahlberg
e08519b74d when we reallocate the ip addresses for nodes, we must make sure that
a node that has been allocated to server an ip actually CAN serve that ip
(if we use differing public_addresses files on each node)

(This used to be ctdb commit fdaf7cb2d7682507fbf4c6c2b833b327c93fac08)
2008-03-03 10:53:23 +11:00
Ronnie Sahlberg
57d29f1011 add a num_connected field to the rec structure that holds the number
of connected nodes

num_active only contains the number of active nodes and would thus not count
banned nodes

(This used to be ctdb commit 06d3ce470766ef0b60d68ccd84de5437146cc147)
2008-03-03 10:24:17 +11:00
Ronnie Sahlberg
f6f7f54bd6 add a new tunable : reclockpingperiod
once every such interval :
* the recovery master on each node will uppdate the "connected" count in the
reclock count file (ctdb getreclock)
* if the node thinks it is a recovery master but it detects another node
  that is DISCONNECTED but which still holds a lock to the reclock count file
  this may mean that we have a split cluster.
  if that other node that is DISCONNECTED but still holds the lock on hte reclock
  pnn count file, is MORE connected than the local node,
  yield the recmaster role and let the other half of the lcuster take over

this add a second, last chance mechanism to detect split clusters.
IF the cluster is split but GPFS is not yet split, this mechanism makes
the largest half of the cluster become the active half.

(This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287)
2008-03-03 09:19:30 +11:00
Ronnie Sahlberg
cadd95263f change recmaster from being a local variable in monitor_cluster() to be a member of the ctdb_recoverd structure
(This used to be ctdb commit b7f955338f50c92374b4f559268fb3a1a516aefa)
2008-03-03 07:53:46 +11:00
Ronnie Sahlberg
814570f904 update the reclock pnn count for how many nodes are connected to the current node once every 60 seconds
(This used to be ctdb commit bf1863cc9e2539b2c3e53c664b493b459ebfcc8b)
2008-02-29 13:14:47 +11:00
Ronnie Sahlberg
efa29c6c98 store the num_active variable (number of connected/active nodes) inside the rec
structure and avoid passing this as an extra parameter to do_recovery()

(This used to be ctdb commit 8bb229aa3b4bd41e48d4e4e2e148d8680c8ba436)
2008-02-29 12:55:20 +11:00
Ronnie Sahlberg
e0036942bc add a new file <reclock>.pnn where each recovery daemon can lock that byte at offset==pnn to offer an alternative way to detect which nodes are active instead of relying on CONNECTED being accurate.
(This used to be ctdb commit 21d3319eaf463e2a00637d440ee2d4d15f53bf09)
2008-02-29 12:37:42 +11:00
Ronnie Sahlberg
4adeafef11 add a control to get the name of the reclock file from the daemon
(This used to be ctdb commit 9effb22cc1616d684352d7ebabb359e69adb0f52)
2008-02-29 10:03:39 +11:00
Ronnie Sahlberg
7bc8007f93 add a new tunable DisableWhenUnhealthy which when set will cause a node to automatically become DISABLED anytime monitoring fails and the node becomes UNHEALTHY.
Use with caution.

(This used to be ctdb commit c20293360db67f9876b0c84e5e9e12a5868964cb)
2008-02-22 10:33:09 +11:00
Ronnie Sahlberg
0c193b349b document the --start-as-disabled argument
(This used to be ctdb commit 613881a06186dec90fb64a7190ddf4afd7437d67)
2008-02-22 10:01:15 +11:00
Ronnie Sahlberg
f3b474cffb Add debug output to indicate why a node starts up in DISABLED state
(This used to be ctdb commit 8df75775966ead36e1073896fedeff674a6e0587)
2008-02-22 09:52:57 +11:00
Ronnie Sahlberg
39539f6044 Add a new parameter to /etc/sysconfig/ctdb
CTDB_START_AS_DISABLED="yes"

and command line argument
--start-as-disabled

When set, this makes the ctdb node to always start in DISABLED mode and will thus not host any public ip addresses.
The administrator must manually "ctdb enable" the node after it has started when the administrator wants the node to start hosting public ip addresses.

Using this option it is possible to start ctdb on a node without causing any reallocation of ip addresses when it is starting. The node will still merge with the cluster and there will still be a recovery phase but the ip address allocations will not change in the cluster.

(This used to be ctdb commit b93d29f43f5306c244c887b54a77bca8a061daf2)
2008-02-22 09:42:52 +11:00
Ronnie Sahlberg
c8503e06cd monitor the amount of free memory and if this treshold is crossed, monitoring will log an OOM memory in the ctdb log and shut down ctdb on the node.
by default ctdb does not monitor for OOM.
to enable this you need to uncomment the CTDB_MONITOR_FREE_MEMORY line in /etc/sysconfig/ctdb and specify the amount in MByte free that will trigger OOM and cause ctdb to shutdown the node

(This used to be ctdb commit 35627c7450a03f36a353c3dd7cce31ce3433a7ff)
2008-02-21 13:29:28 +11:00
Ronnie Sahlberg
050e6298e6 update version to 1.0.29
(This used to be ctdb commit bb8229d7b479bd486b07fa6cd04100fec02bddee)
2008-02-21 08:37:29 +11:00
Ronnie Sahlberg
16c4e9c4aa make the ctdb reloadnodes reload the nodes file on all nodes and restart the transport
(This used to be ctdb commit 6272ad33b4af6ea9d6fd0ac877df3f75be45d665)
2008-02-21 08:25:01 +11:00
Ronnie Sahlberg
9f99b44fd1 to make it easier/less disruptive to add nodes to a running cluster
add a new control that causes the node to drop the current nodes list
and reread it from the nodes file.
During this operation, the node will also drop the tcp layer and restart it.

When we drop the tcp layer, by talloc_free()ing the ctcp structure
add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer

add two new commands for the ctdb tool.
one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file

(This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c)
2008-02-19 14:44:48 +11:00
Ronnie Sahlberg
87b38e01b2 the ctdb structure must make its own copy of the ->address field and not just
copy the content of the nodes structure.

this ctdb_address structure contains a pointer which is talloced hanging off the structure itself.
If we copy the content of this structure as we did in assigning to ctdb->address from nodes[i]
then if we talloc_free() the node structure we end up with a wild pointer in ctdb->address

(This used to be ctdb commit 644a7248548260d37df432979b129797750907f4)
2008-02-19 14:35:15 +11:00
Ronnie Sahlberg
bef60e8200 read the current debuglevel in each loop in the recovery daemon so that we
pick up when they change in the parent daemon

(This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf)
2008-02-18 19:38:04 +11:00
Ronnie Sahlberg
8da0e15a07 from Mathieu PARENT <math.parent@gmail.com>
Simulate "nice service" on systems that do not have "service"

(This used to be ctdb commit d0e6dcbadaf41745d423640e5ff5bafd9f68eb88)
2008-02-13 08:20:20 +11:00
Ronnie Sahlberg
e3770e5f9d From Mathieu PARENT <math.parent@gmail.com>
Set the correct permissions for events.d/README

(This used to be ctdb commit d8953c89adc7d11d2fecc61323b7e1456b56fcaa)
2008-02-13 08:17:53 +11:00
Ronnie Sahlberg
42702fa770 add helpers to stop/start nfs lockmanager on different platforms
(This used to be ctdb commit 3b797d851bd4bdb8ec2b3981061c668d2cf0f97c)
2008-02-11 09:52:09 +11:00
Ronnie Sahlberg
0e31eaed57 create a startstop_nfs function that can start/stop the nfs service of different platforms
(This used to be ctdb commit f6cc6bd1f62138fbf812d1917f7341e2fa2323da)
2008-02-11 09:35:37 +11:00
Ronnie Sahlberg
b58a128f6a update to revision 28
(This used to be ctdb commit ca266a989ba1c6fcac48b1739e7cff7766481df7)
2008-02-08 15:12:06 +11:00
Andrew Tridgell
f6ebcd6a55 carefully step around the recovery area when doing a tdb_wipe_all. This prevents
problems with wipe_all on databases that may need crash recovery

(This used to be ctdb commit e7b1349bf8784c151c2651edd99b3f40ebcece1f)
2008-02-08 14:10:54 +11:00
Andrew Tridgell
1ff6e08f78 don't ship the .git directory in the srpm
(This used to be ctdb commit 0e88962c3f37eb38c4c6bfec918ce833b4481170)
2008-02-08 13:22:47 +11:00
Ronnie Sahlberg
60e2a45454 Merge git://git.samba.org/tridge/ctdb
(This used to be ctdb commit a8762bf8ca1958985896f174ccafe09361092d09)
2008-02-08 08:21:03 +11:00
Andrew Tridgell
fbba202f1a fixed a problem with tdb growing after each recovery
(This used to be ctdb commit d754380961e67271809fed6c44f45356fe7a9c77)
2008-02-07 23:01:06 +11:00
Ronnie Sahlberg
81232a9e29 dont use absolute pathnames for the netstat tool
it can be either in /bin or /usr/bin

(This used to be ctdb commit 4ab09e90a8a81b26d2e2af168cfce3c49a98c0e5)
2008-02-07 15:41:48 +11:00
Ronnie Sahlberg
071021b67f dont use an absolute pathname for the touch command
(This used to be ctdb commit dbfa5cb7f91b5c3c7a2dcf337f60b5c4c188a688)
2008-02-07 15:38:59 +11:00
Ronnie Sahlberg
6820f4ea15 dont use an absolute pathname for the iptables tool
(This used to be ctdb commit 8f87385c09b16c0e32d797c4b442865d8185d9ee)
2008-02-07 15:36:26 +11:00
Ronnie Sahlberg
f992455ce3 dont use an absolute path for the basename command
(This used to be ctdb commit 2519d30162fa3e9d5d81efd374543a2e4dfce545)
2008-02-07 15:33:52 +11:00
Ronnie Sahlberg
35ee7d4999 in the 91.lvs event script
IF lvs has been configured, check that the ipvsadm package has also
been installed since we depend on it.
If not, log an error and return 1

(This used to be ctdb commit 506174bbc47f1176122be2e55099149e3db27d57)
2008-02-07 09:42:35 +11:00
Ronnie Sahlberg
a8ea67203f change the IF interface is a BOND THEN xxx ELSE assume everything is ethernet
into a case and add an arm for ib*) (infiniband interfaces)

Dont try using ethtool on ib devices
(mii_tool doesnt work either)

IB does have a command ibv_devinfo   which can tell whether a physical port
is up or not   but it seems nontrivial to map this into a interface name such as ib0

(This used to be ctdb commit ab6bd25542946a732b4378f5476edfb466d6c000)
2008-02-07 09:35:46 +11:00
Ronnie Sahlberg
e365b01cef add documentation on how to set up ha-iscsi with ctdb
(This used to be ctdb commit 1060af64efcba7d2bc8f2187a0005b8c18086017)
2008-02-06 19:08:03 +11:00
Ronnie Sahlberg
2a0e73bff0 add monitoring of iscsi to the eventscript
(This used to be ctdb commit e190c4d71c0b54f4c6615258986770eba15f335d)
2008-02-06 14:26:35 +11:00
Ronnie Sahlberg
7ceb256412 update ctdb revision
(This used to be ctdb commit ec0c7b55d131ad37b5b1b918c886fcb07d85a9e6)
2008-02-06 14:07:53 +11:00
Ronnie Sahlberg
64b6df09a0 update ctdb version
change flags for 41.httpd

(This used to be ctdb commit 88527a4a5423014f9911fa6061632215e153eb7e)
2008-02-06 14:00:04 +11:00
Ronnie Sahlberg
55efef3237 add an eventscript to start/stop iscsi
(This used to be ctdb commit 1aecd8c9dc2855c40c9182f30e4e71bdae5705e3)
2008-02-06 12:41:00 +11:00
Andrew Tridgell
275cd68867 nicer use of structures and use isalpha()
(This used to be ctdb commit 19b5fbcd16596a4b6c22056585dd4bd988db3db7)
2008-02-05 10:36:06 +11:00
Ronnie Sahlberg
3f56526037 Specify and print debuglevels by name and not by number
(This used to be ctdb commit 79ad830294b8b677fbd0c5ad7ed6fbde71f74f8d)
2008-02-05 10:26:23 +11:00
Ronnie Sahlberg
a7aced27fd Merge branch 'master' of git://git.samba.org/tridge/ctdb
(This used to be ctdb commit 08164957f948f0c9f604c260ccf658df9b3440b7)
2008-02-04 20:28:19 +11:00