1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-26 10:04:02 +03:00

1612 Commits

Author SHA1 Message Date
Michael Adam
951efa1097 ctdb-vacuum: rename private->private_data in vacuum_traverse
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:11 +11:00
Michael Adam
01f359cafc ctdb-vacuum: extract check for full vacuum run out of ctdb_vacuum_db_full()
This is more consistent.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:11 +11:00
Michael Adam
c88fd19714 ctdb-vacuum: add consistency check for counts to ctdb_vacuum_db_fast()
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:11 +11:00
Michael Adam
5d5907c7cf ctdb-vacuum: use tdb_parse_record instead of tdb_fetch in delete_queue_traverse()
this spares malloc and free

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:11 +11:00
Michael Adam
fe68b3c494 ctdb-vacuum: simplify delete_record_traverse() - free treats NULL
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:11 +11:00
Michael Adam
593bddf2e8 ctdb-vacuum: simplify delete_queue_traverse() - free treats NULL pointers.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:10 +11:00
Michael Adam
24bec3d31b ctdb-vacuum: reduce indentation in delete_queue_traverse
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:10 +11:00
Michael Adam
48f2d11588 ctdb-vacuum: treat value 0 of tunable RepackLimit as turned off.
I.e. when RepackLimit is set to 0, no size of the freelist
should trigger a repack in vacuuming.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:10 +11:00
Michael Adam
af5568b267 ctdb-vacuum: fix treatment of remaining records and statistics in delete_record_traverse()
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:10 +11:00
Michael Adam
b4e0b01a8c ctdb-vacuum: cast freelist_size in comparison.
At this point, it is >= 0 anyways.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:10 +11:00
Michael Adam
6a46a25530 ctdb-vacuum: improve output of delete list statistics
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-03-06 11:31:09 +11:00
Amitay Isaacs
fb2631f5df ctdb-daemon: Do not support connection tracking if there are no public IPs
CTDB tracks connections to be able to send tickle ACKs and gratuitous
ARPs.  When there are no public IPs, there is no need for tickle ACKs
and gratuitous ARPs.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Mar  4 03:01:38 CET 2014 on sn-devel-104
2014-03-04 03:01:38 +01:00
Amitay Isaacs
7d05baa96b ctdb-recoverd: Check if callback function is registered before calling
Fix suggested by by Kevin Osborn <kosborn@overlandstorage.com>.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Feb 27 13:54:59 CET 2014 on sn-devel-104
2014-02-27 13:54:59 +01:00
Amitay Isaacs
026996550d ctdb-daemon: After updating tickles on other nodes, set update flag to false
tcp_update_flag is set to true whenever tickles are added or deleted.
This flag is used to determine whether or not to send tickles list to
other nodes.  Once tickles list is sent to other nodes successfully,
set tcp_update_flag to false, so ctdbd does not keep sending same tickles
list every TickleUpdateInterval (20 seconds).

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-02-27 11:49:39 +01:00
Martin Schwenke
0723fedced ctdb-daemon: Implement ctdb_control_startup()
This doesn't implement what was recommended.  That would require
careful error handling, probably with a fallback to this code anyway.
This is simple and does no worse that the current code.  That is, the
new node is updated on the next call to tdb_update_tcp_tickles().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-02-27 11:49:39 +01:00
Amitay Isaacs
75ca1216a6 ctdb-daemon: Fix whitespaces
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-02-27 11:49:39 +01:00
Amitay Isaacs
f2cd999189 ctdb-daemon: Always talloc tickle array off vnn instead of ctdb->nodes
This fixes ctdb crash reported in bug #10366.
Fix suggested by Kevin Osborn <kosborn@overlandstorage.com>.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-02-27 11:49:39 +01:00
Martin Schwenke
24b734f084 ctdb-recoverd: LCP2 cleanups
* Remove unnecessary candimbl parameter.

  This parameter can be cheaply calculated in
  lcp2_failback_candidate().  The compiler will probably do an
  excellent job optimising it.  :-)

* Clarify a debug statement

  This is much clearer than doing a complex recalculation of a known
  value.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-02-19 12:04:47 +11:00
Martin Schwenke
9e5ef44f32 ctdb-recoverd: Optimise check for rebalance candidates in LCP2
Currently this can be checked many times.  However, there's no point
calling the rebalance/failback code at all if there are no rebalance
candidates.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-02-19 12:04:47 +11:00
Michael Adam
0535f73c3a ctdb:vacuum: move retrieval of freelist to after vacuum run
The fast vacuum run may have increased the freelist size.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Feb 14 03:15:30 CET 2014 on sn-devel-104
2014-02-14 03:15:30 +01:00
Michael Adam
bd474985b1 ctdb:vacuum: fix debug message typo in add_record_to_delete_list()
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-02-14 11:14:31 +11:00
Martin Schwenke
f1a20d748f ctdb-recoverd: Fix a bug in the LCP2 rebalancing code
srcimbl gets changed on every iteration of the loop.  The value that
should be stored for the new imbalance of the source node is
minsrcimbl.

To help diagnose this, added some extra debug that can be left in.

The extra debug changes the output of a couple of tests.  Note that
the resulting IP allocations in those tests is unchanged - only the
debug output is changed.

Also add some new tests that illustrates the bug.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-02-13 02:03:24 +01:00
Amitay Isaacs
276b233c00 ctdb-daemon: Consult CTDB_DEBUG_HUNG_SCRIPT variable before running debug script
If CTDB_DEUB_HUNG_SCRIPT is set, use that instead of the default
debug script.  This code was dropped by mistake in commit
18c1f432102f1a5093927be9276d001180539e50.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Feb 12 08:47:47 CET 2014 on sn-devel-104
2014-02-12 08:47:47 +01:00
Amitay Isaacs
1566790e5a ctdb-daemon: Return negative status only if there are known errors
If event script does not exist or does not have execute permissions, then
return negative errno to distinguish from the exit errors of event script.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
2014-01-31 15:57:49 +11:00
Martin Schwenke
e5778cc172 ctdb/daemon: reloadips must register state of asynchronous controls
Otherwise ctdb_client_async_wait() is a no-op.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-01-31 13:36:04 +11:00
Amitay Isaacs
eee450fec2 ctdb-daemon: Simplify listing event scripts using scandir
Instead of using RB tree for sorting the script names (incorrectly since
it's only using the leading numbers in the script name), use scandir
with alphasort.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jan 21 06:41:25 CET 2014 on sn-devel-104
2014-01-21 06:41:25 +01:00
Amitay Isaacs
cbffbb7c2f ctdb-daemon: Do not run monitor event if any other event is already running
Any currently running monitor events are cancelled if any other events
are scheduled.  However, this does not stop monitor events to be run
when other events are already running.

Keep track of the number of active events and schedule monitor event
only if there are no active events.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-21 11:30:41 +11:00
Martin Schwenke
e6304d1e1a ctdb/daemon: Untangle serialisation of 1st recovery -> startup -> monitor
At the moment ctdb_check_healthy() is overloaded to wait until the
first recovery is complete, handle the "startup" event and also
actually handle monitoring.  This is untidy and hard to follow.

Instead, have the daemon explicitly wait for 1st recovery after the
"setup" event.  When first recovery is complete, schedule a function
to handle the "startup" event.  When the "startup" event succeeds then
explicitly enable monitoring.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-01-17 17:59:41 +11:00
Martin Schwenke
e77d5f99e3 ctdb/recoverd: Do not refuse disabling takeover runs on inactive nodes
Failure might be expected when disabling takeover runs on banned
nodes, since they might be suffering from performance problems or
similar.  More broadly, administrators who reconfigure a cluster that
isn't in a happy state aren't necessarily doing something sensible.

However, allowing takeover runs to be disabled on inactive nodes stops
reconfiguration of stopped nodes.  This is probaby an unreasonable
limitation, so drop it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-01-17 17:59:19 +11:00
Martin Schwenke
a955d0bedc ctdb-recoverd: Ignore failed ipreallocated controls to inactive nodes
Currently timeouts for controls to inactive nodes can cause banning
credits to be applied.  This should not happen.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2014-01-17 17:59:08 +11:00
Amitay Isaacs
a92fd11ad1 ctdb-daemon: Remove ctdb_fork_with_logging()
This function has been replaced with ctdb_vfork_with_logging().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 16 04:05:35 CET 2014 on sn-devel-104
2014-01-16 04:05:35 +01:00
Amitay Isaacs
97575e1ba0 ctdb-daemon: Remove unused code to run eventscripts
Eventscripts are now executed using a helper.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 12:11:38 +11:00
Amitay Isaacs
18c1f43210 ctdb-daemon: Replace ctdb_fork_with_logging with ctdb_vfork_with_logging (part 2)
Use ctdb_event_helper to run debug-hung-script.sh.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 12:11:38 +11:00
Amitay Isaacs
d86662a925 ctdb-daemon: Replace ctdb_fork_with_logging with ctdb_vfork_with_logging (part 1)
Use ctdb_event_helper to run eventscripts.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 12:11:37 +11:00
Amitay Isaacs
69324b61f0 ctdb-daemon: Add helper process to execute event scripts
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 12:11:37 +11:00
Amitay Isaacs
2879404388 ctdb-daemon: Add ctdb_vfork_with_logging()
This will be used to spawn lightweight helper processes to run
eventscripts.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
7aa20ccb5c ctdb-daemon: No need to call event scripts with CTDB_CALLED_BY_USER
This was added to support external monitoring using CTDB event scripts.
However, it was never used.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
bafa467021 ctdb-daemon: Deprecate RELOAD and STATUS events
These events have never been used.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Martin Schwenke
44a0466ac1 ctdb-recoverd: Only respond to currently queued ipreallocated requests
Otherwise new requests can come in during the latter parts of the
takeover run when the IP allocation algorithm has already run, and the
new requests will be dequeued even though they haven't really be
processed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
efc77ba6ac ctdb-recoverd: For persistent databases a sequence number of 0 is valid
Otherwise recovery ends up done by RSN when it is unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
4ea721b2c1 ctdb-locking: Use vfork instead of fork to exec helpers
There is a significant overhead using fork() over vfork(), specially
when the child process execs a helper.  The overhead is in memory space
and time.

    # strace -c ./test_fork 1024 200
    count=1024, size=204800, total=200M
    failed fork=0
    time for fork() = 4879.597000 us
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    4.543321        3304      1375       375 clone
      0.00    0.000071           0      1033           mmap
      0.00    0.000000           0         1           read
      0.00    0.000000           0         3           write
      0.00    0.000000           0         2           open
      0.00    0.000000           0         2           close
      0.00    0.000000           0         3           fstat
      0.00    0.000000           0         3           mprotect
      0.00    0.000000           0         1           munmap
      0.00    0.000000           0         3           brk
      0.00    0.000000           0         1         1 access
      0.00    0.000000           0         1           execve
      0.00    0.000000           0         1           arch_prctl
    ------ ----------- ----------- --------- --------- ----------------
    100.00    4.543392                  2429       376 total

    # strace -c ./test_vfork 1024 200
    count=1024, size=204800, total=200M
    failed fork=0
    time for fork() = 82.041000 us
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     96.47    0.001204           1      1000           vfork
      3.53    0.000044           0      1033           mmap
      0.00    0.000000           0         1           read
      0.00    0.000000           0         3           write
      0.00    0.000000           0         2           open
      0.00    0.000000           0         2           close
      0.00    0.000000           0         3           fstat
      0.00    0.000000           0         3           mprotect
      0.00    0.000000           0         1           munmap
      0.00    0.000000           0         3           brk
      0.00    0.000000           0         1         1 access
      0.00    0.000000           0         1           execve
      0.00    0.000000           0         1           arch_prctl
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.001248                  2054         1 total

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
0eeb73c187 ctdb-locking: Update current lock statistics when lock is scheduled
When a child process is created for a lock request, the current locks
statistics should be updated immediately.  This will provide accurate
information on number of active lock requests.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
3879e9991f ctdb-locking: Do not merge multiple lock requests to avoid unfair scheduling
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
094f34e9bf ctdb-locking: Implement active lock requests limit per database
This limit was currently a global limit and not per database.  This
prevents any database freeze lock requests from getting scheduled if
the global limit was reached.

Only individual record requests should be limited and database freeze
requests should always get scheduled.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
028fe930b6 ctdb-recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN
When running a mixed version cluster, compatibility with older
versions was was broken during recent refactorisation.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
6fbf399191 ctdb-recoverd: A node refuses to play against itself
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Martin Schwenke
2038d166ad ctdb-recoverd: Remove duplicate code to update flags during recovery
This also happens earlier in do_recovery() and the nodemap is not
updated after that, so this update is redundant.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
6d1b74f052 ctdb-server: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-19 17:13:03 +01:00
Amitay Isaacs
41d37058ca tunables: Remove obsolete tunables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)
2013-10-30 15:37:11 +11:00
Martin Schwenke
62076d3089 recoverd: Rebalancing should be done regardless tunable
Rebalance target nodes should be set even if a deferred rebalance is
not configured.  The user can explicitly cause a takeover run.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit afd9b51644af074752d74c412cb4e7ec2eba2c69)
2013-10-30 12:19:49 +11:00