1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-30 13:18:05 +03:00
Commit Graph

617 Commits

Author SHA1 Message Date
Amitay Isaacs
af3a168ed3 ctdb-util: Do not use mlockall() on AIX
Memory lockdown causes recovery daemon to crash on AIX.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-03-04 01:02:11 +01:00
Amitay Isaacs
b8c6bcc365 ctdb-common: mkdir_p should not try to create .
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2014-01-16 11:41:12 +11:00
Amitay Isaacs
d21919c8b4 ctdb-common: Refactor code to keep track of child processes
This code can then be used to track child processes created with vfork().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-27 18:46:16 +01:00
Amitay Isaacs
7562701153 ctdb-common: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
2013-11-19 17:13:05 +01:00
Martin Schwenke
bd73e017b0 common: New function ctdb_mkdir_p_or_die()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7b971df79b0b63f83555205eacf48d49ca3a273a)
2013-10-25 12:06:06 +11:00
Martin Schwenke
c07e3830b3 common: New function mkdir_p()
Behaves like mkdir -p.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit afe2145d91725daf1399f0a24f1cddcf65f0ec31)
2013-10-25 12:06:06 +11:00
Amitay Isaacs
a42b6e1cad common/util: Use AIX specific code for setting high priority for CTDB daemon
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 7764cf67a61bbf1caad5aa8e2d75a262b9da654c)
2013-10-22 14:33:56 +11:00
Amitay Isaacs
e229ff6133 common: Fix setting of debug level in the client code
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 299fa487549e36572b757852d21471f9e23f6e8f)
2013-10-04 15:15:35 +10:00
Martin Schwenke
066b671de0 utils: Make debug level strings case-insensitive
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c700dd0c7b6b43b61b3e231643b5d7cbe2f9592a)
2013-09-25 14:35:31 +10:00
Martin Schwenke
d30e269ecc common: Make parse_ip() valgrind-clean
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit c0bb147ca09e82019b05ec22995623cffc3184e2)
2013-09-11 15:35:38 +10:00
Amitay Isaacs
a61a4b1254 common/io: Limit the queue buffer size for fair scheduling via tevent
If we process all the data available in a socket buffer, CTDB can stay busy
processing lots of packets via immediate event mechanism in tevent.  After
processing an immediate event, tevent returns without epoll_wait.  So as long
as there are immediate events, tevent will never poll other FDs.  CTDB will
report this as "Event handling took xx seconds" warning.  This is misleading
since CTDB is very busy processing packets, but never gets to the point of
polling FDs.

The improvement in socket handling made it worse when handling traverse
control.  There were lots of packets filled in the socket buffer quickly and
CTDB stayed busy processing those packets and not polling other FDs and timer
events.  This can lead to controls timing out and in worse case other nodes
marking busy node as disconnected.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 92939c1178d04116d842708bc2d6a9c2950e36cc)
2013-08-22 14:08:52 +10:00
Amitay Isaacs
cfb7f74fa2 Revert "common/io: Keep queue buffer size multiple of 4K"
This reverts commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9.

This is not the best approach.  Allowing queue buffer size to grow
indefinitely causes large number of CTDB packets to be queued up very
quickly which when processed via immediate events will block CTDB from
processing events from other FDs.  If there are immediate events queued
up, tevent will never process any of the FDs till all immediate events
are processed.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit d8b094e804efc53fae9f44c6ef961b7b5797d290)
2013-08-22 14:08:52 +10:00
Amitay Isaacs
1467b666f2 Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"
This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.

This is a premature optimization.  Record can bounce between nodes
very quickly if it is a contended record.  There is no need to hold a
record on a node unnecessarily.  In case record contention becomes bad,
enabling sticky records on a database is a better idea.

Conflicts:
	include/ctdb_private.h
	server/ctdb_tunables.c

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)
2013-08-22 14:08:52 +10:00
Amitay Isaacs
27fd34e9ff ctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster
Empty record with rsn=0 should not be written on any other node other than
dmaster.  This is however not true for persistent databases.  So currently
apply the check only for volatile databases.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit df83ae7a047dab4803e0d94b1c11df48ae17ca96)
2013-08-22 14:08:52 +10:00
Michael Adam
aa1360aeb2 util: In passing the code, fix a space vs. tab in set_close_on_exec().
Signed-off-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit f9556a6f1fe0046308c8b363e6dcaf3f7ce6f2b7)
2013-08-19 17:12:33 +02:00
Amitay Isaacs
3c0a477911 common: Null terminate process name string so valgrind doesn't complain
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 1c9025fdd08d1cea342af7487d0123015e08831b)
2013-08-14 16:55:51 +10:00
Amitay Isaacs
d349b56e2d common/io: Keep queue buffer size multiple of 4K
Currently queue buffer size is realloc'd every time we need to extend the
buffer.  Small increments can cause memory fragmentation.  Instead always
extend buffer in multiples of 4K.  This should reduce multiple talloc_realloc
calls when there are lots of packets in the socket buffer.

Also, if queue buffer has grown larger than 64K, throw away the buffer once
all the requests in the queue have been processed.  That way queue does not
hold on to large buffers.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9)
2013-08-09 11:07:37 +10:00
Sumit Bose
3dc280f5b0 IPv6 neighbor solicit cleanup
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit a81edf7eb908659a379f0cb55fd5d04551dc2c37)
2013-07-11 15:16:55 +10:00
Sumit Bose
157f1cfefd Fixes for various issues found by Coverity
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 05bfdbbd0d4abdfbcf28e3930086723508b35952)
2013-07-11 15:16:55 +10:00
Amitay Isaacs
1c21f37e57 ctdbd: Set process names for child processes
This helps distinguish processes in process list in top, perf, etc.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e)
2013-07-10 14:33:19 +10:00
Amitay Isaacs
500b26e48f common/system: Add ctdb_set_process_name() function
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b)
2013-07-10 14:33:19 +10:00
Martin Schwenke
dbd1759eae util: New function ctdb_die()
This is like ctdb_fatal() but exits cleanly without dumping core or
generating a backtrace.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)
2013-07-05 15:52:33 +10:00
Amitay Isaacs
6391f61fbc build: Fix compiler warnings for uninitialized variables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 5408c5c4050539e5aa06a5e82ceb63a6cb5cef0c)
2013-07-04 20:43:52 +10:00
Mathieu Parent
d82b9ae410 build: Fix tdb.h path to enable building with system TDB library
(This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)
2013-06-14 16:45:27 +10:00
Martin Schwenke
6d9667f01c ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY
This adds more serialisation to the startup, ensuring that the
"startup" event runs after everything to do with the first recovery
(including the "recovered" event).

Given that it now takes longer to get to the "startup" state, the
initscript needs to wait until ctdbd gets to "first_recovery".

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)
2013-05-24 14:08:07 +10:00
Martin Schwenke
63577c96db ctdbd: Replace ctdb->done_startup with ctdb->runstate
This allows states, including startup and shutdown states, to be
clearly tracked.  This doesn't include regular runtime "states", which
are handled by node flags.

Introduce new functions ctdb_set_runstate(), runstate_to_string() and
runstate_from_string().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)
2013-05-24 14:08:06 +10:00
Amitay Isaacs
03a96f280f logging: Make sure ringbuffer messages are terminated with a newline
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit dbb7c550133c92292a7212bdcaaa79f399b0919b)
2013-05-24 09:15:08 +10:00
Amitay Isaacs
5145b9bcec logging: Fix a bug in ringbuffer
When ringbuffer is full, it does not return any entries.  Simplify
ringbuffer logic by keeping track of number of log entries rather than
last entry.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 939d12b96a0cbebbe6269fa2b14f584058dd6174)
2013-05-23 16:18:23 +10:00
Martin Schwenke
7aa0a49cbd util: ctdb_fork() should call ctdb_set_child_info()
For now we pass NULL as the child name.  Later we'll give ctdb_fork()
and friends an extra argument and pass that through.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit ba8866d40125bab06391a17d48ff06a4a9f9da89)
2013-04-18 13:18:29 +10:00
Martin Schwenke
4ede763f3b util: New functions ctdb_set_child_info() and ctdb_is_child_process()
Must be called by all child processes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>

(This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)
2013-04-18 13:18:29 +10:00
Volker Lendecke
d82336f1f3 common/messaging: Use the jenkins hash in ctdb_message
This give a better hash distribution

(This used to be ctdb commit f7f8bde2376f8180a0dca6d7b8d7d2a4a12f4bd8)
2013-04-05 13:13:08 +11:00
Volker Lendecke
a37033bfc9 common/messaging: use tdb_parse_record in message_list_db_fetch
This avoids malloc/free in a hot code path.

(This used to be ctdb commit c137531fae8f7f6392746ce1b9ac6f219775fc29)
2013-04-05 13:12:58 +11:00
Amitay Isaacs
9937adf0ca common/messaging: Abstract db related operations inside db functions
This simplifies the use of message indexdb API and abstracts tdb related code
inside the API.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit bf7296ce9b98563bcb8426cd035dbeab6d884f59)
2013-04-05 13:00:43 +11:00
Amitay Isaacs
8788e6318c common/messaging: Don't forget to free the result returned by tdb_fetch()
This fixes a memory leak in the messaging code.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 20be1f991dd75c2333c9ec9db226432a819f57ba)
2013-04-05 13:00:16 +11:00
Amitay Isaacs
96ad89f438 common/messaging: Free message list header if all message handlers are freed
This makes sure that even if the srvids are not deregistered, the header
structure is freed when the last message handler has been freed as a result of
client going away.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 4e1ec7412866f2d31c41de1bec0fbf788c03051b)
2013-04-05 12:59:25 +11:00
Amitay Isaacs
d4407a6516 common/io: For scheduling immediate events use tevent_schedule_immediate
tevent_schedule_immediate() is much more efficient at handling events that need
to be processed immediately rather than creating timed events with
timeval_zero().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 11734be353a1e246163eda631d35dfe55d1d6fb1)
2013-03-06 15:32:37 +11:00
Amitay Isaacs
5d7efb4cf1 ctdbd: Add an index db for message list for faster searches
When CTDB is busy with lots of smbd, CTDB was spending too much time in
daemon_check_srvids() which searches a list of srvids in the registered
message handlers.  Using a hash based index significantly improves the
performance of search in a linked list.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)
2013-03-06 15:32:33 +11:00
Amitay Isaacs
a2abdc1353 common/io: Rewrite socket handling code to read all available data
This improves the processing of packets considerably.  It has been
observed that there can be as many as 10 packets in the socket buffer and
the current code of reading a single packet from a socket at a time is
not very optimal.  This change reads all the bytes from socket buffer and
then parses to extract multiple packets.  If there are multiple packets,
set up a timed event to process next packet.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit d788bc8f7212b7dc1587ae592242dc8c876f4053)
2013-02-19 17:18:21 +11:00
Martin Schwenke
689384a7b4 Logging: Fix breakage when freeing the log ringbuffer
Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from
the log ringbuffer.  The solution there is still generally good: there
is no need to keep the ringbuffer in children created by
ctdb_fork()... except for those special children that are created to
fetch data from the ringbuffer!

Introduce a new function ctdb_fork_no_free_ringbuffer() that does
everything ctdb_fork() needs to do except free the ringbuffer (i.e. it
is the old ctdb_fork() function).  The new ctdb_fork() function just
calls that function and then frees the ringbuffer in the child.

This means all callers of ctdb_fork() have the convenience of having
the ringbuffer freed.  There are 3 special cases:

* Forking the recovery daemon.  We want to be able to fetch from the
  ringbuffer there.

* The ringbuffer fetching code.  Change the 2 calls in this code (main
  daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer()
  instead.

While we're here, clear the log ringbuffer when the recovery deamon is
forked, since it will contain a copy of the messages from the main
daemon.

Note to self: always test... even the most obvious patches...  ;-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)
2013-02-07 11:26:29 +11:00
Martin Schwenke
8dc3219e9b Logging: Free the ringbuffer in child processes created with ctdb_fork()
At the moment the log ringbuffer is duplicated in every child process.
Althought it is copy-on-write we want to see if it is contributing to
out-of-memory situations when there are a lot of children.

The ringbuffer isn't accessible from any of the children anyway...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a82d3ec12f0fda16d6bfa8442a07595de897c10e)
2013-02-05 12:40:30 +11:00
Martin Schwenke
f2ba0e8a65 Logging: New function ctdb_log_ringbuffer_free()
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a4f622e85168f59417c11705f1734e0352e1d44a)
2013-02-05 12:40:30 +11:00
Mathieu Parent
264f847631 common: Don't lie on unimplemented gratuitous arp
Signed-off-by: Mathieu Parent <math.parent@gmail.com>

(This used to be ctdb commit b054193d1d19a8eef998fa690899501f79badb8a)
2013-01-22 18:04:00 +11:00
Mathieu Parent
fd8d3cfeba common: FreeBSD+kFreeBSD: Implement get_process_name (same as in Linux)
Signed-off-by: Mathieu Parent <math.parent@gmail.com>

(This used to be ctdb commit 258092aaf6b7a9bdc14f0fb35e8bd7f7dc742b3f)
2013-01-22 18:03:44 +11:00
Mathieu Parent
384b9b2a7b common: Detailed platform-specific FIXME
Signed-off-by: Mathieu Parent <math.parent@gmail.com>

(This used to be ctdb commit d202b2fdd4fd70172e5e44583627b57a1b7ad2ed)
2013-01-22 18:03:41 +11:00
Martin Schwenke
db5dfe891c recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG
These support getting and clearing logs from the ring-buffer in the
recovery daemon.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)
2012-10-22 11:15:36 +11:00
Amitay Isaacs
1011d10a51 common: Add routines to get process and lock information
Currently these functions are implemented only for Linux.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit be4051326b0c6a0fd301561af10fd15a0e90023b)
2012-10-20 02:48:44 +11:00
Martin Schwenke
8d7562f3f8 common: Debug ctdb_addr_to_str() using new function ctdb_external_trace()
We've seen this function report "Unknown family, 0" and then CTDB
disappeared without a trace.  If we can reproduce it then this might
help us to debug it.

The idea is that you do something like the following in /etc/sysconfig/ctdb:

  export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh"

When we hit this error than we call out to gcore to get a core file so
we can do forensics.  This might block CTDB for a few seconds.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)
2012-10-18 20:05:42 +11:00
Martin Schwenke
75347b8668 util: ctdb_fork() closes all sockets opened by the main daemon
Do some other hosuekeeping including stopping tevent.

Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 212298279557a2833ef0f81809b4a5cdac72ca02)
2012-10-05 11:56:12 +10:00
Amitay Isaacs
c44a97dfa1 util: Do not lock down memory when running with local daemons
Thanks to Ronnie for highlighting the issue of memory lockdown on AIX.
Fix typo, use getuid and not getpid.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 21a5cbf9518fafc610939f14874371a52b1dc8b3)
2012-07-26 22:01:50 +10:00
Amitay Isaacs
51c57e87e5 util: Do not try to lockdown memory when running in local daemons mode
Signed-off-by: Amitay Isaacs <amitay@gmail.com>

(This used to be ctdb commit 25f84797a64a683c303b04057aa8113e9fc47c49)
2012-07-16 12:12:05 +10:00