IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This code can then be used to track child processes created with vfork().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
Behaves like mkdir -p.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit afe2145d91725daf1399f0a24f1cddcf65f0ec31)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit c700dd0c7b6b43b61b3e231643b5d7cbe2f9592a)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit c0bb147ca09e82019b05ec22995623cffc3184e2)
If we process all the data available in a socket buffer, CTDB can stay busy
processing lots of packets via immediate event mechanism in tevent. After
processing an immediate event, tevent returns without epoll_wait. So as long
as there are immediate events, tevent will never poll other FDs. CTDB will
report this as "Event handling took xx seconds" warning. This is misleading
since CTDB is very busy processing packets, but never gets to the point of
polling FDs.
The improvement in socket handling made it worse when handling traverse
control. There were lots of packets filled in the socket buffer quickly and
CTDB stayed busy processing those packets and not polling other FDs and timer
events. This can lead to controls timing out and in worse case other nodes
marking busy node as disconnected.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 92939c1178d04116d842708bc2d6a9c2950e36cc)
This reverts commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9.
This is not the best approach. Allowing queue buffer size to grow
indefinitely causes large number of CTDB packets to be queued up very
quickly which when processed via immediate events will block CTDB from
processing events from other FDs. If there are immediate events queued
up, tevent will never process any of the FDs till all immediate events
are processed.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d8b094e804efc53fae9f44c6ef961b7b5797d290)
This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.
This is a premature optimization. Record can bounce between nodes
very quickly if it is a contended record. There is no need to hold a
record on a node unnecessarily. In case record contention becomes bad,
enabling sticky records on a database is a better idea.
Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)
Empty record with rsn=0 should not be written on any other node other than
dmaster. This is however not true for persistent databases. So currently
apply the check only for volatile databases.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit df83ae7a047dab4803e0d94b1c11df48ae17ca96)
Currently queue buffer size is realloc'd every time we need to extend the
buffer. Small increments can cause memory fragmentation. Instead always
extend buffer in multiples of 4K. This should reduce multiple talloc_realloc
calls when there are lots of packets in the socket buffer.
Also, if queue buffer has grown larger than 64K, throw away the buffer once
all the requests in the queue have been processed. That way queue does not
hold on to large buffers.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9)
This helps distinguish processes in process list in top, perf, etc.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e)
This is like ctdb_fatal() but exits cleanly without dumping core or
generating a backtrace.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)
This adds more serialisation to the startup, ensuring that the
"startup" event runs after everything to do with the first recovery
(including the "recovered" event).
Given that it now takes longer to get to the "startup" state, the
initscript needs to wait until ctdbd gets to "first_recovery".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7)
This allows states, including startup and shutdown states, to be
clearly tracked. This doesn't include regular runtime "states", which
are handled by node flags.
Introduce new functions ctdb_set_runstate(), runstate_to_string() and
runstate_from_string().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28)
When ringbuffer is full, it does not return any entries. Simplify
ringbuffer logic by keeping track of number of log entries rather than
last entry.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 939d12b96a0cbebbe6269fa2b14f584058dd6174)
For now we pass NULL as the child name. Later we'll give ctdb_fork()
and friends an extra argument and pass that through.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit ba8866d40125bab06391a17d48ff06a4a9f9da89)
Must be called by all child processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)
This simplifies the use of message indexdb API and abstracts tdb related code
inside the API.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit bf7296ce9b98563bcb8426cd035dbeab6d884f59)
This fixes a memory leak in the messaging code.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 20be1f991dd75c2333c9ec9db226432a819f57ba)
This makes sure that even if the srvids are not deregistered, the header
structure is freed when the last message handler has been freed as a result of
client going away.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 4e1ec7412866f2d31c41de1bec0fbf788c03051b)
tevent_schedule_immediate() is much more efficient at handling events that need
to be processed immediately rather than creating timed events with
timeval_zero().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 11734be353a1e246163eda631d35dfe55d1d6fb1)
When CTDB is busy with lots of smbd, CTDB was spending too much time in
daemon_check_srvids() which searches a list of srvids in the registered
message handlers. Using a hash based index significantly improves the
performance of search in a linked list.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)
This improves the processing of packets considerably. It has been
observed that there can be as many as 10 packets in the socket buffer and
the current code of reading a single packet from a socket at a time is
not very optimal. This change reads all the bytes from socket buffer and
then parses to extract multiple packets. If there are multiple packets,
set up a timed event to process next packet.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d788bc8f7212b7dc1587ae592242dc8c876f4053)
Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from
the log ringbuffer. The solution there is still generally good: there
is no need to keep the ringbuffer in children created by
ctdb_fork()... except for those special children that are created to
fetch data from the ringbuffer!
Introduce a new function ctdb_fork_no_free_ringbuffer() that does
everything ctdb_fork() needs to do except free the ringbuffer (i.e. it
is the old ctdb_fork() function). The new ctdb_fork() function just
calls that function and then frees the ringbuffer in the child.
This means all callers of ctdb_fork() have the convenience of having
the ringbuffer freed. There are 3 special cases:
* Forking the recovery daemon. We want to be able to fetch from the
ringbuffer there.
* The ringbuffer fetching code. Change the 2 calls in this code (main
daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer()
instead.
While we're here, clear the log ringbuffer when the recovery deamon is
forked, since it will contain a copy of the messages from the main
daemon.
Note to self: always test... even the most obvious patches... ;-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)
At the moment the log ringbuffer is duplicated in every child process.
Althought it is copy-on-write we want to see if it is contributing to
out-of-memory situations when there are a lot of children.
The ringbuffer isn't accessible from any of the children anyway...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a82d3ec12f0fda16d6bfa8442a07595de897c10e)
These support getting and clearing logs from the ring-buffer in the
recovery daemon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)
Currently these functions are implemented only for Linux.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit be4051326b0c6a0fd301561af10fd15a0e90023b)
We've seen this function report "Unknown family, 0" and then CTDB
disappeared without a trace. If we can reproduce it then this might
help us to debug it.
The idea is that you do something like the following in /etc/sysconfig/ctdb:
export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh"
When we hit this error than we call out to gcore to get a core file so
we can do forensics. This might block CTDB for a few seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)
Do some other hosuekeeping including stopping tevent.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 212298279557a2833ef0f81809b4a5cdac72ca02)
Thanks to Ronnie for highlighting the issue of memory lockdown on AIX.
Fix typo, use getuid and not getpid.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 21a5cbf9518fafc610939f14874371a52b1dc8b3)