IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
It really is internal.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit abb64f62efaa70df4b87c030b96300eafd98e6a3)
"ctdb ping" can time out. How many times should we try?
Instead, depend on the initscript to implement something sane.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 90cb337e5ccf397b69a64298559a428ff508f196)
Using "ctdb ping" and "ctdb status" is fraught with danger. These
commands can timeout when ctdbd is running, leading callers to believe
that ctdbd is not running. Timeouts could be increased but we would
still have to handle potential timeouts.
Everything else in the world implements the "status" option by
checking if the relevant process is running. This change makes CTDB
do the same thing and uses standard distro functions.
This change is backward compatible in sense that a missing
/var/run/ctdb/ directory means that we don't do a PID file check but
just depend on the distro's checking method. Therefore, if CTDB was
started with an older version of this script then "service ctdb
status" will still work.
This script does not support changing the value of CTDB_VALGRIND
between calls. If you start with CTDB_VALGRIND=yes then you need to
check status with the same setting. CTDB_VALGRIND is a debug
variable, so this is acceptable.
This also adds sourcing of /lib/lsb/init-functions to make the Debian
function status_of_proc() available.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 687e2eace4f48400cf5029914f62b6ddabb85378)
Default is not to create a pid file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)
For now we pass NULL as the child name. Later we'll give ctdb_fork()
and friends an extra argument and pass that through.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit ba8866d40125bab06391a17d48ff06a4a9f9da89)
Must be called by all child processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)
The comment explains that we use "ctdb stop" and "ctdb continue"
but we should use "ctdb setcrecmasterrole off".
Signed-off-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 06ac62f890299021220214327f1b611c3cf00145)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit b1577a11d548479ff1a05702d106af9465921ad4)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 2438f3a4944f7adbcae4cc1b9d5452714244afe7)
This is now done in ctdb_ltdb_store_server(), so this
extra bump can be spared.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit cad3107b12e8392f786f9a758ee38cf3a3d58538)
Problem:
Recovery can under certain circumstances lead to old record copies
resurrecting: Recovery selects the newest record copy purely by RSN. At
the end of the recovery, the recovery master is the dmaster for all
records in all (non-persistent) databases. And the other nodes locally
hold the complete copy of the databases. The bug is that the recovery
process does not increment the RSN on the recovery master at the end of
the recovery. Now clients acting directly on the Recovery master will
directly change a record's content on the recmaster without migration
and hence without RSN bump. So a subsequent recovery can not tell that
the recmaster's copy is newer than the copies on the other nodes, since
their RSN is the same. Hence, if the recmaster is not node 0 (or more
precisely not the active node with the lowest node number), the recovery
will choose copies from nodes with lower number and stick to these.
Here is how to reproduce:
- assume we have a cluster with at least 2 nodes
- ensure that the recmaster is not node 0
(maybe ensure with "onnode 0 ctdb setrecmasterrole off")
say recmaster is node 1
- choose a new database name, say "test1.tdb"
(make sure it is not yet attached as persistent)
- choose a key name, say "key1"
- all clustere nodes should ok and no recovery running
- now do the following on node 1:
1. dbwrap_tool test1.tdb store key1 uint32 1
2. dbwrap_tool test1.tdb fetch key1 uint32
==> 1
3. ctdb recover
4. dbwrap_tool test1.tdb store key1 uint32 2
5. dbwrap_tool test1.tdb fetch key1 uint32
==> 2
4. ctdb recover
7. dbwrap_tool test1.tdb fetch key1 uint32
==> 1
==> BUG
This is a very severe bug, since when applied to Samba's locking.tdb
database, it means that for SMB clients on clustered Samba there is
the potential for locking out oneself from previously opened files
or even worse, data corruption:
Case 1: locking out
- client on recmaster opens file
- recovery propagates open file handle (entry in locking.tdb) to
other nodes
- client closes file
- client opens the same file
- recovery resurrects old copy of open file record in locking.tdb
from lower node
- client closes file but fails to delete entry in locking.tdb
- client tries to open same file again but fails, since
the old record locks it out (since the client is still connected)
Case 2: data corruption
- clien1 on recmaster opens file
- recovery propagates open file info to other nodes
- client1 closes the file and disconnects
- client2 opens the same file
- recovery resurrects old copy of locking.tdb record,
where client2 has no entry, but client1 has.
- but client2 believes it still has a handle
- client3 opens the file and succees without
conflicting with client2
(the detached entry for client1 is discarded because
the server does not exist any more).
=> both client2 and client3 believe they have exclusive
access to the file and writing creates data corruption
Fix:
When storing a record on the dmaster, bump its RSN.
The ctdb_ltdb_store_server() is the central function for storing
a record to a local tdb from the ctdbd server context.
So this is also the place where the RSN of the record to be stored
should be incremented, when storing on the dmaster.
For the case of the record migration, this is currently done in
ctdb_become_dmaster() in ctdb_call.c, but there are other places
such as in recovery, where we should bump the RSN, but currently
don't do it.
So moving the RSN incrementation into ctdb_ltdb_store_server fixes
the recovery-record-resurrection bug.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit feb1d40b21a160737aead22e398f3c34ff3be8de)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 4c0cbfbe8b19f2e6fe17093b52c734bec63dd8b7)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2e92deef5221ee651028ef87138b3113f1fece91)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 9f01b8db72780acf2f88f1392bc0a796dd4c6176)
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit e96acf19b4d1e0f951ab92b88869a01ff06398be)
This makes sure that CTDB_CONTROL TRAVERSE_ALL is compatible with older versions
of CTDB (i.e. 1.2.39 and 1.2.40 branches).
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 5808f0778b39b79ab7a5c7f53ad27947131386ec)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit e691df43d20871468142c8fb83f7c7303c4ec307)
Also, include description of -e option in usage.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 35264e42ade4676468cf7713fa339c784e932953)
When collating IP information for IP layout, only trust the nodes that are
hosting an IP, to have correct information about that IP. Ignore what all the
other nodes think.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 1c7adbccc69ac276d2b957ad16c3802fdb8868ca)
In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This
prevents CTDB tool commands talking to daemon since "rpcuser" cannot access
CTDB socket.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fe8c4880b371492a38554868d4ca10918c54e412)
If the socket is set non-blocking before connect, then we should catch
EAGAIN errors and retry. Instead of adding a random number of retries,
better to wait for connect to succeed and then set the socket to
non-blocking.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 524ec206e6a5e8b11723f4d8d1251ed5d84063b0)
This reverts commit dc0c58547cd4b20a8e2cd21f3c8363f34fd03e75.
There is a simpler solution that retrying random number of times. Do not set
socket non-blocking till connect succeeds.
(This used to be ctdb commit 74acc2c568300ef42740cf11299a1b2507047f60)
This simplifies the use of message indexdb API and abstracts tdb related code
inside the API.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit bf7296ce9b98563bcb8426cd035dbeab6d884f59)
This fixes a memory leak in the messaging code.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 20be1f991dd75c2333c9ec9db226432a819f57ba)
This makes sure that even if the srvids are not deregistered, the header
structure is freed when the last message handler has been freed as a result of
client going away.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 4e1ec7412866f2d31c41de1bec0fbf788c03051b)
To log debugging information from child processes that are started
with vfork and exec, do not set close_on_exec on STDOUT and STDERR for
that process.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 08c53ee609b80f87450a7a1d7dd24fbcdf5ab7bc)
tevent_schedule_immediate() is much more efficient at handling events that need
to be processed immediately rather than creating timed events with
timeval_zero().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 11734be353a1e246163eda631d35dfe55d1d6fb1)
When CTDB is busy with lots of smbd, CTDB was spending too much time in
daemon_check_srvids() which searches a list of srvids in the registered
message handlers. Using a hash based index significantly improves the
performance of search in a linked list.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 3e09f25d419635f6dd679b48fa65370f7860be7d)
Moving the IP is an optimisation so should not cause failure.
Refactor and simplify the retry-move-IP into new function
try_moveip().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 5402f85dde045576cbaf64e01c68e28ed52204e8)
... as the comment says... not just active nodes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 4f71dca8df19a63f198e2d6d59e605b49ec5e803)
This reduces repetition.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit f505020a5720faa4ecc6414e0bfaa6b3c0e47291)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit a73bb56991b8c07ed0e9517ffcf0dc264be30487)
This improves the processing of packets considerably. It has been
observed that there can be as many as 10 packets in the socket buffer and
the current code of reading a single packet from a socket at a time is
not very optimal. This change reads all the bytes from socket buffer and
then parses to extract multiple packets. If there are multiple packets,
set up a timed event to process next packet.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d788bc8f7212b7dc1587ae592242dc8c876f4053)
In 1f262deaad0818f159f9c68330f7fec121679023, Ronnie changed recovery code
to allocate chunks of 10MB in traverse_pulldb() and traverse_recdb(). The
tunable PullDBPreallocation size was set to 100MB.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit e204fac03412520e877ab04363b3ece02667c55b)
This is an artifact from older versions of Samba. In the newer versions of
Samba, "smbstatus -np" command does not do anything useful, but causes a
traverse in CTDB which is expensive and causes CPU utilization to shoot up.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 053b89c6dbce47001505524606889334559d2ec4)
Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from
the log ringbuffer. The solution there is still generally good: there
is no need to keep the ringbuffer in children created by
ctdb_fork()... except for those special children that are created to
fetch data from the ringbuffer!
Introduce a new function ctdb_fork_no_free_ringbuffer() that does
everything ctdb_fork() needs to do except free the ringbuffer (i.e. it
is the old ctdb_fork() function). The new ctdb_fork() function just
calls that function and then frees the ringbuffer in the child.
This means all callers of ctdb_fork() have the convenience of having
the ringbuffer freed. There are 3 special cases:
* Forking the recovery daemon. We want to be able to fetch from the
ringbuffer there.
* The ringbuffer fetching code. Change the 2 calls in this code (main
daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer()
instead.
While we're here, clear the log ringbuffer when the recovery deamon is
forked, since it will contain a copy of the messages from the main
daemon.
Note to self: always test... even the most obvious patches... ;-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit b940e3a24daa73ca9b2896b7a449240136442b53)
This means it can be set like any other configuration option in the
configuration file, without needing to export it there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a0ef73e197dc9147f7718e0813fe803ff0b3d54d)
The amount of data to write into the buffer wasn't constrained
anywhere...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9b0d56b16775aa16f33bdfdf831256e085fa3339)
This is quite easy to misconfigure by failing to set the execute bit
on the script. Better to complain loudly.
This is a debugging facilty rather than core CTDB functionality, so it
doesn't need a subtle mechanism to disable it at run-time. To disable
the designated script at run-time either edit it to put an "exit 0" at
the top or move it aside and symlink to /bin/true.
This is implemented by actually removing the code that checks that the
file exists and is executable. The output from the shell when the
system() function fails is just as useful.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3400b2ed34b6eb9496eb55f1aab6f89d2952060d)
Use an environment variable instead. This just means that the
initscript exports CTDB_DEBUG_HUNG_SCRIPT and the code checks for the
environment variable.
The justification for this simplification is that more debug options
will be arriving soon and we want to handle them consistently without
needing to add a command-line option for each. So, the convention
will be to use an environment variable for each debug option.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0581f9a84e58764d194f4e04064c2c5b393c348b)