IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 6269cdcd15)
(This used to be ctdb commit e0ca2e02120258aabca1e1586a58a8d672484fb5)
This should make it easier to keep all release scripts alined as it will reduce
the difference between them to ideally a few variables
Also moves the tdb script in the scripts directory.
(Imported from commit 6339de7f4f)
(This used to be ctdb commit 8885b2206fba41ec289fda5dfd653ee676aa0dd3)
after recent fixes we need to raise the version to 1.2.1 so that
we can require also the right patched version.
(Imported from commit 70534adee1)
(This used to be ctdb commit 84c971f33c24d32e5599aba7ba83bb474f7ac922)
If the driver is virtio_net then we assume that the link is up rather
than ignoring the check altogether.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3044d07da2a58260fa06bf489890b279bcf3ec39)
Skip link test for this type of devices
Signed-off-by: Ralph Wuerthner <ralph.wuerthner@de.ibm.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2ea0a9f1a93781a0d036feb9fcc0d120b182922f)
Now the script child signal handler doesn't do anything, we can unify the
"timeout" and "abort" cases introduced in 9dd25cb751919799.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 439f049c7024d69aa4b87dc811e1772981ad29cb)
Fairly simple: prevent the destructor from killing the script, and do it
explicitly from the debugging child.
We can remove the extra "already dead" test, since this will be detected
in the destructor anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit f8aa83788e3cc10ab7655a90d7b7b17ddbe48685)
In the case of a timeout, we dump a log of what's happening to a file
in /tmp. We do it from the signal handler, which is an unreliable hack
(BZ58365).
Instead, create another (lower-priority) child to do the dump, then
kill the timedout script.
Note that this doesn't quite work as intended (the dump is often run
after the script has been killed), so the next patch resolves this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 7ee5ecc8d53e78e2dec21197b74a74cc4ae1834c)
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.
If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.
(This used to be ctdb commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)
Initialize the child pid to 0 so destructor doesn't try to kill it:
server/eventscript.c:565 Sending SIGTERM to child pid:139742328
Failed to kill child process for eventscript, errno No such process(3)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit fcc63e04beb427c1f48deae6d3d98c78a2a67949)
the script timedout.
Instead send a different signal (SIGABRT) to the child process to silently
kill the process group for the script and its children without logging
anything.
We abort any running "monitor" script anytime any other event is generated
either by ctdbd itself or by "ctdb eventscript ..."
BZ61043
(This used to be ctdb commit 9dd25cb751919799af9d8a23a0725343a8400e58)
made the severity of the decreasing interval log level the same as for the increasing,
they are both just info logs because they don't report errors
(This used to be ctdb commit fde29921f14a815ea68911d758485c9070f4eb2a)
The init script relies on the existence.
This should fix bug #6773 on bugzilla.samba.org:
https://bugzilla.samba.org/show_bug.cgi?id=6773
Michael
(This used to be ctdb commit 35e6aa1630732665deaed1e7fbf2c3bf6664895d)
As the basename of the script will be used for the readd script
from setup_iface_ip_readd_script, it's know easier to identify
what script is called by delete_ip_from_iface() while readding
ips to the interface.
metze
(This used to be ctdb commit 3ee225b0b6ed37c22478bd145ced56b1b9b86842)
This is needed because we need to resetup the routing table when
the delete_ip_from_iface() function readds the ip to the interface.
metze
(This used to be ctdb commit ea87185ec9977006ef72d5a68c875154e4c84099)
This combines the logic into a shell function which can be used by the
"takeip" and "updateip" hooks.
We check the return values of the "ip" commands now
instead of ignoring them.
We now create a setup_script.sh similar to the release_script.sh
which makes it easier to analyze problems.
metze
(This used to be ctdb commit 624e8878851b4957cc7c02e922ec86926d6927ee)
This also initializes the variables correctly for the
shutdown|removenatgw code path to delete_all.
metze
(This used to be ctdb commit 2c2cbed4fcbc868a990fa6b32fc96126ffc61bb5)
This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.
metze
(This used to be ctdb commit ac97d65f44e8dc8bf2ec8f68e4db3448521755a2)
We used to talloc_steal c (the command packet) and make it a child of the
"event script state context".
If we failed to create a eventscript child context for some reason,
this would have talloc freed state, but at the same time it would also
implicitely have freed c.
Once ctdb_control_end_recovery() returns the error back to the caller,
the caller would dereference both c, and also outdata which is a child of c
and we would either read garbage data or segv.
Change the ordering so we only talloc_steal c as a child of state IFF
we have successfully created a child context for the script.
BZ61068
(This used to be ctdb commit 259054c3632e42bbaa614ee7e888e6e850733d60)
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.
(This used to be ctdb commit 23b059dcb8074872d7900b225790d4df7da071b6)
(Based on earlier version from Ronnie which modified tdb; this one
is standalone).
When storing records in a tdb that has "automatic seqnum updates"
also check if the actual data for the record has changed or not.
If it has not changed at all, except for possibly the header,
this is likely just a dmaster migration operation in which case
we want to write the record to the tdb but we do not want the tdb
sequence number to be increased.
This resolves the problem of notify.tdb being thrashed under load:
the heuristic in smbd to only reread this when the sequence number
increases (rarely) breaks down.
Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like:
512 1496 118.33 MB/sec execute 60 sec latency 0.00 msec
And turning on latency tracking, this was typical in the logs:
ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb
After this commit:
512 2451 143.85 MB/sec execute 60 sec latency 0.00 msec
And no more latency messages...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 9ed2f8b2fcb7e3f0d795eef22cfa317066490709)
DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",
(This used to be ctdb commit a3473e7a445b14520a49585c460429dfbfe1fce0)
to control whether or not to check if we are swapping, and produce
useful output into the logfile if we are.
For production systems with dedicated nas-heads we should never swap.
But for developer/test systems we often use smaller nondedicated systems where
we can no longer guarantee that we will not be using swap.
(This used to be ctdb commit db87849bf3380914a63a626412bec209dbea7d20)
Recent updates to the test meant that it only worked with the latest
ctdb versions. This changes things so that we never bother matching
the machine readable header, just the actual data in the output. It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8a1cb5dc1ddf82f3b9cbb23e40b3914b3d5c2783)
return success back to the caller instead.
otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled.
(This used to be ctdb commit f4eb41cd3a1099da8265351818fba9bd4688a188)
We should never enter swap; if we do, show the memory state of the machine and the process list. This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 627a6d67a0e9e61f8713e62695b3518c51909230)
We can't just drop packets to the list, as those packets could be part
of the core protocol the client is using. This happens (for example)
when Samba is doing a traverse. If we drop a traverse packet then
Samba hangs indefinately. We are better off dropping the ctdb socket
to Samba.
(This used to be ctdb commit a7a86dafa4d88a6bbc6a71b77ed79a178fd802a6)
The TLIST_*() macros are like the DLIST_*() macros, but take both a
head and tail pointer for the list. This means that adding an element
to the end of the list is efficient (it doesn't need to walk the
list).
We should move all uses of the DLIST_*() macros which use
DLIST_ADD_END() to use the TLIST_*() macros instead.
(This used to be ctdb commit 2d05a71349e9ade869b62cf261c2a9a21818a474)
Check if the node is already enabled/disabled and log an information
message if so.
(This used to be ctdb commit c3eec8f10764a647106087099eeb47b7196f7aac)
packets, to avoid the queue to grow excessively if smbd has blocked.
This could cause traverse packets to become discarded in case the main
smbd daemon does a traverse of a database while there is a recovery
(sending a erconfigured message to smbd, causing an avalanche of unlock
messages to be sent across the cluster.)
This avalance of messages could cause also the tranversal message to be
discarded causing the main smbd process to hang indefinitely waiting
for the traversal message that will never arrive.
Bump the maximum queue length before starting to discard messages from
1000 to 1000000 and at the same time rework the queueing slightly so we
can append messages cheaply to the queue instead of walking the list
from head to tail every time.
(This used to be ctdb commit 59ba5d7f80e0465e5076533374fb9ee862ed7bb6)
There was a bug in tdb where the
tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
(ending the transaction-"mutex") was done before the
/* remove the recovery marker */
This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.
This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.
(This used to be ctdb commit 898b5edfe757cb145960b8f3631029bfd5592119)
If "$1" was empty than loadconfig would load the ctdb config twice.
This stops that from happening.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0406d406da70aaee7ad6aac236114905c5d03ed2)
Proper fix for 085d1bea78fabf754ef6dd6d323f74a1d361e45c's workaround.
$NFS_TICKLE_SHARED_DIRECTORY was being used before it is set via
loadconfig.
Ronnie actually spotted this one. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ee8b2e298351d05197a2e1494f3331433644c1e6)
Also, change the order of the comparison so it is consistent with
others in the script.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 44696e15cdb23e7656d3bb0ead54f509495738a7)
This puts single quotes around everything and uses eval on the
command-lines that actually start ctdbd. The eval causes the single
quotes to be interpreted.
The "redhat" init style no longer uses the Red Hat daemon function.
It loses the quoting and re-splits on spaces. Instead we add an extra
line that uses the success/failure functions to keep things pretty.
Note that this means that we don't respect daemon's
$DAEMON_COREFILE_LIMIT variable but we do our own core file handling
with $CTDB_SUPPRESS_COREFILE anyway. daemon's core file handling was
probably overriding what we were doing anyway, so this can be regarded
as a bug fix.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 522fbb012524fe41a67dbe43589a282dda6bcbe2)
2 changes:
* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
this file does not exist then try looking for the file in /etc/ctdb
(or $CTDB_BASE if set).
* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
file does not exist (even when checked as per above) then do not
fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set). The old
behaviour was surprising and hid errors.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 60aa570aaa77d293b963105b3f605f9625a4594b)
This is very useful for testing, I use such a script:
cat ~/bin/ethtool
#!/bin/sh
IFACE=$1
case "$IFACE" in
Neth2)
;;
Neth3)
;;
Neth4)
;;
Neth5)
;;
*)
exec /usr/sbin/ethtool $@
;;
esac
ip link set down $IFACE
exec /usr/sbin/ethtool $@
metze
(This used to be ctdb commit 3bab985cf615720eded4d47b4f9f37a9c28840aa)