IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There are parentheses missing that stop the default pattern from
matching commands with trailing garbage (e.g. "exportfs.orig").
A careful check of POSIX (and running GNU sed with --posix) suggests
that "\|" isn't a supported way of specifying alternation in a regular
expression. Therefore, it is clearer to switch to extended regular
expressions so that this has a chance of being portable (even though
the point is to print /proc/<pid>/stack, which only works on Linux).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Nov 18 06:37:45 CET 2014 on sn-devel-104
Also add and update tests for statd stack dumps. Update the existing
60.ganesha statd test to do more iterations. Duplicate the result as
a new test for 60.nfs.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
In the process, fix a bug where an extra trace would be printed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Some implementations may not understand RC3164 format messages on the
UDP socket, so add support for RFC5424 message format.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This has most of the advantages of the old logd with none of the
complexity of the extra process. There are several good syslog
implementations that can listen on the UDP port.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Remove --logfile and --syslog daemon options and replace with
--logging.
Modularise and clean up logging initialisation code. The
initialisation API includes an app_name argument that is currently
unused - this will be used in extensions to the syslog backend.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
As far as we know, nobody uses this and it just complicates the
logging subsystem.
Remove all ringbuffer code and documentation. Update the local
daemons startup code correspondingly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
The major and minor device numbers are hexadecimal not decimal.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Sep 25 07:19:59 CEST 2014 on sn-devel-104
Variables that are not set but exported, may return an empty string
for getenv(). Tested on freebsd.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Sep 17 09:55:47 CEST 2014 on sn-devel-104
The current check is incorrect in 2 ways:
* Commit be71a84565 contained a thinko
that stops virtio_net interfaces from simply being marked up
* virtio_net interfaces can actually be down
virtio_net has supported ethtool since Linux 2.6.29, so just remove
the special case. This means that testing CTDB on very old virtual
machines is not supported.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 31 13:08:47 CEST 2014 on sn-devel-104
This was used to limit damage in the "recovered" event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Jul 29 10:03:16 CEST 2014 on sn-devel-104
This event was introduced to handle misconfiguration. For example,
where all nodes where configured as NAT gateway slaves.
However, this event can fail when there are performance issues and
capabilities can't be retrieved from a remote node. The problem is
most likely with the remote node, so marking the local node UNHEALTHY
is probably a mistake.
Having a NAT gateway master node only matters in "ipreallocated", so
leave it to do the checking. Given that a node will run
"ipreallocated" as part of the first recovery, this should cause
misconfigurations to be detected nice and early.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Need to be able to recognise a RHEL system. Still use "system" to
start and stop service, since that still works and yields the smallest
change.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Debugging can still be running when a monitor event times out and
scriptstatus output changes.
When debugging a hung script to a log file, write to a temporary file
and move the temporary file over the log file when done. The test
then waits for the log file to appear.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 3 08:19:23 CEST 2014 on sn-devel-104
There shouldn't be an early exit for the "init" event. Just make the
"ctdb scriptstatus" call conditional.
While here, move the comment about only running a single instance to
be near locking code. The comment is more useful there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Check that the $GANRECDIR symlink points to the location specified by
$CTDB_GANESHA_REC_SUBDIR and replace it if incorrect. This handles
reconfiguration and filesystem changes.
While touching this code:
* Create the $GANRECDIR link as a separate step if it doesn't exist.
This means there is only 1 place where the link is created.
* Change some variables names to the style used for local function
variables.
* Remove some "ln failed" error messages. ln failures will be logged
anyway.
* Add -v to various mkdir/rm/ln commands so that these actions are
logged when they actually do something.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 20 05:40:16 CEST 2014 on sn-devel-104
Backup and restore of the cluster filesystem can upset the operation
of 60.ganesha by changing the contents of this subdirectory.
Allow this subdirectory to be configured to a subdirectory that is
ignored by backup and restore processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jun 11 09:29:22 CEST 2014 on sn-devel-104
The range
CTDB_PER_IP_ROUTING_TABLE_ID_LOW..CTDB_PER_IP_ROUTING_TABLE_ID_HIGH
should not include 253-255. Otherwise policy routing may overwrite
the default system routing tables.
Add some corresponding tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is the loop variable. It can't be empty, especially given the
way the list is built. This must have survived from an earlier
version of the script.
Given that there are whitespace changes associated with the above,
clean-up the "virtio_net" avoidance check so that it reads less like
line-noise.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Commit 4ee4925d41 forgot about
CTDB_NATGW_SLAVE_ONLY so it introduces an incorrect failure when this
is set, and CTDB_NATGW_PUBLIC_IFACE or CTDB_NATGW_PUBLIC_IP is unset.
Relax the sanity check to see if CTDB_NATGW_SLAVE_ONLY is set.
Update the documentation to explicitly state that
CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IP are optional and
unused if CTDB_NATGW_SLAVE_ONLY is set. It would be possible to
insist that CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IFACE should
be unset in that case. However, it is more reasonable to allow
consistent configuration across nodes except with some nodes
configured slave-only.
Add tests, update infrastructure and fix a thinko in the stub's
"natgwlist" implementation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Mon Apr 14 06:06:49 CEST 2014 on sn-devel-104
Extend CTDB_NATGW_STATIC_ROUTES so that each network can have an
optional gateway that overrides CTDB_NATGW_DEFAULT_GATEWAY.
Signed-off-by: Martin Schwenke <martin@meltin.net>
This has been implied since the command to add the route has had
errors redirected to /dev/null. If infrastucture (e.g. ADS, DNS) is
on the same network as CTDB_NATGW_PUBLIC_IP then no route is
necessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Although the dots in $CTDB_NATGW_PUBLIC_IP could probably only help
match an invalid public IP address, this is only executed once so do
as exact a check as possible.
Use CTDB_BASE instead of hardcoding /etc/ctdb.
Make the error message less redundant.
Signed-off-by: Martin Schwenke <martin@meltin.net>
delete_all() really needed renaming for clarity. While doing this,
might as well rename some of the others that don't start with
"natgw_".
Signed-off-by: Martin Schwenke <martin@meltin.net>
NAT gateway really can't operate unless most of the configuration
variables are set.
A check in delete_all() can be removed - strange that this isn't also
done in the add case.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Commit 176ae6c704 caused these functions
to exit on failure. This is incorrect and broke NAT gateway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
"statd-callout notify" currently complains until an add-client or
del-client is done.
Given that we might use ctdb.tdb for something else in the future it
makes sense attach to it in the "startup" event. This could be done
in the background but it should be so lightweight that a timeout will
indicate serious problems.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This feature was added quite a while ago but was not enabled by
default. It is a useful feature so enable it to dump stack traces of
up to 5 stuck processes by default.
This can be disabled by setting:
CTDB_NFS_DUMP_STUCK_THREADS=0
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Feb 25 04:06:45 CET 2014 on sn-devel-104
This comment was true when 50.samba was spaghetti because it tried to
automatically manage both smbd (and nmbd) and winbind. It isn't true
anymore.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Feb 19 04:07:12 CET 2014 on sn-devel-104
* Add stack dumps for "interesting" processes that sometimes get
stuck, so try to print stack traces for them if they appear in the
pstree output.
* Add new configuration variables CTDB_DEBUG_HUNG_SCRIPT_LOGFILE and
CTDB_DEBUG_HUNG_SCRIPT_STACKPAT. These are primarily for testing
but the latter may be useful for live debugging.
* Load CTDB configuration so that above configuration variables can be
set/changed without restarting ctdbd.
Add a test that tries to ensure that all of this is working.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If a primary IP address is being deleted from an interface, the
secondaries are remembered and added back after the primary is
deleted. This is done under a lock shared by the add/del script code.
It is necessary because, by default, Linux deletes secondaries when
the corresponding primary is deleted.
There is a race here between ctdbd and the scripts, since ctdbd
doesn't know about the lock. If ctdbd receives a release IP control
and the IP address is not on an interface then it is regarded as a
"Redundant release of IP" so no "releaseip" event is generated. This
can occur if the IP address in question is a secondary that has been
temporarily dropped. It is more likely if the number of secondaries
is large.
Since Linux 2.6.12 (i.e. 2005) Linux has supported a
promote_secondaries option on interfaces. This option is currently
undocumented but that will change in Linux 3.14. With
promote_secondaries enabled the kernel will not drop secondaries but
will promote a corresponding secondary instead. The kernel does all
necessary locking.
Use promote_secondaries to simplify the code, avoid re-adding
secondaries, avoid re-adding routes and provide improved performance.
This could be done conditionally, with a fallback to legacy
secondary-re-adding code, but no supported Linux distribution is
running a pre-2.6.12 kernel so this is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This adds new files for Ganesha's recovery. myreleaseip_* are used by
the recovery thread on the node where IP is released. The releaseip_*
and tekeip_* files are used by recovery thread where IP is taken over.
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 30 11:18:19 CET 2014 on sn-devel-104
Services can be flagged for reconfigure when they release IPs at
shutdown. The flag is never removed and the service is prematurely
reconfigured during the first "ipreallocated" event, before any IPs
are hosted and before the "startup" event has actually started the
services.
$CTDB_VARDIR/state directly contained the service state subdirectories
and is already removed in the "init" event. Just push the service
state subdirectories down a level and put everything else in a
subdirectory.
This way all the eventscript state gets cleaned up every time CTDB
starts up.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104
Currently the lock is held until the corresponding eventscript
completes, since the process still exists. If the regular part of an
eventscript hangs then the lock might unnecessarily be held for a long
time. The pathological case is when a monitor event gets stuck in
D-wait state and the script times out but can't be killed so the lock
is still held. This can cause an unwanted monitor replay.
Change this so that the lock is released immediately after the
reconfiguration is complete.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
"monitor" events can be cancelled. If a reconfigure action does a
service restart then the "monitor" event can be cancelled at the
inconvenient moment after the service is stopped. In this case the
service stays down and the node may become unhealthy (depending on
whether there are any repair actions in the monitor event).
A long time ago we did service reconfiguration in "monitor" events
following failovers. Service reconfiguration was then moved to the
"ipreallocated" event. However, reconfiguration in "monitor" events
has been kept as a last resort in case an "ipreallocate" event does
not occur. The only important case that this covers is "ctdb
deleteip", where "releaseip" events are generated without a
corresponding "ipreallocated". Therefore, IPs can be deleted without
running the required service reconfiguration.
The supported way of removing IP addresses is now via "ctdb
reloadips", which always causes a takeover run with a corresponding
"ipreallocate" event.
This means that service reconfiguration in "monitor" events is no
longer required and should be removed because it is unsafe.
Also update the associated tests. Make the first confirm that the
monitor event no longer does reconfiguration. Change the others to
test that monitor status is correctly replayed when something else is
doing a reconfigure and currently holds the reconfigure lock.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Dec 17 06:32:35 CET 2013 on sn-devel-104