1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-08 21:18:16 +03:00
Commit Graph

134 Commits

Author SHA1 Message Date
Volker Lendecke
ed058f4dc3 s3: Fix a winbind race leading to 100% CPU
This fixes a race condition that leads to the winbindd_children list becoming
corrupted. It happens when on a busy winbind SIGCHLD is a bit late.

Imagine a winbind with multiple requests in the queue for a single child. Child
dies, and before the SIGCHLD handler is called we find the socket to be dead.
wb_child_request_done is called, receiving an error from wb_simple_trans_recv.
It closes the socket. Then immediately the wb_child_request_trigger will do
another fork_domain_child before the signal handler is called. This means that
we do another fork_domain_child, we have child->sock==-1 at this point.
fork_domain_child will do a DLIST_ADD(winbindd_children, child) a second time
where the child is already part of that list. This corrupts the list. Then the
signal handler kicks in, spinning in

for (child = winbindd_children; child != NULL; child = child->next) {

forever. Not good. This patch makes sure that both conditions (sock==-1 and not
part of the list) for a winbindd_child struct match up.

Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Aug 26 18:51:24 CEST 2011 on sn-devel-104
2011-08-26 18:51:24 +02:00
Volker Lendecke
e0e3d215b1 s3: Use sys_write in fork_domain_child
Counterpart for last checkin. A lot less likely, but not impossible in a child.

Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Aug 26 13:14:27 CEST 2011 on sn-devel-104
2011-08-26 13:14:27 +02:00
Volker Lendecke
964e809ce2 s3: Use sys_read in fork_domain_child
I've seen

[2011/08/26 01:44:10.872057,  1] winbindd/winbindd_dual.c:1336(fork_domain_child)
  fork_domain_child: Could not read child status: nread=-1, error=Interrupted system call

on a customer box. Not good.
2011-08-26 11:42:35 +02:00
Volker Lendecke
f0ff6f390a Use tevent_req_oom
This fixes a few Coverity errors
2011-06-20 12:33:24 +02:00
Andrew Bartlett
ad0a07c531 s3-talloc Change TALLOC_ZERO_P() to talloc_zero()
Using the standard macro makes it easier to move code into common, as
TALLOC_ZERO_P isn't standard talloc.
2011-06-09 12:40:08 +02:00
Jeremy Allison
f85e095dd2 More simple const fixups. 2011-05-05 23:56:08 +02:00
Volker Lendecke
d08414b679 s3: Properly deal with exited winbind children
When a winbind child exits, we need to immediately close the socket. If not,
the next request to that child will be sent to a socket without a listener,
leading to a failed request. This failed request will then trigger a proper
re-init.

This patch avoids the one failed request.

Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed May  4 13:32:16 CEST 2011 on sn-devel-104
2011-05-04 13:32:16 +02:00
Günther Deschner
0bb4701a74 s3: remove various references to server side dcerpc structs (which are not needed).
Guenther
2011-05-02 15:03:44 +02:00
Volker Lendecke
df099e6624 s3: Avoid a potential 100% CPU loop in winbindd
In the clustering case if ctdb is unhappy, winbindd_reinit_after_fork fails.
This can lead to an endless loop depending on the scheduling of the parent vs
child. Parent forks, child is immediately scheduled and exits. Parent gets
SIGCHLD, parent is then scheduled before it sends the request out to the child.
Parent tries to fork again immediately.

The code before this patch did not really take into account that
reinit_after_fork can fail. The code now sends the result of
winbindd_reinit_after_fork to the parent and the parent only considers the
child alive when it got NT_STATUS_OK.

This was seen in 3.4 winbind. winbind has changed significantly since then, so
it might be possible that this does not happen anymore in exactly this way. But
passing up the status of reinit_after_fork and only consider the child alive
when that's ok is the correct thing to do anyway.

Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Fri Apr 29 17:58:19 CEST 2011 on sn-devel-104
2011-04-29 17:58:19 +02:00
Volker Lendecke
aa5abcaf7e s3: Make winbindd_reinit_after_fork return NTSTATUS 2011-04-29 16:57:37 +02:00
Volker Lendecke
0757688eb3 s3: In winbind, close parent/child sockets
This should further reduce fd load in winbind children
2011-04-29 16:57:36 +02:00
Günther Deschner
50883cfeb4 s3-tevent: only include ../lib/util/tevent wrappers where needed.
Guenther

Autobuild-User: Günther Deschner <gd@samba.org>
Autobuild-Date: Fri Apr 29 14:00:30 CEST 2011 on sn-devel-104
2011-04-29 14:00:30 +02:00
Günther Deschner
6e3f0d28a4 s3-includes: only include ntdomain.h where needed.
Guenther
2011-03-30 01:13:09 +02:00
Günther Deschner
ab36d597e7 s3-messages: make ndr_messaging.h part of messages.h.
Guenther
2011-03-30 01:13:09 +02:00
Günther Deschner
b2af281e50 s3-messages: only include messages.h where needed.
Guenther
2011-03-30 01:13:09 +02:00
Volker Lendecke
16155812e0 s3: Fix Coverity ID 1048, CHECKED_RETURN
This is a real bug: tevent_req_set_endtime already calls tevent_req_nomem.

Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Mon Mar 21 16:29:22 CET 2011 on sn-devel-104
2011-03-21 16:29:22 +01:00
Volker Lendecke
c6c666aa07 s3: Use poll in winbind 2011-02-28 16:40:19 +01:00
Günther Deschner
bc83400d81 nsswitch: make wb_reqtrans a common subsystem.
Guenther
2011-02-17 00:52:42 +01:00
Volker Lendecke
d038b45948 s3: Fix a typo
Autobuild-User: Volker Lendecke <vlendec@samba.org>
Autobuild-Date: Wed Feb  2 18:10:45 CET 2011 on sn-devel-104
2011-02-02 18:10:45 +01:00
Stefan Metzmacher
19d3779274 Revert "s3:events: Call all ready fd event handlers on each iteration of the main loop"
This reverts commit 455fccf86b.

I'll add a more generic fix for this problem.

metze
2011-01-31 16:16:09 +01:00
Volker Lendecke
9c2fcb689b s3:winbind: Fork multiple children per domain
This makes us scale better with many simultaneous winbind requests,
some of which might be slow.

This implementation breaks offline logons, as the cached credentials are
maintained in a child (this needs fixing). So, if the offline logons are
active, only allow one DC connection.

Probably the offline logon and the scalable file server cases are
separate enough so that this patch is useful even with the restriction.
2011-01-21 13:51:27 +01:00
Jeremy Allison
30d29e64cb All calls to event_add_to_select_args() call GetTimeOfDay() and
pass this in as the &now parameter. Push this call inside of
event_add_to_select_args() to the correct point so it doesn't
get called unless needed.

Jeremy.

Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Thu Dec 23 01:08:11 CET 2010 on sn-devel-104
2010-12-23 01:08:11 +01:00
Volker Lendecke
a881d6ab86 wb_reqtrans is not used in libwbclient 2010-12-19 23:25:06 +01:00
Volker Lendecke
6bfd745c61 libwbclient: Put the wb_reqtrans definitions into wb_reqtrans.h 2010-12-19 23:25:06 +01:00
Volker Lendecke
f7d97868e4 s3: Fix bug 7844: Race in winbind
If a child dies, the parent process right away closes the socket.
This is wrong, with tevent we still have events pending. This works
fine for epoll but does not for at least the FreeBSD select variant.
Tevent sticks a closed socket into the select masks. This then
returns an error EBADF. When this happens, the parent winbind dies
instead of forking a new child.

This moves the socket close from the SIGCHLD cleanup function to
the socket receiver. I could not reproduce the parent death anymore
and it did not create an obvious fd leak.

Autobuild-User: Jeremy Allison <jra@samba.org>
Autobuild-Date: Mon Dec  6 23:21:02 CET 2010 on sn-devel-104
2010-12-06 23:21:02 +01:00
Volker Lendecke
fd9ea77a71 "bool ? true : false" is a bit pointless 2010-11-17 12:17:21 +01:00
Andrew Bartlett
f768b32e37 libcli/security Provide a common, top level libcli/security/security.h
This will reduce the noise from merges of the rest of the
libcli/security code, without this commit changing what code
is actually used.

This includes (along with other security headers) dom_sid.h and
security_token.h

Andrew Bartlett

Autobuild-User: Andrew Bartlett <abartlet@samba.org>
Autobuild-Date: Tue Oct 12 05:54:10 UTC 2010 on sn-devel-104
2010-10-12 05:54:10 +00:00
Steven Danneman
455fccf86b s3:events: Call all ready fd event handlers on each iteration of the main loop
Previously, only one fd handler was being called per main message loop
in all smbd child processes.

In the case where multiple fds are available for reading the fd
corresponding to the event closest to the beginning of the event list
would be run.  Obviously this is arbitrary and could cause unfairness.

Usually, the first event fd is the network socket, meaning heavy load
of client requests can starve out other fd events such as oplock
or notify upcalls from the kernel.

In this patch, I have changed the behavior of run_events() to unset
any fd that it has already called a handler function, as well
as decrement the number of fds that were returned from select().
This allows the caller of run_events() to iterate it, until all
available fds have been handled.

I then changed the main loop in smbd child processes to iterate
run_events().  This way, all available fds are handled on each wake
of select, while still checking for timed or signalled events between
each handler function call.  I also added an explicit check for
EINTR from select(), which previously was masked by the fact that
run_events() would handle any signal event before the return code
was checked.

This required a signature change to run_events() but all other callers
should have no change in their behavior.  I also fixed a bug in
run_events() where it could be called with a selrtn value of -1,
doing unecessary looping through the fd_event list when no fds were
available.

Also, remove the temporary echo handler hack, as all fds should be
treated fairly now.
2010-10-01 13:31:33 -07:00
Günther Deschner
b38d0542e1 samba: share select wrappers.
Guenther
2010-10-01 22:30:22 +02:00
Volker Lendecke
bad98e37e7 s3: Add "smbcontrol winbindd ip-dropped <local-ip>"
This is supposed to improve the winbind reconnect time after an ip address
has been moved away from a box. Any kind of HA scenario will benefit from
this, because winbindd does not have to wait for the TCP timeout to kick in
when a local IP address has been dropped and DC replies are not received
anymore.
2010-09-30 14:30:33 +02:00
Volker Lendecke
10f0c785c7 s3: Re-introduce a procid_self()
Giving the parent pid to reinit_after_fork is not a good idea....
None of the other callers do this, checked it.
2010-09-30 14:29:56 +02:00
Björn Jacke
306465a5a4 s3/winbind: use mono time for startup timeout check 2010-09-10 23:10:26 +02:00
Stefan Metzmacher
760948a5d4 s3:winbindd: remove rpc_pipe_client references from winbind_dual_ndr code
metze
2010-08-16 14:30:21 +02:00
Stefan Metzmacher
2ccaa23558 s3:winbindd: add binding_handle to struct winbindd_child
metze
2010-08-16 14:30:20 +02:00
Günther Deschner
c136b84f0d s3-secrets: only include secrets.h when needed.
Guenther
2010-08-05 10:12:25 +02:00
Volker Lendecke
7ac58281ae s3: Remove a direct use of procid_self() 2010-07-18 21:22:41 +02:00
Günther Deschner
76a084feee s3-winbindd: Fix child logfile handling which broke with c67cff0372.
Andreas, please check.

Guenther
2010-07-07 17:01:09 +02:00
Andreas Schneider
c67cff0372 s3-winbind: Create all logfiles in the same directory.
If log file is set in the config file, we should create the log files of
the winbind child processes in the same directory.
2010-07-06 18:38:13 +02:00
Volker Lendecke
7f0e6df883 s3: Pass the new server_id through reinit_after_fork 2010-07-04 17:29:23 +02:00
Andrew Bartlett
cdf0704272 s3:winbindd Rename 'children' to 'winbindd_children' and make static 2010-05-13 10:12:26 +10:00
Günther Deschner
c6ebab846d s3: only include gen_ndr headers where needed.
This shrinks include/includes.h.gch by the size of 7 MB and reduces build time
as follows:

ccache build w/o patch
real    4m21.529s
ccache build with patch
real    3m6.402s

pch build w/o patch
real    4m26.318s
pch build with patch
real    3m6.932s

Guenther
2010-05-06 00:22:59 +02:00
Volker Lendecke
fd3eeb3878 s3: async_domain_request is no longer used 2010-04-25 12:32:02 +02:00
Volker Lendecke
dbb7db6c25 s3: sendto_domain() is lo longer used 2010-04-24 11:12:19 +02:00
Stefan Metzmacher
73577205cf s3:winbindd: fix problems with SIGCHLD handling (bug #7317)
The main problem is that we call CatchChild() within the
parent winbindd, which overwrites the signal handler
that was registered by winbindd_setup_sig_chld_handler().

That means winbindd_sig_chld_handler() and winbind_child_died()
are never triggered when a winbindd domain child dies.
As a result will get "broken pipe" for all requests to that domain.

To reduce the risk of similar bugs in future we call
CatchChild() in winbindd_reinit_after_fork() now.

We also use a full winbindd_reinit_after_fork() in the
cache validation child now instead instead of just resetting
the SIGCHLD handler by hand. This will also fix possible
tdb problems on systems without pread/pwrite and disabled mmap
as we now correctly reopen the tdb handle for the child.

metze
2010-04-01 17:25:11 +02:00
Stefan Metzmacher
0f95d00f49 s3:winbindd: only set child_domain in the child
metze
2010-04-01 13:01:26 +02:00
Roel van Meer
cfc79f222d Fix one of the valgrind warnings from bug #6814 - Fixes for problems reported by valgrind
The timeval passed to event_add_to_select_args() must be initialized
as event_add_to_select_args() uses a timeval_min() on this and next_event.
2010-02-26 14:54:22 -08:00
Volker Lendecke
063900ae63 s3: Fix a typo 2010-01-02 12:09:05 +01:00
Volker Lendecke
2daa084da4 s3: Simplify "setup_domain_child" slightly 2009-12-28 14:59:45 +01:00
Volker Lendecke
50e5f9dc51 s3: Fix some nonempty blank lines 2009-12-26 12:26:06 +01:00
Volker Lendecke
6dc924fcf3 s3: Remove some unused code 2009-12-23 12:02:19 +01:00