mirror of
https://github.com/samba-team/samba.git
synced 2025-01-08 21:18:16 +03:00
bc16a8abe3
There were some reports that strace output an LDAP server socket is in
CLOSE_WAIT state, returning EAGAIN for writev over and over (after a call to
epoll() each time).
In the tstream_bsd code the problem happens when we have a pending
writev_send, while there's no readv_send pending. In that case
we still ask for TEVENT_FD_READ in order to notice connection errors
early, so we try to call writev even if the socket doesn't report TEVENT_FD_WRITE.
And there are situations where we do that over and over again.
It happens like this with a Linux kernel:
tcp_fin() has this:
struct tcp_sock *tp = tcp_sk(sk);
inet_csk_schedule_ack(sk);
sk->sk_shutdown |= RCV_SHUTDOWN;
sock_set_flag(sk, SOCK_DONE);
switch (sk->sk_state) {
case TCP_SYN_RECV:
case TCP_ESTABLISHED:
/* Move to CLOSE_WAIT */
tcp_set_state(sk, TCP_CLOSE_WAIT);
inet_csk_enter_pingpong_mode(sk);
break;
It means RCV_SHUTDOWN gets set as well as TCP_CLOSE_WAIT, but
sk->sk_err is not changed to indicate an error.
tcp_sendmsg_locked has this:
...
err = -EPIPE;
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto do_error;
while (msg_data_left(msg)) {
int copy = 0;
skb = tcp_write_queue_tail(sk);
if (skb)
copy = size_goal - skb->len;
if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
bool first_skb;
new_segment:
if (!sk_stream_memory_free(sk))
goto wait_for_space;
...
wait_for_space:
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
if (copied)
tcp_push(sk, flags & ~MSG_MORE, mss_now,
TCP_NAGLE_PUSH, size_goal);
err = sk_stream_wait_memory(sk, &timeo);
if (err != 0)
goto do_error;
It means if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) doesn't
hit as we only have RCV_SHUTDOWN and sk_stream_wait_memory returns
-EAGAIN.
tcp_poll has this:
if (sk->sk_shutdown & RCV_SHUTDOWN)
mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;
So we'll get EPOLLIN | EPOLLRDNORM | EPOLLRDHUP triggering
TEVENT_FD_READ and writev/sendmsg keeps getting EAGAIN.
So we need to always clear TEVENT_FD_READ if we don't
have readable handler in order to avoid burning cpu.
But we turn it on again after a timeout of 1 second
in order to monitor the error state of the connection.
And now that our tsocket_bsd_error() helper checks for POLLRDHUP,
we can check if the socket is in an error state before calling the
writable handler when TEVENT_FD_READ was reported.
Only on error we'll call the writable handler, which will pick
the error without calling writev().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15202
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
(cherry picked from commit
|
||
---|---|---|
.. | ||
tests | ||
doxy.config | ||
tsocket_bsd.c | ||
tsocket_guide.txt | ||
tsocket_helpers.c | ||
tsocket_internal.h | ||
tsocket.c | ||
tsocket.h | ||
wscript_build |