BUG/MEDIUM: muxes: enforce buf_wait check in takeover()

The ->takeover() is quite tricky. It didn't take care of the possibility
that the original thread's connection handler had been woken up to handle
an event (e.g. read0), failed to get a buffer, registered against its own
thread's buffer_wait queue and left the connection in an idle state.

A new thread could then come by, perform a takeover(), and when a buffer
was available, the new thread's tasklet would be woken up by the old one
via *_buf_available(), causing all sort of problems. These problems are
easy to reproduce, by running with shared backend connections and few
buffers (tune.buffers.limit=20, 8 threads, 500 connections, transfer
64kB objects and wait 2-5s for a crash to appear).

A first estimated solution consisted in removing the connection from the
idle list but it turns out that it would be worse for the delete stuff
(the connection no longer appearing as idle, making it impossible to find
it in order to close it). Also, idle counts wouldn't match anymore the
list's state, and the special case of private connections could be
difficult to handle as the connection could be forcefully re-added to the
idle list after allocation despite being private.

After multiple attempts to address the problem in various ways, it appears
that the only reliable solution for now (without starting to turn many
lists to mt_lists) is to have the takeover() function handle the buf_wait
detection or unregistration itself:

  - when doing a regular takeover aiming at finding an idle connection
    for a new request, connections that are blocked in a buffer_wait
    queue are quite rare and not interesting at all (since not immediately
    usable), so skipping them is sufficient. For this we detect that the
    desired connection belongs to a buffer_wait list by checking its
    buf_wait.list element. Note that this check is *not* thread-safe! The
    LIST_DEL_INIT() is performed by __offer_buffers() after the callback
    was called. But this is sufficient as it is now because the only way
    for the element to be seen as not in a list is after the element was
    last touched by __offer_buffers(), so the situation for this connection
    will not change in a different way later.

  - when doing a server delete, we're running under thread isolation.
    The connection might get taken over to be killed. The only trick is
    that private connections not belonging to any idle list may also
    experience this, and in this case even the idle_conns lock will not
    offer any protection against anything. But since we're run under
    thread isolation, we're certain not to compete with the other thread,
    so it's safe to directly unregister the connection from its owner
    thread. Normally this is already handled by conn_release() in
    cli_parse_delete_server(), which calls mux->destroy(), but this would
    actually update the current thread's queue instead of the origin
    thread's, thus we do need to perform an explicit dequeue before
    completing the takeover.

With this, the problem now looks solved for HTTP/1, HTTP/2 and FCGI,
though extensive tests were essentially run on HTTP/1 and HTTP/2.

While the problem has been there for a very long time, there should be
no reason to backport it since buffer_wait didn't practically work
before 3.0-dev and the process used to freeze hard very quickly before
we'd even have a chance to meet that race.
This commit is contained in:
Willy Tarreau 2024-05-14 19:26:44 +02:00
parent b0349cf2de
commit 821a04377d
3 changed files with 66 additions and 0 deletions

View File

@ -4173,6 +4173,14 @@ static int fcgi_takeover(struct connection *conn, int orig_tid, int release)
* has been migrated. * has been migrated.
*/ */
if (!release) { if (!release) {
/* If the connection is attached to a buffer_wait (extremely
* rare), it will be woken up at any instant by its own thread
* and we can't undo it anyway, so let's give up on this one.
* It's not interesting anyway since it's not usable right now.
*/
if (LIST_INLIST(&fcgi->buf_wait.list))
goto fail;
new_task = task_new_here(); new_task = task_new_here();
new_tasklet = tasklet_new(); new_tasklet = tasklet_new();
if (!new_task || !new_tasklet) if (!new_task || !new_tasklet)
@ -4229,6 +4237,20 @@ static int fcgi_takeover(struct connection *conn, int orig_tid, int release)
SUB_RETRY_RECV, &fcgi->wait_event); SUB_RETRY_RECV, &fcgi->wait_event);
} }
if (release) {
/* we're being called for a server deletion and are running
* under thread isolation. That's the only way we can
* unregister a possible subscription of the original
* connection from its owner thread's queue, as this involves
* manipulating thread-unsafe areas. Note that it is not
* possible to just call b_dequeue() here as it would update
* the current thread's bufq_map and not the original one.
*/
BUG_ON(!thread_isolated());
if (LIST_INLIST(&fcgi->buf_wait.list))
_b_dequeue(&fcgi->buf_wait, orig_tid);
}
if (new_task) if (new_task)
__task_free(new_task); __task_free(new_task);
return 0; return 0;

View File

@ -5264,6 +5264,14 @@ static int h1_takeover(struct connection *conn, int orig_tid, int release)
* has been migrated. * has been migrated.
*/ */
if (!release) { if (!release) {
/* If the connection is attached to a buffer_wait (extremely
* rare), it will be woken up at any instant by its own thread
* and we can't undo it anyway, so let's give up on this one.
* It's not interesting anyway since it's not usable right now.
*/
if (LIST_INLIST(&h1c->buf_wait.list))
goto fail;
new_task = task_new_here(); new_task = task_new_here();
new_tasklet = tasklet_new(); new_tasklet = tasklet_new();
if (!new_task || !new_tasklet) if (!new_task || !new_tasklet)
@ -5320,6 +5328,20 @@ static int h1_takeover(struct connection *conn, int orig_tid, int release)
SUB_RETRY_RECV, &h1c->wait_event); SUB_RETRY_RECV, &h1c->wait_event);
} }
if (release) {
/* we're being called for a server deletion and are running
* under thread isolation. That's the only way we can
* unregister a possible subscription of the original
* connection from its owner thread's queue, as this involves
* manipulating thread-unsafe areas. Note that it is not
* possible to just call b_dequeue() here as it would update
* the current thread's bufq_map and not the original one.
*/
BUG_ON(!thread_isolated());
if (LIST_INLIST(&h1c->buf_wait.list))
_b_dequeue(&h1c->buf_wait, orig_tid);
}
if (new_task) if (new_task)
__task_free(new_task); __task_free(new_task);
return 0; return 0;

View File

@ -7539,6 +7539,14 @@ static int h2_takeover(struct connection *conn, int orig_tid, int release)
* has been migrated. * has been migrated.
*/ */
if (!release) { if (!release) {
/* If the connection is attached to a buffer_wait (extremely
* rare), it will be woken up at any instant by its own thread
* and we can't undo it anyway, so let's give up on this one.
* It's not interesting anyway since it's not usable right now.
*/
if (LIST_INLIST(&h2c->buf_wait.list))
goto fail;
new_task = task_new_here(); new_task = task_new_here();
new_tasklet = tasklet_new(); new_tasklet = tasklet_new();
if (!new_task || !new_tasklet) if (!new_task || !new_tasklet)
@ -7595,6 +7603,20 @@ static int h2_takeover(struct connection *conn, int orig_tid, int release)
SUB_RETRY_RECV, &h2c->wait_event); SUB_RETRY_RECV, &h2c->wait_event);
} }
if (release) {
/* we're being called for a server deletion and are running
* under thread isolation. That's the only way we can
* unregister a possible subscription of the original
* connection from its owner thread's queue, as this involves
* manipulating thread-unsafe areas. Note that it is not
* possible to just call b_dequeue() here as it would update
* the current thread's bufq_map and not the original one.
*/
BUG_ON(!thread_isolated());
if (LIST_INLIST(&h2c->buf_wait.list))
_b_dequeue(&h2c->buf_wait, orig_tid);
}
if (new_task) if (new_task)
__task_free(new_task); __task_free(new_task);
return 0; return 0;