BUG/MEDIUM: server: clear purgeable conns before server deletion

Since the following commit, idle connections are cleared before a server
is deleted. This is better than blocking server deletion due to inactive
connections :

  6e0afb2e274952663957121ea33cb6bae574fc2e
  MEDIUM: server: close idle conn on server deletion

A BUG_ON() has been added to ensure that server idle conn counter is nul
after these connections are removed. However, Willy managed to trigger
it easily by repeatedly and randomly delete servers accross a
single-thread haproxy using a server-template with 1000 instances. In
parallel, a h1load client is executed to generate traffic.

This BUG_ON() reflected that it some connections referencing the server
targetted for deletion remained, even though idle server list is empty.
In fact, this is caused by connections scheduled for purging. These
connections are moved from idle server list to a global toremove_list
while still being accounted by the server.

A first approach could be to decrement server idle counter while moving
connection to the purge list. However, this is functionnaly incorrect as
these purgeable connections still reference the server and it could
cause a crash if cleared after it.

The correct fix for this issue is simply to remove every purgeable
connections before a server is deleted. This is implemented by this
patch by extending cli_parse_delete_server(). It could be enough to only
remove connections targetted the deleted server, but as these
connections will be purged anyway it is justified to clear the whole
list.

This must not be backported, unless the above mentionned patch is.
This commit is contained in:
Amaury Denoyelle 2024-05-15 14:28:21 +02:00
parent 231d3d32be
commit 412f1eeb89

View File

@ -5971,6 +5971,14 @@ static int cli_parse_delete_server(char **args, char *payload, struct appctx *ap
conn_release(conn);
}
/* Also remove all purgeable conns as some of them may still
* reference the currently deleted server.
*/
while ((conn = MT_LIST_POP(&idle_conns[i].toremove_conns,
struct connection *, toremove_list))) {
conn_release(conn);
}
if ((i = ((i + 1 == global.nbthread) ? 0 : i + 1)) == tid)
break;
}