BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates

For inet families (IP4/IP6), it is expected that server's addr/port might be updated at runtime from DNS, cli or lua for instance. Such updates were performed under the server's lock. Unfortunately, most readers such as backend.c or sink.c perform the read without taking server's lock because they can't afford slowing down their processing for a type of event which is normally rare. But this could result in bad values being read for the server addr:svc_port tuple (ie: during connection etablishment) as a result of concurrent updates from external components, which can obviously cause some undesirable effects. Instead of slowing the readers down, as we consider server's addr changes are relatively rare, we take another approach and try to update the addr:port atomically by performing changes under full thread isolation when a new change is requested. The changes are performed by a dedicated task which takes care of isolating the current thread and doesn't depend on other threads (independent code path) to protect against dead locks. As such, server's addr:port changes will now be performed atomically, but they will not be processed instantly, they will be translated to events that the dedicated task will pick up from time to time to apply the pending changes. This bug existed for a very long time and has never been reported so far. It was discovered by reading the code during the implementation of log backend ("mode log" in backends). As it involves changes in sensitive areas as well as thread isolation, it is probably not worth considering backporting it for now, unless it is proven that it will help to solve bugs that are actually encountered in the field. This patch depends on: - 24da4d3 ("MINOR: tools: use const for read only pointers in ip{cmp,cpy}") - c886fb5 ("MINOR: server/ip: centralize server ip updates") - event_hdl API (which was first seen on 2.8) + 683b2ae ("MINOR: server/event_hdl: add SERVER_INETADDR event") + BUG/MEDIUM: server/event_hdl: memory overrun in _srv_event_hdl_prepare_inetaddr() + "MINOR: event_hdl: add global tunables" Note that the patch may be reworked so that it doesn't depend on event_hdl API for older versions, the approach would remain the same: this would result in a larger patch due to the need to manually implement a global queue of pending updates with its dedicated task responsible for picking updates and comitting them. An alternative approach could consist in per-server, lock-protected, temporary addr:svc_port storage dedicated to "updaters" were only the most recent values would be kept. The sync task would then use them as source values to atomically update the addr:svc_port members that the runtime readers are actually using.
2023-11-13 18:14:24 +01:00 · 2023-11-13 18:14:24 +01:00 · cd994407a9
commit cd994407a9
parent cb3ec978fd
1 changed files with 163 additions and 17 deletions
--- a/src/server.c
+++ b/src/server.c
@ -72,6 +72,8 @@ __decl_thread(HA_SPINLOCK_T idle_conn_srv_lock);
 struct eb_root idle_conn_srv = EB_ROOT;
 struct task *idle_conn_task __read_mostly = NULL;
 struct list servers_list = LIST_HEAD_INIT(servers_list);
+static struct task *server_atomic_sync_task = NULL;
+static event_hdl_async_equeue server_atomic_sync_queue;

 /* SERVER DELETE(n)->ADD global tracker:
 * This is meant to provide srv->rid (revision id) value.
@ -168,7 +170,8 @@ int srv_getinter(const struct check *check)

 /* Update server's addr:svc_port tuple in INET context
 *
- * Must be called with server lock held
+ * Must be called under thread isolation to ensure consistent readings accross
+ * all threads (addr:svc_port might be read without srv lock being held).
 */
 void _srv_set_inetaddr(struct server *srv, const struct sockaddr_storage *addr, unsigned int svc_port)
 {
@ -176,6 +179,163 @@ void _srv_set_inetaddr(struct server *srv, const struct sockaddr_storage *addr,
 	srv->svc_port = svc_port;
 }

+/*
+ * Function executed by server_atomic_sync_task to perform atomic updates on
+ * compatible server struct members that are not guarded by any lock since
+ * they are not supposed to change often and are subject to being used in
+ * sensitive codepaths
+ *
+ * Some updates may require thread isolation: we start without isolation
+ * but as soon as we encounter an event that requires isolation, we do so.
+ * Once the event is processed, we keep the isolation until we've processed
+ * the whole batch of events and leave isolation once we're done, as it would
+ * be very costly to try to acquire isolation multiple times in a row.
+ * The task will limit itself to a number of events per run to prevent
+ * thread contention (see: "tune.events.max-events-at-once").
+ *
+ * TODO: if we find out that enforcing isolation is too costly, we may
+ * consider adding thread_isolate_try_full(timeout) or equivalent to the
+ * thread API so that we can do our best not to block harmless threads
+ * for too long if one or multiple threads are still heavily busy. This
+ * would mean that the task would be capable of rescheduling itself to
+ * start again on the current event if it failed to acquire thread
+ * isolation. This would also imply that the event_hdl API allows us
+ * to check an event without popping it from the queue first (remove the
+ * event once it is successfully processed).
+ */
+static void srv_set_addr_desc(struct server *s, int reattach);
+static struct task *server_atomic_sync(struct task *task, void *context, unsigned int state)
+{
+	unsigned int remain = event_hdl_tune.max_events_at_once; // to limit max number of events per batch
+	struct event_hdl_async_event *event;
+
+	/* check for new server events that we care about */
+	while ((event = event_hdl_async_equeue_pop(&server_atomic_sync_queue))) {
+		if (event_hdl_sub_type_equal(event->type, EVENT_HDL_SUB_END)) {
+			/* ending event: no more events to come */
+			event_hdl_async_free_event(event);
+			task_destroy(task);
+			task = NULL;
+			break;
+		}
+
+		if (!remain) {
+			/* STOP: we've already spent all our budget here, and
+			 * considering we possibly are under isolation, we cannot
+		         * keep blocking other threads any longer.
+			 *
+			 * Reschedule the task to finish where we left off if
+			 * there are remaining events in the queue.
+			 */
+			if (!event_hdl_async_equeue_isempty(&server_atomic_sync_queue))
+				task_wakeup(task, TASK_WOKEN_OTHER);
+			break;
+		}
+		remain--;
+
+		/* new event to process */
+		if (event_hdl_sub_type_equal(event->type, EVENT_HDL_SUB_SERVER_INETADDR)) {
+			struct sockaddr_storage new_addr;
+			struct event_hdl_cb_data_server_inetaddr *data = event->data;
+			struct proxy *px;
+			struct server *srv;
+
+			/* server ip:port changed, we must atomically update data members
+			 * to prevent invalid reads by other threads.
+			 */
+
+			/* check if related server still exists */
+			px = proxy_find_by_id(data->server.safe.proxy_uuid, PR_CAP_BE, 0);
+			if (!px)
+				continue;
+			srv = findserver_unique_id(px, data->server.safe.puid, data->server.safe.rid);
+			if (!srv)
+				continue;
+
+			/* prepare new addr based on event cb data */
+			memset(&new_addr, 0, sizeof(new_addr));
+			new_addr.ss_family = data->safe.next.family;
+			switch (new_addr.ss_family) {
+				case AF_INET:
+					((struct sockaddr_in *)&new_addr)->sin_addr.s_addr =
+						data->safe.next.addr.v4.s_addr;
+					break;
+				case AF_INET6:
+					memcpy(&((struct sockaddr_in6 *)&new_addr)->sin6_addr,
+					       &data->safe.next.addr.v6,
+					       sizeof(struct in6_addr));
+					break;
+				default:
+					/* should not happen */
+					break;
+			}
+			/*
+			 * this requires thread isolation, which is safe since we're the only
+			 * task working for the current subscription and we don't hold locks
+			 * or ressources that other threads may depend on to complete a running
+			 * cycle. Note that we do this way because we assume that this event is
+			 * rather rare.
+			 */
+			if (!thread_isolated())
+				thread_isolate_full();
+
+			/* apply new addr:port combination */
+			_srv_set_inetaddr(srv, &new_addr, data->safe.next.svc_port);
+
+			/* propagate the changes */
+			if (data->safe.purge_conn) /* force connection cleanup on the given server? */
+				srv_cleanup_connections(srv);
+			srv_set_dyncookie(srv);
+			srv_set_addr_desc(srv, 1);
+		}
+		event_hdl_async_free_event(event);
+	}
+
+	/* some events possibly required thread_isolation:
+	 * now that we are done, we must leave thread isolation before
+	 * returning
+	 */
+	if (thread_isolated())
+		thread_release();
+
+	return task;
+}
+
+/* Try to start the atomic server sync task.
+ *
+ * Returns ERR_NONE on success and a combination of ERR_CODE on failure
+ */
+static int server_atomic_sync_start()
+{
+	struct event_hdl_sub_type subscriptions = EVENT_HDL_SUB_NONE;
+
+	if (server_atomic_sync_task)
+		return ERR_NONE; // nothing to do
+	server_atomic_sync_task = task_new_anywhere();
+	if (!server_atomic_sync_task)
+		goto fail;
+	server_atomic_sync_task->process = server_atomic_sync;
+	event_hdl_async_equeue_init(&server_atomic_sync_queue);
+
+	/* task created, now subscribe to relevant server events in the global list */
+	subscriptions = event_hdl_sub_type_add(subscriptions, EVENT_HDL_SUB_SERVER_INETADDR);
+	if (!event_hdl_subscribe(NULL, subscriptions,
+	                         EVENT_HDL_ASYNC_TASK(&server_atomic_sync_queue,
+	                                              server_atomic_sync_task,
+	                                              NULL,
+	                                              NULL)))
+		goto fail;
+
+
+	return ERR_NONE;
+
+ fail:
+	task_destroy(server_atomic_sync_task);
+	server_atomic_sync_task = NULL;
+	return ERR_ALERT | ERR_FATAL;
+}
+REGISTER_POST_CHECK(server_atomic_sync_start);
+
 /* fill common server event data members struct
 * must be called with server lock or under thread isolate
 */
@ -3601,14 +3761,9 @@ int srv_update_addr(struct server *s, void *ip, int ip_sin_family, const char *u
 	_srv_event_hdl_prepare(&cb_data.common, s, 0);
 	_srv_event_hdl_prepare_inetaddr(&cb_data.addr, &s->addr, s->svc_port, &new_addr, s->svc_port, 0);

-	/* apply the new IP address */
-	_srv_set_inetaddr(s, &new_addr, s->svc_port);
-
+	/* server_atomic_sync_task will apply the changes for us */
 	_srv_event_hdl_publish(EVENT_HDL_SUB_SERVER_INETADDR, cb_data, s);

-	srv_set_dyncookie(s);
-	srv_set_addr_desc(s, 1);
-
 	return 0;
 }

@ -3866,17 +4021,8 @@ out:
 		                                ((port_change) ? new_port : s->svc_port),
 		                                1);

-		/* apply new ip and port */
-		_srv_set_inetaddr(s,
-		                  ((ip_change) ? &sa : &s->addr),
-		                  ((port_change) ? new_port : s->svc_port));
-
+		/* server_atomic_sync_task will apply the changes for us */
 		_srv_event_hdl_publish(EVENT_HDL_SUB_SERVER_INETADDR, cb_data, s);
-
-		/* force connection cleanup on the given server */
-		srv_cleanup_connections(s);
-		srv_set_dyncookie(s);
-		srv_set_addr_desc(s, 1);
 	}
 	if (updater)
 		chunk_appendf(msg, " by '%s'", updater);