BUG/MINOR: soft-stop: always wake up waiting threads on stopping
Currently the soft-stop can lead to old processes remaining alive for as long as two seconds after receiving a soft-stop signal. What happens is that when receiving SIGUSR1, one thread (usually the first one) wakes up, handles the signal, sets "stopping", goes into runn_poll_loop(), and discovers that stopping is set, so its also sets itself in the stopping_thread_mask bit mask. After this it sees that other threads are not yet willing to stop, so it continues to wait. From there, other threads which were waiting in poll() expire after one second on poll timeout and enter run_poll_loop() in turn. That's already one second of wait time. They discover each in turn that they're stopping and see that other threads are not yet stopping, so they go back waiting. After the end of the first second, all threads know they're stopping and have set their bit in stopping_thread_mask. It's only now that those who started to wait first wake up again on timeout to discover that all other ones are stopping, and can now quit. One second later all threads will have done it and the process will quit. This is effectively strictly larger than one second and up to two seconds. What the current patch does is simple, when the first thread stops, it sets its own bit into stopping_thread_mask then wakes up all other threads to do also set theirs. This kills the first second which corresponds to the time to discover the stopping state. Second, when a thread exists, it wakes all other ones again because some might have gone back sleeping waiting for "jobs" to go down to zero (i.e. closing the last connection). This kills the last second of wait time. Thanks to this, as SIGUSR1 now acts instantly again if there's no active connection, or it stops immediately after the last connection has left if one was still present. This should be backported as far as 2.0.
This commit is contained in:
parent
32fba0a629
commit
d7a6b2f742
@ -2832,14 +2832,26 @@ void run_poll_loop()
|
||||
}
|
||||
|
||||
if (!wake) {
|
||||
if (stopping)
|
||||
int i;
|
||||
|
||||
if (stopping) {
|
||||
_HA_ATOMIC_OR(&stopping_thread_mask, tid_bit);
|
||||
/* notify all threads that stopping was just set */
|
||||
for (i = 0; i < global.nbthread; i++)
|
||||
if (((all_threads_mask & ~stopping_thread_mask) >> i) & 1)
|
||||
wake_thread(i);
|
||||
}
|
||||
|
||||
/* stop when there's nothing left to do */
|
||||
if ((jobs - unstoppable_jobs) == 0 &&
|
||||
(stopping_thread_mask & all_threads_mask) == all_threads_mask)
|
||||
(stopping_thread_mask & all_threads_mask) == all_threads_mask) {
|
||||
/* wake all threads waiting on jobs==0 */
|
||||
for (i = 0; i < global.nbthread; i++)
|
||||
if (((all_threads_mask & ~tid_bit) >> i) & 1)
|
||||
wake_thread(i);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
/* If we have to sleep, measure how long */
|
||||
next = wake ? TICK_ETERNITY : next_timer_expiry();
|
||||
|
Loading…
x
Reference in New Issue
Block a user