BUG/MINOR: listener: do not immediately resume on transient error
The listener supports a "transient error" situation, which corresponds to those situations where accept fails badly but poll() reports an event. This happens for example when a listener is paused, or on out of FD. The same mechanism is used when facing a maxconn or maxsessrate limitation. When this happens, the listener is disabled for up to 100ms and put back into the global listener queue so that it automatically wakes up again as soon as the conditions change from an existing connection releasing one resource, or the system recovers from a transient issue. The listener_accept() function has a bug in its exit path causing a freshly limited listener to be immediately enabled again because all the conditions are met (connection count < max). It doesn't take into account the fact that the listener might have been queued and must first wait for the timeout to expire before doing so. The impact is that upon certain errors, the faulty process will busy loop on the accept code without sleeping. This is the scenario reported and diagnosed by @hedong0411 in issue #382. This commit fixes it by verifying that the global queue's delay is at least expired before deciding to resume the listener. Another approach could consist in having an extra state like LI_DELAY for situations where only a delay is acceptable, but this would probably not bring anything except more complex code. This issue was introduced with the lock-free listener accept code (commits 3f0d02b and 82c9789a) that were backported to 1.8.20+ and 1.9.7+, so this fix must be backported to the relevant branches. (cherry picked from commit cdcba115b8a6d3773d5bd3c0fe6f8c239d356eab) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 81a1ad0f526e5e4647e5603acac57f1fc0fd5184) Signed-off-by: Willy Tarreau <w@1wt.eu>
This commit is contained in:
parent
708c244026
commit
5de8d1fc35
@ -1045,7 +1045,10 @@ void listener_accept(int fd)
|
||||
_HA_ATOMIC_SUB(&actconn, 1);
|
||||
|
||||
if ((l->state == LI_FULL && (!l->maxconn || l->nbconn < l->maxconn)) ||
|
||||
(l->state == LI_LIMITED && ((!p || p->feconn < p->maxconn) && (actconn < global.maxconn)))) {
|
||||
(l->state == LI_LIMITED &&
|
||||
((!p || p->feconn < p->maxconn) && (actconn < global.maxconn) &&
|
||||
(!tick_isset(global_listener_queue_task->expire) ||
|
||||
tick_is_expired(global_listener_queue_task->expire, now_ms))))) {
|
||||
/* at least one thread has to this when quitting */
|
||||
resume_listener(l);
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user