REGTESTS: fix random failures with wrong_ip_port_logging.vtc under load

This test has an expect rule for syslog that looks for [cC]D, to
indicate a client abort or timeout during the data phase. The purpose
was to say that when it fails it must be this, but the very low timeout
(1ms) still makes it prone to succeeding if the machine is highly loaded.

This has become more visible since commit e8b1ad4c2b ("BUG/MEDIUM: clock:
also update the date offset on time jumps") because the clock drift
adjustments are more systematic. Since this commit, running 50 such tests
at twice more than the number of CPUs in parallel is sufficient to yield
errors due to some lines appearing as succeeding:

   make reg-tests -- --j $((($(nproc)+1)*2)) --vtestparams -n50 reg-tests/log/wrong_ip_port_logging.vtc

It was observed that pauses up to 300ms were observed in epoll_wait() in
such circumstances, which were properly fixed by the time drift detection..
Another approach would consist in increasing the permitted margin during
which we don't fix the clock drift but that would not be logical since the
base time had really been awaited for.

This should be backported to all stable releases since the commit above
will trigger the issue more often.

(cherry picked from commit 036ab62231003e7e7072986626597682725edff8)
Signed-off-by: Willy Tarreau <w@1wt.eu>
This commit is contained in:
Willy Tarreau 2024-09-09 19:31:14 +02:00
parent ef35df8349
commit 07f1630a77

View File

@ -30,7 +30,7 @@ server s1 {
syslog Slg_1 -level notice {
recv info
expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_port}.*\"ts\":\"[cC]D\",\"
expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_port}.*\"ts\":\"[cC-][D-]\",\"
} -start
haproxy h1 -conf {