IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Instead of having a dedicated httpclient instance and its own code
decorrelated from the actual auto update one, the "update ssl
ocsp-response" will now use the update task in order to perform updates.
Since the cli command allows to update responses that were never
included in the auto update tree, a new flag was added to the
certificate_ocsp structure so that the said entry can be inserted into
the tree "by hand" and it won't be reinserted back into the tree after
the update process is performed. The 'update_once' flag "stole" a bit
from the 'fail_count' counter since it is the one less likely to reach
UINT_MAX among the ocsp counters of the certificate_ocsp structure.
This new logic required that every certificate_ocsp entry contained all
the ocsp-related information at all time since entries that are not
supposed to be configured automatically can still be updated through the
cli. The logic of the ssl_sock_load_ocsp was changed accordingly.
The dedicated proxy used for OCSP auto update is renamed OCSP-UPDATE
which should be more explicit than the previous HC_OCSP name. The
reference to the underlying httpclient is simply kept in the
documentation.
The certid is removed from the log line since it is not really
comprehensible and is replaced by the path to the corresponding frontend
certificate.
It is more a less a revert of the commit b65af26e1 ("MEDIUM: mux-pt: Don't
always set a final error on SE on the sending path"). The PT multiplexer is
so simple that an error on the sending path is terminal. Unlike other muxes,
there is no connection level here. However, instead of reporting an final
error by setting SE_FL_ERROR, we set SE_FL_EOS flag instead if a read0 was
received on the underlying connection. Concretely, it is always true with
the current design of the raw socket layer. But it is cleaner this way.
Without this patch, it is possible to block a TCP socket if a connection
error is triggered when data are sent (for instance a broken pipe) while the
upper stream does not expect to receive more data.
Note the patch above introduced a regression because errors handling at the
connection level is quite simple. All errors are final. But we must keep in
mind it may change. And if so, this will require to move back on a 2-step
errors handling in the mux-pt.
This patch must be backported to 2.7.
Released version 2.8-dev5 with the following main changes :
- MINOR: ssl: rename confusing ssl_bind_kws
- BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
- BUG/MEDIUM: http-ana: Detect closed SC on opposite side during body forwarding
- BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was reached
- MINOR: global: Add an option to disable the data fast-forward
- MINOR: haproxy: Add an command option to disable data fast-forward
- REGTESTS: Remove unsupported feature command in http_splicing.vtc
- BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping
- BUG/MINOR: sched: properly report long_rq when tasks remain in the queue
- BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed
- MINOR: threads: add flags to know if a thread is started and/or running
- MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
- BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame
- BUG/MINOR: mworker: prevent incorrect values in uptime
- MINOR: h3: add traces on decode_qcs callback
- BUG/MINOR: quic: Possible unexpected counter incrementation on send*() errors
- MINOR: quic: Add new traces about by connection RX buffer handling
- MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock
- BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
- MINOR: quic: Simplication for qc_set_timer()
- MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt
- MINOR: quic: Add traces to qc_kill_conn()
- MINOR: quic: Make qc_dgrams_retransmit() return a status.
- BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
- MINOR: quic: Add a trace to identify connections which sent Initial packet.
- MINOR: quic: Add <pto_count> to the traces
- BUG/MINOR: quic: Do not probe with too little Initial packets
- BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
- BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
- BUG/MINOR: quic: Missing padding for short packets
- MINOR: quic: adjust request reject when MUX is already freed
- BUG/MINOR: quic: also send RESET_STREAM if MUX released
- BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
- BUG/MINOR: h3: prevent hypothetical demux failure on int overflow
- MEDIUM: h3: enforce GOAWAY by resetting higher unhandled stream
- MINOR: mux-quic: define qc_shutdown()
- MINOR: mux-quic: define qc_process()
- MINOR: mux-quic: implement client-fin timeout
- MEDIUM: mux-quic: properly implement soft-stop
- MINOR: quic: mark quic-conn as jobs on socket allocation
- MEDIUM: quic: trigger fast connection closing on process stopping
- MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
- MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in h2s_frt_handle_headers()
- BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()
- DEBUG: stream: Add a BUG_ON to never exit process_stream with an expired task
- DOC: config: Fix description of options about HTTP connection modes
- MINOR: proxy: Only consider backend httpclose option for server connections
- BUG/MINOR: haproxy: Fix option to disable the fast-forward
- DOC: config: Add the missing tune.fail-alloc option from global listing
- MINOR: cfgcond: Implement strstr condition expression
- MINOR: cfgcond: Implement enabled condition expression
- REGTESTS: Skip http_splicing.vtc script if fast-forward is disabled
- REGTESTS: Fix ssl_errors.vtc script to wait for connections close
- BUG/MINOR: mworker: stop doing strtok directly from the env
- BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old versions
- BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master FD is wrong
- MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
- BUG/MINOR: cache: Cache response even if request has "no-cache" directive
- BUG/MINOR: cache: Check cache entry is complete in case of Vary
- MINOR: compiler: add a TOSTR() macro to turn a value into a string
- BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send()
- BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy
- MEDIUM: channel: Remove CF_READ_NOEXP flag
- MAJOR: channel: Remove flags to report READ or WRITE errors
- DEBUG: stream/trace: Add sedesc flags in trace messages
- MINOR: channel/stconn: Move rto/wto from the channel to the stconn
- MEDIUM: channel/stconn: Move rex/wex timer from the channel to the sedesc
- MEDIUM: stconn: Don't requeue the stream's task after I/O
- MEDIUM: stconn: Replace read and write timeouts by a unique I/O timeout
- MEDIUM: stconn: Add two date to track successful reads and blocked sends
- MINOR: applet/stconn: Add a SE flag to specify an endpoint does not expect data
- MAJOR: stream: Use SE descriptor date to detect read/write timeouts
- MINOR: stream: Dump the task expiration date in trace messages
- MINOR: stream: Report rex/wex value using the sedesc date in trace messages
- MINOR: stream: Use relative expiration date in trace messages
- MINOR: stconn: Always report READ/WRITE event on shutr/shutw
- CLEANUP: stconn: Remove old read and write expiration dates
- MINOR: stconn: Set half-close timeout using proxy settings
- MINOR: stconn: Remove half-closed timeout
- REGTESTS: cache: Use rxresphdrs to only get headers for 304 responses
- MINOR: stconn: Add functions to set/clear SE_FL_EXP_NO_DATA flag from endpoint
- BUG/MINOR: proto_ux: report correct error when bind_listener fails
- BUG/MINOR: protocol: fix minor memory leak in protocol_bind_all()
- MINOR: proto_uxst: add resume method
- MINOR: listener/api: add lli hint to listener functions
- MINOR: listener: add relax_listener() function
- MINOR: listener: workaround for closing a tiny race between resume_listener() and stopping
- MINOR: listener: make sure we don't pause/resume bypassed listeners
- BUG/MEDIUM: listener: fix pause_listener() suspend return value handling
- BUG/MINOR: listener: fix resume_listener() resume return value handling
- BUG/MEDIUM: resume from LI_ASSIGNED in default_resume_listener()
- MINOR: listener: pause_listener() becomes suspend_listener()
- BUG/MEDIUM: listener/proxy: fix listeners notify for proxy resume
- BUG/MINOR: sock_unix: match finalname with tempname in sock_unix_addrcmp()
- MEDIUM: proto_ux: properly suspend named UNIX listeners
- MINOR: proto_ux: ability to dump ABNS names in error messages
- MINOR: haproxy: always protocol unbind on startup error path
- BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del()
- BUG/MINOR: ring: do not realign ring contents on resize
- MEDIUM: ring: make the offset relative to the head/tail instead of absolute
- CLEANUP: ring: remove the now unused ring's offset
- MINOR: config: add HAPROXY_BRANCH environment variable
- BUILD: thead: Fix several 32 bits compilation issues with uint64_t variables
- BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and fd_rm_from_fd_list
- BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during parsing
- BUG/MINOR: stream: Remove BUG_ON about the task expiration in process_stream()
- MINOR: stream: Handle stream's timeouts in a dedicated function
- MEDIUM: stream: Eventually handle stream timeouts when exiting process_stream()
- MINOR: stconn: Report a send activity when endpoint is willing to consume data
- BUG/MEDIUM: stconn: Report a blocked send if some output data are not consumed
- MEDIUM: mux-h1: Don't expect data from server as long as request is unfinished
- MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished
- MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished
- DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section
- DOC: config: Replace TABs by spaces
- BUG/MINOR: fd: used the update list from the fd's group instead of tgid
- BUG/MEDIUM: fd: make fd_delete() support being called from a different group
- CLEANUP: listener: only store conn counts for local threads
- MINOR: tinfo: make thread_set functions return nth group/mask instead of first
- MEDIUM: quic: improve fatal error handling on send
- MINOR: quic: consider EBADF as critical on send()
- BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list
- BUG/MINOR: mux-h1: Don't report an error on an early response close
- BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format body
- BUG/MINOR: http-check: Skip C-L header for empty body when it's not mandatory
- BUG/MINOR: http-fetch: recognize IPv6 addresses in square brackets in req.hdr_ip()
- REGTEST: added tests covering smp_fetch_hdr_ip()
- MINOR: quic: simplify return path in send functions
- MINOR: quic: implement qc_notify_send()
- MINOR: quic: purge txbuf before preparing new packets
- MEDIUM: quic: implement poller subscribe on sendto error
- MINOR: quic: notify on send ready
- BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7 retry
- BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
- BUG/MEDIUM: http-ana: Don't close request side when waiting for response
- BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data
- MINOR: ssl: Destroy ocsp update http_client during cleanup
- MINOR: ssl: Reinsert ocsp update entries later in case of unknown error
- MINOR: ssl: Add ocsp update success/failure counters
- MINOR: ssl: Store specific ocsp update errors in response and update ctx
- MINOR: ssl: Add certificate's path to certificate_ocsp structure
- MINOR: ssl: Add 'show ssl ocsp-updates' CLI command
- MINOR: ssl: Add sample fetches related to OCSP update
- MINOR: ssl: Use dedicated proxy and log-format for OCSP update
- MINOR: ssl: Reorder struct certificate_ocsp members
- MINOR: ssl: Increment OCSP update replay delay in case of failure
- MINOR: ssl: Add way to dump ocsp response in base64
- MINOR: ssl: Add global options to modify ocsp update min/max delay
- REGTESTS: ssl: Fix ocsp update crt-lists
- REGTESTS: ssl: Add test for new ocsp update cli commands
- MINOR: ssl: Add ocsp-update information to "show ssl crt-list"
- BUG/MINOR: ssl: Fix ocsp-update when using "add ssl crt-list"
- MINOR: ssl: Replace now.tv_sec with date.tv_sec in ocsp update task
- BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback
- BUG/MEDIUM: quic: properly handle duplicated STREAM frames
- BUG/MINOR: cli: fix CLI handler "set anon global-key" call
- MINOR: http_ext: adding some documentation, forgot to inline function
- BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
- MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
- BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
- BUG/MINOR: quic: v2 Initial packets decryption failed
- MINOR: quic: Add traces about QUIC TLS key update
- BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
- BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting frames
- BUG/MINOR: quic: Do not resend already acked frames
- BUG/MINOR: quic: Missing detections of amplification limit reached
- MINOR: quic: Send PING frames when probing Initial packet number space
- BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
- BUG/MAJOR: fd/thread: fix race between updates and closing FD
- BUG/MEDIUM: dns: ensure ring offset is properly reajusted to head
- BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated
- MINOR: quic: Do not accept wrong active_connection_id_limit values
- MINOR: quic: Store the next connection IDs sequence number in the connection
- MINOR: quic: Typo fix for ACK_ECN frame
- MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
- MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
- MINOR: quic: Add spin bit support
- MINOR: quic: Add transport parameters to "show quic"
- BUG/MEDIUM: sink/forwarder: ensure ring offset is properly readjusted to head
- BUG/MINOR: dns: fix ring offset calculation on first read
- BUG/MINOR: dns: fix ring offset calculation in dns_resolve_send()
- MINOR: jwt: Add support for RSA-PSS signatures (PS256 algorithm)
- MINOR: h3: add traces on h3_init_uni_stream() error paths
- MINOR: quic: create a global list dedicated for closing QUIC conns
- MINOR: quic: handle new closing list in show quic
- MEDIUM: quic: release closing connections on stopping
- BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
- MINOR: fd/cli: report the polling mask in "show fd"
- CLEANUP: sock: always perform last connection updates before wakeup
- MINOR: quic: Do not stress the peer during retransmissions of lost packets
- BUG/MINOR: init: properly detect NUMA bindings on large systems
- BUG/MINOR: thread: report thread and group counts in the correct order
- BUG/MAJOR: fd/threads: close a race on closing connections after takeover
- MINOR: debug: add random delay injection with "debug dev delay-inj"
- BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
- BUG/MINOR: quic: Missing listener accept queue tasklet wakeups
- MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
- DOC: config: fix typo "dependeing" in bind thread description
- DOC/CLEANUP: fix typos
This one is printed as the iocb in the "show fd" output, and arguably
this wasn't very convenient as-is:
293 : st=0x000123(cl heopI W:sRa R:sRA) ref=0 gid=1 tmask=0x8 umask=0x0 prmsk=0x8 pwmsk=0x0 owner=0x7f488487afe0 iocb=0x50a2c0(main+0x60f90)
Let's unstatify it and export it so that the symbol can now be resolved
from the various points that need it.
This bug was revealed by h2load tests run as follows:
h2load -t 4 --npn-list h3 -c 64 -m 16 -n 16384 -v https://127.0.0.1:4443/
This open (-c) 64 QUIC connections (-n) 16384 h3 requets from (-t) 4 threads, i.e.
256 requests by connection. Such tests could not always pass and often ended with
such results displays by h2load:
finished in 53.74s, 38.11 req/s, 493.78KB/s
requests: 16384 total, 2944 started, 2048 done, 2048 succeeded, 14336
failed, 14336 errored, 0 timeout
status codes: 2048 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 25.92MB (27174537) total, 102.00KB (104448) headers (space
savings 1.92%), 25.80MB (27053569) data
UDP datagram: 3883 sent, 24330 received
min max mean sd ± sd
time for request: 48.75ms 502.86ms 134.12ms 75.80ms 92.68%
time for connect: 20.94ms 331.24ms 189.59ms 84.81ms 59.38%
time to 1st byte: 394.36ms 417.01ms 406.72ms 9.14ms 75.00%
req/s : 0.00 115.45 14.30 38.13 87.50%
The number of successful requests was always a multiple of 256.
Activating the traces also shew that some connections were blocked after having
successfully completed their handshakes due to the fact that the mux. The mux
is started upon the acceptation of the connection.
Under heavy load, some connections were never accepted. From the moment where
more than 4 (MAXACCEPT) connections were enqueued before a listener could be
woken up to accept at most 4 connections, the remaining connections were not
accepted ore lately at the second listener tasklet wakeup.
Add a call to tasklet_wakeup() to the accept list tasklet of the listeners to
wake up it if there are remaining connections to accept after having called
listener_accept(). In this case the listener must not be removed of this
accept list, if not at the next call it will not accept anything more.
Must be backported to 2.7 and 2.6.
In environments where SYSTEM_MAXCONN is defined when compiling, the
master will use this value instead of the original minimal value which
was set to 100. When this happens, the master process could allocate
RAM excessively since it does not need to have an high maxconn. (For
example if SYSTEM_MAXCONN was set to 100000 or more)
This patch fixes the issue by using the new define MASTER_MAXCONN which
define a default maxconn of 100 for the master process.
Must be backported as far as 2.5.
The goal is to send signals to random threads at random instants so that
they spin for a random delay in a relax() loop, trying to give back the
CPU to another competing hardware thread, in hope that from time to time
this can trigger in critical areas and increase the chances to provoke a
latent concurrency bug. For now none were observed.
For example, this command starts 64 such tasks waking after random delays
of 0-1ms and delivering signals to trigger such loops on 3 random threads:
for i in {1..64}; do
socat - /tmp/sock1 <<< "expert-mode on;debug dev delay-inj 2 3"
done
This command is only enabled when DEBUG_DEV is set at build time.
As mentioned in commit 237e6a0d6 ("BUG/MAJOR: fd/thread: fix race between
updates and closing FD"), a race was found during stress tests involving
heavy backend connection reuse with many competing closes.
Here the problem is complex. The analysis in commit f69fea64e ("MAJOR:
fd: get rid of the DWCAS when setting the running_mask") that removed
the DWCAS in 2.5 overlooked a few races.
First, a takeover from thread1 could happen just after fd_update_events()
in thread2 validates it holds the tmask bit in the CAS loop. Since thread1
releases running_mask after the operation, thread2 will succeed the CAS
and both will believe the FD is theirs. This does explain the occasional
crashes seen with h1_io_cb() being called on a bad context, or
sock_conn_iocb() seeing conn->subs vanish after checking it. This issue
can be addressed using a DWCAS in both fd_takeover() and fd_update_events()
as it was before the patch above but this is not portable to all archs and
is not easy to adapt for those lacking it, due to some operations still
happening only on individual masks after the thread groups were added.
Second, the checks after fd_clr_running() for the current thread being
the last one is not sufficient: at the exact moment the operation
completes, another thread may also set and drop the running bit and see
itself as alone, and both can call _fd_close_orphan() in parallel. In
order to prevent this from happening, we cannot rely on the absence of
others, we need an explicit flag indicating that the FD must be closed.
One approach that was attempted consisted in playing with the thread_mask
but that was not reliable since it could still match between the late
deletion and the early insertion that follows. Instead, a new FD flag
was added, FD_MUST_CLOSE, that exactly indicates that the call to
_fd_delete_orphan() must be done. It is set by fd_delete(), and
atomically cleared by the first one which checks it, and which is the
only one to call _fd_delete_orphan().
With both points addressed, there's no more visible race left:
- takeover() only happens under the connection list's lock and cannot
compete with fd_delete() since fd_delete() must first remove the
connection from the list before deleting the FD. That's also why it
doesn't need to call _fd_delete_orphan() when dropping its running
bit.
- takeover() sets its running bit then atomically replaces the thread
mask, so that until that's done, it doesn't validate the condition
to end the synchonization loop in fd_update_events(). Once it's OK,
the previous thread's bit is lost, and this is checked for in
fd_update_events()
- fd_update_events() can compete with fd_delete() at various places
which are explained above. Since fd_delete() clears the thread mask
as after setting its running bit and after setting the FD_MUST_CLOSE
bit, the synchronization loop guarantees that the thread mask is seen
before going further, and that once it's seen, the FD_MUST_CLOSE flag
is already present.
- fd_delete() may start while fd_update_events() has already started,
but fd_delete() must hold a bit in thread_mask before starting, and
that is checked by the first test in fd_update_events() before setting
the running_mask.
- the poller's _update_fd() will not compete against _fd_delete_orphan()
nor fd_insert() thanks to the fd_grab_tgid() that's always done before
updating the polled_mask, and guarantees that we never pretend that a
polled_mask has a bit before the FD is added.
The issue is very hard to reproduce and is extremely time-sensitive.
Some tests were required with a 1-ms timeout with request rates
closely matching 1 kHz per server, though certain tests sometimes
benefitted from saturation. It was found that adding the following
slowdown at a few key places helped a lot and managed to trigger the
bug in 0.5 to 5 seconds instead of tens of minutes on a 20-thread
setup:
{ volatile int i = 10000; while (i--); }
Particularly, placing it at key places where only one of running_mask
or thread_mask is set and not the other one yet (e.g. after the
synchronization loop in fd_update_events or after dropping the
running bit) did yield great results.
Many thanks to Olivier Houchard for this expert help analysing these
races and reviewing candidate fixes.
The patch must be backported to 2.5. Note that 2.6 does not have tgid
in FDs, and that it requires a change of output on fd_clr_running() as
we need the previous bit. This is provided by carefully backporting
commit d6e1987612 ("MINOR: fd: make fd_clr_running() return the previous
value instead"). Tests have shown that the lack of tgid is a showstopper
for 2.6 and that unless a better workaround is found, it could still be
preferable to backport the minimum pieces required for fd_grab_tgid()
to 2.6 so that it stays stable long.
In case too many thread groups are needed for the threads, we emit
an error indicating the problem. Unfortunately the threads and groups
counts were reversed.
This can be backported to 2.6.
The NUMA detection code tries not to interfer with any taskset the user
could have specified in init scripts. For this it compares the number of
CPUs available with the number the process is bound to. However, the CPU
count is retrieved after being applied an upper bound of MAX_THREADS, so
if the machine has more than 64 CPUs, the comparison always fails and
makes haproxy think the user has already enforced a binding, and it does
not pin it anymore to a single NUMA node.
This can be verified by issuing:
$ socat /path/to/sock - <<< "show info" | grep thread
On a dual 48-CPU machine it reports 64, implying that threads are allowed
to run on the second socket:
Nbthread: 64
With this fix, the function properly reports 96, and the output shows 48,
indicating that a single NUMA node was used:
Nbthread: 48
Of course nothing is changed when "no numa-cpu-mapping" is specified:
Nbthread: 64
This can be backported to 2.4.
This issue was revealed by "Multiple streams" QUIC tracker test which very often
fails (locally) with a file of about 1Mbytes (x4 streams). The log of QUIC tracker
revealed that from its point of view, the 4 files were never all received entirely:
"results" : {
"stream_0_rec_closed" : true,
"stream_0_rec_offset" : 1024250,
"stream_0_snd_closed" : true,
"stream_0_snd_offset" : 15,
"stream_12_rec_closed" : false,
"stream_12_rec_offset" : 72689,
"stream_12_snd_closed" : true,
"stream_12_snd_offset" : 15,
"stream_4_rec_closed" : true,
"stream_4_rec_offset" : 1024250,
"stream_4_snd_closed" : true,
"stream_4_snd_offset" : 15,
"stream_8_rec_closed" : true,
"stream_8_rec_offset" : 1024250,
"stream_8_snd_closed" : true,
"stream_8_snd_offset" : 15
},
But this in contradiction with others QUIC tracker logs which confirms that haproxy
has really (re)sent the stream at the suspected offset(stream_12_rec_offset):
1152085,
"transport",
"packet_received",
{
"frames" : [
{
"frame_type" : "stream",
"length" : "155",
"offset" : "72689",
"stream_id" : "12"
}
],
"header" : {
"dcid" : "a14479169ebb9dba",
"dcil" : "8",
"packet_number" : "466",
"packet_size" : 190
},
"packet_type" : "1RTT"
}
When detected as losts, the packets are enlisted, then their frames are
requeued in their packet number space by qc_requeue_nacked_pkt_tx_frms().
This was done using a local list which was spliced to the packet number
frame list. This had as bad effect to retransmit the frames in the inverse
order they have been sent. This is something the QUIC tracker go client
does not like at all!
Removing the frame splicing fixes this issue and allows haproxy to pass the
"Multiple streams" test.
Must be backported to 2.7.
Normally the task_wakeup() in sock_conn_io_cb() is expected to
happen on the same thread the FD is attached to. But due to the
way the code was arranged in the past (with synchronous callbacks)
we continue to update connections after the wakeup, which always
makes the reader have to think deeply whether it's possible or not
to call another thread there. Let's just move the tasklet_wakeup()
at the end to make sure there's no problem with that.
This bug arrived with this commit:
b5a8020e9 MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
and was revealed by h3 interop tests with clients like s2n-quic and quic-go
as noticed by Amaury.
Indeed, one must check that the CID matching the sequence number provided by a received
RETIRE_CONNECTION_ID frame does not match the DCID of the packet.
Remove useless ->curr_cid_seq_num member from quic_conn struct.
The sequence number lookup must be done in qc_handle_retire_connection_id_frm()
to check the validity of the RETIRE_CONNECTION_ID frame, it returns the CID to be
retired into <cid_to_retire> variable passed as parameter to this function if
the frame is valid and if the CID was not already retired
Must be backported to 2.7.
Since the following commit :
commit fb375574f947143e185225558c274ac00a3f8cb4
MINOR: quic: mark quic-conn as jobs on socket allocation
quic-conn instances are marked as jobs. This prevent haproxy process to
stop while there is transfer in progress. To not delay process
termination, idle connections are woken up through their MUX instances
to be able to release them immediately.
However, there is no mechanism to wake up quic connections left on
closing or draining state. This means that haproxy process termination
is delayed until every closing quic connections timer has expired.
To improve this, a new function quic_handle_stopping() is called when
haproxy process is stopping. It simply wakes up the idle timer task of
all connections in the global closing list. These connections will thus
be released immediately to not interrupt haproxy process stopping.
This should be backported up to 2.7.
A new global quic-conn list has been added by the previous patch. It will
contain every quic-conn in closing or draining state.
Thus, it is now easier to include or skip them on a "show quic" output :
when the default list on the current thread has been browsed entirely,
either we skip to the next thread or we look at the closing list on the
current thread.
This should be backported up to 2.7.
When a CONNECTION_CLOSE is emitted or received, a QUIC connection enters
respectively in draining or closing state. These states are a loose
equivalent of TCP TIME_WAIT. No data can be exchanged anymore but the
connection is maintained during a certain timer to handle packet
reordering or loss.
A new global list has been defined for QUIC connections in
closing/draining state inside thread_ctx structure. Each time a
connection enters in one of this state, it will be moved from the
default global list to the new closing list.
The objective of this patch is to quickly filter connections on
closing/draining. Most notably, this will be used to wake up these
connections and avoid that haproxy process stopping is delayed by them.
A dedicated function qc_detach_th_ctx_list() has been implemented to
transfer a quic-conn from one list instance to the other. This takes
care of back-references attach to a quic-conn instance in case of a
running "show quic".
This should be backported up to 2.7.
This patch adds the support for the PS algorithms when verifying JWT
signatures (rsa-pss). It was not managed during the first implementation
and previously raised an "Unmanaged algorithm" error.
The tests use the same rsa signature as the plain rsa tests (RS256 ...)
and the implementation simply adds a call to
EVP_PKEY_CTX_set_rsa_padding in the function that manages rsa and ecdsa
signatures.
The signatures in the reg-test were built thanks to the PyJWT python
library once again.
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") relative offset calculation was fixed in dns_session_io_handler()
and dns_process_req() functions.
But if we compare with the changes performed in the patch that introduced
the bug: d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail
instead of absolute"), we can see that dns_resolve_send() is missing from
the patch.
Applying both 737d10f + ("BUG/MINOR: dns: fix ring offset calculation on
first read") to dns_resolve_send() function.
With this last commit, we should be back at pre d9c7188 behavior.
No backport needed.
With 737d10f ("BUG/MEDIUM: dns: ensure ring offset is properly reajusted
to head") ring offset is now properly re-adjusted in dns_session_io_handler()
and dns_process_req().
But the previous patch does not cope well if the first read is performed
on a non-empty ring since relative ofs will be computed from ds->ofs=0 or
dss->ofs_req=0.
In this case: relative offset could become invalid since we mix up relative
offsets with absolute offsets.
To fix this, we apply the same logic performed in d9c7188 ("MEDIUM: ring:
make the offset relative to the head/tail instead of absolute") for the
cli_io_handler_show_ring() function: that is using b_peek_ofs(buf, 0) to
set the contextual offset instead of hard-coding it to 0.
This should be considered as a minor bugfix since this bug was discovered by
reading the code: 737d10f already survived a good amount of stress-tests as
shown in GH #2068.
No backport needed as 737d10f is not marked for backports.
Since d9c7188 ("MEDIUM: ring: make the offset relative to the head/tail instead
of absolute"), ring offset calculation has changed: we don't rely on ring->ofs
absolute offset anymore.
But with the above patch, relative offset is not properly calculated in
sink_forward_oc_io_handler() and sink_forward_io_handler().
The issue here is the same as 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head") since dns and sink_forward share the same
ring logic:
When the ring is becoming full, ring_write() will try to regain some space to
insert new data by calling b_del() on older messages. Here b_del() moves
buffer's head under the hood, and since ring->ofs cannot be used to "correct"
the relative offset, both sink_forward_oc_io_handler() and
sink_forward_io_handler() start to get invalid offset.
At this point, we will suffer from ring data corruption resulting in unexpected
behavior or process crashes.
This can be easily demonstrated with the following test:
|log-forward syslog
| dgram-bind 127.0.0.1:5114
| log ring@logbuffer local0
|
|ring logbuffer
| format rfc5424
| size 16384
| server logserver 127.0.0.1:5114
Haproxy will forward incoming logs on udp@127.0.0.1:5114 to
tcp@127.0.0.1:5114
Then use the following tcp server:
nc -l -p 5114
With the following udp log sender:
|while [ 1 ]
|do
| logger --udp --server 127.0.0.1 -P 5114 -p user.warn "Test 7"
|done
Once the ring buffer is full (it takes less that a second to fill the 16k
buffer) haproxy starts to misbehave and the log forwarding stops.
We apply the same fix as in 737d10f ("BUG/MEDIUM: dns: ensure ring offset is
properly reajusted to head").
Please note the ~0 case that is handled slightly differently in this patch:
this is required to properly start reading from a non-empty ring. This case
will be fixed in dns related code in the following patch.
This does not need to be backported as d9c7188 was not marked for backports.
Modify quic_transport_params_dump() and others function relative to the
transport parameters value dump from TRACE() to make their output more
compact.
Add call to quic_transport_params_dump() to dump the transport parameters
from "show quic" CLI command.
Must be backported to 2.7.
Add QUIC_FL_RX_PACKET_SPIN_BIT new RX packet flag to mark an RX packet as having
the spin bit set. Idem for the connection with QUIC_FL_CONN_SPIN_BIT flag.
Implement qc_handle_spin_bit() to set/unset QUIC_FL_CONN_SPIN_BIT for the connection
as soon as a packet number could be deciphered.
Modify quic_build_packet_short_header() to set the spin bit when building
a short packet header.
Validated by quic-tracker spin bit test.
Must be backported to 2.7.
Add ->curr_cid_seq_num new quic_conn struct frame to store the connection
ID sequence number currently used by the connection.
Implement qc_handle_retire_connection_id_frm() to handle this RX frame.
Implement qc_retire_connection_seq_num() to remove a connection ID from its
sequence number.
Implement qc_build_new_connection_id_frm to allocate a new NEW_CONNECTION_ID
frame from a CID.
Modify qc_parse_pkt_frms() which parses the frames of an RX packet to handle
the case of the RETIRE_CONNECTION_ID frame.
Must be backported to 2.7.
Add ->next_cid_seq_num new member to quic_conn struct to store the next
connection ID to be used to alloacated a connection ID.
It is initialized to 0 from qc_new_conn() which initializes a connection.
Modify new_quic_cid() to use this variable each time it is called without
giving the possibility to the caller to pass the sequence number for the
connection to be allocated.
Modify quic_build_post_handshake_frames() to use ->next_cid_seq_num
when building NEW_CONNECTION_ID frames after the hanshake has been completed.
Limit the number of connection IDs provided to the peer to the minimum
between 4 and the value it sent with active_connection_id_limit transport
parameter. This includes the connection ID used by the connection to send
this new connection IDs.
Must be backported to 2.7.
A peer must not send active_connection_id_limit values smaller than 2
which is also the minimum value when not sent.
Make the transport parameters decoding fail in this case.
Must be backported to 2.7.
STREAM frame retransmission has been recently fixed. A new boolean field
<dup> was created for quic_stream frame type. It is set for duplicated
STREAM frame to ensure extra checks on the underlying buffer are
conducted before sending the frame. All of this has been implemented by
this commit :
315a4f6ae54da17fd28f7a14373b05bab0b5aa08
BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
However, the above commit is incomplete. In the MUX code, when a new
STREAM frame is created, <dup> is left uninitialized. In most cases this
is harmless as it will only add extra unneeded checks before sending the
frame. So this is mainly a performance issue.
There is however one case where this bug will lead to a crash : when the
response consists only of an empty STREAM frame. In this case, the empty
frame will be silently removed as it is incorrectly assimilated to an
already acked frame range in qc_build_frms(). This can trigger a
BUG_ON() on the MUX code as a qcs instance is still in the send list
after qc_send_frames() invocation.
Note that this is extremely rare to have only an empty STREAM frame. It
was reproduced with HTTP/0.9 where no HTTP status line exists on an
empty body. I do not know if this is possible on HTTP/3 as a status line
should be present each time in a HEADERS frame.
Properly initialize <dup> field to 0 on each STREAM frames generated by
the QUIC MUX to fix this issue.
This crash may be linked to github issue #2049.
This should be backported up to 2.6.
Since the below patch, ring offset calculation for readers has changed.
commit d9c718863384e32307f65a9ce319dc362b73feb6
MEDIUM: ring: make the offset relative to the head/tail instead of absolute
For readers, this requires to adjust their offsets to be relative to the
ring head each time read is resumed. Indeed, buffer head can change any
time a ring_write() is performed after older entries were purged.
This operation was not performed on the DNS code which causes the offset
to become invalid. In most cases, the following BUG_ON() was triggered :
FATAL: bug condition "msg_len + ofs + cnt + 1 > b_data(buf)" matched
at src/dns.c:522
Fix this by adjusting DNS reader offsets when entering
dns_session_io_handler() and dns_process_req().
This bug was reproduced by using a backend with 10 servers using SRV
record resolution on a single resolvers section. A BUG_ON() crash would
occur after less than 5 minutes of process execution.
This does not need to be backported as the above patch is not.
This should fix github issue #2068.
While running some L7 retries tests, Christopher and I stumbled upon a
very strange behavior showing some occasional server timeouts when the
server closes keep-alive connections quickly. The issue can be
reproduced with the following config:
global
expose-experimental-directives
#tune.fd.edge-triggered on # can speed up the issue
defaults
mode http
timeout client 5s
timeout server 10s
timeout connect 2s
listen f
bind :8001
http-reuse always
retry-on all-retryable-errors
server next 127.0.0.1:8002
frontend b
bind :8002
timeout http-keep-alive 1 # one ms
redirect location /
Sending fast requests without reusing the client connection on port 8001
with a single connection and at least 3 threads on haproxy occasionally
shows some glitches pauses (below with timeout server 2s):
$ taskset -c 2,3 h1load -e -t 1 -r 1 -c 1 http://127.0.0.1:8001/
# time conns tot_conn tot_req tot_bytes err cps rps bps ttfb
1 1 9794 9793 959714 0 9k79 9k79 7M67 42.94u
2 1 9794 9793 959714 0 0.00 0.00 0.00 -
3 1 9794 9793 959714 0 0.00 0.00 0.00 -
4 0 16015 16015 1569470 0 6k22 6k22 4M87 522.9u
5 0 18657 18656 1828190 2 2k63 2k63 2M06 39.22u
If this doesn't happen, limiting to a request rate close to 1/timeout
may help.
What is happening is that after several migrations, a late report
via fd_update_events() may detect that the thread is not welcome, and
will want to program an update so that the current thread's poller
disables its polling on it. It is allowed to do so because it used
fd_grab_tgid(). But what if _fd_delete_orphan() was just starting to
be called and already reset the update_mask ? We'll end up with a bit
present in the update mask, then _fd_delete_orphan() resets the tgid,
which will prevent the poller from consuming that update. The update
is not needed anymore since the FD was closed, but in this case nobody
will clear this bit until the same FD is reused again and cleared. And
as long as the thread's bit remains in the update_mask, no new updates
will be programmed for the next use of this FD on the same thread since
due to the bit being present, fd_nbupdt will not be changed. This is
what is causing this timeout.
The fix consists in making sure _fd_delete_orphan() waits for the
occasional watchers to leave, and to do this before clearing the
update_mask. This will be either fd_update_events() trying to check
its thread_mask, or the poller checking its updates, so that's pretty
short. But it definitely closes this race.
This fix is needed since the introduction of fd_grab_tgid(), hence 2.7.
Note that while testing the fix, another related issue concerning the
atomicity of running_mask vs thread_mask popped up and will have to be
fixed till 2.5 as part of another patch. It may make the tests for this
fix occasionally tigger a few BUG_ON() or face a null conn->subs in
sock_conn_iocb(), though these ones are much more difficult to trigger.
This is not caused by this fix.
The MUX instance is released before its quic-conn counterpart. On
termination, a H3 GOAWAY is emitted to prevent the client to open new
streams for this connection.
The quic-conn instance will stay alive until all opened streams data are
acknowledged. If the client tries to open a new stream during this
interval despite the GOAWAY, quic-conn is responsible to request its
immediate closure with a STOP_SENDING + RESET_STREAM.
This behavior was already implemented but the received packet with the
new STREAM was never acknowledged. This was fixed with the following
commit :
commit 156a89aef8c63910502b266251dc34f648a99fae
BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
However, this patch introduces a regression as it did not skip the call
to qc_handle_strm_frm() despite the MUX instance being released. This
can cause a segfault when using qcc_get_qcs() on a released MUX
instance. To fix this, add a missing break statement which will skip
qc_handle_strm_frm() when the MUX instance is not initialized.
This commit was reproduced using a short timeout client and sending
several requests with delay between them by using a modified aioquic. It
produces a crash with the following backtrace :
#0 0x000055555594d261 in __eb64_lookup (x=4, root=0x7ffff4091f60) at include/import/eb64tree.h:132
#1 eb64_lookup (root=0x7ffff4091f60, x=4) at src/eb64tree.c:37
#2 0x000055555563fc66 in qcc_get_qcs (qcc=0x7ffff4091dc0, id=4, receive_only=1, send_only=0, out=0x7ffff780ca70) at src/mux_quic.c:668
#3 0x0000555555641e1a in qcc_recv (qcc=0x7ffff4091dc0, id=4, len=40, offset=0, fin=1 '\001', data=0x7ffff40c4fef "\001&") at src/mux_quic.c:974
#4 0x0000555555619d28 in qc_handle_strm_frm (pkt=0x7ffff4088e60, strm_frm=0x7ffff780cf50, qc=0x7ffff7cef000, fin=1 '\001') at src/quic_conn.c:2515
#5 0x000055555561d677 in qc_parse_pkt_frms (qc=0x7ffff7cef000, pkt=0x7ffff4088e60, qel=0x7ffff7cef6c0) at src/quic_conn.c:3050
#6 0x00005555556230aa in qc_treat_rx_pkts (qc=0x7ffff7cef000, cur_el=0x7ffff7cef6c0, next_el=0x0) at src/quic_conn.c:4214
#7 0x0000555555625fee in quic_conn_app_io_cb (t=0x7ffff40c1fa0, context=0x7ffff7cef000, state=32848) at src/quic_conn.c:4640
#8 0x00005555558a676d in run_tasks_from_lists (budgets=0x7ffff780d470) at src/task.c:596
#9 0x00005555558a725b in process_runnable_tasks () at src/task.c:876
#10 0x00005555558522ba in run_poll_loop () at src/haproxy.c:2945
#11 0x00005555558529ac in run_thread_poll_loop (data=0x555555d14440 <ha_thread_info+64>) at src/haproxy.c:3141
#12 0x00007ffff789ebb5 in ?? () from /usr/lib/libc.so.6
#13 0x00007ffff7920d90 in ?? () from /usr/lib/libc.so.6
This should fix github issue #2067.
This must be backported up to 2.6.
In very very rare cases, it is possible the Initial packet number space
must be probed even if it there is no more in flight CRYPTO frames.
In such cases, a PING frame is sent into an Initial packet. As this
packet is ack-eliciting, it must be padded by the server. qc_do_build_pkt()
is modified to do so.
Take the opportunity of this patch to modify the trace for TX frames to
easily distinguished them from other frame relative traces.
Must be backported to 2.7.
Mark the connection as limited by the anti-amplification limit when trying to
probe the peer.
Wakeup the connection PTO/dectection loss timer as soon as a datagram is
received. This was done only when the datagram was dropped.
This fixes deadlock issues revealed by some interop runner tests.
Must be backported to 2.7 and 2.6.
Some frames are marked as already acknowledged from duplicated packets
whose the original packet has been acknowledged. There is no need
to resend such packets or frames.
Implement qc_pkt_with_only_acked_frms() to detect packet with only
already acknowledged frames inside and use it from qc_prep_fast_retrans()
which selects the packet to be retransmitted.
Must be backported to 2.6 and 2.7.
Even if there is a check in callers of qc_prep_hdshk_fast_retrans() and
qc_prep_fast_retrans() to prevent retransmissions of packets with no ack-eliciting
frames, these two functions should pay attention not do to that especially if
someone decides to modify their implementations in the future.
Must be backported to 2.6 and 2.7.
This is an old bug which arrived in this commit due to a misinterpretation
of the RFC I guess where the desired effect was to acknowledge all the
handshake packets:
77ac6f566 BUG/MINOR: quic: Missing acknowledgments for trailing packets
This had as bad effect to acknowledge all the handshake packets even the
ones which are not ack-eliciting.
Must be backported to 2.7 and 2.6.
Dump the secret used to derive the next one during a key update initiated by the
client and dump the resulted new secret and the new key and iv to be used to
decryption Application level packets.
Also add a trace when the key update is supposed to be initiated on haproxy side.
This has already helped in diagnosing an issue evealed by the key update interop
test with xquic as client.
Must be backported to 2.7.
v2 interop runner test revealed this bug as follows:
[01|quic|4|c_conn.c:4087] new packet : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080 rel=H
[01|quic|5|c_conn.c:1509] qc_pkt_decrypt(): entering : qc@0x7f62ec026e30
[01|quic|0|c_conn.c:1553] quic_tls_decrypt() failed : qc@0x7f62ec026e30
[01|quic|5|c_conn.c:1575] qc_pkt_decrypt(): leaving : qc@0x7f62ec026e30
[01|quic|0|c_conn.c:4091] packet decryption failed -> dropped : qc@0x7f62ec026e30 pkt@0x7f62ec056390 el=I pn=491940080
Only v2 Initial packets decryption received by the clients were impacted. There
is no issue to encrypt v2 Initial packets. This is due to the fact that when
negotiated the client may send two versions of Initial packets (currently v1,
then v2). The selection was done for the TX path but not on the RX path.
Implement qc_select_tls_ctx() to select the correct TLS cipher context for all
types of packets and call this function before removing the header protection
and before deciphering the packet.
Must be backported to 2.7.
When retransmitting datagrams with two coalesced packets inside, the second
packet was not taken into consideration when checking there is enough space
into the network for the datagram, especially when limited by the anti-amplification.
Must be backported to 2.6 and 2.7.
Before building a packet into a datagram, ensure there is sufficient space for at
least 1200 bytes. Also pad datagrams with only one ack-eliciting Initial packet
inside.
Must be backported to 2.7 and 2.6.
Making http_7239_valid_obfsc() inline because it is only called by inline
functions.
Removing dead comment and documenting proxy_http_parse_{7239,xff,xot} functions.
No backport needed.
Anonymization mode has two CLI handlers "set anon <on|off>" and "set
anon global-key". The last one only requires admin level. However, as
cli_find_kw() is implemented, only the first handler will be retrieved
as they both start with the same prefix "set anon".
This has the effect to execute the wrong handler for "set anon
global-key" with an error message about an invalid keyword. To fix this,
handlers definition have been separated for both "set anon on" and "set
anon off" commands. This allows to have minimal changes while keeping
the same "set anon" prefix for each commands.
Also take this opportunity to fix a reference to a non-existing "set
global-key" CLI handler in the documentation.
This must be backported up to 2.7.
When a STREAM frame is re-emitted, it will point to the same stream
buffer as the original one. If an ACK is received for either one of
these frame, the underlying buffer may be freed. Thus, if the second
frame is declared as lost and schedule for retransmission, we must
ensure that the underlying buffer is still allocated or interrupt the
retransmission.
Stream buffer is stored as an eb_tree indexed by the stream ID. To avoid
to lookup over a tree each time a STREAM frame is re-emitted, a lost
STREAM frame is flagged as QUIC_FL_TX_FRAME_LOST.
In most cases, this code is functional. However, there is several
potential issues which may cause a segfault :
- when explicitely probing with a STREAM frame, the frame won't be
flagged as lost
- when splitting a STREAM frame during retransmission, the flag is not
copied
To fix both these cases, QUIC_FL_TX_FRAME_LOST flag has been converted
to a <dup> field in quic_stream structure. This field is now properly
copied when splitting a STREAM frame. Also, as this is now an inner
quic_frame field, it will be copied automatically on qc_frm_dup()
invocation thus ensuring that it will be set on probing.
This issue was encounted randomly with the following backtrace :
#0 __memmove_avx512_unaligned_erms ()
#1 0x000055f4d5a48c01 in memcpy (__len=18446698486215405173, __src=<optimized out>,
#2 quic_build_stream_frame (buf=0x7f6ac3fcb400, end=<optimized out>, frm=0x7f6a00556620,
#3 0x000055f4d5a4a147 in qc_build_frm (buf=buf@entry=0x7f6ac3fcb5d8,
#4 0x000055f4d5a23300 in qc_do_build_pkt (pos=<optimized out>, end=<optimized out>,
#5 0x000055f4d5a25976 in qc_build_pkt (pos=0x7f6ac3fcba10,
#6 0x000055f4d5a30c7e in qc_prep_app_pkts (frms=0x7f6a0032bc50, buf=0x7f6a0032bf30,
#7 qc_send_app_pkts (qc=0x7f6a0032b310, frms=0x7f6a0032bc50) at src/quic_conn.c:4184
#8 0x000055f4d5a35f42 in quic_conn_app_io_cb (t=0x7f6a0009c660, context=0x7f6a0032b310,
This should fix github issue #2051.
This should be backported up to 2.6.
In the OCSP response callback, instead of using the actual date of the
system, the scheduler's 'now' timer is used when checking a response's
validity.
This patch can be backported to all stable versions.