haproxy

Author	SHA1	Message	Date
Christopher Faulet	cac1e39e86	BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams For regular H2 messages, the HTX EOM flag is synonymous the end of input. So SE_FL_EOI flag must also be set on the stream-endpoint descriptor. However, there is an exception. For tunneled streams, the end of message is reported on the HTX message just after the headers. But in that case, no end of input is reported on the SE. But here, there is a bug. The "early" EOM is also report on the HTX messages when there is no payload (for instance a content-length set to 0). If there is no ES flag on the H2 HEADERS frame, it is an unexpected case. Because for the applicative stream and most probably for the opposite endpoint, the message is considered as finihsed. It is switched in its DONE state (or the equivalent on the endpoint). But, if an extra H2 frame with the ES flag is received, a TRAILERS frame or an emtpy DATA frame, an extra EOT HTX block is pushed to carry the HTX EOM flag. So an extra HTX block is emitted for a regular HTX message. It is totally invalid, it must never happen. Because it is an undefined behavior, it is difficult to predict the result. But it definitly prevent the applicative stream to properly handle aborts and errors because data remain blocked in the channel buffer. Indeed, the end of the message was seen, so no more data are forwarded. It seems to be an issue for 2.8 and upper. Harder to evaluate for older versions. This patch must be backported as far as 2.4. (cherry picked from commit 6743e128f34fba297f2cac836a4f11b84acd503a) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:33:34 +02:00
Christopher Faulet	aa43ed1719	BUG/MEDIUM: http-ana: Report error on write error waiting for the response When we are waiting for the server response, if an error is pending on the frontend side (a write error on client), it is handled as an abort and all regular response analyzers are removed, except the one responsible to release the filters, if any. However, while it is handled as an abort, the error is not reported, as usual, via http_reply_and_close() function. It is an issue because in that, the channels buffers are not reset. Because of this bug, it is possible to block a stream infinitely. The request side is waiting for the response side and the response side is blocked because filters must be released and this cannot be done because data remain blocked in channels buffers. So, in that case, calling http_reply_and_close() with no message is enough to unblock the stream. This patch must be backported as far as 2.8. (cherry picked from commit 0ba6202796fe24099aeff89a5a4b83af99fc027b) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:33:26 +02:00
Amaury Denoyelle	0a06f5f5ae	BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content Received QUIC packets are stored in quic_conn Rx buffer after header protection removal in qc_rx_pkt_handle(). These packets are then removed after quic_conn IO handler via qc_treat_rx_pkts(). If HP cannot be removed, packets are still copied into quic_conn Rx buffer. This can happen if encryption level TLS keys are not yet available. The packet remains in the buffer until HP can be removed and its content processed. An issue occurs if client emits a 0-RTT packet but haproxy does not have the shared secret, for example after a haproxy process restart. In this case, the packet is copied in quic_conn Rx buffer but its HP won't ever be removed. This prevents the buffer to be purged. After some time, if the client has emitted enough packets, Rx buffer won't have any space left and received packets are dropped. This will cause the connection to freeze. To fix this, remove any 0-RTT buffered packets on handshake completion. At this stage, 0-RTT packets are unnecessary anymore. The client is expected to reemit its content in 1-RTT packet which are properly deciphered. This can easily reproduce with HTTP/3 POST requests or retrieving a big enough object, which will fill the Rx buffer with ACK frames. Here is a picoquic command to provoke the issue on haproxy startup : $ picoquicdemo -Q -v 00000001 -a h3 <hostname> 20443 "/?s=1g" Note that allow-0rtt must be present on the bind line to trigger the issue. Else haproxy will reject any 0-RTT packets. This must be backported up to 2.6. This could be one of the reason for github issue #2549 but it's unsure for now. (cherry picked from commit bba6baff306820a01ea66fddde5acbad11c601b6) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:33:14 +02:00
William Lallemand	5072a968c3	BUG/MEDIUM: ssl: 0-RTT initialized at the wrong place for AWS-LC Revert patch fcc8255 "MINOR: ssl_sock: Early data disabled during SSL_CTX switching (aws-lc)". The patch was done in the wrong callback which is never built for AWS-LC, and applies options on the SSL_CTX instead of the SSL, which should never be done elsewhere than in the configuration parsing. This was probably triggered by successfully linking haproxy against AWS-LC without using USE_OPENSSL_AWSLC. The patch also reintroduced SSL_CTX_set_early_data_enabled() in the ssl_quic_initial_ctx() and ssl_sock_initial_ctx(). So the initial_ctx does have the right setting, but it still needs to be applied to the selected SSL_CTX in the clienthello, because we need it on the selected SSL_CTX. Must be backported to 3.0. (ssl_clienthello.c part was in ssl_sock.c) (cherry picked from commit 1889b86561ee67696760111c6df5759c628430dc) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:33:00 +02:00
William Lallemand	5bf426baa4	BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC Then reactivate HAVE_SSL_0RTT and HAVE_SSL_0RTT_QUIC for AWS-LC, which were wrongly deactivated in f5353f2c ("MINOR: ssl: add HAVE_SSL_0RTT constant"). Must be backported to 3.0. (cherry picked from commit 56eefd6827b42afcefed7cc41d2cc38f5c1a2172) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:31:27 +02:00
Willy Tarreau	01aeb7495f	BUG/MINOR: stconn: bs.id and fs.id had their dependencies incorrect The backend depends on the response and the frontend on the request, not the other way around. In addition, they used to depend on L6 (hence contents in the channel buffers) while they should only depend on L5 (permanent info known in the mux). This came in 2.9 with commit 24059615a7 ("MINOR: Add sample fetches to get the frontend and backend stream ID") so this can be backported there. (cherry picked from commit 61dd0156c82ea051779e6524cad403871c31fc5a) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 376b147ffffef0a1f898d72d1d70f10f07d2e5a4) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:31:22 +02:00
Christopher Faulet	40142d2b95	BUILD: mux-pt: Use the right name for the sedesc variable A typo was introduced in 760d26a86 ("BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path"). The sedesc variable is 'sd', not 'se'. This patch must be backported with the commit above. (cherry picked from commit d9f41b1d6e811022372ce541e67b047bd18630a9) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:31:16 +02:00
Christopher Faulet	7749062aba	BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on sending path When data are sent using the kernel splicing, if a connection error occurred, the pipe must be released. Indeed, in that case, no more data can be sent and there is no reason to not release the pipe. But it is in fact an issue for the stream because the channel will appear are not empty. This may prevent the stream to be released. This happens on 2.8 when a filter is also attached on it. On 2.9 and upper, it seems there is not issue. But it is hard to be sure and the current patch remains valid is all cases. On 2.6 and lower, the code is not the same and, AFAIK, there is no issue. This patch must be backported to 2.8. However, on 2.8, there is no zero-copy data forwarding. The patch must be adapted. There is no done_ff/resume_ff callback functions for muxes. The pipe must released in sc_conn_send() when an error flag is set on the SE, after the call to snd_pipe callback function. (cherry picked from commit 760d26a8625f3af2b6939037a40f19b5f8063be1) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:31:09 +02:00
Christopher Faulet	e2a93b6492	BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was set When a send on a connection is performed, if a SE error (or a pending error) was already reported earlier, we leave immediately. No send is performed. However, we must be sure to report the error at the SC level if necessary. Indeed, the SE error may have been reported during the zero-copy data forwarding. So during receive on the opposite side. In that case, we may have missed the opportunity to report it at the SC level. The patch must be backported as far as 2.8. (cherry picked from commit 5dc45445ff18207dbacebf1f777e1f1abcd5065d) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 18:31:01 +02:00
Aurelien DARRAGON	b2dabc930c	BUG/MEDIUM: server/addr: fix tune.events.max-events-at-once event miss and leak An issue has been introduced with cd99440 ("BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates"). Indeed, in the above commit we implemented the atomic_sync task which is responsible for consuming pending server events to apply the changes atomically. For now only server's addr updates are concerned. To prevent the task from causing contention, a budget was assigned to it. It can be controlled with the global tunable 'tune.events.max-events-at-once': the task may not process more than this number of events at once. However, a bug was introduced with this budget logic: each time the task has to be interrupted because it runs out of budget, we reschedule the task to finish where it left off, but the current event which was already removed from the queue wasn't processed yet. This means that this pending event (each tune.events.max-events-at-once) is effectively lost. When the atomic_sync task deals with large number of concurrent events, this bug has 2 known consequences: first a server's addr/port update will be lost every 'tune.events.max-events-at-once'. This can of course cause reliability issues because if the event is not republished periodically, the server could stay in a stale state for indefinite amount of time. This is the case when the DNS server flaps for instance: some servers may not come back UP after the incident as described in GH #2666. Another issue is that the lost event was not cleaned up, resulting in a small memory leak. So in the end, it means that the bug is likely to cause more and more degradation over time until haproxy is restarted. As a workaround, 'tune.events.max-events-at-once' may be set to the maximum number of events expected per batch. Note however that this value cannot exceed 10 000, otherwise it could cause the watchdog to trigger due to the task being busy for too long and preventing other threads from making any progress. Setting higher values may not be optimal for common workloads so it should only be used to mitigate the bug while waiting for this fix. Since tune.events.max-events-at-once defaults to 100, this bug only affects configs that involve more than 100 servers whose addr:port properties are likely to be updated at the same time (batched updates from cli, lua, dns..) To fix the bug, we move the budget check after the current event is fully handled. For that we went from a basic 'while' to 'do..while' loop as we assume from the config that 'tune.events.max-events-at-once' cannot be 0. While at it, we reschedule the task once thread isolation ends (it was not required to perform the reschedule while under isolation) to give the hand back faster to waiting threads. This patch should be backported up to 2.9 with cd99440. It should fix GH #2666. (cherry picked from commit 8f1fd96d17588fb571959901bd20d4239b1a96af) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-09-03 16:39:52 +02:00
Willy Tarreau	7a59afa93b	[RELEASE] Released version 3.0.4 Released version 3.0.4 with the following main changes : - MINOR: proto: extend connection thread rebind API - BUILD: listener: silence a build warning about unused value without threads - BUG/MEDIUM: quic: prevent crash on accept queue full - CLEANUP: proto: rename TID affinity callbacks - CLEANUP: quic: rename TID affinity elements - BUG/MINOR: session: Eval L4/L5 rules defined in the default section - BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts - DOC: install: don't reference removed CPU arg - BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path - BUG/MAJOR: mux-h2: force a hard error upon short read with pending error - DOC: configuration: issuers-chain-path not compatible with OCSP - DOC: config: improve the http-keep-alive section - BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter - BUG/MINOR: server: Don't warn fallback IP is used during init-addr resolution - BUG/MINOR: cli: Atomically inc the global request counter between CLI commands - BUG/MINOR: quic: Non optimal first datagram. - MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface - BUG/MINOR: quic: Lack of precision when computing K (cubic only cc) - BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the signature - MINOR: quic: Dump TX in flight bytes vs window values ratio. - MINOR: quic: Add information to "show quic" for CUBIC cc. - MEDIUM: h1: allow to preserve keep-alive on T-E + C-L - MINOR: queue: add a function to check for TOCTOU after queueing - BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue() - MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD (take #2) - BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn - Revert "MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface" - MEDIUM: log: relax some checks and emit diag warnings instead in lf_expr_postcheck() - DOC: quic: fix default minimal value for max window size - MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status - BUG/MEDIUM: mux-h2: Set ES flag when necessary on 0-copy data forwarding - BUG/MEDIUM: stream: Prevent mux upgrades if client connection is no longer ready - BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry - BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown - BUG/MEDIUM: cli: Always release back endpoint between two commands on the mcli - BUG/MINOR: quic: unexploited retransmission cases for Initial pktns. - BUG/MEDIUM: mux-h1: Properly handle empty message when an error is triggered - MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places - BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf - BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) - BUG/MINOR: Crash on O-RTT RX packet after dropping Initial pktns - BUG/MEDIUM: mux-pt: Fix condition to perform a shutdown for writes in mux_pt_shut()	2024-09-03 15:37:09 +02:00
Christopher Faulet	710f2389d4	BUG/MEDIUM: mux-pt: Fix condition to perform a shutdown for writes in mux_pt_shut() A regression was introduced in the commit 76fa71f7a ("BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown") because of a typo on the connection flags. CO_FL_SOCK_WR_SH flag must be tested to prevent a call to conn_sock_shutw() and not CO_FL_SOCK_RD_SH. Concretly, most of time, it is harmeless because shutdown for writes is always performed before any shutdown for reads. Except in case describe by the commit above. But it is not clear if it has an impact or not. This patch must be backported with the commit above, so as far as 2.9. (cherry picked from commit e1cae428791abf4e4fdf3969761eaaafd45df636) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 15:31:58 +02:00
Frederic Lecaille	2f7ae07342	BUG/MINOR: Crash on O-RTT RX packet after dropping Initial pktns This bug arrived with this naive commit: BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) which omitted to consider the case where the Initial packet number space could be discarded before receiving 0-RTT packets. To fix this, append/insert the O-RTT (early-data) packet number space into the encryption level list depending on the presence or not of the Initial packet number space. This issue was revealed when using aws-lc as TLS stack in GH #2701 issue. Thank you to @Tristan971 for having reported this issue. Must be backported where the commit mentionned above is supposed to be backported: as far as 2.9. (cherry picked from commit 7e19432fd41e9c0146f0227b43d0dd3dc740e20b) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 15:29:32 +02:00
Frederic Lecaille	19d9009a91	BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) By "aws-lc only", one means that this bug was first revealed by aws-lc stack. This does not mean it will not appeared for new versions of other TLS stacks which have never revealed this bug. This bug was reported by Ilya (@chipitsine) in GH #2657 where some QUIC interop tests (resumption, zerortt) could lead to crash with haproxy compiled against aws-lc TLS stack. These crashed were triggered by this BUG_ON() which detects that too short datagrams with at least one ack-eliciting Initial packet inside could be built. <0>2024-07-31T15:13:42.562717+02:00 [01\|quic\|5\|quic_tx.c:739] qc_prep_pkts(): next encryption level : qc@0x61d000041080 idle_timer_task@0x60d000006b80 flags=0x6000058 FATAL: bug condition "first_pkt->type == QUIC_PACKET_TYPE_INITIAL && (first_pkt->flags & (1UL << 0)) && length < 1200" matched at src/quic_tx.c:163 call trace(12): \| 0x563ea447bc02 [ba d9 00 00 00 48 8d 35]: main-0x1958ce \| 0x563ea4482703 [e9 73 fe ff ff ba 03 00]: qc_send+0x17e4/0x1b5d \| 0x563ea4488ab4 [85 c0 0f 85 00 f6 ff ff]: quic_conn_io_cb+0xab1/0xf1c \| 0x563ea468e6f9 [48 c7 c0 f8 55 ff ff 64]: run_tasks_from_lists+0x173/0x9c2 \| 0x563ea468f24a [8b 7d a0 29 c7 85 ff 0f]: process_runnable_tasks+0x302/0x6e6 \| 0x563ea4610893 [83 3d aa 65 44 00 01 0f]: run_poll_loop+0x6e/0x57b \| 0x563ea4611043 [48 8b 1d 46 c7 1d 00 48]: main-0x48d \| 0x7f64d05fb609 [64 48 89 04 25 30 06 00]: libpthread:+0x8609 \| 0x7f64d0520353 [48 89 c7 b8 3c 00 00 00]: libc:clone+0x43/0x5e That said everything was correctly done by qc_prep_ptks() to prevent such a case. But this relied on the hypothesis that the list of encryption levels it used was always built in the same order as follows for 0-RTT sessions: initial, early-data, handshake, application But this order is determined but the order the TLS stack derives the secrets for these encryption levels. For aws-lc, this order is not the same but as follows: initial, handshake, application, early-data During 0-RTT sessions, the server may have to build three ack-eliciting packets (with CRYPTO data inside) to reply to the first client packet: initial, hanshake, application. qc_prep_pkts() adds a PADDING frame to the last built packet for the last encryption level in the list. But after application level encryption, there is early-data encryption level. This prevented qc_prep_pkts() to build a padded applicaiton level last packet to send a 1200-bytes datagram. To fix this, always insert early-data encryption level after the initial encryption level into the encryption levels list when initializing this encryption level from quic_conn_enc_level_init(). Must be backported as far as 2.9. (cherry picked from commit e12620a8a909805755ae1e8b8552c187202a9f3f) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 15:29:25 +02:00
Willy Tarreau	c725db17e8	BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the mbuf There exists an extremely tricky code path that was revealed in 3.0 by the glitches feature, though it might theoretically have existed before. TL;DR: a mux mbuf may be full after successfully sending GOAWAY, and discard its remaining contents without clearing H2_CF_MUX_MFULL and H2_CF_DEM_MROOM, then endlessly loop in h2_send(), until the watchdog takes care of it. What can happen is the following: Some data are received, h2_io_cb() is called. h2_recv() is called to receive the incoming data. Then h2_process() is called and in turn calls h2_process_demux() to process input data. At some point, a glitch limit is reached and h2c_error() is called to close the connection. The input frame was incomplete, so some data are left in the demux buffer. Then h2_send() is called, which in turn calls h2_process_mux(), which manages to queue the GOAWAY frame, turning the state to H2_CS_ERROR2. The frame is sent, and h2_process() calls h2_send() a last time (doing nothing) and leaves. The streams are all woken up to notify about the error. Multiple backend streams were waiting to be scheduled and are woken up in turn, before their parents being notified, and communicate with the h2 mux in zero-copy-forward mode, request a buffer via h2_nego_ff(), fill it, and commit it with h2_done_ff(). At some point the mux's output buffer is full, and gets flags H2_CF_MUX_MFULL. The io_cb is called again to process more incoming data. h2_send() isn't called (polled) or does nothing (e.g. TCP socket buffers full). h2_recv() may or may not do anything (doesn't matter). h2_process() is called since some data remain in the demux buf. It goes till the end, where it finds st0 == H2_CS_ERROR2 and clears the mbuf. We're now in a situation where the mbuf is empty and MFULL is still present. Then it calls h2_send(), which doesn't call h2_process_mux() due to MFULL, doesn't enter the for() loop since all buffers are empty, then keeps sent=0, which doesn't allow to clear the MFULL flag, and since "done" was not reset, it loops forever there. Note that the glitches make the issue more reproducible but theoretically it could happen with any other GOAWAY (e.g. PROTOCOL_ERROR). What makes it not happen with the data produced on the parsing side is that we process a single buffer of input at once, and there's no way to amplify this to 30 buffers of responses (RST_STREAM, GOAWAY, SETTINGS ACK, WINDOW_UPDATE, PING ACK etc are all quite small), and since the mbuf is cleared upon every exit from h2_process() once the error was sent, it is not possible to accumulate response data across multiple calls. And the regular h2_snd_buf() path checks for st0 >= H2_CS_ERROR so it will not produce any data there either. Probably that h2_nego_ff() should check for H2_CS_ERROR before accepting to deliver a buffer, but this needs to be carefully studied. In the mean time the real problem is that the MFULL flag was kept when clearing the buffer, making the two inconsistent. Since it doesn't seem possible to trigger this sequence without the zero-copy-forward mechanism, this fix needs to be backported as far as 2.9, along with previous commit "MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places" which will strengthen the consistency between these checks. Many thanks to Annika Wickert for her detailed report that allowed to diagnose this problem. CVE-2024-45506 was assigned to this problem. (cherry picked from commit 830e50561c6636be4ada175d03e8df992abbbdcd) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 14:59:09 +02:00
Willy Tarreau	d636e51545	MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places The code leading to H2_CF_MUX_MFULL and H2_CF_DEM_MROOM being cleared is quite complex and assumptions about its state are extremely difficult when reading the code. There are indeed long sequences where the mux might possibly be empty, still having the flag set until it reaches h2_send() which will clear it after the last send. Even then it's not obviour whether it's always guaranteed to release the flag when invoked in multiple passes. Let's just simplify the conditionnn so that h2_send() does not depend on "sent" anymore and that h2_timeout_task() doesn't leave the flags set on the buffer on emptiness. While it doesn't seem to fix anything, it will make the code more robust against future changes. (cherry picked from commit e9cdedb39b2020a7eb1ae5d8462b391d4301fb93) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 14:59:09 +02:00
Christopher Faulet	dfefb9953e	BUG/MEDIUM: mux-h1: Properly handle empty message when an error is triggered When a 400/408/500/501 error is returned by the H1 multiplexer, we first try to get the error message of the proxy before using the default one. This may be configured to be mapped on /dev/null or on an empty file. In that case, no message is emitted, as expected. But everything is handled as the error was successfully sent. However, there is an bug here. In h1_send_error() function, this case is not properly handled. The flag H1C_F_ABRTED is not set on the H1 connection as it should be and h1_close() function is not called, leaving the H1 connection in an undefined state. It is especially an issue when a "empty" 408-Request-Time-out error is emitted while there are data blocked in the output buffer. In that case, the connection remains openned until the client closes and a "cR--"/408 is logged repeatedly, every time the client timeout is reached. This patch must backported as far as 2.8. (cherry picked from commit 0d4271cdae18780de79e1ce997d562f91eeee316) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 14:39:30 +02:00
Frederic Lecaille	657e745c16	BUG/MINOR: quic: unexploited retransmission cases for Initial pktns. qc_prep_hdshk_fast_retrans() job is to pick some packets to be retransmitted from Initial and Handshake packet number spaces. A packet may be coalesced to a first one into the same datagram. When a coalesced packet is inspected for retransmission, it is skipped if its length would make the total datagram length it is attached to exceeding the anti-amplification limit. But in this case, the first packet must be kept for the current retransmission. This is tracked by this trace statemement: TRACE_PROTO("will probe Initial packet number space", QUIC_EV_CONN_SPPKTS, qc); This was not the case because of the wrong "goto end" statement. This latter must be run only if the Initial packet number space must not be probe with the first packet found as coalesced to another one which must be skipped. This bug was revealed by AWS-LC interop runner with handshakeloss and handshakecorruption which always fail because this stack leads the server to send more Initial packets. Thank you to Ilya (@chipitsine) for this issue report in GH #2663. Must be backported as far as 2.6. (cherry picked from commit 15a737eb5fc54bbc8aa5cadad054a69badde5b8e) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 13:56:55 +02:00
Christopher Faulet	008f445a4f	BUG/MEDIUM: cli: Always release back endpoint between two commands on the mcli When several commands are chained on the master CLI, the same client connection is used. Because, it is a TCP connection, the mux PT is used. It means there is no stream at the mux level. It is not possible to release the applicative stream between each commands as for the HTTP. So, to work around this limitation, between two commands, the master CLI is resetting the stream. It does exactly what it was performed on HTTP to manage keep-alive connections on old HAProxy versions. But this part was copied from a code dealing with connection only while the back endpoint can be an applet or a mux for the master cli. The previous fix on the mux PT ("BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown") revealed a bug. Between two commands, the back endpoint was only released if the connection's XPRT was closed. This works if the back endpoint is an applet because there is no connection. But for commands sent to a worker, a connection is used. At this stage, this only works if the connection's XPRT is closed. Otherwise, the old endpoint is never detached leading to undefined behavior on the next command execution (most probably a crash). Without the commit above, the connection's XPRT is always closed on shutdown. It is no longer true. At this stage, we must inconditionnally release the back endpoint by resetting the corresponding sedesc to fix the bug. This patch must be backported with the commit above in all stable versions. On 2.4 and lower, it will need to be adapted. (cherry picked from commit d4781bd5e7f0e0c0491fb7dfccb4da5c15055e3f) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 07:51:16 +02:00
Christopher Faulet	8c94c485b0	BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown When a shutdown is reported to the mux (shutdown for reads or writes), the connexion is immediately fully closed if the mux detects the connexion is closed in both directions. Only the passthrough multiplexer is able to perform this action at this stage because there is no stream and no internal data. Other muxes perform a full connection close during the mux's release stage. It was working quite well since recently. But, in theory, the bug is quite old. In fact, it seems possible for the lower layer to report an error on the connection in same time a shutdown is performed on the mux. Depending on how events are scheduled, the following may happen: 1. An connection error is detected at the fd layer and a wakeup is scheduled on the mux to handle the event. 2. A shutdown for writes is performed on the mux. Here the mux decides to fully close the connexion. If the xprt is not used to log info, it is released. 3. The mux is finally woken up. It tries to retrieve data from the xprt because it is not awayre there was an error. This leads to a crash because of a NULL-deref. By reading the code, it is not obvious. But it seems possible with SSL connection when the handshake is rearmed. It happens when a SSL_ERROR_WANT_WRITE is reported on a SSL_read() attempt or a SSL_ERROR_WANT_READ on a SSL_write() attempt. This bug is only visible if the XPRT is not used to log info. So it is no so common. This patch should fix the 2nd crash reported in the issue #2656. It must first be backported as far as 2.9 and then slowly to all stable versions. (cherry picked from commit 76fa71f7a8d27006ea1b06b417963501d3a5fcab) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-03 07:51:08 +02:00
Christopher Faulet	46f72f4379	BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry Support for 429 was recently added to L7 retries (0d142e075 "MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status"). But the l7_status_match() function was not properly updated. The switch statement must match the 429 status to be able to perform a L7 retry. This patch must be backported if the commit above is backported. It is related to #2687. (cherry picked from commit 62c9d51ca4d4f870723522b30d368d984f536e7e) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-02 20:09:33 +02:00
Christopher Faulet	13437097c3	BUG/MEDIUM: stream: Prevent mux upgrades if client connection is no longer ready If an early error occurred on the client connection, we must prevent any multiplexer upgrades. Indeed, it is unexpected for a mux to be initialized with no xprt. On a normal workflow it is impossible. So it is not an issue. But if a mux upgrade is performed at the stream level, an early error on the connection may have already been handled by the previous mux and the connection may be already fully closed. If the mux upgrade is still performed, a crash can be experienced. It is possible to have a crash with an implicit TCP>HTTP upgrade if there is no data in the input buffer. But it is also possible to get a crash with an explicit "switch-mode http" rule. It must be backported to all stable versions. In 2.2, the patch must be applied directly in stream_set_backend() function. (cherry picked from commit e4812404c541018ba521abf6573be92553ba7c53) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-02 20:09:33 +02:00
Christopher Faulet	9e4bdd6fa4	BUG/MEDIUM: mux-h2: Set ES flag when necessary on 0-copy data forwarding When DATA frames are sent via the 0-copy data forwarding, we must take care to set the ES flag on the last DATA frame. It should be performed in h2_done_ff() when IOBUF_FL_EOI flag was set by the producer. This flag is here to know when the producer has reached the end of input. When this happens, the h2s state is also updated. It is switched to "half-closed local" or "closed" state depending on its previous state. It is mainly an issue on uploads because the server may be blocked waiting for the end of the request. A workaround is to disable the 0-copy forwarding support the the H2 by setting "tune.h2.zero-copy-fwd-send" directive to off in your global section. This patch should fix the issue #2665. It must be backported as far as 2.9. (cherry picked from commit 4ef5251c44b83ed2f9495d200827f8696f16cd60) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-02 20:09:33 +02:00
Christopher Faulet	ac2dc762f8	MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status The "429" status can now be specified on retry-on directives. PR_RE_* flags were updated to remains sorted. This patch should fix the issue #2687. It is quite simple so it may safely be backported to 3.0 if necessary. (cherry picked from commit 0d142e0756986b56819ecb2d131a0c4b30ae899f) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-02 20:09:33 +02:00
Amaury Denoyelle	edc6b40079	DOC: quic: fix default minimal value for max window size It is possible to override the default QUIC congestion algorithm on a bind line. With the same setting, it is also possible to specify the maximum congestion window size. The parser rejects values outside of the range between 10k and 4g. This is in contradiction with the documentation which specify 1k as the lower value. Correct this value in the documentation. This should be backported up to 2.9. (cherry picked from commit 103d8607776dbbf6f64eaf82359ec7a5dd7e3ebb) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-09-02 20:09:25 +02:00
Aurelien DARRAGON	3f4ef20698	MEDIUM: log: relax some checks and emit diag warnings instead in lf_expr_postcheck() With 7a21c3a ("MAJOR: log: implement proper postparsing for logformat expressions") which finally made postparsing checks reliable, we started to get report from users that couldn't start haproxy 3.0 with configs that used to work in the past. The current situation is described in GH #2642. While the checks are mostly relevant, it turns out there are not strictly needed anymore from a technical point of view. Most of them were useful in early logformat implementation to prevent runtime bugs due to the use of an alias or fetch at runtime from an incompatible proxy. It's been a few versions already that the code handling fetches and log aliases is robust enough to support fetches/aliases used from the wrong context: all it does is that the fetch/alias will silently fail if it's not available. This can be proved by the fact that even if the postparsing checks were partially broken in the past, it didn't cause runtime issues (at least on recent haproxy versions). Most of these checks can now be seen as configuration hints: when a check triggers, it will indicate a configuration inconsistency in most cases, but they are some corner cases where it is not possible to know at config time if the conditions will be met for the alias/fetch to work properly.. so instead of failing with a hard error like we did so far, let's just be more permissive and report our findings using "diag_warning": such warnings are only emitted when haproxy is started with '-dD' cli option. We also took this opportunity to improve messages clarity and make them more precise (report the offending item instead of complaining about the whole expression because of a single element). With this patch, configs that used to start before 7a21c3a shouldn't trigger hard errors anymore. This may be backported in 3.0. (cherry picked from commit 41ca89bc6fe96a660fea992643dbc2c0844a609e) [ada: ctx adjt] Signed-off-by: Aurelien DARRAGON <adarragon@haproxy.com>	2024-08-16 14:34:21 +02:00
Willy Tarreau	9b84f98ae3	Revert "MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface" This reverts commit 514a3110f0cb33a04fa5bd786927ad98aebfe72b. This one was backported by mistake, it wasn't meant to. It should not harm anyway but better not backport stuff that doesn't need to. Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 16:30:42 +02:00
Valentine Krasnobaeva	bfd43e7999	BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn This commit fixes 41275a691 ("MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"). fd_hard_limit is taken in account implicitly via 'ideal_maxconn' value in all maxconn adjustements, when global.rlimit_memmax is set: MIN(global.maxconn, capped by global.rlimit_memmax, ideal_maxconn); It also caps provided global.rlimit_nofile, if it couldn't be set as a current process fd limit (see more details in the main() code). So, lets set the default value for fd_hard_limit only, when there is no any other haproxy-specific limit provided, i.e. rlimit_memmax, maxconn, rlimit_nofile. Otherwise we may break users configs. Please, note, that in master-worker mode, master does not need the DEFAULT_MAXFD (1048576) as well, as we explicitly limit its maxconn to 100. Must be backported in all stable versions until v2.6.0, including v2.6.0, like the commit above. (cherry picked from commit 16a5fac4bba1cb2bb6cf686066256aa141515feb) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 14:20:15 +02:00
Valentine Krasnobaeva	d6c8f7d7ae	MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD (take #2 ) Let's provide a default value for fd_hard_limit, if it's not set in the configuration. With this patch we could set some specific default via compile-time variable DEFAULT_MAXFD as well. Hope, this will be helpfull for haproxy package maintainers. make -j 8 TARGET=linux-glibc DEBUG=-DDEFAULT_MAXFD=50000 If haproxy is comipled without DEFAULT_MAXFD defined, the default will be set to 1048576. This is done to avoid killing the process by its watchdog, while it started without any limitations in its configuration or in the command line and the hard RLIMIT_NOFILE is extremely huge (~1000000000). We use in this case compute_ideal_maxconn() to calculate maxconn and maxsock, maxsock defines the size of internal fdtab, which becames very-very large as well. When the process starts to simply loop over this fdtab (0(n)), this takes a lot of time, so watchdog does it job. To avoid this, maxconn now is always reduced to some reasonable value either by explicit global.fd-hard-limit from configuration, or by its default. The default may be changed at build-time and overwritten then by global.fd-hard-limit at runtime. Explicit global.fd-hard-limit from the configuration has always precedence over DEFAULT_MAXFD, if set. Must be backported in all stable versions until v2.6.0, including v2.6.0. (cherry picked from commit 41275a691839df5f8dc7cb9faa4e259fbb755d34) [wt: the discussion around this patch came to an agreement on the list: https://www.mail-archive.com/haproxy@formilux.org/msg45098.html ] Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 14:19:27 +02:00
Willy Tarreau	68492650d3	BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue() After checking that a server or backend is full, it remains possible to call pendconn_add() just after the last pending requests finishes, so that there's no more connection on the server for very low maxconn (typ 1), leaving new ones in queue till the timeout. The approach depends on where the request was queued, though: - when queued on a server, we can simply detect that we may dequeue pending requests and wake them up, it will wake our request and that's fine. This needs to be done in srv_redispatch_connect() when the server is set. - when queued on a backend, it means that all servers are done with their requests. It means that all servers were full before the check and all were empty after. In practice this will only concern configs with less servers than threads. It's where the issue was first spotted, and it's very hard to reproduce with more than one server. In this case we need to load-balance again in order to find a spare server (or even to fail). For this, we call the newly added dedicated function pendconn_must_try_again() that tells whether or not a blocked pending request was dequeued and needs to be retried. This should be backported along with pendconn_must_try_again() to all stable versions, but with extreme care because over time the queue's locking evolved. (cherry picked from commit 5541d4995d6d9e8e7956423d26c26bebe8f0eaea) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 12:11:53 +02:00
Willy Tarreau	94f85bd646	MINOR: queue: add a function to check for TOCTOU after queueing There's a rare TOCTOU case that happens from time to time with maxconn 1 and multiple threads. Between the moment we see the queue full and the moment we queue a request, it's possible that the last request on the server or proxy ended and that no other one is left to offer it its place. Given that all this code path is performance-critical and we cannot afford to increase the lock duration, better recheck for the condition after queueing. For this we need to be able to check for the condition and cleanly dequeue a request. That's what this patch provides via the new function pendconn_must_try_again(). It will catch more requests than absolutely needed though it will catch them all. It may find that around 1/1000 of requests are at risk, though testing shows that in practice, it's around 1 per million that really gets stuck (other ones benefit from timing and finishing late requests). Maybe in the future some conditions might be refined but it's harmless. What happens to such requests is that they're dequeued and their pendconn freed, so that the caller can decide to try to LB or queue them again. For now the function is not used, it's just added separately for easier tracking. (cherry picked from commit 1a8f3a368f1d212f5c2869d400fb07c78b2e7f45) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 12:11:48 +02:00
Willy Tarreau	822602b3ff	MEDIUM: h1: allow to preserve keep-alive on T-E + C-L In 2.5-dev9, commit 631c7e866 ("MEDIUM: h1: Force close mode for invalid uses of T-E header") enforced a recently arrived new security rule in the HTTP specification aiming at preventing a class of content-smuggling attacks involving HTTP/1.0 agents. It consists in handling the very rare T-E + C-L requests or responses in close mode. It happens it does have an impact of a rare few and very old clients (probably running insecure TLS stacks by the way) that continue to send both with their POST requests. The impact is that for each and every request they'll have to reconnect, possibly negotiating a full TLS handshake that becomes harmful to the machine in terms of CPU computation. This commit adds a new option "h1-do-not-close-on-insecure-transfer-encoding" that does exactly what it says, it just asks not to close on such messages, even though the message continues to be sanitized and C-L dropped. It means that the risk is only between the sender and haproxy, which is limited, and might be the only acceptable solution for such environments having to deal with broken implementations. The cases are so rare that it should not need to be backported, or in the worst case, to the latest LTS if there is any demand. (cherry picked from commit 2dab1ba84b11fe43baa91642ffcddb90e9ec09d2) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 12:11:39 +02:00
Frederic Lecaille	70f6d9986e	MINOR: quic: Add information to "show quic" for CUBIC cc. Add ->state_cli() new callback to quic_cc_algo struct to define a function called by the "show quic (cc\|full)" commands to dump some information about the congestion algorithm internal state currently in use by the QUIC connections. Implement this callback for CUBIC algorithm to dump its internal variables: - K: (the time to reach the cubic curve inflexion point), - last_w_max: the last maximum window value reached before intering the last recovery period. This is also the window value at the inflexion point of the cubic curve, - wdiff: the difference between the current window value and last_w_max. So negative before the inflexion point, and positive after. (cherry picked from commit 76ff8afa2d9eb0206bc72f4e2f8ad230720dfb94) [wt: adjusted ctx in quic_cli since no GSO] Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Frederic Lecaille	31ba0d2abe	MINOR: quic: Dump TX in flight bytes vs window values ratio. Display the ratio of the numbers of bytes in flight by packet number spaces versus the current window values in percent. (cherry picked from commit 4abaadd842de23e25938e32add86ab37d8f67e24) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Christopher Faulet	22ef1a993a	BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the signature When the signature included in a JWT is verified, if an error occurred, one or more SSL errors are queued and never cleared. These errors may be then caught by the SSL stack and a fatal SSL error may be erroneously reported during a SSL received or send. So we must take care to clear the SSL error queue when the signature verification failed. This patch should fix issue #2643. It must be backported as far as 2.6. (cherry picked from commit 46b1fec0e9a6afe2c12fd4dff7c8a0d788aa6dd4) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Frederic Lecaille	3b51c3db6f	BUG/MINOR: quic: Lack of precision when computing K (cubic only cc) K cubic variable is stored in ms. But it was a formula with the second as unit for the window difference parameter which was used to compute K without considering the loss of information. Then the result was converted in ms (K *= 1000). This leaded to a lack of precision and multiples of 1000 as values. To fix this, use the same formula but with the window difference in ms as parameter passed to the cubic function and remove the conversion. Must be backported as far as 2.6. (cherry picked from commit a6d40e09f7c9ebb99d5f08630a6482fbb7d04d26) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Aurelien DARRAGON	514a3110f0	MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface Given that sink applets are responsible for conveying messages from the ring to the tcp server endpoint, there are no protocol timeout or errors expected there, it is an unidirectional flow of data over TCP. As such, NOLINGER flag which was inherited from peers applet, see dbd026792 ("BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface") is not desirable under sink context: The reason why we have the NOLINGER flag set is to ensure the connection is closed right away and avoid 60s TIME_WAIT delay on closed sockets. The downside is that messages sent right before closing the socket are not guaranteed to make it to the server because closing with NOLINGER flag set will result in RST packet being emitted right away, which could prevent in-flight messages from being properly delivered. Unlike peers applets, the only cases were sink applets are expected to close the connection are upon unexpected error or upon stopping, which are relatively rare events. Thanks to previous commit, ERROR flag is already set in case of error, so the use of NOLINGER is not mandatory for the RST to be sent. Now for the stopping case, it only happens once in the process lifetime so it's acceptable to close the socket using EOS+EOI flags without the NOLINGER option set. So in our case, it is preferable to ensure messages get properly delivered knowning that closed sockets should be piling up in TIME_WAIT, this means removing the NOLINGER flag on the outgoing stream interface for sink applets. It is a prerequisite for upcoming patches in order to cleanly shut the applet during runtime without risking to send the RST packet before all pending messages were sent to the endpoint. (cherry picked from commit 0821460e3f7c8212340507c86e193c4bed210789) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Frederic Lecaille	8d1fe2391a	BUG/MINOR: quic: Non optimal first datagram. This bug arrived with this commit: b068e758f MINOR: quic: simplify rescheduling for handshake This commit introduced a bad side effect. Haproxy always replied by an ACK-only datagram when it received the first client Initial packet. Then it handled the CRYPTO data insided. And finally, it sent its own CRYPTO data. This broke the packet coalescing rule whose aim is to optimally build and send as more as QUIC packets by datagram. To fix this, simply partially reverts this commit, to make the low level I/O task return again if some CRYPTO were received. This will delay the acknowledgement which will be sent with the CRYPTO data from the same datagram again. Must be backported to 3.0. (cherry picked from commit 402ce29e9e8e2d8b32c65e21a95e33f2f4d6373c) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Christopher Faulet	29a6545fec	BUG/MINOR: cli: Atomically inc the global request counter between CLI commands The global request counter is used to set the stream id (s->uniq_id). It is incremented at different places. And it must be atomically incremented because it is a global value. However, in the analyer dealing with CLI command response, this was not the case. It is now fixed. This patch must be backported to all stable versions. (cherry picked from commit 3cdb3fa5d95afc33465f894640217ff87b0c0562) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Christopher Faulet	1c15af0dc7	BUG/MINOR: server: Don't warn fallback IP is used during init-addr resolution When a fallback IP address is provided in the list of methods to use to resolve the server address, a warning is emitted if previous methods failed. The aim is to inform this address will be used for the server. However, it is valid use-case. It is the expected behavior. There is no reason to emit a warning. Having a message during HAProxy startup to inform the fallback IP address will be used is probably a good idea. But it should be a notice not a warning. Otherwise, checking the configuration validity will always failed, just like starting HAProxy in zero-warning mode while the option was set on purpose. This patch should fix the issue #2627. It must be backported to all stable versions. (cherry picked from commit abaafda4850c64183bb69d6a62675b7405993f4a) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Amaury Denoyelle	69dfddcf42	BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter Since 2.5, an array of GPC is provided to replace legacy gpc0/gpc1. src_inc_gpc is a sample fetch which is used to increment counters in this array. A crash occurs if src_inc_gpc is used without any previous track-sc rule. This is caused by an error in smp_fetch_sc_inc_gpc(). When temporary stick counter is created via smp_create_src_stkctr(), table pointer arg value used is not correct : it points to the counter ID instead of the table argument. To fix this, use the proper sample fetch second arg. This can be reproduced with the following config : acl mark src_inc_gpc(0,<table>) -m bool tcp-request connection accept if mark This should be backported up to 2.6. (cherry picked from commit ea7ea5198a0ed9352f88425c884e6437ecc9ebdc) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Willy Tarreau	8a77ac322f	DOC: config: improve the http-keep-alive section Nathan Wehrman suggested this add-on to try to better explain the interactions between http-keep-alive and other timeouts, and the impacts on protocols (HTTP/1, HTTP/2 etc). (cherry picked from commit 2bd269cf2a1345a48e5398149d3ead04ff059266) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
William Lallemand	a706f30b8d	DOC: configuration: issuers-chain-path not compatible with OCSP State that issuers-chain-path is not compatible with OCSP features. Must be backported in every stable version. (cherry picked from commit 8a3e4a608b5cfd50f080d082f21cf5b673fdc292) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Willy Tarreau	3efbe47cf2	BUG/MAJOR: mux-h2: force a hard error upon short read with pending error A risk of truncated packet was addressed in 2.9 by commit 19fb19976f ("BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty") by ignoring CO_FL_ERROR after a recv() call as long as some data remained present in the buffer. However it has a side effect due to the fact that some frame processors only deal with full frames, for example, HEADERS. The side effect is that an incomplete frame will not be processed and will remain in the buffer, preventing the error from being taken into account, so the I/O handler wakes up the H2 parser to handle the error, and that one just subscribes for more data, and this loops forever wasting CPU cycles. Note that this only happens with errors at the SSL layer exclusively, otherwise we'd have a read0 pending that would properly be detected: conn->flags = CO_FL_XPRT_TRACKED \| CO_FL_ERROR \| CO_FL_XPRT_READY \| CO_FL_CTRL_READY conn->err_code = CO_ERR_SSL_FATAL h2c->flags = H2_CF_ERR_PENDING \| H2_CF_WINDOW_OPENED \| H2_CF_MBUF_HAS_DATA \| H2_CF_DEM_IN_PROGRESS \| H2_CF_DEM_SHORT_READ The condition to report the error in h2_recv() needs to be refined, so that connection errors are taken into account either when the buffer is empty, or when there's an incomplete frame, since we're certain it will never be completed. We're certain to enter that function because H2_CF_DEM_SHORT_READ implies too short a frame, and earlier there's a protocol check to validate that no frame size is larger than bufsize, hence a H2_CF_DEM_SHORT_READ implies there's some room left in the buffer and we're allowed to try to receive. The condition to reproduce the bug seems super hard to meet but was observed once by Patrick Hemmer who had the reflex to capture lots of information that allowed to explain the problem. In order to reproduce it, the SSL code had to be significantly modified to alter received contents at very empiric places, but that was sufficient to reproduce it and confirm that the current patch works as expected. The bug was tagged MAJOR because when it triggers there's no other solution to get rid of it but to restart the process. However given how hard it is to trigger on a lab, it does not seem very likely to occur in field. This needs to be backported to 2.9. (cherry picked from commit 4de03e42cdc6361cf9d26e9c599469617e88eab5) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Valentine Krasnobaeva	47852a2c86	BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path We could run under heavy load in containers or on premises and some automatic tool in parallel could use CLI to check OCSP updates statuses or to upload new OCSP responses. So, calloc() to store OCSP update callback arguments may fail and ocsp_tree_lock need to be unlocked, when exiting due to this failure. This needs to be backported in all stable versions until v2.4.0 included. (cherry picked from commit 9371c28c28311f34d03c6e44bbeaf2214a1bec44) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Lukas Tribus	620284d0d0	DOC: install: don't reference removed CPU arg Remove reference to the removed CPU= build argument in commit 018443b8a1 ("BUILD: makefile: get rid of the CPU variable"). This should be backported to 3.0. (cherry picked from commit a9e3decd7602db661cdaec91f105ac6459f13637) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Willy Tarreau	90494537b2	BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts The "show threads" command introduced early in the 2.0 dev cycle uses appctx->st1 to store its context (the number of the next thread to dump). It goes back to an era where contexts were shared between the various applets and the CLI's command handlers. In fact it was already not good by then because st1 could possibly have APPCTX_CLI_ST1_PAYLOAD (2) in it, that would make the dmup start at thread 2, though it was extremely unlikely. When contexts were finally cleaned up and moved to their own storage, this one was overlooked, maybe due to using st1 instead of st2 like most others. So it continues to rely on st1, and more recently some new flags were appended, one of which is APPCTX_CLI_ST1_LASTCMD (16) and is always there. This results in "show threads" to believe it must start do dump from thread 16, and if this thread is not present, it can simply crash the process. A tiny reproducer is: global nbthread 1 stats socket /tmp/sock1 level admin mode 666 $ socat /tmp/sock1 - <<< "show threads" The fix for modern versions simply consists in assigning a context to this command from the applet storage. We're using a single int, no need for a struct, an int* will do it. That's valid till 2.6. Prior to 2.6, better switch to appctx->ctx.cli.i0 or i1 which are all properly initialized before the command is executed. This must be backported to all stable versions. Thanks to Andjelko Horvat for the report and the reproducer. (cherry picked from commit e0e2b6613214212332de4cbad2fc06cf4774c1b0) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Christopher Faulet	9a55572ff8	BUG/MINOR: session: Eval L4/L5 rules defined in the default section It is possible to define TCP/HTTP rules in a named default section to inherit from it in a proxy. However, there is an issue with L4/L5 rules. Only the lists of the current frontend are checked to know if an eval must be performed. Nothing is done for an empty list. Of course, the lists of the default proxy must also be checked to be sure to not ignored default L4/L5 rules. It is now fixed. This patch should fix the issue #2637. It must be backported as far as 2.6. (cherry picked from commit 076444550583acc11ef7fce7e7e740f039125696) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Amaury Denoyelle	aca17100a0	CLEANUP: quic: rename TID affinity elements This commit is the renaming counterpart of the previous one, this time for quic_conn module. Several elements related to TID affinity update from quic_conn has been renamed : public functions, but also flag renamed to QUIC_FL_CONN_TID_REBIND and trace event to QUIC_EV_CONN_BIND_TID. This should be backported with the same instruction as the previous commit. (cherry picked from commit 3be58fc720c406ce4f4dfc70b87662cef4838886) [wt: dropped the BUG_ON() from quic_conn since bfdf145859d not backported] Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00
Amaury Denoyelle	8669ecdf8a	CLEANUP: proto: rename TID affinity callbacks Since the following patch, protocol API to update a connection TID affinity has been extended. commit 1a43b9f32c71267e3cb514aa70a13c75adb20742 MINOR: proto: extend connection thread rebind API The single callback set_affinity has been splitted in 3 different functions which are called at different stages during listener_accept(), depending on accept queue push success or not. However, the naming was rendered confusing by the usage of function prefix 1 and 2. Rename proto callback related to TID affinity update and use the following names : * bind_tid_prep * bind_tid_commit * bind_tid_reset This commit should probably be backported at least up to 3.0 with the above patch. This is because the fix was recently backported and it would allow to keep changes minimal between the two versions. It could even be backported up to 2.8 if there is no major conflict. (cherry picked from commit 9fbe8b03346a98cc8cc7b47eaa68935b1d4b3916) Signed-off-by: Willy Tarreau <w@1wt.eu>	2024-07-29 11:56:13 +02:00

... 2 3 4 5 6 ...

22716 Commits