haproxy

Author	SHA1	Message	Date
Amaury Denoyelle	01a5be8c38	CLEANUP: mux-quic: remove stconn usage in h3/hq Small cleanup on snd_buf for application protocol layer. * do not export h3_snd_buf * replace stconn by a qcs argument. This is better as h3/hq-interop only uses the qcs instance. This should be backported up to 2.6. (cherry picked from commit 8d4ac48d3def189190c29b6f1f5d697b180f7e30) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:38 +02:00
Amaury Denoyelle	57b3c47e70	BUG/MEDIUM: mux-quic: fix crash on early app-ops release H3 SETTINGS emission has recently been delayed. The idea is to send it with the first STREAM to reduce sendto syscall invocation. This was implemented in the following patch : 3dd79d378c86b3ebf60e029f518add5f1ed54815 MINOR: h3: Send the h3 settings with others streams (requests) This patch works fine under nominal conditions. However, it will cause a crash if a HTTP/3 connection is released before having sent any data, for example when receiving an invalid first request. In this case, qc_release will first free qcc.app_ops HTTP/3 application protocol layer via release callback. Then qc_send is called to emit any closing frames built by app_ops release invocation. However, in qc_send, as no data has been sent, it will try to complete application layer protocol intialization, with a SETTINGS emission for HTTP/3. Thus, qcc.app_ops is reused, which is invalid as it has been just freed. This will cause a crash with h3_finalize in the call stack. This bug can be reproduced artificially by generating incomplete HTTP/3 requests. This will in time trigger http-request timeout without any data send. This is done by editing qc_handle_strm_frm function. - ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len, + ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len - 1, strm_frm->offset.key, strm_frm->fin, (char *)strm_frm->data); To fix this, application layer closing API has been adjusted to be done in two-steps. A new shutdown callback is implemented : it is used by the HTTP/3 layer to generate GOAWAY frame in qc_release prologue. Application layer context qcc.app_ops is then freed later in qc_release via the release operation which is now only used to liberate app layer ressources. This fixes the problem as the intermediary qc_send invocation will be able to reuse app_ops before it is freed. This patch fixes the crash, but it would be better to adjust H3 SETTINGS emission in case of early connection closing : in this case, there is no need to send it. This should be implemented in a future patch. This should fix the crash recently experienced by Tristan in github issue #1801. This must be backported up to 2.6. (cherry picked from commit f8aaf8bdfa40e21b1a2f600c3ed6455bf9b6a763) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:25 +02:00
William Lallemand	86120e4953	MEDIUM: quic: separate path for rx and tx with set_encryption_secrets With quicTLS the set_encruption_secrets callback is always called with the read_secret and the write_secret. However this is not the case with libreSSL, which uses the set_read_secret()/set_write_secret() mecanism. It still provides the set_encryption_secrets() callback, which is called with a NULL parameter for the write_secret during the read, and for the read_secret during the write. The exchange key was not designed in haproxy to be called separately for read and write, so this patch allow calls with read or write key to NULL. (cherry picked from commit 95fc737fc6edfa2575ce982b739184e99475c215) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:19 +02:00
Emeric Brun	2cc1ed89d4	BUG/MEDIUM: sink: bad init sequence on tcp sink from a ring. The init of tcp sink, particularly for SSL, was done too early in the code, during parsing, and this can cause a crash specially if nbthread was not configured. This was detected by William using ASAN on a new regtest on log forward. This patch adds the 'struct proxy' created for a sink to a list and this list is now submitted to the same init code than the main proxies list or the log_forward's proxies list. Doing this, we are assured to use the right init sequence. It also removes the ini code for ssl from post section parsing. This patch should be backported as far as v2.2 Note: this fix uses 'goto' labels created by commit 'BUG/MAJOR: log-forward: Fix log-forward proxies not fully initialized' but this code didn't exist before v2.3 so this patch needs to be adapted for v2.2. (cherry picked from commit d6e581de4be1d3564d771056303242c9ae930c40) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:36:59 +02:00
Aurelien DARRAGON	1f9d9f53f0	MINOR: proxy/listener: support for additional PAUSED state This patch is a prerequisite for #1626. Adding PAUSED state to the list of available proxy states. The flag is set when the proxy is paused at runtime (pause_listener()). It is cleared when the proxy is resumed (resume_listener()). It should be backported to 2.6, 2.5 and 2.4 (cherry picked from commit d46f437de69d5d4d84a207531a3ba6f8d3d697dc) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Aurelien DARRAGON	614f99ee0a	MINOR: listener: small API change A minor API change was performed in listener(.c/.h) to restore consistency between stop_listener() and (resume/pause)_listener() functions. LISTENER_LOCK was never locked prior to calling stop_listener(): lli variable hint is thus not useful anymore. Added PROXY_LOCK locking in (resume/pause)_listener() functions with related lpx variable hint (prerequisite for #1626). It should be backported to 2.6, 2.5 and 2.4 (cherry picked from commit 001328873c352e5e4b1df0dcc8facaf2fc1408aa) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Frédéric Lécaille	ba627a66ce	MINOR: h3: Send the h3 settings with others streams (requests) This is the ->finalize application callback which prepares the unidirectional STREAM frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is at the same time always waiting for the client request, this makes haproxy call sendto() to send only about 20 bytes of stream data. Furthermore in case of heavy loss, this give less chances to short h3 requests to succeed. Drawback: as at this time the mux sends its streams by their IDs ascending order the stream 0 is always embedded before the unidirectional stream 3 for h3 settings. Nevertheless, as these settings may be lost and received after other h3 request streams, this is permitted by the RFC. Perhaps there is a better way to do. This will have to be checked with Amaury. Must be backported to 2.6. (cherry picked from commit 3dd79d378c86b3ebf60e029f518add5f1ed54815) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Frédéric Lécaille	68436e33d1	BUG/MINOR: quic: Speed up the handshake completion only one time It is possible to speed up the handshake completion but only one time by connection as mentionned in RFC 9002 "6.2.3. Speeding up Handshake Completion". Add a flag to prevent this process to be run several times (see https://www.rfc-editor.org/rfc/rfc9002#name-speeding-up-handshake-compl). Must be backported to 2.6. (cherry picked from commit bb995eafc7e8e7d0457e1c3af17a98ef94d8b40b) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	397fcc008d	MINOR: sched: store the current profile entry in the thread context The profile entry that corresponds to the current task/tasklet being profiled is now stored into the thread's context. This will allow it to be accessed from the tasks themselves. This is needed for an upcoming fix. (cherry picked from commit 1efddfa6bfdcaf57198866db67e49b40442d278f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	1cb273c718	BUG/MINOR: sched: properly account for the CPU time of dying tasks When task profiling is enabled, the scheduler can measure and report the cumulated time spent in each task and their respective latencies. But this was wrong for tasks with few wakeups as well as for self-waking ones, because the call date needed to measure how long it takes to process the task is retrieved in the task itself (->wake_date was turned to the call date), and we could face two conditions: - a new wakeup while the task is executing would reset the ->wake_date field before returning and make abnormally low values being reported; that was likely the case for taskèrun_applet for self-waking applets; - when the task dies, NULL is returned and the call date couldn't be retrieved, so that CPU time was not being accounted for. This was particularly visible with process_stream() which is usually called only twice per request, and whose time was systematically halved. The cleanest solution here is to keep in mind that the scheduler already uses quite a bit of local context in th_ctx, and place the intermediary values there so that they cannot vanish. The wake_date has to be reset immediately once read, and only its copy is used along the function. Note that this must be done both for tasks and tasklet, and that until recently tasklets were also able to report wrong values due to their sole dependency on TH_FL_TASK_PROFILING between tests. One nice benefit for future improvements is that such information will now be available from the task without having to be stored into the task itself anymore. Since the tasklet part was computed on wrapping 32-bit arithmetics and the task one was on 64-bit, the values were now consistently moved to 32-bit as it's already largely sufficient (4s spent in a task is more than twice what the watchdog would tolerate). Some further cleanups might be necessary, but the patch aimed at staying minimal. Task profiling output after 1 million HTTP request previously looked like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2012338 4.850s 2.410us 12.91s 6.417us process_stream 2000136 9.594s 4.796us 34.26s 17.13us sc_conn_io_cb 2000135 1.973s 986.0ns 30.24s 15.12us h1_timeout_task 137 - - 2.649ms 19.34us accept_queue_process 49 152.3us 3.107us 321.7yr 6.564yr main+0x146430 7 5.250us 750.0ns 25.92us 3.702us srv_cleanup_idle_conns 1 559.0ns 559.0ns 918.0ns 918.0ns task_run_applet 1 - - 2.162us 2.162us Now it looks like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2014194 4.794s 2.380us 13.75s 6.826us process_stream 2000151 20.01s 10.00us 36.04s 18.02us sc_conn_io_cb 2000148 2.167s 1.083us 32.27s 16.13us h1_timeout_task 198 54.24us 273.0ns 3.487ms 17.61us accept_queue_process 52 158.3us 3.044us 409.9us 7.882us main+0x1466e0 18 16.77us 931.0ns 63.98us 3.554us srv_cleanup_toremove_conns 8 282.1us 35.26us 546.8us 68.35us srv_cleanup_idle_conns 3 149.2us 49.73us 8.131us 2.710us task_run_applet 3 268.1us 89.38us 11.61us 3.871us Note the two-fold difference on process_stream(). This feature is essentially used for debugging so it has extremely limited impact. However it's used quite a bit more in bug reports and it would be desirable that at least 2.6 gets this fix backported. It depends on at least these two previous patches which will then also have to be backported: MINOR: task: permanently enable latency measurement on tasklets CLEANUP: task: rename ->call_date to ->wake_date (cherry picked from commit 62b5b96bcc91985cb6bf6a30264ef3c54315c7c7) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Christopher Faulet	43bd98151e	BUG/MINOR: task: Fix detection of tasks profiling in tasklet_wakeup_after() The regression was introduced when ad548b54a7 ["MINOR: task: Add tasklet_wakeup_after()"] was backported to 2.6 (21e0c31695). TH_FL_TASK_PROFILING flag does not exist. To detect if tasks profiling is enabled, "task_profiling_mask" variable must be used. It is a 2.6-specific issue. Thus there is no upstream commit ID. This patch must be backported if the commit above is also backported. For now, no backport is needed.	2022-09-12 17:54:22 +02:00
Willy Tarreau	41f645a05a	CLEANUP: task: rename ->call_date to ->wake_date This field is misnamed because its real and important content is the date the task was woken up, not the date it was called. It temporarily holds the call date during execution but this remains confusing. In fact before the latency measurements were possible it was indeed a call date. Thus is will now be called wake_date. This change is necessary because a subsequent fix will require the introduction of the real call date in the thread ctx. (cherry picked from commit 04e50b3d325fa35ce9557701513773a8a84e9230) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	c432738cb0	MINOR: task: permanently enable latency measurement on tasklets When tasklet latency measurement was enabled in 2.4 with commit b2285de04 ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"), the feature was conditionned on DEBUG_TASK because the field would add 8 bytes to the struct tasklet. This approach was not a very good idea because the struct ends on an int anyway thus it does finish with a 32-bit hole regardless of the presence of this field. What is true however is that adding it turned a 64-byte struct to 72-byte when caller debugging is enabled. This patch revisits this with a minor change. Now only the lowest 32 bits of the call date are stored, so they always fit in the remaining hole, and this allows to remove the dependency on DEBUG_TASK. With debugging off, we're now seeing a 48-byte struct, and with debugging on it's exactly 64 bytes, thus still exactly one cache line. 32 bits allow a latency of 4 seconds on a tasklet, which already indicates a completely dead process, so there's no point storing the upper bits at all. And even in the event it would happen once in a while, the lost upper bits do not really add any value to the debug reports. Also, now one tasklet wakeup every 4 billion will not be sampled due to the test on the value itself. Similarly we just don't care, it's statistics and the measurements are not 9-digit accurate anyway. (cherry picked from commit 768c2c5678d462a3622492a1230946978292571e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	87f8732e1b	BUG/MINOR: task: make task_instant_wakeup() work on a task not a tasklet There's a subtle (harmless) bug in task_instant_wakeup(). As it uses some tasklet code instead of some task code, the debug part also acts on the tasklet equivalent, and the call_date is only set when DEBUG_TASK is set instead of inconditionally like with tasks. As such, without this debugging macro, call dates are not updated for tasks woken this way. There isn't any impact yet because this function was introduced in 2.6 to solve certain classes of issues and is not used yet, and in the worst case it would only affect the reported latency time. This may be backported to 2.6 in case a future fix would depend on it but currently will not fix existing code. (cherry picked from commit 0fae3a0360314285a17153cac76413184143ee74) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	c94b84c6ba	BUG/MINOR: task: always reset a new tasklet's call date The tasklet's call date was not reset, so if profiling was enabled while some tasklets were in the run queue, their initial random value could be used to preload a bogus initial latency value into the task profiling bin. Let's just zero the initial value. This should be backported to 2.4 as it was brought with initial commit b2285de04 ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"). The impact is very low though. (cherry picked from commit f27acd961e9b4291f80bc54100e57969ec4372ec) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Frédéric Lécaille	d404122c2f	BUG/MINOR: quic: Wrong connection ID to thread ID association To work, quic_pin_cid_to_tid() must set cid[0] to a value with <target_id> as <global.nbthread> modulo. For each integer n, (n - (n % m)) + d has always d as modulo m (with d < m). So, this statement seemed correct: cid[0] = cid[0] - (cid[0] % global.nbthread) + target_tid; except when n wraps or when another modulo is applied to the addition result. Here, for 8bit modulo arithmetic, if m does not divides 256, this cannot works for values which wraps when we increment them by d. For instance n=255 m=3 and d=1 the formula result is 0 (should be d). To fix this, we first limit c[0] to 255 - <target_id> to prevent c[0] from wrapping. Thank you to @esb for having reported this issue in GH #1855. Must be backported to 2.6 (cherry picked from commit 3122c75fd1f9a73a13ec533a4f313be0af1c5348) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
William Lallemand	22181db9ba	BUILD: quic: temporarly ignore chacha20_poly1305 for libressl LibreSSL does not implement EVP_chacha20_poly1305() with EVP_CIPHER but uses the EVP_AEAD API instead: https://man.openbsd.org/EVP_AEAD_CTX_init This patch disables this cipher for libreSSL for now. (cherry picked from commit d2be9d4c48b71b2132938dbfac36142cc7b8f7c4) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
William Lallemand	d452b03784	BUILD: ssl: fix ssl_sock_switchtx_cbk when no client_hello_cb When building HAProxy with USE_QUIC and libressl 3.6.0, the ssl_sock_switchtx_cbk symbol is not found because libressl does not implement the client_hello_cb. A ssl_sock_switchtx_cbk version for the servername callback is available but wasn't exported correctly. (cherry picked from commit 844009d77ac42182ab4d5cf3efaaf227318505a1) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
William Lallemand	8aa94c3778	BUILD: quic: add some ifdef around the SSL_ERROR_* for libressl SSL_ERROR_WANT_ASYNC, SSL_ERROR_WANT_ASYNC_JOB and SSL_ERROR_WANT_CLIENT_HELLO_CB does not seems supported by libressl. (cherry picked from commit 6d74e179ee012c2b4eb282c2b63f87e9a6235251) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	ddcaae84cf	DEBUG: stream: minor rearrangement of a few fields in struct stream. Some recent traces started to show confusing stream pointers ending with 0xe. The reason was that the stream's obj_type was almost unused in the past and was stuffed in a hole in the structure. But now it's present in all "show sess all" outputs and having to mentally match this value against another one that's 0x17e lower is painful. The solution here is to move the obj_type at the top, like in almost every other structure, but without breaking the efficient layout. This patch moves a few fields around and manages to both plug some holes (16 bytes saved, 976 to 960) and avoid channels needlessly crossing cache boundaries (res was spread over 3 lines vs 2 now). Nothing else was changed. It would be desirable to backport this to 2.6 since it's where dumps are currently being processed the most. (cherry picked from commit 178dda6b41caa7baef02ac4754b1c97c6dd481fb) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-09-02 17:27:55 +02:00
Willy Tarreau	1f4f25e3cb	BUILD: debug: make sure debug macros are never empty As outlined in commit f7ebe584d7 ("BUILD: debug: Add braces to if statement calling only CHECK_IF()"), the BUG_ON() family of macros is incorrectly defined to be empty when debugging is disabled, and that can lead to trouble. Make sure they always fall back to the usual "do { } while (0)". This may be backported to 2.6 if needed, though no such issue was met there to date. (cherry picked from commit d8009a1ca6607bfe08978476ae2c77679b2b5453) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-09-01 08:38:18 +02:00
Willy Tarreau	79a8dce5c1	MINOR: ring: add support for a backing-file This mmaps a file which will serve as the backing-store for the ring's contents. The idea is to provide a way to retrieve sensitive information (last logs, debugging traces) even after the process stops and even after a possible crash. Right now this was possible by connecting to the CLI and dumping the contents of the ring live, but this is not handy and consumes quite a bit of resources before it is needed. With a backing file, the ring is effectively RAM-mapped file, so that contents stored there are the same as those found in the file (the OS doesn't guarantee immediate sync but if the process dies it will be OK). Note that doing that on a filesystem backed by a physical device is a bad idea, as it will induce slowdowns at high loads. It's really important that the device is RAM-based. Also, this may have security implications: if the file is corrupted by another process, the storage area could be corrupted, causing haproxy to crash or to overwrite its own memory. As such this should only be used for debugging. (cherry picked from commit 0b8e9ceb12ee7ba5f5d3fada2610920a97014dc8) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-09-01 08:38:18 +02:00
Willy Tarreau	004b445c58	MINOR: ring: support creating a ring from a linear area Instead of allocating two parts, one for the ring struct itself and one for the storage area, ring_make_from_area() will arrange the two inside the same memory area, with the storage starting immediately after the struct. This will allow to store a complete ring state in shared memory areas for example. (cherry picked from commit 6df10d872b84121b4d0e1fbd7bf91fd8defb3680) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-09-01 08:38:18 +02:00
Willy Tarreau	a191123583	BUILD: ring: forward-declare struct appctx to avoid a build warning When using ring.h standalone it emits warnings about appctx. Let's forward-declare it. (cherry picked from commit 8df098c2b1fb9d73b55c27a4b4dcd47690493d26) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-09-01 08:38:18 +02:00
Fr�d�ric L�caille	407884a80c	BUG/MINOR: quic: Missing header protection AES cipher context initialisations (draft-v2) This bug arrived with this commit: "MINOR: quic: Add reusable cipher contexts for header protection" haproxy could crash because of missing cipher contexts initializations for the header protection and draft-v2 Initial secrets. This was due to the fact that these initialization both for RX and TX secrets were done outside of qc_new_isecs(). The role of this function is definitively to initialize these cipher contexts in addition to the derived secrets. Indeed this function is called by qc_new_conn() which initializes the connection but also by qc_conn_finalize() which also calls qc_new_isecs() in case of a different QUIC version was negotiated by the peers from the one used by the client for its first Initial packet. This was reported by "v2" QUIC interop test with at least picoquic as client. Must be backported to 2.6. (cherry picked from commit c242832af31315315a0c8791af96aad5caa0ec38) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	5f714c9932	CLEANUP: quic: No more use ->rx_list MT_LIST entry point (quic_rx_packet) This quic_rx_packet is definitively no more used. Should be backported to 2.6 to ease the future backports. (cherry picked from commit f34c1c956827a2f973cd160be9d8c97ede00e979) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	56eb0cd44c	MINOR: quic: Replace MT_LISTs by LISTs for RX packets. Replace ->rx.pqpkts quic_enc_level struct member MT_LIST by an LIST. Same thing for ->list quic_rx_packet struct member MT_LIST. Update the code consequently. This was a reminisence of the multithreading support (several threads by connection). Must be backported to 2.6 (cherry picked from commit a2d8ad20a3fbe121e4b5ba531b2fccc48a8b1a59) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	867818a00b	BUG/MINOR: mux-quic: Fix memleak on QUIC stream buffer for unacknowledged data Some clients send CONNECTION_CLOSE frame without acknowledging the STREAM data haproxy has sent. In this case, when closing the connection if there were remaining data in QUIC stream buffers, they were not released. Add a <closing> boolean option to qc_stream_desc_free() to force the stream buffer memory releasing upon closing connection. Thank you to Tristan for having reported such a memory leak issue in GH #1801. Must be backported to 2.6. (cherry picked from commit ea4a5cbbdfa71cd287d453dffbdf643846754bba) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	b421b72b64	MINOR: quic: Add reusable cipher contexts for header protection Implement quic_tls_rx_hp_ctx_init() and quic_tls_tx_hp_ctx_init() to initiliaze such header protection cipher contexts for each RX and TX parts and for each packet number spaces, only one time by connection. Make qc_new_isecs() call these two functions to initialize the cipher contexts of the Initial secrets. Same thing for ha_quic_set_encryption_secrets() to initialize the cipher contexts of the subsequent derived secrets (ORTT, 1RTT, Handshake). Modify qc_do_rm_hp() and quic_apply_header_protection() to reuse these cipher contexts. Note that there is no need to modify the key update for the header protection. The header protection secrets are never updated. (cherry picked from commit 86a53c566935c8f331a694b50a49f918364d0aa2) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	0163cf9cc2	REORG: h2: extract cookies concat function in http_htx As specified by RFC 7540, multiple cookie headers are merged in a single entry before passing it to a HTTP/1.1 connection. This step is implemented during headers parsing in h2 module. Extract this code in the generic http_htx module. This will allow to reuse it quickly for HTTP/3 implementation which has the same requirement for cookie headers. (cherry picked from commit 2c5a7ee3330dfad050942992d0431e4f5f881e7a) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	acb7e598de	BUG/MEDIUM: quic: fix crash on MUX send notification MUX notification on TX has been edited recently : it will be notified only when sending its own data, and not for example on retransmission by the quic-conn layer. This is subject of the patch : b29a1dc2f4a334c1c7fea76c59abb4097422c05c BUG/MINOR: quic: do not notify MUX on frame retransmit A new flag QUIC_FL_CONN_RETRANS_LOST_DATA has been introduced to differentiate qc_send_app_pkts invocation by MUX and directly by the quic-conn layer in quic_conn_app_io_cb(). However, this is a first problem as internal quic-conn layer usage is not limited to retransmission. For example for NEW_CONNECTION_ID emission. Another problem much important is that send functions are also called through quic_conn_io_cb() which has not been protected from MUX notification. This could probably result in crash when trying to notify the MUX. To fix both problems, quic-conn flagging has been inverted : when used by the MUX, quic-conn is flagged with QUIC_FL_CONN_TX_MUX_CONTEXT. To improve the API, MUX must now used qc_send_mux which ensure the flag is set. qc_send_app_pkts is now static and can only be used by the quic-conn layer. This must be backported wherever the previously mentionned patch is. (cherry picked from commit 704675656bf8b577971f1bbc3be186f4cc362632) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	f4758e8938	BUG/MINOR: quic: do not notify MUX on frame retransmit On STREAM emission, quic-conn notifies MUX through a callback named qcc_streams_sent_done(). This also happens on retransmission : in this case offset are examined and notification is ignored if already seen. However, this behavior has slightly changed since e53b489826ba9760a527b461095402ca05d2b6be BUG/MEDIUM: mux-quic: fix server chunked encoding response Indeed, if offset diff is NULL, frame is now not ignored. This is to support FIN notification with a final empty STREAM frame. A side-effect of this is that if the last stream frame is retransmitted, it won't be ignored in qcc_streams_sent_done(). In most cases, this side-effect is harmless as qcs instance will soon be freed after being closed. But if qcs is still alive, this will cause a BUG_ON crash as it is considered as locally closed. This bug depends on delay condition and seems to be extremely rare. But it might be the reason for a crash seen on interop with s2n client on http3 testcase : FATAL: bug condition "qcs->st == QC_SS_CLO" matched at src/mux_quic.c:372 call trace(16): \| 0x558228912b0d [b8 01 00 00 00 c6 00 00]: main-0x1c7878 \| 0x558228917a70 [48 8b 55 d8 48 8b 45 e0]: qcc_streams_sent_done+0xcf/0x355 \| 0x558228906ff1 [e9 29 05 00 00 48 8b 05]: main-0x1d3394 \| 0x558228907cd9 [48 83 c4 10 85 c0 0f 85]: main-0x1d26ac \| 0x5582289089c1 [48 83 c4 50 85 c0 75 12]: main-0x1d19c4 \| 0x5582288f8d2a [48 83 c4 40 48 89 45 a0]: main-0x1e165b \| 0x5582288fc4cc [89 45 b4 83 7d b4 ff 74]: qc_send_app_pkts+0xc6/0x1f0 \| 0x5582288fd311 [85 c0 74 12 eb 01 90 48]: main-0x1dd074 \| 0x558228b2e4c1 [48 c7 c0 d0 60 ff ff 64]: run_tasks_from_lists+0x4e6/0x98e \| 0x558228b2f13f [8b 55 80 29 c2 89 d0 89]: process_runnable_tasks+0x7d6/0x84c \| 0x558228ad9aa9 [8b 05 75 16 4b 00 83 f8]: run_poll_loop+0x80/0x48c \| 0x558228ada12f [48 8b 05 aa c5 20 00 48]: main-0x256 \| 0x7ff01ed2e609 [64 48 89 04 25 30 06 00]: libpthread:+0x8609 \| 0x7ff01e8ca163 [48 89 c7 b8 3c 00 00 00]: libc:clone+0x43/0x5e To reproduce it locally, code was artificially patched to produce retransmission and avoid qcs liberation. In order to fix this and avoid future class of similar problem, the best way is to not call qcc_streams_sent_done() to notify MUX for retranmission. To implement this, we test if any of QUIC_FL_CONN_RETRANS_OLD_DATA or the new flag QUIC_FL_CONN_RETRANS_LOST_DATA is set. A new wrapper qc_send_app_retransmit() has been added to set the new flag as a complement to already existing qc_send_app_probing(). This must be backported up to 2.6. (cherry picked from commit b29a1dc2f4a334c1c7fea76c59abb4097422c05c) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	bacf177e51	MINOR: quic: refactor application send Adjust qc_send_app_pkts function : remove <old_data> arg and provide a new wrapper function qc_send_app_probing() which should be used instead when probing with old data. This simplifies the interface of the default function, most notably for the MUX which does not interfer with retransmission. QUIC_FL_CONN_RETRANS_OLD_DATA flag is set/unset directly in the wrapper qc_send_app_probing(). At the same time, function documentation has been updated to clarified arguments and return values. This commit will be useful for the next patch to differentiate MUX and retransmission send context. As a consequence, the current patch should be backported wherever the next one will be. (cherry picked from commit cc130473646f5b86be879fd78e0be5581c784ddc) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	96f5196a75	BUG/MEDIUM: mux-quic: reject uni stream ID exceeding flow control Emit STREAM_LIMIT_ERROR if a client tries to open an unidirectional stream with an ID greater than the value specified by our flow-control limit. The code is similar to the bidirectional stream opening. MAX_STREAMS_UNI emission is not implement for the moment and is left as a TODO. This should not be too urgent for the moment : in HTTP/3, a client has only a limited use for unidirectional streams (H3 control stream + 2 QPACK streams). This is covered by the value provided by haproxy in transport parameters. This patch has been tagged with BUG as it should have prevented last crash reported on github issue #1808 when opening a new unidirectional streams with an invalid ID. However, it is probably not the main cause of the bug contrary to the patch commit 11a6f4007b908b49ecd3abd5cd10fba177f07c11 BUG/MINOR: quic: Wrong status returned by qc_pkt_decrypt() This must be backported up to 2.6. (cherry picked from commit bf3c208760861ee36590fc4a5a579b8808818bd9) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	f2ff91ac05	MINOR: qpack: report error on enc/dec stream close As specified by RFC 9204, encoder and decoder streams must not be closed. If the peer behaves incorrectly and closes one of them, emit a H3_CLOSED_CRITICAL_STREAM connection error. To implement this, QPACK stream decoding API has been slightly adjusted. Firstly, fin parameter is passed to notify about FIN STREAM bit. Secondly, qcs instance is passed via unused void* context. This allows to use qcc_emit_cc_app() function to report a CONNECTION_CLOSE error. (cherry picked from commit 26aa399d6b245da3e82e768dd15931263842d7d2) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	c8f907b25a	BUG/MEDIUM: quic: Wrong use of <token_odcid> in qc_lsntr_pkt_rcv() This commit was not complete: "BUG/MEDIUM: quic: Possible use of uninitialized <odcid> variable in qc_lstnr_params_init()" <token_odcid> should have been directly passed to qc_lstnr_params_init() without dereferencing it to prevent haproxy to have new chances to crash! Must be backported to 2.6. (cherry picked from commit 7629f5d6709c539c6c9012949411281144c82f53) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	9de435607a	BUG/MEDIUM: quic: Possible use of uninitialized <odcid> variable in qc_lstnr_params_init() When receiving a token into a client Initial packet without a cluster secret defined by configuration, the <odcid> variable used to parse the ODCID from the token could be used without having been initialized. Such a packet must be dropped. So the sufficient part of this patch is this check: + } + else if (!global.cluster_secret && token_len) { + /* Impossible case: a token was received without configured + * cluster secret. + */ + TRACE_PROTO("Packet dropped", QUIC_EV_CONN_LPKT, + NULL, NULL, NULL, qv); + goto drop; } Take the opportunity of this patch to rework and make it more readable this part of code where such a packet must be dropped removing the <check_token> variable. When an ODCID is parsed from a token, new <token_odcid> new pointer variable is set to the address of the parsed ODCID. This way, is not set but used it will make crash haproxy. This was not always the case with an uninitialized local variable. Adapt the API to used such a pointer variable: <token> boolean variable is removed from qc_lstnr_params_init() prototype. This must be backported to 2.6. (cherry picked from commit e9325e97c2b0a6c16fc39e4d231506b9f407d741) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	57993bc05b	MINOR: mux-quic: define new traces Add new traces to help debugging on QUIC MUX. Most notable, the following functions are now traced : * qcc_emit_cc * qcs_free * qcs_consume * qcc_decode_qcs * qcc_emit_cc_app * qcc_install_app_ops * qcc_release_remote_stream * qcc_streams_sent_done * qc_init (cherry picked from commit 4c9a1642c13499a6ebeb5e300cacdf8f5a342b62) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	5153e6fee7	MINOR: quic: Remove useless lock for RX packets This lock was there be able to handle the RX packets for a connetion from several threads. This is no more needed since a QUIC connection is always handled by the same thread. May be backported to 2.6 (cherry picked from commit a6920a25d98d8b126fe029620e0165d0d2ad15b2) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	402bdb2c7d	MEDIUM: quic: xprt traces rework Add a least as much as possible TRACE_ENTER() and TRACE_LEAVE() calls to any function. Note that some functions do not have any access to the a quic_conn argument when receiving or parsing datagram at very low level. (cherry picked from commit a8b2f843d203f3bc6c363842ee368d59266c3663) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	acc7ed3a89	MINOR: quic: replace custom buf on Tx by default struct buffer On first prototype version of QUIC, emission was multithreaded. To support this, a custom thread-safe ring-buffer has been implemented with qring/cbuf. Now the thread model has been adjusted : a quic-conn is always used on the same thread and emission is not multi-threaded. Thus, qring/cbuf usage can be replace by a standard struct buffer. The code has been simplified even more as for now buffer is always drained after a prepare/send invocation. This is the case since a datagram is always considered as sent even on sendto() error. BUG_ON statements guard are here to ensure that this model is always valid. Thus, code to handle data wrapping and consume too small contiguous space with a 0-length datagram is removed. (cherry picked from commit f2476053f9304709dea595db413537023dba2d0e) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	3366213eab	BUG/MINOR: quic: adjust errno handling on sendto qc_snd_buf returned a size_t which means that it was never negative despite its documentation. Thus the caller who checked for this was never informed of a sendto error. Clean this by changing the return value of qc_snd_buf() to an integer. A 0 is returned on success. Every other values are considered as an error. This commit should be backported up to 2.6. Note that to not cause malfunctions, it must be backported after the previous patch : 906b0589546b700b532472ede019e5c5a8ac1f38 MINOR: quic: explicitely ignore sendto error This is to ensure that a sendto error does not cause send to be interrupted which may cause a stalled transfer without a proper retry mechanism. The impact of this bug seems null as caller explicitely ignores sendto error. However this part of code seems to be subject to strange issues and it may fix them in part. It may be of interest for github issue #1808. (cherry picked from commit 6715cbf97f5142f18748d1575632082d3b0fbe91) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Fr�d�ric L�caille	0dfe70e17e	MINOR: quic: Add two new stats counters for sendto() errors Add "quic_socket_full" new stats counter for sendto() errors with EAGAIN as errno. and "quic_sendto_err" counter for any other error. (cherry picked from commit 8ecb7363b5ba5eb850081a79655988d441c0e881) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	ba1448156a	MEDIUM: mux-quic: implement http-request timeout Implement http-request timeout for QUIC MUX. It is used when the connection is opened and is triggered if no HTTP request is received in time. By HTTP request we mean at least a QUIC stream with a full header section. Then qcs instance is attached to a sedesc and upper layer is then responsible to wait for the rest of the request. This timeout is also used when new QUIC streams are opened during the connection lifetime to wait for full HTTP request on them. As it's possible to demux multiple streams in parallel with QUIC, each waiting stream is registered in a list <opening_list> stored in qcc with <start> as timestamp in qcs for the stream opening. Once a qcs is attached to a sedesc, it is removed from <opening_list>. When refreshing MUX timeout, if <opening_list> is not empty, the first waiting stream is used to set MUX timeout. This is efficient as streams are stored in the list in their creation order so CPU usage is minimal. Also, the size of the list is automatically restricted by flow control limitation so it should not grow too much. Streams are insert in <opening_list> by application protocol layer. This is because only application protocol can differentiate streams for HTTP messaging from internal usage. A function qcs_wait_http_req() has been added to register a request stream by app layer. QUIC MUX can then remove it from the list in qc_attach_sc(). As a side-note, it was necessary to implement attach qcc_app_ops callback on hq-interop module to be able to insert a stream in waiting list. Without this, a BUG_ON statement would be triggered when trying to remove the stream on sedesc attach. This is to ensure that every requests streams are registered for http-request timeout. MUX timeout is explicitely refreshed on MAX_STREAM_DATA and STOP_SENDING frame parsing to schedule http-request timeout if a new stream has been instantiated. It was already done on STREAM parsing due to a previous patch. (cherry picked from commit 30e260e2e6bd22d0ebece004dc84f13d17ae79f4) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	ac720058ac	MINOR: h3: support HTTP request framing state Store the current step of HTTP message in h3s stream. This reports if we are in the parsing of headers, content or trailers section. A new enum h3s_st_req is defined for this. This field is stored in h3s struct but only used for request stream. It is left undefined for other streams (control or QPACK streams). h3_is_frame_valid() has been extended to take into account this state information. A connection error H3_FRAME_UNEXPECTED is reported if an invalid frame according to the current state is received; for example a DATA frame at the beginning of a stream. (cherry picked from commit 8d818c6eabf71f45d5cd46e136b60bcb4dde50d9) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	2f23adfa94	MEDIUM: mux-quic: implement http-keep-alive timeout Complete QUIC MUX timeout refresh function by using http-keep-alive timeout. It is used when the connection is idle after having handle at least one request. To implement this a new member <idle_start> has been defined in qcc structure. This is used as timestamp for when the connection became idle and is used as base time for http keep-alive timeout (cherry picked from commit bd6ec1bf845736d7447669077b4f3f2c4bd48011) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	88b2019b97	MINOR: mux-quic: count in-progress requests Add a new qcc member named <nb_hreq>. Its purpose is close to <nb_sc> which represents the number of attached stream connectors. Both are incremented inside qc_attach_sc(). The difference is on the decrement operation. While <nb_cs> is decremented on sedesc detach callback, <nb_hreq> is decremented when the qcs is locally closed. In most cases, <nb_hreq> will be decremented before <nb_cs>. However, it will be the reverse if a stream must be kept alive after detach callback. The main purpose of this field is to implement http-keep-alive timeout. Both <nb_sc> and <nb_hreq> must be null to activate the http-keep-alive timeout. (cherry picked from commit c603de4d84f06321566f80461f5cf4231f52af4e) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	bc8daecd14	MINOR: mux-quic: save proxy instance into qcc Store a reference to proxy in the qcc structure. This will be useful to access to proxy members outside of qcc_init(). Most notably, this change is required to implement timeout refreshing by using the various timeouts configured at the proxy level. (cherry picked from commit 07bf8f4d8631c29297ded195d9d4004fba2c030d) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	524a4f648d	MINOR: mux-quic: support app graceful shutdown Adjust qcc_emit_cc_app() to allow the delay of emission of a CONNECTION_CLOSE. This will only set the error code but the quic-conn layer is not flagged for immediate close. The quic-conn will be responsible to shut the connection when deemed suitable. This change will allow to implement application graceful shutdown, such as HTTP/3 with GOAWAY emission. This will allow to emit closing frames on MUX release. Once all work is done at the lower layer, the quic-conn should emit a CONNECTION_CLOSE with the registered error code. (cherry picked from commit d666d740d206b1fde607fd6ca6097fd7bf9c11bd) Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00
Amaury Denoyelle	78d1e9adad	MINOR: quic: define a generic QUIC error type Define a new structure quic_err to abstract a QUIC error type. This allows to easily differentiate a transport and an application error code. This simplifies error transmission from QUIC MUX and H3 layers. This new type is defined in quic_frame module. It is used to replace <err_code> field in <quic_conn>. QUIC_FL_CONN_APP_ALERT flag is removed as it is now useless. Utility functions are defined to be able to quickly instantiate transport, tls and application errors. (cherry picked from commit 57e6db7021483f2cd4e903397cb4ad9890d0719e) [wt: code adjusted in qc_build_cc_frm() instead due to a495c12282 already being backported] Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-08-31 10:43:54 +02:00

1 2 3 4 5 ...

6333 Commits