haproxy

Author	SHA1	Message	Date
William Lallemand	b1351c1a05	BUG/MINOR: ssl: shut the ca-file errors emitted during httpclient init With an OpenSSL library which use the wrong OPENSSLDIR, HAProxy tries to load the OPENSSLDIR/certs/ into @system-ca, but emits a warning when it can't. This patch fixes the issue by allowing to shut the error when the SSL configuration for the httpclient is not explicit. Must be backported in 2.6. (cherry picked from commit `0a2d63236c`) [wla: context changed in httpclient_precheck()] Signed-off-by: William Lallemand <wlallemand@haproxy.org>	2022-11-25 09:58:29 +01:00
Willy Tarreau	51743eea8b	BUG/MINOR: server/idle: at least use atomic stores when updating max_used_conns In 2.2, some idle conns usage metrics were added by commit `cf612a045` ("MINOR: servers: Add a counter for the number of currently used connections."), which mentioned that the operation doesn't need to be atomic since we're not seeking exact values. This is true but at least we should use atomic stores to make sure not to cause invalid values to appear on archs that wouldn't guarantee atomicity when writing an int, such as writing two 16-bit words. This is pretty unlikely on our targets but better keep the code safe against this. This may be backported as far as 2.2. (cherry picked from commit `9dc231a6b2`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:57 +01:00
Frédéric Lécaille	38c47fb838	BUG/MAJOR: quic: Crash after discarding packet number spaces This previous patch was not sufficient to prevent haproxy from crashing when some Handshake packets had to be inspected before being possibly retransmitted: "BUG/MAJOR: quic: Crash upon retransmission of dgrams with several packets" This patch introduced another issue: access to packets which have been released because still attached to others (in the same datagram). This was the case for instance when discarding the Initial packet number space before inspecting an Handshake packet in the same datagram through its ->prev or member in our case. This patch implements quic_tx_packet_dgram_detach() which detaches a packet from the adjacent ones in the same datagram to be called when ackwowledging a packet (as done in the previous commit) and when releasing its memory. This was, we are sure the released packets will not be accessed during retransmissions. Thank you to @gabrieltz for having reported this issue in GH #1903. Must be backported to 2.6. (cherry picked from commit `74b5f7b31b`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:23 +01:00
Frédéric Lécaille	05e719b7bd	BUG/MAJOR: quic: Crash upon retransmission of dgrams with several packets As revealed by some traces provided by @gabrieltz in GH #1903 issue, there are clients (chrome I guess) which acknowledge only one packet among others in the same datagram. This is the case for the first datagram sent by a QUIC haproxy listener made an Initial packet followed by an Handshake one. In this identified case, this is the Handshake packet only which is acknowledged. But if the client is able to respond with an Handshake packet (ACK frame) this is because it has successfully parsed the Initial packet. So, why not also acknowledging it? AFAIK, this is mandatory. On our side, when restransmitting this datagram, the Handshake packet was accessed from the Initial packet after having being released. Anyway. There is an issue on our side. Obviously, we must not expect an implementation to respect the RFC especially when it want to build an attack ;) With this simple patch for each TX packet we send, we also set the previous one in addition to the next one. When a packet is acknowledged, we detach the next one and the next one in the same datagram from this packet, so that it cannot be resent when resending these packets (the previous one, in our case). Thank you to @gabrieltz for having reported this issue. Must be backported to 2.6. (cherry picked from commit `814645f42f`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:23 +01:00
Amaury Denoyelle	d4c880d649	BUG/MINOR: quic: fix subscribe operation Subscribing was not properly designed between quic-conn and quic MUX layers. Align this as with in other haproxy components : <subs> field is moved from the MUX to the quic-conn structure. All mention of qcc MUX is cleaned up in quic_conn_subscribe()/quic_conn_unsubscribe(). Thanks to this change, ACK reception notification has been simplified. It's now unnecessary to check for the MUX existence before waking it. Instead, if <subs> quic-conn field is set, just wake-up the upper layer tasklet without mentionning MUX. This should probably be extended to other part in quic-conn code. This should be backported up to 2.6. (cherry picked from commit `bbb1c68508`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:34:22 +01:00
Amaury Denoyelle	9c15bd5d37	MINOR: quic: display unknown error sendto counter on stat page This patch complete the previous incomplete commit. The new counter sendto_err_unknown is now displayed on stats page/CLI show stats. This is related to github issue #1903. This should be backported up to 2.6. (cherry picked from commit `7941ead3aa`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:52:33 +02:00
Amaury Denoyelle	934659e0ce	MINOR: quic: do not crash on unhandled sendto error Remove ABORT_NOW() statement on unhandled sendto error. Instead use a dedicated counter sendto_err_unknown to report these cases. If we detect increment of this counter, strace can be used to detect errno value : $ strace -p $(pidof haproxy) -f -e trace=sendto -Z This should be backported up to 2.6. This should help to debug github issue #1903. (cherry picked from commit `1d9f170edd`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:52:22 +02:00
Amaury Denoyelle	9287b4047b	BUG/MINOR: mux-quic: complete flow-control for uni streams Max stream data was not enforced and respect for local/remote uni streams. Previously, qcs instances incorrectly reused the limit defined from bidirectional ones. This is now fixed. Two fields are added in qcc structure connection : * value for local flow control to enforce on remote uni streams * value for remote flow control to respect on local uni streams These two values can be reused to properly initialized msd field of a qcs instance in qcs_new(). The rest of the code is similar. This must be backported up to 2.6. (cherry picked from commit `176174f7e4`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:54 +02:00
William Lallemand	3fd456abc7	BUG/MEDIUM: httpclient/lua: crash when the lua task timeout before the httpclient When the lua task finished before the httpclient that are associated to it, there is a risk that the httpclient try to task_wakeup() the lua task which does not exist anymore. To fix this issue the httpclient used in a lua task are stored in a list, and the httpclient are destroyed at the end of the lua task. Must be backported in 2.5 and 2.6. (cherry picked from commit `bb581423b3`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:59 +02:00
Amaury Denoyelle	2b697ca18f	MINOR: quic: define first packet flag Received packets treatment has some difference regarding if this is the first one or not of the encapsulating datagram. Previously, this was set via a function argument. Simplify this by defining a new Rx packet flag named QUIC_FL_RX_PACKET_DGRAM_FIRST. This change does not have functional impact. It will simplify API when qc_lstnr_pkt_rcv() is broken into several functions : their number of arguments will be reduced thanks to this patch. This should be backported up to 2.6. (cherry picked from commit `deb7c87f55`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:29 +02:00
Amaury Denoyelle	2f0d2c3197	MINOR: quic: extend pn_offset field from quic_rx_packet pn_offset field was only set if header protection cannot be removed. Extend the usage of this field : it is now set everytime on packet parsing in qc_lstnr_pkt_rcv(). This change helps to clean up API of Rx functions by removing unnecessary variables and function argument. This change has no functional impact. It is a part of a refactoring series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of FD-owned socket patches. This should be backported up to 2.6. (cherry picked from commit `845169da58`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:25 +02:00
Amaury Denoyelle	60abfa542e	MINOR: quic: add version field on quic_rx_packet Add a new field version on quic_rx_packet structure. This is set on header parsing in qc_lstnr_pkt_rcv() function. This change has no functional impact. It is a part of a refactoring series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of FD-owned socket patches. This should be backported up to 2.6. (cherry picked from commit `0eae57273b`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:22 +02:00
Amaury Denoyelle	6f2acccd90	CLEANUP: quic: improve naming for rxbuf/datagrams handling QUIC datagrams are read from a random thread. They are then redispatch to the connection thread according to the first packet DCID. These operations are implemented through a special buffer designed to avoid locking. Refactor this code with the following changes : * <rxbuf> type is renamed <quic_receiver_buf>. Its list element is also renamed to highligh its attach point to a receiver. * <quic_dgram> and <quic_receiver_buf> definition are moved to quic_sock-t.h. This helps to reduce the size of quic_conn-t.h. * <quic_dgram> list elements are renamed to highlight their attach point into a <quic_receiver_buf> and a <quic_dghdlr>. This should be backported up to 2.6. (cherry picked from commit `1cba8d60f3`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:46:11 +02:00
Amaury Denoyelle	94cf0e576d	CLEANUP: quic: remove unused rxbufs member in receiver rxbuf is the structure used to store QUIC datagrams and redispatch them to the connection thread. Each receiver manages a list of rxbuf. This was stored both as an array and a mt_list. Currently, only mt_list is needed so removed <rxbufs> member from receiver structure. This should be backported up to 2.6. (cherry picked from commit `8c4d062d25`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:46:08 +02:00
Frédéric Lécaille	118e94d0c0	MINOR: quic: Split the secrets key allocation in two parts Implement quic_tls_secrets_keys_alloc()/quic_tls_secrets_keys_free() to allocate the memory for only one direction (RX or TX). Modify ha_quic_set_encryption_secrets() to call these functions for one of this direction (or both). So, for now on we can rely on the value of the secret keys to know if it was derived. Remove QUIC_FL_TLS_SECRETS_SET flag which is no more useful. Consequently, the secrets are dumped by the traces only if derived. Must be backported to 2.6. (cherry picked from commit `e1a49cfd4d`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:46:04 +02:00
Frédéric Lécaille	e0212bcb47	BUG/MINOR: quic: Stalled 0RTT connections with big ClientHello TLS message This issue was reproduced with -Q picoquic client option to split a big ClientHello message into two Initial packets and haproxy as server without any knowledged of any previous ORTT session (restarted after a firt 0RTT session). The ORTT received packets were removed from their queue when the second Initial packet was parsed, and the QUIC handshake state never progressed and remained at Initial state. To avoid such situations, after having treated some Initial packets we always check if there are ORTT packets to parse and we never remove them from their queue. This will be done after the hanshake is completed or upon idle timeout expiration. Also add more traces to be able to analize the handshake progression. Tested with ngtcp2 and picoquic Must be backported to 2.6. (cherry picked from commit `4aa7d8197a`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:46:01 +02:00
Frédéric Lécaille	c094639d65	MINOR: quic: Use a non-contiguous buffer for RX CRYPTO data Implement quic_get_ncbuf() to dynamically allocate a new ncbuf to be attached to any quic_cstream struct which needs such a buffer. Note that there is no quic_cstream for 0RTT encryption level. quic_free_ncbuf() is added to release the memory allocated for a non-contiguous buffer. Modify qc_handle_crypto_frm() to call this function and allocate an ncbuf for crypto data which are not received in order. The crypto data which are received in order are not buffered but provide to the TLS stack (calling qc_provide_cdata()). Modify qc_treat_rx_crypto_frms() which is called after having provided the in order received crypto data to the TLS stack to provide again the remaining crypto data which has been buffered, if possible (if they are in order). Each time buffered CRYPTO data were consumed, we try to release the memory allocated for the non-contiguous buffer (ncbuf). Also move rx.crypto.offset quic_enc_level struct member to rx.offset quic_cstream struct member. Must be backported to 2.6. (cherry picked from commit `9f9263ed13`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:45:58 +02:00
Frédéric Lécaille	6b84c17d99	MINOR: quic: New quic_cstream object implementation Add new quic_cstream struct definition to implement the CRYPTO data stream. This is a simplication of the qcs object (QUIC streams) for the CRYPTO data without any information about the flow control. They are not attached to any tree, but to a QUIC encryption level, one by encryption level except for the early data encryption level (for 0RTT). A stream descriptor is also allocated for each CRYPTO data stream. Must be backported to 2.6 (cherry picked from commit `7e3f7c47e9`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:45:49 +02:00
Willy Tarreau	b088056f3d	CLEANUP: quic/receiver: remove the now unused tx_qring list The tx_qrings[] and tx_qring_list in the receiver are not used anymore since commit `f2476053f` ("MINOR: quic: replace custom buf on Tx by default struct buffer"), the only place where they're referenced was in quic_alloc_tx_rings_listener(), which by the way implies that these were not even freed on exit. Let's just remove them. This should be backported to 2.6 since the commit above also was. (cherry picked from commit `cab054bbf9`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:42:38 +02:00
Amaury Denoyelle	3701ab7823	MEDIUM: quic: retrieve frontend destination address Retrieve the frontend destination address for a QUIC connection. This address is retrieve from the first received datagram and then stored in the associated quic-conn. This feature relies on IP_PKTINFO or affiliated flags support on the socket. This flag is set for each QUIC listeners in sock_inet_bind_receiver(). To retrieve the destination address, recvfrom() has been replaced by recvmsg() syscall. This operation and parsing of msghdr structure has been extracted in a wrapper quic_recv(). This change is useful to finalize the implementation of 'dst' sample fetch. As such, quic_sock_get_dst() has been edited to return local address from the quic-conn. As a best effort, if local address is not available due to kernel non-support of IP_PKTINFO, address of the listener is returned instead. This should be backported up to 2.6. (cherry picked from commit `97ecc7a8ea`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:42:27 +02:00
Amaury Denoyelle	68f165013c	MINOR: quic: limit usage of ssl_sock_ctx in favor of quic_conn Continue on the cleanup of QUIC stack and components. quic_conn uses internally a ssl_sock_ctx to handle mandatory TLS QUIC integration. However, this is merely as a convenience, and it is not equivalent to stackable ssl xprt layer in the context of HTTP1 or 2. To better emphasize this, ssl_sock_ctx usage in quic_conn has been removed wherever it is not necessary : namely in functions not related to TLS. quic_conn struct now contains its own wait_event for tasklet quic_conn_io_cb(). This should be backported up to 2.6. (cherry picked from commit `2ed840015f`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 08:43:48 +02:00
Willy Tarreau	d22ff05c3e	MINOR: fd: add a new function to only raise RLIMIT_NOFILE In issue #1866 an issue was reported under docker, by which a user cannot lower the number of FD needed. It looks like a restriction imposed in this environment, but it results in an error while it ought not have to in the case of shrinking. This patch adds a new function raise_rlim_nofile() that takes the desired new setting, compares it to the current one, and only calls setrlimit() if one of the values in the new setting is larger than the older one. As such it will continue to emit warnings and errors in case of failure to raise the limit but will never shrink it. This patch is only preliminary to another one, but will have to be backported where relevant (likely only 2.6). (cherry picked from commit `922a907926`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 08:43:48 +02:00
Amaury Denoyelle	4a6be622b2	CLEANUP: quic: create a dedicated quic_conn module xprt_quic module was too large and did not reflect the true architecture by contrast to the other protocols in haproxy. Extract code related to XPRT layer and keep it under xprt_quic module. This code should only contains a simple API to communicate between QUIC lower layer and connection/MUX. The vast majority of the code has been moved into a new module named quic_conn. This module is responsible to the implementation of QUIC lower layer. Conceptually, it overlaps with TCP kernel implementation when comparing QUIC and HTTP1/2 stacks of haproxy. This should be backported up to 2.6. (cherry picked from commit `92fa63f735`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 08:43:48 +02:00
Amaury Denoyelle	228883ca32	CLEANUP: quic: remove duplicated varint code from xprt_quic.h There was some identical code between xprt_quic and quic_enc modules. This concerns helper on QUIC varint type. Keep only the version in quic_enc file : this should help to reduce dependency on xprt_quic module. Note that quic_max_int_by_size() has been removed and is replaced by the identical quic_max_int(). This should be backported up to 2.6. (cherry picked from commit `a2639383ec`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 08:43:48 +02:00
Amaury Denoyelle	69baa24752	CLEANUP: quic: fix headers Clean up quic sources by adjusting headers list included depending on the actual dependency of each source file. On some occasion, xprt_quic.h was removed from included list. This is useful to help reducing the dependency on this single file and cleaning up QUIC haproxy architecture. This should be backported up to 2.6. (cherry picked from commit `5c25dc5bfd`) [cf: Include <haproxy/global.h> from cfgparse-quic.c instead of only <haproxy/global-t.h">. On 2.7, it is shipped with "tools.h" (tools.h > cli.h > global.h). But not on the 2.6] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 08:41:20 +02:00
Amaury Denoyelle	90a008239e	BUG/MINOR: quic: adjust quic_tls prototypes Two prototypes in quic_tls module were not identical to the actual function definition. * quic_tls_decrypt2() : the second argument const attribute is not present, to be able to use it with EVP_CIPHER_CTX_ctlr(). As a consequence of this change, token field of quic_rx_packet is now declared as non-const. * quic_tls_generate_retry_integrity_tag() : the second argument type differ between the two. Adjust this by fixing it to as unsigned char to match EVP_EncryptUpdate() SSL function. This situation did not seem to have any visible effect. However, this is clearly an undefined behavior and should be treated as a bug. This should be backported up to 2.6. (cherry picked from commit `f3c40f83fb`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 07:43:42 +02:00
Amaury Denoyelle	adf910e519	CLEANUP: quic: remove global var definition in quic_tls header Some variables related to QUIC TLS were defined in a header file : their definitions are now moved properly in the implementation file, with only declarations in the header. This should be backported up to 2.6. (cherry picked from commit `a19bb6f0b2`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 07:43:39 +02:00
Willy Tarreau	22beb2ad21	BUG/MINOR: backend: only enforce turn-around state when not redispatching In github issue #1878, Bart Butler reported observing turn-around states (1 second pause) after connection retries going to different servers, while this ought not happen. In fact it does happen because back_handle_st_cer() enforces the TAR state for any algo that's not round-robin. This means that even leastconn has it, as well as hashes after the number of servers changed. Prior to doing that, the call to stream_choose_redispatch() has already had a chance to perform the correct choice and to check the algo and the number of retries left. So instead we should just let that function deal with the algo when needed (and focus on deterministic ones), and let the former just obey. Bart confirmed that the fixed version works as expected (no more delays during retries). This may be backported to older releases, though it doesn't seem very important. At least Bart would like to have it in 2.4 so let's go there for now after it has cooked a few weeks in 2.6. (cherry picked from commit `406efb96d1`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 07:43:13 +02:00
Willy Tarreau	c564abd107	BUG/MAJOR: conn-idle: fix hash indexing issues on idle conns Idle connections do not work on 32-bit machines due to an alignment issue causing the connection nodes to be indexed with their lower 32-bits set to zero and the higher 32 ones containing the 32 lower bitss of the hash. The cause is the use of ebmb_node with an aligned data, as on this platform ebmb_node is only 32-bit aligned, leaving a hole before the following hash which is a uint64_t: $ pahole -C conn_hash_node ./haproxy struct conn_hash_node { struct ebmb_node node; /* 0 20 / / XXX 4 bytes hole, try to pack / int64_t hash; / 24 8 / struct connection conn; /* 32 4 / / size: 40, cachelines: 1, members: 3 / / sum members: 32, holes: 1, sum holes: 4 / / padding: 4 / / last cacheline: 40 bytes */ }; Instead, eb64 nodes should be used when it comes to simply storing a 64-bit key, and that is what this patch does. For backports, a variant consisting in simply marking the "hash" member with a "packed" attribute on the struct also does the job (tested), and might be preferable if the fix is difficult to adapt. Only 2.6 and 2.5 are affected by this. (cherry picked from commit `8522348482`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-10 07:40:32 +02:00
Aurelien DARRAGON	9779c1b5d7	BUG/MINOR: log: improper behavior when escaping log data Patrick Hemmer reported an improper log behavior when using log-format to escape log data (+E option): Some bytes were truncated from the output: - escape_string() function now takes an extra parameter that allow the caller to specify input string stop pointer in case the input string is not guaranteed to be zero-terminated. - Minors checks were added into lf_text_len() to make sure dst string will not overflow. - lf_text_len() now makes proper use of escape_string() function. This should be backported as far as 1.8. (cherry picked from commit `c5bff8e550`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 16:31:39 +02:00
Amaury Denoyelle	28e18246be	BUG/MEDIUM: mux-quic: properly trim HTX buffer on snd_buf reset MUX QUIC snd_buf operation whill return early if a qcs instance is resetted. In this case, HTX is left untouched and the callback returns the whole bufer size. This lead to an undefined behavior as the stream layer is notified about a transfer but does not see its HTX buffer emptied. In the end, the transfer may stall which will lead to a leak on session. To fix this, HTX buffer is now resetted when snd_buf is short-circuited. This should fix the issue as now the stream layer can continue the transfer until its completion. This patch has already been tested by Tristan and is reported to solve the github issue #1801. This should be backported up to 2.6. (cherry picked from commit `0ed617ac2f`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 15:58:22 +02:00
Amaury Denoyelle	53e4116b6b	MINOR: mux-quic: refactor snd_buf Factorize common code between h3 and hq-interop snd_buf operation. This is inserted in MUX QUIC snd_buf own callback. The h3/hq-interop API has been adjusted to directly receive a HTX message instead of a plain buf. This led to extracting part of MUX QUIC snd_buf in qmux_http module. This should be backported up to 2.6. (cherry picked from commit `9534e59bb9`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 15:58:18 +02:00
Amaury Denoyelle	0859fbf203	REORG: mux-quic: export HTTP related function in a dedicated file Extract function dealing with HTX outside of MUX QUIC. For the moment, only rcv_buf stream operation is concerned. The main objective is to be able to support both TCP and HTTP proxy mode with a common base and add specialized modules on top of it. This should be backported up to 2.6. (cherry picked from commit `d80fbcaca2`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 15:58:14 +02:00
Amaury Denoyelle	f3cab28f96	REORG: mux-quic: extract traces in a dedicated source file QUIC MUX implements several APIs to interface with stream, quic-conn and app-ops layers. It is planified to better separate this roles, possibly by using several files. The first step is to extract QUIC MUX traces in a dedicated source files. This will allow to reuse traces in multiple files. The main objective is to be able to support both TCP and HTTP proxy mode with a common base and add specialized modules on top of it. This should be backported up to 2.6. (cherry picked from commit `36d50bff22`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 15:58:06 +02:00
Amaury Denoyelle	4075ed06e6	BUG/MEDIUM: mux-quic: fix nb_hreq decrement nb_hreq is a counter on qcc for active HTTP requests. It is incremented for each qcs where a full HTTP request was received. It is decremented when the stream is closed locally : - on HTTP response fully transmitted - on stream reset A bug will occur if a stream is resetted without having processed a full HTTP request. nb_hreq will be decremented whereas it was not incremented. This will lead to a crash when building with DEBUG_STRICT=2. If BUG_ON_HOT are not active, nb_hreq counter will wrap which may break the timeout logic for the connection. This bug was triggered on haproxy.org. It can be reproduced by simulating the reception of a STOP_SENDING frame instead of a STREAM one by patching qc_handle_strm_frm() : + if (quic_stream_is_bidi(strm_frm->id)) + qcc_recv_stop_sending(qc->qcc, strm_frm->id, 0); + //ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len, + // strm_frm->offset.key, strm_frm->fin, + // (char *)strm_frm->data); To fix this bug, a qcs is now flagged with a new QC_SF_HREQ_RECV. This is set when the full HTTP request is received. When the stream is closed locally, nb_hreq will be decremented only if this flag was set. This must be backported up to 2.6. (cherry picked from commit `afb7b9d8e5`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-20 15:52:18 +02:00
Amaury Denoyelle	01a5be8c38	CLEANUP: mux-quic: remove stconn usage in h3/hq Small cleanup on snd_buf for application protocol layer. * do not export h3_snd_buf * replace stconn by a qcs argument. This is better as h3/hq-interop only uses the qcs instance. This should be backported up to 2.6. (cherry picked from commit `8d4ac48d3d`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:38 +02:00
Amaury Denoyelle	57b3c47e70	BUG/MEDIUM: mux-quic: fix crash on early app-ops release H3 SETTINGS emission has recently been delayed. The idea is to send it with the first STREAM to reduce sendto syscall invocation. This was implemented in the following patch : `3dd79d378c` MINOR: h3: Send the h3 settings with others streams (requests) This patch works fine under nominal conditions. However, it will cause a crash if a HTTP/3 connection is released before having sent any data, for example when receiving an invalid first request. In this case, qc_release will first free qcc.app_ops HTTP/3 application protocol layer via release callback. Then qc_send is called to emit any closing frames built by app_ops release invocation. However, in qc_send, as no data has been sent, it will try to complete application layer protocol intialization, with a SETTINGS emission for HTTP/3. Thus, qcc.app_ops is reused, which is invalid as it has been just freed. This will cause a crash with h3_finalize in the call stack. This bug can be reproduced artificially by generating incomplete HTTP/3 requests. This will in time trigger http-request timeout without any data send. This is done by editing qc_handle_strm_frm function. - ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len, + ret = qcc_recv(qc->qcc, strm_frm->id, strm_frm->len - 1, strm_frm->offset.key, strm_frm->fin, (char *)strm_frm->data); To fix this, application layer closing API has been adjusted to be done in two-steps. A new shutdown callback is implemented : it is used by the HTTP/3 layer to generate GOAWAY frame in qc_release prologue. Application layer context qcc.app_ops is then freed later in qc_release via the release operation which is now only used to liberate app layer ressources. This fixes the problem as the intermediary qc_send invocation will be able to reuse app_ops before it is freed. This patch fixes the crash, but it would be better to adjust H3 SETTINGS emission in case of early connection closing : in this case, there is no need to send it. This should be implemented in a future patch. This should fix the crash recently experienced by Tristan in github issue #1801. This must be backported up to 2.6. (cherry picked from commit `f8aaf8bdfa`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:25 +02:00
William Lallemand	86120e4953	MEDIUM: quic: separate path for rx and tx with set_encryption_secrets With quicTLS the set_encruption_secrets callback is always called with the read_secret and the write_secret. However this is not the case with libreSSL, which uses the set_read_secret()/set_write_secret() mecanism. It still provides the set_encryption_secrets() callback, which is called with a NULL parameter for the write_secret during the read, and for the read_secret during the write. The exchange key was not designed in haproxy to be called separately for read and write, so this patch allow calls with read or write key to NULL. (cherry picked from commit `95fc737fc6`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:41:19 +02:00
Emeric Brun	2cc1ed89d4	BUG/MEDIUM: sink: bad init sequence on tcp sink from a ring. The init of tcp sink, particularly for SSL, was done too early in the code, during parsing, and this can cause a crash specially if nbthread was not configured. This was detected by William using ASAN on a new regtest on log forward. This patch adds the 'struct proxy' created for a sink to a list and this list is now submitted to the same init code than the main proxies list or the log_forward's proxies list. Doing this, we are assured to use the right init sequence. It also removes the ini code for ssl from post section parsing. This patch should be backported as far as v2.2 Note: this fix uses 'goto' labels created by commit 'BUG/MAJOR: log-forward: Fix log-forward proxies not fully initialized' but this code didn't exist before v2.3 so this patch needs to be adapted for v2.2. (cherry picked from commit `d6e581de4b`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-19 11:36:59 +02:00
Aurelien DARRAGON	1f9d9f53f0	MINOR: proxy/listener: support for additional PAUSED state This patch is a prerequisite for #1626. Adding PAUSED state to the list of available proxy states. The flag is set when the proxy is paused at runtime (pause_listener()). It is cleared when the proxy is resumed (resume_listener()). It should be backported to 2.6, 2.5 and 2.4 (cherry picked from commit `d46f437de6`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Aurelien DARRAGON	614f99ee0a	MINOR: listener: small API change A minor API change was performed in listener(.c/.h) to restore consistency between stop_listener() and (resume/pause)_listener() functions. LISTENER_LOCK was never locked prior to calling stop_listener(): lli variable hint is thus not useful anymore. Added PROXY_LOCK locking in (resume/pause)_listener() functions with related lpx variable hint (prerequisite for #1626). It should be backported to 2.6, 2.5 and 2.4 (cherry picked from commit `001328873c`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Frédéric Lécaille	ba627a66ce	MINOR: h3: Send the h3 settings with others streams (requests) This is the ->finalize application callback which prepares the unidirectional STREAM frames for h3 settings and wakeup the mux I/O handler to send them. As haproxy is at the same time always waiting for the client request, this makes haproxy call sendto() to send only about 20 bytes of stream data. Furthermore in case of heavy loss, this give less chances to short h3 requests to succeed. Drawback: as at this time the mux sends its streams by their IDs ascending order the stream 0 is always embedded before the unidirectional stream 3 for h3 settings. Nevertheless, as these settings may be lost and received after other h3 request streams, this is permitted by the RFC. Perhaps there is a better way to do. This will have to be checked with Amaury. Must be backported to 2.6. (cherry picked from commit `3dd79d378c`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Frédéric Lécaille	68436e33d1	BUG/MINOR: quic: Speed up the handshake completion only one time It is possible to speed up the handshake completion but only one time by connection as mentionned in RFC 9002 "6.2.3. Speeding up Handshake Completion". Add a flag to prevent this process to be run several times (see https://www.rfc-editor.org/rfc/rfc9002#name-speeding-up-handshake-compl). Must be backported to 2.6. (cherry picked from commit `bb995eafc7`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	397fcc008d	MINOR: sched: store the current profile entry in the thread context The profile entry that corresponds to the current task/tasklet being profiled is now stored into the thread's context. This will allow it to be accessed from the tasks themselves. This is needed for an upcoming fix. (cherry picked from commit `1efddfa6bf`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	1cb273c718	BUG/MINOR: sched: properly account for the CPU time of dying tasks When task profiling is enabled, the scheduler can measure and report the cumulated time spent in each task and their respective latencies. But this was wrong for tasks with few wakeups as well as for self-waking ones, because the call date needed to measure how long it takes to process the task is retrieved in the task itself (->wake_date was turned to the call date), and we could face two conditions: - a new wakeup while the task is executing would reset the ->wake_date field before returning and make abnormally low values being reported; that was likely the case for taskèrun_applet for self-waking applets; - when the task dies, NULL is returned and the call date couldn't be retrieved, so that CPU time was not being accounted for. This was particularly visible with process_stream() which is usually called only twice per request, and whose time was systematically halved. The cleanest solution here is to keep in mind that the scheduler already uses quite a bit of local context in th_ctx, and place the intermediary values there so that they cannot vanish. The wake_date has to be reset immediately once read, and only its copy is used along the function. Note that this must be done both for tasks and tasklet, and that until recently tasklets were also able to report wrong values due to their sole dependency on TH_FL_TASK_PROFILING between tests. One nice benefit for future improvements is that such information will now be available from the task without having to be stored into the task itself anymore. Since the tasklet part was computed on wrapping 32-bit arithmetics and the task one was on 64-bit, the values were now consistently moved to 32-bit as it's already largely sufficient (4s spent in a task is more than twice what the watchdog would tolerate). Some further cleanups might be necessary, but the patch aimed at staying minimal. Task profiling output after 1 million HTTP request previously looked like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2012338 4.850s 2.410us 12.91s 6.417us process_stream 2000136 9.594s 4.796us 34.26s 17.13us sc_conn_io_cb 2000135 1.973s 986.0ns 30.24s 15.12us h1_timeout_task 137 - - 2.649ms 19.34us accept_queue_process 49 152.3us 3.107us 321.7yr 6.564yr main+0x146430 7 5.250us 750.0ns 25.92us 3.702us srv_cleanup_idle_conns 1 559.0ns 559.0ns 918.0ns 918.0ns task_run_applet 1 - - 2.162us 2.162us Now it looks like this: Tasks activity: function calls cpu_tot cpu_avg lat_tot lat_avg h1_io_cb 2014194 4.794s 2.380us 13.75s 6.826us process_stream 2000151 20.01s 10.00us 36.04s 18.02us sc_conn_io_cb 2000148 2.167s 1.083us 32.27s 16.13us h1_timeout_task 198 54.24us 273.0ns 3.487ms 17.61us accept_queue_process 52 158.3us 3.044us 409.9us 7.882us main+0x1466e0 18 16.77us 931.0ns 63.98us 3.554us srv_cleanup_toremove_conns 8 282.1us 35.26us 546.8us 68.35us srv_cleanup_idle_conns 3 149.2us 49.73us 8.131us 2.710us task_run_applet 3 268.1us 89.38us 11.61us 3.871us Note the two-fold difference on process_stream(). This feature is essentially used for debugging so it has extremely limited impact. However it's used quite a bit more in bug reports and it would be desirable that at least 2.6 gets this fix backported. It depends on at least these two previous patches which will then also have to be backported: MINOR: task: permanently enable latency measurement on tasklets CLEANUP: task: rename ->call_date to ->wake_date (cherry picked from commit `62b5b96bcc`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Christopher Faulet	43bd98151e	BUG/MINOR: task: Fix detection of tasks profiling in tasklet_wakeup_after() The regression was introduced when `ad548b54a7` ["MINOR: task: Add tasklet_wakeup_after()"] was backported to 2.6 (`21e0c31695`). TH_FL_TASK_PROFILING flag does not exist. To detect if tasks profiling is enabled, "task_profiling_mask" variable must be used. It is a 2.6-specific issue. Thus there is no upstream commit ID. This patch must be backported if the commit above is also backported. For now, no backport is needed.	2022-09-12 17:54:22 +02:00
Willy Tarreau	41f645a05a	CLEANUP: task: rename ->call_date to ->wake_date This field is misnamed because its real and important content is the date the task was woken up, not the date it was called. It temporarily holds the call date during execution but this remains confusing. In fact before the latency measurements were possible it was indeed a call date. Thus is will now be called wake_date. This change is necessary because a subsequent fix will require the introduction of the real call date in the thread ctx. (cherry picked from commit `04e50b3d32`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	c432738cb0	MINOR: task: permanently enable latency measurement on tasklets When tasklet latency measurement was enabled in 2.4 with commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"), the feature was conditionned on DEBUG_TASK because the field would add 8 bytes to the struct tasklet. This approach was not a very good idea because the struct ends on an int anyway thus it does finish with a 32-bit hole regardless of the presence of this field. What is true however is that adding it turned a 64-byte struct to 72-byte when caller debugging is enabled. This patch revisits this with a minor change. Now only the lowest 32 bits of the call date are stored, so they always fit in the remaining hole, and this allows to remove the dependency on DEBUG_TASK. With debugging off, we're now seeing a 48-byte struct, and with debugging on it's exactly 64 bytes, thus still exactly one cache line. 32 bits allow a latency of 4 seconds on a tasklet, which already indicates a completely dead process, so there's no point storing the upper bits at all. And even in the event it would happen once in a while, the lost upper bits do not really add any value to the debug reports. Also, now one tasklet wakeup every 4 billion will not be sampled due to the test on the value itself. Similarly we just don't care, it's statistics and the measurements are not 9-digit accurate anyway. (cherry picked from commit `768c2c5678`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	87f8732e1b	BUG/MINOR: task: make task_instant_wakeup() work on a task not a tasklet There's a subtle (harmless) bug in task_instant_wakeup(). As it uses some tasklet code instead of some task code, the debug part also acts on the tasklet equivalent, and the call_date is only set when DEBUG_TASK is set instead of inconditionally like with tasks. As such, without this debugging macro, call dates are not updated for tasks woken this way. There isn't any impact yet because this function was introduced in 2.6 to solve certain classes of issues and is not used yet, and in the worst case it would only affect the reported latency time. This may be backported to 2.6 in case a future fix would depend on it but currently will not fix existing code. (cherry picked from commit `0fae3a0360`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00
Willy Tarreau	c94b84c6ba	BUG/MINOR: task: always reset a new tasklet's call date The tasklet's call date was not reset, so if profiling was enabled while some tasklets were in the run queue, their initial random value could be used to preload a bogus initial latency value into the task profiling bin. Let's just zero the initial value. This should be backported to 2.4 as it was brought with initial commit `b2285de04` ("MINOR: tasks: also compute the tasklet latency when DEBUG_TASK is set"). The impact is very low though. (cherry picked from commit `f27acd961e`) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-09-12 17:54:22 +02:00

1 2 3 4 5 ...

6368 Commits