haproxy

Author	SHA1	Message	Date
Aleksandr Gamzin	0c84063ccb	3.0.7-alt1 - 3.0.7	2024-12-16 17:44:20 +03:00
Aleksandr Gamzin	89344084f5	HAProxy 3.0.7 -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQH1fGNIBmE19E/t6r1+TYmeqSygAUCZ1rb7gAKCRD1+TYmeqSy gKQ6AJ9ercvaIxo1IGjzi/Vg+2/5owEOWwCcCXLvsiA5PiqwlnD7tN/I8qGa4xs= =U49F -----END PGP SIGNATURE----- Merge tag 'v3.0.7' into sisyphus HAProxy 3.0.7	2024-12-16 16:16:43 +03:00
Christopher Faulet	ce353908f1	[RELEASE] Released version 3.0.7 Released version 3.0.7 with the following main changes : - BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg} - MINOR: quic: notify connection layer on handshake completion - BUG/MINOR: stream: unblock stream on wait-for-handshake completion - BUG/MEDIUM: quic: support wait-for-handshake - MINOR: quic: simplify qc_parse_pkt_frms() return path - MINOR: quic: use dynamically allocated frame on parsing - MINOR: quic: extend return value of CRYPTO parsing - BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO - CLEANUP: guid: remove global tree export - BUG/MINOR: guid/server: ensure thread-safety on GUID insert/delete - BUG/MEDIUM: quic: prevent crash due to CRYPTO parsing error - BUG/MINOR: cli: don't show sockpairs in HAPROXY_CLI and HAPROXY_MASTER_CLI - BUG/MEDIUM: resolvers: Insert a non-executed resulution in front of the wait list - BUG/MEDIUM: mux-h2: Don't send RST_STREAM frame for streams with no ID - BUG/MINOR: Don't report early srv aborts on request forwarding in DONE state - BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration - BUG/MEDIUM: mailers: make sure to always apply offsets to now_ms in expiration - BUG/MINOR: mux_quic: make sure to always apply offsets to now_ms in expiration - BUG/MINOR: peers: make sure to always apply offsets to now_ms in expiration - DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code - DOC: config: Fix a typo in "1.3.1. The Request line" - BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only - DOC: config: Slightly improve the %Tr documentation - DOC: config: Move wait_end in section about internal samples - DOC: config: Move fs.* and bs.* in section about L5 samples - DOC: lua: fix yield-dependent methods expected contexts - DOC: configuration: explain quotes and spaces in conditional blocks - DOC: configuration: wrap long line for "strstr()" conditional expression - BUG/MINOR: http-ana: Adjust the server status before the L7 retries - BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames - BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding - BUG/MEDIUM: h3: Properly limit the number of headers received - BUG/MEDIUM: h3: Increase max number of headers when sending headers - DOC: config: Improve documentation of tune.http.maxhdr directive - BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler() - BUG/MEDIUM: wdt: fix the stuck detection for warnings - BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary - MINOR: activity/memprofile: offer a function to unregister stale info - BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy() - BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line - BUG/MAJOR: quic: fix wrong packet building due to already acked frames - DEV: lags/show-sess-to-flags: Properly handle fd state on server side - BUG/MEDIUM: http-ana: Don't release too early the L7 buffer - BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set - MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding - BUG/MINOR: log: fix lf_text() behavior with empty string - BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided - BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry - BUG/MINOR: h1-htx: Use default reason if not set when formatting the response - BUG/MINOR: signal: register default handler for SIGINT in signal_init() - BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported - MINOR: mux-h2/traces: add a missing trace on negative initial window size - CLEANUP: mux-h2/traces: reword certain ambiguous traces - BUG/MINOR: server-state: Fix expiration date of srvrq_check tasks	2024-12-12 13:49:49 +01:00
Christopher Faulet	cd69a61378	BUG/MINOR: server-state: Fix expiration date of srvrq_check tasks "hold.timeout" was used as expiration date for srvrq_check tasks. But it is not accurrate. The expiration date must be based on the resolution timeouts instead (resolve and retry). The purpose of srvrq_check task is to clean up the server resolution status when outdated info are inherited from the state file. Using "hold.timeout" is not accurrate here because hold timeouts concern the resolution response items not the resolution status of servers. It may be set to a huge value or 0. The expiration date of these tasks must be based on the resolution timeouts instead. So now the ("timeout resolve" + resolve_retries * "timeout retry") value is used. This patch should fix the issue #2816. It must be backported to all stable versions. (cherry picked from commit 647a2906626e9c3d9c3349d338a35798325496f2) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 3746a7d0639ced74bb9f7cff79181be9a0f18e56) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:55:33 +01:00
Willy Tarreau	6928d0a04f	CLEANUP: mux-h2/traces: reword certain ambiguous traces Some h2 traces were not very clear, let's reword them a bit. (cherry picked from commit 7c8e9420a23584c7f366aacbfeb308d949f5c7b3) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit ff64cbe2092bd5a6a2874d7b44afe322f6348e41) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:51:19 +01:00
Willy Tarreau	a016c73f0e	MINOR: mux-h2/traces: add a missing trace on negative initial window size When a negative initial windows size is reported, we're going to close the connection, so it's important to report a trace to explain why! This should be backported at least to 3.1 and possibly 3.0 (adapting the context since there's no glitches there). (cherry picked from commit 86823c828f983bd986b150542c7a6482d60b291d) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 35bb79b797b85f8b89c87b8b925bc50f6f25865c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:50:32 +01:00
Amaury Denoyelle	ffdd10f627	BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported QUIC relies on several advanced network API features from the kernel to perform optimally. Checks are performed during startup to ensure that these features are supported. A fallback is automatically performed for every incompatible feature. Besides the automatic fallback mechanism, a message is also reported to the user at the same time. Previously, alert level was used, but it is incorrect as it is reserved for unrecoverable errors which should prevent haproxy to start. Warning level could be used, but this can annoy users running with zero-warning mode. This patch removes the alert message when 'socket-owner connection' mode cannot be activated. Convert the message to a diag level. This allows users to start without forcing configuration modification to hide a warning. Besides, several feature fallback such as the polling mechanism does not emit any warning either, so it's better to adopt a similar behavior for QUIC features. This must be backported up to 2.8. (cherry picked from commit 6fed219fd786f3fdca155f686cf2fa2f9e572697) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 24fa1cc97310e436607f64aa1ce3fd4330a26597) [cf: ctx adjt] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:47:37 +01:00
Valentine Krasnobaeva	b5044b5760	BUG/MINOR: signal: register default handler for SIGINT in signal_init() When haproxy is launched in a background and in a subshell (see example below), according to POSIX standard (2.11. Signals and Error Handling), it inherits from the subshell SIG_IGN signal handler for SIGINT and SIGQUIT. $ (./haproxy -f env4.cfg &) So, when haproxy is lanched like this, it doesn't stop upon receiving the SIGINT. This can be a root cause of some unexpected timeouts, when haproxy is started under VTest, as VTest sends to the process SIGINT in order to terminate it. To fix this, let's explicitly register the default signal handler for the SIGINT in signal_init() initcall. This should be backported in all stable versions. (cherry picked from commit d3c20b02469dea6f46369bb91965d8b4924bb2b7) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit a78b02f37d70366ef5afd308de48e8e2c4b54b4a) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:43:41 +01:00
Christopher Faulet	c4075d0a74	BUG/MINOR: h1-htx: Use default reason if not set when formatting the response When the response status line is formatted before sending it to the client, if there is no reason set, HAProxy should add one that matches the status code, as stated in the configuration manual. However it is not performed. It is possible to hit this bug when the response comes from a H2 server, because there is no reason field in HTTP/2 and above. This patch should fix the issue #2798. It should be backported to all stable versions. (cherry picked from commit 37487ada739fc86e3acb46c9949196f4f15cc9b1) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 736d4e2c3550dc9c56e5f05778457466b3ce13d9) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:43:16 +01:00
Christopher Faulet	f763b3d368	BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry It is possible to loose the request after several L7 retries, leading to crashes, because the request channel flag stating some data were sent is not properly reset. When a L7 retry is performed, some flags on different entities must be reset to be sure a new connection will be properly retried, just like it was the first one, mainly because there was no connection establishment failure. One of them, on the request channel, is not reset. The flag stating some data were already sent. It is annoying because this flag is used during the connection establishment to know if an error is triggered at the connection level or at the data level. In the last case, the error must be handled by the HTTP response analyzer, to eventually perform another L7 retry. Because CF_WROTE_DATA flag is not removed when a L7 retry is performed, a subsequent connection establishment error may be handled as a L7 error while in fact the request was never sent. It also means the request was never saved in the buffer used to performed L7 retries. Thus, on the next L7 retires, the request is just lost. This forecefully leads to a bunch of undefined behavior. One of them is a crash, when the request is used to perform the load-balancing. This patch should fix issue #2793. It must be backported to all stable versions. (cherry picked from commit 62f37801c881f68060cedb7a74b5b8cb5fcfec81) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit d0129d2c2a408a9dabd486ee129f3ec8b0199270) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:43:12 +01:00
Aurelien DARRAGON	ce1bec1cc5	BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided In _event_hdl_publish(), when we prepare the asynchronous event and no <data> was provided (set to NULL), we forgot to initialize the _data event_hdl_async_event struct member to NULL, which leads to uninitialized reads in event_hdl_async_free_event() when the event is freed: ==1002331== Conditional jump or move depends on uninitialised value(s) ==1002331== at 0x35D9D1: event_hdl_async_free_event (event_hdl.c:224) ==1002331== by 0x1CC8EC: hlua_event_runner (hlua.c:9917) ==1002331== by 0x39AD3F: run_tasks_from_lists (task.c:641) ==1002331== by 0x39B7B4: process_runnable_tasks (task.c:883) ==1002331== by 0x314B48: run_poll_loop (haproxy.c:2976) ==1002331== by 0x315218: run_thread_poll_loop (haproxy.c:3190) ==1002331== by 0x18061D: main (haproxy.c:3747) The bug severity was set to MEDIUM because of its nature, and it's best if this patch can be backported up to 2.8. But in practise it can only be triggered with events that don't provide optional data: since PAT_REF events are the first native events making use of this feature, this bug shouldn't be an issue before f72a66e ("MINOR: pattern: publish event_hdl events on pat_ref updates") (cherry picked from commit dd56616067d19060425940f6906cefe6efcd1955) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit 5b4381c19fbe87ad2972110330c59e1f231449ba) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:43:03 +01:00
Aurelien DARRAGON	bc3cdd6151	BUG/MINOR: log: fix lf_text() behavior with empty string As reported by Baptiste in GH #2797, if a logformat alias leveraging lf_text() ends up printing nothing (empty string), the whole logformat evaluation stops, leading garbage log message. This bug was introduced during 3.0 cycle in `fcb7e4b` ("MINOR: log: add lf_rawtext{_len}() functions"). At that time I genuinely thought that if strlcpy2() returned 0, it was due to a lack of space, actually forgetting that the function may simply be called with an empty string. Because of that, lf_text() would return NULL if called with an empty string, and since all lf_*() helpers are expected to return NULL on error, this explains why the logformat evaluation immediately stops in this case. To fix the issue, let's simply consider that strlcpy2() returning 0 is not an error, like it was already the case before. It should be backported in 3.1 and 3.0 with `fcb7e4b`. (cherry picked from commit 3e470471b7e0ec113807f6981699fda9538e7ffc) Signed-off-by: Willy Tarreau <w@1wt.eu> (cherry picked from commit ef8324f124f1ba0a98648edd49723ee2b8819bbe) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-12-11 14:42:35 +01:00
Christopher Faulet	d680647107	MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding It may only happens when there is no data to forward but a last stream frame must be sent with the FIN bit. It is not invalid, but it is useless to send an empty H3 DATA frame in that case. (cherry picked from commit 6697e87ae5e1f569dc87cf690b5ecfc049c4aab0) [ad: This patch was merely considered as an optimization. However, it is in fact mandatory as it fixes a bug on QUIC zero-copy implementation. As such, it must be backported up to 2.9. This bug can happen when iobuf data is null in done_ff, indicating that no data were transferred. Despite this, qcc_send_stream() was always called with data incorrectly incremented to iobuf offset, which is equal to HTTP/3 frame header length. This could cause garbage data emission by QUIC MUX. The most visible effect is that it provokes a BUG_ON() crash when QCS instance is released due to Tx offsets desynchronization. This bug is related to github issue #2678.] Signed-off-by: Amaury Denoyelle <adenoyelle@haproxy.com>	2024-12-05 17:07:37 +01:00
Christopher Faulet	56cd20cb53	BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set epoll_wait() may return EPOLLUP and/or EPOLLRDHUP after an asynchronous connect(), to indicate that the peer accepted the connection then immediately closed before epoll_wait() returned. When this happens, sock_conn_check() is called to check whether or not the connection correctly established, and after that the receive channel of the socket is assumed to already be closed. This lets haproxy send the request at best (if RDHUP and not HUP) then immediately close. Over the last two years, there were a few reports about this spuriously happening on connections where network captures proved that the server had not closed at all (and sometimes even received the request and responded to it after haproxy had closed). The logs show that a successful connection is immediately reported on error after the request was sent. After investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can be reported by epool_wait() during the connect() but in sock_conn_check(), the connect() reports a success. So the connection is validated but the HUP is handled on the first receive and an error is reported. The same behavior could be observed on health-checks, leading HAProxy to consider the server as DOWN while it is not. The only explanation at this point is that it is a kernel bug, notably because it does not even match the documentation for connect() nor epoll. In addition for now it was only observed with Ubuntu kernels 5.4 and 5.15 and was never reproduced on any other one. We have no reproducer but here is the typical strace observed: socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 114 fcntl(114, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 setsockopt(114, SOL_TCP, TCP_NODELAY, [1], 4) = 0 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(19, EPOLL_CTL_ADD, 114, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=151, u64=151}}, {events=EPOLLIN, data={u32=59, u64=59}}, {events=EPOLLIN\|EPOLLRDHUP, data={u32=114, u64=114}}], 200, 0) = 4 epoll_ctl(19, EPOLL_CTL_MOD, 114, {events=EPOLLOUT, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLOUT, data={u32=114, u64=114}}, {events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=10, u64=10}}, {events=EPOLLIN, data={u32=165, u64=165}}], 200, 0) = 4 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = 0 sendto(114, "POST "..., 1009, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 1009 close(114) = 0 Some ressources about this issue: - https://www.spinics.net/lists/netdev/msg876470.html - https://github.com/haproxy/haproxy/issues/1863 - https://github.com/haproxy/haproxy/issues/2368 So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on the FD during the connection establishement if FD_POLL_ERR is not reported too in sock_conn_check(). This way, the call to connect() is able to validate or reject the connection. At the end, if the HUP or RDHUP flags were valid, either connect() would report the error itself, or the next recv() would return 0 confirming the closure that the poller tried to report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and this pattern is so rare that nobody will ever notice the extra call to recv(). Please note that at least one reporter confirmed that using poll() instead of epoll() also addressed the problem, so that can also be a temporary workaround for those discovering the problem without the ability to immediately upgrade. The event is accounted via a COUNT_IF(), to be able to spot it in future issue. Just in case. This patch should fix the issue #1863 and #2368. It may be related to #2751. It should be backported as far as 2.4. In 3.0 and below, the COUNT_IF() must be removed. (cherry picked from commit 7262433183f590377ace31ff96b1fafa4525b7c2) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com> (cherry picked from commit b369bdcddfab9627cc3bacc0e75c9e94ac3b24fa) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-27 15:07:18 +01:00
Christopher Faulet	525ebfea3a	BUG/MEDIUM: http-ana: Don't release too early the L7 buffer In some cases, the buffer used to store the request to be able to perform a L7 retry is released released too early, leading to a crash because a retry is performed with an empty request. First, there is a test on invalid 101 responses that may be caught by the "junk-response" retry policy. Then, it is possible to get an error (empty-response, bad status code...) after an interim response. In both cases, the L7 buffer is already released while it should not. To fix the issue, the L7 buffer is now released at the end of the AN_RES_WAIT_HTTP analyser, but only when a response was successfully received and processed. In all error cases, the stream is quickly released, with the L7 buffer. So there is no leak and it is safer this way. This patch may fix the issue #2793. It must be as far as 2.4. (cherry picked from commit dc15581c02171eeb49ef3ffbab0f583f38482b4c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-27 15:05:33 +01:00
Christopher Faulet	704e2e4719	DEV: lags/show-sess-to-flags: Properly handle fd state on server side It must be handled as an hexadecimal value. (cherry picked from commit ceb80aed579bab9d8db38aa87790bc04b5c9767a) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-27 15:05:25 +01:00
Frederic Lecaille	00c346726a	BUG/MAJOR: quic: fix wrong packet building due to already acked frames If a packet build was asked to probe the peer with frames which have just been acked, the frames build run by qc_build_frms() could be cancelled by qc_stream_frm_is_acked() whose aim is to check that current frames to be built have not been already acknowledged. In this case the packet build run by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet which should be ack-eliciting. This is a bug detected by the BUG_ON() statement in qc_do_build_pk(): BUG_ON(qel->pktns->tx.pto_probe && !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING)); Thank you to @Tristan971 for having reported this issue in GH #2709 This is an old bug which must be backported as far as 2.6. (cherry picked from commit 96b2641fc8ce58eb1875e7b525c57e58e4b794c3) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-27 15:05:19 +01:00
Christopher Faulet	27dd4f4efe	BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line The formatting of the first-line, for a request or a response, does not properly handle the wrapping of the output buffer. This may lead to a data corruption for the current response or eventually for the previous one. Utility functions used to format the first-line of the request or the response rely on the chunk API. So it is not expected to pass a buffer that wraps. Unfortunatly, because of a change performed during the 2.9 dev cycle, the output buffer was direclty used instead of a non-wrapping buffer created from it with b_make() function. It is not an issue for the request because its start-line is always the first block formatted in the output buffer. But for the response, the output may be not empty and may wrap. In that case, the response start-line is dumped at a random position in the buffer, corrupting data. AFAIK, it is only an issue if the HTTP request pipelining is used. To fix the issue, we now take care to create a non-wapping buffer from the output buffer. This patch should fix issues #2779 and #2996. It must be backported as far as 2.9. (cherry picked from commit b150ae46dd97caa5050d8abefc1d9b619ab5ab9a) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:50:23 +01:00
Willy Tarreau	cd587f2b48	BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy() There's actually a problem with memprofiles: the pool pointer is stored in ->info but some pools are replaced during startup, such as the trash pool, leaving a dangling pointer there, that may randomly report crap or even crash during "show profile memory". Let's make pool_destroy() call memprof_remove_stale_info() added by previous patch so that these entries are properly unregistered. This must be backported along with the previous patch (MINOR: activity/memprofile: offer a function to unregister stale info) as far as 2.8. (cherry picked from commit ed3ed358676edf058663bde7ec6098b51f8bc745) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:49:51 +01:00
Willy Tarreau	3b5cfb28e1	MINOR: activity/memprofile: offer a function to unregister stale info There's actually a problem with memprofiles: the pool pointer is stored in ->info but some pools are replaced during startup, such as the trash pool, leaving a dangling pointer there. Let's complete the API with a new function memprof_remove_stale_info() that will remove all stale references to this info pointer. It's also present when USE_MEMORY_PROFILING is not set so as to ease the job on callers. (cherry picked from commit 859341c1ec583c586ef36db0b63cd84f3843bfab) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:49:47 +01:00
Willy Tarreau	992c3c2b67	BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary In commit 401fb0e87a ("MINOR: activity/memprofile: show per-DSO stats") we added a summary per DSO. However the free calls/tot were not initialized when creating a new entry because initially they were applied to any entry, but since we don't update free calls for non-free capable callers, we still need to reinitialize these entries when reassigning one. Because of this bug, a "show profiling memory" output can randomly show highly negative values on the DSO lines if it turns out that the DSO entry was created on an alloc instead of a realloc/free. Since the commit above was backported to 2.9, this one must go there as well. (cherry picked from commit c42a2b8c945d1b45672a2b1715dfa586daaec657) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:49:33 +01:00
Willy Tarreau	5e704ead9a	BUG/MEDIUM: wdt: fix the stuck detection for warnings If two slow tasks trigger one warning even a few seconds apart, the watchdog code will mistakenly take this for a definite stuck task and kill the process. The reason is that since commit 148eb5875f ("DEBUG: wdt: better detect apparently locked up threads and warn about them") the updated ctxsw count is not the correct one, instead of updating the private counter it resets the public one, preventing it from making progress and making the wdt believe that no progress was made. In addition the initial value was read from [tid] instead of [thr]. Please note that another fix is needed in debug_handler() otherwise the watchdog will fire early after the first warning or thread dump. A simple test for this is to issue several of these commands back-to-back on the CLI, which crashes an unfixed 3.1 very quickly: $ socat /tmp/sock1 - <<< "expert-mode on; debug dev loop 1000" This needs to be backported to 2.9 since the fix above was backported there. The impact on 3.0 and 2.9 is almost inexistent since the watchdog there doesn't apply the shorter warning delay, so the first call already indicates that the thread is stuck. (cherry picked from commit 24ce001771a7609b2a3902fc1f851668ef176c59) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:49:24 +01:00
Willy Tarreau	62d7bc6011	BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler() Since 2.0 with commit `e6a02fa65a` ("MINOR: threads: add a "stuck" flag to the thread_info struct"), the TH_FL_STUCK flag was set by the debugger to flag that a thread was stuck and report it in the output. However, two commits later (`2bfefdbaef` "MAJOR: watchdog: implement a thread lockup detection mechanism"), this flag was used to detect that a thread had already been reported as stuck. The problem is that it seldom happens that a "show threads" command instantly crashes because it calls debug_handler(), which sets the flag, and if the watchdog timer was about to trigger before going back to the scheduler, the watchdog believes that the thread has been stuck for a while and will kill the process. The issue was magnified in 3.1 with the lower-delay warning, because it's possible for a thread to die on the next wakeup after the first warning (which calls debug_handler() hence sets the STUCK flag). One good approach would have been to use two distinct flags, one for "stuck" as reported by the debug handler, and one for "stuck" as seen by the watchdog. However, one could also argue that since the second commit, given that the wdt monitors the threads, there's no point any more for the debug handler to set the flag itself. Removing this code means that two consecutive "show threads" will not report "stuck" until the watchdog sets it, which aligns better with expectations. This can be backported to all stable releases. This code has changed a bit over time, the "if" block and the harmless variables just need to be removed. (cherry picked from commit 1151fe68186cf862882f147de208c509c25d525e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:49:06 +01:00
Christopher Faulet	9416f10fb9	DOC: config: Improve documentation of tune.http.maxhdr directive The description was inproved to clrealy mentionned it is applied on received requests and responses. In addition, a comment was added about HTTP/2 and HTTP/3 limitation when messages are encoded to be sent. (cherry picked from commit e863d8d6814224961724157c605c77ddab85cbae) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:47:11 +01:00
Christopher Faulet	bda4f94322	BUG/MEDIUM: h3: Increase max number of headers when sending headers In the same way than for the H2, the maximum number of headers that can be encoded when headers are sent must be increased to match the limit imposed when they are received. Reasons are the sames. On receive path, the maximum number of headers accepted must be higher than the configured limit to be able to handle pseudo headers and cookies headers. On the sending path, the same limit must be applied because the pseudo headers will consume some extra slots and the cookie header could be splitted. This patch should be backported as far as 2.6. (cherry picked from commit 3bd9a9e7d7a8d7869015eaf041b3ae7a0761c1d4) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:47:03 +01:00
Christopher Faulet	97f2c2e56e	BUG/MEDIUM: h3: Properly limit the number of headers received The number of headers are limited before the decoding but pseudo headers and cookie headers consume extra slots. In practice, this lowers the maximum number of headers that can be received. To workaround this issue, the limit is doubled during the frame decoding to be sure to have enough extra slots. And the number of headers is tested against the configured limit after the HTX message was created to be able to report an error. Unfortunatly no parsing error are reported because the QUIC multiplexer is not able to do so for now. The same is performed on trailers to be consistent with H2. This patch should be backported as far as 2.6. (cherry picked from commit 785e63335374a6db8ef35205cdb36ea726710061) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:47:00 +01:00
Christopher Faulet	0777b7dded	BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding There is no explicit test on the number of headers when a HEADERS frame is received. It is implicitely limited by the size of the header list. But it is twice the configured limit to be sure to decode the frame. So now, a check is performed after the HTX message was created. This way, we are sure to not exceed the configured limit after the decoding stage. If there are too many headers, a parsing error is reported. Note the same is performed on the trailers. This patch should patially address the issue #2685. It should be backported to all stable versions. (cherry picked from commit 63d2760dfa679bea4b7a61a1a8702af23cf26e75) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:46:55 +01:00
Christopher Faulet	e9ad14e73d	BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames When a HEADERS frame is encoded to be sent, the maximum number of headers allowed in the frame is lower than on receiving path. This can lead to report a sending error while the message was accepted. It could be confusing. In addition, the start-line is splitted into pseudo-headers and consummes this way some header slots, increasing the difference between HEADERS frames encoding and decoding. It is even more noticeable because when a HEADERS frame is decoded, a margin is used to be able to handle splitted cookie headers. Concretly, on decoding path, a limit of twice the maxumum number of headers allowed in a message (tune.http.maxhdr * 2) is used. On encoding path, the exact limit is used. It is not consistent. Note that when a frame is decoded, we must use a larger limit because the pseudo headers are reassembled in the start-line and must count for one. But also because, most of time, the cookies are splitted into several headers and are reassembled too. To fix the issue, the same ratio is applied on sending path. A limit must be defined because an dynamic allocation is not acceptable. Twice of the configured limit should be good enough to support headers manipulation. This patch should be backported to all stable versions. (cherry picked from commit e415e3cb7aa1feaac3ed703687656e09dd464eb3) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:46:48 +01:00
Christopher Faulet	ca12dcec26	BUG/MINOR: http-ana: Adjust the server status before the L7 retries The server status must be adjusted, if necessary, at each retry. It is properly performed when "obersve layer4" directive is set. But for the layer 7, only the last attempt was considered. When the L7 retries were implemented, all retries were added before the server status adjutement. So only the last attempt was considered. To fix the issue, we must adjut the server status first, and then try to perform a L7 retry. This patch should fix the issue #2679. It must be backported to all stable versions. (cherry picked from commit 2a5da31ccef239e21d17ec34430fdc6b51b9cc67) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:45:53 +01:00
Willy Tarreau	adc7e713f7	DOC: configuration: wrap long line for "strstr()" conditional expression This keyword had too long a description line, let's split it. This can be backported to 2.8. (cherry picked from commit 5c15899410c722e2ff4a01f6d70dc40095b43ff5) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:45:43 +01:00
Willy Tarreau	b92987afa7	DOC: configuration: explain quotes and spaces in conditional blocks Conditional blocks inherit the same tokenizer and argument parser as the rest of the configuration, but are also silently concatenated around groups of spaces and tabs. This can lead to subtle failures for configs containing spaces around commas and parenthesis, where a string comparison might silently fail for example. Let's better document this particular case. Thanks to Valentine for analysing and reporting the problem. This can be backported to 2.4. (cherry picked from commit da1620b3175c63b768a8537951667885fef77e8c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:45:37 +01:00
Aurelien DARRAGON	c693da8896	DOC: lua: fix yield-dependent methods expected contexts Contrary to what the doc states, it is not expected (nor relevant) to use yield-dependent methods such as core.yield() or core.(m)sleep() from contexts that don't support yielding. Such contexts include body, init, fetches and converters. Thus the doc got it wrong since the beginning, because such methods were never supported from the above contexts, yet it was listed in the list of compatible contexts (probably the result of a copy-paste), which is error-prone because it could either cause a Lua runtime error to be thrown, or be ignored in some other cases. It should be backported to all stable versions. (cherry picked from commit 501827ebe0ad8f4121c4397267afbc7968e3d9af) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:44:48 +01:00
Christopher Faulet	b85468d1f3	DOC: config: Move fs.* and bs.* in section about L5 samples These sample fetch functions were added in the wrong section. Move them in the section about sample fetch functions at L5 layer. (cherry picked from commit e68c6852adb7051a30e209c5a0604f192182b42d) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:44:06 +01:00
Christopher Faulet	c6a4e359e3	DOC: config: Move wait_end in section about internal samples wait_end is an internal sample fetch functions and not a L6 one. So move it in the corresponding section. (cherry picked from commit 4ccc3f40488bfeed93f0df7d339444fe6503ee4e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:41:20 +01:00
Christopher Faulet	c9d735fd3f	DOC: config: Slightly improve the %Tr documentation Specify -1 can also be reported for %Tr delay when the response is invalid. (cherry picked from commit e9021a4ca1d6a70cb647441aae78ec4d35bb7c1a) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:40:14 +01:00
Christopher Faulet	9464b240ed	BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only The server response time is erroneously reported as -1 when it is intercepted by HAProxy. As stated in the documentation, the server response time is reported as -1 when the last response header was never seen. It happens when a server timeout is triggered before the server managed to process the request. It also happens if the response is invalid. This may be reported by the mux during the response parsing, but also by the HTTP analyzers. However, in this last case, the response time must only be reported as -1 on 502. This patch must be backported to all stable versions. It should fix the issue #2384. (cherry picked from commit 5863d33fce702c46b77c07d4ea82e036b11417a6) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:40:07 +01:00
Christopher Faulet	598c140650	DOC: config: Fix a typo in "1.3.1. The Request line" At the beginning of the last paragraph of this section, HTTP/3 was used instead of HTTP/2. It is not fixed. (cherry picked from commit 18de419f9647ad5fe0006900e2c1587bffd49c24) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:39:52 +01:00
Christopher Faulet	453076bbcc	DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code A space was missing before the ':' for the sample fetch functions above. It was an issue for the text to HTML conversion script. So, let's fix it. (cherry picked from commit 3af2d91b3b6ebe1587bcb17f5fb223436df67253) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:39:43 +01:00
Willy Tarreau	9d74f3692b	BUG/MINOR: peers: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be a reconnect programmed upon signal receipt at the wrapping date not having a working timeout. This should be backported where it applies. (cherry picked from commit ed55ff878d5af35dae70f78023ab2141d36e5866) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:35:13 +01:00
Willy Tarreau	b8c4edbc49	BUG/MINOR: mux_quic: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact looks nul since the task is also woken up, but better not leave such tasks in the timer tree anyway. This should be backported where it applies. (cherry picked from commit f66bfcff96082ce5c98c635c5da7a9ba157a20af) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:35:10 +01:00
Willy Tarreau	4ebe6dcb31	BUG/MEDIUM: mailers: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be mailers suddenly stopping. This should be backported where it applies. (cherry picked from commit 841be4cdd15b3d0834a478cc95ebda0f47171b4d) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:35:02 +01:00
Willy Tarreau	1ad88e6a79	BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration Now_ms can be zero nowadays, so it's not suitable for direct assignment to t->expire, as there's a risk that the timer never wakes up once assigned (TICK_ETERNITY). Let's use tick_add(now_ms, 0) for an immediate wakeup instead. The impact here might be health checks suddenly stopping. This should be backported where it applies. (cherry picked from commit 2f287f14f355e734e512732e35aebf993d000792) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:34:36 +01:00
Christopher Faulet	7cf18ab7c8	BUG/MINOR: Don't report early srv aborts on request forwarding in DONE state L7-retries may be ignored if server aborts are detected during the request forwarding, when the request is already in DONE state. When a request was fully processed (so in HTTP_MSG_DONE state) and is waiting for be forwarded to the server, there is a test to detect server aborts, to be able to report the error. However, this test must be skipped if the response was not received yet, to let the reponse analyszers handle the abort. It is important to properly handle the retries. This test must only be performed if the response analysis was finished. It means the response must be at least in HTTP_MSG_BODY state. This patch should be backported as far as 2.8. (cherry picked from commit a930e99f4699676ea72f72ba1fb99c953da0d74e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:34:08 +01:00
Christopher Faulet	1b18a4cad1	BUG/MEDIUM: mux-h2: Don't send RST_STREAM frame for streams with no ID On server side, the H2 stream is first created with an unassigned ID (ID == 0). Its ID is assigned when the request is emitted, before formatting the HEADERS frame. However, the session may be aborted during that stage. We must take care to not emit RST_STREAM frame for this stream, because it does not exist yet for the server. It is especially important to do so because, depending on the timing, it may also happens before the H2 PREFACE was sent. This patch must be backported to all stable versions. It is related to issue (cherry picked from commit f065d0009888c394e5f93dfdaa2ae79958b2c2e2) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-22 15:33:00 +01:00
Christopher Faulet	33b0ca4440	BUG/MEDIUM: resolvers: Insert a non-executed resulution in front of the wait list When a resolver is woken up to process DNS resolutions, it is possible to trigger an infinite loop on the resolver's wait list because delayed resolutions are always reinserted at the end of this list. This leads the watchdog to kill the process. By re-inserting them in front of the list, that fixes the bug. When a resolver tries to send the queries for the resolutions in its wait list, it may be unable to proceed for a resolution. This may happen because the resolution must be skipped (no hostname to resolv, a resolution already in-progress) or when an error occurred. In that case, the resolution is re-inserted in the resolver's wait list to be retry later, on a next wakeup. However, the resolution is inserted at the end of the wait list. So it is immediately reevaluated, in the same execution loop, instead of to be delayed. Most of time, it is not an issue because the resolution is considered as not expired on the second run. But it is an problem when the internal time wraps and is equal to 0. In that case, the resolution expiration date is badly computed and it is always considered as expired. If two or more resolutions are in that state, the resolver loops for ever on its wait list, until the process is killed by the watchdog. So we can argue that the way the resolution expiration date is computed must be fixed. And it would be true in a perfect world. However, the resolvers code is so crapy that it is hard to be sure to not introduce regressions. It is farly easier to re-insert delayed resolutions in front of the wait list. This fixes the issue and at worst, these resolutions will be evaluated one time too many on the next wakeup and only if now_ms was equal to 0 on the prior wakeup. This patch should be backported to all stable versions. On 2.2, LIST_ADD() must be used instead of LIST_INSERT() (cherry picked from commit 8f28dbeea94e11e2327362755f16d18b301fd153) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-13 10:59:44 +01:00
Valentine Krasnobaeva	e771877f82	BUG/MINOR: cli: don't show sockpairs in HAPROXY_CLI and HAPROXY_MASTER_CLI Before this fix, HAPROXY_CLI and HAPROXY_MASTER_CLI have contained along with CLI sockets addresses internal sockpairs, which are used only for master CLI (reload sockpair and sockpair shared with a worker process). These internal sockpairs are always need to be hidden. At the moment there is no any client, who uses sockpair addresses for the stats listener or in order to connect to master CLI. So, let's simply not copy these internal sockpair addresses of MASTER and GLOBAL proxy listeners. As listeners with sockpairs are skipped and they can be presented in the listeners list in any order, let's add semicolon separator between addresses only in the case, when there are already some string saved in the trash and we are sure, that we are adding a new address to it. Otherwise, we could have such weird output: HAPROXY_MASTER_CLI=unix@/tmp/mcli.sock;; This fix is need to be backported in all stable versions. (cherry picked from commit 113745e6f0c0ef8fe89e89fdfdcc6ed994889d4a) [cf: ctx adjt] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-13 10:59:22 +01:00
Amaury Denoyelle	51a13e6905	BUG/MEDIUM: quic: prevent crash due to CRYPTO parsing error A packet which contains several splitted and out of order CRYPTO frames may be parsed multiple times to ensure it can be handled via ncbuf. Only 3 iterations can be performed to prevent excessive CPU usage. There is a risk of crash if packet parsing is interrupted after maximum iterations is reached, or no progress can be made on the ncbuf. This is because <frm> may be dangling after list_for_each_entry_safe() The crash occurs on qc_frm_free() invokation, on error path of qc_parse_pkt_frms(). To fix it, always reset frm to NULL after list_for_each_entry_safe() to ensure it is not dangling. This should fix new report on github isue #2776. This regression has been triggered by the following patch : 1767196d5b2d8d1e557f7b3911a940000166ecda BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO As such, it must be backported up to 2.6, after the above patch. (cherry picked from commit 2975e8805d9e84010bf5199a2365d650923dbb2c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-13 10:57:20 +01:00
Amaury Denoyelle	85fa6d5b77	BUG/MINOR: guid/server: ensure thread-safety on GUID insert/delete Since 3.0, it is possible to assign a GUID to proxies, listeners and servers. These objects are stored in a global tree guid_tree. Proxies and listeners are static. However, servers may be added or deleted at runtime, which imply that guid_tree must be protected. Fix this by declaring a read-write lock to protect tree access. For now, only guid_insert() and guid_remove() are protected using a write lock. Outside of these, GUID tree is not accessed at runtime. If server CLI commands are extended to support GUID as server identifier, lookup operation should be extended with a read lock protection. Note that during stat-file preloading, GUID tree is accessed for lookup. However, as it is performed on startup which is single threaded, there is no need for lock here. A BUG_ON() has been added to ensure this precondition remains true. This bug could caused a segfault when using dynamic servers with GUID. However, it was never reproduced for now. This must be backported up to 3.0. To avoid a conflict issue, the previous cleanup patch can be merged before it. (cherry picked from commit 8e0e7d9d1af5b2dfec2e625d2c19dd034c36eb04) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-13 10:57:15 +01:00
Amaury Denoyelle	d68329f014	CLEANUP: guid: remove global tree export guid_tree is not directly used outside of functions provided by the guid module. Remove its export from the include file. (cherry picked from commit b70880cdc9c01602197fd124c84ab264f6b4ddfb) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-13 10:57:10 +01:00
Amaury Denoyelle	f3bddfa8eb	BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO A ClientHello may be splitted accross several different CRYPTO frames, then mixed in a single QUIC packet. This is used notably by clients such as chrome to render the first Initial packet opaque to middleboxes. Each packet frame is handled sequentially. Out-of-order CRYPTO frames are buffered in a ncbuf, until gaps are filled and data is transferred to the SSL stack. If CRYPTO frames are heavily splitted with small fragments, buffering may fail as ncbuf does not support small gaps. This causes the whole packet to be rejected and unacknowledged. It could be solved if the client reemits its ClientHello after remixing its CRYPTO frames. This patch is written to improve CRYPTO frame parsing. Each CRYPTO frames which cannot be buffered due to ncbuf limitation are now stored in a temporary list. Packet parsing is completed until all frames have been handled. If temporary list is not empty, reparsing is done on the stored frames. With the newly buffered CRYPTO frames, ncbuf insert operation may this time succeeds if the frame now covers a whole gap. Reparsing will loop until either no progress can be made or it has been done at least 3 times, to prevent CPU utilization. This patch should fix github issue #2776. This should be backported up to 2.6, after a period of observation. Note that it relies on the following refactor patches : MINOR: quic: extend return value of CRYPTO parsing MINOR: quic: use dynamically allocated frame on parsing MINOR: quic: simplify qc_parse_pkt_frms() return path (cherry picked from commit 1767196d5b2d8d1e557f7b3911a940000166ecda) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2024-11-08 15:54:11 +01:00

1 2 3 4 5 ...

27370 Commits