haproxy

Author	SHA1	Message	Date
Christopher Faulet	369c66d040	BUG/MEDIUM: raw-sock: Don't report connection error if something was received In raw_sock_to_buf(), if a low-level error is reported, we no longer immediately set an error on the connexion if something was received. This may happen when a RST is received with data. This way, we let a chance to the mux to process received data first instead of immediately aborting. This patch should fix some spurious health-check failures. It is pretty hard to observe, but with a server immediately returning the response followed by a RST, without waiting the request, it is possible to have some health-check errors. For instance, with the following tcploop server: tcploop 8000 L Q W N1 A S:"HTTP/1.0 200 OK\r\n\r\n" F K ( Accept -> send response -> FIN -> Close) we can have such strace output: 15:11:21.433005 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 38 15:11:21.433141 fcntl(38, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 15:11:21.433233 setsockopt(38, SOL_TCP, TCP_NODELAY, [1], 4) = 0 15:11:21.433359 setsockopt(38, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 15:11:21.433457 connect(38, {sa_family=AF_INET, sin_port=htons(8000), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 15:11:21.434215 epoll_ctl(4, EPOLL_CTL_ADD, 38, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=38, u64=38}}) = 0 15:11:21.434468 epoll_wait(4, [{events=EPOLLOUT, data={u32=38, u64=38}}], 200, 21) = 1 15:11:21.434810 recvfrom(38, 0x7f32a83e5020, 16320, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) 15:11:21.435405 sendto(38, "OPTIONS / HTTP/1.0\r\ncontent-leng"..., 41, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 41 15:11:21.435833 epoll_ctl(4, EPOLL_CTL_MOD, 38, {events=EPOLLIN\|EPOLLRDHUP, data={u32=38, u64=38}}) = 0 15:11:21.435907 epoll_wait(4, [{events=EPOLLIN\|EPOLLERR\|EPOLLHUP\|EPOLLRDHUP, data={u32=38, u64=38}}], 200, 17) = 1 15:11:21.436024 recvfrom(38, "HTTP/1.0 200 OK\r\n\r\n", 16320, 0, NULL, NULL) = 19 15:11:21.436189 close(38) = 0 15:11:21.436402 write(2, "[WARNING] (163564) : Server bac"..., 184[WARNING] (163564) : Server back-http/www is DOWN, reason: Socket error, check duration: 5ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. The response was received, but it is ignored because an error was reported too. The error handling must be refactored. But it a titanic stain. Thus, for now, a good fix is to delay the error report when something was received. The error will be reported on the next receive, if any. This patch should fix the issue #1863, but it must be confirmed. At least it fixes the above example. It must be backported to 2.6. For older versions, it must be evaluated first. (cherry picked from commit dfefebcd7a01faab9988d0aa0811d43bdfc7c665) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Aurelien DARRAGON	61882cc68c	BUG/MINOR: http_ana/txn: don't re-initialize txn and req var lists In http_create_txn(): vars_init_head() was performed on both s->vars_txn and s->var_reqres lists. But this is wrong, these two lists are already initialized upon stream creation in stream_new(). Moreover, between stream_new() and http_create_txn(), some variable may be defined (e.g.: by the frontend), resulting in lists not being empty. Because of this "extra" list initialization, already defined variables can be lost. This causes txn dependant code not being able to access previously defined variables as well as memory leak because http_destroy_txn() relies on these lists to perform the purge. This proved to be the case when a frontend sets variables and lua sample fetch is used in backend section as described in GH #1935. Many thanks to Darragh O'Toole for his detailed report. Removing extra var_init_head (x2) in http_create_txn() to fix the issue. Adding somme comments in the code in an attempt to prevent future misuses of s->var_reqres, and s->var_txn lists. It should be backported in every stable version. (This is an old bug that seems to exist since 1.6-dev6) [cf: On 2.0 and 1.8, for the legacy HTTP code, vars_init() are used during the TXN cleanup, when the stream is reused. So, these calls must be moved from http_init_txn() to http_reset_txn() and not removed.] (cherry picked from commit 5ad2b642625b89cdf4f5fd26a598fc480abdc806) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Christopher Faulet	9cfdbf2dd7	BUG/MEDIUM: listener: Fix race condition when updating the global mngmt task It is pretty similar to fbb934da90 ("BUG/MEDIUM: stick-table: fix a race condition when updating the expiration task"). When the global management task is running, at the end of its process function, it resets the expire date by setting it to TICK_ETERNITY. In same time, a listener may reach a global limit and decides to schedule the task. Thus it is possible to queue the task and trigger the BUG_ON() on the expire date because its value was set to TICK_ETERNITY in the means time: FATAL: bug condition "task->expire == 0" matched at src/task.c:310 call trace(12): \| 0x662de8 [b8 01 00 00 00 c6 00 00]: __task_queue+0xc7/0x11e \| 0x63b03f [48 b8 04 00 00 00 05 00]: main+0x2535e \| 0x63ed1a [e9 d2 fd ff ff 48 8b 45]: listener_accept+0xf72/0xfda \| 0x6a36d3 [eb 01 90 c9 c3 55 48 89]: sock_accept_iocb+0x82/0x87 \| 0x6af22f [48 8b 05 ca f9 13 00 8b]: fd_update_events+0x35a/0x497 \| 0x42a7a8 [89 45 d8 83 7d d8 02 75]: main-0x1eb539 \| 0x6158fb [48 8b 05 e6 06 1c 00 64]: run_poll_loop+0x2e7/0x319 \| 0x615b6c [48 8b 05 ed 65 1d 00 48]: main-0x175 \| 0x7ffff775bded [e9 69 fe ff ff 48 8b 4c]: libc:+0x8cded \| 0x7ffff77e1370 [48 89 c7 b8 3c 00 00 00]: libc:+0x112370 To fix the bug, a RW lock is introduced. It is used to fix the race condition. A read lock is taken when the task is scheduled, in listener_accpet() and a write lock is used at the end of process function to set the expire date to TICK_ETERNITY. This lock should not be used very often and most of time by "readers". So, the impact should be really limited. This patch should fix the issue #1930. It must be backported as far as 1.8 with some cautions because the code has evolved a lot since then. (cherry picked from commit 13e86d947d1536dbe6d8f4057388727528546ae6) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Willy Tarreau	c65d18df9e	BUG/MINOR: pool/cli: use ullong to report total pool usage in bytes As noticed by Gabriel Tzagkarakis in issue #1903, the total pool size in bytes is historically still in 32 bits, but at least we should report the product of the number of objects and their size in 64 bits so that the value doesn't wrap around 4G. This may be backported to all versions. (cherry picked from commit 0c5e9896c79cbec3507894ed33a8553127bc170e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Amaury Denoyelle	566e8d57dc	BUILD: quic: fix dubious 0-byte overflow on qc_release_lost_pkts With GCC 12.2.0 and O2 optimization activated, compiler reports the following warning for qc_release_lost_pkts(). In function ‘quic_tx_packet_refdec’, inlined from ‘qc_release_lost_pkts.constprop’ at src/quic_conn.c:2056:3: include/haproxy/atomic.h:320:41: error: ‘__atomic_sub_fetch_4’ writing 4 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 320 \| #define HA_ATOMIC_SUB_FETCH(val, i) __atomic_sub_fetch(val, i, __ATOMIC_SEQ_CST) \| ^~~~~~~~~~~~~~~~~~ include/haproxy/quic_conn.h:499:14: note: in expansion of macro ‘HA_ATOMIC_SUB_FETCH’ 499 \| if (!HA_ATOMIC_SUB_FETCH(&pkt->refcnt, 1)) { \| ^~~~~~~~~~~~~~~~~~~ GCC thinks that quic_tx_packet_refdec() can be called with a NULL argument from qc_release_lost_pkts() with <oldest_lost> as arg. This warning is a false positive as <oldest_lost> cannot be NULL in qc_release_lost_pkts() at this stage. This is due to the previous check to ensure that <pkts> list is not empty. This warning is silenced by using ALREADY_CHECKED() macro. This should be backported up to 2.6. This should fix github issue #1852. (cherry picked from commit 3a72ba2aede63e49e53fe22b80e07c9d3c8f72e3) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Willy Tarreau	857095c121	BUG/MEDIUM: ring: fix creation of server in uninitialized ring If a "ring" section initialization fails (e.g. due to a duplicate name, invalid chars, or missing memory), any subsequent "server" statement that appears in the same section will crash the config parser by dereferencing the currently NULL cfg_sink. E.g: ring x ring x # fails on "already exists" server srv 1.1.1.1 # crashes on cfg_sink==NULL All other statements have a test for this but "server" was missing it, so this patch adds it. Thanks to Joel Hutchinson for reporting this issue. This must be backported as far as 2.2. (cherry picked from commit 1b662aabbfa32fb6ddeff4ff5f0e3031f12dafd3) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:22 +01:00
Willy Tarreau	abab4bd4d0	DOC: config: fix alphabetical ordering of global section the global section keywords were seriously misordered, and it's visible that some mistakes have induced other ones over time, so it was about time to fix this. Roughly 20% of the keywords were misplaced. This commit only reordered the keywords index and their description, nothing else was changed. It might be backported because it's a real pain to find certain options there. (cherry picked from commit 8e6ad2548ce933ef52113b20f2766d66d16f3e39) [cf: Context adjustment] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-25 09:25:20 +01:00
Christopher Faulet	5d05aa64de	REG-TESTS: cache: Remove T-E header for 304-Not-Modified responses VTEST does not properly handle 304-Not-Modified responses. If a Transfer-Encoding header (and probably a Content-Lenght header too), it waits for a body. Waiting for a fix, the Transfer-Encoding encoding of cached responses in 2 VTEST scripts are removed. Note it is now an issue because of a fix in the H1 multiplexer : * 226082d13a "BUG/MINOR: mux-h1: Do not send a last null chunk on body-less answers" This patch must be backported with the above commit. (cherry picked from commit a0e1a87948cc1858e82c656d18344051838a4af0) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Mickael Torres	611bfc4356	BUG/MINOR: mux-h1: Do not send a last null chunk on body-less answers HEAD answers should not contain any body data. Currently when a "transfer-encoding: chunked" header is returned, a last null-chunk is added to the answer. Some clients choke on it and fail when trying to reuse the connection. Check that the response should not be body-less before sending the null-chunk. This patch should fix #1932. It must be backported as far as 2.4. (cherry picked from commit 226082d13a0d0b83114e933b3a63916b18f9824b) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Christopher Faulet	7dbd951a95	BUG/MEDIUM: mux-fcgi: Avoid value length overflow when it doesn't fit at once When the request data are copied in a mbuf, if the free space is too small to copy all data at once, the data length is shortened. When this is performed, we reserve the size of the STDIN recod header and eventually the same for the empty STDIN record if it is the last HTX block of the request. However, there is no test to be sure the free space is large enough. Thus, on this special case, when the mbuf is almost full, it is possible to overflow the value length. Because of this bug, it is possible to experience crashes from time to time. This patch should fix the issue #1923. It must be backported as far as 2.4. (cherry picked from commit 52fd8a1b7b8a5a328cb5f4fabd42d2ca7af78760) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Christopher Faulet	df5fcf4146	BUG/MINOR: mux-fcgi: Be sure to send empty STDING record in case of zero-copy When the last HTX DATA block was copied in zero-copy, the empty STDIN record, marking the end of the request data was never sent. Thanks to this patch, it is now sent. This patch must be backported as far as 2.4. (cherry picked from commit e8c7fb35882c453c7d1c2e923a3782a9348d8c5b) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Christopher Faulet	24e3037b52	BUG/MINOR: resolvers: Set port before IP address when processing SRV records For a server subject to SRV resolution, when the server's address is set, its dynamic cookie, if any, and its server key are computed. Both are based on the ip/port pair. However, this happens before the server's port is set. Thus the port is equal to 0 at this stage. It is a problem if several servers share the same IP but with different ports because they will share the same dynamic cookie and the same server key, disturbing this way the connection persistency and the session stickiness. This patch must be backported as far as 2.2. (cherry picked from commit 2364b39984e4fd4aa6a88148520d49fe0620c034) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Christopher Faulet	98ccf10af1	BUG/MINOR: resolvers: Don't wait periodic resolution on healthcheck failure DNS resoltions may be triggered via a "do-resolve" action or when a connection failure is experienced during a healthcheck. Cached valid responses are used, if possible. But if the entry is expired or if there is no valid response, a new reolution should be performed. However, an resolution is only performed if the "resolve" timeout is expired. Thus, when this comes from a healthcheck, it means no extra resolution is performed at all. Now, when the resolution is performed for a server (SRV or SRVEQ) and no valid response is found, the resolution timer is reset (last_resolution is set to TICK_ETERNITY). Of course, it is only performed if no resolution is already running. Note that this feature was broken 5 years ago when the resolvers code was refactored (67957bd59e). This patch should fix the issue #1906. It affects all stable versions. However, it is probably a good idea to not backport it too far (2.6, maybe 2.4) and with some delay. (cherry picked from commit 68a61b63214d2514f793842b995df31ea46e8fd4) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Christopher Faulet	de3a7ba373	BUG/MINOR: http-htx: Fix error handling during parsing http replies When an error is triggered during arguments parsing of an http reply (for instance, from a "return" rule), while a log-format body was expected but not evaluated yet, HAproxy crashes when the body log-format string is released because it was not properly initialized. The list used for the log-format string must be initialized earlier. This patch should fix the issue #1925. It must be backported as far as 2.2. (cherry picked from commit 5a3d9a77e20a66796357627b1dd807a1b22603a5) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Aurelien DARRAGON	a7d82157ef	BUG/MEDIUM: wdt/clock: properly handle early task hangs In ae053b30 - BUG/MEDIUM: wdt: don't trigger the watchdog when p is unitialized: wdt is not triggering until prev_cpu_time is initialized to prevent unexpected process termination. Unfortunately this is not enough, some tasks could start immediately after process startup, and in such cases prev_cpu_time could be uninitialized, because prev_cpu_time is set after the polling loop while process_runnable_tasks() is executed before the polling loop. It happens to be the case with lua tasks registered using register_task function from lua script. Those tasks are registered in early init stage of haproxy and they are scheduled to run before the first polling loop, leading to prev_cpu_time being uninitialized (equals 0) on the thread when the task is first executed. Because of this, if such tasks get stuck right away (e.g: blocking IO) the watchdog won't behave as expected and the thread will remain stuck indefinitely. (polling loop for the thread won't run at all as the thread is already stuck) To solve this, we're now making sure that prev_cpu_time is first set before any tasks are processed on the thread. This is done by setting initial prev_cpu_time value directly in clock_init_thread_date() Thanks to Abhijeet Rastogi for reporting this unexpected behavior. It could be backported in every stable versions. (everywhere ae053b30 is, because both are related) (cherry picked from commit 16d6c0cb0935f94859f56ff015743d6c3e77cb60) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Willy Tarreau	f41015781b	CI: emit the compiler's version in the build reports Some occasional builds fail only on a specific platform and being able to figure the exact compiler version used there is crucial. It's not easy to guess from the rest of the output, so let's add it before the platform-specific defines, which suit the same needs. (cherry picked from commit a051816c03dacc611b0fd3b8f6247beb6d658abc) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-18 11:33:37 +01:00
Ilya Shipitsin	4b7c0f54eb	CI: enable QUIC for LibreSSL builds since LibreSSL-3.6.x supports QUIC, let us enable it (cherry picked from commit 6397c7c55ff26eb11a7be66fb4d4f033fd47c762) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:42:10 +01:00
Ilya Shipitsin	a7c11202a8	CI: switch to the "latest" LibreSSL LibreSSL-3.6.0 had some regression, it was fixed in 3.6.1, let us switch back to the latest LibreSSL available (cherry picked from commit 70b2c72687ae6e33d2f81147837d4d7c38ef3b9f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:42:05 +01:00
Remi Tricot-Le Breton	09ae90d983	BUG/MINOR: ssl: ocsp structure not freed properly in case of error In case of error, the ocsp item might already be in the ocsp certificate tree but simply freed instead of destroyed through ssl_sock_free_ocsp. This patch can be backported to all stable versions. (cherry picked from commit aa529f776d24ecc84239a97456ed4835b6589350) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:42:00 +01:00
Remi Tricot-Le Breton	d842a5a47d	BUG/MINOR: ssl: Memory leak of AUTHORITY_KEYID struct when loading issuer When calling ssl_get0_issuer_chain, if akid is not NULL but its keyid is, then the AUTHORITY_KEYID is not freed. This patch can be backported to all stable branches. (cherry picked from commit 1621dc1cc59cf99c8d5e02c0f300f1cd743cafa1) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:41:55 +01:00
Remi Tricot-Le Breton	6f57618877	BUG/MINOR: ssl: Memory leak of DH BIGNUM fields When running HAProxy with OpenSSLv3, the two BIGNUMs used to build our own DH parameters are not freed. It was not necessary previously because ownership of those parameters was transferred to OpenSSL through the DH_set0_pqg call. This patch should be backported to 2.6. (cherry picked from commit a2c21db155f52089b9474e9a13a8b270f55301b7) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:41:50 +01:00
Miroslav Zagorac	b0c7490d4e	BUG/MINOR: httpclient: fixed memory allocation for the SSL ca_file The memory for the SSL ca_file was allocated only once (in the function httpclient_create_proxy()) and that pointer was assigned to each created proxy that the HTTP client uses. This would not be a problem if this memory was not freed in each individual proxy when it was deinitialized in the function ssl_sock_free_srv_ctx(). Memory allocation: src/http_client.c, function httpclient_create_proxy(): 1277: if (!httpclient_ssl_ca_file) 1278: httpclient_ssl_ca_file = strdup("@system-ca"); 1280: srv_ssl->ssl_ctx.ca_file = httpclient_ssl_ca_file; Memory deallocation: src/ssl_sock.c, function ssl_sock_free_srv_ctx(): 5613: ha_free(&srv->ssl_ctx.ca_file); This should be backported to version 2.6. (cherry picked from commit a2ec192de38eba294189bf94b76dbb2c4cf9a55b) [cf: Context adjustment] Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:40:47 +01:00
Ilya Shipitsin	b055edbeeb	CI: add monthly gcc cross compile jobs Build only gcc cross compile jobs are added with monthly run to catch rare errors, mostly 32bit <--> 64bit (cherry picked from commit 5526f922af178dc4b33b9c6e141e7524d7c5539d) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:36:02 +01:00
Amaury Denoyelle	eaff3aebc6	BUG/MINOR: quic: fix race condition on datagram purging Each datagram is received by a random thread and dispatch to its destination thread linked to the connection. Then, the datagram is handled by the connection thread. Once this is done, datagram buffer pointer is atomically set to NULL to mark it as consumed. Consumed datagrams are purged before recvfrom() invocation on random receiver threads. The check for NULL buffer must thus be done atomically. This was not the case before this patch, which may have triggered race conditions. This bug has been introduced by commit 91b2305ad79bb7086840797b6e98bd791992444f MINOR: quic: implement datagram cleanup for quic_receiver_buf This should be backported up to 2.6 after previously mentionned commit. (cherry picked from commit 0b13e9407173c340d0b8d63c73ff07fdde5e889c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:35:41 +01:00
Aurelien DARRAGON	55f37aa5e5	BUG/MINOR: log: fixing bug in tcp syslog_io_handler Octet-Counting syslog_io_handler does specific treatment to handle syslog tcp octet counting: Logic was good, but a sneaky mistake prevented rfc-6587 octet counting from working properly. trash.area was used as an input buffer. It does not make sense here since it is uninitialized. Compilation was unaffected because trash is a thread local "global" variable. buf->area should definitely be used instead. This should be backported as far as 2.4. (cherry picked from commit 7faffdc6abc662470a4706f6c5af7c9c3a8764fb) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:34:42 +01:00
Amaury Denoyelle	d4c880d649	BUG/MINOR: quic: fix subscribe operation Subscribing was not properly designed between quic-conn and quic MUX layers. Align this as with in other haproxy components : <subs> field is moved from the MUX to the quic-conn structure. All mention of qcc MUX is cleaned up in quic_conn_subscribe()/quic_conn_unsubscribe(). Thanks to this change, ACK reception notification has been simplified. It's now unnecessary to check for the MUX existence before waking it. Instead, if <subs> quic-conn field is set, just wake-up the upper layer tasklet without mentionning MUX. This should probably be extended to other part in quic-conn code. This should be backported up to 2.6. (cherry picked from commit bbb1c68508ceebb98ac4234c906a65a42596e6ea) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:34:22 +01:00
Amaury Denoyelle	bf2a4b1203	MINOR: quic: remove unnecessary quic_session_accept() A specialized listener accept was previously used for QUIC. This is now unneeded and we can revert to the default one session_accept_fd(). One change of importance is that the call order between conn_xprt_start() and conn_complete_session() is now reverted to the default one. This means that MUX instance is now NULL during qc_xprt_start() and its app-ops layer cannot be set here. This operation has been delayed to qc_init() to prevent a segfault. This should be backported up to 2.6. (cherry picked from commit 0aba11e9e72c6531e131f740135f9d47d2060ca7) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-11-17 16:31:21 +01:00
Willy Tarreau	cf533a19e5	BUG/MEDIUM: stick-table: fix a race condition when updating the expiration task Pierre Cheynier reported a rare crash that can affect stick-tables. When a entry is created, the stick-table's expiration date is updated. But if at exactly the same time the expiration task runs, it finishes by updating its expiration timer without any protection, which may collide with the call to task_queue() in another thread. In this case, it sometimes happens that the first test for TICK_ETERNITY in task_queue() passes, then the "expire" field is reset, then the BUG_ON() triggers, like below: FATAL: bug condition "task->expire == 0" matched at src/task.c:279 call trace(13): \| 0x649d86 [c6 04 25 01 00 00 00 00]: __task_queue+0xc6/0xce \| 0x596bef [eb 90 ba 03 00 00 00 be]: stktable_requeue_exp+0x1ef/0x258 \| 0x596c87 [48 83 bb 90 00 00 00 00]: stktable_touch_with_exp+0x27/0x312 \| 0x563698 [48 8b 4c 24 18 4c 8b 4c]: stream_process_counters+0x3a8/0x6a2 \| 0x569344 [49 8b 87 f8 00 00 00 48]: process_stream+0x3964/0x3b4f \| 0x64a80b [49 89 c7 e9 23 ff ff ff]: run_tasks_from_lists+0x3ab/0x566 \| 0x64ad66 [29 44 24 14 8b 7c 24 14]: process_runnable_tasks+0x396/0x71e \| 0x6184b2 [83 3d 47 b3 a6 00 01 0f]: run_poll_loop+0x92/0x4ff \| 0x618acf [48 8b 1d aa 20 7d 00 48]: main+0x1877ef \| 0x7fc7d6ec1e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45 \| 0x7fc7d6c9e4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a This one is extremely difficult to reproduce in practice, but adding a printf() in process_table_expire() before assigning the value, while running with an expire delay of 1ms helps a lot and may trigger the crash in less than one minute on a 8-thread machine. Interestingly, depending on the sequencing, this bug could also have made a table fail to expire if the expire field got reset after the last update but before the call to task_queue(). It would require to be quite unlucky so that the table is never touched anymore after the race though. The solution taken by this patch is to take the table's lock when updating its expire value in stktable_requeue_exp(), enclosing the call to task_queue(), and to update the task->expire field while still under the lock in process_table_expire(). Note that thanks to previous changes, taking the table's lock for the update in stktable_requeue_exp() costs almost nothing since we now have the guarantee that this is not done more than 1000 times a second. Since process_table_expire() sets the timeout after returning from stktable_trash_expired() which just released the lock, the two functions were merged so that the task's expire field is updated while still under the lock. Note that this heavily depends on the two previous patches below: CLEANUP: stick-table: remove the unused table->exp_next OPTIM: stick-table: avoid atomic ops in stktable_requeue_exp() when possible This is a bit complicated due to the fact that in 2.7 some parts were made lockless. In 2.6 and older, the second part (the merge of the two functions) will be sufficient since the task_queue() call was already performed under the table's lock, and the patches above are not needed. This needs to be backported as far as 1.8 scrupulously following instructions above. (cherry picked from commit fbb934da9028960f31edf734ab9de08ae799c7d1) [wt: only merged the two functions above to update task->expire under the table's lock; tested and confirmed to fix the problem] Signed-off-by: Willy Tarreau <w@1wt.eu>	2022-11-14 18:31:24 +01:00
Christopher Faulet	4c629b75a3	BUG/MAJOR: stick-table: don't process store-response rules for applets The commit bc7c207f74 ("BUG/MAJOR: stick-tables: do not try to index a server name for applets") tried to catch applets case when we tried to index the server name. However, there is still an issue. The applets are unconditionally casted to servers and this bug exists since a while. it's just luck if it doesn't crash. Now, when store rules are processed, we skip the rule if the stream's target is not a server or, of course, if it is a server but the "non-stick" option is set. However, we still take care to release the sticky session. This patch must be backported to all stable versions. (cherry picked from commit b976640fe178c7084f0db4d443874ebe4746dca8) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 18:28:10 +02:00
William Lallemand	4f60fdd26e	DOC: lua: add a note about compression w/ httpclient Decompression is not supported by the httpclient. (cherry picked from commit a9256194b85b8813019de6df11a196bab4d3f900) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 18:25:56 +02:00
William Lallemand	a9ba03ae89	DOC: management: add forgotten "show startup-logs" The keyword was never documented. Must be backported in all maintained versions. (cherry picked from commit f76b3b47ea0c8fc147d614bf4982d15e7de2aa65) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 18:25:19 +02:00
Ilya Shipitsin	910d89986e	CI: SSL: temporarily stick to LibreSSL=3.5.3 recently released 3.6.0 introduced some regression which must be resolved first, let us use 3.5.3 notation instead of "latest" (cherry picked from commit b65fd666665616b313a2465bf4f71c6d86ceb6f2) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:54:29 +02:00
Ilya Shipitsin	b78d3e8595	CI: SSL: use proper version generating when "latest" semantic is used both "OPENSSL_VERSION=latest" and "LIBRESSL_VERSION=latest" processing introduced errors when build-ssl.sh script was invoked. that error in turn led to skipping custom openssl build and haproxy was linked against stock openssl, i.e. openssl-1.1.1 (cherry picked from commit 14711bdc9a0586fc54df25532b51b8b629dd78ef) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:54:19 +02:00
Christopher Faulet	a7456b2d14	BUG/MINOR: sink: Set default connect/server timeout for implicit ring buffers Ring buffers may be implicitly created from log declarations when "tcp@", "tcp6@", "tcp4@" or "uxst@" prefixes are used. These ring buffers rely on unconfigurable proxies. While connect and server timeouts should be defined for explicit ring buffers, it is no possible for implicit ones. However, a default value must be set and TICK_ETERNITY is not an acceptable one. Thus, now "1s" is set for the connect timeout and "5s" is set for server one. This patch may be backported as far as 2.4. (cherry picked from commit d08a25b1f125eaf79475d8e42f2eb51683bf91a0) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:53:54 +02:00
Christopher Faulet	9c2c1e328f	BUG/MINOR: sink: Only use backend capability for the sink proxies When a ring section is parsed, a proxy is created. For now, it has the frontend (PR_CAP_FE) and the internal (PR_CAP_INT) capabilities, in addition to the expected backend capability (PR_CAP_BE). PR_CAP_INT capability was added to silent warning triggered because of PR_CAP_FE capability. Indeed, Because the proxy is declared as a frontend, warnings about missing bind lines and missing client timeout should be triggered during the configuration parsing. These warnings are inhibited because PR_CAP_INT capability is set. It is an issue on the 2.4 because PR_CAP_INT capability does not exist. So warnings are always emitted. But the true bug is that these proxies should not have PR_CAP_FE and PR_CAP_INT capabilities. Removing these capabilities is enough to remove any warnings on the 2.4, with no regression on higher versions. However, it may be a good idea to eval if a dedicated frontend for sinks should be added or not. This way, a true frontend would be used to start the sink applets. In addition, proxies capabilities/modes have to be reviewed to have a less ambiguous API. For instance a dedicate mode for sinks (PR_MODE_SINK ?) may be added. Finally, it could be very nice to have all proxies in the same list, including internal ones. This patch should fix the issue #1900. It must be backported as far as 2.4. (cherry picked from commit 11a707ae52b28a8bd94cff734b088a5bddcc820c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:53:51 +02:00
Amaury Denoyelle	9c15bd5d37	MINOR: quic: display unknown error sendto counter on stat page This patch complete the previous incomplete commit. The new counter sendto_err_unknown is now displayed on stats page/CLI show stats. This is related to github issue #1903. This should be backported up to 2.6. (cherry picked from commit 7941ead3aa00c9e83fadf70a1d6d515d20421ad0) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:52:33 +02:00
Amaury Denoyelle	934659e0ce	MINOR: quic: do not crash on unhandled sendto error Remove ABORT_NOW() statement on unhandled sendto error. Instead use a dedicated counter sendto_err_unknown to report these cases. If we detect increment of this counter, strace can be used to detect errno value : $ strace -p $(pidof haproxy) -f -e trace=sendto -Z This should be backported up to 2.6. This should help to debug github issue #1903. (cherry picked from commit 1d9f170eddd8703ba550e91322298e88e8280075) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:52:22 +02:00
Christopher Faulet	126f683a2c	BUG/MEDIUM: compression: handle rewrite errors when updating response headers When an HTTP response is compressed by HAProxy, the headers are updated. However it is possible to encounter a rewrite error because the buffer is full. In this case, the compression is aborted. Thus, we must be sure to leave the response in a valid state. For now, it is an issue because the "Content-Encoding" header is added before all other headers manipulations. So if the compression is aborted on error, the "Content-Encoding" header may remain while the payload is not compressed. So now, we take care to leave with a valid response on error by reordering the headers manipulations. It is too painful to really rollback all changes, especially for an edge case. This patch should be backported as far as 2.0. Note that on the 2.0, the legacy HTTP part is also concerned. (cherry picked from commit 910b7577bce977ee08068f22df7a5d824ae4155c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:59 +02:00
Amaury Denoyelle	9287b4047b	BUG/MINOR: mux-quic: complete flow-control for uni streams Max stream data was not enforced and respect for local/remote uni streams. Previously, qcs instances incorrectly reused the limit defined from bidirectional ones. This is now fixed. Two fields are added in qcc structure connection : * value for local flow control to enforce on remote uni streams * value for remote flow control to respect on local uni streams These two values can be reused to properly initialized msd field of a qcs instance in qcs_new(). The rest of the code is similar. This must be backported up to 2.6. (cherry picked from commit 176174f7e4734ca8d7a27a622be44ec386d36f4c) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:54 +02:00
William Lallemand	784987a344	BUILD: Makefile: add "USE_SHM_OPEN" on the linux-musl target The startup-logs with the shm works correctly with Alpine and Musl, enable the feature by default for the linux-musl target. (cherry picked from commit 83e9bcaa87f9484f065ef4a234b6fef22f1e91b6) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:28 +02:00
William Lallemand	730251540b	CI: github: dump the backtrace of coredumps in the alpine container This patch allows to show the backtrace of a coredump produced in the alpine/musl jobs. It activates some option required by the containers to allow the production of coredump, set a shared directory so the kernel could dump the coredump within the container. Some debug packages were also added. (cherry picked from commit 6435801d09cac089e705baed57f839495f61a972) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:18 +02:00
William Lallemand	3e731f871c	REGTESTS: httpclient/lua: test the lua task timeout with the httpclient Test the httpclient when the lua action timeout. The lua timeout is reached before the httpclient is ended. This test that the httpclient are correctly cleaned when destroying the hlua context. Must be backported as far as 2.5. (cherry picked from commit 4ed0a3a88384a3836cbf162025df7ca41cc4fa5e) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:06 +02:00
William Lallemand	b398f7368b	BUG/MEDIUM: httpclient: check if the httpclient was released in the IO handler Upon a applet_release(), the applet can be scheduled again and a call to the IO handler is still possible. When the struct httpclient is already free the IO handler could try to access it. This patch fixes the issue by setting svcctx to NULL in the applet_release, and checking its value in the IO handler. Must be backported as far as 2.5. (cherry picked from commit a93eac41f0e9143daccd852c4a26003b5cd4e036) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:50:03 +02:00
William Lallemand	3fd456abc7	BUG/MEDIUM: httpclient/lua: crash when the lua task timeout before the httpclient When the lua task finished before the httpclient that are associated to it, there is a risk that the httpclient try to task_wakeup() the lua task which does not exist anymore. To fix this issue the httpclient used in a lua task are stored in a list, and the httpclient are destroyed at the end of the lua task. Must be backported in 2.5 and 2.6. (cherry picked from commit bb581423b3ba48dfafb53b70205483f766242a6b) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:59 +02:00
Christopher Faulet	244e18af77	BUG/MINOR: ring: Properly parse connect timeout The connect timeout in a ring section was not properly parsed. Thus, it was never set and the server timeout may be overwritten, depending on the directives order. The first char of the keyword must be tested, not the third one. This patch is related to the issue #1900. But it does not fix the issue. It must be backported as far as 2.4. (cherry picked from commit 321d100cc8f86e5713fa7325acb30ef0e962d900) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:55 +02:00
Christopher Faulet	64153b11e5	BUG/MINOR: log: Preserve message facility when the log target is a ring buffer When a ring is used as log target, the original facility, if any, must be preserved. The default facility must only be used if there no facility was found in the incoming log message. This patch should fix the issue #1901. It must be backported as far as 2.4. (cherry picked from commit cc640e851aa22e9ba7c4b5155d099457d1e3989f) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:48 +02:00
Amaury Denoyelle	05f9c5c266	MINOR: quic: extend Retry token check function On Initial packet reception, token is checked for validity through quic_retry_token_check() function. However, some related parts were left in the parent function quic_rx_pkt_retrieve_conn(). Move this code directly into quic_retry_token_check() to facilitate its call in various context. The API of quic_retry_token_check() has also been refactored. Instead of working on a plain char* buffer, it now uses a quic_rx_packet instance. This helps to reduce the number of parameters. This change will allow to check Retry token even if data were received with a FD-owned quic-conn socket. Indeed, in this case, quic_rx_pkt_retrieve_conn() call will probably be skipped. This should be backported up to 2.6. (cherry picked from commit 9e3026c58dcd4444c462193a9a8f1e1027e4f550) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:43 +02:00
Amaury Denoyelle	8bcc6909f4	MINOR: quic: refactor packet drop on reception Sometimes, a packet is dropped on reception. Several goto statements are used, mostly to increment a proxy drop counter or drop silently the packet. However, this labels are interleaved. Re-arrang goto labels to simplify this process : * drop label is used to drop a packet with counter incrementation. This is the default method. * drop_silent is the next label which does the same thing but skip the counter incrementation. This is useful when we do not need to report the packet dropping operation. This should be backported up to 2.6. (cherry picked from commit 6e56a9e05524862a7e415d87d0d46f28a5b9a999) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:39 +02:00
Amaury Denoyelle	fa1e984fd0	MINOR: quic: split and rename qc_lstnr_pkt_rcv() This change is the following of qc_lstnr_pkt_rcv() refactoring. This function has finally been split into several ones. The first half is renamed quic_rx_pkt_parse(). This function is responsible to parse a QUIC packet header and calculate the packet length. QUIC connection retrieval has been extracted and is now called directly by quic_lstnr_dghdlr(). The second half of qc_lstnr_pkt_rcv() is renamed to qc_rx_pkt_handle(). This function is responsible to copy a QUIC packet content to a quic-conn receive buffer. A third function named qc_rx_check_closing() is responsible to detect if the connection is already in closing state. As this requires to drop the whole datagram, it seems justified to be in a separate function. This change has no functional impact. It is part of a refactoring series on qc_lstnr_pkt_rcv(). The objective is to facilitate the integration of FD-owned quic-conn socket patches. This should be backported up to 2.6. (cherry picked from commit 982896961ccfb9054c2b527b552293716658dc15) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:36 +02:00
Amaury Denoyelle	f7fd2a5d50	MINOR: quic: extract connection retrieval Simplify qc_lstnr_pkt_rcv() by extracting code responsible to retrieve the quic-conn instance. This code is put in a dedicated function named quic_rx_pkt_retrieve_conn(). This new function could be skipped if a FD-owned quic-conn socket is used. The first traces of qc_lstnr_pkt_rcv() have been clean up as qc instance is always NULL here : thus qc parameter can be removed without any change. This change has no functional impact. It is a part of a refactoring series on qc_lstnr_pkt_rcv(). The objective is facilitate integration of FD-owned socket patches. This should be backported up to 2.6. (cherry picked from commit 449b1a8f552ddbf08be707fae1824a79fe862955) Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>	2022-10-25 11:49:32 +02:00

1 2 3 4 5 ...

18356 Commits