haproxy

Author	SHA1	Message	Date
Amaury Denoyelle	caa16549b8	MINOR: quic: notify on send ready This patch completes the previous one with poller subscribe of quic-conn owned socket on sendto() error. This ensures that mux-quic is notified if waiting on sending when a transient sendto() error is cleared. As such, qc_notify_send() is called directly inside socket I/O callback. qc_notify_send() internal condition have been thus completed. This will prevent to notify upper layer until all sending condition are fulfilled: room in congestion window and no transient error on socket FD. This should be backported up to 2.7.	2023-03-01 14:32:37 +01:00
Amaury Denoyelle	e1a0ee3cf6	MEDIUM: quic: implement poller subscribe on sendto error On sendto() transient error, prior to this patch sending was simulated and we relied on retransmission to retry sending. This could hurt significantly the performance. Thanks to quic-conn owned socket support, it is now possible to improve this. On transient error, sending is interrupted and quic-conn socket FD is subscribed on the poller for sending. When send is possible, quic_conn_sock_fd_iocb() will be in charge of restart sending. A consequence of this change is on the return value of qc_send_ppkts(). This function will now return 0 on transient error if quic-conn has its owned socket. This is used to interrupt sending in the calling function. The flag QUIC_FL_CONN_TO_KILL must be checked to differentiate a fatal error from a transient one. This should be backported up to 2.7.	2023-03-01 14:32:37 +01:00
Amaury Denoyelle	147862de61	MINOR: quic: purge txbuf before preparing new packets Sending is implemented in two parts on quic-conn module. First, QUIC packets are prepared in a buffer and then sendto() is called with this buffer as input. qc.tx.buf is used as the input buffer. It must always be empty before starting to prepare new packets in it. Currently, this is guarantee by the fact that either sendto() is completed, a fatal error is encountered which prevent future send, or a transient error is encountered and we rely on retransmission to send the remaining data. This will change when poller subscribe of socket FD on sendto() transient error will be implemented. In this case, qc.tx.buf will not be emptied to resume sending when the transient error is cleared. To allow the current sending process to work as expected, a new function qc_purge_txbuf() is implemented. It will try to send remaining data before preparing new packets for sending. If successful, txbuf will be emptied and sending can continue. If not, sending will be interrupted. This should be backported up to 2.7.	2023-03-01 14:29:16 +01:00
Amaury Denoyelle	e0fe118dad	MINOR: quic: implement qc_notify_send() Implement qc_notify_send(). This function is responsible to notify the upper layer subscribed on SUB_RETRY_SEND if sending condition are back to normal. For the moment, this patch has no functional change as only congestion window room is checked before notifying the upper layer. However, this will be extended when poller subscribe of socket on sendto() error will be implemented. qc_notify_send() will thus be responsible to ensure that all condition are met before wake up the upper layer. This should be backported up to 2.7.	2023-03-01 14:29:16 +01:00
Amaury Denoyelle	37333864ef	MINOR: quic: simplify return path in send functions This patch simply clean up return paths used in various send function of quic-conn module. This will simplify the implementation of poller subscribing on sendto() error which add another error handling path. This should be backported up to 2.7.	2023-03-01 14:29:16 +01:00
Oto Valek	fa0413f1c7	REGTEST: added tests covering smp_fetch_hdr_ip() Added new testcases for all 4 branches of smp_fetch_hdr_ip(): - a plain IPv4 address - an IPv4 address with an port number - a plain IPv6 address - an IPv6 address wrapped in [] brackets	2023-03-01 14:10:02 +01:00
Oto Valek	d1773e6881	BUG/MINOR: http-fetch: recognize IPv6 addresses in square brackets in req.hdr_ip() If an IPv6 address is enclosed in square brackets [], trim them before calling inet_pton(). This is to comply with RFC7239 6.1 and RFC3986 3.2.2.	2023-03-01 14:09:46 +01:00
Christopher Faulet	d48bfb6983	BUG/MINOR: http-check: Skip C-L header for empty body when it's not mandatory The Content-Length header is always added into the request for an HTTP health-check. However, when there is no payload, this header may be skipped for OPTIONS, GET, HEAD and DELETE methods. In fact, it is a "SHOULD NOT" in the RCF 9110 (#8.6). It is not really an issue in itself but it seems to be an issue for AWS ELB. It returns a 400-Bad-Request if a HEAD/GET request with no payload contains a Content-Length header. So, it is better to skip this header when possible. This patch should fix the issue #2026. It could be backported as far as 2.2.	2023-02-28 18:51:27 +01:00
Christopher Faulet	0506d9de51	BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format body When the HTTP request of a health-check is forged, we must not pretend there is no payload, by setting HTX_SL_F_BODYLESS, if a log-format body was configured. Indeed, a test on the body length was used but it is only valid for a plain string. For A log-format string, a list is used. Note it an bug with no consequence for now. This patch must be backported as far as 2.2.	2023-02-28 18:44:15 +01:00
Christopher Faulet	fb5fff19fe	BUG/MINOR: mux-h1: Don't report an error on an early response close If the response is closed before any data was received, we must not report an error to the SE descriptor. It is important to be able to retry on an empty response. This patch should fix the issue #2061. It must be backported to 2.7.	2023-02-28 18:36:46 +01:00
Christopher Faulet	5e1b0e7bf8	BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list When a connection is removed from the safe list or the idle list, CO_FL_SAFE_LIST and CO_FL_IDLE_LIST flags must be cleared. It is performed when the connection is reused. But not when it is moved into the toremove_conns list. It may be an issue because the multiplexer owning the connection may be woken up before the connection is really removed. If the connection flags are not sanitized, it may think the connection is idle and reinsert it in the corresponding list. From this point, we can imagine several bugs. An UAF or a connection reused with an invalid state for instance. To avoid any issue, the connection flags are sanitized when an idle connection is moved into the toremove_conns list. The same is performed at right places in the multiplexers. Especially because the connection release may be delayed (for h2 and fcgi connections). This patch shoudld fix the issue #2057. It must carefully be backported as far as 2.2. Especially on the 2.2 where the code is really different. But some conflicts should be expected on the 2.4 too.	2023-02-28 18:36:29 +01:00
Amaury Denoyelle	4bdd069637	MINOR: quic: consider EBADF as critical on send() EBADF on sendto() is considered as a fatal error. As such, it is removed from the list of the transient errors. The connection will be killed when encountered. For the record, EBADF can be encountered on process termination with the listener socket. This should be backported up to 2.7.	2023-02-28 10:51:25 +01:00
Amaury Denoyelle	1febc2d316	MEDIUM: quic: improve fatal error handling on send Send is conducted through qc_send_ppkts() for a QUIC connection. There is two types of error which can be encountered on sendto() or affiliated syscalls : * transient error. In this case, sending is simulated with the remaining data and retransmission process is used to have the opportunity to retry emission * fatal error. If this happens, the connection should be closed as soon as possible. This is done via qc_kill_conn() function. Until this patch, only ECONNREFUSED errno was considered as fatal. Modify the QUIC send API to be able to differentiate transient and fatal errors more easily. This is done by fixing the return value of the sendto() wrapper qc_snd_buf() : * on fatal error, a negative error code is returned. This is now the case for every errno except EAGAIN, EWOULDBLOCK, ENOTCONN, EINPROGRESS and EBADF. * on a transient error, 0 is returned. This is the case for the listed errno values above and also if a partial send has been conducted by the kernel. * on success, the return value of sendto() syscall is returned. This commit will be useful to be able to handle transient error with a quic-conn owned socket. In this case, the socket should be subscribed to the poller and no simulated send will be conducted. This commit allows errno management to be confined in the quic-sock module which is a nice cleanup. On a final note, EBADF should be considered as fatal. This will be the subject of a next commit. This should be backported up to 2.7.	2023-02-28 10:51:25 +01:00
Willy Tarreau	7b8aac4439	MINOR: tinfo: make thread_set functions return nth group/mask instead of first thread_set_first_group() and thread_set_first_tmask() were modified and renamed to instead return the number and mask of the nth group. Passing zero continues to return the first one, but it will be more convenient to use this way when building shards.	2023-02-28 10:28:47 +01:00
Willy Tarreau	fea8c19119	CLEANUP: listener: only store conn counts for local threads The listeners have a thr_conn[] array indexed on the thread number that is used during connection redispatching to know what threads are the least loaded. Since we introduced thread groups, and based on the fact that a listener may only belong to one group, there's no point storing counters for all threads, we just need to store them for all threads in the group. Doing so reduces the struct listener from 1500 to 632 bytes. This may be backported to 2.7 to save a bit of resources.	2023-02-28 10:28:47 +01:00
Willy Tarreau	061754b249	BUG/MEDIUM: fd: make fd_delete() support being called from a different group There's currently a problem affecting thread groups. Stopping a listener from a different group than the one that runs this listener will trigger the BUG_ON() in fd_delete(). This typically happens by issuing "disable frontend f" on the CLI for the following config since the CLI runs on group 1: global nbthread 2 thread-groups 2 stats socket /tmp/sock1 level admin frontend f mode http bind abns@frt-sock thread 2 This happens because abns sockets cannot be suspended so here this requires a full stop. A first approach would consist in isolating the caller during such rare operations but it turns out that fd_delete() is not robust against even such calling conditions, because it uses its own thread mask with an FD that may be in a different group, and even though the threads would be isolated and running_mask should be zero, we must not mix thread masks from different groups like this. A better solution consists in replacing the bug condition detection with a self-protection. After all it's not trivial to figure all likely call places, and forcing upper layers to protect the code is not clean if we can do it at the bottom. Thus this is what is being done now. We detect a thread group mismatch, and if so, we forcefully isolate ourselves and entirely clean the socket. This has the merit of being much more robust and easier to use (and harder to misuse). Given that such operations are very rare (actually when they happen a crash follows), it's not a problem to waste some time isolating the caller there. This must be backported to 2.7, along with this previous patch: BUG/MINOR: fd: used the update list from the fd's group instead of tgid	2023-02-27 19:26:42 +01:00
Willy Tarreau	c0f6f5755b	BUG/MINOR: fd: used the update list from the fd's group instead of tgid In _fd_delete_orphan() we try to remove the FD from its update list which is supposed to be the current thread group's. However the function might be called from another group during stopping or under isolation, so FD is not queued in the current group's update list but in its own group's list. Let's retrieve the group from the FD instead of using tgid. This should have no impact on existing code since there is no code path calling fd_delete() under thread isolation for now, and other cases are blocked in fd_delete(). This must be backported to 2.7.	2023-02-27 19:26:41 +01:00
Christopher Faulet	b705622336	DOC: config: Replace TABs by spaces It is just a small cleanup. All TABs were replaced by spaces.	2023-02-27 18:01:42 +01:00
Christopher Faulet	24b319b695	DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section This patch improves the 'hold' parameter description in the 'resolvers' section to make it clearer. It really explains differences between all status. Thanks to Nick Ramirez for this update. This patch should solve the issue #1694. It could be backported to all stable versions.	2023-02-27 18:00:02 +01:00
Christopher Faulet	85eabfbf67	MEDIUM: mux-quic: Don't expect data from server as long as request is unfinished As for the H1 and H2 stream, the QUIC stream now states it does not expect data from the server as long as the request is unfinished. The aim is the same. We must be sure to not trigger a read timeout on server side if the client is still uploading data. From the moment the end of the request is received and forwarded to upper layer, the QUIC stream reports it expects to receive data from the opposite endpoint. This re-enables read timeout on the server side.	2023-02-27 17:45:45 +01:00
Christopher Faulet	72722c04b0	MEDIUM: mux-h2: Don't expect data from server as long as request is unfinished As for the H1 stream, the H2 stream now states it does not expect data from the server as long as the request is unfinished. The aim is the same. We must be sure to not trigger a read timeout on server side if the client is still uploading data. From the moment the end of the request is received and forwarded to upper layer, the H2 stream reports it expects to receive data from the opposite endpoint. This re-enables read timeout on the server side.	2023-02-27 17:45:45 +01:00
Christopher Faulet	f4b89f162a	MEDIUM: mux-h1: Don't expect data from server as long as request is unfinished On client side, as long as the request is unfinished, the H1 stream states it does not expect data from the server. It does not mean the server must not send its response but only it may wait to receive the whole request with no risk to trigger a read timeout. When the request is finished, the H1 stream reports it expects to receive data from the opposite endpoint. The purpose of this patch is to never report a server timeout on receive if the client is still uploading data. This way, it is possible to have a smaller server timeout than the client one.	2023-02-27 17:45:45 +01:00
Christopher Faulet	59b240c30c	BUG/MEDIUM: stconn: Report a blocked send if some output data are not consumed Instead of reporting a blocked send if nothing is send, we do it if some output data remain blocked after a write attempts or after a call the the applet's I/O handler. It is mandatory to properly handle write timeouts. Indeed, if an endpoint is blocked for a while but it partially consumed output data, no timeout is triggered. It is especially true for connections. But the same may happen for applet, there is no reason. Of course, if the endpoint decides to partially consume output data because it must wait to move on for any reason, it should use the se/applet API (se/applet_will_consume(), se/applet_wont_consume() and se/applet_need_more_data()). This bug was introduced during the channels timeouts refactoring. No backport is needed.	2023-02-27 17:45:45 +01:00
Christopher Faulet	8aabc8ebfd	MINOR: stconn: Report a send activity when endpoint is willing to consume data When the endpoint (applet or mux) is now willing to consume data while it said it wouldn't, a send activity is reported. Indeed, the writes was blocked because of the endpoint. It is now ready to consume outgoing data. So an send activity must be reported to reset corresponding timers. Concretly, when the flag SE_FL_WONT_CONSULE is removed, a send activity is reported.	2023-02-27 17:45:45 +01:00
Christopher Faulet	e758b5c703	MEDIUM: stream: Eventually handle stream timeouts when exiting process_stream() When we exit from process_stream(), if the task is expired, we try to handle the stream timeouts and we resync the stream-connectors. This avoids a useless immediate wakeup. It is not really an issue, but it is a small improvement in edge cases.	2023-02-27 17:45:45 +01:00
Christopher Faulet	85e568f594	MINOR: stream: Handle stream's timeouts in a dedicated function This will be mandatory to be able to handle stream's timeouts before exiting process_stream(). So, to not duplicate code, all this stuff is moved in a dedicated function.	2023-02-27 17:45:45 +01:00
Christopher Faulet	3bbd2baab3	BUG/MINOR: stream: Remove BUG_ON about the task expiration in process_stream() At the end of process_stream(), A BUG_ON was recently added to abort if we leave the function with an expired task. However, it may happen if an event prevents the timeout to be handled but nothing evolved. In this case, the task expiration is not updated and we expect to catch the timeout on the immediate task wakeup. No backport needed.	2023-02-27 17:45:45 +01:00
Christopher Faulet	c9ec9bc834	BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during parsing A bug during H1 data parsing may lead to copy more data than the maximum allowed. The bug is an overflow on this max threshold when it is lower than the size of an htx_blk structure. At first glance, it means it is possible to not respsect the buffer's reserve. So it may lead to rewrite errors but it may also block any progress on the stream if the compression is enabled. In this case, the channel buffer appears as full and the compression must wait for space to proceed. Outside of any bug, it is only possible when there are outgoing data to forward, so the compression filter just waits. Because of this bug, there is nothing to forward. The buffer is just full of input data. Thus nothing move and the stream is infinitly blocked. To fix the bug, we must be sure to be able to create an HTX block of 1 byte without exceeding the maximum allowed. This patch should fix the issue #2053. It must be backported as far as 2.5.	2023-02-27 17:45:45 +01:00
Aurelien DARRAGON	e51891a01d	BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and fd_rm_from_fd_list With 4d9888c ("CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros") some "volatile" keywords were dropped at various assignment places in fd's code. In fd_add_to_fd_list() and fd_add_to_fd_list(), because of the absence of the "volatile" keyword: the compiler was able to perform some code optimizations that prevented prev and next variables from being reloaded between locking attempts (goto loop). The result was that fd_add_to_fd_list() and fd_rm_from_fd_list() could enter in infinite loops, preventing other threads from working further and ultimately resulting in the watchdog being triggered as described in GH #2011. To fix this, we made sure to re-audit 4d9888c in order to restore the required memory barriers / compilers hints to prevent the compiler from mis-optimizing the code around the fd's locks. That is: using atomic loads to fetch the prev and next values, and restoring the "volatile" cast for cur_list.ent variable assignment in fd_rm_from_fd_list() Big thanks to @xanaxalan for his help and patience and to @wtarreau for his findings and explanations in regard to compiler's optimizations. This must be backported in 2.7 with 4d9888c ("CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros")	2023-02-27 16:55:56 +01:00
Fr�d�ric L�caille	83540ed429	BUILD: thead: Fix several 32 bits compilation issues with uint64_t variables Cast uint64_t as ullong and difference between two uint64_t as llong.	2023-02-24 09:56:50 +01:00
S�baastien Gross	2a1bcf1a59	MINOR: config: add HAPROXY_BRANCH environment variable This patch adds support from HAPROXY_BRANCH environment variable. It can be useful is some resources are loaded from different locations when migrating from one version to another. Signed-off-by: S�bastien Gross <sgross@haproxy.com>	2023-02-24 09:45:44 +01:00
Willy Tarreau	a2a3d5dd25	CLEANUP: ring: remove the now unused ring's offset Since the previous patch, the ring's offset is not used anymore. The haring utility remains backward-compatible since it can trust the buffer element that's at the beginning of the map and which still contains all the valid data.	2023-02-24 09:26:30 +01:00
Willy Tarreau	d9c7188633	MEDIUM: ring: make the offset relative to the head/tail instead of absolute The ring's offset currently contains a perpetually growing custor which is the number of bytes written from the start. It's used by readers to know where to (re)start reading from. It was made absolute because both the head and the tail can change during writes and we needed a fixed position to know where the reader was attached. But this is complicated, error-prone, and limits the ability to reduce the lock's coverage. In fact what is needed is to know where the reader is currently waiting, if at all. And this location is exactly where it stored its count, so the absolute position in the buffer (the seek offset from the first storage byte) does represent exactly this, as it doesn't move (we don't realign the buffer), and is stable regardless of how head/tail changes with writes. This patch modifies this so that the application code now uses this representation instead. The most noticeable change is the initialization, where we've kept ~0 as a marker to go to the end, and it's now set to the tail offset instead of trying to resolve the current write offset against the current ring's position. The offset was also used at the end of the consuming loop, to detect if a new write had happened between the lock being released and taken again, so as to wake the consumer(s) up again. For this we used to take a copy of the ring->ofs before unlocking and comparing with the new value read in the next lock. Since it's not possible to write past the current reader's location, there's no risk of complete rollover, so it's sufficient to check if the tail has changed. Note that the change also has an impact on the haring consumer which needs to adapt as well. But that's good in fact because it will rely on one less variable, and will use offsets relative to the buffer's head, and the change remains backward-compatible.	2023-02-24 09:26:30 +01:00
Willy Tarreau	d0d85d2e36	BUG/MINOR: ring: do not realign ring contents on resize If a ring is resized, we must not zero its head since the contents are preserved in-situ. Till now it used to work because we only resize during boot and we emit very few data (if at all) during boot. But this can change in the future. This can be backported to 2.2 though no older version should notice a difference.	2023-02-24 09:26:30 +01:00
Fr�d�ric L�caille	b7a13be6cd	BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del() This issue arrived with this commit: 1dbeb35f8 MINOR: quic: Add new traces about by connection RX buffer handling and revealed by the GH CI as follows: src/quic_conn.c: In function ‘quic_rx_pkts_del’: include/haproxy/trace.h:134:65: error: format ‘%zu’ expects argument of type ‘size_t’, but argument 6 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=] _msg_len = snprintf(_msg, sizeof(_msg), (fmt), ##args); Replace all %zu printf integer format by %llu. Must be backported to 2.7 where the previous is supposed to be backported.	2023-02-24 09:23:07 +01:00
Aurelien DARRAGON	28a6d48a60	MINOR: haproxy: always protocol unbind on startup error path In haproxy startup, all init error paths after the protocol bind step cautiously call protocol_unbind_all() before exiting except one that was conditional. We're not making an exception to the rule and we now properly call protocol_unbind_all() as well. No backport needed as this patch is unnoticeable.	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	de63efba5a	MINOR: proto_ux: ability to dump ABNS names in error messages In sock_unix_bind_receiver(), uxst_bind_listener() and uxdg_bind_listener(), properly dump ABNS socket names by leveraging sa2str() function which does the hard work for us. UNIX sockets are reported as is (unchanged) while ABNS UNIX sockets are prefixed with 'abns@' to match the syntax used in config file. (they where previously showing as empty strings because of the leading NULL-byte that was not properly handled in this case) This is only a minor debug improvement, however it could be useful to backport it up to 2.4. [for 2.4: you should replace "%s [%s]" by "%s for [%s]" for uxst and uxgd if you wan't the patch to apply properly]	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	2338dba18d	MEDIUM: proto_ux: properly suspend named UNIX listeners When a listener is suspended, we expect that it may not process any data for the time it is suspended. Yet for named UNIX socket, as the suspend operation is a no-op at the proto level, recv events on the socket may still be processed by the polling loop. This is quite disturbing as someone may rely on a paused proxy being harmless, which is true for all protos except for named UNIX sockets. To fix this behavior, we explicitely disable io recv events when suspending a named UNIX socket listener (we call disable() method on the listener). The io recv events will automatically be restored when the listener is resumed since the l->enable() method is called at the end of the resume() operation. This could be backported up to 2.4 after a reasonable observation period to make sure that this change doesn't cause unwanted side-effects.	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	2a7903bbb2	BUG/MINOR: sock_unix: match finalname with tempname in sock_unix_addrcmp() In sock_unix_addrcmp(), named UNIX sockets paths are manually compared in order to properly handle tempname paths (ending with ".XXXX.tmp") that result from the 2-step bind implemented in sock_unix_bind_receiver(). However, this logic does not take into account "final" path names (without the ".XXXX.tmp" suffix). Example: /tmp/test did not match with /tmp/test.1288.tmp prior to this patch Indeed, depending on how the socket addr is retrieved, the same socket could be designated either by its tempname or finalname. socket addr is normally stored with its finalname within a receiver, but a call to getsockname() on the same socket will return the tempname that was used for the bind() call (sock_get_old_sockets() depends on getsockname()). This causes sock_find_compatible_fd() to malfunction with named UNIX sockets (ie: haproxy -x CLI option). To fix this, we slightly modify the check around the temp suffix in sock_unix_addrcmp(): we perform the suffix check even if one of the paths is lacking the temp suffix (with proper precautions). Now the function is able to match: - finalname x finalname - tempname x tempname - finalname x tempname That is: /tmp/test == /tmp/test.1288.tmp == /tmp/test.X.tmp It should be backported up to 2.4	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	ca8a4b2966	BUG/MEDIUM: listener/proxy: fix listeners notify for proxy resume In 58651b42f ("MEDIUM: listener/proxy: make the listeners notify about proxy pause/resume") we introduced the logic for pause/resume notify using li_ready for pause and li_paused for resume. Unfortunately, relying on li_paused for resume doesn't work reliably if we resume a listener which is only made of receivers that are completely stopped. For example, this could happen with receivers that don't support the LI_PAUSED state like ABNS sockets. This is especially true since pause_listener() was renamed to suspend_listener() to better reflect its actual behavior in ("MINOR: listener: pause_listener() becomes suspend_listener()) To fix this, we now rely on the li_suspended state in resume_listener() to make sure that suspend_listener() and resume_listener() notify messages are consistent to each other: "Proxy pause" is triggered when there are no more ready listeners. "Proxy resume" is triggered when there are no more suspended listeners. Also, we make use of the new PR_FL_PAUSED proxy flag to make sure we don't report the same event twice. This could be backported up to 2.4 after a reasonable observation period to make sure that this change doesn't cause unwanted side-effects. -- Backport notes: This commit depends on: - "MINOR: listener: pause_listener() becomes suspend_listener()" -> 2.4 only, as "MINOR: proxy/listener: support for additional PAUSED state" was not backported: Replace this: \|+ if (px && !(px->flags & PR_FL_PAUSED) && !px->li_ready) { \| /* PROXY_LOCK is required / \| proxy_cond_pause(px); \| ha_warning("Paused %s %s.\n", proxy_cap_str(px->cap), px->id); By this: \|+ if (px && !px->li_ready) { \| ha_warning("Paused %s %s.\n", proxy_cap_str(px->cap), px->id); \| send_log(px, LOG_WARNING, "Paused %s %s.\n", proxy_cap_str(px->cap), px->id); \| } And this: \|+ if (px && (px->flags & PR_FL_PAUSED) && !px->li_suspended) { \| / PROXY_LOCK is required */ \| proxy_cond_resume(px); \| ha_warning("Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); By this: \|+ if (px && !px->li_suspended) { \| ha_warning("Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); \| send_log(px, LOG_WARNING, "Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); \| }	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	d3ffba4512	MINOR: listener: pause_listener() becomes suspend_listener() We are simply renaming pause_listener() to suspend_listener() to prevent confusion around listener pausing. A suspended listener can be in two differents valid states: - LI_PAUSED: the listener is effectively paused, it will unpause on resume_listener() - LI_ASSIGNED (not bound): the listener does not support the LI_PAUSED state, so it was unbound to satisfy the suspend request, it will correcly re-bind on resume_listener() Besides that, we add the LI_F_SUSPENDED flag to mark suspended listeners in suspend_listener() and unmark them in resume_listener(). We're also adding li_suspend proxy variable to track the number of currently suspended listeners: That is, the number of listeners that were suspended through suspend_listener() and that are either in LI_PAUSED or LI_ASSIGNED state. Counter is increased on successful suspend in suspend_listener() and it is decreased on successful resume in resume_listener() -- Backport notes: -> 2.4 only, as "MINOR: proxy/listener: support for additional PAUSED state" was not backported: Replace this: \| /* PROXY_LOCK is require \| proxy_cond_resume(px); By this: \| ha_warning("Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); \| send_log(px, LOG_WARNING, "Resumed %s %s.\n", proxy_cap_str(px->cap), px->id); -> 2.6 and 2.7 only, as "MINOR: listener: make sure we don't pause/resume" was custom patched: Replace this: \|@@ -253,6 +253,7 @@ struct listener { \| \| /* listener flags (16 bits) / \| #define LI_F_FINALIZED 0x0001 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+#define LI_F_SUSPENDED 0x0002 / listener has been suspended using suspend_listener(), it is either is LI_PAUSED or LI_ASSIGNED state / \| \| / Descriptor for a "bind" keyword. The ->parse() function returns 0 in case of \| * success, or a combination of ERR_* flags if an error is encountered. The By this: \|@@ -222,6 +222,7 @@ struct li_per_thread { \| \| #define LI_F_QUIC_LISTENER 0x00000001 /* listener uses proto quic / \| #define LI_F_FINALIZED 0x00000002 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+#define LI_F_SUSPENDED 0x00000004 / listener has been suspended using suspend_listener(), it is either is LI_PAUSED or LI_ASSIGNED state / \| \| / The listener will be directly referenced by the fdtab[] which holds its \| * socket. The listener provides the protocol-specific accept() function to	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	046a75e131	BUG/MEDIUM: resume from LI_ASSIGNED in default_resume_listener() Since fc974887c ("MEDIUM: protocol: explicitly start the receiver before the listener"), resume from LI_ASSIGNED state does not work anymore. This is because the binding part has been divided into 2 distinct steps since: first bind(), then listen(). This new logic was properly implemented in startup sequence through protocol_bind_all() but wasn't properly reported in default_resume_listener() function. Fixing default_resume_listener() to comply with the new logic. This should help ABNS sockets to properly rebind in resume_listener() after they have been stopped by pause_listener(): See Redmine:4475 for more context. This commit depends on: - "MINOR: listener: workaround for closing a tiny race between resume_listener() and stopping" - "MINOR: listener: make sure we don't pause/resume bypassed listeners" This could be backported up to 2.4 after a reasonable observation period to make sure that this change doesn't cause unwanted side-effects.	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	3bb2a38f01	BUG/MINOR: listener: fix resume_listener() resume return value handling In resume_listener(), proto->resume() errors were not properly handled: the function kept flowing down as if no errors were detected. Instead, we're performing an early return when such errors are detected to prevent undefined behaviors. This could be backported up to 2.4. -- Backport notes: This commit depends on: - "MINOR: listener: make sure we don't pause/resume bypassed listeners" -> 2.4 ... 2.7: Replace this: \| if (l->bind_conf->maxconn && l->nbconn >= l->bind_conf->maxconn) { \| l->rx.proto->disable(l); By this: \| if (l->maxconn && l->nbconn >= l->maxconn) { \| l->rx.proto->disable(l);	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	7a15fa58b1	BUG/MEDIUM: listener: fix pause_listener() suspend return value handling Legacy suspend() return value handling in pause_listener() has been altered over the time. First with fb76bd5ca ("BUG/MEDIUM: listeners: correctly report pause() errors") Then with e03204c8e ("MEDIUM: listeners: implement protocol level ->suspend/resume() calls") We aim to restore original function behavior and comply with resume_listener() function description. This is required for resume_listener() and pause_listener() to work as a whole Now, it is made explicit that pause_listener() may stop a listener if the listener doesn't support the LI_PAUSED state (depending on the protocol family, ie: ABNS sockets), in this case l->state will be set to LI_ASSIGNED and this won't be considered as an error. This could be backported up to 2.4 after a reasonable observation period to make sure that this change doesn't cause unwanted side-effects. -- Backport notes: This commit depends on: - "MINOR: listener: make sure we don't pause/resume bypassed listeners" -> 2.4: manual change required because "MINOR: proxy/listener: support for additional PAUSED state" was not backported: the contextual patch lines don't match. Replace this: \| if (px && !px->li_ready) { \| /* PROXY_LOCK is required */ By this: \| if (px && !px->li_ready) { \| ha_warning("Paused %s %s.\n", proxy_cap_str(px->cap), px->id);	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	2370599f96	MINOR: listener: make sure we don't pause/resume bypassed listeners Some listeners are kept in LI_ASSIGNED state but are not supposed to be started since they were bypassed on initial startup (eg: in protocol_bind_all() or in enable_listener()...) Introduce the LI_F_FINALIZED flag: when the variable is non zero it means that the listener made it past the LI_LISTEN state (finalized) at least once so we can safely pause / resume. This way we won't risk starting a previously bypassed listener which never made it that far and thus was not expected to be lazy-started by accident. As listener_pause() and listener_resume() are currently partially broken, such unexpected lazy-start won't happen. But we're trying to restore pause() and resume() behavior so this patch will be required before going any further. We had to re-introduce listeners 'flags' struct member since it was recently moved into bind_conf struct. But here we do have a legitimate need for these listener-only flags. This should only be backported if explicitly required by another commit. -- Backport notes: -> 2.4 and 2.5: The 2-bytes hole we're using in the current patch does not apply, let's use the 4-byte hole located under the 'option' field. Replace this: \|@@ -226,7 +226,8 @@ struct li_per_thread { \| struct listener { \| enum obj_type obj_type; /* object type = OBJ_TYPE_LISTENER / \| enum li_state state; / state: NEW, INIT, ASSIGNED, LISTEN, READY, FULL / \|- / 2-byte hole here / \|+ uint16_t flags; / listener flags: LI_F_* / \| int luid; / listener universally unique ID, used for SNMP / \| int nbconn; / current number of connections on this listener / \| unsigned int thr_idx; / thread indexes for queue distribution : (t2<<16)+t1 / By this: \|@@ -209,6 +209,8 @@ struct listener { \| short int nice; / nice value to assign to the instantiated tasks / \| int luid; / listener universally unique ID, used for SNMP / \| int options; / socket options : LI_O_* / \|+ uint16_t flags; / listener flags: LI_F_* / \|+ / 2-bytes hole here / \| __decl_thread(HA_RWLOCK_T lock); \| \| struct fe_counters counters; /* statistics counters / -> 2.4 only: We need to adjust some contextual lines. Replace this: \|@@ -477,7 +478,7 @@ int pause_listener(struct listener l, int lpx, int lli) \| if (!lli) \| HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \|- if (l->state <= LI_PAUSED) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state <= LI_PAUSED) \| goto end; \| \| if (l->rx.proto->suspend) By this: \|@@ -477,7 +478,7 @@ int pause_listener(struct listener l, int lpx, int lli) \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) \| goto end; \| \|- if (l->state <= LI_PAUSED) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state <= LI_PAUSED) \| goto end; \| \| if (l->rx.proto->suspend) And this: \|@@ -535,7 +536,7 @@ int resume_listener(struct listener l, int lpx, int lli) \| if (MT_LIST_INLIST(&l->wait_queue)) \| goto end; \| \|- if (l->state == LI_READY) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state == LI_READY) \| goto end; \| \| if (l->rx.proto->resume) By this: \|@@ -535,7 +536,7 @@ int resume_listener(struct listener l, int lpx, int lli) \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) \| goto end; \| \|- if (l->state == LI_READY) \|+ if (!(l->flags & LI_F_FINALIZED) \|\| l->state == LI_READY) \| goto end; \| \| if (l->rx.proto->resume) -> 2.6 and 2.7 only: struct listener 'flags' member still exists, let's use it. Remove this from the current patch: \|@@ -226,7 +226,8 @@ struct li_per_thread { \| struct listener { \| enum obj_type obj_type; / object type = OBJ_TYPE_LISTENER / \| enum li_state state; / state: NEW, INIT, ASSIGNED, LISTEN, READY, FULL / \|- / 2-byte hole here / \|+ uint16_t flags; / listener flags: LI_F_* / \| int luid; / listener universally unique ID, used for SNMP / \| int nbconn; / current number of connections on this listener / \| unsigned int thr_idx; / thread indexes for queue distribution : (t2<<16)+t1 / Then, replace this: \|@@ -251,6 +250,9 @@ struct listener { \| EXTRA_COUNTERS(extra_counters); \| }; \| \|+/ listener flags (16 bits) / \|+#define LI_F_FINALIZED 0x0001 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \|+ \| / Descriptor for a "bind" keyword. The ->parse() function returns 0 in case of \| * success, or a combination of ERR_* flags if an error is encountered. The \| * function pointer can be NULL if not implemented. The function also has an By this: \|@@ -221,6 +221,7 @@ struct li_per_thread { \| }; \| \| #define LI_F_QUIC_LISTENER 0x00000001 /* listener uses proto quic / \|+#define LI_F_FINALIZED 0x00000002 / listener made it to the READY\|\|LIMITED\|\|FULL state at least once, may be suspended/resumed safely / \| \| / The listener will be directly referenced by the fdtab[] which holds its \| * socket. The listener provides the protocol-specific accept() function to	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	f5d98938ad	MINOR: listener: workaround for closing a tiny race between resume_listener() and stopping This is an alternative fix that tries to address the same issue as d1ebee177 ("BUG/MINOR: listener: close tiny race between resume_listener() and stopping") while allowing resume_listener() to be more versatile. Indeed, because of the previous fix, resume_listener() is not able to rebind stopped listeners, and this breaks the original behavior that is documented in the function description: "If the listener was only in the assigned state, it's totally rebound. This can happen if a pause() has completely stopped it. If the resume fails, 0 is returned and an error might be displayed." With relax_listener(), we now make sure to check l->state under the listener lock so we don't call resume_listener() when the conditions are not met. As such, concurrently stopped listeners may not be rebound using relax_listener(). Note: the documented race can't happen since 1b927eb3c ("MEDIUM: proto: stop protocols under thread isolation during soft stop"), but older versions are concerned as 1b927eb3c was not marked for backports. Moreover, the patch also prevents the race between protocol_pause_all() and resuming from LIMITED or FULL states. This commit depends on: - "MINOR: listener: add relax_listener() function" This should be backported with d1ebee177 up to 2.4 (d1ebee177 is marked to be backported for all stable versions but the current patch does not apply for versions < 2.4)	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	bcad7e6319	MINOR: listener: add relax_listener() function There is a need for a small difference between resuming and relaxing a listener. When resuming, we expect that the listener may completely resume, this includes unpausing or rebinding if required. Resuming a listener is a best-effort operation: no matter the current state, try our best to bring the listener up to the LI_READY state. There are some cases where we only want to "relax" listeners that were previously restricted using limit_listener() or listener_full() functions. Here we don't want to ressucitate listeners, we're simply interested in cancelling out the previous restriction. To this day, listener_resume() on a unbound listener is broken, that's why the need for this wasn't felt yet. But we're trying to restore historical listener_resume() behavior, so we better prepare for this by introducing an explicit relax_listener() function that only does what is expected in such cases. This commit depends on: - "MINOR: listener/api: add lli hint to listener functions"	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	4059e094db	MINOR: listener/api: add lli hint to listener functions Add listener lock hint (AKA lli) to (stop/resume/pause)_listener() functions. All these functions implicitely take the listener lock when they are called: It could be useful to be able to call them while already holding the lock, so we're adding lli hint to make them take the lock only when it is missing. This should only be backported if explicitly required by another commit -- -> 2.4 and 2.5 common backport notes: These 2 commits need to be backported first: - 187396e34 "CLEANUP: listener: function comment typo in stop_listener()" - a57786e87 "BUG/MINOR: listener: null pointer dereference suspected by coverity" -> 2.4 special backport notes: In addition to the previously mentionned dependencies, the patch needs to be slightly adapted to match the corresponding contextual lines: Replace this: \|@@ -471,7 +474,8 @@ int pause_listener(struct listener l, int lpx) \| if (!lpx && px) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &px->lock); \| \|- HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \|+ if (!lli) \|+ HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \| if (l->state <= LI_PAUSED) \| goto end; By this: \|@@ -471,7 +474,8 @@ int pause_listener(struct listener l, int lpx) \| if (!lpx && px) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &px->lock); \| \|- HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \|+ if (!lli) \|+ HA_RWLOCK_WRLOCK(LISTENER_LOCK, &l->lock); \| \| if ((global.mode & (MODE_DAEMON \| MODE_MWORKER)) && \| !(proc_mask(l->rx.settings->bind_proc) & pid_bit)) Replace this: \|@@ -169,7 +169,7 @@ void protocol_stop_now(void) \| HA_SPIN_LOCK(PROTO_LOCK, &proto_lock); \| list_for_each_entry(proto, &protocols, list) { \| list_for_each_entry_safe(listener, lback, &proto->receivers, rx.proto_list) \|- stop_listener(listener, 0, 1); \|+ stop_listener(listener, 0, 1, 0); \| } \| HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock); \| } By this: \|@@ -169,7 +169,7 @@ void protocol_stop_now(void) \| HA_SPIN_LOCK(PROTO_LOCK, &proto_lock); \| list_for_each_entry(proto, &protocols, list) { \| list_for_each_entry_safe(listener, lback, &proto->receivers, rx.proto_list) \| if (!listener->bind_conf->frontend->grace) \|- stop_listener(listener, 0, 1); \|+ stop_listener(listener, 0, 1, 0); \| } \| HA_SPIN_UNLOCK(PROTO_LOCK, &proto_lock); Replace this: \|@@ -2315,7 +2315,7 @@ void stop_proxy(struct proxy p) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &p->lock); \| \| list_for_each_entry(l, &p->conf.listeners, by_fe) \|- stop_listener(l, 1, 0); \|+ stop_listener(l, 1, 0, 0); \| \| if (!(p->flags & (PR_FL_DISABLED\|PR_FL_STOPPED)) && !p->li_ready) { \| / might be just a backend / By this: \|@@ -2315,7 +2315,7 @@ void stop_proxy(struct proxy p) \| HA_RWLOCK_WRLOCK(PROXY_LOCK, &p->lock); \| \| list_for_each_entry(l, &p->conf.listeners, by_fe) \|- stop_listener(l, 1, 0); \|+ stop_listener(l, 1, 0, 0); \| \| if (!p->disabled && !p->li_ready) { \| /* might be just a backend */	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	9c3214c7b4	MINOR: proto_uxst: add resume method resume method was not explicitly defined for uxst protocol family. Here we can safely use the default_resume_listener, just like the uxdg family. This could be backported up to 2.4.	2023-02-23 15:05:05 +01:00
Aurelien DARRAGON	8429627e3c	BUG/MINOR: protocol: fix minor memory leak in protocol_bind_all() In protocol_bind_all() (involved in startup sequence): We only free errmsg (set by fam->bind() attempt) when we make use of it. But this could lead to some memory leaks because there are some cases where we ignore the error message (e.g: verbose=0 with ERR_WARN messages). As long as errmsg is set, we should always free it. As mentioned earlier, this really is a minor leak because it can only occur on specific conditions (error paths) during the startup phase. This may be backported up to 2.4. -- Backport notes: -> 2.4 only: Replace this: \| ha_warning("Binding [%s:%d] for %s %s: %s\n", \| listener->bind_conf->file, listener->bind_conf->line, \| proxy_type_str(px), px->id, errmsg); By this: \| else if (lerr & ERR_WARN) \| ha_warning("Starting %s %s: %s\n", \| proxy_type_str(px), px->id, errmsg);	2023-02-23 15:05:05 +01:00

... 2 3 4 5 6 ...

19609 Commits