haproxy

Author	SHA1	Message	Date
William Lallemand	6640dbb524	MINOR: cli: delare the CLI frontend as an internal proxy Declare the CLI frontend as an internal proxy so we can check the PR_CAP_INT flag instead of the global.fe_cli pointer for generic use cases.	2021-08-13 15:34:10 +02:00
Emeric Brun	bc5c821cc2	BUG/MEDIUM: cfgcheck: verify existing log-forward listeners during config check User reported that the config check returns an error with the message: "Configuration file has no error but will not start (no listener) => exit(2)." if the configuration present only a log-forward section with bind or dgram-bind listeners but no listen/backend nor peer sections. The process checked if there was 'peers' section avalaible with an internal frontend (and so a listener) or a 'listen/backend' section not disabled with at least one configured listener (into the global proxies_list). Since the log-forward proxies appear in a different list, they were not checked. This patch adds a lookup on the 'log-forward' proxies list to check if one of them presents a listener and is not disabled. And this is done only if there was no available listener found into 'listen/backend' sections. I have also studied how to re-work this check considering the 'listeners' counter used after startup/init to keep the same algo and avoid further mistakes but currently this counter seems increased during config parsing and if a proxy is disabled, decreased during startup/init which is done after the current config check. So the fix still not rely on this counter. This patch should fix the github issue #1346 This patch should be backported as far as 2.3 (so on branches including the "log-forward" feature)	2021-08-13 11:21:57 +02:00
Christopher Faulet	c86bb87f10	BUG/MINOR: lua: Properly catch alloc errors when parsing lua filter directives When a lua filter declaration is parsed, some allocation errors were not properly handled. In addition, we must be sure the filter identifier is defined in lua to duplicate it when the filter configuration is filled. This patch fix a defect reported in the issue #1347. It only concerns 2.5-dev. No backport needed.	2021-08-13 08:42:00 +02:00
Christopher Faulet	70c4345dbc	BUG/MINOR: lua: Properly check negative offset in Channel/HttpMessage functions In Channel and HTTPMessage classes, several functions uses an offset that may be negative to start from the end of incoming data. But, after calculation, the offset must never be negative. However, there is a bug because of a bad cast to unsigned when "input + offset" is performed. The result must be a signed integer. This patch should fix most of defects reported in the issue #1347. It only affects 2.5-dev. No backport needed.	2021-08-13 08:36:42 +02:00
Christopher Faulet	eae8afaa60	MINOR: filters/lua: Support the HTTP filtering from filters written in lua Now an HTTPMessage class is available to manipulate HTTP message from a filter it is possible to bind HTTP filters callback function on lua functions. Thus, following methods may now be defined by a lua filter: * Filter:http_headers(txn, http_msg) * Filter:http_payload(txn, http_msg, offset, len) * Filter:http_end(txn, http_msg) http_headers() and http_end() may return one of the constant filter.CONTINUE, filter.WAIT or filter.ERROR. If nothing is returned, filter.CONTINUE is used as the default value. On its side, http_payload() may return the amount of data to forward. If nothing is returned, all incoming data are forwarded. For now, these functions are not allowed to yield because this interferes with the filter workflow.	2021-08-12 08:57:07 +02:00
Christopher Faulet	78c35471f8	MINOR: filters/lua: Add request and response HTTP messages in the lua TXN When a lua TXN is created from a filter context, the request and the response HTTP message objects are accessible from ".http_req" and ".http_res" fields. For an HTTP proxy, these objects are always defined. Otherwise, for a TCP proxy, no object is created and nil is used instead. From any other context (action or sample fetch), these fields don't exist.	2021-08-12 08:57:07 +02:00
Christopher Faulet	df97ac4584	MEDIUM: filters/lua: Add HTTPMessage class to help HTTP filtering This new class exposes methods to manipulate HTTP messages from a filter written in lua. Like for the HTTP class, there is a bunch of methods to manipulate the message headers. But there are also methods to manipulate the message payload. This part is similar to what is available in the Channel class. Thus the payload can be duplicated, erased, modified or forwarded. For now, only DATA blocks can be retrieved and modified because the current API is limited. No HTTPMessage method is able to yield. Those manipulating the headers are always called on messages containing all the headers, so there is no reason to yield. Those manipulating the payload are called from the http_payload filters callback function where yielding is forbidden. When an HTTPMessage object is instantiated, the underlying Channel object can be retrieved via the ".channel" field. For now this class is not used because the HTTP filtering is not supported yet. It will be the purpose of another commit. There is no documentation for now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	c404f1126c	MEDIUM: filters/lua: Support declaration of some filter callback functions in lua It is now possible to write some filter callback functions in lua. All filter callbacks are not supported yet but the mechanism to call them is now in place. Following method may be defined in the Lua filter class to be bound on filter callbacks: * Filter:start_analyse(txn, chn) * Filter:end_analyse(txn, chn) * Filter:tcp_payload(txn, chn, offset, length) hlua_filter_callback() function is responsible to call the good lua function depending on the filter callback function. Using some flags it is possible to allow a lua call to yield or not, to retrieve a return value or not, and to specify if a channel or an http message must be passed as second argument. For now, the HTTP part has not been added yet. It is also possible to add extra argument adding them on the stack before the call. 3 new functions are exposed by the global object "filter". The first one, filter.wake_time(ms_delay), to set the wake_time when a Lua callback function yields (if allowed). The two others, filter.register_data_filter(filter, chn) and filter.unregister_data_filter(filter, chn), to enable or disable the data filtering on a channel for a specific lua filter instance. start_analyse() and end_analyse() may return one of the constant filter.CONTINUE, filter.WAIT or filter.ERROR. If nothing is returned, filter.CONTINUE is used as the default value. On its side, tcp_payload() may return the amount of data to forward. If nothing is returned, all incoming data are forwarded. For now, these functions are not allowed to yield because this interferes with the filter workflow. Here is a simple example : MyFilter = {} MyFilter.id = "My Lua filter" MyFilter.flags = filter.FLT_CFG_FL_HTX MyFilter.__index = MyFilter function MyFilter:new() flt = {} setmetatable(flt, MyFilter) flt.req_len = 0 flt.res_len = 0 return flt end function MyFilter:start_analyze(txn, chn) filter.register_data_filter(self, chn) end function MyFilter:end_analyze(txn, chn) print("<Total> request: "..self.req_len.." - response: "..self.res_len) end function MyFilter:tcp_payload(txn, chn) offset = chn:ouput() len = chn:input() if chn:is_resp() then self.res_len = self.res_len + len print("<TCP:Response> offset: "..offset.." - length: "..len) else self.req_len = self.req_len + len print("<TCP:Request> offset: "..offset.." - length: "..len) end end	2021-08-12 08:57:07 +02:00
Christopher Faulet	a1ac5fb28e	MEDIUM: filters/lua: Be prepared to filter TCP payloads For filters written in lua, the tcp payloads will be filtered using methods exposed by the Channel class. So the corrsponding C binding functions must be prepared to process payload in a filter context and not only in an action context. The main change is the offset where to start to process data in the channel buffer, and the length of these data. For an action, all input data are considered. But for a filter, it depends on what the filter is allow to forward when the tcp_payload callback function is called. It depends on previous calls but also on other filters. In addition, when the payload is modified by a lua filter, its context must be updated. Note also that channel functions cannot yield when called from a filter context. For now, it is not possible to define callbacks to filter data and the documentation has not been updated.	2021-08-12 08:57:07 +02:00
Christopher Faulet	8c9e6bba0f	MINOR: lua: Add flags on the lua TXN to know the execution context A lua TXN can be created when a sample fetch, an action or a filter callback function is executed. A flag is now used to track the execute context. Respectively, HLUA_TXN_SMP_CTX, HLUA_TXN_ACT_CTX and HLUA_TXN_FLT_CTX. The filter flag is not used for now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	9f55a5012e	MINOR: lua: Add a function to get a filter attached to a channel class For now, there is no support for filters written in lua. So this function, if called, will always return NULL. But when it will be called in a filter context, it will return the filter structure attached to a channel class. This function is also responsible to set the offset of data that may be processed and the length of these data. When called outside a filter context (so from an action), the offset is the input data position and the length is the input data length. From a filter, the offset and the length of data that may be filtered are retrieved the filter context.	2021-08-12 08:57:07 +02:00
Christopher Faulet	69c581a092	MEDIUM: filters/lua: Add support for dummy filters written in lua It is now possible to write dummy filters in lua. Only the basis to declare such filters has been added for now. There is no way to declare callbacks to filter anything. Lua filters are for now empty nutshells. To do so, core.register_filter() must be called, with 3 arguments, the filter's name (as it appears in HAProxy config), the lua class that will be used to instantiate filters and a function to parse arguments passed on the filter line in HAProxy configuration file. The lua filter class must at least define the method new(), without any extra args, to create new instances when streams are created. If this method is not found, the filter will be ignored. Here is a template to declare a new Lua filter: // haproxy.conf global lua-load /path/to/my-filter.lua ... frontend fe ... filter lua.my-lua-filter arg1 arg2 arg3 filter lua.my-lua-filter arg4 arg5 // my-filter.lua MyFilter = {} MyFilter.id = "My Lua filter" -- the filter ID (optional) MyFilter.flags = filter.FLT_CFG_FL_HTX -- process HTX streams (optional) MyFilter.__index = MyFilter function MyFilter:new() flt = {} setmetatable(flt, MyFilter) -- Set any flt fields. self.args can be used flt.args = self.args return flt -- The new instance of Myfilter end core.register_filter("my-lua-filter", MyFilter, function(filter, args) -- process <args>, an array of strings. For instance: filter.args = args return filter end) In this example, 2 filters are declared using the same lua class. The parsing function is called for both, with its own copy of the lua class. So each filter will be unique. The global object "filter" exposes some constants and flags, and later some functions, to help writting filters in lua. Internally, when a lua filter is instantiated (so when new() method is called), 2 lua contexts are created, one for the request channel and another for the response channel. It is a prerequisite to let some callbacks yield on one side independently on the other one. There is no documentation for now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	6a79fc16bd	MEDIUM: lua: Improve/revisit the lua api to manipulate channels First of all, following functions are now considered deprecated: * Channel:dup() * Channel:get() * Channel:getline() * Channel:get_in_len() * Cahnnel:get_out_len() It is just informative, there is no warning and functions may still be used. Howver it is recommended to use new functions. New functions are more flexible and use a better naming pattern. In addition, the same names will be used in the http_msg class to manipulate http messages from lua filters. The new API is: * Channel:data() * Channel:line() * Channel:append() * Channel:prepend() * Channel:insert() * Channel:remove() * Channel:set() * Channel:input() * Channel:output() * Channel:send() * Channel:forward() * Channel:is_resp() * Channel:is_full() * Channel:may_recv() The lua documentation was updated accordingly.	2021-08-12 08:57:07 +02:00
Christopher Faulet	9a6ffda795	MEDIUM: lua: Process buffer data using an offset and a length The main change is that following functions will now process channel's data using an offset and a length: * hlua_channel_dup_yield() * hlua_channel_get_yield() * hlua_channel_getline_yield() * hlua_channel_append_yield() * hlua_channel_set() * hlua_channel_send_yield() * hlua_channel_forward_yield() So for now, the offset is always the input data position and the length is the input data length. But with the support for filters, from a filter context, these values will be relative to the filter. To make all processing clearer, the function _hlua_channel_dup() has been updated and _hlua_channel_dupline(), _hlua_channel_insert() and _hlua_channel_delete() have been added. This patch is mandatory to allow the support of the filters written in lua.	2021-08-12 08:57:07 +02:00
Christopher Faulet	ba9e21dc68	MINOR: lua: Add a function to get a reference on a table in the stack The hlua_checktable() function may now be used to create and return a reference on a table in stack, given its position. This function ensures it is really a table and throws an exception if not. This patch is mandatory to allow the support of the filters written in lua.	2021-08-12 08:57:07 +02:00
Christopher Faulet	03fb1b26f7	MINOR: filters/lua: Release filters before the lua context This patch is mandatory to allow the support of the filters written in lua.	2021-08-12 08:57:07 +02:00
Christopher Faulet	23976d9e40	BUG/MINOR: lua: Don't yield in channel.append() and channel.set() Lua functions to set or append data to the input part of a channel must not yield because new data may be received while the lua script is suspended. So adding data to the input part in several passes is highly unpredicatble and may be interleaved with received data. Note that if necessary, it is still possible to suspend a lua action by returning act.YIELD. This way the whole action will be reexecuted later because of I/O events or a timer. Another solution is to call core.yield(). This bug affects all stable versions. So, it may be backported. But it is probably not necessary because nobody notice it till now.	2021-08-12 08:57:07 +02:00
Christopher Faulet	2e60aa4dee	BUG/MINOR: lua: Yield in channel functions only if lua context can yield When a script is executed, it is not always allowed to yield. Lua sample fetches and converters cannot yield. For lua actions, it depends on the context. When called from tcp content ruleset, an action may yield until the expiration of the inspect-delay timeout. From http rulesets, yield is not possible. Thus, when channel functions (dup, get, append, send...) are called, instead of yielding when it is not allowed and triggering an error, we just give up. In this case, some functions do nothing (dup, append...), some others just interrupt the in-progress job (send, forward...). But, because these functions don't yield anymore when it is not allowed, the script regains the control and can continue its execution. This patch depends on "MINOR: lua: Add a flag on lua context to know the yield capability at run time". Both may be backported in all stable versions. However, because nobody notice this bug till now, it is probably not necessary, excepted if someone ask for it.	2021-08-12 08:57:07 +02:00
Christopher Faulet	1f43a3430e	MINOR: lua: Add a flag on lua context to know the yield capability at run time When a script is executed, a flag is used to allow it to yield. An error is returned if a lua function yield, explicitly or not. But there is no way to get this capability in C functions. So there is no way to choose to yield or not depending on this capability. To fill this gap, the flag HLUA_NOYIELD is introduced and added on the lua context if the current script execution is not authorized to yield. Macros to set, clear and test this flags are also added. This feature will be usefull to fix some bugs in lua actions execution.	2021-08-12 08:57:07 +02:00
Christopher Faulet	6fcd2d3280	BUG/MINOR: stream: Don't release a stream if FLT_END is still registered When at least one filter is registered on a stream, the FLT_END analyzer is called on both direction when all other analyzers have finished their processing. During this step, filters may release any allocated elements if necessary. So it is important to not skip it. Unfortunately, if both stream interfaces are closed, it is possible to not wait the end of this analyzer. It is possible to be in this situation if a filter must wait and prevents the analyzer completion. To fix the bug, we now wait FLT_END analyzer is no longer registered on both direction before releasing the stream. This patch may be backported as far as 1.7, but AFAIK, no filter is affected by this bug. So the backport seems to be optional for now. In any case, it should remain under observation for some weeks first.	2021-08-12 08:54:16 +02:00
Christopher Faulet	47bfd7b9b7	BUG/MINOR: tcpcheck: Properly detect pending HTTP data in output buffer In tcpcheck_eval_send(), the condition to detect there are still pending data in the output buffer is buggy. Presence of raw data must be tested for TCP connection only. But a condition on the connection was missing to be sure it is not an HTX connection. This patch must be backported as far as 2.2.	2021-08-12 07:49:23 +02:00
William Lallemand	7e7765a451	BUG/MINOR: buffer: fix buffer_dump() formatting The formatting of the buffer_dump() output must be calculated using the relative counter, not the absolute one, or everything will be broken if the <from> variable is not a multiple of 16. Could be backported in all maintained versions.	2021-08-12 00:51:45 +02:00
Amaury Denoyelle	3eb42f91d9	BUG/MEDIUM: server: support both check/agent-check on a dynamic instance A static server is able to support simultaneously both health chech and agent-check. Adjust the dynamic server CLI handlers to also support this configuration. This should not be backported, unless dynamic server checks are backported.	2021-08-11 14:41:47 +02:00
Amaury Denoyelle	26cb8342ad	BUG/MEDIUM: check: fix leak on agent-check purge There is currently a leak on agent-check for dynamic servers. When deleted, the check rules and vars are not liberated. This leak grows each time a dynamic server with agent-check is deleted. Replace the manual purge code by a free_check invocation which centralizes all the details on check cleaning. There is no leak for health check because in this case the proxy is the owner of the check vars and rules. This should not be backported, unless dynamic server checks are backported.	2021-08-11 14:40:21 +02:00
Amaury Denoyelle	6d7fc446b4	BUG/MINOR: check: fix leak on add dynamic server with agent-check error If an error occured during a dynamic server creation, free_check is used to liberate a possible agent-check. However, this does not free associated vars and rules associated as this is done on another function named deinit_srv_agent_check. To simplify the check free and avoid a leak, move free vars/rules in free_check. This is valid because deinit_srv_agent_check also uses free_check. This operation is done only for an agent-check because for a health check, the proxy instance is the owner of check vars/rules. This should not be backported, unless dynamic server checks are backported.	2021-08-11 14:37:42 +02:00
Amaury Denoyelle	25fe1033cb	BUG/MINOR: check: do not reset check flags on purge Do not reset check flags when setting CHK_ST_PURGE. Currently, this change has no impact. However, it is semantically wrong to clear important flags such as CHK_ST_AGENT on purge. Furthermore, this change will become mandatoy for a future fix to properly free agent checks on dynamic servers removal. For this, it will be needed to differentiate health/agent-check on purge via CHK_ST_AGENT to properly free agent checks. This must not be backported unless dynamic servers checks are backported.	2021-08-11 14:33:34 +02:00
Amaury Denoyelle	13f2e2ceeb	BUG/MINOR: server: do not use refcount in free_server in stopping mode Currently there is a leak at process shutdown with dynamic servers with check/agent-check activated. Check purges are not executed on process stopping, so the server is not liberated due to its refcount. The solution is simply to ignore the refcount on process stopping mode and free the server on the first free_server invocation. This should not be backported, unless dynamic server checks are backported. In this case, the following commit must be backported first. 7afa5c1843521ec3be7549592d2b38ccc9d68b73 MINOR: global: define MODE_STOPPING	2021-08-09 17:53:30 +02:00
Amaury Denoyelle	7afa5c1843	MINOR: global: define MODE_STOPPING Define a new mode MODE_STOPPING. It is used to indicate that the process is in the stopping stage and no event loop runs anymore.	2021-08-09 17:51:55 +02:00
Amaury Denoyelle	9ba34ae710	BUG/MINOR: check: test if server is not null in purge Test if server is not null before using free_server in the check purge operation. Currently, the null server scenario should not occured as purge is used with refcounted dynamic servers. However, this might not be always the case if purge is use in the future in other cases; thus the test is useful for extensibility. No need to backport, unless dynamic server checks are backported. This has been reported through a coverity report in github issue #1343.	2021-08-09 17:48:34 +02:00
Amaury Denoyelle	b65f4cab6a	MEDIUM: server: implement agent check for dynamic servers This commit is the counterpart for agent check of "MEDIUM: server: implement check for dynamic servers". The "agent-check" keyword is enabled for dynamic servers. The agent check must manually be activated via "enable agent" CLI. This can enable the dynamic server if the agent response is "ready" without an explicit "enable server" CLI.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	2fc4d39577	MEDIUM: server: implement check for dynamic servers Implement check support for dynamic servers. The "check" keyword is now enabled for dynamic servers. If used, the server check is initialized and the check task started in the "add server" CLI handler. The check is explicitely disabled and must be manually activated via "enable health" CLI handler. The dynamic server refcount is incremented if a check is configured. On "delete server" handler, the check is purged, which decrements the refcount.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	9ecee0fa36	MINOR: check: enable safe keywords for dynamic servers Implement a collection of keywords deemed safe and useful to dynamic servers. The list of the supported keywords is : - addr - check-proto - check-send-proxy - check-via-socks4 - rise - fall - fastinter - downinter - port - agent-addr - agent-inter - agent-port - agent-send	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	b33a0abc0b	MEDIUM: check: implement check deletion for dynamic servers Implement a mechanism to free a started check on runtime for dynamic servers. A new function check_purge is created for this. The check task will be marked for deletion and scheduled to properly close connection elements and free the task/tasklet/buf_wait elements. This function will be useful to delete a dynamic server wich checks.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	d6b7080cec	MINOR: server: implement a refcount for dynamic servers It is necessary to have a refcount mechanism on dynamic servers to be able to enable check support. Indeed, when deleting a dynamic server with check activated, the check will be asynchronously removed. This is mandatory to properly free the check resources in a thread-safe manner. The server instance must be kept alive for this.	2021-08-06 11:09:48 +02:00
Amaury Denoyelle	403dce8e5a	MINOR: check: do not increment global maxsock at runtime global maxsock is used to estimate a number of fd to reserve for internal use, such as checks. It is incremented at startup with the info from the config file. Disable this incrementation in checks functions at runtime. First, it currently serves no purpose to increment it after startup. Worse, it may lead to out-of-bound accesse on the fdtab. This will be useful to initiate checks for dynamic servers.	2021-08-06 11:08:24 +02:00
Amaury Denoyelle	3c2ab1a0d4	MINOR: check: export check init functions Remove static qualifier on init_srv_check, init_srv_agent_check and start_check_task. These functions will be called in server.c for dynamic servers with checks.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	f2c27a5c67	MINOR: check: allocate default check ruleset for every backends Allocate default tcp ruleset for every backend without explicit rules defined, even if no server in the backend use check. This change is required to implement checks for dynamic servers. This allocation is done on check_config_validity. It must absolutely be called before check_proxy_tcpcheck (called via post proxy check) which allocate the implicit tcp connect rule.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	fca18172d9	MINOR: server: initialize fields for dynamic server check Set default inter/rise/fall values for dynamic servers check/agent. This is required because dynamic servers do not inherit from a default-server.	2021-08-06 11:08:04 +02:00
Amaury Denoyelle	7b368339af	MEDIUM: task: implement tasklet kill Implement an equivalent of task_kill for tasklets. This function can be used to request a tasklet deletion in a thread-safe way. Currently this function is unused.	2021-08-06 11:07:48 +02:00
Amaury Denoyelle	c755efd5c6	MINOR: server: unmark deprecated on enable health/agent cli Remove the "DEPRECATED" marker on "enable/disable health/agent" commands. Their purpose is to toggle the check/agent on a server. These commands are still useful because their purpose is not covered by the "set server" command. Most there was confusion with the commands 'set server health/agent', which in fact serves another goal. Note that the indication "use 'set server' instead" has been added since 2016 on the commit 2c04eda8b58636ad2ae44e42b1f50f3b5a24a642 REORG: cli: move "{enable\|disable} health" to server.c and 58d9cb7d22c1b0d8239543443131e3e3658375d0 REORG: cli: move "{enable\|disable} agent" to server.c Besides, these commands will become required to enable check/agent on dynamic servers which will be created with check disabled. This should be backported up to 2.4.	2021-08-06 10:09:50 +02:00
Christopher Faulet	d7da3dd928	BUG/MEDIUM: spoe: Fix policy to close applets when SPOE connections are queued It is the second part of the fix that should solve fairness issues with the connections management inside the SPOE filter. Indeed, in multithreaded mode, when the SPOE detects there are some connections in queue on a server, it closes existing connections by releasing SPOE applets. It is mandatory when a maxconn is set because few connections on a thread may prenvent new connections establishment. The first attempt to fix this bug (9e647e5af "BUG/MEDIUM: spoe: Kill applets if there are pending connections and nbthread > 1") introduced a bug. In pipelining mode, SPOE applets might be closed while some frames are pending for the ACK reply. To fix the bug, in the processing stage, if there are some connections in queue, only truly idle applets may process pending requests. In this case, only one request at a time is processed. And at the end of the processing stage, only truly idle applets may be released. It is an empirical workaround, but it should be good enough to solve contention issues when a low maxconn is set. This patch should partely fix the issue #1340. It must be backported as far as 2.0.	2021-08-05 10:07:43 +02:00
Christopher Faulet	6f1296b5c7	BUG/MEDIUM: spoe: Create a SPOE applet if necessary when the last one is released On a thread, when the last SPOE applet is released, if there are still pending streams, a new one is created. Of course, HAproxy must not be stopping. It is important to start a new applet in this case to not abort in-progress jobs, especially when a maxconn is set. Because applets may be closed to be fair with connections waiting for a free slot. This patch should partely fix the issue #1340. It depends on the commit "MINOR: spoe: Create a SPOE applet if necessary when the last one on a thread is closed". Both must be backported as far as 2.0.	2021-08-05 10:07:43 +02:00
Christopher Faulet	434b8525ee	MINOR: spoe: Add a pointer on the filter config in the spoe_agent structure There was no way to access the SPOE filter configuration from the agent object. However it could be handy to have it. And in fact, this will be required to fix a bug.	2021-08-05 10:07:43 +02:00
Willy Tarreau	d332f1396b	BUG/MINOR: server: update last_change on maint->ready transitions too Nenad noticed that when leaving maintenance, the servers' last_change field was not updated. This is visible in the Status column of the stats page in front of the state, as the cumuled time spent in the current state is wrong, it starts from the last transition (typically ready->maint). In addition, the backend's state was not updated either, because the down transition is performed by set_backend_down() which also emits a log, and it is this function which was extended to update the backend's last_change, but it's not called for down->up transitions so that was not done. The most visible (and unpleasant) effect of this bug is that it affects slowstart so such a server could immediately restart with a significant load ratio. This should likely be backported to all stable releases.	2021-08-04 19:41:01 +02:00
Willy Tarreau	7b2ac29a92	CLEANUP: fd: remove the now unneeded fd_mig_lock This is not needed anymore since we don't use it when setting the running mask anymore.	2021-08-04 16:03:36 +02:00
Willy Tarreau	f69fea64e0	MAJOR: fd: get rid of the DWCAS when setting the running_mask Right now we're using a DWCAS to atomically set the running_mask while being constrained by the thread_mask. This DWCAS is annoying because we may seriously need it later when adding support for thread groups, for checking that the running_mask applies to the correct group. It turns out that the DWCAS is not strictly necessary because we never need it to set the thread_mask based on the running_mask, only the other way around. And in fact, the running_mask is always cleared alone, and the thread_mask is changed alone as well. The running_mask is only relevant to indicate a takeover when the thread_mask matches it. Any bit set in running and not present in thread_mask indicates a transition in progress. As such, it is possible to re-arrange this by using a regular CAS around a consistency check between running_mask and thread_mask in fd_update_events and by making a CAS on running_mask then an atomic store on the thread_mask in fd_takeover(). The only other case is fd_delete() but that one already sets the running_mask before clearing the thread_mask, which is compatible with the consistency check above. This change has happily survived 10 billion takeovers on a 16-thread machine at 800k requests/s. The fd-migration doc was updated to reflect this change.	2021-08-04 16:03:36 +02:00
Willy Tarreau	b1f29bc625	MINOR: activity/fd: remove the dead_fd counter This one is set whenever an FD is reported by a poller with a null owner, regardless of the thread_mask. It has become totally meaningless because it only indicates a migrated FD that was not yet reassigned to a thread, but as soon as a thread uses it, the status will change to skip_fd. Thus there is no reason to distinguish between the two, it adds more confusion than it helps. Let's simply drop it.	2021-08-04 16:03:36 +02:00
Amaury Denoyelle	bd8dd841e5	BUG/MINOR: server: remove srv from px list on CLI 'add server' error If an error occured during the CLI 'add server' handler, the newly created server must be removed from the proxy list if already inserted. Currently, this can happen on the extremely rare error during server id generation if there is no id left. The removal operation is not thread-safe, it must be conducted before releasing the thread isolation. This can be backported up to 2.4. Please note that dynamic server track is not implemented in 2.4, so the release_server_track invocation must be removed for the backport to prevent a compilation error.	2021-08-04 14:57:06 +02:00
Willy Tarreau	ba3ab7907a	MEDIUM: servers: make the server deletion code run under full thread isolation In 2.4, runtime server deletion was brought by commit e558043e1 ("MINOR: server: implement delete server cli command"). A comment remained in the code about a theoretical race between the thread_isolate() call and another thread being in the process of allocating memory before accessing the server via a reference that was grabbed before the memory allocation, since the thread_harmless_now()/thread_harmless_end() pair around mmap() may have the effect of allowing cli_parse_delete_server() to proceed. Now that the full thread isolation is available, let's update the code to rely on this. Now it is guaranteed that competing threads will either be in the poller or queued in front of thread_isolate_full(). This may be backported to 2.4 if any report of breakage suggests the bug really exists, in which case the two following patches will also be needed: MINOR: threads: make thread_release() not wait for other ones to complete MEDIUM: threads: add a stronger thread_isolate_full() call	2021-08-04 14:49:36 +02:00
Willy Tarreau	88d1c5d3fb	MEDIUM: threads: add a stronger thread_isolate_full() call The current principle of running under isolation was made to access sensitive data while being certain that no other thread was using them in parallel, without necessarily having to place locks everywhere. The main use case are "show sess" and "show fd" which run over long chains of pointers. The thread_isolate() call relies on the "harmless" bit that indicates for a given thread that it's not currently doing such sensitive things, which is advertised using thread_harmless_now() and which ends usings thread_harmless_end(), which also waits for possibly concurrent threads to complete their work if they took this opportunity for starting something tricky. As some system calls were notoriously slow (e.g. mmap()), a bunch of thread_harmless_now() / thread_harmless_end() were placed around them to let waiting threads do their work while such other threads were not able to modify memory contents. But this is not sufficient for performing memory modifications. One such example is the server deletion code. By modifying memory, it not only requires that other threads are not playing with it, but are not either in the process of touching it. The fact that a pool_alloc() or pool_free() on some structure may call thread_harmless_now() and let another thread start to release the same object's memory is not acceptable. This patch introduces the concept of "idle threads". Threads entering the polling loop are idle, as well as those that are waiting for all others to become idle via the new function thread_isolate_full(). Once thread_isolate_full() is granted, the thread is not idle anymore, and it is released using thread_release() just like regular isolation. Its users have to keep in mind that across this call nothing is granted as another thread might have performed shared memory modifications. But such users are extremely rare and are actually expecting this from their peers as well. Note that that in case of backport, this patch depends on previous patch: MINOR: threads: make thread_release() not wait for other ones to complete	2021-08-04 14:49:36 +02:00

1 2 3 4 5 ...

11905 Commits