haproxy

Author	SHA1	Message	Date
Christopher Faulet	16b37510bc	REGTESTS: Add script to test abortonclose option This script test abortonclose option for HTTP/1 client only. It may be backported as far as 2.0. But on the 2.2 and prior, the syslog part must be adapted to catch log messages emitted by proxy during HAProxy startup. Following lines must be added : recv expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Proxy fe1 started." recv expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Proxy fe2 started."	2021-05-06 09:19:20 +02:00
Christopher Faulet	1baef1523d	BUG/MEDIUM: mux-h1: Properly report client close if abortonclose option is set On client side, if CO_RFL_KEEP_RECV flags is set when h1_rcv_buf() is called, we force subscription for reads to be able to catch read0. This way, the event will be reported to upper layer to let the stream abort the request. This patch fixes the abortonclose option for H1 connections. It depends on following patches : * MEDIUM: mux-h1: Don't block reads when waiting for the other side * MINOR: conn-stream: Force mux to wait for read events if abortonclose is set But to be sure the event is handled by the stream, the following patches are also required : * BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive * MINOR: channel: Rely on HTX version if appropriate in channel_may_recv() All the series must be backported with caution as far as 2.0, and only after a period of observation to be sure nothing broke.	2021-05-06 09:19:06 +02:00
Christopher Faulet	ec4207cb68	MEDIUM: mux-h1: Don't block reads when waiting for the other side When we are waiting for the other side to read more data, or to read the next request, we must only stop the processing of input data and not the data receipt. This patch don't change anything on the subscribes for reads. So it should not change anything. The only difference is that the H1 connection will try to read data if it is woken up for an I/O event and if it was subscribed for reads. This patch is required to fix abortonclose option for H1 client connections.	2021-05-06 09:19:06 +02:00
Christopher Faulet	d8219b31e7	MINOR: conn-stream: Force mux to wait for read events if abortonclose is set When the abortonclose option is enabled, to be sure to be immediately notified when a shutdown is received from the client, the frontend conn-stream must be sure the mux will wait for read events. To do so, the CO_RFL_KEEP_RECV flag is set when mux->rcv_buf() is called. This new flag instructs the mux to wait for read events, regardless its internal state. This patch is required to fix abortonclose option for H1 client connections.	2021-05-06 09:19:05 +02:00
Christopher Faulet	e0dec4b7b2	BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may receive In si_update_rx() function, the reads may be blocked because we explicitly don't want to read or because of a lack of room in the input buffer. The first condition is valid. However the second one only test if the channel is empty or not. It means the reads are blocked if there are still some output data in the input channel, in its buffer or its pipe. This condition is not accurate. The reads must not be blocked if the channel can still receive data. Thus instead of relying on channel_is_empty() function, we now call channel_may_recv(). This patch is especially useful to be able to catch read0 on client side when we are waiting for a connection to the server, when abortonclose option is enabled. Otherwise, the client abort is not detected. This patch depends on "MINOR: channel: Rely on HTX version if appropriate in channel_may_recv()". Both must be backported as far as 2.0 after a period of observation to be sure nothing broke.	2021-05-06 09:19:05 +02:00
Christopher Faulet	1c235e57d0	MINOR: channel: Rely on HTX version if appropriate in channel_may_recv() When channel_may_recv() is called for an HTX stream, the HTX version, channel_htx_may_recv() is called. This patch is mandatory to fix a bug related to the abortonclose option.	2021-05-06 09:19:05 +02:00
Willy Tarreau	f6818d637a	BUILD: makefile: add new option USE_MEMORY_PROFILING It is not enabled by default, and may only work on linux-glibc for now, though maybe other platforms could adopt it, possibly with certain restrictions.	2021-05-05 19:09:19 +02:00
Willy Tarreau	ca3afc2456	MINOR: activity: add the profiling.memory global setting This allows to enable/disable memory usage profiling very early, which can be convenient to trace the memory usage in maps, certificates, Lua etc.	2021-05-05 19:09:19 +02:00
Willy Tarreau	993d44d234	MINOR: activity: make "show profiling" also dump the memoery usage Now the memory usage stats are dumped. They are first sorted by total alloc+free so that the first ones are always the most relevant, and that most symmetric alloc/free pairs appear next to each other. This way it becomes convenient to only show a small part of them such as: show profiling memory 20 It's worth noting that the sorting is performed upon each call to the iohandler so it is technically possible that an entry could appear twice or be dropped if the ordering changes between two calls. In practice it is not an issue but it's worth being mentioned.	2021-05-05 19:09:19 +02:00
Willy Tarreau	42712cb6d4	MINOR: activity: make "show profiling" support a few arguments These ones allow to limit the output to only certain sections and/or a number of lines per dump.	2021-05-05 19:09:19 +02:00
Willy Tarreau	637d85a93e	MINOR: activity: clean up the show profiling io_handler a little bit Let's rearrange it to make it more configurable and allow to iterate over multiple parts (header, tasks, memory etc), to restart from a given line number (previously it didn't work, though fortunately it didn't happen), and to support dumping only certain parts and a given number of lines. A few entries from ctx.cli are now used to store a restart point and the current step.	2021-05-05 19:09:19 +02:00
Willy Tarreau	f93c7be87f	MEDIUM: activity: collect memory allocator statistics with USE_MEMORY_PROFILING When built with USE_MEMORY_PROFILING the main memory allocation functions are diverted to collect statistics per caller. It is a bit tricky because the only way to call the original ones is to find their pointer, which requires dlsym(), and which is not available everywhere. Thus all functions are designed to call their fallback function (the original one), which is preset to an initialization function that is supposed to call dlsym() to resolve the missing symbols, and vanish. This saves expensive tests in the critical path. A second problem is that dlsym() calls calloc() to initialize some error messages. After plenty of tests with posix_memalign(), valloc() and friends, it turns out that returning NULL still makes it happy. Thus we currently use a visit counter (in_memprof) to detect if we're reentering, in which case all allocation functions return NULL. In order to convert a return address to an entry in the stats, we perform a cheap hash consisting in multiplying the pointer by a balanced number (as many zeros as ones) and keeping the middle bits. The hash is already pretty good like this, achieving to store up to 638 entries in a 2048-entry table without collision. But in order to further refine this and improve the fill ratio of the table, in case of collision we move up to 16 adjacent entries to find a free place. This remains quite cheap and manages to store all of these inside a 1024-entries hash table with even less risk of collision. Also, free(NULL) does not produce any stats. By doing so we reduce from 638 to 208 the average number of entries needed for a basic config using SSL. free(NULL) not only provides no information as it's a NOP, but keeping it is pure pollution as it happens all the time. When DEBUG_MEM_STATS is enabled, malloc/calloc/realloc are redefined as macros, preventing the code from compiling. Thus, when this option is detected, the macros are undefined as they are pointless there anyway. The functions are optimized to quickly jump to the fallback and as such become almost invisible in terms of processing time, execpt an extra "if" on a read_mostly variable and a jump. Considering that this only happens for pool misses and library routines, this remains acceptable. Performance tests in SSL (the most stressful test) shows less than 1% performance loss when profiling is enabled on 2c4t. The code was written in a way to ease backporting to modern versions (2.2+) if needed, so it keeps the long names for integers and doesn't use the _INC version of the atomic ops.	2021-05-05 19:09:19 +02:00
Willy Tarreau	db87fc7d36	MINOR: activity: declare the storage for memory usage statistics We'll need to store for each call place, the pointer to the caller (the return address to be more exact as with free() it's not uncommon to see tail calls), the number of calls to alloc/free and the total alloc/free bytes. realloc() will be counted either as alloc or free depending on the balance of the size before vs after. We store 1024+1 entries. The first ones are used as hashes and the last one for collisions. When profiling is enabled via the CLI, all the stats are reset.	2021-05-05 18:55:28 +02:00
Willy Tarreau	00dd44f67f	MINOR: activity: add a "memory" entry to "profiling" This adds the necessary flags to permit run-time enabling/disabling of memory profiling. For now this is disabled. A few words were added to the management doc about it and recalling that this is limited to certain OSes.	2021-05-05 18:55:02 +02:00
Willy Tarreau	ef7380f916	CLEANUP: activity: mark the profiling and task_profiling_mask __read_mostly These ones are only read by the scheduler and occasionally written to by the CLI parser, so let's move them to read_mostly so that they do not risk to suffer from cache line pollution.	2021-05-05 18:38:05 +02:00
Willy Tarreau	64192392c4	MINOR: tools: add functions to retrieve the address of a symbol get_sym_curr_addr() will return the address of the first occurrence of the given symbol while get_sym_next_addr() will return the address of the next occurrence of the symbol. These ones return NULL on non-linux, non-ELF, non-USE_DL.	2021-05-05 16:24:52 +02:00
Amaury Denoyelle	d3a88c1c32	MEDIUM: connection: close front idling connection on soft-stop Implement a safe mechanism to close front idling connection which prevents the soft-stop to complete. Every h1/h2 front connection is added in a new per-thread list instance. On shutdown, a new task is waking up which calls wake mux operation on every connection still present in the new list. A new stopping_list attach point has been added in the connection structure. As this member is only used for frontend connections, it shared the same union as the session_list reserved for backend connections.	2021-05-05 14:39:23 +02:00
Amaury Denoyelle	efc6e95642	MEDIUM: mux_h1: release idling frontend conns on soft-stop In h1_process, if the proxy of a frontend connection is disabled, release the connection. This commit is in preparation to properly close idling front connections on soft-stop. h1_process must still be called, this will be done via a dedicated task which monitors the global variable stopping.	2021-05-05 14:35:36 +02:00
Amaury Denoyelle	99cca08ecc	MINOR: connection: move session_list member in a union Move the session_list attach point in an anonymous union. This member is only used for backend connections. This commit is in preparation for the support of stopping frontend idling connections which will add another member to the union. This change means that a special care must be taken to be sure that only backend connections manipulate the session_list. A few BUG_ON has been added as special guard to prevent from misuse.	2021-05-05 14:35:36 +02:00
Amaury Denoyelle	3109ccfe70	MINOR: srv: close all idle connections on shutdown Implement a function to close all server idle connections. This function is called via a global deinit server handler. The main objective is to prevents from leaving sockets in TIME_WAIT state. To limit the set of operations on shutdown and prevents tasks rescheduling, only the ctrl stack closing is done.	2021-05-05 14:33:51 +02:00
Ilya Shipitsin	04b57a7d1b	CI: Github Actions: switch to LibreSSL-3.3.3 stable LibreSSL-3.3.3 released, let us switch to it	2021-05-05 11:29:05 +02:00
Willy Tarreau	1ab6c0bfd2	MINOR: pools/debug: slightly relax DEBUG_DONT_SHARE_POOLS The purpose of this debugging option was to prevent certain pools from masking other ones when they were shared. For example, task, http_txn, h2s, h1s, h1c, session, fcgi_strm, and connection are all 192 bytes and would normally be mergedi, but not with this option. The problem is that certain pools are declared multiple times with various parameters, which are often very close, and due to the way the option works, they're not shared either. Good examples of this are captures and stick tables. Some configurations have large numbers of stick-tables of pretty similar types and it's very common to end up with the following when the option is enabled: $ socat - /tmp/sock1 <<< "show pools" \| grep stick - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753800=56 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753880=57 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753900=58 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753980=59 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a00=60 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753a80=61 - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753b00=62 - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x753780=55 In addition to not being convenient, it can have important effects on the memory usage because these pools will not share their entries, so one stick table cannot allocate from another one's pool. This patch solves this by going back to the initial goal which was not to have different pools in the same list. Instead of masking the MAP_F_SHARED flag, it simply adds a test on the pool's name, and disables pool sharing if the names differ. This way pools are not shared unless they're of the same name and size, which doesn't hinder debugging. The same test above now returns this: $ socat - /tmp/sock1 <<< "show pools" \| grep stick - Pool sticktables (160 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 7 users, @0x3fadb30 [SHARED] - Pool sticktables (224 bytes) : 0 allocated (0 bytes), 0 used, needed_avg 0, 0 failures, 1 users, @0x3facaa0 [SHARED] This is much better. This should probably be backported, in order to limit the side effects of DEBUG_DONT_SHARE_POOLS being enabled in production.	2021-05-05 07:47:29 +02:00
Willy Tarreau	48129be18a	MINOR: debug: add a new "debug dev sym" command in expert mode This command attempts to resolve a pointer to a symbol name. This is convenient during development as it's easier to get such pointers live than by issuing a debugger or calling addr2line.	2021-05-05 07:47:29 +02:00
William Lallemand	5ba80d677d	BUG/MINOR: ssl/cli: fix a lock leak when no memory available This bug was introduced in e5ff4ad ("BUG/MINOR: ssl: fix a trash buffer leak in some error cases"). When cli_parse_set_cert() returns because alloc_trash_chunk() failed, it does not unlock the spinlock which can lead to a deadlock later. Must be backported as far as 2.1 where e5ff4ad was backported.	2021-05-04 16:40:44 +02:00
Willy Tarreau	18b2a9dd87	BUG/MEDIUM: cli: prevent memory leak on write errors Since the introduction of payload support on the CLI in 1.9-dev1 by commit abbf60710 ("MEDIUM: cli: Add payload support"), a chunk is temporarily allocated for the CLI to support defragmenting a payload passed with a command. However it's only released when passing via the CLI_ST_END state (i.e. on clean shutdown), but not on errors. Something as trivial as: $ while :; do ncat --send-only -U /path/to/cli <<< "show stat"; done with a few hundreds of servers is enough see the number of allocated trash chunks go through the roof in "show pools". This needs to be backported as far as 2.0.	2021-05-04 16:27:45 +02:00
Christopher Faulet	c31b200872	BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers When the lua buffers are used, a variable number of stack slots may be used. Thus we cannot assume that we know where the top of the stack is. It was not an issue for lua < 5.4.3 (at least for small buffers). But 'socket:receive()' now fails with lua 5.4.3 because a light userdata is systematically pushed on the top of the stack when a buffer is initialized. To fix the bug, in hlua_socket_receive(), we save the index of the top of the stack before creating the buffer. This way, we can check the number of arguments, regardless anything was pushed on the stack or not. Note that the other buffer usages seem to be safe. This patch should solve the issue #1240. It should be backport to all stable branches.	2021-05-03 10:34:48 +02:00
Willy Tarreau	080347fe2a	[RELEASE] Released version 2.4-dev18 Released version 2.4-dev18 with the following main changes : - DOC: Fix indentation for `path-strip-dot` normalizer - DOC: Fix RFC reference for the percent-to-uppercase normalizer - DOC: Add RFC references for the path-strip-dot(dot)? normalizers - MINOR: uri_normalizer: Add a `percent-decode-unreserved` normalizer - BUG/MINOR: mux-fcgi: Don't send normalized uri to FCGI application - REORG: htx: Inline htx functions to add HTX blocks in a message - CLEANUP: assorted typo fixes in the code and comments - DOC: general: fix white spaces for HTML converter - BUG/MINOR: ssl: ssl_sock_prepare_ssl_ctx does not return an error code - BUG/MINOR: cpuset: move include guard at the very beginning - BUG/MAJOR: fix build on musl with cpu_set_t support - BUG/MEDIUM: cpuset: fix build on MacOS - BUG/MINOR: htx: Preserve HTX flags when draining data from an HTX message - MEDIUM: htx: Refactor htx_xfer_blks() to not rely on hdrs_bytes field - CLEANUP: htx: Remove unsued hdrs_bytes field from the HTX start-line - BUG/MINOR: mux-h2: Don't encroach on the reserve when decoding headers - MEDIUM: http-ana: handle read error on server side if waiting for response - MINOR: htx: Limit length of headers name/value when a HTX message is dumped - BUG/MINOR: applet: Notify the other side if data were consumed by an applet - BUG/MINOR: hlua: Don't consume headers when starting an HTTP lua service - BUG/MEDIUM: mux-h2: Handle EOM flag when sending a DATA frame with zero-copy - CLEANUP: channel: No longer notify the producer in co_skip()/co_htx_skip() - DOC: general: fix example in set-timeout - CLEANUP: cfgparse: de-uglify early file error handling in readcfgfile() - MINOR: config: add a new "default-path" global directive - BUG/MEDIUM: peers: initialize resync timer to get an initial full resync - BUG/MEDIUM: peers: register last acked value as origin receiving a resync req - BUG/MEDIUM: peers: stop considering ack messages teaching a full resync - BUG/MEDIUM: peers: reset starting point if peers appears longly disconnected - BUG/MEDIUM: peers: reset commitupdate value in new conns - BUG/MEDIUM: peers: re-work updates lookup during the sync on the fly - BUG/MEDIUM: peers: reset tables stage flags stages on new conns - MINOR: peers: add informative flags about resync process for debugging - BUG/MEDIUM: time: fix updating of global_now upon clock drift - CLEANUP: freq_ctr: make arguments of freq_ctr_total() const - CLEANUP: hlua: rename hlua_appctx* appctx to luactx - MINOR: server: fix doc/trace on lb algo for dynamic server creation - REGTESTS: server: fix cli_add_server due to previous trace update - REGTESTS: add minimal CLI "add map" tests - DOC: management: move "set var" to the proper place - CLEANUP: map: slightly reorder the add map function - MINOR: map: get rid of map_add_key_value() - MINOR: map: show the current and next pattern version in "show map" - MINOR: map/acl: add the possibility to specify the version in "show map/acl" - MINOR: pattern: support purging arbitrary ranges of generations - MINOR: map/acl: add the possibility to specify the version in "clear map/acl" - MINOR: map/acl: add the "prepare map/acl" CLI command - MINOR: map/acl: add the "commit map/acl" CLI command - MINOR: map/acl: make "add map/acl" support an optional version number - CLEANUP: map/cli: properly align the map/acl help - BUILD: compiler: do not use already defined __read_mostly on dragonfly	2021-05-01 08:25:15 +02:00
Amaury Denoyelle	d272b409d7	BUILD: compiler: do not use already defined __read_mostly on dragonfly DragonflyBSD already has an attribute __read_mostly which serves the same purpose as the one in compiler.h. No need to be backported as it was added in the current 2.4-dev.	2021-04-30 17:16:36 +02:00
Willy Tarreau	29202013c1	CLEANUP: map/cli: properly align the map/acl help Due to extra options on some commands, the help started to become a bit of a mess, so let's realign all the commands.	2021-04-30 15:36:31 +02:00
Willy Tarreau	bb51c44d64	MINOR: map/acl: make "add map/acl" support an optional version number By passing a version number to "add map/acl", it becomes possible to atomically replace maps and ACLs. The principle is that a new version number is first retrieved by calling"prepare map/acl", and this version number is used with "add map" and "add acl". Newly added entries then remain invisible to the matching mechanism but are visible in "show map/acl" when the version number is specified, or may be cleard with "clear map/acl". Finally when the insertion is complete, a "commit map/acl" command must be issued, and the version is atomically updated so that there is no intermediate state with incomplete entries.	2021-04-30 15:36:31 +02:00
Willy Tarreau	7a562ca809	MINOR: map/acl: add the "commit map/acl" CLI command The command is used to atomically replace a map/acl with the pending contents of the designated version. The new version must have been allocated by "prepare map/acl" prior to this. At the moment it is not possible to force the version when adding new entries, so this may only be used to atomically clear an ACL/map.	2021-04-30 15:36:31 +02:00
Willy Tarreau	97218ce3a9	MINOR: map/acl: add the "prepare map/acl" CLI command This command allocates a new version for the map/acl, that will be usable later to prepare the addition of new values to atomically replace existing ones. Technically speaking the operation consists in atomically incrementing the next version. There's no "undo" operation here, if a version is not committed, it will automatically be trashed when committing a newer version.	2021-04-30 15:36:31 +02:00
Willy Tarreau	ff3feeb5cf	MINOR: map/acl: add the possibility to specify the version in "clear map/acl" This will ease maintenance of versionned maps by allowing to clear old or failed updates instead of the current version. Nothing was done to allow clearing everyhing, though if there was a need for this, implementing "@all" or something equivalent wouldn't require more than 3 lines of code.	2021-04-30 15:36:31 +02:00
Willy Tarreau	a13afe6535	MINOR: pattern: support purging arbitrary ranges of generations Instead of being able to purge only values older than a specific value, let's support arbitrary ranges and make pat_ref_purge_older() just be one special case of this one.	2021-04-30 15:36:31 +02:00
Willy Tarreau	95f753e403	MINOR: map/acl: add the possibility to specify the version in "show map/acl" The maps and ACLs internally all have two versions, the "current" one, which is the one being matched against, and the "next" one, the one being filled during an atomic replacement. Till now the "show" commands only used to show the current one but it can be convenient to be able to show other ones as well, so let's add the ability to do this with "show map" and "show acl". The method used here consists in passing the version number as "@<ver>" before the map/acl name or ID. It would have been better after it but that could create confusion with keys already using such a format.	2021-04-30 15:36:31 +02:00
Willy Tarreau	e3a42a6c2d	MINOR: map: show the current and next pattern version in "show map" The "show map" command wasn't updated when pattern generations were added for atomic reloads, let's report them in the "show map" command that lists all known maps. It will be useful for users.	2021-04-30 15:36:31 +02:00
Willy Tarreau	4053b03caa	MINOR: map: get rid of map_add_key_value() This function was only used once in cli_parse_add_map(), and half of the work it used to do was already known from the caller or testable outside of the lock. Given that we'll need to modify it soon to pass a generation number, let's remerge it in the caller instead, using pat_ref_load() which is the one we'll need.	2021-04-30 15:36:31 +02:00
Willy Tarreau	f7dd0e8796	CLEANUP: map: slightly reorder the add map function The function uses two distinct code paths for single the key/value pair and multiple pairs inserted as payload, each with a copy-paste of the error handling. Let's modify the loop to factor them out.	2021-04-30 15:36:31 +02:00
Willy Tarreau	4000ff0448	DOC: management: move "set var" to the proper place Commit b8bd1ee89 ("MEDIUM: cli: add a new experimental "set var" command") added "get var" and "set var" but "set var" was misplaced in the doc, breaking the alphabetic ordering.	2021-04-30 15:36:31 +02:00
Willy Tarreau	deee369cfa	REGTESTS: add minimal CLI "add map" tests The map_redirect test already tests for "show map", "del map" and "clear map" but doesn't have any "add map" command. Let's add some trivial ones involving one regular entry and two other ones added as payload, checking they are properly returned.	2021-04-29 16:19:03 +02:00
Amaury Denoyelle	996190a70d	REGTESTS: server: fix cli_add_server due to previous trace update Error output for dynamic server creation if invalid lb algo has changed since previous commit : MINOR: server: fix doc/trace on lb algo for dynamic server creation The vtest regex should have been updated has well to match it.	2021-04-29 15:38:02 +02:00
Amaury Denoyelle	eafd701dc5	MINOR: server: fix doc/trace on lb algo for dynamic server creation The text mentionned that only backends with consistent hash method were supported for dynamic servers. In fact, it is only required that the lb algorith is dynamic.	2021-04-29 14:59:42 +02:00
Willy Tarreau	7e702d13f4	CLEANUP: hlua: rename hlua_appctx* appctx to luactx There is some serious confusion in the lua interface code related to sockets and services coming from the hlua_appctx structs being called "appctx" everywhere, and where the real appctx is reached using appctx->appctx. This part is a bit of a pain to debug so let's rename all occurrences of this local variable to "luactx".	2021-04-28 17:59:21 +02:00
Willy Tarreau	b4476c6a8c	CLEANUP: freq_ctr: make arguments of freq_ctr_total() const freq_ctr_total() doesn't modify the freq counters, it should take a const argument.	2021-04-28 17:44:37 +02:00
Willy Tarreau	fe16126acc	BUG/MEDIUM: time: fix updating of global_now upon clock drift During commit 7e4a557f6 ("MINOR: time: change the global timeval and the the global tick at once") the approach made sure that the new now_ms was always higher than or equal to global_now_ms, but by forgetting the old value. This can cause the first update to global_now_ms to fail if it's already out of sync, going back into the loop, and the subsequent call would then succeed due to commit 4d01f3dcd ("MINOR: time: avoid overwriting the same values of global_now"). And if it goes out of sync, it will fail to update forever, as observed by Ashley Penney in github issue #1194, causing incorrect freq counters calculations everywhere. One possible trigger for this issue is one thread spinning for a few milliseconds while the other ones continue to work. The issue really is that old_now_ms ought not to be modified in the loop as it's used for the CAS. But we don't need to structurally guarantee that global_now_ms grows monotonically as it's computed from the new global_now which is already verified for this via the __tv_islt() test. Thus, dropping any corrections on global_now_ms in the loop is the correct way to proceed as long as this one is always updated to follow global_now. No backport is needed, this is only for 2.4-dev.	2021-04-28 17:43:55 +02:00
Emeric Brun	ccdfbae62c	MINOR: peers: add informative flags about resync process for debugging This patch adds miscellenous informative flags raised during the initial full resync process performed during the reload for debugging purpose. 0x00000010: Timeout waiting for a full resync from a local node 0x00000020: Timeout waiting for a full resync from a remote node 0x00000040: Session aborted learning from a local node 0x00000080: Session aborted learning from a remote node 0x00000100: A local node teach us and was fully up to date 0x00000200: A remote node teach us and was fully up to date 0x00000400: A local node teach us but was partially up to date 0x00000800: A remote node teach us but was partially up to date 0x00001000: A local node was assigned for a full resync 0x00002000: A remote node was assigned for a full resync 0x00004000: A resync was explicitly requested This patch could be backported on any supported branch	2021-04-28 14:23:10 +02:00
Emeric Brun	1a6b43e13e	BUG/MEDIUM: peers: reset tables stage flags stages on new conns Flags used as context to know current status of each table pushing a full resync to a peer were correctly reset receiving a new resync request or confirmation message but in case of local peer sync during reload the resync request is implicit and those flags were not correctly reset in this case. This could result to a partial initial resync of some tables after reload if the connection with the old process was broken and retried. This patch reset those flags at the end of the handshake for all new connections to be sure to push a entire full resync if needed. This patch should be backported on all supported branches ( v >= 1.6 )	2021-04-28 14:23:10 +02:00
Emeric Brun	8e7a13ed66	BUG/MEDIUM: peers: re-work updates lookup during the sync on the fly Only entries between the opposite of the last 'local update' rotating counter were considered to be pushed. This processing worked in most cases because updates are continually pushed trying to reach this point but it remains some cases where updates id are more far away in the past and appearing in futur and the push of updates is stuck until the head reach again the tail which could take a very long time. This patch re-work the lookup to consider that all positions on the rotating counter is considered in the past until we reach exactly the 'local update' value. Doing this, the updates push won't be stuck anymore. This patch should be backported on all supported branches ( >= 1.6 )	2021-04-28 14:23:10 +02:00
Emeric Brun	cc9cce9351	BUG/MEDIUM: peers: reset commitupdate value in new conns The commitupdate value of the table is used to check if the update is still pending for a push for all peers. To be sure to not miss a push we reset it just after a handshake success. This patch should be backported on all supported branches ( >= 1.6 )	2021-04-28 14:23:10 +02:00
Emeric Brun	d9729da982	BUG/MEDIUM: peers: reset starting point if peers appears longly disconnected If two peers are disconnected and during this period they continue to process a large amount of local updates, after a reconnection they may take a long time before restarting to push their updates. because the last pushed update would appear internally in futur. This patch fix this resetting the cursor on acked updates at the maximum point considered in the past if it appears in futur but it means we may lost some updates. A clean fix would be to update the protocol to be able to signal a remote peer that is was not updated for a too long period and needs a full resync but this is not yet supported by the protocol. This patch should be backported on all supported branches ( >= 1.6 )	2021-04-28 14:23:10 +02:00

1 2 3 4 5 ...

14613 Commits