IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Released version 3.0.6 with the following main changes :
- MINOR: connection: No longer include stconn type header in connection-t.h
- BUG/MINOR: h1: do not forward h2c upgrade header token
- BUG/MINOR: h2: reject extended connect for h2c protocol
- MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state
- BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only
- REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades
- REGTESTS: shorten a bit the delay for the h1/h2 upgrade test
- BUG/MINOR: mux-quic: report glitches to session
- BUG/MEDIUM: cli: Be sure to catch immediate client abort
- BUG/MEDIUM: cli: Deadlock when setting frontend maxconn
- BUG/MINOR: server: make sure the HMAINT state is part of MAINT
- BUG/MINOR: cfgparse-global: fix allowed args number for setenv
- BUILD: tools: only include execinfo.h for the real backtrace() function
- MINOR: tools: do not attempt to use backtrace() on linux without glibc
- MINOR: task: define two new one-shot events for use with WOKEN_OTHER or MSG
- BUG/MEDIUM: stream: make stream_shutdown() async-safe
- BUG/MINOR: queue: make sure that maintenance redispatches server queue
- MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute()
- BUG/MEDIUM: queue: always dequeue the backend when redistributing the last server
- BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy forwarding
- BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for upgrade
- MINOR: debug: make mark_tainted() return the previous value
- MINOR: chunk: drop the global thread_dump_buffer
- MINOR: debug: split ha_thread_dump() in two parts
- MINOR: debug: slightly change the thread_dump_pointer signification
- MINOR: debug: make ha_thread_dump_done() take the pointer to be used
- MINOR: debug: replace ha_thread_dump() with its two components
- MEDIUM: debug: on panic, make the target thread automatically allocate its buf
- BUG/MEDIUM: server: server stuck in maintenance after FQDN change
- BUG/MEDIUM: hlua: make hlua_ctx_renew() safe
- BUG/MEDIUM: hlua: properly handle sample func errors in hlua_run_sample_{fetch,conv}()
- BUG/MEDIUM: mux-quic: ensure timeout server is active for short requests
- BUG/MEDIUM: queue: make sure never to queue when there's no more served conns
- BUG/MINOR: httpclient: return NULL when no proxy available during httpclient_new()
- BUG/MEDIUM: stconn: Wait iobuf is empty to shut SE down during a check send
- BUG/MINOR: http-ana: Don't report a server abort if response payload is invalid
- BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in sc_notify()
- BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a filter
- REGTESTS: Never reuse server connection in http-messaging/truncated.vtc
- BUG/MINOR: quic: avoid leaking post handshake frames
- BUG/MEDIUM: quic: avoid freezing 0RTT connections
- DOC: config: fix rfc7239 forwarded typo in desc
- BUG/MINOR: mworker: fix mworker-max-reloads parser
- BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent
- BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy FF
- BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy FF
- BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled address families
- MINOR: activity/memprofile: always return "other" bin on NULL return address
- MINOR: activity/memprofile: show per-DSO stats
- BUG/MINOR: server: fix dynamic server leak with check on failed init
- BUG/MEDIUM: stconn: Report blocked send if sends are blocked by an error
- BUG/MINOR: http-ana: Fix wrong client abort reports during responses forwarding
- BUG/MINOR: stconn: Don't disable 0-copy FF if EOS was reported on consumer side
- BUG/MEDIUM: server: fix race on servers_list during server deletion
- BUILD: debug: silence a build warning with threads disabled
- MINOR: pools: export the pools variable
- MINOR: debug: place a magic pattern at the beginning of post_mortem
- MINOR: debug: place the post_mortem struct in its own section.
- MINOR: debug: store important pointers in post_mortem
- MINOR: cli: remove non-printable characters from 'debug dev fd'
- BUG/MINOR: trace: stop rewriting argv with -dt
- BUG/MINOR: ssl/cli: 'set ssl cert' does not check the transaction name correctly
- DOC: config: add missing glitch_{cnt,rate} data types
- DOC: config: add missing glitch_{cnt,rate} sample definitions
- BUG/MEDIUM: mux-h1: Fix how timeouts are applied on H1 connections
- BUG/MINOR: http-ana: Report internal error if an action yields on a final eval
- MINOR: stream: Save last evaluated rule on invalid yield
- BUG/MEDIUM: promex: Fix dump of extra counters
- DOC: config: document connection error 44 (reverse connect failure)
- CLEANUP: connection: properly name the CO_ER_SSL_FATAL enum entry
- BUG/MINOR: quic: fix malformed probing packet building
- MINOR: cli/debug: show dev: add cmdline and version
- MINOR: stream/stats: Expose the current number of streams in stats
- MINOR: stream/stats: Expose the total number of streams ever created in stats
- BUG/MINOR: stats: Fix the name for the total number of streams created
- MINOR: connection: add more connection error codes to cover common errno
- MINOR: rawsock: set connection error codes when returning from recv/send/splice
- MINOR: connection: add new sample fetch functions fc_err_name and bc_err_name
- MINOR: debug: print gdb hints when crashing
- MINOR: debug: do not limit backtraces to stuck threads
- MINOR: debug: also add a pointer to struct global to post_mortem
- MINOR: debug: also add fdtab and acitvity to struct post_mortem
- MINOR: debug: remove the redundant process.thread_info array from post_mortem
- MINOR: wdt: move the local timers to a struct
- MINOR: debug: add a function to dump a stuck thread
- DEBUG: wdt: better detect apparently locked up threads and warn about them
- DEBUG: cli: make it possible for "debug dev loop" to trigger warnings
- DEBUG: wdt: make the blocked traffic warning delay configurable
- DEBUG: wdt: add a stats counter "BlockedTrafficWarnings" in show info
- BUILD: debug: also declare strlen() in __ABORT_NOW()
- BUILD: Missing inclusion header for ssize_t type
- MINOR: debug: move the "recover now" warn message after the optional notes
At the end of the too long processing warning added by commit 0950778b3a
("MINOR: debug: add a function to dump a stuck thread"), there can be some
optional notes about lua and memory trimming. However it's a bit awkward
that they appear after the "trying to recover now" message. Let's just move
that message after the notes.
(cherry picked from commit 5dcf2012fc035e790c118590a12240e0769fbcaa)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Compilation issue detected as follows by gcc:
In file included from src/ncbuf.c:19:
src/ncbuf.c: In function 'ncb_write_off':
include/haproxy/bug.h:144:10: error: unknown type name 'ssize_t'
144 | extern ssize_t write(int, const void *, size_t); \
(cherry picked from commit bc9821fd26b3a118415f579cdfa6e430b03f96da)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Previous commit 8f204fa8ae ("MINOR: debug: print gdb hints when crashing")
broken on the CI where strlen() isn't known. Let's forward-declare it in
the __ABORT_NOW() functions, just like write(). No backport is needed.
(cherry picked from commit 2d27c80288c0acee85326c0574ed70d0b2e486ef)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Every time a warning is issued about traffic being blocked, let's
increment a global counter so that we can check for this situation
in "show info".
(cherry picked from commit 84dd05e7d83eeee4e7b8c64dc656cdd608c78806)
Signed-off-by: Willy Tarreau <w@1wt.eu>
The new global "warn-blocked-traffic-after" allows one to configure
after how much time a warning should be emitted when traffic is blocked.
(cherry picked from commit 6127e5a4e9722c1b47f5a9810fd41892b675557b)
Signed-off-by: Willy Tarreau <w@1wt.eu>
A new argument "warn" allows to force the emission of a warning while
stuck in the loop by making the internal state inconsistent.
(cherry picked from commit 7337c422247b7af342048cfd48ac0aa2a4b7335e)
[wt: backported only to help testing the watchdog backports]
Signed-off-by: Willy Tarreau <w@1wt.eu>
In order to help users detect when threads are behaving abnormally, let's
try to emit a warning when one is no longer making any progress. This will
allow to catch faulty situations more accurately, instead of occasionally
triggering just after the long task. It will also let users know that there
is something wrong with their configuration, and inspect the call trace to
figure whether they're using excessively long rules or Lua for example (the
usual warnings about lua-load vs lua-load-per-thread are still reported).
The warning will only be emitted for threads not yet marked as stuck so
as not to interfere with panic dumps and avoid sending a warning just
before a panic. A tainted flag is set when this happens however (0x2000).
(cherry picked from commit 148eb5875fb7e6c46c0a9eac486dcb7b3bca931d)
Signed-off-by: Willy Tarreau <w@1wt.eu>
There's currently no way to just emit a warning informing that a thread
is stuck without crashing. This is a problem because sometimes users
would benefit from this info to clean up their configuration (e.g. abuse
of map_regm, lua-load etc).
This commit adds a new function ha_stuck_warning() that will emit a
warning indicating that the designated thread has been stuck for XX
milliseconds, with a number of streams blocked, and will make that
thread dump its own state. The warning will then be sent to stderr,
along with some reminders about the impacts of such situations to
encourage users to fix their configuration.
In order not to disrupt operations, a local 4kB buffer is allocated
in the stack. This should be quite sufficient.
For now the function is not used.
(cherry picked from commit 0950778b3a13fe31ff83223827d6692076cba5e5)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Better have a local struct for per-thread timers, as this will allow us
to store extra info that are useful to improve accurate reporting.
(cherry picked from commit 3f4d646849a253f3dc15972e40023495725efe98)
Signed-off-by: Willy Tarreau <w@1wt.eu>
That one is huge and unneeded since we now have the pointer to the
whole thread_info[] array, which does contain the freshest version
of these info and many more. Let's just get rid of it entirely.
(cherry picked from commit 52240680f1d98cc7eb1e762a04becaf54660e96b)
[wt: adjusted ctx in feed_post_mortem_late()]
Signed-off-by: Willy Tarreau <w@1wt.eu>
These ones are often used as well when trying to analyse sequences of
events, let's add them.
(cherry picked from commit da5cf52173853bcacb12c6ebb045fe395d4b3ba6)
Signed-off-by: Willy Tarreau <w@1wt.eu>
The pointer to struct global is also an important element to have in
post_mortem given that it's used a lot to take decisions in the code.
Let's just add it. It's worth noting that we could get rid of argc/argv
at this point since they're also present in the global struct, but they
don't cost much there anyway.
(cherry picked from commit 2f04ebe14aca91f4a0fafcd03a0f310d98d97aaf)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Historically for size limitation reasons, we would only dump the
backtrace of stuck threads. The problem is that when triggering
a panic or other reasons, we have no backtrace, which effectively
limits it to the watchdog timer. It's also visible in "show threads"
which used to report backtraces for all threads in 2.4 and displays
none nowadays, making its use much more limited.
A first approach could be to just dump the thread that triggers the
panic (in addition to stuck threads). But that remains quite limited
since "show threads" would still display nothing. This patch takes a
better approach consisting in dumping all non-idle threads. This way
the output is less polluted that with the older approach (no need to
dump all those waiting in the poller), and all active threads are
visible, in panics as well as in "show threads". As such, the CLI
command "debug dev panic" now dmups backtraces again. This is already
a benefit which will ease testing of various locations against the
ability to resolve useful symbols.
(cherry picked from commit 4adb2d864d7e3ca9df1e39beabf7b2ffa5aee35c)
Signed-off-by: Willy Tarreau <w@1wt.eu>
To make bug reporting easier for users, when crashing, let's suggest
what to do. Typically when a BUG_ON() matches, only the current thread
is useful the vast majority of the time, while when the watchdog
triggers, all threads are interesting.
The messages are printed at the end after the dump. We may adjust these
with wiki links in the future is more detailed instructions are relevant.
(cherry picked from commit 8f204fa8aeadef3faea4471ba9cfd93d9d168960)
Signed-off-by: Willy Tarreau <w@1wt.eu>
These functions return a symbolic error code such as ECONNRESET to keep
logs compact while making them human-readable. It's a good alternative
to the numeric code in that it's more expressive, and a good one to the
full message since it's shorter and more precise (some codes even match
errno names).
The doc was updated so that the symbolic names appear in the table. It
could be useful to backport this feature to help with troubleshooting
some issues, though backporting the doc might possibly be more annoying
in case users have local patches already, so maybe the table update does
not need to be backported in this case.
(cherry picked from commit 601b34fe7bd50c733a437f26817580bbd56c8d56)
Signed-off-by: Willy Tarreau <w@1wt.eu>
For a long time the errno values returned by recv/send/splice() were not
translated to connection error codes. There are not that many eligible
and having them would help a lot when debugging some complex issues where
logs disagree with network traces. Let's add them now.
(cherry picked from commit 822d82caf4165f0f6da681737c7e3db17d01f599)
Signed-off-by: Willy Tarreau <w@1wt.eu>
While we get reports of connection setup errors in fc_err/bc_err, we
don't have the equivalent for the recv/send/splice syscalls. Let's
add provisions for new codes that cover the common errno values that
recv/send/splice can return, i.e. ECONNREFUSED, ENOMEM, EBADF, EFAULT,
EINVAL, ENOTCONN, ENOTSOCK, ENOBUFS, EPIPE. We also add a special case
for when the poller reported the error itself. It's worth noting that
EBADF/EFAULT/EINVAL will generally indicate serious bugs in the code
and should not be reported.
The only thing is that it's quite hard to forcefully (and reliably)
trigger these errors in automated tests as the timing is critical.
Using iptables to manually reset established connections in the
middle of large transfers at least permits to see some ECONNRESET
and/or EPIPE, but the other ones are harder to trigger.
(cherry picked from commit 00c383ff65c6378327382d2c055f66efb098498d)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Because of a copy/paste error, CurrStreams was reused by mistake. It should
be "CumStreams"
No backports needed.
(cherry picked from commit 131b877565db423930909f0c26f25e000cbd6e3b)
Signed-off-by: Willy Tarreau <w@1wt.eu>
A shared counter is added in the thread context to track the total number of
streams created on the thread. This number is then reported in stats. It
will be a useful information to diagnose some bugs.
(cherry picked from commit 273d322b6fa8117423bbdc9b818002563d4fd3a3)
[wt: ctx adj in tinfo-t]
Signed-off-by: Willy Tarreau <w@1wt.eu>
A shared counter is added in the thread context to track the current number
of streams. This number is then reported in stats. It will be a useful
information to diagnose some bugs.
(cherry picked from commit 18ee22ff766bd7399947af3be2b512ac5827b3c8)
[wt: adj ctx in tinfo-t]
Signed-off-by: Willy Tarreau <w@1wt.eu>
'show dev' command is very convenient to obtain haproxy debugging information,
while process is run in container. Let's extend its output with version and
cmdline. cmdline is useful in a way, as it shows absolute binary path and its
arguments, because sometimes the person, who is debugging failing container is
not the same, who has created and deployed it.
argc and argv are stored in the exported global structure, because
feed_post_mortem() is added as a post check function callback in the
post_check_list. So we can't simply change the signature of
feed_post_mortem(), without breaking other post check callbacks APIs.
Parsers are not supposed to modify argv, so we can safely bypass its pointer
to debug_parse_cli_show_dev(), without copying all argument stings somewhere
in the heap or on stack.
(cherry picked from commit 0d79c9bedfa564e3c032c1e910c29949f5133d91)
Signed-off-by: Willy Tarreau <w@1wt.eu>
This bug arrived with this commit:
cdfceb10a MINOR: quic: refactor qc_prep_pkts() loop
which prevents haproxy from sending PING only packets/datagrams (some
packets/datagrams with only PING frame as ack-eliciting frames inside).
Such packets/datagrams are useful in rare cases during retransmissions
when one wants to probe the peer without exceeding the anti-amplification
limit.
Modify the condition passed to qc_build_pkt() to add padding to the current
datagram. One does not want to do that when probing the peer without ack-eliciting
frames passed as <frms> parameter. Indeed qc_build_pkt() calls qc_do_build_pkt()
which supports this case: if <probe> is true (probing required), qc_do_build_pkt()
handles the case where some padding must be added to a PING only packet/datagram.
This is the case when probing with an empty <frms> frame list of ack-eliciting
frames without exceeding the anti-amplification limit from qc_dgrams_retransmit().
Add some comments to qc_build_pkt() and qc_do_build_pkt() to clarify this
as this code is easy to break!
Thank you for @Tristan971 for having reported this issue in GH #2709.
Must be backported to 3.0.
(cherry picked from commit 217e467e89d15f3c22e11fe144458afbf718c8a8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
It was the only one prefixed with "CO_ERR_", making it harder to batch
process and to look up. It was added in 2.5 by commit 61944f7a73 ("MINOR:
ssl: Set connection error code in case of SSL read or write fatal failure")
so it can be backported as far as 2.6 if needed to help integrate other
patches.
(cherry picked from commit 393957908bf492ff6660fba239106f0da7988fe8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
It was missing from commit ac1164de7c ("MINOR: connection: define error
for reverse connect"), and can be backported to 3.0 and 2.9.
(cherry picked from commit abed9e0426c2f24522e0053452435082870e3afc)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When extra counters are dumped for an entity (frontend, backend, server or
listener), there is a filter on capabilities. Some extra counters are not
available for all entities and must be ignored. However, when this was
performed, the field number, used as an index to dump the metric value, was
still incremented while it should not and leads to an overflow or a stats
mix-up.
This patch must be backported to 3.0.
(cherry picked from commit d1adfd9fe41b0f9f67944eec07348213a7debbf3)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When an action yields while it is not allowed, an internal error is
reported. This interrupts the processing. So info about the last evaluated
rule must be filled.
This patch may be bakcported if needed. If so, the commit ("MINOR: stream:
Save last evaluated rule on invalid yield") must be backported first.
(cherry picked from commit 0b7605491e4ccb66a0468c219306adf354355e0d)
[cf: Of course the mentionned commit to be backported with this one is wrong. It
must be "BUG/MINOR: http-ana: Report internal error if an action yields on
a final eval"].
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
This was already performed for tcp actions at content level, but not for
HTTP actions. It is always a bug, so it must be reported accordingly.
This patch may be backported to all stable versions.
(cherry picked from commit 65ea29dcf85c6553e6dd0613a9c6c506fe22b9ac)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
There were several flaws in the way the different timeouts were applied on
H1 connections. First, the H1C task handling timeouts was not created if no
client/server timeout was specified. But there are other timeouts to
consider. First, the client-fin/server-fin timeouts. But for frontend
connections, http-keey-alive and http-request timeouts may also be used. And
finally, on soft-stop, the close-spread-time value must be considered too.
So at the end, it is probably easier to always create a task to manage H1C
timeouts. Especially since the client/server timeouts are most often set.
Then, when the expiration date of the H1C's task must only be updated if the
considered timeout is set. So tick_add_ifset() must be used instead of
tick_add(). Otherwise, if a timeout is undefined, the taks may expire
immediately while it should in fact never expire.
Finally, the idle expiration date must only be considered for idle
connections.
This patch should be backported in all stable versions, at least as far as
2.6. On the 2.4, it will have to be slightly adapted for the idle_exp
part. On 2.2 and 2.0, the patch will have to be rewrite because
h1_refresh_timeout() is quite different.
(cherry picked from commit 3c09b34325a073e2c110e046f9705b2fddfa91c5)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Following previous commit, when glitch_cnt and glitch_rate data types were
implemented in c9c6b683f ("MEDIUM: stick-tables: add a new stored type for
glitch_cnt and glitch_rate"), newly exposed samples such as
table_glitch_cnt(), table_glitch_rate, src_glitch_cnt() and
src_glitch_rate() were documented but their definitions was missing in
supported keywords list.
It should be backported in 3.0 with c9c6b683f
(cherry picked from commit 0686fd8cfccd7ff12211b8253bf2446d62c90a18)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When glitch_cnt and glitch_rate data types were implemented in
c9c6b683f ("MEDIUM: stick-tables: add a new stored type for glitch_cnt and
glitch_rate"), the data types list for "stick-table" keyword documentation
was overlooked.
This was reported by Nick Ramirez.
It should be backported in 3.0 with c9c6b683f.
(cherry picked from commit 9a6fc2d474511ead2fe8c39524d23b156d640ef8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
Since commit 089c13850f ("MEDIUM: ssl: ssl-load-extra-del-ext work
only with .crt"), the 'set ssl cert' CLI command does not check
correctly if the transaction you are trying to update is the right one.
The consequence is that you could commit accidentaly a transaction on
the wrong certificate.
The fix introduces the check again in case you are not using
ssl-load-extra-del-ext.
This must be backported in all stable versions.
(cherry picked from commit 984d2cfb61744bed29ce92cdc5360155cbd8ca44)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When using trace with -dt, the trace_parse_cmd() function is doing a
strtok which write \0 into the argv string.
When using the mworker mode, and reloading, argv was modified and the
trace won't work anymore because the first : is replaced by a '\0'.
This patch fixes the issue by allocating a temporary string so we don't
modify the source string directly. It also replace strtok by its
reentrant version strtok_r.
Must be backported as far as 2.9.
(cherry picked from commit 596db3ef86844617565a0b4b4ce8358fe6537d87)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When using 'debug dev fd', the output of laddr and raddr can contain
some garbage.
This patch replaces any control or non-printable character by a '.'.
(cherry picked from commit 944a224358ab2865a3a1c0bf700aba38550b19cc)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Dealing with a core and a stripped executable is a pain when it comes
to finding pools, proxies or thread contexts. Let's put a pointer to
these heads and arrays in the post_mortem struct for easier location.
Other critical lists like this could possibly benefit from being added
later.
Here we now have:
- tgroup_info
- thread_info
- tgroup_ctx
- thread_ctx
- pools
- proxies
Example:
$ objdump -h haproxy|grep post
34 _post_mortem 000014b0 0000000000cfd400 0000000000cfd400 008fc400 2**8
(gdb) set $pm=(struct post_mortem*)0x0000000000cfd400
(gdb) p $pm->tgroup_ctx[0]
$8 = {
threads_harmless = 254,
threads_idle = 254,
stopping_threads = 0,
timers = {
b = {0x0, 0x0}
},
niced_tasks = 0,
__pad = 0xf5662c <ha_tgroup_ctx+44> "",
__end = 0xf56640 <ha_tgroup_ctx+64> ""
}
(gdb) info thr
Id Target Id Frame
* 1 Thread 0x7f9e7706a440 (LWP 21169) 0x00007f9e76a9c868 in raise () from /lib64/libc.so.6
2 Thread 0x7f9e76a60640 (LWP 21175) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
3 Thread 0x7f9e7613d640 (LWP 21176) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
4 Thread 0x7f9e7493a640 (LWP 21179) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
5 Thread 0x7f9e7593c640 (LWP 21177) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
6 Thread 0x7f9e7513b640 (LWP 21178) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
7 Thread 0x7f9e6ffff640 (LWP 21180) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
8 Thread 0x7f9e6f7fe640 (LWP 21181) 0x00007f9e76b343c7 in wait4 () from /lib64/libc.so.6
(gdb) p/x $pm->thread_info[0].pth_id
$12 = 0x7f9e7706a440
(gdb) p/x $pm->thread_info[1].pth_id
$13 = 0x7f9e76a60640
(gdb) set $px = *$pm->proxies
while ($px != 0)
printf "%#lx %s served=%u\n", $px, $px->id, $px->served
set $px = ($px)->next
end
0x125eda0 GLOBAL served=0
0x12645b0 stats served=0
0x1266940 comp served=0
0x1268e10 comp_bck served=0
0x1260cf0 <OCSP-UPDATE> served=0
0x12714c0 <HTTPCLIENT> served=0
(cherry picked from commit e5fccfe0b6397ec2b14ebc3a0d09646442b2018d)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Placing it in its own section will ease its finding, particularly in
gdb which is too dumb to find anything in memory. Now it will be
sufficient to issue this:
$ gdb -ex "info files" -ex "quit" ./haproxy core 2>/dev/null |grep _post_mortem
0x0000000000cfd300 - 0x0000000000cfe780 is _post_mortem
or this:
$ objdump -h haproxy|grep post
34 _post_mortem 00001480 0000000000cfd300 0000000000cfd300 008fc300 2**8
to spot the symbol's address. Then it can be read this way:
(gdb) p *(struct post_mortem *)0x0000000000cfd300
(cherry picked from commit 93c3f2a0b4da77c0317496b8585192fb64ef400f)
Signed-off-by: Willy Tarreau <w@1wt.eu>
In order to ease finding of the post_mortem struct in core dumps, let's
make it start with a recognizable pattern of exactly 32 chars (to
preserve alignment):
"POST-MORTEM STARTS HERE+7654321\0"
It can then be found like this from gdb:
(gdb) find 0x000000012345678, 0x0000000100000000, 'P','O','S','T','-','M','O','R','T','E','M'
0xcfd300 <post_mortem>
1 pattern found.
Or easier with any other more practical tool (who as ever used "find" in
gdb, given that it cannot iterate over maps and is 100% useless?).
(cherry picked from commit 989b02e1930d7ecd1a728c3d18ccfba095cdd636)
Signed-off-by: Willy Tarreau <w@1wt.eu>
We want it to be accessible from debuggers for inspection and it's
currently unavailable. Let's start by exporting it as a first step.
(cherry picked from commit fba48e1c40287f1abb4066935f2436bd0b8cd7a4)
Signed-off-by: Willy Tarreau <w@1wt.eu>
Commit 091de0f9b2 ("MINOR: debug: slightly change the thread_dump_pointer
signification") caused the following warning to be emitted when threads
are disabled:
src/debug.c: In function 'ha_thread_dump_one':
src/debug.c:359:9: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
Let's just disguise the pointer to silence it. It should be backported
where the patch above was backported, since it was part of a series aiming
at making thread dumps more exploitable from core dumps.
(cherry picked from commit f163cbfb7f893a06d158880a753cad01908143d8)
[wt: s/MT_LIST_FOR_EACH_ENTRY_LOCKED/mt_list_for_each_entry_safe/ with
two backup elements in 3.0]
Signed-off-by: Willy Tarreau <w@1wt.eu>
Each server is inserted in a global list named servers_list on
new_server(). This list is then only used to finalize servers
initialization after parsing.
On dynamic server creation, there is no issue as new_server() is under
thread isolation. However, when a server is deleted after its refcount
reached zero, srv_drop() removes it from servers_list without lock
protection. In the longterm, this can cause list corruption and crashes,
especially if multiple adjacent servers are removed in parallel.
To fix this, convert servers_list to a mt_list. This should not impact
performance as servers_list is not used during runtime outside of server
creation/deletion.
This should fix github issue #2733. Thanks to Chris Staite who first
found the issue here.
This must be backported up to 2.6.
(cherry picked from commit 7a02fcaf20dbc19db36052bbc7001bcea3912ab5)
Signed-off-by: Willy Tarreau <w@1wt.eu>
There is no reason to disable the 0-copy data forwarding if an end-of-stream
was reported on the consumer side. Indeed, the consumer will send data in
this case. So there is no reason to check the read side here.
This patch may be backported as far as 2.9.
(cherry picked from commit 362de90f3e4ddd0c15331c6b9cb48b671a6e2385)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When the response forwarding is aborted, we must not report a client abort
if a EOS was seen on client side. On abort performed by the stream must be
considered.
This bug was introduced when the SHUTR was splitted in 2 flags.
This patch must be backported as far as 2.8.
(cherry picked from commit 5970c6abec3e0ee4ac44364e999cae2cc852f4c8)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When some data must be sent to the endpoint but an error was previously
reported, nothing is performed and we leave. But, in this case, the SC is not
notified the sends are blocked.
It is indeed an issue if the endpoint reports an error after consuming all
data from the SC. In the endpoint the outgoing data are trashed because of
the error, but on the SC, everything was sent, even if an error was also
reported.
Because of this bug, it is possible to have outgoing data blocked at the SC
level but without any write timeout armed. In some cases, this may lead to
blocking conditions where the stream is never closed.
So now, when outgoing data cannot be sent because an previous error was
triggered, a blocked send is reported. This way, it is possible to report a
write timeout.
This patch should fix the issue #2754. It must be backported as far as 2.8.
(cherry picked from commit fbc3de6e9e59679d2e9ece3984ce31b6a7dd418f)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
If a dynamic server is added with check or agent-check, its refcount is
incremented after server keyword parsing. However, if add server fails
at a later stage, refcount is only decremented once, which prevented the
server to be fully released.
This causes a leak with a server which is detached from most of the
lists but still exits in the system.
This bug is considered minor as only a few conditions may cause a
failure in add server after check/agent-check initialization. This is
the case if there is a naming collision or the dynamic ID cannot be
generated.
To fix this, simply decrement server refcount on add server error path
if either check and/or agent-check are flagged as activated.
This bug is related to github issue #2733. Thanks to Chris Staite who
first found the leak.
This must be backported up to 2.6.
(cherry picked from commit 116178563c2fb57e28a76838cf85c4858b185b76)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
On systems where many libs are loaded, it's hard to track suspected
leaks. Having a per-DSO summary makes it more convenient. That's what
we're doing here by summarizing all calls per DSO before showing the
total.
(cherry picked from commit 401fb0e87a2cea7171e4d37da6094755eb10a972)
Signed-off-by: Willy Tarreau <w@1wt.eu>
It was found in a large "show profiling memory" output that a few entries
have a NULL return address, which causes confusion because this address
will be reused by the next new allocation caller, possibly resulting in
inconsistencies such as "free() ... pool=trash" which makes no sense. The
cause is in fact that the first caller had an entry->info pointing to the
trash pool from a p_alloc/p_free with a NULL return address, and the second
had a different type and reused that entry.
Let's make sure undecodable stacks causing an apparent NULL return address
all lead to the "other" bin.
While this is not exactly a bug, it would make sense to backport it to the
recent branches where the feature is used (probably at least as far as 2.8).
(cherry picked from commit 5091f90479ab4d963b55cb725cee8201d93521d9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
As described in GH #2765, there were situations where http connections
would be re-used for requests to different endpoints, which is obviously
unexpected. In GH #2765, this occured with httpclient and UNIX socket
combination, but later code analysis revealed that while disabling http
reuse on httpclient proxy helped, it didn't fix the underlying issue since
it was found that conn_calculate_hash_sockaddr() didn't take into account
families such as AF_UNIX or AF_CUST_SOCKPAIR, and because of that the
sock_addr part of the connection wasn't hashed.
To properly fix the issue, let's explicly handle UNIX (both regular and
ABNS) and AF_CUST_SOCKPAIR families, so that the destination address is
properly hashed. To prevent this bug from re-appearing: when the family
isn't known, instead of doing nothing like before, let's fall back to a
generic (unoptimal) hashing which hashes the whole sockaddr_storage struct
As a workaround, http-reuse may be disabled on impacted proxies.
(unfortunately this doesn't help for httpclient since reuse policy
defaults to safe and cannot be modified from the config)
It should be backported to all stable versions.
Shout out to @christopherhibbert for having reported the issue and
provided a trivial reproducer.
[ada: prior to 3.0, ctx adjt is required because conn_hash_update()'s
prototype is slightly different]
(cherry picked from commit b5b40a9843e505ed84153327ab897ca0e8d9a571)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
When data are sent via the zero-copy data forwarding, in h2_done_ff, we must
be sure to remove the H2 stream from the send list if something is send. It
was only performed if no blocking condition was encountered. But we must
also do it if something is sent. Otherwise the transfer may be blocked till
timeout.
This patch must be backported as far as 2.9.
(cherry picked from commit ded28f6e5c210b49ede7edb25cd4b39163759366)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>