Commit Graph

27315 Commits

Author SHA1 Message Date
Willy Tarreau
90494537b2 BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts
The "show threads" command introduced early in the 2.0 dev cycle uses
appctx->st1 to store its context (the number of the next thread to dump).
It goes back to an era where contexts were shared between the various
applets and the CLI's command handlers.

In fact it was already not good by then because st1 could possibly have
APPCTX_CLI_ST1_PAYLOAD (2) in it, that would make the dmup start at
thread 2, though it was extremely unlikely.

When contexts were finally cleaned up and moved to their own storage,
this one was overlooked, maybe due to using st1 instead of st2 like
most others. So it continues to rely on st1, and more recently some
new flags were appended, one of which is APPCTX_CLI_ST1_LASTCMD (16)
and is always there. This results in "show threads" to believe it must
start do dump from thread 16, and if this thread is not present, it can
simply crash the process. A tiny reproducer is:

  global
    nbthread 1
    stats socket /tmp/sock1 level admin mode 666

  $ socat /tmp/sock1 - <<< "show threads"

The fix for modern versions simply consists in assigning a context to
this command from the applet storage. We're using a single int, no need
for a struct, an int* will do it. That's valid till 2.6.

Prior to 2.6, better switch to appctx->ctx.cli.i0 or i1 which are all
properly initialized before the command is executed.

This must be backported to all stable versions.

Thanks to Andjelko Horvat for the report and the reproducer.

(cherry picked from commit e0e2b6613214212332de4cbad2fc06cf4774c1b0)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Christopher Faulet
9a55572ff8 BUG/MINOR: session: Eval L4/L5 rules defined in the default section
It is possible to define TCP/HTTP rules in a named default section to
inherit from it in a proxy. However, there is an issue with L4/L5 rules.
Only the lists of the current frontend are checked to know if an eval must
be performed. Nothing is done for an empty list. Of course, the lists of the
default proxy must also be checked to be sure to not ignored default L4/L5
rules. It is now fixed.

This patch should fix the issue #2637. It must be backported as far as 2.6.

(cherry picked from commit 076444550583acc11ef7fce7e7e740f039125696)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Amaury Denoyelle
aca17100a0 CLEANUP: quic: rename TID affinity elements
This commit is the renaming counterpart of the previous one, this time
for quic_conn module. Several elements related to TID affinity update
from quic_conn has been renamed : public functions, but also flag
renamed to QUIC_FL_CONN_TID_REBIND and trace event to
QUIC_EV_CONN_BIND_TID.

This should be backported with the same instruction as the previous
commit.

(cherry picked from commit 3be58fc720c406ce4f4dfc70b87662cef4838886)
[wt: dropped the BUG_ON() from quic_conn since bfdf145859d not backported]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Amaury Denoyelle
8669ecdf8a CLEANUP: proto: rename TID affinity callbacks
Since the following patch, protocol API to update a connection TID
affinity has been extended.
  commit 1a43b9f32c71267e3cb514aa70a13c75adb20742
  MINOR: proto: extend connection thread rebind API

The single callback set_affinity has been splitted in 3 different
functions which are called at different stages during listener_accept(),
depending on accept queue push success or not. However, the naming was
rendered confusing by the usage of function prefix 1 and 2.

Rename proto callback related to TID affinity update and use the
following names :

* bind_tid_prep
* bind_tid_commit
* bind_tid_reset

This commit should probably be backported at least up to 3.0 with the
above patch. This is because the fix was recently backported and it
would allow to keep changes minimal between the two versions. It could
even be backported up to 2.8 if there is no major conflict.

(cherry picked from commit 9fbe8b03346a98cc8cc7b47eaa68935b1d4b3916)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Amaury Denoyelle
6772e175ad BUG/MEDIUM: quic: prevent crash on accept queue full
Handshake for quic_conn instances runs on a single non-chosen thread. On
completion, listener_accept() is performed to select the less loaded
thread before initializing connection instance. As such, quic_conn
instance is migrated to the thread with its upper connection.

In case accept queue is full, listener_accept() fallback to local accept
mode, which cause the connection to be assigned to the current thread.
However, this is not supported by QUIC as quic_conn instance is left on
the previously selected thread. In most cases, this will cause a
BUG_ON() due to a task manipulation from an outside thread.

To fix this, handle quic_conn thread rebind in multiple steps using the
new extended protocol API. Several operations have been moved from
qc_set_tid_affinity1() to newly defined qc_set_tid_affinity2(), in
particular CID TID update. This ensures that quic_conn instance is not
prematurely accessed on the new thread until accept queue push is
guaranteed to succeed.

qc_reset_tid_affinity() is also newly defined to reassign the newly
created tasks and tasklets to the current thread. This is necessary to
prevent the BUG_ON() crash described above.

This must be backported up to 2.8 after a period of observation. Note
that it depends on previous patch :
  MINOR: proto: extend connection thread rebind API

(cherry picked from commit 95f624540b87e06e7a3c36b8c1ed4d76f0add2dc)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Willy Tarreau
22db679256 BUILD: listener: silence a build warning about unused value without threads
A variable introduced in commit 1a43b9f32c ("MINOR: proto: extend
connection thread rebind API") is not used without threads and causes a
build warning. Let's just mark it maybe_unused.

Since the commit above is tagged for backporting, this one will need to
be backported along with it.

(cherry picked from commit 0cb874320951bb3202a25c28334657edef77227b)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:56:13 +02:00
Amaury Denoyelle
b9d8312e34 MINOR: proto: extend connection thread rebind API
MINOR: listener: define callback for accept queue push

Extend API for connection thread rebind API by replacing single callback
set_affinity by three different ones. Each one of them is used at a
different stage of the operation :

* set_affinity1 is used similarly to previous set_affinity

* set_affinity2 is called directly from accept_queue_push_mp() when an
  entry has been found in accept ring. This operation cannot fail.

* reset_affinity is called after set_affinity1 in case of failure from
  accept_queue_push_mp() due to no space left in accept ring. This is
  necessary for protocols which must reconfigure resources before
  fallback on the current tid.

This patch does not have any functional changes. However, it will be
required to fix crashes for QUIC connections when accept queue ring is
full. As such, it must be backported with it.

(cherry picked from commit 1a43b9f32c71267e3cb514aa70a13c75adb20742)
[wt: backported for ease of maintenance, as suggested in commit
     9fbe8b0334 ("CLEANUP: proto: rename TID affinity callbacks") also
     marked for backporting]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-29 11:55:21 +02:00
Willy Tarreau
95a607c4b3 [RELEASE] Released version 3.0.3
Released version 3.0.3 with the following main changes :
    - BUG/MINOR: log: fix broken '+bin' logformat node option
    - DEBUG: hlua: distinguish burst timeout errors from exec timeout errors
    - REGTESTS: ssl: fix some regtests 'feature cmd' start condition
    - BUG/MEDIUM: proxy: fix email-alert invalid free
    - DOC: configuration: fix alphabetical order of bind options
    - DOC: management: document ptr lookup for table commands
    - BUG/MAJOR: quic: fix padding with short packets
    - SCRIPTS: git-show-backports: do not truncate git-show output
    - DOC: api/event_hdl: small updates, fix an example and add some precisions
    - BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission
    - BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure
    - BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure
    - BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure
    - DEV: flags/show-fd-to-flags: adapt to recent versions
    - BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct()
    - BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session
    - DOC: configuration: add details about crt-store in bind "crt" keyword
    - BUG/MINOR: server: fix first server template name lookup UAF
    - MINOR: activity: make the memory profiling hash size configurable at build time
    - BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error
    - BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid
    - BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid
    - BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid()
    - BUG/MINOR: quic: fix race condition in qc_check_dcid()
    - BUG/MINOR: quic: fix race-condition on trace for CID retrieval
    - BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
    - BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
    - BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
    - DOC: configuration: more details about the master-worker mode
    - BUG/MEDIUM: server: fix race on server_atomic_sync()
    - BUG/MINOR: jwt: don't try to load files with HMAC algorithm
    - MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD
    - DOC: configuration: update maxconn description
    - BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx
    - Revert "MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"
    - BUG/MINOR: jwt: fix variable initialisation
    - BUG/MINOR: h1: Fail to parse empty transfer coding names
    - BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
    - BUG/MEDIUM: h1: Reject empty Transfer-encoding header
    - BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread
    - DEV: flags/quic: decode quic_conn flags
    - BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past
2024-07-11 16:05:19 +02:00
Christopher Faulet
9613cd06a5 BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in past
Every time a bandwidth limitation is evaluated on a channel, the analyze
expiration date is renewed, mainly based on the internal bandwidth
limitation filter expiration date. However, when the filter is called while
there is no data to filter, we skip all limitation computations to jump at
the end of the function. At this stage, the analyze expiration date is
renewed before exiting. But here the internal expiration date may be expired
and not reset.

To sum up, it is possible to set the analyze expiration date of a channel in
the past. It is unexpected and this could lead to a loop in process_stream.

To fix the issue, we just now take care to reset the internal expiration
date, if needed, before exiting.

This patch should fix the issue #2634. It must be backported as far as 2.8.

(cherry picked from commit 2cb5b7dca688e59146256833153ad700302004ba)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Amaury Denoyelle
49b64e6539 DEV: flags/quic: decode quic_conn flags
Decode quic_conn flags via qc_show_flags() function.

To support this, quic flags definition have been put outside of USE_QUIC
directive.

(cherry picked from commit 19b8c1b7cdda84775eb1afb452eee044d4920d4a)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Christopher Faulet
d13428f750 BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current thread
When a message is queued, waiting to be processed by a SPOE applet, there
are some heuristic to know if a new applet must be created or not. There are
2 conditions to skip the applet creation:

  1 - if there are enough idle applets on the current thread, or,

  2 - if the processing rate on the current thread is high enough to handle
      this new message

In the 2nd case, there is a flaw when the number of processed messages falls
to zero while the processing rate is still greater than zero. In that case,
we will skip the SPOE applet creation without taking care to check there is
at least one applet on the current thread.

So now, the conditions above to skip the SPOE applet creation are only
evaluated if there is at least one applet on the current thread.

This patch must be backported to every stable versions.

(cherry picked from commit 5e84f13a0b3b915990a4e25ec9448fdbef3c1a14)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Christopher Faulet
276c0ce8e0 BUG/MEDIUM: h1: Reject empty Transfer-encoding header
The Transfer-Encoding headers list the transfer coding that have been
applied to the content in order to form the message body. It is a list of
tokens. And as specified by RFC 9110, a token cannot be empty. When several
coding names are specify as a comma-separated value, this case is properly
handled and an error is triggered. However, an empty header value will just
be skipped and no error is triggered. This could be an issue with some buggy
servers.

Now, empty Transfer-Encoding header are rejected too.

This patch must be backported as far as 2.6.

(cherry picked from commit 4a2dd6f3777959187565edd79475091e155e2161)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Christopher Faulet
f403f36c2a BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
The following Transfer-Encoding header is now rejected with a
400-bad-request:

  Transfer-Encoding: chunked,\r\n

This case was not properly handled and the last empty value was just
ignored.

This patch must be backported as far as 2.6.

(cherry picked from commit 428451fe96d9ad9ba8ef0f0669e145a37d97304d)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Christopher Faulet
9df5f65f26 BUG/MINOR: h1: Fail to parse empty transfer coding names
Empty transfer coding names, inside a comma-separated list, are already
rejected. But it is only by chance. Today, it is detected as an unknown
coding names (not "chunked" concretly). Then, it is handled by the H1
multiplexer as an error and a 422-Unprocessable-Content response is
returned.

So, the error is properly detected in this case, but it is not accurate. A
400-bad-request response must be returned instead. Then, it is better to
catch the error during the header parsing. It is the purpose of this patch.

This patch should be backported as far as 2.6.

(cherry picked from commit b8b01027603ae53fdebc7c63c4dacf0908eaef82)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
William Lallemand
d3ffe441cc BUG/MINOR: jwt: fix variable initialisation
Set the alg variable from sample_conv_jwt_verify_check() to
JWT_ALG_DEFAULT.

This was reported by coverity in #2630, but since you need to use the
first argument to use the 2nd, this has no real impact.

Mut be backported with 883f1bd (as far as 2.6).

(cherry picked from commit 0a1b251c1a2ac55e135db0f5ac3d241e218308b4)
Signed-off-by: Willy Tarreau <w@1wt.eu>
2024-07-11 15:44:58 +02:00
Willy Tarreau
c742566b5b Revert "MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"
This reverts the following commit:
  e3aefc50d8 ("MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD")

Lukas expressed some concerns about possible consequences of this change
so let's wait for a consensus to be found in mainline before we backport
anything (if at all), as we certainly don't want to change the behavior
after it's backported. No version was released with this patch, it's the
right moment to revert it. For reference, the discussion is here:

    https://www.mail-archive.com/haproxy@formilux.org/msg45098.html

Please note that if it were to be re-introduced later, it should be
applied along with a small fix that already references it.
2024-07-11 15:44:52 +02:00
Christopher Faulet
73cb6e9fa9 BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without appctx
For a given peer, the synchronization of the learn state is no longer
performed in the peer appctx. It is delayed to be handled by the peers sync
task. It means that for a given peer, it is possible to have finished to
learn and only handle it after the appctx release. So the synchronization
may happen on a peer without appctx.

This was not tested and an unconditionnal wakeup on the appctx could lead to
a crash because of a NULL-deref. It may be experienced by running
reg-tests/peers/tls_basic_sync.vtc script in loop. The fix is obivous. In
sync_peer_learn_state(), we must omit to wakeup the appctx if it was already
released.

This patch should fix issue #2629. It must be backported to 3.0.

(cherry picked from commit 3e2d1476e65ed45a38ed153ad2357d60755be8e9)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-05 12:21:52 +02:00
Valentine Krasnobaeva
72abbbb784 DOC: configuration: update maxconn description
Let's update maxconn keyword description, in order to make it clear, which
setting has the precedence over the global.maxconn and the SYSTEM_MAXCONN if
set.

(cherry picked from commit ff024206f0e0235551395c496e1aa7f23b74bf56)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-05 12:21:13 +02:00
Valentine Krasnobaeva
e3aefc50d8 MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD
Let's provide a default value for fd_hard_limit, if it's not set in the
configuration. With this patch we could set some specific default via
compile-time variable DEFAULT_MAXFD as well. Hope, this will be helpfull for
haproxy package maintainers.

    make -j 8 TARGET=linux-glibc DEBUG=-DDEFAULT_MAXFD=50000

If haproxy is comipled without DEFAULT_MAXFD defined, the default will be set
to 1048576.

This is done to avoid killing the process by its watchdog, while it started
without any limitations in its configuration or in the command line and the
hard RLIMIT_NOFILE is extremely huge (~1000000000). We use in this case
compute_ideal_maxconn() to calculate maxconn and maxsock, maxsock defines the
size of internal fdtab, which becames very-very large as well. When
the process starts to simply loop over this fdtab (0(n)), this takes a lot of
time, so watchdog does it job.

To avoid this, maxconn now is always reduced to some reasonable value either
by explicit global.fd-hard-limit from configuration, or by its default. The
default may be changed at build-time and overwritten then by
global.fd-hard-limit at runtime. Explicit global.fd-hard-limit from the
configuration has always precedence over DEFAULT_MAXFD, if set.

Must be backported in all stable versions until v2.6.0, including v2.6.0.

(cherry picked from commit 41275a691839df5f8dc7cb9faa4e259fbb755d34)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-05 12:20:58 +02:00
William Lallemand
639d800694 BUG/MINOR: jwt: don't try to load files with HMAC algorithm
When trying to use a HMAC algorithm (HS256, HS384, HS512) the
sample_conv_jwt_verify_check() function of the converter tries to load a
file even if it is only supposed to contain a secret instead of a path.

When using lua, the check function is called at runtime so it even tries
to load file at each call... This fixes the issue for HMAC algorithm
but this is still a problem with the other algorithms, since we don't
have a way of pre-loading files before the call.

Another solution must be found to prevent disk IO with lua using other
algorithms.

Must be backported as far as 2.6.

(cherry picked from commit 883f1bdbcec7882a2e4a257e93f92be604467319)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-05 12:17:44 +02:00
Amaury Denoyelle
420ecc0a47 BUG/MEDIUM: server: fix race on server_atomic_sync()
The following patch fixes a race condition during server addr/port
update :
  cd994407a9
  BUG/MAJOR: server/addr: fix a race during server addr:svc_port updates

The new update mechanism is implemented via an event update. It uses
thread isolation to guarantee that no other thread is accessing server
addr/port. Furthermore, to ensure server instance is not deleted just
before the event handler, server instance is lookup via its ID in proxy
tree.

However, thread isolation is only entered after server lookup. This
leaves a tiny race condition as the thread will be marked as harmless
and a concurrent thread can delete the server in the meantime. This
causes server_atomic_sync() to manipulated a deleted server instance to
reinsert it in used_server_addr backend tree. This can cause a segfault
during this operation or possibly on a future used_server_addr tree
access.

This issue was detected by criteo. Several backtraces were retrieved,
each related to server addr_node insert or delete operation, either in
srv_set_addr_desc(), or add/delete dynamic server handlers.

To fix this, simply extend thread isolation section to start it before
server lookup. This ensures that once retrieved the server cannot be
deleted until its addr/port are updated. To ensure this issue won't
happen anymore, a new BUG_ON() is added in srv_set_addr_desc().

Also note that ebpt_delete() is now called every time on delete handler
as this is a safe idempotent operation.

To reproduce these crashes, a script was executed to add then remove
different servers every second. In parallel, the following CLI command
was issued repeatdly without any delay to force multiple update on
servers port :

  set server <srv> addr 0.0.0.0 port $((1024 + RANDOM % 1024))

This must be backported at least up to 3.0. If above mentionned patch
has been selected for previous version, this commit must also be
backported on them.

(cherry picked from commit 50ae717624875cd36aac290ac5953062ee7de692)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-05 12:17:34 +02:00
William Lallemand
ea62653f9b DOC: configuration: more details about the master-worker mode
Add more details about the master-worker mode in the "master-worker"
global keyword.

Should fix issue #2198.

(cherry picked from commit 419b79492a2ae8c9323b907b9d2da85c1208c372)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:51:01 +02:00
Christopher Faulet
b3839e7d6b BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
In 3.0, the CLI applet was rewritten to use its own buffers. However, the
lua part, used to register CLI commands at runtime, was not updated
accordingly. It means the lua CLI commands still try to write in the channel
buffers. This is of course totally unexepected and not supported. Because of
this bug, the applet hangs intead of returning the command result.

The registration of lua CLI commands relies on the lua TCP applets. So the
send and receive functions were fixed to use the applet's buffer when it is
required and still use the channel buffers otherwies. This way, other lua
TCP applets can still run on the legacy mode, without the applet's buffers.

This patch must be backported to 3.0.

(cherry picked from commit e5e36ce09722e63ca4542b0b2ff1a1eb905f8208)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:50:45 +02:00
Christopher Faulet
93097de986 BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
When the support for modules was added, the function producing the #HELP
line of each metric was refactored. Since then, the prefix "#HELP
<metric-name>" is printed twice because a code block was not removed. It is
now fixed.

This patch must be backported to 3.0.

(cherry picked from commit b789cef91f5ac6123b130f5474b44fb7c57106ae)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:50:39 +02:00
Willy Tarreau
01e25a4cf2 BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
Locking of the CID tree was extended in qc_check_dcid() by recent commit
05f59a5 ("BUG/MINOR: quic: fix race condition in qc_check_dcid()") but
there was a direct return from the middle of the function which was not
covered by the unlock, resulting in the function keeping the lock on
success return.

Let's just remove this return and replace it with a variable to merge all
exit paths.

This must be backported wherever the fix above is backported.

(cherry picked from commit 192abc6f834dcd09f310299afe253b17f9985407)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:50:08 +02:00
Amaury Denoyelle
9e913167ae BUG/MINOR: quic: fix race-condition on trace for CID retrieval
quic_rx_pkt_retrieve_conn() is used when parsing a received datagram
from the listener socket. It returned the quic_conn instance
corresponding to the first packet DCID, unless it is mapped to another
thread.

As expected, global CID tree access is protected by a lock in the
function. However, there is a race condition due to the final trace
where qc instance is dereferenced outside of the lock. Fix this by
adding a new trace under lock protection and remove qc deferencement at
function end.

This may fix first crash of github issue #2607.

This must be backported up to 2.8.

(cherry picked from commit bbb9f8248e29e89c288ad55a0fb7c71280a335a0)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:45:26 +02:00
Amaury Denoyelle
177bcf3b4c BUG/MINOR: quic: fix race condition in qc_check_dcid()
qc_check_dcid() is a function which check that a DCID is associated to
the expected quic_conn instance. This is used for quic_conn socket
receive handler as there is a tiny risk that a datagram to another
connection was received on this socket.

As other operations on global CID tree, a lock must be used to protect
against race condition. However, as previous commit, lock was not held
long enough as CID tree node is accessed outside of the lock region. To
fix this, increase critical section until CID dereferencement is done.

The impact of this bug should be similar to the previous one. However,
risk of crash are even less reduced as it should be extremely rare to
receive datagram for other connections on a quic_conn socket. As such,
most of the time first check condition of qc_check_dcid() is enough.

This may fix first crash of issue github #2607.

This must be backported up to 2.8.

(cherry picked from commit 05f59a51ac5ba193ef37447ac88f74d3019c3399)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:45:20 +02:00
Amaury Denoyelle
ab76fa58e8 BUG/MEDIUM: quic: fix race-condition in quic_get_cid_tid()
haproxy generates CID for clients which reuse them as DCID on their
packets. These CID are stored in a global tree quic_cid_trees. Each
operation on this tree must be done under lock protection.

quic_get_cid_tid() is a function which lookups a CID in global tree and
return the associated thread ID. This is used on datagram reception on
listener socket before redispatching the datagram to the correct thread.
This function uses a lock to protect quic_cid_trees access. However,
lock region is too small as CID tree node is accessed outside of it. Fix
this by extending lock protection for CID dereferencement until thread
ID is retrieved.

The impact of this bug is unknown, but it may possible cause crashes.
However, it is probably rare as most of datagram reception is done on
quic_conn socket which does not uses quic_get_cid_tid().

This may fix first crash of github issue #2607.

This must be backported up to 2.8.

(cherry picked from commit 72267ff35f7c82f5a32d99a03124b73d95b00a01)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:45:16 +02:00
Amaury Denoyelle
5ddc4004cb BUG/MEDIUM: h3: ensure the ":scheme" pseudo header is totally valid
Ensure pseudo-header scheme is only constitued of valid characters
according to RFC 9110. If an invalid value is found, the request is
rejected and stream is resetted.

It's the same as for previous commit "BUG/MEDIUM: h3: ensure the
":method" pseudo header is totally valid" except that this time it
applies to the ":scheme" pseudo header.

This must be backported up to 2.6.

(cherry picked from commit a3bed52d1f84ba36af66be4317a5f746d498bdf4)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:45:10 +02:00
Amaury Denoyelle
47d13c68cf BUG/MEDIUM: h3: ensure the ":method" pseudo header is totally valid
Ensure pseudo-header method is only constitued of valid characters
according to RFC 9110. If an invalid value is found, the request is
rejected and stream is resetted.

Previously only characters forbidden in headers were rejected (NUL/CR/LF),
but this is insufficient for :method, where some other forbidden chars
might be used to trick a non-compliant backend server into seeing a
different path from the one seen by haproxy. Note that header injection
is not possible though.

This must be backported up to 2.6.

Many thanks to Yuki Mogi of FFRI Security Inc for the detailed report
that allowed to quicky spot, confirm and fix the problem.

(cherry picked from commit 789d4abd7328f0a745d67698e89bbb888d4d9b2c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:44:57 +02:00
Aurelien DARRAGON
a8740d1046 BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or error
This is a complementary patch to c16eba818 ("BUG/MEDIUM: server/dns:
preserve server's port upon resolution timeout or error").

Indeed, since c16eba818, the port is properly preserved, but unsetting
server's address this way results in server_atomic_sync() function
thinking that we're actually setting a new address and not unsetting
the previous one because addr family is != AF_UNSPEC.

Upon DNS timeout, this could be observed:

[WARNING]  (2588257) : Server http/s1 is going DOWN for maintenance (DNS timeout status). 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING]  (2588257) : Server http/s1 ('test1.localhost') is UP/READY (resolves again).

Notice that server timeouts and then immediately resolves again. Of course
in this case case the server's address was properly set to 0, meaning
that the server will not receive any traffic, but it is confusing and
could result in haproxy temporarily thinking that the server is actually
available while it's not.

To properly fix the issue and restore historical behavior, let's
explicitly set inetaddr's family to AF_UNSPEC after fetching original
server's address.

It should be backported in 3.0 with c16eba818.

(cherry picked from commit 80aba1d2844165d9c6929d31cc9c2fd2e92286ed)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:44:44 +02:00
Willy Tarreau
dd42e4233d MINOR: activity: make the memory profiling hash size configurable at build time
The MEMPROF_HASH_BITS variable was set to 10 without a possibility to
change it (beyond patching the code). After seeing a few reports already
with "other" being listed and a list with close to 1024 entries, it looks
like it's about time to either increase the hash size, or at least make
it configurable for special cases. As a reminder, in order to remain
fast, the algorithm searches no more than 16 places after the hash, so
when a table is almost full, searches are long and new places are rare.

The present patch just makes it possible to redefine it by passing
"-DMEMPROF_HASH_BITS=11" or "-DMEMPROF_HASH_BITS=12" in CFLAGS, and
moves the definition to defaults.h to make it easier to find. Such
values should be way sufficient for the vast majority of use cases.
Maybe in the future we'd change the default. At least this version
should be backported to ease rebuilds, say, till 2.8 or so.

(cherry picked from commit 290659ffd3a2eead918adc387e8842c59fbff2e7)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:43:49 +02:00
Aurelien DARRAGON
e3385b697e BUG/MINOR: server: fix first server template name lookup UAF
This is a follow-up for 7223296 ("BUG/MINOR: server: fix first server
template not being indexed").

Indeed, in 7223296 we added a new call to _srv_parse_set_id_from_prefix()
for the first server before handling additional ones. But we actually
overlooked the fact that _srv_parse_set_id_from_prefix() was already
performed at the end of _srv_parse_tmpl_init() for the same server.

Since _srv_parse_set_id_from_prefix() frees srv->id, it results in UAF
when performing name lookups on the first server, because used_server_name
node key still uses the freed string pointer.

The early _srv_parse_set_id_from_prefix() call (added in 7223296) and
the original one perform the same task, except that the new one is
followed by name node insertion logic required for name lookups to work
properly. So let's simply get rid of the old one at the end of the
function.

_srv_parse_set_id_from_prefix() in the 'err:' label was also removed since
is is now useless as well starting with 7223296 and would trigger the same
bug on error paths. Thanks to Amaury for noticing it.

This bug was discovered while trying to address GH issue #2620.
Thanks to @x-yuri for his detailed report (with working repro).

It should be backported in 3.0 with 7223296.

(cherry picked from commit eec804804212374739556175f81b234d7cc8c6f0)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-07-03 08:43:17 +02:00
William Lallemand
aeb5cbdb23 DOC: configuration: add details about crt-store in bind "crt" keyword
Add some details about the certificate storage cache system in the "crt"
bind keyword.

This should be backported to 3.0. Fix issue #2618.

(cherry picked from commit ba37ad41b26a6ba83581821c13426a7fbe4d2494)
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
2024-07-02 16:49:08 +02:00
Christopher Faulet
933f35fe26 BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a session
When we try to kill a session, the shard must be locked before decrementing
the ref count on the session. Otherwise, the ref count can fall to 0 and a
purge task (stktable_trash_oldest or process_table_expire) may release the
session before we have the opportunity to acquire the lock on the shard to
effectively kill the session. This could lead to a double free.

Here is the scenario:

    Thread 1                                 Thread 2

  sktsess_kill(ts)
    if (ATOMIC_DEC(&ts->ref_cnt) != 0)
        return
                   /* here the ref count is 0 */

                                       stktable_trash_oldest()
                                          LOCK(&sh_lock)
                                          if (!ATOMIC_LOAD(&ts->ref_cnf))
                                              __stksess_free(ts)
                                          UNLOCK(&sh_lock)

                  /* here the session was released */
    LOCK(&sh_lock)
    __stksess_free(ts)  <--- double free
    UNLOCK(&sh_lock)

The bug was introduced in 2.9 by the commit 7968fe3889 ("MEDIUM:
stick-table: change the ref_cnt atomically"). The ref count must be
decremented inside the lock for stksess_kill() and sktsess_kill_if_expired()
function.

This patch should fix the issue #2611. It must be backported as far as 2.9. On
the 2.9, there is no sharding. All the table is locked. The patch will have to
be adapted.

(cherry picked from commit 9357873641c5de29b169848fc1c808747818a1eb)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:25 +02:00
Aurelien DARRAGON
08cc27085e BUG/MINOR: hlua: report proper context upon error in hlua_cli_io_handler_fct()
As a result of copy pasting, hlua_cli_io_handler_fct() used to report lua
exceptions like E_ETMOUT as "Lua converter" instead of "Lua cli".

Let's fix that.

It could be backported to all stable versions.

[ada: for older versions, HLUA_E_BTMOUT case didn't exist so it has to be
 skipped]

(cherry picked from commit 185d230e2c615ee723270c81e4eb1eec20181918)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Willy Tarreau
d5c8676f4c DEV: flags/show-fd-to-flags: adapt to recent versions
The script hadn't been updated since it was introduced, and the
hard-coded field 12 doesn't match anymore (it's 16 now). Let's just
use "grep -o cflg..." to extract the desired part more flexibly.
This can be backported at least to 3.0, probably further, but it
will need to be tested prior to this. Better not bring it too far,
it's only used when debugging.

(cherry picked from commit a14c7d194ad27f9f84c9d42aab953a162999252a)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
078cb85b89 BUG/MINOR: quic: fix BUG_ON() on Tx pkt alloc failure
On quic_tx_packet allocation failure, it is possible to trigger BUG_ON()
crash on INITIAL packet building. This statement is responsible to
ensure INITIAL packets are padded to 1.200 bytes as required. If a
packet on higher encryption level allocation fails, PADDING frame cannot
properly encoded, despite the INITIAL packet properly built.

This crash happens due to qc_txb_store() invokation after quic_tx_packet
allocation failure to validate already built packets. However, this
statement is unneeded as qc_purge_tx_buf() is called just after. Simply
remove qc_txb_store() to fix this issue.

This was detected using -dMfail.

This should be backported up to 2.6.

(cherry picked from commit d5376b7a874776b4d5d79f9b746d4654df796f85)
[cf: ctx adjt]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
83bd975406 BUG/MINOR: h3: fix BUG_ON() crash on control stream alloc failure
BUG_ON() from qcc_set_error() is triggered on HTTP/3 control stream
allocation failure. This is caused because both h3_finalize() and
qcc_init_stream_local() call qcc_set_error() which is forbidden to
prevent error code erasure.

Fix this by removing qcc_set_error() invocation from h3_finalize() on
allocation failure. Note that this function is still responsible to use
it on SETTING frame emission failure.

This was detected using -dMfail.

This must be backported up to 3.0.

(cherry picked from commit 5718c67c19766c87bb68b7624e1873a887fbbaf1)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
96c254fd3e BUG/MINOR: mux-quic: fix crash on qcs SD alloc failure
Since the following commit, sedesc are created since QCS instantiation
in qcs_new().
  086e51017e
  BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream

However, sedesc is initialized before other QCS mandatory fields. If
sedesc allocation fails, a crash would occur on qcs_free() invocation
for QCS early release. To fix this, delay sedesc allocation until
function end.

This bug was detected using -dMfail.

This should be backported up to 2.6.

(cherry picked from commit 3aded1d3752a12af9b8e48f445218230e6967a06)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
5ab8106639 BUG/MINOR: h3: fix crash on STOP_SENDING receive after GOAWAY emission
After emitting a HTTP/3 GOAWAY frame, opening of streams higher than
advertised ID was prevented. h3_attach operation would return success
but without allocating H3S stream context for QCS. In addition, the
stream would be immediately scheduled for RESET_STREAM emission.

Despite the immediate stream close, the current is not sufficient enough
and can cause crashes. When of this occurence can be found if
STOP_SENDING is the first frame received for a stream. A crash would
occur under qcc_recv_stop_sending() after h3_attach invokation, when
h3_close() is used which try to access to H3S context.

To fix this, change h3_attach API. In case of success, H3S stream
context is always allocated, even if the stream will be scheduled for
immediate close. This renders the code more reliable.

This crash should be extremely rare, as it can only happen after GOAWAY
emission, which is only used on soft-stop or reload.

This should solve the second crash occurence reported on GH #2607.

This must be backported up to 2.8.

(cherry picked from commit 85838822ba37a92b2dcc43205a07c2b33208b985)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Aurelien DARRAGON
75a9d54c67 DOC: api/event_hdl: small updates, fix an example and add some precisions
Fix an example suggesting that using EVENT_HDL_SUB_TYPE(x, y) with y being
0 was valid. Then add some notes to explain how to use
EVENT_HDL_SUB_FAMILY() and EVENT_HDL_SUB_TYPE() with valid values.

Also mention that the feature is available starting from 2.8 and not 2.7.
Finally, perform some purely cosmetic updates.

This could be backported in 2.8.

(cherry picked from commit 13e0972aeac275137b429163def950af88fecd46)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
a4c5cae3bb SCRIPTS: git-show-backports: do not truncate git-show output
git-show-backports lists a git-show command which can be used to inspect
all commits subject to backport. This command specifies formatting
option to reproduce default git-show output, especially for commit
messages indented with 4 spaces character. However, it also add wrapping
on message line longer than 72 characters. This reduce lisibility of
messages where large info are written such as backtraces.

Improve this by changing git-show format option. Use a limit value of 0
to disable wrapping while preserving indentation.

This could be backported to every stable version to simplify backporting
process.

(cherry picked from commit b27470fd1d06acd6dc33161e1fdb6743f72770df)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Amaury Denoyelle
9ecf24b43e BUG/MAJOR: quic: fix padding with short packets
QUIC sending functions were extended to be more flexible. Of all the
changes, they support now iterating over a variable instance of QEL
instance of only 2 previously. This change has rendered PADDING emission
less previsible, which was adjusted via the following patch :

  a60609f1aa3e5f61d2a2286fdb40ebf6936a80ee
  BUG/MINOR: quic: fix padding of INITIAL packets

Its main purpose was to ensure PADDING would only be generated for the
last iterated QEL instance, to avoid unnecessary padding. In parallel, a
BUG_ON() statement ensure that built INITIAL packets are always padded
to 1.200 bytes as necessary before emitted them.

This BUG_ON() statement caused crash in one particular occurence : when
building datagrams that mixed Initial long packets and 1-RTT short
packets. This last occurence type does not have a length field in its
header, contrary to Long packets. This caused a miscalculation for the
necessary padding size, with INITIAL packets not padded enough to reach
the necessary 1.200 bytes size.

This issue was detected on 3.0.2. It can be reproduced by using 0-RTT
combined with latency. Here are the used commands :

  $ ngtcp2-client --tp-file=/tmp/ngtcp2-tp.txt \
    --session-file=/tmp/ngtcp2-session.txt --exit-on-all-streams-close \
    127.0.0.1 20443 "https://[::]/?s=32o"
  $ sudo tc qdisc add dev lo root netem latency 500ms

Note that this issue cannot be reproduced on current dev version.
Indeed, it seems that the following patch introduce a slight change in
packet building ordering :

  cdfceb10ae136b02e51f9bb346321cf0045d58e0
  MINOR: quic: refactor qc_prep_pkts() loop

This must be backported to 3.0.

This should fix github issue #2609.

(cherry picked from commit c714b6bb55e34c7cd2cb3ff7dbed374e6b6eae65)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Aurelien DARRAGON
acb50d3e0e DOC: management: document ptr lookup for table commands
Add missing documentation and examples for the optional ptr lookup method
for table {show,set,clear} commands introduced in commit 9b2717e7 ("MINOR:
stktable: use {show,set,clear} table with ptr"), as initially described in
GH #2118.

It may be backported in 3.0.

(cherry picked from commit 7422f16da3b84829f2ecf3ff393584b5c5682e06)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
William Lallemand
f06071283a DOC: configuration: fix alphabetical order of bind options
Put the curves, ecdhe, severity-output, v4v6 and v6only keyword at the
right place.

Fix issue #2594.

Could be backported in every stable versions.

(cherry picked from commit 0cc2913aec965dabc579cd90a3d91a440f29967c)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Aurelien DARRAGON
e8df66a4fc BUG/MEDIUM: proxy: fix email-alert invalid free
In fa90a7d3 ("BUG/MINOR: proxy: fix email-alert leak on deinit()"), I
tried to fix email-alert deinit() leak the simple way by leveraging
existing free_email_alert() helper function which was already used for
freeing email alert settings used in a default section.

However, as described in GH #2608, there is a subtelty that makes
free_email_alert() not suitable for use from free_proxy().

Indeed, proxy 'mailers.name' hint shares the same memory space than the
pointer to the corresponding mailers section (once the proxy is resolved,
name hint is replaced by the pointer to the section). However, since both
values share the same space (through union), we have to take care of not
freeing `mailers.name` once init_email_alert() was called on the proxy.

Unfortunately, free_email_alert() isn't protected against that, causing
double free() during deinit when mailers section is referenced from
multiple proxy sections. Since there is no easy fix, and that the leak in
itself isn't a big deal (fa90a7d3 was simply an opportunistic fix rather
than a must-have given that the leak only occurs during deinit and not
during runtime), let's actually revert the fix to restore legacy behavior
and prevent deinit errors.

Thanks to @snetat for having reported the issue on Github as well as
providing relevant infos to pinpoint the bug.

It should be backported everywhere fa90a7d3 was backported.
[ada: for versions prior to 3.0, simply revert the offending commit using
'git revert' as proxy_free_common() first appears in 3.0]

(cherry picked from commit 8e226682be904a6774f65e90bac0b674888cc293)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
William Lallemand
98da55645c REGTESTS: ssl: fix some regtests 'feature cmd' start condition
Since patch fde517b ("REGTESTS: wolfssl: temporarly disable some failing
reg-tests") some 'feature cmd' lines have an extra quotation mark, so
they were disable in every cases.

Must be backported to 2.9.

(cherry picked from commit 6da0879083749d5f098b8b2f4d459a70260491d2)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Aurelien DARRAGON
4a744814ed DEBUG: hlua: distinguish burst timeout errors from exec timeout errors
hlua burst timeout was introduced in 58e36e5b1 ("MEDIUM: hlua: introduce
tune.lua.burst-timeout").

It is a safety measure that allows to detect when too much time is spent
on a single lua execution (between 2 interruptions/yields), meaning that
the current thread is not able to perform other tasks. Such scenario
should be avoided because it will cause thread contention which may have
negative performance impact and could cause the watchdog to trigger. When
the burst timeout is exceeded, the current Lua execution is aborted and a
timeout error is reported to the user.

Unfortunately, the same error is currently being reported for cumulative
(AKA execution) timeout and for burst timeout, which may be confusing to
the user.

Indeed, "execution timeout" error historically results from the current
hlua context exceeding the total (cumulative) time it's allowed to run.
It is set per lua context using the dedicated tunables:
 - tune.lua.session-timeout
 - tune.lua.task-timeout
 - tune.lua.service-timeout

We've already faced an user report where the user was able to trigger the
burst timeout and got "Lua task: execution timeout." error while the user
didn't set cumulative timeout. Thus the error was actually confusing
because it was indeed the burst timeout which was causing it due to the
use of cpu-intensive call from within the task without sufficient manual
"yield" keypoints around the cpu-intensive call to ensure it runs on a
dedicated scheduler cycle.

In this patch we make it so burst timeout related errors are reported as
"burst timeout" errors instead of "execution timeout" errors (which
in fact became the generic timeout errors catchall with 58e36e5b1).

To do this, hlua_timer_check() now returns a different value depending if
the exeeded timeout is the burst one or the cumulative one, which allows
us to return either HLUA_E_ETMOUT or HLUA_E_BTMOUT in hlua_ctx_resume().

It should improve the situation described in GH #2356 and may possibly be
backported with 58e36e5b1 to improve error reporting if it applies without
resistance.

(cherry picked from commit 983513d901bb7511ea6b1e8c3bb00d58a9d432f2)
[cf: No reason to backport further]
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00
Aurelien DARRAGON
8a110626bf BUG/MINOR: log: fix broken '+bin' logformat node option
In 12d08cf912 ("BUG/MEDIUM: log: don't ignore disabled node's options"),
while trying to restore historical node option inheritance behavior, I
broke the '+bin' logformat node option recently introduced in b7c3d8c87c
("MINOR: log: add +bin logformat node option").

Indeed, because of 12d08cf912, LOG_OPT_BIN is not set anymore on
individual nodes even if it was set globally, making the feature unusable.
('+bin' is also used for binary cbor encoding)

What I should have done instead is include LOG_OPT_BIN in the options
inherited from global ones. This is what's being done in this commit.
Misleading comment was adjusted.

It must be backported in 3.0 with 12d08cf912.

(cherry picked from commit 0030f722a2fa574d1e7d90e6f242e4b6a5ace355)
Signed-off-by: Christopher Faulet <cfaulet@haproxy.com>
2024-06-26 15:16:24 +02:00