IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When the stats code was moved to an applet, it wasn't completely
cleaned of its usage of the HTTP transaction and it used to store
the HTTP status in txn->status and to set the HTTP request date to
<now> from within the applet. This is totally wrong because the
applet is seen as a server from the HTTP engine, which parses its
response, so the http_txn must not be touched there.
This was made visible by the cache which would always exhibit a
negative TR log, indicating that nowhere in the code we took care of
setting s->logs.tv_request while the code above used to continue to
hide this. Another side effect of this issue is that under load, if
the stats applet call risks to be delayed, the reported t_queue can
appear negative by being below tv_request-tv_accept.
This patch removes the assignment of tv_request and txn->status from
the applet code and instead sets the tv_request if still unset when
connecting to the applet. This ensures that all applets report correct
request timers now.
During high loads it becomes visible that the time drifts between threads,
sometimes showing tens of seconds after several minutes. The root cause is
the per-thread correction which is performed based on a local offset and
local time. But we can't use a unique global time either as we need the
thread-local time to be stable between two poll() calls.
This commit takes a stab at this problem by proceeding this way :
- a global "global_now" date is monotonous and common between all threads.
- each thread has its own local <now> which is resynced with <global_now>
on each invocation of tv_update_date()
- each thread detects its own drift based on its poll() timeout and its
local <now>, and recalculates its adjusted local time
- each thread then ensures its new local time is no older than the current
global time, otherwise it readjusts its local time to match this one
- finally threads do atomically update the global time to match its own
local one
This guarantees a monotonous global time and a monotonous+stable local time.
It is still possible by definition for two threads to report a minor time
variation on subsequent events but that variation will only be caused by
the moment they watched the time and are very small. When a common global
time is needed between all threads, global_now could be used as a reference
(with care). The wallclock time used in logs is still <date> anyway.
With threads, it became mandatory to implement a thread-local time with
its own correction. However, it was noticed that during high thread
contention, the time correction could occasionally be wrong, reporting
huge negative or positive timers in logs. This was caused by the
conversion between struct timeval and a single 64-bit offset, due to
an erroneous shift and due to a loss of sign during the conversion.
Given that time_t is not always signed, and that timeval is not really
needed here, better avoid playing dangerous games with these operations
and use a single 64-bit offset representing a signed 32-bit offset, for
the seconds part and an unsigned offset for the microsecond part.
It still supports atomic updates and doesn't cause issues anymore.
This code has been used successfully a few times in the past to detect
that a pool was used after being freed. Its main goal is to allocate a
full page for each object so that they are always released individually
and unmapped from memory. This way if any part of the code reference the
object after is was freed and before it is reallocated, a segv occurs at
the exact offending location. It does a few extra things such as writing
to the memory area before freeing to detect double-frees and free of
read-only areas, and placing the data at the end of the page instead of
the beginning so that out of bounds accesses are easier to spot. The
amount of memory used with this is huge (about 10 times the regular
usage) but it can be useful sometimes.
If we can't write early data, for some reason, don't give up on reading them,
they may still be early data to be read, and if we don't do so, openssl
internal states might be inconsistent, and the handshake will fail.
The current code only tries to do the handshake in case we can't send early
data if we're acting as a client, which is wrong, it has to be done on the
server side too, or we end up in an infinite loop.
While using mmap() to allocate pools for debugging purposes, kill -USR1 caused
libc aborts in deinit() on two calls to free() on proxies' tasks and the global
listener task. The issue comes from the fact that we're using free() to release
a task instead of task_free(), so the task was allocated from a pool and released
using a different method.
This bug has been there since at least 1.5, so a backport is desirable to all
maintained versions.
The cache was trying to remove objects from the tree while they were
already removed from it. We set the key to 0 as a check for not trying
to remove the object from the tree when we are still using the object.
The cli command "show cache" displays the status of the cache, the first
displayed line is the shctx informations with how much blocks available
blocks it contains (blocks are 1k by default).
The next lines are the objects stored in the cache tree, the pointer,
the size of the object and how much blocks it uses, a refcount for the
number of users of the object, and the remaining expiration time (which
can be negative if expired)
Example:
$ echo "show cache" | socat - /run/haproxy.sock
0x7fa54e9ab03a: foobar (shctx:0x7fa54e9ab000, available blocks:3921)
0x7fa54ed65b8c (size: 43190 (43 blocks), refcount:2, expire: 2)
0x7fa54ecf1b4c (size: 45238 (45 blocks), refcount:0, expire: 2)
0x7fa54ed70cec (size: 61622 (61 blocks), refcount:0, expire: 2)
0x7fa54ecdbcac (size: 42166 (42 blocks), refcount:1, expire: 2)
0x7fa54ec9736c (size: 44214 (44 blocks), refcount:2, expire: 2)
0x7fa54eca28ec (size: 46262 (46 blocks), refcount:2, expire: -2)
Allows bigger objects to be cached in the shctx, the first
implementation was only storing small ssl session, but we want to store
bigger HTTP response.
Being an external agent, it's confusing that it uses haproxy's internal
types and it seems to have encouraged other implementations to do so.
Let's completely remove any reference to struct sample and use the
native DATA types instead of converting to and from haproxy's sample
types.
Since we switched to notify mode in the systemd unit file in commit
d6942c8, haproxy won't start if the daemon keyword is present in the
configuration.
This change makes sure that haproxy remains in foreground when using
systemd mode and adds a note in the documentation.
The special case of the Cookie header field was overlooked in the
implementation, considering that most servers do handle cookie lists,
but as reported here on discourse it's not the case at all :
https://discourse.haproxy.org/t/h2-cookie-header-splitted-header/1742
This patch fixes this by skipping all occurences of the Cookie header
in the request while building the H1 request, and then building a single
Cookie header with all values appended at once, according to what is
requested in RFC7540#8.1.2.5.
In order to build the list of values, the list struct is used as a linked
list (as there can't be more cookies than headers). This makes the list
walking quite efficient and ensures all values are quickly found without
having to rescan the list.
A test case provided by Lukas shows that it properly works :
> GET /? HTTP/1.1
> user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> accept-language: en-US,en;q=0.5
> accept-encoding: gzip, deflate
> referer: https://127.0.0.1:4443/?expectValue=1511294406
> host: 127.0.0.1:4443
< HTTP/1.1 200 OK
< Server: nginx
< Date: Tue, 21 Nov 2017 20:00:13 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/5.3.10-1ubuntu3.26
< Set-Cookie: HAPTESTa=1511294413
< Set-Cookie: HAPTESTb=1511294413
< Set-Cookie: HAPTESTc=1511294413
< Set-Cookie: HAPTESTd=1511294413
< Content-Encoding: gzip
> GET /?expectValue=1511294413 HTTP/1.1
> user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
> accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> accept-language: en-US,en;q=0.5
> accept-encoding: gzip, deflate
> host: 127.0.0.1:4443
> cookie: SERVERID=s1; HAPTESTa=1511294413; HAPTESTb=1511294413; HAPTESTc=1511294413; HAPTESTd=1511294413
Many thanks to @Nurza, @adrianw and @lukastribus for their helpful reports
and investigations here.
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.
This commit modifies the headers decoding so that it now works in two
phases : hpack_decode_headers() only decodes the HPACK stream in the
HEADERS frame and puts the result into a list. Headers which require
storage (huffman-compressed or from the dynamic table) are stored in
a chunk allocated by the H2 demuxer. Then once the headers are properly
decoded into this list, h2_make_h1_request() is called with this list
to produce the HTTP/1.1 request into the destination buffer. The list
necessarily enforces a limit. Here we use 2*MAX_HTTP_HDR, which means
that we can have as many individual cookies as we have regular headers
if a client decides to break their cookies into multiple values. This
seams reasonable and will allow the H1 parser to decide whether it's
too much or not.
Thus the output stream is not produced on the fly anymore and this will
permit to deal with certain corner cases like reparing the Cookie header
(which for now is not done).
In order to limit header duplication and parsing, the known pseudo headers
continue to be passed by their index : the name element in the list then
has a NULL pointer and the value is the pseudo header's index. Given that
these ones represent about half of the incoming requests and need to be
found quickly, it maintains an acceptable level of performance.
The code was significantly reduced by doing this because the orignal code
had to deal with HPACK and H1 combinations (eg: index vs not indexed, etc)
and now the HPACK decoding is totally focused on the decompression, and
the H1 encoding doesn't have to deal with the issue of wrapping input for
example.
One bug was addressed here (though it couldn't happen at the moment). The
H2 demuxer used to detect a failure to write the request into the H1 buffer
and would then detect if the output buffer wraps, realign it and try again.
The problem by doing so was that the HPACK context was already modified and
not rewindable. Thus the size check is now performed first and a failure is
reported if it doesn't fit.
The current H2 to H1 protocol conversion presents some issues which will
require to perform some processing on certain headers before writing them
so it's not possible to convert HPACK to H1 on the fly.
Here we introduce a function which performs half of what hpack_decode_header()
used to do, which is to take a list of headers on input and emit the
corresponding request in HTTP/1.1 format. The code is the same and functions
were renamed to be prefixed with "h2" instead of "hpack", though it ends
up being simpler as the various HPACK-specific cases could be fused into
a single one (ie: add header).
Moving this part here makes a lot of sense as now this code is specific to
what is documented in HTTP/2 RFC 7540 and will be able to deal with special
cases related to H2 to H1 conversion enumerated in section 8.1.
Various error codes which were previously assigned to HPACK were never
used (aside being negative) and were all replaced by -1 with a comment
indicating what error was detected. The code could be further factored
thanks to this but this commit focuses on compatibility first.
This code is not yet used but builds fine.
We used to return >0 indicating a success when an error was present on the
connection, preventing the caller from detecting and handling it. This for
example happens when sending too many headers in a frame, making the request
impossible to decompress.
Clang reports this warning :
src/server.c:872:14: warning: address of array 'check->desc' will
always evaluate to 'true' [-Wpointer-bool-conversion]
Indeed, check->desc used to be a pointer to a dynamically allocated area
a long time ago and is now an array. Let's remove the useless test.
Clang complains that h2_get_n64() is not used, and a few other protocol
specific functions may fall in that category depending on how the code
evolves. Better mark them unused to silence the warning since it's on
purpose.
While gcc only emits warnings about unused static functions, Clang also
emits such a warning when the functions are inlined. This is a bit
annoying at certain places where functions are provided to manipulate
multiple data types and are not yet used. Let's have a type modifier
"__maybe_unused" which sets the "unused" attribute like the Linux kernel
does. It's elegant as it allows the code author to indicate that it knows
that this element might be unused. It works on variables as well, which
is convenient to remove ifdefs around local variables in certain functions,
but doesn't work on labels.
Clang emits a warning about these types being redefined in eb32sctree
while they are already defined in eb32tree. Let's simply not redefine
them if eb32tree was already included.
[ plock commit 4c53fd3a0b2b1892817cebd0db012a52f4087850 ]
Pieter Baauw reported a build issue affecting haproxy after plock was
included. It happens that expressions of the form :
if ((const) ? (expr1) : (expr2))
do_something()
always produce code for both expr1 and expr2 on Clang when building
without optimization. The resulting asm code is even funny, basically
doing :
mov reg, 1
cmp reg, 1
...
This causes our sizeof() tests to fail to build because we purposely
dereference a fake function that reports the location and nature of the
inconsistency, but this fake function appears in the object code despite
all conditions being there to avoid it.
However the compiler is still smart enough to optimize away code doing
if (const)
do_something()
So we simply repeat the condition before do_something(), and the dummy
function is not referenced anymore unless really required.
[ plock commit 61e255286ae32e83e1a3174dd7c49eda99880a8b]
There are a few inlines such as pl_barrier() and pl_cpu_relax() which
are used a lot. Unfortunately, while building test code at -O0, inlining
is disabled and these ones are called a lot and show up a lot in any
profile, are traced into when single-stepping with a debugger, etc, thus
they are polluting the landscape. Since they're single-asm statements,
there is no reason for not turning them into macros.
The result becomes fairly visible here at -O0 :
$ size latency.inline latency.macro
text data bss dec hex filename
11431 692 656 12779 31eb treelock.inline
10967 692 656 12315 301b treelock.macro
And it was verified that regularly optimized code remains strictly identical.
[ plock commit 44081ea493dd78dab48076980e881748e9b33db5 ]
Older compilers (eg: gcc 3.4) don't provide __sync_synchronize() so let's
do it by hand on this platform.
[ plock commit b155d5c762fb9a9793911881f80e61faa6b0e889 ]
Local variables "l", "i" and "ret" were renamed "__pl_l", "__pl_i" and
"__pl_r" respectively, to limit the risk of conflicts with existing
variables in application code.
[ plock commit bfac5887ebabb8ef753b0351f162265767eb219b ]
Local variable "t" was renamed "__pl_t" to limit the risk of conflicts
with existing variables in application code.
Call the shctx free_blocks callback in order to remove the row from the
cache tree.
Put the row in the hot list during allocation, forbid the blocks to be
stolen by a free or a row_reserve
This patch adds support for `Type=notify` to the systemd unit.
Supporting `Type=notify` improves both starting as well as reloading
of the unit, because systemd will be let known when the action completed.
See this quote from `systemd.service(5)`:
> Note however that reloading a daemon by sending a signal (as with the
> example line above) is usually not a good choice, because this is an
> asynchronous operation and hence not suitable to order reloads of
> multiple services against each other. It is strongly recommended to
> set ExecReload= to a command that not only triggers a configuration
> reload of the daemon, but also synchronously waits for it to complete.
By making systemd aware of a reload in progress it is able to wait until
the reload actually succeeded.
This patch introduces both a new `USE_SYSTEMD` build option which controls
including the sd-daemon library as well as a `-Ws` runtime option which
runs haproxy in master-worker mode with systemd support.
When haproxy is running in master-worker mode with systemd support it will
send status messages to systemd using `sd_notify(3)` in the following cases:
- The master process forked off the worker processes (READY=1)
- The master process entered the `mworker_reload()` function (RELOADING=1)
- The master process received the SIGUSR1 or SIGTERM signal (STOPPING=1)
Change the unit file to specify `Type=notify` and replace master-worker
mode (`-W`) with master-worker mode with systemd support (`-Ws`).
Future evolutions of this feature could include making use of the `STATUS`
feature of `sd_notify()` to send information about the number of active
connections to systemd. This would require bidirectional communication
between the master and the workers and thus is left for future work.
Commit 9aaf778 ("MAJOR: connection : Split struct connection into struct
connection and struct conn_stream.") had to change the way the stream
interface deals with incoming data to accomodate the mux. A break
statement got lost during a change, leading to the receive call being
performed twice even when CF_READ_DONTWAIT is set. The most noticeable
effect is that it made the bug described in commit 33982cb ("BUG/MAJOR:
stream: ensure analysers are always called upon close") much easier to
reproduce as it would appear even with an HTTP frontend.
Let's just restore the stream-interface flag and the break here, as in
the previous code.
No backport is needed as this was introduced during 1.8-dev.
A recent issue affecting HTTP/2 + redirect + cache has uncovered an old
problem affecting all existing versions regarding the way events are
reported to analysers.
It happens that when an event is reported, analysers see it and may
decide to temporarily pause processing and prevent other analysers from
processing the same event. Then the event may be cleared and upon the
next call to the analysers, some of them will never see it.
This is exactly what happens with CF_READ_NULL if it is received before
the request is processed, like during redirects : the first time, some
analysers see it, pause, then the event may be converted to a SHUTW and
cleared, and on next call, there's nothing to process. In practice it's
hard to get the CF_READ_NULL flag during the request because requests
have CF_READ_DONTWAIT, preventing the read0 from happening. But on
HTTP/2 it's presented along with any incoming request. Also on a TCP
frontend the flag is not set and it's possible to read the NULL before
the request is parsed.
This causes a problem when filters are present because flt_end_analyse
needs to be called to release allocated resources and remove the
CF_FLT_ANALYZE flag. And the loss of this event prevents the analyser
from being called and from removing itself, preventing the connection
from ever ending.
This problem just shows that the event processing needs a serious revamp
after 1.8. In the mean time we can deal with the really problematic case
which is that we *want* to call analysers if CF_SHUTW is set on any side
ad it's the last opportunity to terminate a processing. It may
occasionally result in some analysers being called for nothing in half-
closed situations but it will take care of the issue.
An example of problematic configuration triggering the bug in 1.7 is :
frontend tcp
bind :4445
default_backend http
backend http
redirect location /
compression algo identity
Then submitting requests which immediately close will have for effect
to accumulate streams which will never be freed :
$ printf "GET / HTTP/1.1\r\n\r\n" >/dev/tcp/0/4445
This fix must be backported to 1.7 as well as any version where commit
c0c672a ("BUG/MINOR: http: Fix conditions to clean up a txn and to
handle the next request") was backported. This commit didn't cause the
bug but made it much more likely to happen.
Upon stream instanciation, we used to enable channel auto connect
and auto close to ease TCP processing. But commit 9aaf778 ("MAJOR:
connection : Split struct connection into struct connection and
struct conn_stream.") has revealed that it was a bad idea because
this commit enables reading of the trailing shutdown that may follow
a small requests, resulting in a read and a shutr turned into shutw
before the stream even has a chance to apply the filters. This
causes an issue with impossible situations where the backend stream
interface is still in SI_ST_INI with a closed output, which blocks
some streams for example when performing a redirect with filters
enabled.
Let's change this so that we only enable these two flags if there is
no analyser on the stream. This way process_stream() has a chance to
let the analysers decide whether or not to allow the shutdown event
to be transferred to the other side.
It doesn't seem possible to trigger this issue before 1.8, so for now
it is preferable not to backport this fix.
Released version 1.8-rc4 with the following main changes :
- BUG/MEDIUM: cache: does not cache if no Content-Length
- BUILD: thread/pipe: fix build without threads
- BUG/MINOR: spoe: check buffer size before acquiring or releasing it
- MINOR: debug/flags: Add missing flags
- MINOR: threads: Use __decl_hathreads to declare locks
- BUG/MINOR: buffers: Fix b_alloc_margin to be "fonctionnaly" thread-safe
- BUG/MAJOR: ebtree/scope: fix insertion and removal of duplicates in scope-aware trees
- BUG/MAJOR: ebtree/scope: fix lookup of next node in scope-aware trees
- MINOR: ebtree/scope: add a function to find next node from a parent
- MINOR: ebtree/scope: simplify the lookup functions by using eb32sc_next_with_parent()
- BUG/MEDIUM: mworker: Fix re-exec when haproxy is started from PATH
- BUG/MEDIUM: cache: use msg->sov to forward header
- MINOR: cache: forward data with headers
- MINOR: cache: disable cache if shctx_row_data_append fail
- BUG/MINOR: threads: tid_bit must be a unsigned long
- CLEANUP: tasks: Remove useless double test on rq_next
- BUG/MEDIUM: standard: itao_str/idx and quote_str/idx must be thread-local
- MINOR: tools: add a function to dump a scope-aware tree to a file
- MINOR: tools: improve the DOT dump of the ebtree
- MINOR: tools: emphasize the node being worked on in the tree dump
- BUG/MAJOR: ebtree/scope: properly tag upper nodes during insertion
- DOC: peers: Add a first version of peers protocol v2.1.
- CONTRIB: Wireshark dissector for HAProxy Peer Protocol.
- MINOR: mworker: display an accurate error when the reexec fail
- BUG/MEDIUM: mworker: wait again for signals when execvp fail
- BUG/MEDIUM: mworker: does not deinit anymore
- BUG/MEDIUM: mworker: does not close inherited FD
- MINOR: tests: add a python wrapper to test inherited fd
- BUG/MINOR: Allocate the log buffers before the proxies startup
- MINOR: tasks: Use a bitfield to track tasks activity per-thread
- MAJOR: polling: Use active_tasks_mask instead of tasks_run_queue
- MINOR: applets: Use a bitfield to track applets activity per-thread
- MAJOR: polling: Use active_appels_mask instead of applets_active_queue
- MEDIUM: applets: Don't process more than 200 active applets at once
- MINOR: stream: Add thread-mask of tasks/FDs/applets in "show sess all" command
- MINOR: SSL: Store the ASN1 representation of client sessions.
- MINOR: ssl: Make sure we don't shutw the connection before the handshake.
- BUG/MEDIUM: deviceatlas: ignore not valuable HTTP request data
A customer reported a crash when within the HTTP request some headers
were not set leading to the module to crash. So the module ignore them
since empty data have no value for the detection.
Needs to be backported to 1.7.
Instead of storing the SSL_SESSION pointer directly in the struct server,
store the ASN1 representation, otherwise, session resumption is broken with
TLS 1.3, when multiple outgoing connections want to use the same session.
Now, we process at most 200 active applets per call to applet_run_active. We use
the same limit as the tasks. With the cache filter and the SPOE, the number of
active applets can now be huge. So, it is important to limit the number of
applets processed in applet_run_active.
applets_active_queue is the active queue size. It is a global variable. So it is
underoptimized because we may be lead to consider there are active applets for a
thread while in fact all active applets are assigned to the otherthreads. So, in
such cases, the polling loop will be evaluated many more times than necessary.
Instead, we now check if the thread id is set in the bitfield active_applets_mask.
This is specific to threads, no backport is needed.