IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Indicate for each statistic which types may have a value for
that statistic.
Explain some of the provided statistics a little more deeply.
(cherry picked from commit ebe62d645b45aa2210ef848fa16805a0aba7d75a)
Listening to an abstract namespace socket is quite convenient but
comes with some drawbacks that must be clearly understood when the
socket is being listened to by multiple processes. The trouble is
that the socket cannot be rebound if a new process attempts a soft
restart and fails, so only one of the initially bound processes
will still be bound to it, the other ones will fail to rebind. For
most situations it's not an issue but it needs to be indicated.
(cherry picked from commit 70f72e0c90691c72cb72306b718f785902270015)
Abstract namespace sockets ignore the shutdown() call and do not make
it possible to temporarily stop listening. The issue it causes is that
during a soft reload, the new process cannot bind, complaining that the
address is already in use.
This change registers a new pause() function for unix sockets and
completely unbinds the abstract ones since it's possible to rebind
them later. It requires the two previous patches as well as preceeding
fixes.
This fix should be backported into 1.5 since the issue apperas there.
(cherry picked from commit fd0e008d9d4db2f860b739bd28f6cd31d9aaf2b5)
When a listener resumes operations, supporting a full rebind makes it
possible to perform a full stop as a pause(). This will be used for
pausing abstract namespace unix sockets.
(cherry picked from commit 1c4b814087189b4b0225a473b7cb0a844bc30839)
In order to fix the abstact socket pause mechanism during soft restarts,
we'll need to proceed differently depending on the socket protocol. The
pause_listener() function already supports some protocol-specific handling
for the TCP case.
This commit makes this cleaner by adding a new ->pause() function to the
protocol struct, which, if defined, may be used to pause a listener of a
given protocol.
For now, only TCP has been adapted, with the specific code moved from
pause_listener() to tcp_pause_listener().
(cherry picked from commit 092d865c53de80afc847c5ff0a079b414041ce2a)
Jan Seda noticed that abstract sockets are incompatible with soft reload,
because the new process cannot bind and immediately fails. This patch marks
the binding as retryable and not fatal so that the new process can try to
bind again after sending a signal to the old process.
Note that this fix is not enough to completely solve the problem, but it
is necessary. This patch should be backported to 1.5.
(cherry picked from commit 3c5efa2b3268f31cffc2c18887010d4bc906a066)
This is currently harmless, but when stopping a listener, its fd is
closed but not set to -1, so it is not possible to re-open it again.
Currently this has no impact but can have after the abstract sockets
are modified to perform a complete close on soft-reload.
The fix can be backported to 1.5 and may even apply to 1.4 (protocols.c).
(cherry picked from commit 39447b6a5799a160eae452db920fd0735a78638b)
As usual, when touching any is* function, Solaris complains about the
type of the element being checked. Better backport this to 1.5 since
nobody knows what the emitted code looks like since macros are used
instead of functions.
(cherry picked from commit 506c69a50e8d434b6b0c2c89b0402f220830644d)
When bind() fails (function uxst_bind_listener()), the fail path doesn't
consider the abstract namespace and tries to unlink paths held in
uninitiliazed memory (tempname and backname). See the strace excerpt;
the strings still hold the path from test1.
===============================================================================================
23722 bind(5, {sa_family=AF_FILE, path=@"test2"}, 110) = -1 EADDRINUSE (Address already in use)
23722 unlink("/tmp/test1.sock.23722.tmp") = -1 ENOENT (No such file or directory)
23722 close(5) = 0
23722 unlink("/tmp/test1.sock.23722.bak") = -1 ENOENT (No such file or directory)
===============================================================================================
This patch should be backported to 1.5.
(cherry picked from commit 7319b64fc4c9b7e04726816c6cc02f6ecf66a0a4)
There is a very small typo in the statistics interface: a "set" in
lowercase where allothers are uppercase "Set".
(cherry picked from commit 8c27bcaea0116247ee055c5481a63507de4fe6e4)
With all the goodies supported by logformat, people find that the limit
of 1024 chars for log lines is too short. Some servers do not support
larger lines and can simply drop them, so changing the default value is
not always the best choice.
This patch takes a different approach. Log line length is specified per
log server on the "log" line, with a value between 80 and 65535. That
way it's possibly to satisfy all needs, even with some fat local servers
and small remote ones.
(cherry picked from commit 18324f574f349d510622ff45635de899437a3a11)
This value was set in log.h without any #ifndef around, so when one
wanted to change it, a patch was needed. Let's move it to defaults.h
with the usual #ifndef so that it's easier to change it.
(cherry picked from commit 4e957907aa117c07214ab84ba2a58f2fc1666931)
This used to cause a build failure since 1.5.0, as reported by
Timothy Shelton. The proxy protocol doc was also added.
(cherry picked from commit ca3094d0b1531ce62fc1970aa7396a01330bb5c1)
I've been facing multiple configurations which involved track-sc* rules
in tcp-request content without the "if ..." to force it to wait for the
contents, resulting in random behaviour with contents sometimes retrieved
and sometimes not.
Reading the doc doesn't make it clear either that the tracking will be
performed only if data are already there and that waiting on an ACL is
the only way to avoid this.
Since this behaviour is not natural and we now have the ability to fix
it, this patch ensures that if input data are still moving, instead of
silently dropping them, we naturally wait for them to stabilize up to
the inspect-delay. This way it's not needed anymore to implement an
ACL-based condition to force to wait for data, eventhough the behaviour
is not changed for when an ACL is present.
The most obvious usage will be when track-sc is followed by any HTTP
sample expression, there's no need anymore for adding "if HTTP".
It's probably worth backporting this to 1.5 to avoid further configuration
issues. Note that it requires previous patch.
(cherry picked from commit 1b71eb581ec1637879f725421efb95ad69f0ea4f)
stktable_fetch_key() does not indicate whether it returns NULL because
the input sample was not found or because it's unstable. It causes trouble
with track-sc* rules. Just like with sample_fetch_string(), we want it to
be able to give more information to the caller about what it found. Thus,
now we use the pointer to a sample passed by the caller, and fill it with
the information we have about the sample. That way, even if we return NULL,
the caller has the ability to check whether a sample was found and if it is
still changing or not.
(cherry picked from commit b5975defba61e7ef37ae771614166d0970ede04e)
We used to only clear flags when reusing the static sample before calling
sample_process(), but that's not enough because there's a context in samples
that can be used by some fetch functions such as auth, headers and cookies,
and not reinitializing it risks that a pointer of a different type is used
in the wrong context.
An example configuration which triggers the case consists in mixing hdr()
and http_auth_group() which both make use of contexts :
http-request add-header foo2 %[hdr(host)],%[http_auth_group(foo)]
The solution is simple, initialize all the sample and not just the flags.
This fix must be backported into 1.5 since it was introduced in 1.5-dev19.
(cherry picked from commit 6c616e0b96106dd33d183afbda31e72799e967c3)
Baptiste Assmann reported a corner case in the releasing of stick-counters:
we release content-aware counters before logging. In the past it was not a
problem, but since now we can log them it, it prevents one from logging
their value. Simply switching the log production and the release of the
counter fixes the issue.
This should be backported into 1.5.
(cherry picked from commit d713bcc326da5d1ac80adab666d7710f3e37650c)
'ssl_sock_get_common_name' applied to a connection was also renamed
'ssl_sock_get_remote_common_name'. Currently, this function is only used
with protocol PROXYv2 to retrieve the client certificate's common name.
A further usage could be to retrieve the server certificate's common name
on an outgoing connection.
(cherry picked from commit 0abf836ecb32767fa1f9ad598f3e236e073491bd)
The sample fetch function "base" makes use of the trash which is also
used by set-header/add-header etc... everything which builds a formated
line. So we end up with some junk in the header if base is in use. Let's
fix this as all other fetches by using a trash chunk instead.
This bug was reported by Baptiste Assmann, and also affects 1.5.
(cherry picked from commit 3caf2afabe89fb0ef0886cd1d8ea99ef21ec3491)
Released version 1.5.1 with the following main changes :
- BUG/MINOR: config: http-request replace-header arg typo
- BUG/MINOR: ssl: rejects OCSP response without nextupdate.
- BUG/MEDIUM: ssl: Fix to not serve expired OCSP responses.
- BUG/MINOR: ssl: Fix OCSP resp update fails with the same certificate configured twice. (cherry picked from commit 1d3865b096b43b9a6d6a564ffb424ffa6f1ef79f)
- BUG/MEDIUM: Consistently use 'check' in process_chk
- BUG/MAJOR: session: revert all the crappy client-side timeout changes
- BUG/MINOR: logs: properly initialize and count log sockets
http-request replace-header was introduced with a typo which prevents it
to be conditionned by an ACL.
This patch fixes this issue.
(cherry picked from commit 92df370621b6e1286ef273310ad47371456a5cf0)
Commit 81ae195 ("[MEDIUM] add support for logging via a UNIX socket")
merged in 1.3.14 introduced a few minor issues with log sockets. All
of them happen only when a failure is encountered when trying to set
up the logging socket (eg: socket family is not available or is
temporarily short in resources).
The first socket which experiences an error causes the socket setup
loop to abort, possibly preventing any log from being sent if it was
the first logger. The second issue is that if this socket finally
succeeds after a second attempt, errors are reported for the wrong
logger (eg: logger #1 failed instead of #2). The last point is that
we now have multiple loggers, and it's a waste of time to walk over
their list for every log while they're almost always properly set up.
So in order to fix all this, let's merge the two lists. If a logger
experiences an error, it simply sends an alert and skips to the next
one. That way they don't prevent messages from being sent and are
all properly accounted for.
(cherry picked from commit c7c7be21bf6c7e9afd897d4bf451dc450187a77e)
This is the 3rd regression caused by the changes below. The latest to
date was reported by Finn Arne Gangstad. If a server responds with no
content-length and the client's FIN is never received, either we leak
the client-side FD or we spin at 100% CPU if timeout client-fin is set.
Enough is enough. The amount of tricks needed to cover these side-effects
starts to look like used toilet paper stacked over a chocolate cake. I
don't want to eat that cake anymore!
All this to avoid reporting a server-side timeout when a client stops
uploading data and haproxy expires faster than the server... A lot of
"ifs" resulting in a technically valid log that doesn't always please
users, and whose alternative causes that many issues for all others
users.
So let's revert this crap merged since 1.5-dev25 :
Revert "CLEANUP: http: don't clear CF_READ_NOEXP twice"
This reverts commit 1592d1e72a4a2d25a554c299ae95a3e6cad80bf1.
Revert "BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction"
This reverts commit 77d29029af1c44216b190dd7442964b9d8f45257.
Revert "BUG/MEDIUM: session: don't clear CF_READ_NOEXP if analysers are not called"
This reverts commit 0943757a2144761c60e416b5ed07baa76934f5a4.
Revert "BUG/MEDIUM: http: disable server-side expiration until client has sent the body"
This reverts commit 3bed5e9337fd6eeab0f0006ebefcbe98ee5c4f9f.
Revert "BUG/MEDIUM: http: correctly report request body timeouts"
This reverts commit b9edf8fbecc9d1b5c82794735adcc367a80a4ae2.
Revert "BUG/MEDIUM: http/session: disable client-side expiration only after body"
This reverts commit b1982e27aaff2a92a389a9f1bc847e3bb8fdb4f2.
If a cleaner AND SAFER way to do something equivalent in 1.6-dev, we *might*
consider backporting it to 1.5, but given the vicious bugs that have surfaced
since, I doubt it will happen any time soon.
Fortunately, that crap never made it into 1.4 so no backport is needed.
(cherry picked from commit 6f0a7bac282c9b2082dc763977b7721b6b002089)
For some browsers (firefox), an expired OCSP Response causes unwanted behavior.
Haproxy stops serving OCSP response if nextupdate date minus
the supported time skew (#define OCSP_MAX_RESPONSE_TIME_SKEW) is
in the past.
(cherry picked from commit 4f3c87a5d942d4d0649c35805ff4e335970b87d4)
I am not entirely sure that this is a bug, but it seems
to me that it may cause a problem if there agent-check is
configured and there is some kind of error making a connection for it.
Signed-off-by: Simon Horman <horms@verge.net.au>
(cherry picked from commit ccaabcdfca23851af6fd83f4f3265284d283e2ab)
Released version 1.5.0 with the following main changes :
- MEDIUM: ssl: ignored file names ending as '.issuer' or '.ocsp'.
- MEDIUM: ssl: basic OCSP stapling support.
- MINOR: ssl/cli: Fix unapropriate comment in code on 'set ssl ocsp-response'
- MEDIUM: ssl: add 300s supported time skew on OCSP response update.
- MINOR: checks: mysql-check: Add support for v4.1+ authentication
- MEDIUM: ssl: Add the option to use standardized DH parameters >= 1024 bits
- MEDIUM: ssl: fix detection of ephemeral diffie-hellman key exchange by using the cipher description.
- MEDIUM: http: add actions "replace-header" and "replace-values" in http-req/resp
- MEDIUM: Break out check establishment into connect_chk()
- MEDIUM: Add port_to_str helper
- BUG/MEDIUM: fix ignored values for half-closed timeouts (client-fin and server-fin) in defaults section.
- BUG/MEDIUM: Fix unhandled connections problem with systemd daemon mode and SO_REUSEPORT.
- MINOR: regex: fix a little configuration memory leak.
- MINOR: regex: Create JIT compatible function that return match strings
- MEDIUM: regex: replace all standard regex function by own functions
- MEDIUM: regex: Remove null terminated strings.
- MINOR: regex: Use native PCRE API.
- MINOR: missing regex.h include
- DOC: Add Exim as Proxy Protocol implementer.
- BUILD: don't use type "uint" which is not portable
- BUILD: stats: workaround stupid and bogus -Werror=format-security behaviour
- BUG/MEDIUM: http: clear CF_READ_NOEXP when preparing a new transaction
- CLEANUP: http: don't clear CF_READ_NOEXP twice
- DOC: fix proxy protocol v2 decoder example
- DOC: fix remaining occurrences of "pattern extraction"
- MINOR: log: allow the HTTP status code to be logged even in TCP frontends
- MINOR: logs: don't limit HTTP header captures to HTTP frontends
- MINOR: sample: improve sample_fetch_string() to report partial contents
- MINOR: capture: extend the captures to support non-header keys
- MINOR: tcp: prepare support for the "capture" action
- MEDIUM: tcp: add a new tcp-request capture directive
- MEDIUM: session: allow shorter retry delay if timeout connect is small
- MEDIUM: session: don't apply the retry delay when redispatching
- MEDIUM: session: redispatch earlier when possible
- MINOR: config: warn when tcp-check rules are used without option tcp-check
- BUG/MINOR: connection: make proxy protocol v1 support the UNKNOWN protocol
- DOC: proxy protocol example parser was still wrong
- DOC: minor updates to the proxy protocol doc
- CLEANUP: connection: merge proxy proto v2 header and address block
- MEDIUM: connection: add support for proxy protocol v2 in accept-proxy
- MINOR: tools: add new functions to quote-encode strings
- DOC: clarify the CSV format
- MEDIUM: stats: report the last check and last agent's output on the CSV status
- MINOR: freq_ctr: introduce a new averaging method
- MEDIUM: session: maintain per-backend and per-server time statistics
- MEDIUM: stats: report per-backend and per-server time stats in HTML and CSV outputs
- BUG/MINOR: http: fix typos in previous patch
- DOC: remove the ultra-obsolete TODO file
- DOC: update roadmap
- DOC: minor updates to the README
- DOC: mention the maxconn limitations with the select poller
- DOC: commit a few old design thoughts files
These ones were design notes and ideas collected during the 1.5
development phase lying on my development machine. There might still
be some value in keeping them for future reference since they mention
certain corner cases.
Select()'s safe area is limited to 1024 FDs, and anything higher
than this will report "select: FAILED" on startup in debug mode,
so better document it.
The support is all based on static responses. This doesn't add any
request / response logic to HAProxy, but allows a way to update
information through the socket interface.
Currently certificates specified using "crt" or "crt-list" on "bind" lines
are loaded as PEM files.
For each PEM file, haproxy checks for the presence of file at the same path
suffixed by ".ocsp". If such file is found, support for the TLS Certificate
Status Request extension (also known as "OCSP stapling") is automatically
enabled. The content of this file is optional. If not empty, it must contain
a valid OCSP Response in DER format. In order to be valid an OCSP Response
must comply with the following rules: it has to indicate a good status,
it has to be a single response for the certificate of the PEM file, and it
has to be valid at the moment of addition. If these rules are not respected
the OCSP Response is ignored and a warning is emitted. In order to identify
which certificate an OCSP Response applies to, the issuer's certificate is
necessary. If the issuer's certificate is not found in the PEM file, it will
be loaded from a file at the same path as the PEM file suffixed by ".issuer"
if it exists otherwise it will fail with an error.
It is possible to update an OCSP Response from the unix socket using:
set ssl ocsp-response <response>
This command is used to update an OCSP Response for a certificate (see "crt"
on "bind" lines). Same controls are performed as during the initial loading of
the response. The <response> must be passed as a base64 encoded string of the
DER encoded response from the OCSP server.
Example:
openssl ocsp -issuer issuer.pem -cert server.pem \
-host ocsp.issuer.com:80 -respout resp.der
echo "set ssl ocsp-response $(base64 -w 10000 resp.der)" | \
socat stdio /var/run/haproxy.stat
This feature is automatically enabled on openssl 0.9.8h and above.
This work was performed jointly by Dirkjan Bussink of GitHub and
Emeric Brun of HAProxy Technologies.
The pcreposix layer (in the pcre projetc) execute strlen to find
thlength of the string. When we are using the function "regex_exex*2",
the length is used to add a final \0, when pcreposix is executed a
strlen is executed to compute the length.
If we are using a native PCRE api, the length is provided as an
argument, and these operations disappear.
This is useful because PCRE regex are more used than POSIC regex.
The new regex function can use string and length. The HAproxy buffer are
not null-terminated, and the use of the regex_exec* functions implies
the add of this null character. This patch replace these function by the
functions which takes a string and length as input.
Just the file "proto_http.c" is change because this one is more executed
than other. The file "checks.c" have a very low usage, and it is not
interesting to change it. Furthermore, the buffer used by "checks.c" are
null-terminated.
This patch remove all references of standard regex in haproxy. The last
remaining references are only in the regex.[ch] files.
In the file src/checks.c, the original function uses a "pmatch" array.
In fact this array is unused. This patch remove it.
This patchs rename the "regex_exec" to "regex_exec2". It add a new
"regex_exec", "regex_exec_match" and "regex_exec_match2" function. This
function can match regex and return array containing matching parts.
Otherwise, this function use the compiled method (JIT or PCRE or POSIX).
JIT require a subject with length. PCREPOSIX and native POSIX regex
require a null terminted subject. The regex_exec* function are splited
in two version. The first version take a null terminated string, but it
execute strlen() on the subject if it is compiled with JIT. The second
version (terminated by "2") take the subject and the length. This
version adds a null character in the subject if it is compiled with
PCREPOSIX or native POSIX functions.
The documentation of posix regex and pcreposix says that the function
returns 0 if the string matche otherwise it returns REG_NOMATCH. The
REG_NOMATCH macro take the value 1 with posix regex and the value 17
with the pcreposix. The documentaion of the native pcre API (used with
JIT) returns a negative number if no match, otherwise, it returns 0 or a
positive number.
This patch fix also the return codes of the regex_exec* functions. Now,
these function returns true if the string match, otherwise it returns
false.
When I renamed the modify-header action to replace-value, one of them
was mistakenly set to "replace-val" instead. Additionally, differentiation
of the two actions must be done on args[0][8] and not *args[8]. Thanks
Thierry for spotting...
This patch adds two new actions to http-request and http-response rulesets :
- replace-header : replace a whole header line, suited for headers
which might contain commas
- replace-value : replace a single header value, suited for headers
defined as lists.
The match consists in a regex, and the replacement string takes a log-format
and supports back-references.
The time statistics computed by previous patches are now reported in the
HTML stats in the tips related to the total sessions for backend and servers,
and as separate columns for the CSV stats.
Using the last rate counters, we now compute the queue, connect, response
and total times per server and per backend with a 95% accuracy over the last
1024 samples. The operation is cheap so we don't need to condition it.
While the current functions report average event counts per period, we are
also interested in average values per event. For this we use a different
method. The principle is to rely on a long tail which sums the new value
with a fraction of the previous value, resulting in a sliding window of
infinite length depending on the precision we're interested in.
The idea is that we always keep (N-1)/N of the sum and add the new sampled
value. The sum over N values can be computed with a simple program for a
constant value 1 at each iteration :
N
,---
\ N - 1 e - 1
> ( --------- )^x ~= N * -----
/ N e
'---
x = 1
Note: I'm not sure how to demonstrate this but at least this is easily
verified with a simple program, the sum equals N * 0.632120 for any N
moderately large (tens to hundreds).
Inserting a constant sample value V here simply results in :
sum = V * N * (e - 1) / e
But we don't want to integrate over a small period, but infinitely. Let's
cut the infinity in P periods of N values. Each period M is exactly the same
as period M-1 with a factor of ((N-1)/N)^N applied. A test shows that given a
large N :
N - 1 1
( ------- )^N ~= ---
N e
Our sum is now a sum of each factor times :
N*P P
,--- ,---
\ N - 1 e - 1 \ 1
> v ( --------- )^x ~= VN * ----- * > ---
/ N e / e^x
'--- '---
x = 1 x = 0
For P "large enough", in tests we get this :
P
,---
\ 1 e
> --- ~= -----
/ e^x e - 1
'---
x = 0
This simplifies the sum above :
N*P
,---
\ N - 1
> v ( --------- )^x = VN
/ N
'---
x = 1
So basically by summing values and applying the last result an (N-1)/N factor
we just get N times the values over the long term, so we can recover the
constant value V by dividing by N.
A value added at the entry of the sliding window of N values will thus be
reduced to 1/e or 36.7% after N terms have been added. After a second batch,
it will only be 1/e^2, or 13.5%, and so on. So practically speaking, each
old period of N values represents only a quickly fading ratio of the global
sum :
period ratio
1 36.7%
2 13.5%
3 4.98%
4 1.83%
5 0.67%
6 0.25%
7 0.09%
8 0.033%
9 0.012%
10 0.0045%
So after 10N samples, the initial value has already faded out by a factor of
22026, which is quite fast. If the sliding window is 1024 samples wide, it
means that a sample will only count for 1/22k of its initial value after 10k
samples went after it, which results in half of the value it would represent
using an arithmetic mean. The benefit of this method is that it's very cheap
in terms of computations when N is a power of two. This is very well suited
to record response times as large values will fade out faster than with an
arithmetic mean and will depend on sample count and not time.
Demonstrating all the above assumptions with maths instead of a program is
left as an exercise for the reader.
Now that we can quote unsafe string, it becomes possible to dump the health
check responses on the CSV page as well. The two new fields are "last_chk"
and "last_agt".
Indicate that the text cells in the CSV format may contain quotes to
escape ambiguous texts. We don't have this case right now since we limit
the output, but it may happen in the future.