58000fe16d
There were only a few more used as output examples and comments in a few docs, it was the right moment to get rid of them. The file intro.txt which explains how to parse the version also got a hint about the possible presence of a hyphen in the name in older versions.
1187 lines
48 KiB
Plaintext
1187 lines
48 KiB
Plaintext
-----------------------------------------
|
|
Filters Guide - version 2.4
|
|
( Last update: 2021-02-24 )
|
|
------------------------------------------
|
|
Author : Christopher Faulet
|
|
Contact : christopher dot faulet at capflam dot org
|
|
|
|
|
|
ABSTRACT
|
|
--------
|
|
|
|
The filters support is a new feature of HAProxy 1.7. It is a way to extend
|
|
HAProxy without touching its core code and, in certain extent, without knowing
|
|
its internals. This feature will ease contributions, reducing impact of
|
|
changes. Another advantage will be to simplify HAProxy by replacing some parts
|
|
by filters. As we will see, and as an example, the HTTP compression is the first
|
|
feature moved in a filter.
|
|
|
|
This document describes how to write a filter and what to keep in mind to do
|
|
so. It also talks about the known limits and the pitfalls to avoid.
|
|
|
|
As said, filters are quite new for now. The API is not freezed and will be
|
|
updated/modified/improved/extended as needed.
|
|
|
|
|
|
|
|
SUMMARY
|
|
-------
|
|
|
|
1. Filters introduction
|
|
2. How to use filters
|
|
3. How to write a new filter
|
|
3.1. API Overview
|
|
3.2. Defining the filter name and its configuration
|
|
3.3. Managing the filter lifecycle
|
|
3.3.1. Dealing with threads
|
|
3.4. Handling the streams activity
|
|
3.5. Analyzing the channels activity
|
|
3.6. Filtering the data exchanged
|
|
4. FAQ
|
|
|
|
|
|
|
|
1. FILTERS INTRODUCTION
|
|
-----------------------
|
|
|
|
First of all, to fully understand how filters work and how to create one, it is
|
|
best to know, at least from a distance, what is a proxy (frontend/backend), a
|
|
stream and a channel in HAProxy and how these entities are linked to each other.
|
|
doc/internals/entities.pdf is a good overview.
|
|
|
|
Then, to support filters, many callbacks has been added to HAProxy at different
|
|
places, mainly around channel analyzers. Their purpose is to allow filters to
|
|
be involved in the data processing, from the stream creation/destruction to
|
|
the data forwarding. Depending of what it should do, a filter can implement all
|
|
or part of these callbacks. For now, existing callbacks are focused on
|
|
streams. But future improvements could enlarge filters scope. For instance, it
|
|
could be useful to handle events at the connection level.
|
|
|
|
In HAProxy configuration file, a filter is declared in a proxy section, except
|
|
default. So the configuration corresponding to a filter declaration is attached
|
|
to a specific proxy, and will be shared by all its instances. it is opaque from
|
|
the HAProxy point of view, this is the filter responsibility to manage it. For
|
|
each filter declaration matches a uniq configuration. Several declarations of
|
|
the same filter in the same proxy will be handle as different filters by
|
|
HAProxy.
|
|
|
|
A filter instance is represented by a partially opaque context (or a state)
|
|
attached to a stream and passed as arguments to callbacks. Through this context,
|
|
filter instances are stateful. Depending the filter is declared in a frontend or
|
|
a backend section, its instances will be created, respectively, when a stream is
|
|
created or when a backend is selected. Their behaviors will also be
|
|
different. Only instances of filters declared in a frontend section will be
|
|
aware of the creation and the destruction of the stream, and will take part in
|
|
the channels analyzing before the backend is defined.
|
|
|
|
It is important to remember the configuration of a filter is shared by all its
|
|
instances, while the context of an instance is owned by a uniq stream.
|
|
|
|
Filters are designed to be chained. It is possible to declare several filters in
|
|
the same proxy section. The declaration order is important because filters will
|
|
be called one after the other respecting this order. Frontend and backend
|
|
filters are also chained, frontend ones called first. Even if the filters
|
|
processing is serialized, each filter will bahave as it was alone (unless it was
|
|
developed to be aware of other filters). For all that, some constraints are
|
|
imposed to filters, especially when data exchanged between the client and the
|
|
server are processed. We will discuss again these constraints when we will tackle
|
|
the subject of writing a filter.
|
|
|
|
|
|
|
|
2. HOW TO USE FILTERS
|
|
---------------------
|
|
|
|
To use a filter, the parameter 'filter' should be used, followed by the filter
|
|
name and, optionally, its configuration in the desired listen, frontend or
|
|
backend section. For instance :
|
|
|
|
listen test
|
|
...
|
|
filter trace name TST
|
|
...
|
|
|
|
|
|
See doc/configuration.txt for a formal definition of the parameter 'filter'.
|
|
Note that additional parameters on the filter line must be parsed by the filter
|
|
itself.
|
|
|
|
The list of available filters is reported by 'haproxy -vv' :
|
|
|
|
$> haproxy -vv
|
|
HAProxy version 1.7-dev2-3a1d4a-33 2016/03/21
|
|
Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
|
|
|
|
[...]
|
|
|
|
Available filters :
|
|
[COMP] compression
|
|
[TRACE] trace
|
|
|
|
|
|
Multiple filter lines can be used in a proxy section to chain filters. Filters
|
|
will be called in the declaration order.
|
|
|
|
Some filters can support implicit declarations in certain circumstances
|
|
(without the filter line). This is not recommended for new features but are
|
|
useful for existing ones moved in a filter, for backward compatibility
|
|
reasons. Implicit declarations are supported when there is only one filter used
|
|
on a proxy. When several filters are used, explicit declarations are mandatory.
|
|
The HTTP compression filter is one of these filters. Alone, using 'compression'
|
|
keywords is enough to use it. But when at least a second filter is used, a
|
|
filter line must be added.
|
|
|
|
# filter line is optional
|
|
listen t1
|
|
bind *:80
|
|
compression algo gzip
|
|
compression offload
|
|
server srv x.x.x.x:80
|
|
|
|
# filter line is mandatory for the compression filter
|
|
listen t2
|
|
bind *:81
|
|
filter trace name T2
|
|
filter compression
|
|
compression algo gzip
|
|
compression offload
|
|
server srv x.x.x.x:80
|
|
|
|
|
|
|
|
|
|
3. HOW TO WRITE A NEW FILTER
|
|
----------------------------
|
|
|
|
To write a filter, there are 2 header files to explore :
|
|
|
|
* include/haproxy/filters-t.h : This is the main header file, containing all
|
|
important structures to use. It represents the
|
|
filter API.
|
|
|
|
* include/haproxy/filters.h : This header file contains helper functions that
|
|
may be used. It also contains the internal API
|
|
used by HAProxy to handle filters.
|
|
|
|
To ease the filters integration, it is better to follow some conventions :
|
|
|
|
* Use 'flt_' prefix to name the filter (e.g flt_http_comp or flt_trace).
|
|
|
|
* Keep everything related to the filter in a same file.
|
|
|
|
The filter 'trace' can be used as a template to write new filter. It is a good
|
|
start to see how filters really work.
|
|
|
|
3.1 API OVERVIEW
|
|
----------------
|
|
|
|
Writing a filter can be summarized to write functions and attach them to the
|
|
existing callbacks. Available callbacks are listed in the following structure :
|
|
|
|
struct flt_ops {
|
|
/*
|
|
* Callbacks to manage the filter lifecycle
|
|
*/
|
|
int (*init) (struct proxy *p, struct flt_conf *fconf);
|
|
void (*deinit) (struct proxy *p, struct flt_conf *fconf);
|
|
int (*check) (struct proxy *p, struct flt_conf *fconf);
|
|
int (*init_per_thread) (struct proxy *p, struct flt_conf *fconf);
|
|
void (*deinit_per_thread)(struct proxy *p, struct flt_conf *fconf);
|
|
|
|
/*
|
|
* Stream callbacks
|
|
*/
|
|
int (*attach) (struct stream *s, struct filter *f);
|
|
int (*stream_start) (struct stream *s, struct filter *f);
|
|
int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be);
|
|
void (*stream_stop) (struct stream *s, struct filter *f);
|
|
void (*detach) (struct stream *s, struct filter *f);
|
|
void (*check_timeouts) (struct stream *s, struct filter *f);
|
|
|
|
/*
|
|
* Channel callbacks
|
|
*/
|
|
int (*channel_start_analyze)(struct stream *s, struct filter *f,
|
|
struct channel *chn);
|
|
int (*channel_pre_analyze) (struct stream *s, struct filter *f,
|
|
struct channel *chn,
|
|
unsigned int an_bit);
|
|
int (*channel_post_analyze) (struct stream *s, struct filter *f,
|
|
struct channel *chn,
|
|
unsigned int an_bit);
|
|
int (*channel_end_analyze) (struct stream *s, struct filter *f,
|
|
struct channel *chn);
|
|
|
|
/*
|
|
* HTTP callbacks
|
|
*/
|
|
int (*http_headers) (struct stream *s, struct filter *f,
|
|
struct http_msg *msg);
|
|
int (*http_payload) (struct stream *s, struct filter *f,
|
|
struct http_msg *msg, unsigned int offset,
|
|
unsigned int len);
|
|
int (*http_end) (struct stream *s, struct filter *f,
|
|
struct http_msg *msg);
|
|
|
|
void (*http_reset) (struct stream *s, struct filter *f,
|
|
struct http_msg *msg);
|
|
void (*http_reply) (struct stream *s, struct filter *f,
|
|
short status,
|
|
const struct buffer *msg);
|
|
|
|
/*
|
|
* TCP callbacks
|
|
*/
|
|
int (*tcp_payload) (struct stream *s, struct filter *f,
|
|
struct channel *chn, unsigned int offset,
|
|
unsigned int len);
|
|
};
|
|
|
|
|
|
We will explain in following parts when these callbacks are called and what they
|
|
should do.
|
|
|
|
Filters are declared in proxy sections. So each proxy have an ordered list of
|
|
filters, possibly empty if no filter is used. When the configuration of a proxy
|
|
is parsed, each filter line represents an entry in this list. In the structure
|
|
'proxy', the filters configurations are stored in the field 'filter_configs',
|
|
each one of type 'struct flt_conf *' :
|
|
|
|
/*
|
|
* Structure representing the filter configuration, attached to a proxy and
|
|
* accessible from a filter when instantiated in a stream
|
|
*/
|
|
struct flt_conf {
|
|
const char *id; /* The filter id */
|
|
struct flt_ops *ops; /* The filter callbacks */
|
|
void *conf; /* The filter configuration */
|
|
struct list list; /* Next filter for the same proxy */
|
|
unsigned int flags; /* FLT_CFG_FL_* */
|
|
};
|
|
|
|
* 'flt_conf.id' is an identifier, defined by the filter. It can be
|
|
NULL. HAProxy does not use this field. Filters can use it in log messages or
|
|
as a uniq identifier to check multiple declarations. It is the filter
|
|
responsibility to free it, if necessary.
|
|
|
|
* 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
|
|
generally allocated and filled by its parsing function (See § 3.2). It is
|
|
the filter responsibility to free it.
|
|
|
|
* 'flt_conf.ops' references the callbacks implemented by the filter. This
|
|
field must be set during the parsing phase (See § 3.2) and can be refine
|
|
during the initialization phase (See § 3.3). If it is dynamically allocated,
|
|
it is the filter responsibility to free it.
|
|
|
|
* 'flt_conf.flags' is a bitfield to specify the filter capabilities. For now,
|
|
only FLT_CFG_FL_HTX may be set when a filter is able to process HTX
|
|
streams. If not set, the filter is excluded from the HTTP filtering.
|
|
|
|
|
|
The filter configuration is global and shared by all its instances. A filter
|
|
instance is created in the context of a stream and attached to this stream. in
|
|
the structure 'stream', the field 'strm_flt' is the state of all filter
|
|
instances attached to a stream :
|
|
|
|
/*
|
|
* Structure representing the "global" state of filters attached to a
|
|
* stream.
|
|
*/
|
|
struct strm_flt {
|
|
struct list filters; /* List of filters attached to a stream */
|
|
struct filter *current[2]; /* From which filter resume processing, for a specific channel.
|
|
* This is used for resumable callbacks only,
|
|
* If NULL, we start from the first filter.
|
|
* 0: request channel, 1: response channel */
|
|
unsigned short flags; /* STRM_FL_* */
|
|
unsigned char nb_req_data_filters; /* Number of data filters registered on the request channel */
|
|
unsigned char nb_rsp_data_filters; /* Number of data filters registered on the response channel */
|
|
unsigned long long offset[2]; /* gloal offset of input data already filtered for a specific channel
|
|
* 0: request channel, 1: response channel */
|
|
};
|
|
|
|
|
|
Filter instances attached to a stream are stored in the field
|
|
'strm_flt.filters', each instance is of type 'struct filter *' :
|
|
|
|
/*
|
|
* Structure representing a filter instance attached to a stream
|
|
*
|
|
* 2D-Array fields are used to store info per channel. The first index
|
|
* stands for the request channel, and the second one for the response
|
|
* channel. Especially, <next> and <fwd> are offsets representing amount of
|
|
* data that the filter are, respectively, parsed and forwarded on a
|
|
* channel. Filters can access these values using FLT_NXT and FLT_FWD
|
|
* macros.
|
|
*/
|
|
struct filter {
|
|
struct flt_conf *config; /* the filter's configuration */
|
|
void *ctx; /* The filter context (opaque) */
|
|
unsigned short flags; /* FLT_FL_* */
|
|
unsigned long long offset[2]; /* Offset of input data already filtered for a specific channel
|
|
* 0: request channel, 1: response channel */
|
|
unsigned int pre_analyzers; /* bit field indicating analyzers to
|
|
* pre-process */
|
|
unsigned int post_analyzers; /* bit field indicating analyzers to
|
|
* post-process */
|
|
struct list list; /* Next filter for the same proxy/stream */
|
|
};
|
|
|
|
* 'filter.config' is the filter configuration previously described. All
|
|
instances of a filter share it.
|
|
|
|
* 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
|
|
responsibility to free it.
|
|
|
|
* 'filter.pre_analyzers and 'filter.post_analyzers will be described later
|
|
(See § 3.5).
|
|
|
|
* 'filter.offset' will be described later (See § 3.6).
|
|
|
|
|
|
3.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
|
|
---------------------------------------------------
|
|
|
|
During the filter development, the first thing to do is to add it in the
|
|
supported filters. To do so, its name must be registered as a valid keyword on
|
|
the filter line :
|
|
|
|
/* Declare the filter parser for "my_filter" keyword */
|
|
static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
|
|
{ "my_filter", parse_my_filter_cfg, NULL /* private data */ },
|
|
{ NULL, NULL, NULL },
|
|
}
|
|
};
|
|
INITCALL1(STG_REGISTER, flt_register_keywords, &flt_kws);
|
|
|
|
|
|
Then the filter internal configuration must be defined. For instance :
|
|
|
|
struct my_filter_config {
|
|
struct proxy *proxy;
|
|
char *name;
|
|
/* ... */
|
|
};
|
|
|
|
|
|
All callbacks implemented by the filter must then be declared. Here, a global
|
|
variable is used :
|
|
|
|
struct flt_ops my_filter_ops {
|
|
.init = my_filter_init,
|
|
.deinit = my_filter_deinit,
|
|
.check = my_filter_config_check,
|
|
|
|
/* ... */
|
|
};
|
|
|
|
|
|
Finally, the function to parse the filter configuration must be written, here
|
|
'parse_my_filter_cfg'. This function must parse all remaining keywords on the
|
|
filter line :
|
|
|
|
/* Return -1 on error, else 0 */
|
|
static int
|
|
parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
|
|
struct flt_conf *flt_conf, char **err, void *private)
|
|
{
|
|
struct my_filter_config *my_conf;
|
|
int pos = *cur_arg;
|
|
|
|
/* Allocate the internal configuration used by the filter */
|
|
my_conf = calloc(1, sizeof(*my_conf));
|
|
if (!my_conf) {
|
|
memprintf(err, "%s : out of memory", args[*cur_arg]);
|
|
return -1;
|
|
}
|
|
my_conf->proxy = px;
|
|
|
|
/* ... */
|
|
|
|
/* Parse all keywords supported by the filter and fill the internal
|
|
* configuration */
|
|
pos++; /* Skip the filter name */
|
|
while (*args[pos]) {
|
|
if (!strcmp(args[pos], "name")) {
|
|
if (!*args[pos + 1]) {
|
|
memprintf(err, "'%s' : '%s' option without value",
|
|
args[*cur_arg], args[pos]);
|
|
goto error;
|
|
}
|
|
my_conf->name = strdup(args[pos + 1]);
|
|
if (!my_conf->name) {
|
|
memprintf(err, "%s : out of memory", args[*cur_arg]);
|
|
goto error;
|
|
}
|
|
pos += 2;
|
|
}
|
|
|
|
/* ... parse other keywords ... */
|
|
}
|
|
*cur_arg = pos;
|
|
|
|
/* Set callbacks supported by the filter */
|
|
flt_conf->ops = &my_filter_ops;
|
|
|
|
/* Last, save the internal configuration */
|
|
flt_conf->conf = my_conf;
|
|
return 0;
|
|
|
|
error:
|
|
if (my_conf->name)
|
|
free(my_conf->name);
|
|
free(my_conf);
|
|
return -1;
|
|
}
|
|
|
|
|
|
WARNING : In this parsing function, 'flt_conf->ops' must be initialized. All
|
|
arguments of the filter line must also be parsed. This is mandatory.
|
|
|
|
In the previous example, the filter lne should be read as follows :
|
|
|
|
filter my_filter name MY_NAME ...
|
|
|
|
|
|
Optionally, by implementing the 'flt_ops.check' callback, an extra set is added
|
|
to check the internal configuration of the filter after the parsing phase, when
|
|
the HAProxy configuration is fully defined. For instance :
|
|
|
|
/* Check configuration of a trace filter for a specified proxy.
|
|
* Return 1 on error, else 0. */
|
|
static int
|
|
my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
|
|
{
|
|
if (px->mode != PR_MODE_HTTP) {
|
|
Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
|
|
return 1;
|
|
}
|
|
|
|
/* ... */
|
|
|
|
return 0;
|
|
}
|
|
|
|
|
|
|
|
3.3. MANAGING THE FILTER LIFECYCLE
|
|
----------------------------------
|
|
|
|
Once the configuration parsed and checked, filters are ready to by used. There
|
|
are two main callbacks to manage the filter lifecycle :
|
|
|
|
* 'flt_ops.init' : It initializes the filter for a proxy. This callback may be
|
|
defined to finish the filter configuration.
|
|
|
|
* 'flt_ops.deinit' : It cleans up what the parsing function and the init
|
|
callback have done. This callback is useful to release
|
|
memory allocated for the filter configuration.
|
|
|
|
Here is an example :
|
|
|
|
/* Initialize the filter. Returns -1 on error, else 0. */
|
|
static int
|
|
my_filter_init(struct proxy *px, struct flt_conf *fconf)
|
|
{
|
|
struct my_filter_config *my_conf = fconf->conf;
|
|
|
|
/* ... */
|
|
|
|
return 0;
|
|
}
|
|
|
|
/* Free resources allocated by the trace filter. */
|
|
static void
|
|
my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
|
|
{
|
|
struct my_filter_config *my_conf = fconf->conf;
|
|
|
|
if (my_conf) {
|
|
free(my_conf->name);
|
|
/* ... */
|
|
free(my_conf);
|
|
}
|
|
fconf->conf = NULL;
|
|
}
|
|
|
|
|
|
3.3.1 DEALING WITH THREADS
|
|
--------------------------
|
|
|
|
When HAProxy is compiled with the threads support and started with more that one
|
|
thread (global.nbthread > 1), then it is possible to manage the filter per
|
|
thread with following callbacks :
|
|
|
|
* 'flt_ops.init_per_thread': It initializes the filter for each thread. It
|
|
works the same way than 'flt_ops.init' but in the
|
|
context of a thread. This callback is called
|
|
after the thread creation.
|
|
|
|
* 'flt_ops.deinit_per_thread': It cleans up what the init_per_thread callback
|
|
have done. It is called in the context of a
|
|
thread, before exiting it.
|
|
|
|
It is the filter responsibility to deal with concurrency. check, init and deinit
|
|
callbacks are called on the main thread. All others are called on a "worker"
|
|
thread (not always the same). It is also the filter responsibility to know if
|
|
HAProxy is started with more than one thread. If it is started with one thread
|
|
(or compiled without the threads support), these callbacks will be silently
|
|
ignored (in this case, global.nbthread will be always equal to one).
|
|
|
|
|
|
3.4. HANDLING THE STREAMS ACTIVITY
|
|
-----------------------------------
|
|
|
|
It may be interesting to handle streams activity. For now, there is three
|
|
callbacks that should define to do so :
|
|
|
|
* 'flt_ops.stream_start' : It is called when a stream is started. This
|
|
callback can fail by returning a negative value. It
|
|
will be considered as a critical error by HAProxy
|
|
which disabled the listener for a short time.
|
|
|
|
* 'flt_ops.stream_set_backend' : It is called when a backend is set for a
|
|
stream. This callbacks will be called for all
|
|
filters attached to a stream (frontend and
|
|
backend). Note this callback is not called if
|
|
the frontend and the backend are the same.
|
|
|
|
* 'flt_ops.stream_stop' : It is called when a stream is stopped. This callback
|
|
always succeed. Anyway, it is too late to return an
|
|
error.
|
|
|
|
For instance :
|
|
|
|
/* Called when a stream is created. Returns -1 on error, else 0. */
|
|
static int
|
|
my_filter_stream_start(struct stream *s, struct filter *filter)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... */
|
|
|
|
return 0;
|
|
}
|
|
|
|
/* Called when a backend is set for a stream */
|
|
static int
|
|
my_filter_stream_set_backend(struct stream *s, struct filter *filter,
|
|
struct proxy *be)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... */
|
|
|
|
return 0;
|
|
}
|
|
|
|
/* Called when a stream is destroyed */
|
|
static void
|
|
my_filter_stream_stop(struct stream *s, struct filter *filter)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... */
|
|
}
|
|
|
|
|
|
WARNING : Handling the streams creation and destruction is only possible for
|
|
filters defined on proxies with the frontend capability.
|
|
|
|
In addition, it is possible to handle creation and destruction of filter
|
|
instances using following callbacks:
|
|
|
|
* 'flt_ops.attach' : It is called after a filter instance creation, when it is
|
|
attached to a stream. This happens when the stream is
|
|
started for filters defined on the stream's frontend and
|
|
when the backend is set for filters declared on the
|
|
stream's backend. It is possible to ignore the filter, if
|
|
needed, by returning 0. This could be useful to have
|
|
conditional filtering.
|
|
|
|
* 'flt_ops.detach' : It is called when a filter instance is detached from a
|
|
stream, before its destruction. This happens when the
|
|
stream is stopped for filters defined on the stream's
|
|
frontend and when the analyze ends for filters defined on
|
|
the stream's backend.
|
|
|
|
For instance :
|
|
|
|
/* Called when a filter instance is created and attach to a stream */
|
|
static int
|
|
my_filter_attach(struct stream *s, struct filter *filter)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
if (/* ... */)
|
|
return 0; /* Ignore the filter here */
|
|
return 1;
|
|
}
|
|
|
|
/* Called when a filter instance is detach from a stream, just before its
|
|
* destruction */
|
|
static void
|
|
my_filter_detach(struct stream *s, struct filter *filter)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... */
|
|
}
|
|
|
|
Finally, it may be interesting to notify the filter when the stream is woken up
|
|
because of an expired timer. This could let a chance to check some internal
|
|
timeouts, if any. To do so the following callback must be used :
|
|
|
|
* 'flt_opt.check_timeouts' : It is called when a stream is woken up because of
|
|
an expired timer.
|
|
|
|
For instance :
|
|
|
|
/* Called when a stream is woken up because of an expired timer */
|
|
static void
|
|
my_filter_check_timeouts(struct stream *s, struct filter *filter)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... */
|
|
}
|
|
|
|
|
|
3.5. ANALYZING THE CHANNELS ACTIVITY
|
|
------------------------------------
|
|
|
|
The main purpose of filters is to take part in the channels analyzing. To do so,
|
|
there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
|
|
'flt_ops.channel_post_analyze', called respectively before and after each
|
|
analyzer attached to a channel, except analyzers responsible for the data
|
|
forwarding (TCP or HTTP). Concretely, on the request channel, these callbacks
|
|
could be called before following analyzers :
|
|
|
|
* tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
|
|
* http_wait_for_request (AN_REQ_WAIT_HTTP)
|
|
* http_wait_for_request_body (AN_REQ_HTTP_BODY)
|
|
* http_process_req_common (AN_REQ_HTTP_PROCESS_FE)
|
|
* process_switching_rules (AN_REQ_SWITCHING_RULES)
|
|
* http_process_req_ common (AN_REQ_HTTP_PROCESS_BE)
|
|
* http_process_tarpit (AN_REQ_HTTP_TARPIT)
|
|
* process_server_rules (AN_REQ_SRV_RULES)
|
|
* http_process_request (AN_REQ_HTTP_INNER)
|
|
* tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE)
|
|
* process_sticking_rules (AN_REQ_STICKING_RULES)
|
|
|
|
And on the response channel :
|
|
|
|
* tcp_inspect_response (AN_RES_INSPECT)
|
|
* http_wait_for_response (AN_RES_WAIT_HTTP)
|
|
* process_store_rules (AN_RES_STORE_RULES)
|
|
* http_process_res_common (AN_RES_HTTP_PROCESS_BE)
|
|
|
|
Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
|
|
can interrupt the stream processing. So a filter can decide to not execute the
|
|
analyzer that follows and wait the next iteration. If there are more than one
|
|
filter, following ones are skipped. On the next iteration, the filtering resumes
|
|
where it was stopped, i.e. on the filter that has previously stopped the
|
|
processing. So it is possible for a filter to stop the stream processing on a
|
|
specific analyzer for a while before continuing. Moreover, this callback can be
|
|
called many times for the same analyzer, until it finishes its processing. For
|
|
instance :
|
|
|
|
/* Called before a processing happens on a given channel.
|
|
* Returns a negative value if an error occurs, 0 if it needs to wait,
|
|
* any other value otherwise. */
|
|
static int
|
|
my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
|
|
struct channel *chn, unsigned an_bit)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
switch (an_bit) {
|
|
case AN_REQ_WAIT_HTTP:
|
|
if (/* wait that a condition is verified before continuing */)
|
|
return 0;
|
|
break;
|
|
/* ... * /
|
|
}
|
|
return 1;
|
|
}
|
|
|
|
* 'an_bit' is the analyzer id. All analyzers are listed in
|
|
'include/haproxy/channels-t.h'.
|
|
|
|
* 'chn' is the channel on which the analyzing is done. It is possible to
|
|
determine if it is the request or the response channel by testing if
|
|
CF_ISRESP flag is set :
|
|
|
|
│ ((chn->flags & CF_ISRESP) == CF_ISRESP)
|
|
|
|
|
|
In previous example, the stream processing is blocked before receipt of the HTTP
|
|
request until a condition is verified.
|
|
|
|
'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
|
|
negative value if an error occurs, any other value otherwise. It is called when
|
|
a filterable analyzer finishes its processing, so once for the same analyzer.
|
|
For instance :
|
|
|
|
/* Called after a processing happens on a given channel.
|
|
* Returns a negative value if an error occurs, any other
|
|
* value otherwise. */
|
|
static int
|
|
my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
|
|
struct channel *chn, unsigned an_bit)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
struct http_msg *msg;
|
|
|
|
switch (an_bit) {
|
|
case AN_REQ_WAIT_HTTP:
|
|
if (/* A test on received headers before any other treatment */) {
|
|
msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
|
|
txn->status = 400;
|
|
msg->msg_state = HTTP_MSG_ERROR;
|
|
http_reply_and_close(s, s->txn->status, http_error_message(s));
|
|
return -1; /* This is an error ! */
|
|
}
|
|
break;
|
|
/* ... * /
|
|
}
|
|
return 1;
|
|
}
|
|
|
|
|
|
Pre and post analyzer callbacks of a filter are not automatically called. They
|
|
must be regiesterd explicitly on analyzers, updating the value of
|
|
'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
|
|
are listed in 'include/types/channels.h'. Here is an example :
|
|
|
|
static int
|
|
my_filter_stream_start(struct stream *s, struct filter *filter)
|
|
{
|
|
/* ... * /
|
|
|
|
/* Register the pre analyzer callback on all request and response
|
|
* analyzers */
|
|
filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
|
|
|
|
/* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
|
|
* AN_RES_WAIT_HTTP analyzers */
|
|
filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
|
|
|
|
/* ... * /
|
|
return 0;
|
|
}
|
|
|
|
|
|
To surround activity of a filter during the channel analyzing, two new analyzers
|
|
has been added :
|
|
|
|
* 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE) : For
|
|
a specific filter, this analyzer is called before any call to the
|
|
'channel_analyze' callback. From the filter point of view, it calls the
|
|
'flt_ops.channel_start_analyze' callback.
|
|
|
|
* 'flt_end_analyze' (AN_REQ/RES_FLT_END) : For a specific filter, this
|
|
analyzer is called when all other analyzers have finished their
|
|
processing. From the filter point of view, it calls the
|
|
'flt_ops.channel_end_analyze' callback.
|
|
|
|
These analyzers are called only once per streams.
|
|
|
|
'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
|
|
interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
|
|
example :
|
|
|
|
/* Called when analyze starts for a given channel
|
|
* Returns a negative value if an error occurs, 0 if it needs to wait,
|
|
* any other value otherwise. */
|
|
static int
|
|
my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
|
|
struct channel *chn)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... TODO ... */
|
|
|
|
return 1;
|
|
}
|
|
|
|
/* Called when analyze ends for a given channel
|
|
* Returns a negative value if an error occurs, 0 if it needs to wait,
|
|
* any other value otherwise. */
|
|
static int
|
|
my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
|
|
struct channel *chn)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* ... TODO ... */
|
|
|
|
return 1;
|
|
}
|
|
|
|
|
|
Workflow on channels can be summarized as following :
|
|
|
|
FE: Called for filters defined on the stream's frontend
|
|
BE: Called for filters defined on the stream's backend
|
|
|
|
+------->---------+
|
|
| | |
|
|
+----------------------+ | +----------------------+
|
|
| flt_ops.attach (FE) | | | flt_ops.attach (BE) |
|
|
+----------------------+ | +----------------------+
|
|
| | |
|
|
V | V
|
|
+--------------------------+ | +------------------------------------+
|
|
| flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) |
|
|
+--------------------------+ | +------------------------------------+
|
|
| | |
|
|
... | ...
|
|
| | |
|
|
| ^ |
|
|
| --+ | | --+
|
|
+------<----------+ | | +--------<--------+ |
|
|
| | | | | | |
|
|
V | | | V | |
|
|
+-------------------------------+ | | | +-------------------------------+ | |
|
|
| flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ |
|
|
|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| |
|
|
+---------------+---------------+ | R | +-------------------------------+ |
|
|
| | O | | |
|
|
+------<---------+ | N ^ +--------<-------+ | B
|
|
| | | T | | | | A
|
|
+---------------|------------+ | | E | +---------------|------------+ | | C
|
|
|+--------------V-------------+ | | N | |+--------------V-------------+ | | K
|
|
||+----------------------------+ | | D | ||+----------------------------+ | | E
|
|
|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N
|
|
||| V | | | | ||| V | | | D
|
|
||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ |
|
|
+|| V | | | +|| V | |
|
|
+|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| |
|
|
+----------------------------+ | | +----------------------------+ |
|
|
| --+ | | |
|
|
+------------>------------+ ... |
|
|
| |
|
|
[ data filtering (see below) ] |
|
|
| |
|
|
... |
|
|
| |
|
|
+--------<--------+ |
|
|
| | |
|
|
V | |
|
|
+-------------------------------+ | |
|
|
| flt_end_analyze (FE+BE) +-+ |
|
|
| (flt_ops.channel_end_analyze) | |
|
|
+---------------+---------------+ |
|
|
| --+
|
|
V
|
|
+----------------------+
|
|
| flt_ops.detach (BE) |
|
|
+----------------------+
|
|
|
|
|
V
|
|
+--------------------------+
|
|
| flt_ops.stream_stop (FE) |
|
|
+--------------------------+
|
|
|
|
|
V
|
|
+----------------------+
|
|
| flt_ops.detach (FE) |
|
|
+----------------------+
|
|
|
|
|
V
|
|
|
|
By zooming on an analyzer box we have:
|
|
|
|
...
|
|
|
|
|
V
|
|
|
|
|
+-----------<-----------+
|
|
| |
|
|
+-----------------+--------------------+ |
|
|
| | | |
|
|
| +--------<---------+ | |
|
|
| | | | |
|
|
| V | | |
|
|
| flt_ops.channel_pre_analyze ->-+ | ^
|
|
| | | |
|
|
| | | |
|
|
| V | |
|
|
| analyzer --------->-----+--+
|
|
| | |
|
|
| | |
|
|
| V |
|
|
| flt_ops.channel_post_analyze |
|
|
| | |
|
|
| | |
|
|
+-----------------+--------------------+
|
|
|
|
|
V
|
|
...
|
|
|
|
|
|
3.6. FILTERING THE DATA EXCHANGED
|
|
-----------------------------------
|
|
|
|
WARNING : To fully understand this part, it is important to be aware on how the
|
|
buffers work in HAProxy. For the HTTP part, it is also important to
|
|
understand how data are parsed and structured, and how the internal
|
|
representation, called HTX, works. See doc/internals/buffer-api.txt
|
|
and doc/internals/htx-api.txt for details.
|
|
|
|
An extended feature of the filters is the data filtering. By default a filter
|
|
does not look into data exchanged between the client and the server because it
|
|
is expensive. Indeed, instead of forwarding data without any processing, each
|
|
byte need to be buffered.
|
|
|
|
So, to enable the data filtering on a channel, at any time, in one of previous
|
|
callbacks, 'register_data_filter' function must be called. And conversely, to
|
|
disable it, 'unregister_data_filter' function must be called. For instance :
|
|
|
|
my_filter_http_headers(struct stream *s, struct filter *filter,
|
|
struct http_msg *msg)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
|
|
/* 'chn' must be the request channel */
|
|
if (!(msg->chn->flags & CF_ISRESP)) {
|
|
struct htx *htx;
|
|
struct ist hdr;
|
|
struct http_hdr_ctx ctx;
|
|
|
|
htx = htxbuf(msg->chn->buf);
|
|
|
|
/* Enable the data filtering for the request if 'X-Filter' header
|
|
* is set to 'true'. */
|
|
hdr = ist("X-Filter);
|
|
ctx.blk = NULL;
|
|
if (http_find_header(htx, hdr, &ctx, 0) &&
|
|
ctx.value.len >= 4 && memcmp(ctx.value.ptr, "true", 4) == 0)
|
|
register_data_filter(s, chn, filter);
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
|
|
set to 'true'.
|
|
|
|
If several filters are declared, the evaluation order remains the same,
|
|
regardless the order of the registrations to the data filtering. Data
|
|
registrations must be performed before the data forwarding step. However, a
|
|
filter may be unregistered from the data filtering at any time.
|
|
|
|
Depending on the stream type, TCP or HTTP, the way to handle data filtering is
|
|
different. HTTP data are structured while TCP data are raw. And there are more
|
|
callbacks for HTTP streams to fully handle all steps of an HTTP transaction. But
|
|
the main part is the same. The data filtering is performed in one callback,
|
|
called in loop on input data starting at a specific offset for a given
|
|
length. Data analyzed by a filter are considered as forwarded from its point of
|
|
view. Because filters are chained, a filter never analyzes more data than its
|
|
predecessors. Thus only data analyzed by the last filter are effectively
|
|
forwarded. This means, at any time, any filter may choose to not analyze all
|
|
available data (available from its point of view), blocking the data forwarding.
|
|
|
|
Internally, filters own 2 offsets representing the number of bytes already
|
|
analyzed in the available input data, one per channel. There is also an offset
|
|
couple at the stream level, in the strm_flt object, representing the total
|
|
number of bytes already forwarded. These offsets may be retrieved and updated
|
|
using following macros :
|
|
|
|
* FLT_OFF(flt, chn)
|
|
|
|
* FLT_STRM_OFF(s, chn)
|
|
|
|
where 'flt' is the 'struct filter' passed as argument in all callbacks, 's' the
|
|
filtered stream and 'chn' is the considered channel. However, there is no reason
|
|
for a filter to use these macros or take care of these offsets.
|
|
|
|
|
|
3.6.1 FILTERING DATA ON TCP STREAMS
|
|
-----------------------------------
|
|
|
|
The TCP data filtering for TCP streams is the easy case, because HAProxy do not
|
|
parse these data. Data are stored in raw in the buffer. So there is only one
|
|
callback to consider:
|
|
|
|
* 'flt_ops.tcp_payload : This callback is called when input data are
|
|
available. If not defined, all available data will be considered as analyzed
|
|
and forwarded from the filter point of view.
|
|
|
|
This callback is called only if the filter is registered to analyze TCP
|
|
data. Here is an example :
|
|
|
|
/* Returns a negative value if an error occurs, else the number of
|
|
* consumed bytes. */
|
|
static int
|
|
my_filter_tcp_payload(struct stream *s, struct filter *filter,
|
|
struct channel *chn, unsigned int offset,
|
|
unsigned int len)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
int ret = len;
|
|
|
|
/* Do not parse more than 'my_conf->max_parse' bytes at a time */
|
|
if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
|
|
ret = my_conf->max_parse;
|
|
|
|
/* if available data are not completely parsed, wake up the stream to
|
|
* be sure to not freeze it. The best is probably to set a
|
|
* chn->analyse_exp timer */
|
|
if (ret != len)
|
|
task_wakeup(s->task, TASK_WOKEN_MSG);
|
|
return ret;
|
|
}
|
|
|
|
But it is important to note that tunnelled data of an HTTP stream may also be
|
|
filtered via this callback. Tunnelled data are data exchange after an HTTP tunnel
|
|
is established between the client and the server, via an HTTP CONNECT or via a
|
|
protocol upgrade. In this case, the data are structured. Of course, to do so,
|
|
the filter must be able to parse HTX data and must have the FLT_CFG_FL_HTX flag
|
|
set. At any time, the IS_HTX_STRM() macros may be used on the stream to know if
|
|
it is an HTX stream or a TCP stream.
|
|
|
|
|
|
3.6.2 FILTERING DATA ON HTTP STREAMS
|
|
------------------------------------
|
|
|
|
The HTTP data filtering is a bit more complex because HAProxy data are
|
|
structutred and represented to an internal format, called HTX. So basically
|
|
there is the HTTP counterpart to the previous callback :
|
|
|
|
* 'flt_ops.http_payload' : This callback is called when input data are
|
|
available. If not defined, all available data will be considered as analyzed
|
|
and forwarded for the filter.
|
|
|
|
But the prototype for this callbacks is slightly different. Instead of having
|
|
the channel as parameter, we have the HTTP message (struct http_msg). This
|
|
callback is called only if the filter is registered to analyze TCP data. Here is
|
|
an example :
|
|
|
|
/* Returns a negative value if an error occurs, else the number of
|
|
* consumed bytes. */
|
|
static int
|
|
my_filter_http_payload(struct stream *s, struct filter *filter,
|
|
struct http_msg *msg, unsigned int offset,
|
|
unsigned int len)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
struct htx *htx = htxbuf(&msg->chn->buf);
|
|
struct htx_ret htxret = htx_find_offset(htx, offset);
|
|
struct htx_blk *blk;
|
|
|
|
blk = htxret.blk;
|
|
offset = htxret.ret;
|
|
for (; blk; blk = htx_get_next_blk(blk, htx)) {
|
|
enum htx_blk_type type = htx_get_blk_type(blk);
|
|
|
|
if (type == HTX_BLK_UNUSED)
|
|
continue;
|
|
else if (type == HTX_BLK_DATA) {
|
|
/* filter data */
|
|
}
|
|
else
|
|
break;
|
|
}
|
|
|
|
return len;
|
|
}
|
|
|
|
In addition, there are two others callbacks :
|
|
|
|
* 'flt_ops.http_headers' : This callback is called just before the HTTP body
|
|
forwarding and after any processing on the request/response HTTP
|
|
headers. When defined, this callback is always called for HTTP streams
|
|
(i.e. without needs of a registration on data filtering).
|
|
Here is an example :
|
|
|
|
|
|
/* Returns a negative value if an error occurs, 0 if it needs to wait,
|
|
* any other value otherwise. */
|
|
static int
|
|
my_filter_http_headers(struct stream *s, struct filter *filter,
|
|
struct http_msg *msg)
|
|
{
|
|
struct my_filter_config *my_conf = FLT_CONF(filter);
|
|
struct htx *htx = htxbuf(&msg->chn->buf);
|
|
struct htx_sl *sl = http_get_stline(htx);
|
|
int32_t pos;
|
|
|
|
for (pos = htx_get_first(htx); pos != -1; pos = htx_get_next(htx, pos)) {
|
|
struct htx_blk *blk = htx_get_blk(htx, pos);
|
|
enum htx_blk_type type = htx_get_blk_type(blk);
|
|
struct ist n, v;
|
|
|
|
if (type == HTX_BLK_EOH)
|
|
break;
|
|
if (type != HTX_BLK_HDR)
|
|
continue;
|
|
|
|
n = htx_get_blk_name(htx, blk);
|
|
v = htx_get_blk_value(htx, blk);
|
|
/* Do something on the header name/value */
|
|
}
|
|
|
|
return 1;
|
|
}
|
|
|
|
* 'flt_ops.http_end' : This callback is called when the whole HTTP message was
|
|
processed. It may interrupt the stream processing. So, it could be used to
|
|
synchronize the HTTP request with the HTTP response, for instance :
|
|
|
|
/* Returns a negative value if an error occurs, 0 if it needs to wait,
|
|
* any other value otherwise. */
|
|
static int
|
|
my_filter_http_end(struct stream *s, struct filter *filter,
|
|
struct http_msg *msg)
|
|
{
|
|
struct my_filter_ctx *my_ctx = filter->ctx;
|
|
|
|
|
|
if (!(msg->chn->flags & CF_ISRESP)) /* The request */
|
|
my_ctx->end_of_req = 1;
|
|
else /* The response */
|
|
my_ctx->end_of_rsp = 1;
|
|
|
|
/* Both the request and the response are finished */
|
|
if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
|
|
return 1;
|
|
|
|
/* Wait */
|
|
return 0;
|
|
}
|
|
|
|
Then, to finish, there are 2 informational callbacks :
|
|
|
|
* 'flt_ops.http_reset' : This callback is called when an HTTP message is
|
|
reset. This happens either when a 1xx informational response is received, or
|
|
if we're retrying to send the request to the server after it failed. It
|
|
could be useful to reset the filter context before receiving the true
|
|
response.
|
|
By checking s->txn->status, it is possible to know why this callback is
|
|
called. If it's a 1xx, we're called because of an informational
|
|
message. Otherwise, it is a L7 retry.
|
|
|
|
* 'flt_ops.http_reply' : This callback is called when, at any time, HAProxy
|
|
decides to stop the processing on a HTTP message and to send an internal
|
|
response to the client. This mainly happens when an error or a redirect
|
|
occurs.
|
|
|
|
|
|
3.6.3 REWRITING DATA
|
|
--------------------
|
|
|
|
The last part, and the trickiest one about the data filtering, is about the data
|
|
rewriting. For now, the filter API does not offer a lot of functions to handle
|
|
it. There are only functions to notify HAProxy that the data size has changed to
|
|
let it update internal state of filters. This is the developer responsibility to
|
|
update data itself, i.e. the buffer offsets, using following function :
|
|
|
|
* 'flt_update_offsets()' : This function must be called when a filter alter
|
|
incoming data. It updates offsets of the stream and of all filters
|
|
preceding the calling one. Do not call this function when a filter change
|
|
the size of incoming data leads to an undefined behavior.
|
|
|
|
A good example of filter changing the data size is the HTTP compression filter.
|