[DOC] added some docs about http headers storage and acls

This commit is contained in:
Willy Tarreau 2007-04-01 09:44:10 +02:00
parent 422505801f
commit 985fc56734
3 changed files with 274 additions and 0 deletions

View File

@ -115,3 +115,148 @@ Sinon, peut-
req in switch URI =^ "/images/" images:"/" req in switch URI =^ "/images/" images:"/"
2007/03/31 - Besoins plus précis.
1) aucune extension de branchement ou autre dans les "listen", c'est trop complexe.
Distinguer les données entrantes (in) et sortantes (out).
Le frontend ne voit que les requetes entrantes et les réponses sortantes.
Le backend voir les requêtes in/out et les réponses in/out.
Le frontend permet les branchements d'ensembles de filtres de requêtes vers
d'autres. Le frontend et les ensembles de filtres de requêtes peuvent brancher
vers un backend.
-----------+--------+----------+----------+---------+----------+
\ Where | | | | | |
\______ | Listen | Frontend | ReqRules | Backend | RspRules |
\| | | | | |
Capability | | | | | |
-----------+--------+----------+----------+---------+----------+
Frontend | X | X | | | |
-----------+--------+----------+----------+---------+----------+
FiltReqIn | X | X | X | X | |
-----------+--------+----------+----------+---------+----------+
JumpFiltReq| X | X | X | | | \
-----------+--------+----------+----------+---------+----------+ > = ReqJump
SetBackend | X | X | X | | | /
-----------+--------+----------+----------+---------+----------+
FiltReqOut | | | | X | |
-----------+--------+----------+----------+---------+----------+
FiltRspIn | X | | | X | X |
-----------+--------+----------+----------+---------+----------+
JumpFiltRsp| | | | X | X |
-----------+--------+----------+----------+---------+----------+
FiltRspOut | | X | | X | X |
-----------+--------+----------+----------+---------+----------+
Backend | X | | | X | |
-----------+--------+----------+----------+---------+----------+
En conclusion
-------------
Il y a au moins besoin de distinguer 8 fonctionnalités de base :
- capacité à recevoir des connexions (frontend)
- capacité à filtrer les requêtes entrantes
- capacité à brancher vers un backend ou un ensemble de règles de requêtes
- capacité à filtrer les requêtes sortantes
- capacité à filtrer les réponses entrantes
- capacité à brancher vers un autre ensemble de règles de réponses
- capacité à filtrer la réponse sortante
- capacité à gérer des serveurs (backend)
Remarque
--------
- on a souvent besoin de pouvoir appliquer un petit traitement sur un ensemble
host/uri/autre. Le petit traitement peut consister en quelques filtres ainsi
qu'une réécriture du couple (host,uri).
Proposition : ACL
Syntaxe :
---------
acl <name> <what> <operator> <value> ...
Ceci créera une acl référencée sous le nom <name> qui sera validée si
l'application d'au moins une des valeurs <value> avec l'opérateur <operator>
sur le sujet <what> est validée.
Opérateurs :
------------
Toujours 2 caractères :
[=!][~=*^%/.]
Premier caractère :
'=' : OK si test valide
'!' : OK si test échoué.
Second caractère :
'~' : compare avec une regex
'=' : compare chaîne à chaîne
'*' : compare la fin de la chaîne (ex: =* ".mydomain.com")
'^' : compare le début de la chaîne (ex: =^ "/images/")
'%' : recherche une sous-chaîne
'/' : compare avec un mot entier en acceptant le '/' comme délimiteur
'.' : compare avec un mot entier en acceptant el '.' comme délimiteur
Ensuite on exécute une action de manière conditionnelle si l'ensemble des ACLs
mentionnées sont validées (ou invalidées pour celles précédées d'un "!") :
<what> <where> <action> on [!]<aclname> ...
Exemple :
---------
acl www_pub host =. www www01 dev preprod
acl imghost host =. images
acl imgdir uri =/ img
acl imagedir uri =/ images
acl msie h(user-agent) =% "MSIE"
set_host "images" on www_pub imgdir
remap_uri "/img" "/" on www_pub imgdir
remap_uri "/images" "/" on www_pub imagedir
setbe images on imghost
reqdel "Cookie" on all
Actions possibles :
req {in|out} {append|delete|rem|add|set|rep|mapuri|rewrite|reqline|deny|allow|setbe|tarpit}
resp {in|out} {append|delete|rem|add|set|rep|maploc|rewrite|stsline|deny|allow}
req in append <line>
req in delete <line_regex>
req in rem <header>
req in add <header> <new_value>
req in set <header> <new_value>
req in rep <header> <old_value> <new_value>
req in mapuri <old_uri_prefix> <new_uri_prefix>
req in rewrite <old_uri_regex> <new_uri>
req in reqline <old_req_regex> <new_req>
req in deny
req in allow
req in tarpit
req in setbe <backend>
resp out maploc <old_location_prefix> <new_loc_prefix>
resp out stsline <old_sts_regex> <new_sts_regex>
Les chaînes doivent être délimitées par un même caractère au début et à la fin,
qui doit être échappé s'il est présent dans la chaîne. Tout ce qui se trouve
entre le caractère de fin et les premiers espace est considéré comme des
options passées au traitement. Par exemple :
req in rep host /www/i /www/
req in rep connection /keep-alive/i "close"
Il serait pratique de pouvoir effectuer un remap en même temps qu'un setbe.
Captures: les séparer en in/out. Les rendre conditionnelles ?

View File

@ -0,0 +1,124 @@
2007/03/30 - Header storage in trees
This documentation describes how to store headers in radix trees, providing
fast access to any known position, while retaining the ability to grow/reduce
any arbitrary header without having to recompute all positions.
Principle :
We have a radix tree represented in an integer array, which represents the
total number of bytes used by all headers whose position is below it. This
ensures that we can compute any header's position in O(log(N)) where N is
the number of headers.
Example with N=16 :
+-----------------------+
| |
+-----------+ +-----------+
| | | |
+-----+ +-----+ +-----+ +-----+
| | | | | | | |
+--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+
| | | | | | | | | | | | | | | |
0 1 2 3 4 5 6 7 8 9 A B C D E F
To reach header 6, we have to compute hdr[0]+hdr[4]+hdr[6]
With this method, it becomes easy to grow any header and update the array.
To achieve this, we have to replace one after the other all bits on the
right with one 1 followed by zeroes, and update the position if it's higher
than current position, and stop when it's above number of stored headers.
For instance, if we want to grow hdr[6], we proceed like this :
6 = 0110 (BIN)
Let's consider the values to update :
(bit 0) : (0110 & ~0001) | 0001 = 0111 = 7 > 6 => update
(bit 1) : (0110 & ~0011) | 0010 = 0110 = 6 <= 6 => leave it
(bit 2) : (0110 & ~0111) | 0100 = 0100 = 4 <= 6 => leave it
(bit 4) : (0110 & ~1111) | 1000 = 1000 = 8 > 6 => update
(bit 5) : larger than array size, stop.
It's easy to walk through the tree too. We only have one iteration per bit
changing from X to the ancestor, and one per bit from the ancestor to Y.
The ancestor is found while walking. To go from X to Y :
pos = pos(X)
while (Y != X) {
if (Y > X) {
// walk from Y to ancestor
pos += hdr[Y]
Y &= (Y - 1)
} else {
// walk from X to ancestor
pos -= hdr[X]
X &= (X - 1)
}
}
However, it is not trivial anymore to linearly walk the tree. We have to move
from a known place to another known place, but a jump to next entry costs the
same as a jump to a random place.
Other caveats :
- it is not possible to remove a header, it is only possible to empty it.
- it is not possible to insert a header, as that would imply a renumbering.
=> this means that a "defrag" function is required. Headers should preferably
be added, then should be stuffed on top of destroyed ones, then only
inserted if absolutely required.
When we have this, we can then focus on a 32-bit header descriptor which would
look like this :
{
unsigned line_len :13; /* total line length, including CRLF */
unsigned name_len :6; /* header name length, max 63 chars */
unsigned sp1 :5; /* max spaces before value : 31 */
unsigned sp2 :8; /* max spaces after value : 255 */
}
Example :
Connection: close \r\n
<---------+-----+-----+-------------> line_len
<-------->| | | name_len
<-----> | sp1
<-------------> sp2
Rem:
- if there are more than 31 spaces before the value, the buffer will have to
be moved before being registered
- if there are more than 255 spaces after the value, the buffer will have to
be moved before being registered
- we can use the empty header name as an indicator for a deleted header
- it would be wise to format a new request before sending lots of random
spaces to the servers.
- normal clients do not send such crap, so those operations *may* reasonably
be more expensive than the rest provided that other ones are very fast.
It would be handy to have the following macros :
hdr_eon(hdr) => end of name
hdr_sov(hdr) => start of value
hdr_eof(hdr) => end of value
hdr_vlen(hdr) => length of value
hdr_hlen(hdr) => total header length
A 48-bit encoding would look like this :
Connection: close \r\n
<---------+------+---+--------------> eoh = 16 bits
<-------->| | | eon = 8 bits
<--------------->| | sov = 8 bits
<---> vlen = 16 bits

View File

@ -0,0 +1,5 @@
Many interesting RFC and drafts linked to from this site :
http://www.web-cache.com/Writings/protocols-standards.html