2005-04-16 15:20:36 -07:00
#
# Traffic control configuration.
#
2005-07-11 21:13:56 -07:00
2007-10-18 21:56:38 -07:00
menuconfig NET_SCHED
2005-07-11 21:13:56 -07:00
bool "QoS and/or fair queueing"
2006-11-09 16:16:21 -08:00
select NET_SCH_FIFO
2005-07-11 21:13:56 -07:00
---help---
When the kernel has several packets to send out over a network
device, it has to decide which ones to send first, which ones to
2005-11-01 15:13:02 +01:00
delay, and which ones to drop. This is the job of the queueing
disciplines, several different algorithms for how to do this
2005-07-11 21:13:56 -07:00
"fairly" have been proposed.
If you say N here, you will get the standard packet scheduler, which
is a FIFO (first come, first served). If you say Y here, you will be
able to choose from among several alternative algorithms which can
then be attached to different network devices. This is useful for
example if some of your network devices are real time devices that
need a certain minimum data flow rate, or if you need to limit the
maximum data flow rate for traffic which matches specified criteria.
This code is considered to be experimental.
To administer these schedulers, you'll need the user-level utilities
from the package iproute2+tc at <ftp://ftp.tux.org/pub/net/ip-routing/>.
That package also contains some documentation; for more, check out
2010-11-15 19:55:34 +00:00
<http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2>.
2005-07-11 21:13:56 -07:00
This Quality of Service (QoS) support will enable you to use
Differentiated Services (diffserv) and Resource Reservation Protocol
2005-11-01 15:13:02 +01:00
(RSVP) on your Linux router if you also say Y to the corresponding
classifiers below. Documentation and software is at
<http://diffserv.sourceforge.net/>.
2005-07-11 21:13:56 -07:00
If you say Y here and to "/proc file system" below, you will be able
to read status information about packet schedulers from the file
/proc/net/psched.
The available schedulers are listed in the following questions; you
can say Y to as many as you like. If unsure, say N now.
2005-11-17 15:22:39 -08:00
if NET_SCHED
2005-11-01 15:13:02 +01:00
comment "Queueing/Scheduling"
2005-04-16 15:20:36 -07:00
config NET_SCH_CBQ
2005-11-01 15:13:02 +01:00
tristate "Class Based Queueing (CBQ)"
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the Class-Based Queueing (CBQ) packet
2005-11-01 15:13:02 +01:00
scheduling algorithm. This algorithm classifies the waiting packets
into a tree-like hierarchy of classes; the leaves of this tree are
in turn scheduled by separate algorithms.
2005-04-16 15:20:36 -07:00
2005-11-01 15:13:02 +01:00
See the top of <file:net/sched/sch_cbq.c> for more details.
2005-04-16 15:20:36 -07:00
CBQ is a commonly used scheduler, so if you're unsure, you should
say Y here. Then say Y to all the queueing algorithms below that you
2005-11-01 15:13:02 +01:00
want to use as leaf disciplines.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_cbq.
config NET_SCH_HTB
2005-11-01 15:13:02 +01:00
tristate "Hierarchical Token Bucket (HTB)"
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the Hierarchical Token Buckets (HTB)
2005-11-01 15:13:02 +01:00
packet scheduling algorithm. See
2005-04-16 15:20:36 -07:00
<http://luxik.cdi.cz/~devik/qos/htb/> for complete manual and
in-depth articles.
2005-11-01 15:13:02 +01:00
HTB is very similar to CBQ regarding its goals however is has
2005-04-16 15:20:36 -07:00
different properties and different algorithm.
To compile this code as a module, choose M here: the
module will be called sch_htb.
config NET_SCH_HFSC
2005-11-01 15:13:02 +01:00
tristate "Hierarchical Fair Service Curve (HFSC)"
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the Hierarchical Fair Service Curve
2005-11-01 15:13:02 +01:00
(HFSC) packet scheduling algorithm.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_hfsc.
config NET_SCH_ATM
2005-11-01 15:13:02 +01:00
tristate "ATM Virtual Circuits (ATM)"
2005-11-17 15:22:39 -08:00
depends on ATM
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the ATM pseudo-scheduler. This
2005-11-01 15:13:02 +01:00
provides a framework for invoking classifiers, which in turn
select classes of this queuing discipline. Each class maps
the flow(s) it is handling to a given virtual circuit.
2007-07-18 02:00:04 -07:00
See the top of <file:net/sched/sch_atm.c> for more details.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_atm.
config NET_SCH_PRIO
2005-11-01 15:13:02 +01:00
tristate "Multi Band Priority Queueing (PRIO)"
---help---
2005-04-16 15:20:36 -07:00
Say Y here if you want to use an n-band priority queue packet
2005-11-01 15:13:02 +01:00
scheduler.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_prio.
2008-09-12 16:29:34 -07:00
config NET_SCH_MULTIQ
tristate "Hardware Multiqueue-aware Multi Band Queuing (MULTIQ)"
---help---
Say Y here if you want to use an n-band queue packet scheduler
to support devices that have multiple hardware transmit queues.
To compile this code as a module, choose M here: the
module will be called sch_multiq.
2005-04-16 15:20:36 -07:00
config NET_SCH_RED
2005-11-01 15:13:02 +01:00
tristate "Random Early Detection (RED)"
---help---
2005-04-16 15:20:36 -07:00
Say Y here if you want to use the Random Early Detection (RED)
2005-11-01 15:13:02 +01:00
packet scheduling algorithm.
See the top of <file:net/sched/sch_red.c> for more details.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_red.
net_sched: SFB flow scheduler
This is the Stochastic Fair Blue scheduler, based on work from :
W. Feng, D. Kandlur, D. Saha, K. Shin. Blue: A New Class of Active Queue
Management Algorithms. U. Michigan CSE-TR-387-99, April 1999.
http://www.thefengs.com/wuchang/blue/CSE-TR-387-99.pdf
This implementation is based on work done by Juliusz Chroboczek
General SFB algorithm can be found in figure 14, page 15:
B[l][n] : L x N array of bins (L levels, N bins per level)
enqueue()
Calculate hash function values h{0}, h{1}, .. h{L-1}
Update bins at each level
for i = 0 to L - 1
if (B[i][h{i}].qlen > bin_size)
B[i][h{i}].p_mark += p_increment;
else if (B[i][h{i}].qlen == 0)
B[i][h{i}].p_mark -= p_decrement;
p_min = min(B[0][h{0}].p_mark ... B[L-1][h{L-1}].p_mark);
if (p_min == 1.0)
ratelimit();
else
mark/drop with probabilty p_min;
I did the adaptation of Juliusz code to meet current kernel standards,
and various changes to address previous comments :
http://thread.gmane.org/gmane.linux.network/90225
http://thread.gmane.org/gmane.linux.network/90375
Default flow classifier is the rxhash introduced by RPS in 2.6.35, but
we can use an external flow classifier if wanted.
tc qdisc add dev $DEV parent 1:11 handle 11: \
est 0.5sec 2sec sfb limit 128
tc filter add dev $DEV protocol ip parent 11: handle 3 \
flow hash keys dst divisor 1024
Notes:
1) SFB default child qdisc is pfifo_fast. It can be changed by another
qdisc but a child qdisc MUST not drop a packet previously queued. This
is because SFB needs to handle a dequeued packet in order to maintain
its virtual queue states. pfifo_head_drop or CHOKe should not be used.
2) ECN is enabled by default, unlike RED/CHOKe/GRED
With help from Patrick McHardy & Andi Kleen
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Andi Kleen <andi@firstfloor.org>
CC: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23 10:56:17 +00:00
config NET_SCH_SFB
tristate "Stochastic Fair Blue (SFB)"
---help---
Say Y here if you want to use the Stochastic Fair Blue (SFB)
packet scheduling algorithm.
See the top of <file:net/sched/sch_sfb.c> for more details.
To compile this code as a module, choose M here: the
module will be called sch_sfb.
2005-04-16 15:20:36 -07:00
config NET_SCH_SFQ
2005-11-01 15:13:02 +01:00
tristate "Stochastic Fairness Queueing (SFQ)"
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the Stochastic Fairness Queueing (SFQ)
2007-07-18 02:00:04 -07:00
packet scheduling algorithm.
2005-11-01 15:13:02 +01:00
See the top of <file:net/sched/sch_sfq.c> for more details.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_sfq.
config NET_SCH_TEQL
2005-11-01 15:13:02 +01:00
tristate "True Link Equalizer (TEQL)"
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use the True Link Equalizer (TLE) packet
2005-11-01 15:13:02 +01:00
scheduling algorithm. This queueing discipline allows the combination
of several physical devices into one virtual device.
See the top of <file:net/sched/sch_teql.c> for more details.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_teql.
config NET_SCH_TBF
2005-11-01 15:13:02 +01:00
tristate "Token Bucket Filter (TBF)"
---help---
Say Y here if you want to use the Token Bucket Filter (TBF) packet
scheduling algorithm.
See the top of <file:net/sched/sch_tbf.c> for more details.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called sch_tbf.
config NET_SCH_GRED
2005-11-01 15:13:02 +01:00
tristate "Generic Random Early Detection (GRED)"
---help---
2005-04-16 15:20:36 -07:00
Say Y here if you want to use the Generic Random Early Detection
2005-05-03 14:34:20 -07:00
(GRED) packet scheduling algorithm for some of your network devices
2005-04-16 15:20:36 -07:00
(see the top of <file:net/sched/sch_red.c> for details and
references about the algorithm).
To compile this code as a module, choose M here: the
module will be called sch_gred.
config NET_SCH_DSMARK
2005-11-01 15:13:02 +01:00
tristate "Differentiated Services marker (DSMARK)"
---help---
2005-04-16 15:20:36 -07:00
Say Y if you want to schedule packets according to the
Differentiated Services architecture proposed in RFC 2475.
Technical information on this method, with pointers to associated
RFCs, is available at <http://www.gta.ufrj.br/diffserv/>.
To compile this code as a module, choose M here: the
module will be called sch_dsmark.
config NET_SCH_NETEM
2005-11-01 15:13:02 +01:00
tristate "Network emulator (NETEM)"
---help---
2005-04-16 15:20:36 -07:00
Say Y if you want to emulate network delay, loss, and packet
re-ordering. This is often useful to simulate networks when
testing applications or protocols.
To compile this driver as a module, choose M here: the module
will be called sch_netem.
If unsure, say N.
2008-11-20 04:10:00 -08:00
config NET_SCH_DRR
tristate "Deficit Round Robin scheduler (DRR)"
help
Say Y here if you want to use the Deficit Round Robin (DRR) packet
scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_drr.
If unsure, say N.
2011-01-17 08:06:09 +00:00
config NET_SCH_MQPRIO
tristate "Multi-queue priority scheduler (MQPRIO)"
help
Say Y here if you want to use the Multi-queue Priority scheduler.
This scheduler allows QOS to be offloaded on NICs that have support
for offloading QOS schedulers.
To compile this driver as a module, choose M here: the module will
be called sch_mqprio.
If unsure, say N.
2011-02-02 15:21:10 +00:00
config NET_SCH_CHOKE
tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
help
Say Y here if you want to use the CHOKe packet scheduler (CHOose
and Keep for responsive flows, CHOose and Kill for unresponsive
flows). This is a variation of RED which trys to penalize flows
that monopolize the queue.
To compile this code as a module, choose M here: the
module will be called sch_choke.
2011-04-04 05:30:58 +00:00
config NET_SCH_QFQ
tristate "Quick Fair Queueing scheduler (QFQ)"
help
Say Y here if you want to use the Quick Fair Queueing Scheduler (QFQ)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_qfq.
If unsure, say N.
codel: Controlled Delay AQM
An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.
http://queue.acm.org/detail.cfm?id=2209336
This AQM main input is no longer queue size in bytes or packets, but the
delay packets stay in (FIFO) queue.
As we don't have infinite memory, we still can drop packets in enqueue()
in case of massive load, but mean of CoDel is to drop packets in
dequeue(), using a control law based on two simple parameters :
target : target sojourn time (default 5ms)
interval : width of moving time window (default 100ms)
Based on initial work from Dave Taht.
Refactored to help future codel inclusion as a plugin for other linux
qdisc (FQ_CODEL, ...), like RED.
include/net/codel.h contains codel algorithm as close as possible than
Kathleen reference.
net/sched/sch_codel.c contains the linux qdisc specific glue.
Separate structures permit a memory efficient implementation of fq_codel
(to be sent as a separate work) : Each flow has its own struct
codel_vars.
timestamps are taken at enqueue() time with 1024 ns precision, allowing
a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses
usec as base unit.
Selected packets are dropped, unless ECN is enabled and packets can get
ECN mark instead.
Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and
tg3 drivers (BQL enabled).
Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
[ interval TIME ] [ ecn ]
qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
maxpacket 1514 ecn_mark 84399 drop_overlimit 0
CoDel must be seen as a base module, and should be used keeping in mind
there is still a FIFO queue. So a typical setup will probably need a
hierarchy of several qdiscs and packet classifiers to be able to meet
whatever constraints a user might have.
One possible example would be to use fq_codel, which combines Fair
Queueing and CoDel, in replacement of sfq / sfq_red.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-10 07:51:25 +00:00
config NET_SCH_CODEL
tristate "Controlled Delay AQM (CODEL)"
help
Say Y here if you want to use the Controlled Delay (CODEL)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_codel.
If unsure, say N.
fq_codel: Fair Queue Codel AQM
Fair Queue Codel packet scheduler
Principles :
- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
so that new flows have priority on old ones.
- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)
tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
[ target TIME ] [ interval TIME ] [ noecn ]
[ quantum BYTES ]
defaults : 1024 flows, 10240 packets limit, quantum : device MTU
target : 5ms (CoDel default)
interval : 100ms (CoDel default)
Impressive results on load :
class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
rate 201691Kbit 28595pps backlog 0b 312p requeues 0
lended: 33063109 borrowed: 0 giants: 0
tokens: -912 ctokens: -912
class fq_codel 10:1735 parent 10:
(dropped 1292, overlimits 0 requeues 0)
backlog 15140b 10p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4524 parent 10:
(dropped 1291, overlimits 0 requeues 0)
backlog 16654b 11p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4e74 parent 10:
(dropped 1290, overlimits 0 requeues 0)
backlog 6056b 4p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
class fq_codel 10:628a parent 10:
(dropped 1289, overlimits 0 requeues 0)
backlog 7570b 5p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
class fq_codel 10:a4b3 parent 10:
(dropped 302, overlimits 0 requeues 0)
backlog 16654b 11p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:c3c2 parent 10:
(dropped 1284, overlimits 0 requeues 0)
backlog 13626b 9p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:d331 parent 10:
(dropped 299, overlimits 0 requeues 0)
backlog 15140b 10p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:d526 parent 10:
(dropped 12160, overlimits 0 requeues 0)
backlog 35870b 211p requeues 0
deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
class fq_codel 10:e2c6 parent 10:
(dropped 1288, overlimits 0 requeues 0)
backlog 15140b 10p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:eab5 parent 10:
(dropped 1285, overlimits 0 requeues 0)
backlog 16654b 11p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:f220 parent 10:
(dropped 1289, overlimits 0 requeues 0)
backlog 15140b 10p requeues 0
deficit 1514 count 1 lastcount 1 ldelay 7.1ms
qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
rate 201697Kbit 28602pps backlog 0b 260p requeues 71
qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
new_flows_len 0 old_flows_len 11
PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms
10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms
Much better than SFQ because of priority given to new flows, and fast
path dirtying less cache lines.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-11 09:30:50 +00:00
config NET_SCH_FQ_CODEL
tristate "Fair Queue Controlled Delay AQM (FQ_CODEL)"
help
Say Y here if you want to use the FQ Controlled Delay (FQ_CODEL)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_fq_codel.
If unsure, say N.
pkt_sched: fq: Fair Queue packet scheduler
- Uses perfect flow match (not stochastic hash like SFQ/FQ_codel)
- Uses the new_flow/old_flow separation from FQ_codel
- New flows get an initial credit allowing IW10 without added delay.
- Special FIFO queue for high prio packets (no need for PRIO + FQ)
- Uses a hash table of RB trees to locate the flows at enqueue() time
- Smart on demand gc (at enqueue() time, RB tree lookup evicts old
unused flows)
- Dynamic memory allocations.
- Designed to allow millions of concurrent flows per Qdisc.
- Small memory footprint : ~8K per Qdisc, and 104 bytes per flow.
- Single high resolution timer for throttled flows (if any).
- One RB tree to link throttled flows.
- Ability to have a max rate per flow. We might add a socket option
to add per socket limitation.
Attempts have been made to add TCP pacing in TCP stack, but this
seems to add complex code to an already complex stack.
TCP pacing is welcomed for flows having idle times, as the cwnd
permits TCP stack to queue a possibly large number of packets.
This removes the 'slow start after idle' choice, hitting badly
large BDP flows, and applications delivering chunks of data
as video streams.
Nicely spaced packets :
Here interface is 10Gbit, but flow bottleneck is ~20Mbit
cwin is big, yet FQ avoids the typical bursts generated by TCP
(as in netperf TCP_RR -- -r 100000,100000)
15:01:23.545279 IP A > B: . 78193:81089(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.545394 IP B > A: . ack 81089 win 3668 <nop,nop,timestamp 11597985 1115>
15:01:23.546488 IP A > B: . 81089:83985(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.546565 IP B > A: . ack 83985 win 3668 <nop,nop,timestamp 11597986 1115>
15:01:23.547713 IP A > B: . 83985:86881(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.547778 IP B > A: . ack 86881 win 3668 <nop,nop,timestamp 11597987 1115>
15:01:23.548911 IP A > B: . 86881:89777(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.548949 IP B > A: . ack 89777 win 3668 <nop,nop,timestamp 11597988 1115>
15:01:23.550116 IP A > B: . 89777:92673(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.550182 IP B > A: . ack 92673 win 3668 <nop,nop,timestamp 11597989 1115>
15:01:23.551333 IP A > B: . 92673:95569(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.551406 IP B > A: . ack 95569 win 3668 <nop,nop,timestamp 11597991 1115>
15:01:23.552539 IP A > B: . 95569:98465(2896) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.552576 IP B > A: . ack 98465 win 3668 <nop,nop,timestamp 11597992 1115>
15:01:23.553756 IP A > B: . 98465:99913(1448) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.554138 IP A > B: P 99913:100001(88) ack 65248 win 3125 <nop,nop,timestamp 1115 11597805>
15:01:23.554204 IP B > A: . ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.554234 IP B > A: . 65248:68144(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.555620 IP B > A: . 68144:71040(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.557005 IP B > A: . 71040:73936(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.558390 IP B > A: . 73936:76832(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.559773 IP B > A: . 76832:79728(2896) ack 100001 win 3668 <nop,nop,timestamp 11597993 1115>
15:01:23.561158 IP B > A: . 79728:82624(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.562543 IP B > A: . 82624:85520(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.563928 IP B > A: . 85520:88416(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.565313 IP B > A: . 88416:91312(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.566698 IP B > A: . 91312:94208(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.568083 IP B > A: . 94208:97104(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.569467 IP B > A: . 97104:100000(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.570852 IP B > A: . 100000:102896(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.572237 IP B > A: . 102896:105792(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.573639 IP B > A: . 105792:108688(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.575024 IP B > A: . 108688:111584(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.576408 IP B > A: . 111584:114480(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
15:01:23.577793 IP B > A: . 114480:117376(2896) ack 100001 win 3668 <nop,nop,timestamp 11597994 1115>
TCP timestamps show that most packets from B were queued in the same ms
timeframe (TSval 1159799{3,4}), but FQ managed to send them right
in time to avoid a big burst.
In slow start or steady state, very few packets are throttled [1]
FQ gets a bunch of tunables as :
limit : max number of packets on whole Qdisc (default 10000)
flow_limit : max number of packets per flow (default 100)
quantum : the credit per RR round (default is 2 MTU)
initial_quantum : initial credit for new flows (default is 10 MTU)
maxrate : max per flow rate (default : unlimited)
buckets : number of RB trees (default : 1024) in hash table.
(consumes 8 bytes per bucket)
[no]pacing : disable/enable pacing (default is enable)
All of them can be changed on a live qdisc.
$ tc qd add dev eth0 root fq help
Usage: ... fq [ limit PACKETS ] [ flow_limit PACKETS ]
[ quantum BYTES ] [ initial_quantum BYTES ]
[ maxrate RATE ] [ buckets NUMBER ]
[ [no]pacing ]
$ tc -s -d qd
qdisc fq 8002: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 256 quantum 3028 initial_quantum 15140
Sent 216532416 bytes 148395 pkt (dropped 0, overlimits 0 requeues 14)
backlog 0b 0p requeues 14
511 flows, 511 inactive, 0 throttled
110 gc, 0 highprio, 0 retrans, 1143 throttled, 0 flows_plimit
[1] Except if initial srtt is overestimated, as if using
cached srtt in tcp metrics. We'll provide a fix for this issue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-29 15:49:55 -07:00
config NET_SCH_FQ
tristate "Fair Queue"
help
Say Y here if you want to use the FQ packet scheduling algorithm.
FQ does flow separation, and is able to respect pacing requirements
set by TCP stack into sk->sk_pacing_rate (for localy generated
traffic)
To compile this driver as a module, choose M here: the module
will be called sch_fq.
If unsure, say N.
2005-04-16 15:20:36 -07:00
config NET_SCH_INGRESS
tristate "Ingress Qdisc"
2008-01-31 16:57:15 -08:00
depends on NET_CLS_ACT
2005-11-01 15:13:02 +01:00
---help---
Say Y here if you want to use classifiers for incoming packets.
2005-04-16 15:20:36 -07:00
If unsure, say Y.
To compile this code as a module, choose M here: the
module will be called sch_ingress.
2012-02-05 13:51:32 +00:00
config NET_SCH_PLUG
tristate "Plug network traffic until release (PLUG)"
---help---
This queuing discipline allows userspace to plug/unplug a network
output queue, using the netlink interface. When it receives an
enqueue command it inserts a plug into the outbound queue that
causes following packets to enqueue until a dequeue command arrives
over netlink, causing the plug to be removed and resuming the normal
packet flow.
This module also provides a generic "network output buffering"
functionality (aka output commit), wherein upon arrival of a dequeue
command, only packets up to the first plug are released for delivery.
The Remus HA project uses this module to enable speculative execution
of virtual machines by allowing the generated network output to be rolled
back if needed.
For more information, please refer to http://wiki.xensource.com/xenwiki/Remus
Say Y here if you are using this kernel for Xen dom0 and
want to protect Xen guests with Remus.
To compile this code as a module, choose M here: the
module will be called sch_plug.
2005-11-01 15:13:02 +01:00
comment "Classification"
2005-04-16 15:20:36 -07:00
config NET_CLS
2005-11-01 15:13:02 +01:00
boolean
2005-04-16 15:20:36 -07:00
config NET_CLS_BASIC
2005-11-01 15:13:02 +01:00
tristate "Elementary classification (BASIC)"
select NET_CLS
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to be able to classify packets using
only extended matches and actions.
To compile this code as a module, choose M here: the
module will be called cls_basic.
config NET_CLS_TCINDEX
2005-11-01 15:13:02 +01:00
tristate "Traffic-Control Index (TCINDEX)"
select NET_CLS
---help---
Say Y here if you want to be able to classify packets based on
traffic control indices. You will want this feature if you want
to implement Differentiated Services together with DSMARK.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called cls_tcindex.
config NET_CLS_ROUTE4
2005-11-01 15:13:02 +01:00
tristate "Routing decision (ROUTE)"
2011-05-19 19:23:28 -04:00
depends on INET
2011-01-14 13:36:42 +01:00
select IP_ROUTE_CLASSID
2005-11-01 15:13:02 +01:00
select NET_CLS
---help---
If you say Y here, you will be able to classify packets
according to the route table entry they matched.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called cls_route.
config NET_CLS_FW
2005-11-01 15:13:02 +01:00
tristate "Netfilter mark (FW)"
select NET_CLS
---help---
If you say Y here, you will be able to classify packets
according to netfilter/firewall marks.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called cls_fw.
config NET_CLS_U32
2005-11-01 15:13:02 +01:00
tristate "Universal 32bit comparisons w/ hashing (U32)"
select NET_CLS
---help---
2006-06-30 18:53:46 +02:00
Say Y here to be able to classify packets using a universal
2005-11-01 15:13:02 +01:00
32bit pieces based comparison scheme.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called cls_u32.
config CLS_U32_PERF
2005-11-01 15:13:02 +01:00
bool "Performance counters support"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_U32
2005-11-01 15:13:02 +01:00
---help---
Say Y here to make u32 gather additional statistics useful for
fine tuning u32 classifiers.
2005-04-16 15:20:36 -07:00
config CLS_U32_MARK
2005-11-01 15:13:02 +01:00
bool "Netfilter marks support"
2006-11-09 15:19:14 -08:00
depends on NET_CLS_U32
2005-11-01 15:13:02 +01:00
---help---
Say Y here to be able to use netfilter marks as u32 key.
2005-04-16 15:20:36 -07:00
config NET_CLS_RSVP
2005-11-01 15:13:02 +01:00
tristate "IPv4 Resource Reservation Protocol (RSVP)"
select NET_CLS
2005-04-16 15:20:36 -07:00
---help---
The Resource Reservation Protocol (RSVP) permits end systems to
request a minimum and maximum data flow rate for a connection; this
is important for real time data such as streaming sound or video.
Say Y here if you want to be able to classify outgoing packets based
on their RSVP requests.
To compile this code as a module, choose M here: the
module will be called cls_rsvp.
config NET_CLS_RSVP6
2005-11-01 15:13:02 +01:00
tristate "IPv6 Resource Reservation Protocol (RSVP6)"
select NET_CLS
2005-04-16 15:20:36 -07:00
---help---
The Resource Reservation Protocol (RSVP) permits end systems to
request a minimum and maximum data flow rate for a connection; this
is important for real time data such as streaming sound or video.
Say Y here if you want to be able to classify outgoing packets based
2007-07-18 02:00:04 -07:00
on their RSVP requests and you are using the IPv6 protocol.
2005-04-16 15:20:36 -07:00
To compile this code as a module, choose M here: the
module will be called cls_rsvp6.
[NET_SCHED]: Add flow classifier
Add new "flow" classifier, which is meant to extend the SFQ hashing
capabilities without hard-coding new hash functions and also allows
deterministic mappings of keys to classes, replacing some out of tree
iptables patches like IPCLASSIFY (maps IPs to classes), IPMARK (maps
IPs to marks, with fw filters to classes), ...
Some examples:
- Classic SFQ hash:
tc filter add ... flow hash \
keys src,dst,proto,proto-src,proto-dst divisor 1024
- Classic SFQ hash, but using information from conntrack to work properly in
combination with NAT:
tc filter add ... flow hash \
keys nfct-src,nfct-dst,proto,nfct-proto-src,nfct-proto-dst divisor 1024
- Map destination IPs of 192.168.0.0/24 to classids 1-257:
tc filter add ... flow map \
key dst addend -192.168.0.0 divisor 256
- alternatively:
tc filter add ... flow map \
key dst and 0xff
- similar, but reverse ordered:
tc filter add ... flow map \
key dst and 0xff xor 0xff
Perturbation is currently not supported because we can't reliable kill the
timer on destruction.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 18:37:42 -08:00
config NET_CLS_FLOW
tristate "Flow classifier"
select NET_CLS
---help---
If you say Y here, you will be able to classify packets based on
a configurable combination of packet keys. This is mostly useful
in combination with SFQ.
To compile this code as a module, choose M here: the
module will be called cls_flow.
2008-11-07 22:56:00 -08:00
config NET_CLS_CGROUP
2010-03-23 05:24:03 +00:00
tristate "Control Group Classifier"
2008-11-07 22:56:00 -08:00
select NET_CLS
depends on CGROUPS
---help---
Say Y here if you want to classify packets based on the control
cgroup of their process.
2010-03-23 05:24:03 +00:00
To compile this code as a module, choose M here: the
module will be called cls_cgroup.
2005-04-16 15:20:36 -07:00
config NET_EMATCH
bool "Extended Matches"
2005-11-01 15:13:02 +01:00
select NET_CLS
2005-04-16 15:20:36 -07:00
---help---
Say Y here if you want to use extended matches on top of classifiers
and select the extended matches below.
Extended matches are small classification helpers not worth writing
2005-11-01 15:13:02 +01:00
a separate classifier for.
2005-04-16 15:20:36 -07:00
2005-11-01 15:13:02 +01:00
A recent version of the iproute2 package is required to use
2005-04-16 15:20:36 -07:00
extended matches.
config NET_EMATCH_STACK
int "Stack size"
depends on NET_EMATCH
default "32"
---help---
Size of the local stack variable used while evaluating the tree of
ematches. Limits the depth of the tree, i.e. the number of
2005-06-08 15:10:22 -07:00
encapsulated precedences. Every level requires 4 bytes of additional
2005-04-16 15:20:36 -07:00
stack space.
config NET_EMATCH_CMP
tristate "Simple packet data comparison"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets based on
simple packet data comparisons for 8, 16, and 32bit values.
To compile this code as a module, choose M here: the
module will be called em_cmp.
config NET_EMATCH_NBYTE
tristate "Multi byte comparison"
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets based on
multiple byte comparisons mainly useful for IPv6 address comparisons.
To compile this code as a module, choose M here: the
module will be called em_nbyte.
config NET_EMATCH_U32
2005-11-01 15:13:02 +01:00
tristate "U32 key"
2005-04-16 15:20:36 -07:00
depends on NET_EMATCH
---help---
Say Y here if you want to be able to classify packets using
the famous u32 key in combination with logic relations.
To compile this code as a module, choose M here: the
module will be called em_u32.
config NET_EMATCH_META
tristate "Metadata"
depends on NET_EMATCH
---help---
2006-01-11 16:40:30 -08:00
Say Y here if you want to be able to classify packets based on
2005-04-16 15:20:36 -07:00
metadata such as load average, netfilter attributes, socket
attributes and routing decisions.
To compile this code as a module, choose M here: the
module will be called em_meta.
2005-06-23 21:00:58 -07:00
config NET_EMATCH_TEXT
tristate "Textsearch"
depends on NET_EMATCH
2005-06-23 23:55:41 -07:00
select TEXTSEARCH
2005-06-24 17:39:03 -07:00
select TEXTSEARCH_KMP
2005-08-25 16:23:11 -07:00
select TEXTSEARCH_BM
2005-06-24 17:39:03 -07:00
select TEXTSEARCH_FSM
2005-06-23 21:00:58 -07:00
---help---
2005-11-01 15:13:02 +01:00
Say Y here if you want to be able to classify packets based on
2005-06-24 17:39:03 -07:00
textsearch comparisons.
2005-06-23 21:00:58 -07:00
To compile this code as a module, choose M here: the
module will be called em_text.
2012-07-04 05:32:03 +02:00
config NET_EMATCH_CANID
tristate "CAN Identifier"
2012-11-23 00:44:57 +00:00
depends on NET_EMATCH && (CAN=y || CAN=m)
2012-07-04 05:32:03 +02:00
---help---
Say Y here if you want to be able to classify CAN frames based
on CAN Identifier.
To compile this code as a module, choose M here: the
module will be called em_canid.
2012-07-11 10:56:57 +00:00
config NET_EMATCH_IPSET
tristate "IPset"
depends on NET_EMATCH && IP_SET
---help---
Say Y here if you want to be able to classify packets based on
ipset membership.
To compile this code as a module, choose M here: the
module will be called em_ipset.
2005-04-16 15:20:36 -07:00
config NET_CLS_ACT
2005-11-01 15:13:02 +01:00
bool "Actions"
2005-04-16 15:20:36 -07:00
---help---
2005-11-01 15:13:02 +01:00
Say Y here if you want to use traffic control actions. Actions
get attached to classifiers and are invoked after a successful
classification. They are used to overwrite the classification
result, instantly drop or redirect packets, etc.
A recent version of the iproute2 package is required to use
extended matches.
2005-04-16 15:20:36 -07:00
config NET_ACT_POLICE
2005-11-01 15:13:02 +01:00
tristate "Traffic Policing"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_ACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here if you want to do traffic policing, i.e. strict
bandwidth limiting. This action replaces the existing policing
module.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_police.
2005-04-16 15:20:36 -07:00
config NET_ACT_GACT
2005-11-01 15:13:02 +01:00
tristate "Generic actions"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_ACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here to take generic actions such as dropping and
accepting packets.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_gact.
2005-04-16 15:20:36 -07:00
config GACT_PROB
2005-11-01 15:13:02 +01:00
bool "Probability support"
2005-04-16 15:20:36 -07:00
depends on NET_ACT_GACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here to use the generic action randomly or deterministically.
2005-04-16 15:20:36 -07:00
config NET_ACT_MIRRED
2005-11-01 15:13:02 +01:00
tristate "Redirecting and Mirroring"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_ACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here to allow packets to be mirrored or redirected to
other devices.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_mirred.
2005-04-16 15:20:36 -07:00
config NET_ACT_IPT
2005-11-01 15:13:02 +01:00
tristate "IPtables targets"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_ACT && NETFILTER && IP_NF_IPTABLES
---help---
2006-06-30 18:53:46 +02:00
Say Y here to be able to invoke iptables targets after successful
2005-11-01 15:13:02 +01:00
classification.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_ipt.
2005-04-16 15:20:36 -07:00
2007-09-27 12:48:05 -07:00
config NET_ACT_NAT
tristate "Stateless NAT"
depends on NET_CLS_ACT
---help---
Say Y here to do stateless NAT on IPv4 packets. You should use
netfilter for NAT unless you know what you are doing.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_nat.
2007-09-27 12:48:05 -07:00
2005-04-16 15:20:36 -07:00
config NET_ACT_PEDIT
2005-11-01 15:13:02 +01:00
tristate "Packet Editing"
2005-04-16 15:20:36 -07:00
depends on NET_CLS_ACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here if you want to mangle the content of packets.
2005-04-16 15:20:36 -07:00
2005-11-01 15:13:02 +01:00
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_pedit.
2005-04-16 15:20:36 -07:00
2005-04-24 20:10:16 -07:00
config NET_ACT_SIMP
2005-11-01 15:13:02 +01:00
tristate "Simple Example (Debug)"
2005-04-24 20:10:16 -07:00
depends on NET_CLS_ACT
---help---
2005-11-01 15:13:02 +01:00
Say Y here to add a simple action for demonstration purposes.
It is meant as an example and for debugging purposes. It will
print a configured policy string followed by the packet count
to the console for every packet that passes by.
If unsure, say N.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_simple.
2005-11-01 15:13:02 +01:00
2008-09-12 16:30:20 -07:00
config NET_ACT_SKBEDIT
tristate "SKB Editing"
depends on NET_CLS_ACT
---help---
Say Y here to change skb priority or queue_mapping settings.
If unsure, say N.
To compile this code as a module, choose M here: the
2010-02-08 22:41:44 -08:00
module will be called act_skbedit.
2008-09-12 16:30:20 -07:00
2010-08-18 13:10:35 +00:00
config NET_ACT_CSUM
tristate "Checksum Updating"
2010-08-23 20:42:11 -07:00
depends on NET_CLS_ACT && INET
2010-08-18 13:10:35 +00:00
---help---
Say Y here to update some common checksum after some direct
packet alterations.
To compile this code as a module, choose M here: the
module will be called act_csum.
2005-11-01 15:13:02 +01:00
config NET_CLS_IND
bool "Incoming device classification"
2005-11-17 15:22:39 -08:00
depends on NET_CLS_U32 || NET_CLS_FW
2005-11-01 15:13:02 +01:00
---help---
Say Y here to extend the u32 and fw classifier to support
classification based on the incoming device. This option is
likely to disappear in favour of the metadata ematch.
2005-11-17 15:22:39 -08:00
endif # NET_SCHED
2007-10-18 21:56:38 -07:00
config NET_SCH_FIFO
bool