linux

iv/linux

History

Michal Soltys 678a6241c6 net/sched/sch_hfsc.c: keep fsc and virtual times in sync; fix an old bug This patch simplifies how we update fsc and calculate vt from it - while keeping the expected functionality identical with how hfsc behaves curently. It also fixes a certain issue introduced with a very old patch. The idea is, that instead of correcting cl_vt before fsc curve update (rtsc_min) and correcting cl_vt after calculation (rtsc_y2x) to keep cl_vt local to the current period - we can simply rely on virtual times and curve values always being in sync - analogously to how rsc and usc function, except that we use virtual time here. Why hasn't it been done since the beginning this way ? The likely scenario (basing on the code trying to correct curves whenever possible) was to keep the virtual times as small as possible - as they have tendency to "gallop" forward whenever their siblings and other fair sharing subtrees are idling. On top of that, current code is subtly bugged, so cumulative time (without any corrections) is always kept and used in init_vf() when a new backlog period begins (using cl_cvtoff). Is cumulative value safe ? Generally yes, though corner cases are easy to create. For example consider: 1gbit interface some 100kbit leaf, everything else idle With current tick (64ns) 1s is 15625000 ticks, but the leaf is alone and it's virtual time, so in reality it's 10000 times more. ITOW 38 bits are needed to hold 1 second. 54 - 1 day, 59 - 1 month, 63 - 1 year (all logarithms rounded up). It's getting somewhat dangerous, but also requires setup excusing this kind of values not mentioning permanently backlogged class for a year. In near most extreme case (10gbit, 10kbit leaf), we have "enough" to hold ~13.6 days in 64 bits. Well, the issue remains mostly theoretical and cl_cvtoff has been working fine for all those years. Sensible configuration are de-facto immune to this issue, and not so sensible can solve it with a cronjob and its period inversely proportional to the insanity of such setup =) Now let's explain the subtle bug mentioned earlier. The issue is related to how offsets are kept and how we calculate virtual times and update fair service curve(s). The issue itself is subtle, but easy to observe with long m1 segments. It was introduced in rather old patch: Commit 99296150c7: "[NET_SCHED]: O(1) children vtoff adjustment in HFSC scheduler" (available in git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git) Originally when a new backlog period was started, cl_vtoff of each sibling was updated with cl_cvtmax from past period - naturally moving all cl_vt to proper starting point. That patch adjusted it so cumulative offset is kept in the parent, and there is no need for traversing the list (as any subsequent child activation derives new vt from already active sibling(s)). But with this change, cl_vtoff (of each sibling) is no longer persistent across the inactivity periods, as it's calculated from parent's cl_cvtoff on a new backlog period, conflicting with the following curve correction from the previous period: if (cl->cl_virtual.x == vt) { cl->cl_virtual.x -= cl->cl_vtoff; cl->cl_vtoff = 0; } This essentially tries to keep curve as if it was local to the period and resets cl_vtoff (cumulative vt offset of the class) to 0 when possible (read: when we have an intersection or if a new curve is below the old one). But then it's recalculated from cl_cvtoff on next active period. Then rtsc_min() call preceding the above if() doesn't really do what we expect it to do in such scenario - as it calculates the minimum of corrected curve (from the previous backlog period) and the new uncorrected curve (with offset derived from cl_cvtoff). Example: tc class add dev $ife parent 1:0 classid 1:1 hfsc ls m2 100mbit ul m2 100mbit tc class add dev $ife parent 1:1 classid 1:10 hfsc ls m1 80mbit d 10s m2 20mbit tc class add dev $ife parent 1:1 classid 1:11 hfsc ls m2 20mbit start B, keep it backlogged, let it run 6s (30s worth of vt as A is idle) pause B briefly to force cl_cvtoff update in parent (whole 1:1 going idle) start A, let it run 10s pause A briefly to force rtsc_min() At this point we would expect A to continue at 20mbit after a brief moment of 80mbit. But instead A will use 80mbit for full 10s again. It's the effect of first correcting A (during 'start A'), and then - after unpausing - calculating rtsc_min() from old corrected and new uncorrected curve. The patch fixes this bug and keepis vt and fsc in sync (virtual times are cumulative, not local to the backlog period). Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>		2016-08-08 16:06:47 -07:00
..
act_api.c	net_sched: get rid of struct tcf_common	2016-07-25 21:49:20 -07:00
act_bpf.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_connmark.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_csum.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_gact.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_ife.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_ipt.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_meta_mark.c	Support to encoding decoding skb mark on IFE action	2016-03-01 17:15:23 -05:00
act_meta_skbprio.c	Support to encoding decoding skb prio on IFE action	2016-03-01 17:15:23 -05:00
act_mirred.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_nat.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_pedit.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_police.c	net_sched: get rid of struct tcf_common	2016-07-25 21:49:20 -07:00
act_simple.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_skbedit.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
act_vlan.c	net_sched: move tc_action into tcf_common	2016-07-25 21:49:19 -07:00
cls_api.c	net, cls: also reject deleting all filters when TCA_KIND present	2016-06-16 22:50:16 -07:00
cls_basic.c	net_sched: destroy proto tp when all filters are gone	2015-03-09 15:35:55 -04:00
cls_bpf.c	bpf: refactor bpf_prog_get and type check into helper	2016-07-01 16:00:47 -04:00
cls_cgroup.c	cls_cgroup: factor out classid retrieval	2015-07-20 12:41:30 -07:00
cls_flow.c	sched: cls_flow: use skb_to_full_sk() helper	2015-11-08 20:56:39 -05:00
cls_flower.c	net/sched: flower: Return error when hw can't offload and skip_sw is set	2016-06-14 22:37:26 -07:00
cls_fw.c	net: revert "net_sched: move tp->root allocation into fw_init()"	2015-09-24 14:33:30 -07:00
cls_matchall.c	net/sched: Add match-all classifier hw offloading.	2016-07-24 23:11:59 -07:00
cls_route.c	net_sched: destroy proto tp when all filters are gone	2015-03-09 15:35:55 -04:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h	net_sched: convert rsvp to call tcf_exts_destroy from rcu callback	2015-08-26 11:01:45 -07:00
cls_tcindex.c	net_sched: convert tcindex to call tcf_exts_destroy from rcu callback	2015-08-26 11:01:44 -07:00
cls_u32.c	net: cls_u32: be more strict about skip-sw flag for knodes	2016-06-08 21:43:14 -07:00
em_canid.c	net: sched: remove tcf_proto from ematch calls	2014-10-06 18:02:32 -04:00
em_cmp.c
em_ipset.c	netfilter: x_tables: Pass struct net in xt_action_param	2015-09-18 21:58:14 +02:00
em_meta.c	qdisc: constify meta_type_ops structures	2016-04-14 00:35:30 -04:00
em_nbyte.c	net: sched: remove tcf_proto from ematch calls	2014-10-06 18:02:32 -04:00
em_text.c	net: Remove state argument from skb_find_text()	2015-02-22 15:59:54 -05:00
em_u32.c
ematch.c	ematch: Fix auto-loading of ematch modules.	2015-02-20 15:30:56 -05:00
Kconfig	net/sched: introduce Match-all classifier	2016-07-24 23:11:59 -07:00
Makefile	net/sched: introduce Match-all classifier	2016-07-24 23:11:59 -07:00
sch_api.c	sched: remove NET_XMIT_POLICED	2016-06-12 22:02:11 -04:00
sch_atm.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_blackhole.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_cbq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_choke.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_codel.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_drr.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_dsmark.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_fifo.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-06-30 05:03:36 -04:00
sch_fq_codel.c	net_sched: fq_codel: cache skb->truesize into skb->cb	2016-06-25 12:19:35 -04:00
sch_fq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_generic.c	net_sched: generalize bulk dequeue	2016-06-25 12:19:35 -04:00
sch_gred.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_hfsc.c	net/sched/sch_hfsc.c: keep fsc and virtual times in sync; fix an old bug	2016-08-08 16:06:47 -07:00
sch_hhf.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_htb.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-07-24 00:53:32 -04:00
sch_ingress.c	net: sched: fix tc_should_offload for specific clsact classes	2016-06-07 16:59:53 -07:00
sch_mq.c	net: sched: do not acquire qdisc spinlock in qdisc/class stats dump	2016-06-07 16:37:14 -07:00
sch_mqprio.c	net: sched: do not acquire qdisc spinlock in qdisc/class stats dump	2016-06-07 16:37:14 -07:00
sch_multiq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_netem.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-06-30 05:03:36 -04:00
sch_pie.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_plug.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_prio.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-06-30 05:03:36 -04:00
sch_qfq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_red.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_sfb.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_sfq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_tbf.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_teql.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00