linux/net
Daniel Borkmann a7288c4dd5 net: sctp: improve sctp_select_active_and_retran_path selection
In function sctp_select_active_and_retran_path(), we walk the
transport list in order to look for the two most recently used
ACTIVE transports (trans_pri, trans_sec). In case we didn't find
anything ACTIVE, we currently just camp on a possibly PF or
INACTIVE transport that is primary path; this behavior actually
dates back to linux-history tree of the very early days of
lksctp, and can yield a behavior that chooses suboptimal
transport paths.

Instead, be a bit more clever by reusing and extending the
recently introduced sctp_trans_elect_best() handler. In case
both transports are evaluated to have the same score resulting
from their states, break the tie by looking at: 1) transport
patch error count 2) last_time_heard value from each transport.

This is analogous to Nishida's Quick Failover draft [1],
section 5.1, 3:

  The sender SHOULD avoid data transmission to PF destinations.
  When all destinations are in either PF or Inactive state,
  the sender MAY either move the destination from PF to active
  state (and transmit data to the active destination) or the
  sender MAY transmit data to a PF destination. In the former
  scenario, (i) the sender MUST NOT notify the ULP about the
  state transition, and (ii) MUST NOT clear the destination's
  error counter. It is recommended that the sender picks the
  PF destination with least error count (fewest consecutive
  timeouts) for data transmission. In case of a tie (multiple PF
  destinations with same error count), the sender MAY choose the
  last active destination.

Thus for sctp_select_active_and_retran_path(), we keep track of
the best, if any, transport that is in PF state and in case no
ACTIVE transport has been found (hence trans_{pri,sec} is NULL),
we select the best out of the three: current primary_path and
retran_path as well as a possible PF transport.

The secondary may still camp on the original primary_path as
before. The change in sctp_trans_elect_best() with a more fine
grained tie selection also improves at the same time path selection
for sctp_assoc_update_retran_path() in case of non-ACTIVE states.

  [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
..
9p A bunch of updates and cleanup within the transport layer, 2014-04-11 14:14:57 -07:00
802
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
appletalk net: Split sk_no_check into sk_no_check_{rx,tx} 2014-05-23 16:28:53 -04:00
atm atm: remove commented out check 2014-05-30 17:17:04 -07:00
ax25 net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-03 23:32:12 -07:00
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-06-05 14:10:07 -04:00
bridge bridge: memorize and export selected IGMP/MLD querier port 2014-06-10 23:50:47 -07:00
caif net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
can can: add hash based access to single EFF frame filters 2014-05-19 09:38:24 +02:00
ceph crush: decode and initialize chooseleaf_vary_r 2014-05-16 21:29:55 +04:00
core net: filter: cleanup A/X name usage 2014-06-11 00:13:16 -07:00
dcb net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
dccp net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
decnet net: Split sk_no_check into sk_no_check_{rx,tx} 2014-05-23 16:28:53 -04:00
dns_resolver dns_resolver: Do not accept domain names longer than 255 chars 2014-06-05 00:05:53 -07:00
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
ethernet
hsr hsr: replace del_timer by del_timer_sync 2014-03-27 15:28:06 -04:00
ieee802154 6lowpan_rtnl: fix off by one while fragmentation 2014-06-02 10:39:42 -07:00
ipv4 gre: allow changing mac address when device is up 2014-06-10 22:46:42 -07:00
ipv6 ipv6: Shrink udp_v6_mcast_next() to one socket variable 2014-06-05 16:23:08 -07:00
ipx net: Split sk_no_check into sk_no_check_{rx,tx} 2014-05-23 16:28:53 -04:00
irda
iucv af_iucv: correct cleanup if listen backlog is full 2014-05-30 17:35:23 -07:00
key af_key: Replace comma with semicolon 2014-05-30 17:48:58 -07:00
l2tp l2tp: call udp{6}_set_csum 2014-06-04 22:46:38 -07:00
lapb
llc
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-29 12:55:38 -04:00
mac802154 mac802154: don't deliver packets to devices that are down 2014-06-11 12:10:19 -07:00
mpls gre: Call gso_make_checksum 2014-06-04 22:46:38 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2014-06-05 15:35:04 -07:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-03 23:32:12 -07:00
netrom net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
nfc NFC: nfc_sock_link() can be static 2014-05-26 00:53:10 +02:00
openvswitch vxlan: Add support for UDP checksums (v4 sending, v6 zero csums) 2014-06-04 22:46:39 -07:00
packet net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
phonet net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
rds rds/tcp_listen: Replace comma with semicolon 2014-05-30 17:48:58 -07:00
rfkill Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-22 13:58:36 -04:00
rose net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
rxrpc af_rxrpc: Fix XDR length check in rxrpc key demarshalling. 2014-05-16 15:24:47 -04:00
sched net: use the new API kvfree() 2014-06-05 00:49:51 -07:00
sctp net: sctp: improve sctp_select_active_and_retran_path selection 2014-06-11 12:23:17 -07:00
sunrpc sunrpc: Remove sk_no_check setting 2014-05-23 16:28:53 -04:00
tipc tipc: Don't reset the timeout when restarting 2014-05-24 14:11:41 -04:00
unix net: unix: Align send data_len up to PAGE_SIZE 2014-05-16 16:04:03 -04:00
vmw_vsock vsock: Make transport the proto owner 2014-05-05 13:13:50 -04:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-29 12:55:38 -04:00
x25 net: Fix use after free by removing length arg from sk_data_ready callbacks. 2014-04-11 16:15:36 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-03 23:32:12 -07:00
compat.c net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types 2014-03-06 16:30:45 +01:00
Kconfig Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
Makefile
nonet.c
socket.c net: use SYSCALL_DEFINEx for sys_recv 2014-04-16 15:15:05 -04:00
sysctl_net.c