linux/net/tipc
Parthasarathy Bhuvaragan 333f796235 tipc: fix a race condition leading to subscriber refcnt bug
Until now, the requests sent to topology server are queued
to a workqueue by the generic server framework.
These messages are processed by worker threads and trigger the
registered callbacks.
To reduce latency on uniprocessor systems, explicit rescheduling
is performed using cond_resched() after MAX_RECV_MSG_COUNT(25)
messages.

This implementation on SMP systems leads to an subscriber refcnt
error as described below:
When a worker thread yields by calling cond_resched() in a SMP
system, a new worker is created on another CPU to process the
pending workitem. Sometimes the sleeping thread wakes up before
the new thread finishes execution.
This breaks the assumption on ordering and being single threaded.
The fault is more frequent when MAX_RECV_MSG_COUNT is lowered.

If the first thread was processing subscription create and the
second thread processing close(), the close request will free
the subscriber and the create request oops as follows:

[31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380         [tipc]
[31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97
[31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc]
[...]
[31.228377] Call Trace:
[31.228377]  [<ffffffff812fbb6b>] dump_stack+0x4d/0x72
[31.228377]  [<ffffffff8105a311>] __warn+0xd1/0xf0
[31.228377]  [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20
[31.228377]  [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
[31.228377]  [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc]
[31.228377]  [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc]
[31.228377]  [<ffffffff81071925>] process_one_work+0x145/0x3d0
[31.246554] ---[ end trace c3882c9baa05a4fd ]---
[31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266
[31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428
[31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0
[31.249323] PGD 0
[31.249323] Oops: 0000 [#1] SMP

In this commit, we
- rename tipc_conn_shutdown() to tipc_conn_release().
- move connection release callback execution from tipc_close_conn()
  to a new function tipc_sock_release(), which is executed before
  we free the connection.
Thus we release the subscriber during connection release procedure
rather than connection shutdown procedure.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:46:46 -04:00
..
addr.c tipc: simplify include dependencies 2015-05-14 12:24:45 -04:00
addr.h tipc: simplify include dependencies 2015-05-14 12:24:45 -04:00
bcast.c tipc: remove pre-allocated message header in link struct 2016-03-06 23:01:20 -05:00
bcast.h tipc: remove pre-allocated message header in link struct 2016-03-06 23:01:20 -05:00
bearer.c tipc: stricter filtering of packets in bearer layer 2016-04-07 17:00:13 -04:00
bearer.h tipc: remove remnants of old broadcast code 2016-04-13 17:49:11 -04:00
core.c tipc: create broadcast transmission link at namespace init 2015-10-24 06:56:27 -07:00
core.h tipc: reduce code dependency between binding table and node layer 2015-11-20 14:06:10 -05:00
discover.c tipc: eliminate buffer leak in bearer layer 2016-04-07 17:00:13 -04:00
discover.h tipc: eliminate buffer leak in bearer layer 2016-04-07 17:00:13 -04:00
eth_media.c tipc: make media address offset a common define 2015-02-27 18:18:48 -05:00
ib_media.c tipc: rename media/msg related definitions 2015-02-27 18:18:48 -05:00
Kconfig tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
link.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
link.h tipc: remove pre-allocated message header in link struct 2016-03-06 23:01:20 -05:00
Makefile tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
msg.c tipc: let broadcast packet reception use new link receive function 2015-10-24 06:56:37 -07:00
msg.h tipc: stricter filtering of packets in bearer layer 2016-04-07 17:00:13 -04:00
name_distr.c tipc: reduce code dependency between binding table and node layer 2015-11-20 14:06:10 -05:00
name_distr.h tipc: reduce code dependency between binding table and node layer 2015-11-20 14:06:10 -05:00
name_table.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
name_table.h tipc: convert legacy nl name table dump to nl compat 2015-02-09 13:20:48 -08:00
net.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
net.h tipc: make tipc node table aware of net namespace 2015-01-12 16:24:32 -05:00
netlink_compat.c tipc: fix null deref crash in compat config path 2016-02-25 17:04:48 -05:00
netlink.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
netlink.h tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
node.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
node.h tipc: narrow down interface towards struct tipc_link 2015-11-20 14:06:10 -05:00
server.c tipc: fix a race condition leading to subscriber refcnt bug 2016-04-14 16:46:46 -04:00
server.h tipc: fix a race condition leading to subscriber refcnt bug 2016-04-14 16:46:46 -04:00
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-03-08 12:34:12 -05:00
socket.h tipc: clean up socket layer message reception 2015-07-26 16:31:50 -07:00
subscr.c tipc: fix a race condition leading to subscriber refcnt bug 2016-04-14 16:46:46 -04:00
subscr.h tipc: remove struct tipc_name_seq from struct tipc_subscription 2016-02-06 03:40:43 -05:00
sysctl.c tipc: add name distributor resiliency queue 2014-09-01 17:51:48 -07:00
udp_media.c tipc: make sure IPv6 header fits in skb headroom 2016-03-14 12:23:12 -04:00