linux/net
Chuck Lever 821c791a0b xprtrdma: Segment head and tail XDR buffers on page boundaries
A single memory allocation is used for the pair of buffers wherein
the RPC client builds an RPC call message and decodes its matching
reply. These buffers are sized based on the maximum possible size
of the RPC call and reply messages for the operation in progress.

This means that as the call buffer increases in size, the start of
the reply buffer is pushed farther into the memory allocation.

RPC requests are growing in size. It used to be that both the call
and reply buffers fit inside a single page.

But these days, thanks to NFSv4 (and especially security labels in
NFSv4.2) the maximum call and reply sizes are large. NFSv4.0 OPEN,
for example, now requires a 6KB allocation for a pair of call and
reply buffers, and NFSv4 LOOKUP is not far behind.

As the maximum size of a call increases, the reply buffer is pushed
far enough into the buffer's memory allocation that a page boundary
can appear in the middle of it.

When the maximum possible reply size is larger than the client's
RDMA receive buffers (currently 1KB), the client has to register a
Reply chunk for the server to RDMA Write the reply into.

The logic in rpcrdma_convert_iovs() assumes that xdr_buf head and
tail buffers would always be contained on a single page. It supplies
just one segment for the head and one for the tail.

FMR, for example, registers up to a page boundary (only a portion of
the reply buffer in the OPEN case above). But without additional
segments, it doesn't register the rest of the buffer.

When the server tries to write the OPEN reply, the RDMA Write fails
with a remote access error since the client registered only part of
the Reply chunk.

rpcrdma_convert_iovs() must split the XDR buffer into multiple
segments, each of which are guaranteed not to contain a page
boundary. That way fmr_op_map is given the proper number of segments
to register the whole reply buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-03-14 14:55:53 -04:00
..
6lowpan 6lowpan: fix debugfs interface entry name 2015-12-20 08:21:00 +01:00
9p Rework and error handling fixes, primarily in the fscatch and fd transports. 2016-01-24 12:39:09 -08:00
802
8021q net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK 2015-12-15 16:50:08 -05:00
appletalk appletalk: fix erroneous return value 2016-02-18 14:59:34 -05:00
atm net: Generalise wq_has_sleeper helper 2015-11-30 14:47:33 -05:00
ax25 net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
batman-adv batman-adv: Avoid endless loop in bat-on-bat netdevice check 2016-02-16 22:16:33 +08:00
bluetooth Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb 2016-02-20 08:52:28 +01:00
bridge bridge: mdb: avoid uninitialized variable warning 2016-02-16 15:37:28 -05:00
caif net: caif: fix erroneous return value 2016-02-18 14:59:35 -05:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph libceph: don't spam dmesg with stray reply warnings 2016-02-24 20:28:51 +01:00
core net: make netdev_for_each_lower_dev safe for device removal 2016-02-19 15:29:26 -05:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp tcp/dccp: fix another race at listener dismantle 2016-02-18 11:35:51 -05:00
decnet net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa net: dsa: Unregister slave_dev in error path 2016-02-17 22:05:16 -05:00
ethernet net: Add eth_platform_get_mac_address() helper. 2016-01-06 16:31:56 -05:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 inet: kill unused skb_free op 2016-01-05 22:25:57 -05:00
ipv4 rtnl: RTM_GETNETCONF: fix wrong return value 2016-02-19 15:33:46 -05:00
ipv6 rtnl: RTM_GETNETCONF: fix wrong return value 2016-02-19 15:33:46 -05:00
ipx
irda irda: fix a potential use-after-free in ircomm_param_request 2016-01-29 22:56:46 -08:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp l2tp: Fix error creating L2TP tunnels 2016-02-17 15:34:47 -05:00
l3mdev net: Add netif_is_l3_slave 2015-10-07 04:27:43 -07:00
lapb
llc
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-01 15:56:08 -08:00
mac802154 mac802154: constify ieee802154_llsec_ops structure 2016-01-04 20:40:41 +01:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
netfilter netfilter: nft_counter: fix erroneous return values 2016-02-08 13:05:02 +01:00
netlabel
netlink netlink: not trim skb for mmaped socket when dump 2016-01-29 20:25:17 -08:00
netrom
nfc NFC 4.5 pull request 2016-01-04 21:48:15 -05:00
openvswitch lwt: fix rx checksum setting for lwt devices tunneling over ipv6 2016-02-19 15:39:30 -05:00
packet packet: Allow packets with only a header (but no payload) 2015-11-29 22:17:17 -05:00
phonet phonet: properly unshare skbs in phonet_rcv() 2016-01-12 12:05:38 -05:00
rds Initial roundup of 4.5 merge window patches 2016-01-23 18:45:06 -08:00
rfkill rfkill: fix rfkill_fop_read wait_event usage 2016-01-26 11:32:05 +01:00
rose
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-01-12 18:57:02 -08:00
sched net_sched fix: reclassification needs to consider ether protocol changes 2016-02-18 11:14:19 -05:00
sctp sctp: Fix port hash table size computation 2016-02-21 21:52:51 -05:00
sunrpc xprtrdma: Segment head and tail XDR buffers on page boundaries 2016-03-14 14:55:53 -04:00
switchdev switchdev: Require RTNL mutex to be held when sending FDB notifications 2016-01-28 16:21:31 -08:00
tipc tipc: unlock in error path 2016-02-19 15:38:44 -05:00
unix af_unix: Don't use continue to re-execute unix_stream_read_generic loop 2016-02-19 23:50:31 -05:00
vmw_vsock vsock: Fix blocking ops call in prepare_to_wait 2016-02-13 05:57:39 -05:00
wimax
wireless regulatory: fix world regulatory domain data 2016-01-14 11:10:13 +01:00
x25
xfrm net: preserve IP control block during GSO segmentation 2016-01-15 14:35:24 -05:00
compat.c
Kconfig net, sched: add clsact qdisc 2016-01-10 22:13:15 -05:00
Makefile net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
socket.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00