312 Commits

Author SHA1 Message Date
Wenjia Zhang
aec45e857c s390/qeth: use vzalloc for QUERY OAT buffer
qeth_query_oat_command() currently allocates the kernel buffer for
the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
fragmented memory, large allocations may fail (eg. the qethqoat tool by
default uses 132KB).

Solve this issue by using vzalloc, backing the allocation with
non-contiguous memory.

Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:12:51 -07:00
Julian Wiedmann
04db741d0d s390/qeth: switch on SG by default for IQD devices
Scatter-gather transmit brings a nice performance boost. Considering the
rather large MTU sizes at play, it's also totally the Right Thing To Do.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:12:51 -07:00
Julian Wiedmann
778b1ac737 s390/qeth: indicate error when netdev allocation fails
Bailing out on allocation error is nice, but we also need to tell the
ccwgroup core that creating the qeth groupdev failed.

Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12 13:12:51 -07:00
Julian Wiedmann
f15cdaf237 s390/qeth: don't restrict qeth_card to DMA memory
Allocating the main qeth_card struct with GFP_DMA blocks us from moving
it into netdev_priv(). But the only reason why we need DMA memory is the
ccw1 structs embedded into each ccw channel. So extract those into
separate allocations, like we already do for the cmd buffers.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
95f4d8b75a s390/qeth: clean up card initialization
The qeth_card struct is kzalloc-ed, so remove all the redundant
0-initializations. While at it, split up what's left of
qeth_determine_card_type().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
24142fd8d8 s390/qeth: do basic setup for data channel
The data channel currently doesn't need a setup operation, because we
don't use pre-allocated cmd buffers for its IO. But subsequent changes
will introduce further setup that also applies to the data channel.
This refactors things a bit, so that the new stuff can then be
automatically applied to all channels.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
45ca2fd646 s390/qeth: use qeth_setup_ccw() to set up all CCWs
Re-work the helper a little bit, so that it can be used for all CCWs
that qeth issues.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
750b162598 s390/qeth: reduce hard-coded access to ccw channels
Where possible use accessor macros and local pointers to access the ccw
channels. This makes it less likely to miss a spot.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
73657a3e5b s390/qeth: extract helper for MPC protocol type
Just a little code deduplication.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 14:02:50 -07:00
Julian Wiedmann
5f89eca577 s390/qeth: speed up L2 IQD xmit
Modify the L2 OSA xmit path so that it also supports L2 IQD devices
(in particular, their HW header requirements). This allows IQD devices
to advertise NETIF_F_SG support, and eliminates the allocation overhead
for the HW header.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
a7c2f4a332 s390/qeth: add support for constrained HW headers
Some transmit modes require that the HW header is located in the same
page as the initial protocol headers in skb->data. Let callers specify
the size of this contiguous header range, and enforce it when building
the HW header.

While at it, apply some gentle renaming to the relevant L2 code so that
it matches the L3 code.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
ba86ceee9d s390/qeth: merge linearize-check into HW header construction
When checking whether an skb needs to be linearized to fit into an IO
buffer, it's desirable to consider the skb's final size and layout
(ie. after the HW header was added). But a subsequent linearization can
then cause the re-positioned HW header to violate its alignment
restrictions.

Dealing with this situation in two different code paths is quite tricky.
This patch integrates a) linearize-check and b) HW header construction
into one 3 step-sequence:
1. evaluate how the HW header needs to be added (to identify if it takes
   up an additional buffer element), then
2. check if the required buffer elements exceed the device's limit.
   Linearize when necessary and re-evaluate the HW header placement.
3. Add the HW header in the best-possible way:
   a) push, without taking up an additional buffer element
   b) push, but consume another buffer element
   c) allocate a header object from the cache.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
d2a274b25b s390/qeth: add statistics for consumed buffer elements
Nowadays an skb fragment typically spans over multiple pages. So replace
the obsolete, SG-only 'fragments' counter with one that tracks the
consumed buffer elements. This is what actually matters for performance.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
72f219da79 s390/qeth: use core MTU range checking
qeth's ndo_change_mtu() only applies some trivial bounds checking. Set
up dev->min_mtu properly, so that dev_set_mtu() can do this for us.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
8ce7a9e064 s390/qeth: simplify max MTU handling
When the MPC initialization code discovers the HW-specific max MTU,
apply the resulting changes straight to the netdevice.

If this is the device's first initialization, also set its MTU
(HiperSockets: the max MTU; else: a layer-specific default value).
Then cap the current MTU by the new max MTU.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
92d2720969 s390/qeth: don't cache HW port number
The netdevice is always available now, so get the portno from there.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:30 -07:00
Julian Wiedmann
d3d1b205e8 s390/qeth: allocate netdevice early
Allocation of the netdevice is currently delayed until a qeth card first
goes online. This complicates matters in several places, where we need
to cache values instead of applying them straight to the netdevice.

Improve on this by moving the allocation up to where the qeth card
itself is created. This is also one step in direction of eventually
placing the qeth card into netdev_priv().

In all subsequent code, remove the now redundant checks whether
card->dev is valid.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:29 -07:00
Julian Wiedmann
addc5ee872 s390/qeth: remove redundant netif_carrier_ok() checks
netif_carrier_off() does its own checking.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:29 -07:00
Julian Wiedmann
a702349a40 s390/qeth: fix race in used-buffer accounting
By updating q->used_buffers only _after_ do_QDIO() has completed, there
is a potential race against the buffer's TX completion. In the unlikely
case that the TX completion path wins, qeth_qdio_output_handler() would
decrement the counter before qeth_flush_buffers() even incremented it.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-21 10:12:29 -07:00
Julian Wiedmann
09960b3a0a s390/qeth: clean up exported symbols
Remove some redundant EXPORTs. While at it, also move some L2-only
prototypes into the proper header file.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 16:42:39 -07:00
Julian Wiedmann
6d8769abe4 s390/qeth: consolidate ccwgroup driver definition
Reshuffle the code a bit so that everything is in one place.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 16:42:39 -07:00
Julian Wiedmann
86c0cdb9e0 s390/qeth: clean up Output Queue selection
Consolidate duplicated code, fix the misuse of RTN_UNSPEC and simplify
the handling of non-unicast traffic on IQD devices.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 16:42:39 -07:00
Julian Wiedmann
f67a43a73b s390/qeth: remove unused buffer->aob pointer
Except for tracing, the pointer is not used.

At the same time, accessing it from qeth_qdio_output_handler() is racy:
whenever qeth_qdio_cq_handler() gets control, its call to
qeth_qdio_handle_aob() frees the AOB.

So the AOB pointer that qeth_qdio_output_handler() stores into 'buffer'
can go stale at any time, and trigger a use-after-free.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 16:42:39 -07:00
Julian Wiedmann
3b346c1814 s390/qeth: various buffer management cleanups
Use the new qeth_scrub_qdio_buffer() helper, remove an extra parameter
from qeth_clear_output_buffer(), init the bufstates.user field just once
(in qeth_flush_buffers()) and remove some noisy trace messages.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 16:42:39 -07:00
Julian Wiedmann
d025da9eb1 s390/qeth: consistently re-enable device features
commit e830baa9c3f0 ("qeth: restore device features after recovery") and
commit ce3443564145 ("s390/qeth: rely on kernel for feature recovery")
made sure that the HW functions for device features get re-programmed
after recovery.

But we missed that the same handling is also required when a card is
first set offline (destroying all HW context), and then online again.
Fix this by moving the re-enable action out of the recovery-only path.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:19:48 +09:00
Julian Wiedmann
ce28867fd2 s390/qeth: don't clobber buffer on async TX completion
If qeth_qdio_output_handler() detects that a transmit requires async
completion, it replaces the pending buffer's metadata object
(qeth_qdio_out_buffer) so that this queue buffer can be re-used while
the data is pending completion.

Later when the CQ indicates async completion of such a metadata object,
qeth_qdio_cq_handler() tries to free any data associated with this
object (since HW has now completed the transfer). By calling
qeth_clear_output_buffer(), it erronously operates on the queue buffer
that _previously_ belonged to this transfer ... but which has been
potentially re-used several times by now.
This results in double-free's of the buffer's data, and failing
transmits as the buffer descriptor is scrubbed in mid-air.

The correct way of handling this situation is to
1. scrub the queue buffer when it is prepared for re-use, and
2. later obtain the data addresses from the async-completion notifier
   (ie. the AOB), instead of the queue buffer.

All this only affects qeth devices used for af_iucv HiperTransport.

Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:19:48 +09:00
Julian Wiedmann
4664610537 Revert "s390/qeth: use Read device to query hypervisor for MAC"
This reverts commit b7493e91c11a757cf0f8ab26989642ee4bb2c642.

On its own, querying RDEV for a MAC address works fine. But when upgrading
from a qeth that previously queried DDEV on a z/VM NIC (ie. any kernel with
commit ec61bd2fd2a2), the RDEV query now returns a _different_ MAC address
than the DDEV query.

If the NIC is configured with MACPROTECT, z/VM apparently requires us to
use the MAC that was initially returned (on DDEV) and registered. So after
upgrading to a kernel that uses RDEV, the SETVMAC registration cmd for the
new MAC address fails and we end up with a non-operabel interface.

To avoid regressions on upgrade, switch back to using DDEV for the MAC
address query. The downgrade path (first RDEV, later DDEV) is fine, in this
case both queries return the same MAC address.

Fixes: b7493e91c11a ("s390/qeth: use Read device to query hypervisor for MAC")
Reported-by: Michal Kubecek <mkubecek@suse.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 21:19:48 +09:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Linus Torvalds
1c8c5a9d38 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add Maglev hashing scheduler to IPVS, from Inju Song.

 2) Lots of new TC subsystem tests from Roman Mashak.

 3) Add TCP zero copy receive and fix delayed acks and autotuning with
    SO_RCVLOWAT, from Eric Dumazet.

 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
    Brouer.

 5) Add ttl inherit support to vxlan, from Hangbin Liu.

 6) Properly separate ipv6 routes into their logically independant
    components. fib6_info for the routing table, and fib6_nh for sets of
    nexthops, which thus can be shared. From David Ahern.

 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
    messages from XDP programs. From Nikita V. Shirokov.

 8) Lots of long overdue cleanups to the r8169 driver, from Heiner
    Kallweit.

 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

11) Plumb extack down into fib_rules, from Roopa Prabhu.

12) Add Flower classifier offload support to igb, from Vinicius Costa
    Gomes.

13) Add UDP GSO support, from Willem de Bruijn.

14) Add documentation for eBPF helpers, from Quentin Monnet.

15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

16) Allow applications to be given the number of bytes available to read
    on a socket via a control message returned from recvmsg(), from
    Soheil Hassas Yeganeh.

17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
    From Björn Töpel.

19) Remove indirect load support from all of the BPF JITs and handle
    these operations in the verifier by translating them into native BPF
    instead. From Daniel Borkmann.

20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

21) Allow XDP programs to do lookups in the main kernel routing tables
    for forwarding. From David Ahern.

22) Allow drivers to store hardware state into an ELF section of kernel
    dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

23) Various RACK and loss detection improvements in TCP, from Yuchung
    Cheng.

24) Add TCP SACK compression, from Eric Dumazet.

25) Add User Mode Helper support and basic bpfilter infrastructure, from
    Alexei Starovoitov.

26) Support ports and protocol values in RTM_GETROUTE, from Roopa
    Prabhu.

27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
    Brouer.

28) Add lots of forwarding selftests, from Petr Machata.

29) Add generic network device failover driver, from Sridhar Samudrala.

* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
  strparser: Add __strp_unpause and use it in ktls.
  rxrpc: Fix terminal retransmission connection ID to include the channel
  net: hns3: Optimize PF CMDQ interrupt switching process
  net: hns3: Fix for VF mailbox receiving unknown message
  net: hns3: Fix for VF mailbox cannot receiving PF response
  bnx2x: use the right constant
  Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
  net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
  enic: fix UDP rss bits
  netdev-FAQ: clarify DaveM's position for stable backports
  rtnetlink: validate attributes in do_setlink()
  mlxsw: Add extack messages for port_{un, }split failures
  netdevsim: Add extack error message for devlink reload
  devlink: Add extack to reload and port_{un, }split operations
  net: metrics: add proper netlink validation
  ipmr: fix error path when ipmr_new_table fails
  ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
  net: hns3: remove unused hclgevf_cfg_func_mta_filter
  netfilter: provide udp*_lib_lookup for nf_tproxy
  qed*: Utilize FW 8.37.2.0
  ...
2018-06-06 18:39:49 -07:00
Thomas Gleixner
604a98f1df Merge branch 'timers/urgent' into timers/core
Pick up urgent fixes to apply dependent cleanup patch
2018-05-02 16:11:12 +02:00
Kittipon Meesompop
d7e6ed97b5 s390/qeth: add IPv6 RX checksum offload support
Check if a qeth device supports IPv6 RX checksum offload, and hook it up
into the existing NETIF_F_RXCSUM support.
As NETIF_F_RXCSUM is now backed by a combination of HW Assists, we need
to be a little smarter when dealing with errors during a configuration
change:
- switching on NETIF_F_RXCSUM only makes sense if at least one HW Assist
  was enabled successfully.
- for switching off NETIF_F_RXCSUM, all available HW Assists need to be
  deactivated.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:49 -04:00
Kittipon Meesompop
571f9dd802 s390/qeth: add IPv6 TX checksum offload support
Check if a qeth device supports IPv6 TX checksum offload, and advertise
NETIF_F_IPV6_CSUM accordingly. Add support for setting the relevant bits
in IPv6 packet descriptors.

Currently this has only limited use (ie. UDP, or Jumbo Frames). For any
TCP traffic with a standard MSS, the TCP checksum gets calculated
as part of the linear GSO segmentation.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:49 -04:00
Kittipon Meesompop
a8155b009f s390/qeth: extend Checksum Offload Assists for IPv6
Add some wrappers to make the protocol-specific Assist code a little
more generic, and use them for sending protocol-agnostic commands in
the Checksum Offload Assist code.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:49 -04:00
Kittipon Meesompop
ee75fb8615 s390/qeth: query IPv6 assists during hardsetup
For new functionality, the L2 subdriver will start using IPv6 assists.
So move the query from the L3 subdriver into the common setup path.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:49 -04:00
Kittipon Meesompop
3aade31b2f s390/qeth: add stats counter for RX csum offload
This matches the statistics we gather for the TX offload path.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:48 -04:00
Julian Wiedmann
d4ac024688 s390/qeth: convert vlan spinlock to mutex
As the vid_list is only accessed from process context, there's no need to
protect it with a spinlock.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:47 -04:00
Julian Wiedmann
7bcd64eb8c s390/qeth: skip QDIO queue handler indirection
Both qeth sub drivers use the same QDIO queue handlers, there's no need
to expose them via the driver's discipline. No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:47 -04:00
Julian Wiedmann
b7493e91c1 s390/qeth: use Read device to query hypervisor for MAC
For z/VM NICs, qeth needs to consider which of the three CCW devices in
an MPC group it uses for requesting a managed MAC address.

On the Base device, the hypervisor returns a default MAC which is
pre-assigned when creating the NIC (this MAC is also returned by the
READ MAC primitive). Querying any other device results in the allocation
of an additional MAC address.

For consistency with READ MAC and to avoid using up more addresses than
necessary, it is preferable to use the NIC's default MAC. So switch the
the diag26c over to using a NIC's Read device, which should always be
identical to the Base device.

Fixes: ec61bd2fd2a2 ("s390/qeth: use diag26c to get MAC address on L2")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:32 -04:00
Julian Wiedmann
db71bbbd11 s390/qeth: fix request-side race during cmd IO timeout
Submitting a cmd IO request (usually on the WRITE device, but for IDX
also on the READ device) is currently done with ccw_device_start()
and a manual timeout in the caller.
On timeout, the caller cleans up the related resources (eg. IO buffer).
But 1) the IO might still be active and utilize those resources, and
    2) when the IO completes, qeth_irq() will attempt to clean up the
       same resources again.

Instead of introducing additional resource locking, switch to
ccw_device_start_timeout() to ensure IO termination after timeout, and
let the IRQ handler alone deal with cleaning up after a request.

This also removes a stray write->irq_pending reset from
clear_ipacmd_list(). The routine doesn't terminate any pending IO on
the WRITE device, so this should be handled properly via IO timeout
in the IRQ handler.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:32 -04:00
Julian Wiedmann
a936b1ef37 s390/qeth: handle failure on workqueue creation
Creating the global workqueue during driver init may fail, deal with it.
Also, destroy the created workqueue on any subsequent error.

Fixes: 0f54761d167f ("qeth: Support VEPA mode")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Julian Wiedmann
901e3f49fa s390/qeth: avoid control IO completion stalls
For control IO, qeth currently tracks the index of the buffer that it
expects to complete the next IO on each qeth_channel. If the channel
presents an IRQ while this buffer has not yet completed, no completion
processing for _any_ completed buffer takes place.
So if the 'next buffer' is skipped for any sort of reason* (eg. when it
is released due to error conditions, before the IO is started), the
buffer obviously won't switch to PROCESSED until it is eventually
allocated for a _different_ IO and completes.
Until this happens, all completion processing on that channel stalls
and pending requests possibly time out.

As a fix, remove the whole 'next buffer' logic and simply process any
IO buffer right when it completes. A channel will never have more than
one IO pending, so there's no risk of processing out-of-sequence.

*Note: currently just one location in the code really handles this problem,
       by advancing the 'next' index manually.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Julian Wiedmann
686c97ee29 s390/qeth: fix error handling in adapter command callbacks
Make sure to check both return code fields before(!) processing the
command response. Otherwise we risk operating on invalid data.

This matches an earlier fix for SETASSPARMS commands, see
commit ad3cbf613329 ("s390/qeth: fix error handling in checksum cmd callback").

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Deepa Dinamani
0d55303c51 compat: Move compat_timespec/ timeval to compat_time.h
All the current architecture specific defines for these
are the same. Refactor these common defines to a common
header file.

The new common linux/compat_time.h is also useful as it
will eventually be used to hold all the defines that
are needed for compat time types that support non y2038
safe types. New architectures need not have to define these
new types as they will only use new y2038 safe syscalls.
This file can be deleted after y2038 when we stop supporting
non y2038 safe syscalls.

The patch also requires an operation similar to:

git grep "asm/compat\.h" | cut -d ":" -f 1 |  xargs -n 1 sed -i -e "s%asm/compat.h%linux/compat.h%g"

Cc: acme@kernel.org
Cc: benh@kernel.crashing.org
Cc: borntraeger@de.ibm.com
Cc: catalin.marinas@arm.com
Cc: cmetcalf@mellanox.com
Cc: cohuck@redhat.com
Cc: davem@davemloft.net
Cc: deller@gmx.de
Cc: devel@driverdev.osuosl.org
Cc: gerald.schaefer@de.ibm.com
Cc: gregkh@linuxfoundation.org
Cc: heiko.carstens@de.ibm.com
Cc: hoeppner@linux.vnet.ibm.com
Cc: hpa@zytor.com
Cc: jejb@parisc-linux.org
Cc: jwi@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: mark.rutland@arm.com
Cc: mingo@redhat.com
Cc: mpe@ellerman.id.au
Cc: oberpar@linux.vnet.ibm.com
Cc: oprofile-list@lists.sf.net
Cc: paulus@samba.org
Cc: peterz@infradead.org
Cc: ralf@linux-mips.org
Cc: rostedt@goodmis.org
Cc: rric@kernel.org
Cc: schwidefsky@de.ibm.com
Cc: sebott@linux.vnet.ibm.com
Cc: sparclinux@vger.kernel.org
Cc: sth@linux.vnet.ibm.com
Cc: ubraun@linux.vnet.ibm.com
Cc: will.deacon@arm.com
Cc: x86@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: James Hogan <jhogan@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-19 13:29:54 +02:00
Linus Torvalds
becdce1c66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:

 - Improvements for the spectre defense:
    * The spectre related code is consolidated to a single file
      nospec-branch.c
    * Automatic enable/disable for the spectre v2 defenses (expoline vs.
      nobp)
    * Syslog messages for specve v2 are added
    * Enable CONFIG_GENERIC_CPU_VULNERABILITIES and define the attribute
      functions for spectre v1 and v2

 - Add helper macros for assembler alternatives and use them to shorten
   the code in entry.S.

 - Add support for persistent configuration data via the SCLP Store Data
   interface. The H/W interface requires a page table that uses 4K pages
   only, the code to setup such an address space is added as well.

 - Enable virtio GPU emulation in QEMU. To do this the depends
   statements for a few common Kconfig options are modified.

 - Add support for format-3 channel path descriptors and add a binary
   sysfs interface to export the associated utility strings.

 - Add a sysfs attribute to control the IFCC handling in case of
   constant channel errors.

 - The vfio-ccw changes from Cornelia.

 - Bug fixes and cleanups.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (40 commits)
  s390/kvm: improve stack frame constants in entry.S
  s390/lpp: use assembler alternatives for the LPP instruction
  s390/entry.S: use assembler alternatives
  s390: add assembler macros for CPU alternatives
  s390: add sysfs attributes for spectre
  s390: report spectre mitigation via syslog
  s390: add automatic detection of the spectre defense
  s390: move nobp parameter functions to nospec-branch.c
  s390/cio: add util_string sysfs attribute
  s390/chsc: query utility strings via fmt3 channel path descriptor
  s390/cio: rename struct channel_path_desc
  s390/cio: fix unbind of io_subchannel_driver
  s390/qdio: split up CCQ handling for EQBS / SQBS
  s390/qdio: don't retry EQBS after CCQ 96
  s390/qdio: restrict buffer merging to eligible devices
  s390/qdio: don't merge ERROR output buffers
  s390/qdio: simplify math in get_*_buffer_frontier()
  s390/decompressor: trim uncompressed image head during the build
  s390/crypto: Fix kernel crash on aes_s390 module remove.
  s390/defkeymap: fix global init to zero
  ...
2018-04-09 09:04:10 -07:00
Sebastian Ott
ded27d8d2e s390/cio: rename struct channel_path_desc
Rename struct channel_path_desc to struct channel_path_desc_fmt0
to fit the scheme. Provide a macro for the function wrappers that
gather this and related data from firmware.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-03-26 16:13:11 +02:00
David S. Miller
03fe2debbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fun set of conflict resolutions here...

For the mac80211 stuff, these were fortunately just parallel
adds.  Trivially resolved.

In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
function phy_disable_interrupts() earlier in the file, whilst in
'net-next' the phy_error() call from this function was removed.

In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
'rt_table_id' member of rtable collided with a bug fix in 'net' that
added a new struct member "rt_mtu_locked" which needs to be copied
over here.

The mlxsw driver conflict consisted of net-next separating
the span code and definitions into separate files, whilst
a 'net' bug fix made some changes to that moved code.

The mlx5 infiniband conflict resolution was quite non-trivial,
the RDMA tree's merge commit was used as a guide here, and
here are their notes:

====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch.  This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
            drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
            (IB/mlx5: Fix cleanup order on unload) added to for-rc and
            commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
            add as part of the devel cycle both needed to modify the
            init/de-init functions used by mlx5.  To support the new
            representors, the new functions added by the cleanup patch
            needed to be made non-static, and the init/de-init list
            added by the representors patch needed to be modified to
            match the init/de-init list changes made by the cleanup
            patch.
    Updates:
            drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
            prototypes added by representors patch to reflect new function
            names as changed by cleanup patch
            drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
            stage list to match new order from cleanup patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23 11:31:58 -04:00
Julian Wiedmann
a6c3d93963 s390/qeth: on channel error, reject further cmd requests
When the IRQ handler determines that one of the cmd IO channels has
failed and schedules recovery, block any further cmd requests from
being submitted. The request would inevitably stall, and prevent the
recovery from making progress until the request times out.

This sort of error was observed after Live Guest Relocation, where
the pending IO on the READ channel intentionally gets terminated to
kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
triggering qeth to issue a QUERY CARD INFO command. The command
then stalled in the inoperabel WRITE channel.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann
17bf8c9b3d s390/qeth: lock read device while queueing next buffer
For calling ccw_device_start(), issue_next_read() needs to hold the
device's ccwlock.
This is satisfied for the IRQ handler path (where qeth_irq() gets called
under the ccwlock), but we need explicit locking for the initial call by
the MPC initialization.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann
1063e432bb s390/qeth: when thread completes, wake up all waiters
qeth_wait_for_threads() is potentially called by multiple users, make
sure to notify all of them after qeth_clear_thread_running_bit()
adjusted the thread_running_mask. With no timeout, callers would
otherwise stall.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann
6be687395b s390/qeth: free netdevice when removing a card
On removal, a qeth card's netdevice is currently not properly freed
because the call chain looks as follows:

qeth_core_remove_device(card)
	lx_remove_device(card)
		unregister_netdev(card->dev)
		card->dev = NULL			!!!
	qeth_core_free_card(card)
		if (card->dev)				!!!
			free_netdev(card->dev)

Fix it by free'ing the netdev straight after unregistering. This also
fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
where the need to free the current netdevice was not considered at all.

Note that free_netdev() takes care of the netif_napi_del() for us too.

Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00