591282 Commits

Author SHA1 Message Date
Alexander Duyck
d7fb5a8049 gso: Do not perform partial GSO if number of partial segments is 1 or less
In the event that the number of partial segments is equal to 1 we don't
really need to perform partial segmentation offload.  As such we should
skip multiplying the MSS and instead just clear the partial_segs value
since it will not provide any gain to advertise the frame as being GSO when
it is a single frame.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 13:32:26 -04:00
Jiri Benc
f132ae7c46 gre: change gre_parse_header to return the header length
It's easier for gre_parse_header to return the header length instead of
filing it into a parameter. That way, the callers that don't care about the
header length can just check whether the returned value is lower than zero.

In gre_err, the tunnel header must not be pulled. See commit b7f8fe251e46
("gre: do not pull header in ICMP error processing") for details.

This patch reduces the conflict between the mentioned commit and commit
95f5c64c3c13 ("gre: Move utility functions to common headers").

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 12:44:45 -04:00
Eric Dumazet
d4011239f4 tcp: guarantee forward progress in tcp_sendmsg()
Under high rx pressure, it is possible tcp_sendmsg() never has a
chance to allocate an skb and loop forever as sk_flush_backlog()
would always return true.

Fix this by calling sk_flush_backlog() only if one skb had been
allocated and filled before last backlog check.

Fixes: d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 12:44:36 -04:00
Tony Nguyen
8b44a8a09d ixgbevf: Remove unused parameter
ixgbevf_update_xcast_mode() is not using the netdev parameter;
removing it since it's unnecessary.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:32 -07:00
Usha Ketineni
8829009d2f ixgbe: Disable DCB and FCoE for X550EM_x and x550em_a
This patch adds IXGBE_FLAG_DCB_CAPABLE flag that is set
for all MACs other than X550EM_x and x550em_a. DCB and
FCoE is disabled for these MACS. DCB initialization
code is moved to a separate function.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Ronald Bynoe <ronald.j.bynoe@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:32 -07:00
Alexander Duyck
2f8214fe68 ixgbevf: Use mac_ops instead of trying to identify NIC type
This change makes it so that we can just use function pointers instead of
having to identify if a given VF is running on a Linux or Windows PF.  By
doing this we can avoid having to pull too much information out of the
lower layers and can instead just make use of the mac_ops pointers since
they should differ between the two types of VFs anyway.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:31 -07:00
Babu Moger
33b0eb1596 ixgbevf: Change the relaxed order settings in VF driver for sparc
We noticed performance issues with VF interface on sparc compared
to PF. Setting the RX to IXGBE_DCA_RXCTRL_DATA_WRO_EN brings it
on far with PF. Also this matches to the default sparc setting in
PF driver.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:31 -07:00
Preethi Banala
45a88dfcd8 ixgbe: Revise populating few registers and macro definitions
Revise populating few registers in ixgbe_get_regs() and macro
definitions.
Before applying patch:
$ du -k objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
8572    objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
After applying patch:
$ du -k objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
8568    objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:31 -07:00
Preethi Banala
4c4f8023be ixgbe: Return 64 bit stats values
The code was ignoring higher 32 bits of stats registers. This patch
correctly fills out 64 bit value in two 32 bit words.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:30 -07:00
Preethi Banala
61ff59d81c ixgbe: Remove duplicate and unused device ID definitions
Remove duplicate and unused device ID definitions.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:30 -07:00
Emil Tantilov
740234f070 ixgbe: check EEPROM for WOL support for X540 and above
This change aims to simplify the logic we use to determine WOL
support by reading the EEPROM bits for MACs X540 and newer.

Also some cleanups in ixgbe_wol_supported() - changed return type to
bool and removed redundant return variable by simply using return after
the checks.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:29 -07:00
Emil Tantilov
00103a6ce3 ixgbe: add WoL support for some 82599 subdevice IDs
We had some 82599 subdevice IDs missing from the list of parts that
support WoL.

Reported-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:29 -07:00
KY Srinivasan
c6d45171d7 ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path
as opposed to the hardware mailbox. Make the necessary
adjustments to support Hyper-V.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:29 -07:00
KY Srinivasan
b4363fbd8d ixgbevf: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V.
Add the device IDs presented while running on Hyper-V.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:28 -07:00
Amritha Nambiar
1cdaaf5405 ixgbe: Match on multiple headers for cls_u32 offloads
Adds support to set filters with multiple header fields (L3,L4)to match on.
This is achieved in the following order:
1. Create a leaf hash table for the next header.
2. Create a link to the leaf hash table from the base hash table with
   matches on next header type and current header fields.
3. Add filter in leaf hash table with match on next header fields and
   action.

Verified with the following filters :

Match TCP and DIP:
        handle 1: u32 divisor 1
        u32 ht 800: order 1 link 1: \
        offset at 0 mask 0f00 shift 6 plus 0 eat \
        match ip protocol 6 ff match ip dst 10.0.0.1/32
        match tcp src 28 ffff action drop

Delete the filter:

Match on DIP, SIP, UDP (SPort, DPort):
        handle 2: u32 divisor 1
        u32 ht 800: order 2 link 2: \
        offset at 0 mask 0f00 shift 6 plus 0 eat \
        match ip dst 15.0.0.2/32 match ip protocol 17 ff \
        match ip src 15.0.0.1/32
        match udp src 30 ffff match udp dst 32 ffff action drop

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:28 -07:00
Sridhar Samudrala
947f8a4552 ixgbe: Add support for redirect action to cls_u32 offloads
This patch enables 'redirect' to a SRIOV VF or a offloaded macvlan
device queue via tc 'mirred' action.

Verified with the following script that creates SRIOV VFs,  offloaded
macvlan and adds tc u32 filters with redirect action to the associated
netdevs.

 # add ingress qdisc.
 tc qdisc add dev p4p1 ingress

 # enable hw tc offload.
 ethtool -K p4p1 hw-tc-offload on

 # create 4 sriov VFs and bring up the first one.
 echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
 sleep 1
 ip link set p4p1 up
 ip link set p4p1_0 up

 # create a offloaded macvlan device and bring it up.
 ethtool -K p4p1 l2-fwd-offload on
 ip link add link p4p1 name mvlan_1 type macvlan
 ip link set mvlan_1 up

 # add u32 filter with action to redirect to VF netdev
 tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
    handle 800:0:1 u32 ht 800: \
    match ip src 192.168.1.3/32 \
    action mirred egress redirect dev p4p1_0

 # add u32 filter with action to redirect to macvlan netdev
 tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
    handle 800:0:2 u32 ht 800: \
    match ip src 192.168.2.3/32 \
    action mirred egress redirect dev mvlan_1

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:28 -07:00
Sridhar Samudrala
229d285081 net_sched: act_mirred: add helper inlines to access tcf_mirred info
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-04 00:24:27 -07:00
David S. Miller
cba6532100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/ip_gre.c

Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 00:52:29 -04:00
Linus Torvalds
7391daf2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Some straggler bug fixes:

   1) Batman-adv DAT must consider VLAN IDs when choosing candidate
      nodes, from Antonio Quartulli.

   2) Fix botched reference counting of vlan objects and neigh nodes in
      batman-adv, from Sven Eckelmann.

   3) netem can crash when it sees GSO packets, the fix is to segment
      then upon ->enqueue.  Fix from Neil Horman with help from Eric
      Dumazet.

   4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew
      Finlay.

   5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5,
      since it can sleep.  Fix also from Matthew Finlay.

   6) Check mdiobus_scan() return values properly in pxa168_eth and macb
      drivers.  From Sergei Shtylyov.

   7) If the netdevice doesn't support checksumming, disable
      segmentation.  From Alexandery Duyck.

   8) Fix races between RDS tcp accept and sending, from Sowmini
      Varadhan.

   9) In macb driver, probe MDIO bus before we register the netdev,
      otherwise we can try to open the device before it is really ready
      for that.  Fix from Florian Fainelli.

  10) Netlink attribute size for ILA "tunnels" not calculated properly,
      fix from Nicolas Dichtel"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6/ila: fix nlsize calculation for lwtunnel
  net: macb: Probe MDIO bus before registering netdev
  RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
  RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
  vxlan: Add checksum check to the features check function
  net: Disable segmentation if checksumming is not supported
  net: mvneta: Remove superfluous SMP function call
  macb: fix mdiobus_scan() error check
  pxa168_eth: fix mdiobus_scan() error check
  net/mlx5e: Use workqueue for vxlan ops
  net/mlx5e: Implement a mlx5e workqueue
  net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
  net/mlx5: Unmap only the relevant IO memory mapping
  netem: Segment GSO packets on enqueue
  batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node
  batman-adv: Fix reference counting of vlan object for tt_local_entry
  batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event
  batman-adv: fix DAT candidate selection (must use vid)
2016-05-03 15:07:50 -07:00
Linus Torvalds
610603a520 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "Fix a regression and update the MAINTAINERS entry for fuse"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: update mailing list in MAINTAINERS
  fuse: Fix return value from fuse_get_user_pages()
2016-05-03 14:23:58 -07:00
Nicolas Dichtel
79e8dc8b80 ipv6/ila: fix nlsize calculation for lwtunnel
The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR.

Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
CC: Tom Herbert <tom@herbertland.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:21:33 -04:00
Wei Wang
26879da587 ipv6: add new struct ipcm6_cookie
In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
This is not a good practice and makes it hard to add new parameters.
This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
ipv4 and include the above mentioned variables. And we only pass the
pointer to this structure to corresponding functions. This makes it easier
to add new parameters in the future and makes the function cleaner.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:08:14 -04:00
Florian Fainelli
cf6696608a net: macb: Probe MDIO bus before registering netdev
The current sequence makes us register for a network device prior to
registering and probing the MDIO bus which could lead to some unwanted
consequences, like a thread of execution calling into ndo_open before
register_netdev() returns, while the MDIO bus is not ready yet.

Rework the sequence to register for the MDIO bus, and therefore attach
to a PHY prior to calling register_netdev(), which implies reworking the
error path a bit.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:06:05 -04:00
David S. Miller
b365d955f3 Merge branch 'rds-fixes'
Sowmini Varadhan says:

====================
RDS: TCP: sychronization during connection startup

This patch series ensures that the passive (accept) side of the
TCP connection used for RDS-TCP is correctly synchronized with
any concurrent active (connect) attempts for a given pair of peers.

Patch 1 in the series makes sure that the t_sock in struct
rds_tcp_connection is only reset after any threads in rds_tcp_xmit
have completed (otherwise a null-ptr deref may be encountered).
Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
path.

v2: review comments from Santosh Shilimkar, other spelling corrections
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:03:45 -04:00
Sowmini Varadhan
bd7c5f983f RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
An arbitration scheme for duelling SYNs is implemented as part of
commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
involved will arrive at the same arbitration decision. However, this
needs to be synchronized with an outgoing SYN to be generated by
rds_tcp_conn_connect(). This commit achieves the synchronization
through the t_conn_lock mutex in struct rds_tcp_connection.

The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
the t_conn_lock mutex.  A SYN is sent out only if the RDS connection is
not already UP (an UP would indicate that rds_tcp_accept_one() has
completed 3WH, so no SYN needs to be generated).

Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
acquiring the t_conn_lock mutex. The only acceptable states (to
allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
outgoing SYN before we saw incoming SYN).

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:03:44 -04:00
Sowmini Varadhan
eb19284026 RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
There is a race condition between rds_send_xmit -> rds_tcp_xmit
and the code that deals with resolution of duelling syns added
by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()").

Specifically, we may end up derefencing a null pointer in rds_send_xmit
if we have the interleaving sequence:
           rds_tcp_accept_one                  rds_send_xmit

                                             conn is RDS_CONN_UP, so
    					 invoke rds_tcp_xmit

                                             tc = conn->c_transport_data
        rds_tcp_restore_callbacks
            /* reset t_sock */
    					 null ptr deref from tc->t_sock

The race condition can be avoided without adding the overhead of
additional locking in the xmit path: have rds_tcp_accept_one wait
for rds_tcp_xmit threads to complete before resetting callbacks.
The synchronization can be done in the same manner as rds_conn_shutdown().
First set the rds_conn_state to something other than RDS_CONN_UP
(so that new threads cannot get into rds_tcp_xmit()), then wait for
RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
threads in rds_tcp_xmit are done.

Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:03:44 -04:00
Eric Dumazet
1d2077ac01 net: add __sock_wfree() helper
Hosts sending lot of ACK packets exhibit high sock_wfree() cost
because of cache line miss to test SOCK_USE_WRITE_QUEUE

We could move this flag close to sk_wmem_alloc but it is better
to perform the atomic_sub_and_test() on a clean cache line,
as it avoid one extra bus transaction.

skb_orphan_partial() can also have a fast track for packets that either
are TCP acks, or already went through another skb_orphan_partial()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:02:36 -04:00
David S. Miller
42c8819b8d Merge branch 'tunnel-csum-and-sg-offloads'
Alexander Duyck says:

====================
Fixes for tunnel checksum and segmentation offloads

This patch series is a subset of patches I had submitted for net-next.  I
plan to drop these two patches from the v3 of "Fix Tunnel features and
enable GSO partial for several drivers" and I am instead submitting them
for net since these are truly fixes and likely will need to be backported
to stable branches.

This series addresses 2 specific issues.  The first is that we could
request TSO on a v4 inner header while not supporting checksum offload of
the outer IPv6 header.  The second is that we could request an IPv6 inner
checksum offload without validating that we could actually support an inner
IPv6 checksum offload.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:00:55 -04:00
Alexander Duyck
af67eb9e7e vxlan: Add checksum check to the features check function
We need to perform an additional check on the inner headers to determine if
we can offload the checksum for them.  Previously this check didn't occur
so we would generate an invalid frame in the case of an IPv6 header
encapsulated inside of an IPv4 tunnel.  To fix this I added a secondary
check to vxlan_features_check so that we can verify that we can offload the
inner checksum.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:00:54 -04:00
Alexander Duyck
996e802187 net: Disable segmentation if checksumming is not supported
In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
offload for tunnels.  With this being the case we should disable GSO in
addition to the checksum offload features when we find that a device cannot
perform a checksum on a given packet type.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:00:54 -04:00
David S. Miller
e34b1638d0 Merge branch 'tipc-next'
Jon Maloy says:

====================
tipc: redesign socket-level flow control

The socket-level flow control in TIPC has long been due for a major
overhaul. This series fixes this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:51:17 -04:00
Jon Paul Maloy
10724cc7bb tipc: redesign connection-level flow control
There are two flow control mechanisms in TIPC; one at link level that
handles network congestion, burst control, and retransmission, and one
at connection level which' only remaining task is to prevent overflow
in the receiving socket buffer. In TIPC, the latter task has to be
solved end-to-end because messages can not be thrown away once they
have been accepted and delivered upwards from the link layer, i.e, we
can never permit the receive buffer to overflow.

Currently, this algorithm is message based. A counter in the receiving
socket keeps track of number of consumed messages, and sends a dedicated
acknowledge message back to the sender for each 256 consumed message.
A counter at the sending end keeps track of the sent, not yet
acknowledged messages, and blocks the sender if this number ever reaches
512 unacknowledged messages. When the missing acknowledge arrives, the
socket is then woken up for renewed transmission. This works well for
keeping the message flow running, as it almost never happens that a
sender socket is blocked this way.

A problem with the current mechanism is that it potentially is very
memory consuming. Since we don't distinguish between small and large
messages, we have to dimension the socket receive buffer according
to a worst-case of both. I.e., the window size must be chosen large
enough to sustain a reasonable throughput even for the smallest
messages, while we must still consider a scenario where all messages
are of maximum size. Hence, the current fix window size of 512 messages
and a maximum message size of 66k results in a receive buffer of 66 MB
when truesize(66k) = 131k is taken into account. It is possible to do
much better.

This commit introduces an algorithm where we instead use 1024-byte
blocks as base unit. This unit, always rounded upwards from the
actual message size, is used when we advertise windows as well as when
we count and acknowledge transmitted data. The advertised window is
based on the configured receive buffer size in such a way that even
the worst-case truesize/msgsize ratio always is covered. Since the
smallest possible message size (from a flow control viewpoint) now is
1024 bytes, we can safely assume this ratio to be less than four, which
is the value we are now using.

This way, we have been able to reduce the default receive buffer size
from 66 MB to 2 MB with maintained performance.

In order to keep this solution backwards compatible, we introduce a
new capability bit in the discovery protocol, and use this throughout
the message sending/reception path to always select the right unit.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:51:16 -04:00
Jon Paul Maloy
60020e1857 tipc: propagate peer node capabilities to socket layer
During neighbor discovery, nodes advertise their capabilities as a bit
map in a dedicated 16-bit field in the discovery message header. This
bit map has so far only be stored in the node structure on the peer
nodes, but we now see the need to keep a copy even in the socket
structure.

This commit adds this functionality.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:51:15 -04:00
Jon Paul Maloy
7c8bcfb125 tipc: re-enable compensation for socket receive buffer double counting
In the refactoring commit d570d86497ee ("tipc: enqueue arrived buffers
in socket in separate function") we did by accident replace the test

if (sk->sk_backlog.len == 0)
     atomic_set(&tsk->dupl_rcvcnt, 0);

with

if (sk->sk_backlog.len)
     atomic_set(&tsk->dupl_rcvcnt, 0);

This effectively disables the compensation we have for the double
receive buffer accounting that occurs temporarily when buffers are
moved from the backlog to the socket receive queue. Until now, this
has gone unnoticed because of the large receive buffer limits we are
applying, but becomes indispensable when we reduce this buffer limit
later in this series.

We now fix this by inverting the mentioned condition.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:51:14 -04:00
Oliver Neukum
2b84af94a3 rtl8152: correct speed testing
Allow for SS+ USB

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:49:34 -04:00
Oliver Neukum
ea0798423c usbnet: correct speed testing
Allow for SS+ USB

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:49:34 -04:00
Oliver Neukum
8caf115c72 brcm80211: correct speed testing
Allow for SS+ USB

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:49:34 -04:00
Manish Chopra
c0f31a05f5 qed: Apply tunnel configurations after PF start
Configure and enable various tunnels on the
adapter after PF start.

This change was missed as a part of
'commit 464f664501816ef5fbbc00b8de96f4ae5a1c9325
("qed: Add infrastructure support for tunneling")'

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:48:32 -04:00
Anna-Maria Gleixner
0e28bf93a2 net: mvneta: Remove superfluous SMP function call
Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out
in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and
CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
function calls are no longer required.

Replace smp_call_function_single() with a direct call to
mvneta_percpu_enable() or mvneta_percpu_disable(). The functions do
not require to be called with interrupts disabled, therefore the
smp_call_function_single() calling convention is not preserved.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:27:18 -04:00
David S. Miller
da7daf5b76 Merge branch 'stmmac-dwmac-socfpga-cleanup'
Joachim Eastwood says:

====================
stmmac: dwmac-socfpga refactor+cleanup

This patch aims to remove the init/exit callbacks from the dwmac-
socfpga driver and instead use standard PM callbacks. Doing this
will also allow us to cleanup the driver.

Eventually the init/exit callbacks will be deprecated and removed
from all drivers dwmac-* except for dwmac-generic. Drivers will be
refactored to use standard PM and remove callbacks.

This patch set should not change the behavior of the driver itself,
it only moves code around. The only exception to this is patch
number 4 which restores the resume callback behavior which was
changed in the "net: stmmac: socfpga: Remove re-registration of
reset controller" patch. I belive calling phy_resume() only
from the resume callback and not probe is the right thing to do.

Changes from v1:
 - Rebase on net-next

One heads-up here:
The first patch changes the prototype of a couple of
functions used in Alexandre's "add Ethernet glue logic for
stm32 chip" patch [1] and will cause build failures for
dwmac-stm32.c if not fixed up!
If Alexandre's patch set is applied first I will gladly
rebase my patch set to account for his driver as well.

[1] https://patchwork.ozlabs.org/patch/614405/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:21 -04:00
Joachim Eastwood
0f400a87dc stmmac: dwmac-socfpga: kill init() and rename setup() to set_phy_mode()
Remove old init callback which now contains only a call to
socfpga_dwmac_setup(). Also rename socfpga_dwmac_setup() to indicate
what the function really does.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:20 -04:00
Joachim Eastwood
5373724724 stmmac: dwmac-socfpga: call phy_resume() only in resume callback
Calling phy_resume() should only be need during driver resume to
workaround a hardware errata.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:20 -04:00
Joachim Eastwood
70cb136f77 stmmac: dwmac-socfpga: keep a copy of stmmac_rst in driver priv data
The dwmac-socfpga driver needs to control the reset usually managed
by the core driver to set the PHY mode. Take a copy of the reset
handle from core priv data so it can be used by the driver later.

This also allow us to move reset handling into socfpga_dwmac_setup()
where the code that needs it is located.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:20 -04:00
Joachim Eastwood
56868deece stmmac: dwmac-socfpga: add PM ops and resume function
Implement the needed PM callbacks in the driver instead of
relying on the init/exit hooks in stmmac_platform. This gives
the driver more flexibility in how the code is organized.

Eventually the init/exit callbacks will be deprecated in favor
of the standard PM callbacks and driver remove function.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:19 -04:00
Joachim Eastwood
f4e7bd81b1 stmmac: let remove/resume/suspend functions take device pointer
Change stmmac_remove/resume/suspend to take a device pointer so
they can be used directly by drivers that doesn't need to perform
anything device specific.

This lets us remove the PCI pm functions and later simplifiy the
platform drivers.

Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:22:19 -04:00
Sergei Shtylyov
ce24c2b8a9 macb: fix mdiobus_scan() error check
Now mdiobus_scan() returns ERR_PTR(-ENODEV) instead of NULL if the PHY
device ID was read as all ones. As this was not  an error before, this
value  should be filtered out now in this driver.

Fixes: b74766a0a0fe ("phylib: don't return NULL from get_phy_device()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:03:09 -04:00
Sergei Shtylyov
6dd7454258 pxa168_eth: fix mdiobus_scan() error check
Since mdiobus_scan() returns either an error code or NULL on error, the
driver should check  for both,  not only for NULL, otherwise a crash is
imminent...

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:03:08 -04:00
Sven Eckelmann
64ae744553 batman-adv: Split batadv_iv_ogm_orig_del_if function
batadv_iv_ogm_orig_del_if handles two different buffers bcast_own and
bcast_own_sum which should be resized. The error handling two for
allocating these buffers causes the complexity of this function. This can
be avoided completely when the function is split into a main function
handling the locking, freeing and call of the subfunctions.

The subfunction can then independently handle the resize of the buffers.
This also allows to easily reuse the old buffer (which always is larger) in
case a smaller buffer could not be allocated without increasing the code
complexity.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-05-04 02:22:03 +08:00
Simon Wunderlich
86de37c1fb batman-adv: Merge batadv_v_ogm_orig_update into batadv_v_ogm_route_update
Since batadv_v_ogm_orig_update() was only called from one place and the
calling function became very short, merge these two functions together.

This should also reflect the protocol description of B.A.T.M.A.N. V
better.

Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-05-04 02:22:03 +08:00
Simon Wunderlich
efcc9d3069 batman-adv: move and restructure batadv_v_ogm_forward
To match our code better to the protocol description of B.A.T.M.A.N. V,
move batadv_v_ogm_forward() out into batadv_v_ogm_process_per_outif()
and move all checks directly deciding whether the OGM should be
forwarded into batadv_v_ogm_forward().

Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-05-04 02:22:03 +08:00