4103 Commits

Author SHA1 Message Date
Sagi Grimberg
78eda2bb65 IB/mlx5, iser, isert: Add Signature API additions
Expose more signature setting parameters. We modify the signature API
to allow usage of some new execution parameters relevant to data
integrity feature.

This patch modifies ib_sig_domain structure by:

- Deprecate DIF type in signature API (operation will
  be determined by the parameters alone, no DIF type awareness)
- Add APPTAG check bitmask (for input domain)
- Add REFTAG remap (increment) flag for each domain
- Add APPTAG/REFTAG escape options for each domain

The mlx5 driver is modified to follow the new parameters in HW
signature setup.

At the moment the callers (iser/isert) hard-code new parameters (by
DIF type). In the future, callers will retrieve them from the scsi
command structure.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
3d73cf1a2a Target/iser: Centralize ib_sig_domain setting
Later there will be more parameters to set, so we want to do it in a
centralized place.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
92792c0a19 IB/iser: Centralize ib_sig_domain settings
Later there will be more parameters to set, so we want to do it in a
centralized place.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
142537f4e5 IB/mlx5: Use extended internal signature layout
Rather than using the basic BSF layout which utilizes a pre-configured
signature settings (sufficient for current DIF implementation), we use
the extended BSF layout to expose advanced signature settings. These
settings will also be exposed to the user later.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
f043032ef1 IB/iser: Set IP_CSUM as default guard type
In the future this will be a per-command parameter so we can lose it,
but in the mean time IP_CSUM is a lot lighter for SW layers to
compute, set it as default.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
6f5f8a016e IB/iser: Remove redundant assignment
We clear the struct before - no need to do 0 assignment.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Sagi Grimberg
fd22f78cf7 IB/mlx5: Use enumerations for PI copy mask
In case input and output space parameters match, we can use a copy
mask from input and output space.  Use enums for those.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:10:53 -07:00
Or Gerlitz
b261aeafe1 IB/iser: Bump version, add maintainer
Update the driver version and add Sagi Grimberg as maintainer

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:46 -07:00
Sagi Grimberg
dc05ac36f7 IB/iser: Fix/add kernel-doc style description in iscsi_iser.c
This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:46 -07:00
Sagi Grimberg
cd88621a9e IB/iser: Add/Fix kernel doc style descriptions in iscsi_iser.h
- iser_hdr
- iser_data_buf
- iser_mem_reg
- iser_regd_buf
- iser_tx_desc
- iser_rx_desc
- iser_device
- iser_pi_context
- iser_conn
- ib_conn
- iser_comp
- iscsi_iser_task
- iser_global

While we're at it, change nit alignments in this file

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:46 -07:00
Sagi Grimberg
e9d49b82f1 IB/iser: Nit - add space after __func__ in iser logging
Change logging: "iser:XXXX" to "iser: XXXX"

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:46 -07:00
Ariel Nahum
bba0a3c9d7 IB/iser: Change iscsi_conn_stop log level to info
Match to the debug level of all functions in connect/disconnect flows.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:42 -07:00
Sagi Grimberg
6df5a128f0 IB/iser: Suppress scsi command send completions
Singal completion of every 32 scsi commands and suppress all the rest.
We don't do anything upon getting the completion so no need to "just
consume" it.  Cleanup of scsi command is done in cleanup_task callback.

Still keep dataout and control send completions as we may need to
cleanup there. This helps reducing the amount of interrupts/completions
in the IO path.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:07 -07:00
Sagi Grimberg
6e6fe2fb1d IB/iser: Optimize completion polling
Poll in batch of 16. Since we don't want it on the stack, keep under
iser completion context (iser_comp).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:07 -07:00
Sagi Grimberg
ff3dd52d26 IB/iser: Use beacon to indicate all completions were consumed
Avoid post_send counting (atomic) in the IO path just to keep track of
how many completions we need to consume.  Use a beacon post to indicate
that all prior posts completed.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:07 -07:00
Sagi Grimberg
6aabfa76f5 IB/iser: Use single CQ for RX and TX
This will solve a possible condition where we might miss TX completion
(flush error) during session teardown.  Since we are using a single
CQ, we don't need to actively drain the TX CQ, instead just wait for
flush_completion (when counters reach zero) and remove iser_poll_for_flush_errors().

This patch might introduce a minor performance regression on its own,
but the next patches will enhance performance using a single CQ for RX
and TX.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:07 -07:00
Sagi Grimberg
183cfa434e IB/iser: Use internal polling budget to avoid possible live-lock
We need a way to guarentee that we don't stay in soft-IRQ context for
too long.  We might starve other pending CQ tasklets or worse lock
against application trying to issue IO on the running CPU.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
bf17554035 IB/iser: Centralize iser completion contexts
Introduce iser_comp which centralizes all iser completion related
items and is referenced by iser_device and each ib_conn.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Ariel Nahum
aea8f4df6d IB/iser: Use iser_warn instead of BUG_ON in iser_conn_release
In case iscsid was violently killed (SIGKILL) during its error
recovery stage, we may never get a connection teardown sequence for
some of the old connections.  No harm done, but when we try to unload
the module we will need to cleanup all these connections.  So we
actually may end-up here - so it's not a BUG_ON(), just give a relaxed
warning that this happened and continue with normal unload.  BUG_ON()
will cause segfault on module_exit and we don't want that.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
8c204e69ce IB/iser: Signal iSCSI layer that transport is broken in error completions
Previously we notified iscsi layer about the connection layer when
we consumed all of our flush errors. This was racy as there
was no guarentee that iscsi_conn wasn't terminated by then (which ends
up in an invalid memory access). In case we got a non FLUSH error
completion, we are guarenteed that iscsi_conn is still alive. We should
notify iSCSI layer with iscsi_conn_failure to initiate error handling.

While we are at it, add a nice kernel-doc style documentation.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
3a940daf6f IB/iser: Protect tasks cleanup in case IB device was already released
Bailout in case a task cleanup (iscsi_iser_cleanup_task) is called
after the IB device was removed (DEVICE_REMOVAL CM event).  We also
call iscsi_conn_stop with a lock taken to prevent DEVICE_REMOVAL and
tasks cleanup from racing.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Ariel Nahum
ec370e2b63 IB/iser: Unbind at conn_stop stage
Previously we didn't need to unbind the iser_conn and iscsi_conn since
we always relied on iscsi daemon to teardown the connection and never
let it finish before we cleanup all that is needed in iser.  This is
not the case anymore (for DEVICE_REMOVAL event).  So avoid any possible
chance we cause iscsi_conn dereference after iscsi_conn was freed.

We also call iser_conn_terminate (safe to call multiple times) just
for the corner case of iscsi daemon stopping an old connection before
invoking endpoint removal (might happen if it was violently killed).

Notice we are unbinding under a lock - which is required.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
c107a6c0cf IB/iser: Don't bound release_work completions timeouts
We no longer rely on iscsi connection teardown sequence, so no need to
give a grace period and continue cleanup if it expired. Have
iser_conn_release wait for full completion before freeing iser_conn.

ib_completion:
	Guaranteed to come when:
	    - Got DISCONNECTED/ADDR_CHANGE event or
	    - iSCSI called ep_disconnect/conn_stop
	Guaranteed to finish when:
	    - Got TIMEWAIT_EXIT/DEVICE_REMOVAL event
	    - All Flush errors are consumed
	    - IB related resources are destroyed

stop_completion:
	Guaranteed to come when:
	    - iSCSI calls conn_stop
	Guaranteed to finish when:
	    - All inflight tasks were cleaned up

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
c47a3c9ed5 IB/iser: Fix DEVICE REMOVAL handling in the absence of iscsi daemon
iscsi daemon is in user-space, thus we can't rely on it to be invoked
at connection teardown (if not running or does not receive CPU time).

This patch addresses the issue by re-structuring iSER connection
teardown logic and CM events handling.

The CM events will dictate the RDMA resources destruction (ib_conn)
and iser_conn is kept around as long as iscsi_conn is left around
allowing iscsi/iser callbacks to continue after RDMA transport was
destroyed.

This patch introduces a separation in logic when handling CM events:

- DISCONNECTED_HANDLER, ADDR_CHANGED
  This events indicate the start of teardown process.
  Actions:
  1. Terminate the connection: rdma_disconnect (send DREQ/DREP)
  2. Notify iSCSI of connection failure
  3. Change state to TERMINATING
  4. Poll for all flush errors to be consumed

- TIMEWAIT_EXIT, DEVICE_REMOVAL
  These events indicate the final stage of termination process and
  we can free RDMA related resources.
  Actions:
  1. Call disconnected handler (we are not guaranteed that DISCONNECTED
     event was invoked in the past)
  2. Cleanup RDMA related resources
  3. For DEVICE_REMOVAL return non-zero rc from cma_handler to
     implicitly destroy the cm_id (Can't rely on user-space, make sure
     we have forward progress)

We replace flush_completion (indicate all flushes were consumed) with
ib_completion (rdma resources were cleaned up).

The iser_conn_release_work will wait for teardown completions:

- conn_stop was completed (tasks were cleaned-up) - stop_completion
- RDMA resources were destroyed - ib_completion

And then will continue to free iser connection representation (iser_conn).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
96f15198c1 IB/iser: Extend iser_free_ib_conn_res()
Put all connection IB related resources release in this routine.  One
exception is the cm_id which cannot be destroyed as the routine is
protected by the state mutex.  Also move its position to avoid forward
declaration.  While at it fix qp NULL assignment.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Roi Dayan
6bb0279f95 IB/iser: Remove unused variables and dead code
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
a4ee3539f6 IB/iser: Re-introduce ib_conn
Structure that describes the RDMA relates connection objects.  Static
member of iser_conn.

This patch does not change any functionality

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Sagi Grimberg
5716af6e52 IB/iser: Rename ib_conn -> iser_conn
Two reasons why we choose to do this:

1. No point today calling struct iser_conn by another name ib_conn
2. In the next patches we will restructure iser control plane representation
   - struct iser_conn: connection logical representation
   - struct ib_conn: connection RDMA layout representation

This patch does not change any functionality.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-10-09 00:06:06 -07:00
Linus Torvalds
452b6361c4 Last late set of InfiniBand/RDMA fixes for 3.17:
- Fixes for the new memory region re-registration support
  - iSER initiator error path fixes
  - Grab bag of small fixes for the qib and ocrdma hardware drivers
  - Larger set of fixes for mlx4, especially in RoCE mode
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJUIexdAAoJEENa44ZhAt0hP10QAJztxlS2a8U3JCJzthwSYxlI
 ohT9487iLk1uEcj4Z3i7w2ERRUzXaHbRTktNHFjwfRb8x2qMUgT2PfD6/30sQ250
 nJAk3FRFNipxKkJSfmcc3+O4r91i4F+CaN8DGypaBDHcupeD2drKocl/Iu5MIvkG
 e5CzLlS7i/xrWKmgYP4bIqqFZsqQ+2rJrYBDybuLZSaZNd0PTDE3yCDihfOcsxjn
 TeOCVbm5895fPRtxzeCGHy8bXbYYN9vItuhtHC+sntYtbhNJhjpmP+1yD6M2SoZR
 34sGd7AA1j1H6ATmanzeW2aALkFYPIuGihDbbnRQlDG1v09lEPfP2GtfLxoQ9Ibo
 nfe2rsthzV6Qh2xcXjn6KicgV7bb6aSUXEK24zKx7O3MkOvHkOC/JIIrd9dFe+uj
 R7pUd3XlAk8SBhTQ4gLub06Dl7ynzSRArwcdMTHp30LvtnjJZoQR67WGGrsdwlIW
 MV43105i7iLCcdaSd0ihKnR6OFlSh13Z0wpu+B386bwxkHxjFJXkVHxOJir/iAk9
 cW4RXbA/ic7nwIjes4GbMNDOvdJO2tDcg9KGSgiDY3kC5GksPqfxXYVDlMB2rFoE
 PhfQ8TOcbZYTmlcKLMpMIFXP484VPhWQJeYWPOf9KGS6aW5QRNPsPCmAvaoSXWLs
 GVSlvjbE6O7MgonqG1Jh
 =Kpm1
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband/rdma fixes from Roland Dreier:
 "Last late set of InfiniBand/RDMA fixes for 3.17:

   - fixes for the new memory region re-registration support
   - iSER initiator error path fixes
   - grab bag of small fixes for the qib and ocrdma hardware drivers
   - larger set of fixes for mlx4, especially in RoCE mode"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
  IB/mlx4: Fix VF mac handling in RoCE
  IB/mlx4: Do not allow APM under RoCE
  IB/mlx4: Don't update QP1 in native mode
  IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
  mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
  IB/core: When marshaling uverbs path, clear unused fields
  IB/mlx4: Avoid executing gid task when device is being removed
  IB/mlx4: Fix lockdep splat for the iboe lock
  IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
  IB/mlx4: Reorder steps in RoCE GID table initialization
  IB/mlx4: Don't duplicate the default RoCE GID
  IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
  IB/iser: Bump version to 1.4.1
  IB/iser: Allow bind only when connection state is UP
  IB/iser: Fix RX/TX CQ resource leak on error flow
  RDMA/ocrdma: Use right macro in query AH
  RDMA/ocrdma: Resolve L2 address when creating user AH
  mlx4: Correct error flows in rereg_mr
  IB/qib: Correct reference counting in debugfs qp_stats
  IPoIB: Remove unnecessary port query
  ...
2014-09-23 16:47:34 -07:00
Linus Torvalds
98f75b8291 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) If the user gives us a msg_namelen of 0, don't try to interpret
    anything pointed to by msg_name.  From Ani Sinha.

 2) Fix some bnx2i/bnx2fc randconfig compilation errors.

    The gist of the issue is that we firstly have drivers that span both
    SCSI and networking.  And at the top of that chain of dependencies
    we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are
    selected.

    But since select is a sledgehammer and ignores dependencies,
    everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also
    explicitly select their dependencies and so on and so forth.

    Generally speaking 'select' is supposed to only be used for child
    nodes, those which have no dependencies of their own.  And this
    whole chain of dependencies in the scsi layer violates that rather
    strongly.

    So just make SCSI_NETLINK depend upon it's dependencies, and so on
    and so forth for the things selecting it (either directly or
    indirectly).

    From Anish Bhatt and Randy Dunlap.

 3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert.

 4) Actually notice netdev feature changes in rtl_open() code, from
    Hayes Wang.

 5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov.

 6) Missing memory barrier in sunvnet driver, from David Stevens.

 7) Don't leave anycast addresses around when ipv6 interface is
    destroyed, from Sabrina Dubroca.

 8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is
    initialized in SFC driver, from Edward Cree.

 9) Fix missing DMA error checking in 3c59x, from Neal Horman.

10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently,
    fix from Samuel Gauthier.

11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a
    build error.

12) Fix macvlan regression wherein we stopped emitting
    broadcast/multicast frames over software devices.  From Nicolas
    Dichtel.

13) Fix infiniband bug due to unintended overflow of skb->cb[], from
    Eric Dumazet.  And add an assertion so this doesn't happen again.

14) dm9000_parse_dt() should return error pointers, not NULL.  From
    Tobias Klauser.

15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix
    from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
  net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
  net: bcmgenet: fix TX reclaim accounting for fragments
  ipv4: do not use this_cpu_ptr() in preemptible context
  dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
  r8169: fix an if condition
  r8152: disable ALDPS
  ipoib: validate struct ipoib_cb size
  net: sched: shrink struct qdisc_skb_cb to 28 bytes
  tg3: Work around HW/FW limitations with vlan encapsulated frames
  macvlan: allow to enqueue broadcast pkt on virtual device
  pch_gbe: 'select' NET_PTP_CLASSIFY.
  scsi: Use 'depends' with LIBFC instead of 'select'.
  openvswitch: restore OVS_FLOW_CMD_NEW notifications
  genetlink: add function genl_has_listeners()
  lib: rhashtable: remove second linux/log2.h inclusion
  net: allow macvlans to move to net namespace
  3c59x: Fix bad offset spec in skb_frag_dma_map
  3c59x: Add dma error checking and recovery
  sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG
  can: at91_can: add missing prepare and unprepare of the clock
  ...
2014-09-22 18:23:33 -07:00
Eric Dumazet
b49fe36208 ipoib: validate struct ipoib_cb size
To catch future errors sooner.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-22 14:29:55 -04:00
Roland Dreier
3bdad2d13f Merge branches 'core', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into for-next 2014-09-22 10:05:40 -07:00
Jack Morgenstein
25476b0209 IB/mlx4: Fix VF mac handling in RoCE
We had several problems here.  First, a race condition on QP1 mac
handling between mlx4_ib_update_qps and mlx4_ib_modify_qp, which is
fixed by taking the qp mutex in mlx4_ib_update_qps.

Also, qp->pri.smac_port was not updated in mlx4_ib_update_qps.

Last, in __mlx4_ib_modify_qp we did not properly handle the case where
the mac is zero, but port is non-zero.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:53 -07:00
Jack Morgenstein
3dec487888 IB/mlx4: Do not allow APM under RoCE
Automatic Path Migration is not supported under RoCE. Therefore,
return a "not-supported" error if the caller attempts to set an
alternate path in a QP context.

In addition, if there are no IB ports configured, do not report
APM capability in the device flags returned by mlx4_ib_query_device.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:53 -07:00
Jack Morgenstein
d24d9f4338 IB/mlx4: Don't update QP1 in native mode
For native functions (non-SR-IOV), there's no reason to update
the smac_index, as QP1 is a GSI QP.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:53 -07:00
Jack Morgenstein
3e0629cb6c IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
The source MAC is needed in RoCE when building the QP1 header.

Currently, this is obtained from the source net device. However, the net
device may not yet exist, or can be destroyed in parallel to this QP1 send
operation (e.g through the VPI port change flow) so accessing it may cause
a kernel crash.

To fix this, we maintain a source MAC cache per port for the net device in
struct mlx4_ib_roce.  This cached MAC is initialized to be the default MAC
address obtained during HCA initialization via QUERY_PORT. This cached MAC
is updated via the netdev event notifier handler.

Since the cached MAC is held in an atomic64 object, we do not need locking
when accessing it.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:53 -07:00
Matan Barak
a59c5850f0 IB/core: When marshaling uverbs path, clear unused fields
When marsheling a user path to the kernel struct ib_sa_path, need
to zero smac, dmac and set the vlan id to the "no vlan" value.

Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Reported-by: Aleksey Senin <alekseys@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Moni Shoua
4bf9715f18 IB/mlx4: Avoid executing gid task when device is being removed
When device is being removed (e.g during VPI port link type change
from ETH to IB), tasks for gid table changes should not be executed.

Flush the current queue of tasks and block further tasks from entering the queue.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Jack Morgenstein
dba3ad2add IB/mlx4: Fix lockdep splat for the iboe lock
Chuck Lever reported the following stack trace:

    =================================
    [ INFO: inconsistent lock state ]
    3.16.0-rc2-00024-g2e78883 #17 Tainted: G            E
    ---------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    (&(&iboe->lock)->rlock){+.?...}, at: [<ffffffffa065f68b>] mlx4_ib_addr_event+0xdb/0x1a0 [mlx4_ib]
    {SOFTIRQ-ON-W} state was registered at:
     [<ffffffff810b3110>] mark_irqflags+0x110/0x170
     [<ffffffff810b4806>] __lock_acquire+0x2c6/0x5b0
     [<ffffffff810b4bd9>] lock_acquire+0xe9/0x120
     [<ffffffff815f7f6e>] _raw_spin_lock+0x3e/0x80
     [<ffffffffa0661084>] mlx4_ib_scan_netdevs+0x34/0x260 [mlx4_ib]
     [<ffffffffa06612db>] mlx4_ib_netdev_event+0x2b/0x40 [mlx4_ib]
     [<ffffffff81522219>] register_netdevice_notifier+0x99/0x1e0
     [<ffffffffa06626e3>] mlx4_ib_add+0x743/0xbc0 [mlx4_ib]
     [<ffffffffa05ec168>] mlx4_add_device+0x48/0xa0 [mlx4_core]
     [<ffffffffa05ec2c3>] mlx4_register_interface+0x73/0xb0 [mlx4_core]
     [<ffffffffa05c505e>] cm_req_handler+0x13e/0x460 [ib_cm]
     [<ffffffff810002e2>] do_one_initcall+0x112/0x1c0
     [<ffffffff810e8264>] do_init_module+0x34/0x190
     [<ffffffff810ea62f>] load_module+0x5cf/0x740
     [<ffffffff810ea939>] SyS_init_module+0x99/0xd0
     [<ffffffff815f8fd2>] system_call_fastpath+0x16/0x1b
    irq event stamp: 336142
    hardirqs last  enabled at (336142): [<ffffffff810612f5>] __local_bh_enable_ip+0xb5/0xc0
    hardirqs last disabled at (336141): [<ffffffff81061296>] __local_bh_enable_ip+0x56/0xc0
    softirqs last  enabled at (336004): [<ffffffff8106123a>] _local_bh_enable+0x4a/0x50
    softirqs last disabled at (336005): [<ffffffff810617a4>] irq_exit+0x44/0xd0

    other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(&(&iboe->lock)->rlock);
     <Interrupt>
       lock(&(&iboe->lock)->rlock);

    *** DEADLOCK ***

The above problem was caused by the spin lock being taken both in the process
context and in a soft-irq context (in a netdev notifier handler).

The required fix is to use spin_lock/unlock_bh() instead of spin_lock/unlock
on the iboe lock.

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Moni Shoua
bccb84f1df IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
When a RoCE port becomes active and the netdev of the port has upper
device (e.g bond/team), GIDs derived from the upper dev should appear
in the port's RoCE GID table.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Moni Shoua
655b2aaefc IB/mlx4: Reorder steps in RoCE GID table initialization
There's no need to reset the gid table twice and we need to do it only
for Ethernet ports. Also, no need to actively scan ndetdevs since it's
being done immediatly after we register netdev notifiers.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Moni Shoua
f5c4834d93 IB/mlx4: Don't duplicate the default RoCE GID
When reading the IPv6 addresses from the net-device, make sure to
avoid adding a duplicate entry to the GID table because of equality
between the default GID we generate and the default IPv6 link-local
address of the device.

Fixes: acc4fccf4eff ("IB/mlx4: Make sure GID index 0 is always occupied")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Moni Shoua
e381835cf1 IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
When Ethernet netdev is not present for a port (e.g. when the link
layer type of the port is InfiniBand) it's possible to dereference a
null pointer when we do netdevice scanning.

To fix that, we move a section of code that needs to run only when
netdev is present to a proper if () statement.

Fixes: ad4885d279b6 ("IB/mlx4: Build the port IBoE GID table properly under bonding")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:52 -07:00
Or Gerlitz
61aabb3c91 IB/iser: Bump version to 1.4.1
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:42 -07:00
Sagi Grimberg
91eb1df39a IB/iser: Allow bind only when connection state is UP
We need to fail the bind operation if the iser connection state != UP
(started teardown) and this should be done under the state lock.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:42 -07:00
Roi Dayan
c33b15f00b IB/iser: Fix RX/TX CQ resource leak on error flow
When failing to allocate TX CQ we already allocated RX CQ, so we need to make
sure we release it. Also, when failing to register notification to the RX CQ
we currently leak both RX and TX CQs of the current index, fix that too.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:46:42 -07:00
devesh.sharma@emulex.com
f0c2c225df RDMA/ocrdma: Use right macro in query AH
ocrdma_query_ah() does not use correct macro, and checks the wrong bit
for the validity of address handle in vector table.  Fix this.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:37:43 -07:00
devesh.sharma@emulex.com
1be528bcb8 RDMA/ocrdma: Resolve L2 address when creating user AH
Because of IP-based GIDs, userspace AHs must have MAC and VLAN ID
resolved separately.  Presently, user AHs are broken for ocrdma.  This
patch resolves L2 addresses while creating user AH and obtains the
right DMAC and VLAN ID before creating AH.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 09:37:42 -07:00
Matan Barak
4ff0acca73 mlx4: Correct error flows in rereg_mr
This patch addresses feedback from Sagi Grimberg on the rereg_mr
implementation of mlx4.  The following are fixed:

1. Set the correct pd_flags
2. Make sure we change the iova and size MR fields only after
   successful write and allocation of the MTTs.
3. Make the error checking more robust

Fixes: e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration")
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22 08:47:47 -07:00
Mike Marciniszyn
85cbb7c728 IB/qib: Correct reference counting in debugfs qp_stats
This particular reference count is not needed with the rcu protection,
and the current code leaks a reference count, causing a hang in
qib_qp_destroy().

Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-19 10:18:32 -07:00