4594 Commits

Author SHA1 Message Date
Doug Ledford
efc82eeeae IB/ipoib: No longer use flush as a parameter
Various places in the IPoIB code had a deadlock related to flushing
the ipoib workqueue.  Now that we have per device workqueues and a
specific flush workqueue, there is no longer a deadlock issue with
flushing the device specific workqueues and we can do so unilaterally.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00
Doug Ledford
0b39578bcd IB/ipoib: Use dedicated workqueues per interface
During my recent work on the rtnl lock deadlock in the IPoIB driver, I
saw that even once I fixed the apparent races for a single device, as
soon as that device had any children, new races popped up.  It turns
out that this is because no matter how well we protect against races
on a single device, the fact that all devices use the same workqueue,
and flush_workqueue() flushes *everything* from that workqueue means
that we would also have to prevent all races between different devices
(for instance, ipoib_mcast_restart_task on interface ib0 can race with
ipoib_mcast_flush_dev on interface ib0.8002, resulting in a deadlock on
the rtnl_lock).

There are several possible solutions to this problem:

Make carrier_on_task and mcast_restart_task try to take the rtnl for
some set period of time and if they fail, then bail.  This runs the
real risk of dropping work on the floor, which can end up being its
own separate kind of deadlock.

Set some global flag in the driver that says some device is in the
middle of going down, letting all tasks know to bail.  Again, this can
drop work on the floor.

Or the method this patch attempts to use, which is when we bring an
interface up, create a workqueue specifically for that interface, so
that when we take it back down, we are flushing only those tasks
associated with our interface.  In addition, keep the global
workqueue, but now limit it to only flush tasks.  In this way, the
flush tasks can always flush the device specific work queues without
having deadlock issues.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:18 -04:00
Doug Ledford
894021a752 IB/ipoib: Make the carrier_on_task race aware
We blindly assume that we can just take the rtnl lock and that will
prevent races with downing this interface.  Unfortunately, that's not
the case.  In ipoib_mcast_stop_thread() we will call flush_workqueue()
in an attempt to clear out all remaining instances of ipoib_join_task.
But, since this task is put on the same workqueue as the join task,
the flush_workqueue waits on this thread too.  But this thread is
deadlocked on the rtnl lock.  The better thing here is to use trylock
and loop on that until we either get the lock or we see that
FLAG_OPER_UP has been cleared, in which case we don't need to do
anything anyway and we just return.

While investigating which flag should be used, FLAG_ADMIN_UP or
FLAG_OPER_UP, it was determined that FLAG_OPER_UP was the more
appropriate flag to use.  However, there was a mix of these two flags in
use in the existing code.  So while we check for that flag here as part
of this race fix, also cleanup the two places that had used the less
appropriate flag for their tests.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:17 -04:00
Doug Ledford
c84ca6d2b1 IB/ipoib: Consolidate rtnl_lock tasks in workqueue
The ipoib_mcast_flush_dev routine is called with the rtnl_lock held and
needs to keep it held.  It also needs to call flush_workqueue() to flush
out any outstanding work.  In the past, we've had to try and make sure
that we didn't flush out any outstanding join completions because they
also wanted to grab rtnl_lock() and that would deadlock.  It turns out
that the only thing in the join completion handler that needs this lock
can be safely moved to our carrier_on_task, thereby reducing the
potential for the join completion code and the flush code to deadlock
against each other.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:17 -04:00
Doug Ledford
be7aa663fc IB/ipoib: change init sequence ordering
In preparation for using per device work queues, we need to move the
start of the neighbor thread task to after ipoib_ib_dev_init and move
the destruction of the neighbor task to before ipoib_ib_dev_cleanup.
Otherwise we will end up freeing our workqueue with work possibly
still on it.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:17 -04:00
Doug Ledford
e135106fac IB/ipoib: factor out ah flushing
Create a an ipoib_flush_ah and ipoib_stop_ah routines to use at
appropriate times to flush out all remaining ah entries before we shut
the device down.

Because neighbors and mcast entries can each have a reference on any
given ah, we must make sure to free all of those first before our ah
will actually have a 0 refcount and be able to be reaped.

This factoring is needed in preparation for having per-device work
queues.  The original per-device workqueue code resulted in the following
error message:

<ibdev>: ib_dealloc_pd failed

That error was tracked down to this issue.  With the changes to which
workqueues were flushed when, there were no flushes of the per device
workqueue after the last ah's were freed, resulting in an attempt to
dealloc the pd with outstanding resources still allocated.  This code
puts the explicit flushes in the needed places to avoid that problem.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:06:17 -04:00
Yann Droneaud
66578b0b2f IB/core: don't disallow registering region starting at 0x0
In a call to ib_umem_get(), if address is 0x0 and size is
already page aligned, check added in commit 8494057ab5e4
("IB/uverbs: Prevent integer overflow in ib_umem_get address
arithmetic") will refuse to register a memory region that
could otherwise be valid (provided vm.mmap_min_addr sysctl
and mmap_low_allowed SELinux knobs allow userspace to map
something at address 0x0).

This patch allows back such registration: ib_umem_get()
should probably don't care of the base address provided it
can be pinned with get_user_pages().

There's two possible overflows, in (addr + size) and in
PAGE_ALIGN(addr + size), this patch keep ensuring none
of them happen while allowing to pin memory at address
0x0. Anyway, the case of size equal 0 is no more (partially)
handled as 0-length memory region are disallowed by an
earlier check.

Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org> # 8494057ab5e4 ("IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic")
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:05:02 -04:00
Yann Droneaud
8abaae62f3 IB/core: disallow registering 0-sized memory region
If ib_umem_get() is called with a size equal to 0 and an
non-page aligned address, one page will be pinned and a
0-sized umem will be returned to the caller.

This should not be allowed: it's not expected for a memory
region to have a size equal to 0.

This patch adds a check to explicitly refuse to register
a 0-sized region.

Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 16:05:02 -04:00
Yishai Hadas
56c1d2335b IB/mlx4: Change alias guids default to be host assigned
Change the default mode to be HOST assigned instead of SM assigned. This is
the expected operational mode, because it doesn't depend on SM availability.

As PF generates random GUIDs as the initial admin values, this gives
out of the box experience.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas
ee59fa0d7e IB/mlx4: Request alias GUID on demand
Request GIDs from the SM on demand, i.e., when a VF actually needs them,
and release them when the GIDs are no longer in use.

In cloud environments, this is useful for GID migrations, in which a
GID is assigned to a VF on the destination HCA, while the VF on the
source HCA is shutdown (but the GID was not administratively released).

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas
f547960128 IB/mlx4: Change init flow to request alias GUIDs for active VFs
Change the init flow to ask GUIDs only for active VFs. This is done for
both SM & HOST modes so that there is no need any more to maintain the
ownership record type.

In case SM mode is used, the initial value will be 0, ask the SM to assign,
for the HOST mode the initial value will be the HOST generated GUID.

This will enable out of the box experience for both probed and attached VFs.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas
2350f24774 IB/mlx4: Manage admin alias GUID upon admin request
Set the admin alias GUID per the administrator's request via the sysfs
mechanism into the core layer.

The "get" request returns the current value. However, if the administrator
requests the SM to assign a new value by requesting 0, the SM assigned
GUID is returned.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:50 -04:00
Yishai Hadas
99ee4df6aa IB/mlx4: Alias GUID adding persistency support
If the SM rejects an alias GUID request the PF driver keeps trying to acquire
the specified GUID indefinitely, utilizing an exponential backoff scheme.

Retrying is managed per GUID entry. Each entry that wasn't applied holds its
next retry information. Retry requests to the SM consist of records of 8
consecutive GUIDS. Each record that contains GUIDs requiring retries holds its
next time-to-run based on the retry information of all its GUID entries. The
record having the lowest retry time will run first when that retry time
arrives.

Since the method (SET or DELETE) as sent to the SM applies to all the GUIDs in
the record, we must handle SET requests and DELETE requests in separate SM
messages (one for SETs and the other for DELETEs).

To avoid race conditions where a GUID entry request (set or delete) was
modified after the SM request was sent, we save the method and the requested
indices as part of the callback's context -- thus, only the requested indexes
are evaluated when the response is received.

When an GUID entry is approved we turn off its retry-required bit, this
prevents redundant SM retries from occurring on that record.

The port down event should be sent only when previously it was up. Likewise,
the port up event should be sent only if previously the port was down.

Synchronization was added around the flows that change entries and record state
to prevent race conditions.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-04-15 15:51:49 -04:00
David Howells
75c3cfa855 VFS: assorted weird filesystems: d_inode() annotations
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:58 -04:00
Christoph Hellwig
9ac8928e6a target: simplify the target template registration API
Instead of calling target_fabric_configfs_init() +
target_fabric_configfs_register() / target_fabric_configfs_deregister()
target_fabric_configfs_free() from every target driver, rewrite the API
so that we have simple register/unregister functions that operate on
a const operations vector.

This patch also fixes a memory leak in several target drivers. Several
target drivers namely called target_fabric_configfs_deregister()
without calling target_fabric_configfs_free().

A large part of this patch is based on earlier changes from
Bart Van Assche <bart.vanassche@sandisk.com>.

(v2: Add a new TF_CIT_SETUP_DRV macro so that the core configfs code
can declare attributes as either core only or for drivers)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-14 12:28:41 -07:00
Al Viro
4961772560 infinibad: weird APIs switched to ->write_iter()
Things Not To Do When Writing A Driver, part 1001st:
have writev() and write() on the same file doing completely
different things.  As in, "interpret very different sets of
commands".

	We _can_ handle that, but it's a bloody bad idea.
Don't do that in new drivers.  Ever.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:29:42 -04:00
Al Viro
237dae8890 Merge branch 'iocb' into for-davem
trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:01:38 -04:00
Sagi Grimberg
9e35eff449 iser-target: Bump version to 1.0
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:57 -07:00
Sagi Grimberg
dac6ab305d iser-target: Remove conn_ prefix from struct isert_conn members
These variables are always accessed via struct isert_conn so
no need to have a "conn_" prefix for them.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:57 -07:00
Sagi Grimberg
992607e813 iser-target: Remove un-needed rdma_listen backlog
iser target can handle as many connect request as
the fabric sends to it. This backlog should not set as
a back-pressure mechanism (which is not very useful).

isert does need a back-pressure mechanism, but it should
be added in isert by monitoring the number of pending
established connections (will be added in a later stage).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:56 -07:00
Sagi Grimberg
57df81e3b1 iser-target: Remove redundant check on the device
In iser_connect_release there is no chance that
the iser device is set to NULL, if this happens
we have a BUG. So use BUG_ON.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:56 -07:00
Sagi Grimberg
c6b8e9180d iser-target: Get rid of redundant max_accept
Not sure what it was used for, but there is
no real need for it now as I see it. Go ahead
and get rid of it.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:55 -07:00
Sagi Grimberg
ae9ea9ed38 iser-target: Split some logic in isert_connect_request to routines
Move login buffer alloc/free code to dedicated
routines and introduce isert_conn_init which
initializes the connection lists and locks.

Simplifies and cleans up the code a little bit.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:55 -07:00
Sagi Grimberg
cf8ae95823 iser-target: Rename device find/release routines
isert_device_find_by_ib_dev and isert_device_try_release
can have a better, more common name like isert_device_[get|put].

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:54 -07:00
Sagi Grimberg
7748681bb8 iser-target: Rename rend/recv completion routines
Make receive/send completion handling routines symmetrical.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:54 -07:00
Sagi Grimberg
fd8205e883 iser-target: Remove redundant assignment to local variable
No need to keep a local ib_dev as a device pointer.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:53 -07:00
Sagi Grimberg
172369c570 iser-target: Introduce isert_[alloc|free]_comps
Move the code for completion context handling to dedicated
routines. This simplifies the code and removes code duplication.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:52 -07:00
Sagi Grimberg
40fc069ad8 iser-target: Split isert_setup_qp
Simplify iser QP creation by splitting some unrelated
logic bulks to routines.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:52 -07:00
Sagi Grimberg
6700425eb0 iser-target: Remove redundant casting on void pointers
No need to cast void pointers.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:51 -07:00
Sagi Grimberg
fb14027141 iser-target: Remove redundant local variable
No need for this assignment.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:51 -07:00
Sagi Grimberg
b859203473 iser-target: Remove dead code
unmap_list is unused.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:50 -07:00
Sagi Grimberg
e26e6ef703 iser-target: Remove redundant check on recv completion
We have a switch default for this.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:50 -07:00
Sagi Grimberg
67cb394925 iser-target: Use a single DMA MR and PD per device
This is to favor the HCA cache hit rate using less MRs
and PDs. This commit partially reverts commit:
"eb6ab13 IB/isert: separate connection protection domains and dma MRs"

At the time I thought this would be needed.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:49 -07:00
Sagi Grimberg
4a579da258 iser-target: Fix possible deadlock in RDMA_CM connection error
Before we reach to connection established we may get an
error event. In this case the core won't teardown this
connection (never established it), so we take care of freeing
it ourselves.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:49 -07:00
Sagi Grimberg
364189f0ad iser-target: Fix session hang in case of an rdma read DIF error
This hang was a result of a missing command put when
a DIF error occurred during a rdma read (and we sent
an CHECK_CONDITION error without passing it to the
backend).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:48 -07:00
David S. Miller
c85d6975ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 22:34:15 -04:00
Saeed Mahameed
64613d9499 net/mlx5_core: Extend struct mlx5_interface to support multiple protocols
Preparation for ethernet driver.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed
ce0f750932 net/mlx5_core: Modify arm CQ in preparation for upcoming Ethernet driver
Pass consumer index as a parameter to arm CQ

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed
233d05d28a net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core
Preparation for ethernet driver.
These functions will be used in drivers other than mlx5_ib.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Saeed Mahameed
6cf0a15f07 IB/mlx5: Fix Mellanox copyright note
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Saeed Mahameed
b812b5441e net/mlx5_core: Clear doorbell record inside mlx5_db_alloc()
Do it in one place instead of every where the function is invoked

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:41 -04:00
Eli Cohen
7bef7ad24b net/mlx5_core: Coding style fix
Put a line of space before return and next statement.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:40 -04:00
Haggai Abramonvsky
c3c6c9c810 net/mlx5_core: Fix call to mlx5_core_qp_modify
Pass 0 in the sqd_event parameter.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:40 -04:00
Ido Shamay
a130b59057 net/mlx4: Add SET_PORT opcode modifiers enumeration
The calls to SET_PORT used hard-code numbers, when supplying command's
opcode modifiers, fix that to use well defined constants.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Nicolas Dichtel
5aa7add8f1 infiniband/ipoib: implement ndo_get_iflink
Don't use dev->iflink anymore.

CC: Roland Dreier <roland@kernel.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:01 -04:00
Shachar Raindel
8494057ab5 IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic
Properly verify that the resulting page aligned end address is larger
than both the start address and the length of the memory area requested.

Both the start and length arguments for ib_umem_get are controlled by
the user. A misbehaving user can provide values which will cause an
integer overflow when calculating the page aligned end address.

This overflow can cause also miscalculation of the number of pages
mapped, and additional logic issues.

Addresses: CVE-2014-8159
Cc: <stable@vger.kernel.org>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-04-02 09:53:59 -07:00
Christoph Hellwig
e2e40f2c1e fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-03-25 20:28:11 -04:00
Majd Dibbiny
61a3855bb7 IB/mlx4: Saturate RoCE port PMA counters in case of overflow
For RoCE ports, we set the u32 PMA values based on u64 HCA counters. In case of
overflow, according to the IB spec, we have to saturate a counter to its
max value, do that.

Fixes: c37791349cc7 ('IB/mlx4: Support PMA counters for IBoE')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 15:17:11 -04:00
Moni Shoua
217e8b16a4 IB/mlx4: Verify net device validity on port change event
Processing an event is done in a different context from the one when
the event was dispatched. This requires a check that the slave
net device is still valid when the event is being processed. The check is done
under the iboe lock which ensure correctness.

Fixes: a57500903093 ('IB/mlx4: Add port aggregation support')
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 15:17:11 -04:00
Linus Torvalds
be5e6616dd Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted stuff from this cycle.  The big ones here are multilayer
  overlayfs from Miklos and beginning of sorting ->d_inode accesses out
  from David"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
  autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
  procfs: fix race between symlink removals and traversals
  debugfs: leave freeing a symlink body until inode eviction
  Documentation/filesystems/Locking: ->get_sb() is long gone
  trylock_super(): replacement for grab_super_passive()
  fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
  VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
  SELinux: Use d_is_positive() rather than testing dentry->d_inode
  Smack: Use d_is_positive() rather than testing dentry->d_inode
  TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
  Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
  Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
  VFS: Split DCACHE_FILE_TYPE into regular and special types
  VFS: Add a fallthrough flag for marking virtual dentries
  VFS: Add a whiteout dentry type
  VFS: Introduce inode-getting helpers for layered/unioned fs environments
  Infiniband: Fix potential NULL d_inode dereference
  posix_acl: fix reference leaks in posix_acl_create
  autofs4: Wrong format for printing dentry
  ...
2015-02-22 17:42:14 -08:00