75 Commits

Author SHA1 Message Date
Zack Rusin
9ca476acd5 drm/vmwgfx: Remove usage of MOBFMT_RANGE
Using MOBFMT_RANGE in the early days of guest backed objects was a major
performance win but that has changed a lot since. There's no more
a performance reason to use MOBFMT_RANGE. The device can/will still
profit from the pages being contiguous but marking them as MOBFMT_RANGE
no longer matters.
Benchmarks (e.g. heaven, valley) show that creating page tables
for mob memory is actually faster than using mobfmt ranges.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206172620.3139754-12-zack@kde.org
2021-12-09 13:16:34 -05:00
Zack Rusin
8afa13a058 drm/vmwgfx: Implement DRIVER_GEM
This is initial change adding support for DRIVER_GEM to vmwgfx. vmwgfx
was written before GEM and has always used TTM. Over the years the
TTM buffers started inherting from GEM objects but vmwgfx never
implemented GEM making it quite awkward. We were directly setting
variables in GEM objects to not make DRM crash.

This change brings vmwgfx inline with other DRM drivers and allows us
to use a lot of DRM helpers which have depended on drivers with GEM
support.

Due to historical reasons vmwgfx splits the idea of a buffer and surface
which makes it a littly tricky since either one can be used in most
of our ioctl's which take user space handles. For now our BO's are
GEM objects and our surfaces are opaque objects which are backed by
GEM objects. In the future I'd like to combine those into a single
BO but we don't want to break any of our existing ioctl's so it will
take time to do it in a non-destructive way.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206172620.3139754-5-zack@kde.org
2021-12-09 13:16:16 -05:00
Zack Rusin
8aadeb8ad8 drm/vmwgfx: Remove the dedicated memory accounting
vmwgfx shared very elaborate memory accounting with ttm. It was moved
from ttm to vmwgfx in change
f07069da6b4c ("drm/ttm: move memory accounting into vmwgfx v4")
but because of complexity it was hard to maintain. Some parts of the code
weren't freeing memory correctly and  some were missing accounting all
together. While those would be fairly easy to fix the fundamental reason
for memory accounting in the driver was the ability to invoke shrinker
which is part of TTM code as well (with support for unified memory
hopefully coming soon).

That meant that vmwgfx had a lot of code that was either unused or
duplicating code from TTM. Removing this code also prevents excessive
calls to global swapout which were common during memory pressure
because both vmwgfx and TTM would invoke the shrinker when memory
usage reached half of RAM.

Fixes: f07069da6b4c ("drm/ttm: move memory accounting into vmwgfx v4")
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206172620.3139754-2-zack@kde.org
2021-12-09 13:16:10 -05:00
Zack Rusin
f6be23264b drm/vmwgfx: Introduce a new placement for MOB page tables
For larger (bigger than a page) and noncontiguous mobs we have
to create page tables that allow the host to find the memory.
Those page tables just used regular system memory. Unfortunately
in TTM those BO's are not allowed to be busy thus can't be
fenced and we have to fence those bo's  because we don't want
to destroy the page tables while the host is still executing
the command buffers which might be accessing them.

To solve it we introduce a new placement VMW_PL_SYSTEM which
is very similar to TTM_PL_SYSTEM except that it allows
fencing. This fixes kernel oops'es during unloading of the driver
(and pci hot remove/add) which were caused by busy BO's in
TTM_PL_SYSTEM being present in the delayed deletion list in
TTM (TTM_PL_SYSTEM manager is destroyed before the delayed
deletions are executed)

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211105193845.258816-5-zackr@vmware.com
2021-12-01 11:58:35 -05:00
Maxime Ripard
2f76520561
Merge drm/drm-next into drm-misc-next
Kickstart new drm-misc-next cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-14 09:25:30 +02:00
Linus Torvalds
23852bec53 RDMA v5.15 merge window Pull Request
- Various cleanup and small features for rtrs
 
 - kmap_local_page() conversions
 
 - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
 
 - Cache the IB subnet prefix
 
 - Rework how CRC is calcuated in rxe
 
 - Clean reference counting in iwpm's netlink
 
 - Pull object allocation and lifecycle for user QPs to the uverbs core
   code
 
 - Several small hns features and continued general code cleanups
 
 - Fix the scatterlist confusion of orig_nents/nents introduced in an
   earlier patch creating the append operation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g
 mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv
 ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0
 dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2
 ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA
 LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN
 oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA
 FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628
 Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn
 sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW
 0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h
 NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA=
 =ZwOe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is quite a small cycle, no major series stands out. The HNS and
  rxe drivers saw the most activity this cycle, with rxe being broken
  for a good chunk of time. The significant deleted line count is due to
  a SPDX cleanup series.

  Summary:

   - Various cleanup and small features for rtrs

   - kmap_local_page() conversions

   - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns

   - Cache the IB subnet prefix

   - Rework how CRC is calcuated in rxe

   - Clean reference counting in iwpm's netlink

   - Pull object allocation and lifecycle for user QPs to the uverbs
     core code

   - Several small hns features and continued general code cleanups

   - Fix the scatterlist confusion of orig_nents/nents introduced in an
     earlier patch creating the append operation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
  RDMA/mlx5: Relax DCS QP creation checks
  RDMA/hns: Delete unnecessary blank lines.
  RDMA/hns: Encapsulate the qp db as a function
  RDMA/hns: Adjust the order in which irq are requested and enabled
  RDMA/hns: Remove RST2RST error prints for hw v1
  RDMA/hns: Remove dqpn filling when modify qp from Init to Init
  RDMA/hns: Fix QP's resp incomplete assignment
  RDMA/hns: Fix query destination qpn
  RDMA/hfi1: Convert to SPDX identifier
  IB/rdmavt: Convert to SPDX identifier
  RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
  RDMA/hns: Bugfix for the missing assignment for dip_idx
  RDMA/hns: Bugfix for data type of dip_idx
  RDMA/hns: Fix incorrect lsn field
  RDMA/irdma: Remove the repeated declaration
  RDMA/core/sa_query: Retry SA queries
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
  RDMA/hns: Delete unused hns bitmap interface
  ...
2021-09-02 14:47:21 -07:00
Maor Gottlieb
90e7a6de62 lib/scatterlist: Provide a dedicated function to support table append
RDMA is the only in-kernel user that uses __sg_alloc_table_from_pages to
append pages dynamically. In the next patch. That mode will be extended
and that function will get more parameters. So separate it into a unique
function to make such change more clear.

Link: https://lore.kernel.org/r/20210824142531.3877007-2-maorg@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-24 15:21:14 -03:00
Christian König
d5f45d1e2f drm/ttm: remove ttm_tt_destroy_common v2
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead.

We don't need this any more since we removed the unbind from the destroy
code paths in the drivers.

Also add a warning to ttm_tt_fini() if we try to fini a still populated TT
object.

v2: instead of reverting the patch move the functionality to different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com
2021-08-23 13:54:55 +02:00
Christian König
e54163e918 drm/vmwgfx: unbind in vmw_ttm_unpopulate
Doing this in vmw_ttm_destroy() is to late.

It turned out that this is not a good idea at all because it leaves pointers
to freed up system memory pages in the GART tables of the drivers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-1-christian.koenig@amd.com
2021-08-23 13:42:25 +02:00
Zack Rusin
8d9a8d9bd5 drm/vmwgfx: inline access to the pages from the piter
The indirection doesn't make sense because we always go through
the same function pointer. Instead of the extra indirection
lets inline the access to the current page.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-7-zackr@vmware.com
2021-06-12 00:01:02 -04:00
Zack Rusin
f674a218c6 drm/vmwgfx: remove code that was using physical page addresses
This code has been unused for a while now. When the explicit checks
for whether the driver is running on top of non-coherent swiotlb
have been deprecated we lost the ability to fallback to physical
mappings. Instead of trying to readd a module parameter to force
usage of physical addresses it's better to just force coherent
TTM pages via the force_coherent module parameter making this
code pointless.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-6-zackr@vmware.com
2021-06-12 00:01:01 -04:00
Nirmoy Das
5b7a2c92b6 drm/vmwgfx: use ttm_bo_move_null() when there is nothing to move
Use ttm_bo_move_null() instead of ttm_bo_assign_mem().

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608181306.90008-1-nirmoy.das@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-09 09:10:22 +02:00
Christian König
bfa3357ef9 drm/ttm: allocate resource object instead of embedding it v2
To improve the handling we want the establish the resource object as base
class for the backend allocations.

v2: add missing error handling

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-1-christian.koenig@amd.com
2021-06-04 15:16:45 +02:00
Christian König
d3116756a7 drm/ttm: rename bo->mem and make it a pointer
When we want to decouble resource management from buffer management we need to
be able to handle resources separately.

Add a resource pointer and rename bo->mem so that all code needs to
change to access the pointer instead.

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-4-christian.koenig@amd.com
2021-06-02 11:07:25 +02:00
Thomas Zimmermann
cbc5caf778 drm/vmwgfx: Inline vmw_verify_access()
Vmwgfx is the only user of the TTM's verify_access callback. Inline
the call and avoid the indirection through the function pointer.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525151055.8174-7-tzimmermann@suse.de
2021-05-26 20:56:52 +02:00
Zack Rusin
2cd80dbd35 drm/vmwgfx: Add basic support for SVGA3
SVGA3 is the next version of our PCI device. Some of the changes
include using MMIO for register accesses instead of ioports,
deprecating the FIFO MMIO and removing a lot of the old and
legacy functionality. SVGA3 doesn't support guest backed
objects right now so everything except 3D is working.

v2: Fixes all the static analyzer warnings

Signed-off-by: Zack Rusin <zackr@vmware.com>
Cc: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210505191007.305872-1-zackr@vmware.com
2021-05-11 13:37:15 -04:00
Christian König
6cf9dc238c drm/vmwgfx: clean up vmw_move_notify v2
Instead of swapping bo->mem just give old and new as parameters.

Also drop unused parameters and code.

v2: cleanup stale documentation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210315191432.153826-3-christian.koenig@amd.com
2021-03-16 16:30:25 +01:00
Christian König
f07069da6b drm/ttm: move memory accounting into vmwgfx v4
This is just another feature which is only used by VMWGFX, so move
it into the driver instead.

I've tried to add the accounting sysfs file to the kobject of the drm
minor, but I'm not 100% sure if this works as expected.

v2: fix typo in KFD and avoid 64bit divide
v3: fix init order in VMWGFX
v4: use pdev sysfs reference instead of drm

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Zack Rusin <zackr@vmware.com> (v3)
Tested-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-02-09 17:27:33 +01:00
Christian König
8af8a109b3 drm/ttm: device naming cleanup
Rename ttm_bo_device to ttm_device.
Rename ttm_bo_driver to ttm_device_funcs.
Rename ttm_bo_global to ttm_global.

Move global and device related functions to ttm_device.[ch].

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/415222/
2021-01-21 14:51:45 +01:00
Lee Jones
a38feeaac2 drm/vmwgfx/vmwgfx_ttm_buffer: Supply some missing parameter descriptions
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c:275: warning: Function parameter or member 'p_offset' not described in 'vmw_piter_start'
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c:676: warning: Function parameter or member 'evict' not described in 'vmw_move_notify'

Cc: VMware Graphics <linux-graphics-maintainer@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Zack Rusin <zackr@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115181313.3431493-12-lee.jones@linaro.org
2021-01-19 14:18:37 -05:00
Zack Rusin
9703bb3292 drm/vmwgfx: Switch to a managed drm device
To cleanup some of the error handling and prepare for some
other work lets switch to a managed drm device. It will
let us get a better handle on some of the error paths.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Link: https://patchwork.freedesktop.org/patch/414039/?series=85516&rev=2
2021-01-14 12:12:48 -05:00
Christian König
4c515bb187 drm/vmwgfx: switch to ttm_sg_tt_init
According to Daniel VMWGFX doesn't support DMA-buf anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/403834/
2020-11-30 15:00:12 +01:00
Dave Airlie
ebdf565169 drm/ttm: add multihop infrastrucutre (v3)
Currently drivers get called to move a buffer, but if they have to
move it temporarily through another space (SYSTEM->VRAM via TT)
then they can end up with a lot of ttm->driver->ttm call stacks,
if the temprorary space moves requires eviction.

Instead of letting the driver do all the placement/space for the
temporary, allow it to report back (-EMULTIHOP) and a placement (hop)
to the move code, which will then do the temporary move, and the
correct placement move afterwards.

This removes a lot of code from drivers, at the expense of
adding some midlayering. I've some further ideas on how to turn
it inside out, but I think this is a good solution to the call
stack problems.

v2: separate out the driver patches, add WARN for getting
MULTHOP in paths we shouldn't (Daniel)
v3: use memset (Christian)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: hristian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201109005432.861936-2-airlied@gmail.com
2020-11-11 11:11:03 +10:00
Maxime Ripard
c489573b5b
Merge drm/drm-next into drm-misc-next
Daniel needs -rc2 in drm-misc-next to merge some patches

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-11-02 11:17:54 +01:00
Christian König
8567d51555 drm/vmwgfx: switch to new allocator
It should be able to handle all cases now.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Tested-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/397083/?series=83051&rev=1
2020-10-29 15:57:27 +01:00
Christian König
e34b8feeaa drm/ttm: merge ttm_dma_tt back into ttm_tt
It makes no difference to kmalloc if the structure
is 48 or 64 bytes in size.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/396950/
2020-10-26 14:45:42 +01:00
Dave Airlie
6a6e5988a2 drm/ttm: replace last move_notify with delete_mem_notify
The move notify callback is only used in one place, this should
be removed in the future, but for now just rename it to the use
case which is to notify the driver that the GPU memory is to be
deleted.

Drivers can be cleaned up after this separately.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021044031.1752624-2-airlied@gmail.com
2020-10-22 10:11:49 +10:00
Dave Airlie
bfe5e585b4 drm/ttm: move last binding into the drivers.
This moves the call to tt binding into the driver move,
and drops the driver callback.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020010319.1692445-8-airlied@gmail.com
2020-10-21 13:46:07 +10:00
Dave Airlie
6d82000329 drm/ttm: drop move notify around move.
The drivers now do this in the move callback.

move_notify is still needed in the destroy path.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020010319.1692445-7-airlied@gmail.com
2020-10-21 13:44:28 +10:00
Dave Airlie
f227ccc961 drm/ttm: drop unbind callback.
The drivers now control this, so drop unbinding.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020010319.1692445-5-airlied@gmail.com
2020-10-21 13:43:54 +10:00
Dave Airlie
29a1d482e4 drm/ttm: add move to system into drivers
This moves the to system move into the drivers, and moves all
the unbinds in the move path under driver control

Note: radeon/nouveau already wait so don't duplicate it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020010319.1692445-4-airlied@gmail.com
2020-10-21 13:43:43 +10:00
Dave Airlie
c37d951cb4 drm/ttm: add move old to system to drivers.
Uninline ttm_bo_move_ttm. Eventually want to unhook the unbind out.

Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019071314.1671485-5-airlied@gmail.com
2020-10-20 05:04:04 +10:00
Linus Torvalds
a1e16bc7d5 RDMA 5.10 pull request
The typical set of driver updates across the subsystem:
 
  - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
    usnic, qib, qedr, cxgb4, hns, bnxt_re
 
  - Various rtrs fixes and updates
 
  - Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
    wasn't working right
 
  - Use tracepoints instead of pr_debug in the CM code
 
  - Scrub the locking in ucma and cma to close more syzkaller bugs
 
  - Use tasklet_setup in the subsystem
 
  - Revert the idea that 'destroy' operations are not allowed to fail at
    the driver level. This proved unworkable from a HW perspective.
 
  - Revise how the umem API works so drivers make fewer mistakes using it
 
  - XRC support for qedr
 
  - Convert uverbs objects RWQ and MW to new the allocation scheme
 
  - Large queue entry sizes for hns
 
  - Use hmm_range_fault() for mlx5 On Demand Paging
 
  - uverbs APIs to inspect the GID table instead of sysfs
 
  - Move some of the RDMA code for building large page SGLs into
    lib/scatterlist
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
 mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
 l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
 5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
 46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
 oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
 YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
 /b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
 p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
 ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
 fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
 16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
 =LKpl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A usual cycle for RDMA with a typical mix of driver and core subsystem
  updates:

   - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
     hns, usnic, qib, qedr, cxgb4, hns, bnxt_re

   - Various rtrs fixes and updates

   - Bug fix for mlx4 CM emulation for virtualization scenarios where
     MRA wasn't working right

   - Use tracepoints instead of pr_debug in the CM code

   - Scrub the locking in ucma and cma to close more syzkaller bugs

   - Use tasklet_setup in the subsystem

   - Revert the idea that 'destroy' operations are not allowed to fail
     at the driver level. This proved unworkable from a HW perspective.

   - Revise how the umem API works so drivers make fewer mistakes using
     it

   - XRC support for qedr

   - Convert uverbs objects RWQ and MW to new the allocation scheme

   - Large queue entry sizes for hns

   - Use hmm_range_fault() for mlx5 On Demand Paging

   - uverbs APIs to inspect the GID table instead of sysfs

   - Move some of the RDMA code for building large page SGLs into
     lib/scatterlist"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
  RDMA/ucma: Fix use after free in destroy id flow
  RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
  RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
  RDMA: Explicitly pass in the dma_device to ib_register_device
  lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
  IB/mlx4: Convert rej_tmout radix-tree to XArray
  RDMA/rxe: Fix bug rejecting all multicast packets
  RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
  RDMA/rxe: Remove duplicate entries in struct rxe_mr
  IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
  IB/rdmavt: Fix sizeof mismatch
  MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
  RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
  RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Move to allocate SG table from pages
  lib/scatterlist: Add support in dynamic allocation of SG table from pages
  tools/testing/scatterlist: Show errors in human readable form
  tools/testing/scatterlist: Rejuvenate bit-rotten test
  RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
  RDMA/uverbs: Expose the new GID query API to user space
  ...
2020-10-17 11:18:18 -07:00
Christian König
ce65b87400 drm/ttm: nuke caching placement flags
Changing the caching on the fly never really worked
flawlessly.

So stop this completely and just let drivers specific the
desired caching in the tt or bus object.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394256/
2020-10-15 12:51:35 +02:00
Christian König
1cf65c4518 drm/ttm: add caching state to ttm_bus_placement
And implement setting it up correctly in the drivers.

This allows getting rid of the placement flags for this.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394254/
2020-10-15 12:51:13 +02:00
Christian König
1b4ea4c598 drm/ttm: set the tt caching state at creation time
All drivers can determine the tt caching state at creation time,
no need to do this on the fly during every validation.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/394253/
2020-10-15 12:50:40 +02:00
Dave Airlie
bcff5d3e3b drm/vmwgfx: add a move callback.
This just copies the fallback to vmwgfx, I'm going to iterate on this
a bit until it's not the same as the fallback path.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006000644.1005758-4-airlied@gmail.com
2020-10-07 15:41:54 +10:00
Dave Airlie
279a301021 drm/vmwgfx: move null mem checks outside move notifies
Both fns checked mem == NULL, just move the check outside.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006000644.1005758-3-airlied@gmail.com
2020-10-07 15:41:49 +10:00
Maor Gottlieb
07da1223ec lib/scatterlist: Add support in dynamic allocation of SG table from pages
Extend __sg_alloc_table_from_pages to support dynamic allocation of
SG table from pages. It should be used by drivers that can't supply
all the pages at one time.

This function returns the last populated SGE in the table. Users should
pass it as an argument to the function from the second call and forward.
As before, nents will be equal to the number of populated SGEs (chunks).

With this new extension, drivers can benefit the optimization of merging
contiguous pages without a need to allocate all pages in advance and
hold them in a large buffer.

E.g. with the Infiniband driver that allocates a single page for hold the
pages. For 1TB memory registration, the temporary buffer would consume only
4KB, instead of 2GB.

Link: https://lore.kernel.org/r/20201004154340.1080481-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-05 20:45:45 -03:00
Christian König
fbe86ca567 drm/vmwgfx: switch over to the new pin interface v2
Stop using TTM_PL_FLAG_NO_EVICT.

v2: fix unconditional pinning

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391601/?series=81973&rev=1
2020-09-24 16:16:49 +02:00
Christian König
b254557cb2 drm/vmwgfx: stop using ttm_bo_create v2
Implement in the driver instead since it is the only user of that function.

v2: fix usage of ttm_bo_init_reserved

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391614/?series=81973&rev=1
2020-09-24 16:16:49 +02:00
Christian König
a3b3bef335 drm/vmwgfx: remove unused placement combination
Just some dead code cleanup.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/391599/?series=81973&rev=1
2020-09-24 16:16:49 +02:00
Dave Airlie
6ea6be7708 drm-misc-next for 5.10:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
   - dev: More devm_drm convertions and removal of drm_dev_init
 
 Driver Changes:
   - i915: selftests improvements
   - panfrost: support for Amlogic SoC
   - vc4: one fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX2jGxQAKCRDj7w1vZxhR
 xR3DAQCiZOnaxVcY49iG4343Z1aHHaIEShbnB0bDdaWstn7kiQD/UXBXUoOSFoFQ
 FkTsW31JsdXNnWP5e6/eJd2Lb6waVAA=
 =VlsU
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-09-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.10:

UAPI Changes:

Cross-subsystem Changes:
  - virtio: Merged a PR for patches that will affect drm/virtio

Core Changes:
  - dev: More devm_drm convertions and removal of drm_dev_init
  - atomic: Split out drm_atomic_helper_calc_timestamping_constants of
    drm_atomic_helper_update_legacy_modeset_state
  - ttm: More rework

Driver Changes:
  - i915: selftests improvements
  - panfrost: support for Amlogic SoC
  - vc4: one fix
  - tree-wide: conversions to devm_drm_dev_alloc,
  - ast: simplifications of the atomic modesetting code
  - panfrost: multiple fixes
  - vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
2020-09-23 09:52:24 +10:00
Dave Airlie
37bff6542c drm/ttm: move unbind into the tt destroy.
This moves unbind into the driver side on destroy paths.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-4-airlied@gmail.com
2020-09-18 06:15:24 +10:00
Dave Airlie
7626168fd1 drm/ttm: flip tt destroy ordering.
Call the driver first and have it call the common code cleanup.

This is useful later to fix unbind.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-3-airlied@gmail.com
2020-09-18 06:14:41 +10:00
Dave Airlie
0b988ca1c7 drm/ttm: protect against reentrant bind in the drivers
This moves the generic tracking into the drivers and protects
against reentrancy in the drivers. It fixes up radeon and agp
to be able to query the bound status as that is required.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917043040.146575-2-airlied@gmail.com
2020-09-18 06:14:00 +10:00
Dave Airlie
b40be05ed2 Merge branch 'for-5.10-drm-sg-fix' of https://github.com/mszyprow/linux into drm-next
Please pull a set of fixes for various DRM drivers that finally resolve
incorrect usage of the scatterlists (struct sg_table nents and orig_nents
entries), what causes issues when IOMMU is used.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910080505.24456-1-m.szyprowski@samsung.com
2020-09-17 16:07:11 +10:00
Dave Airlie
7eec915138 drm/ttm/tt: add wrappers to set tt state.
This adds 2 getters and 4 setters, however unbound and populated
are currently the same thing, this will change, it also drops
a BUG_ON that seems not that useful.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915024007.67163-2-airlied@gmail.com
2020-09-16 09:33:24 +10:00
Christian König
48e07c23cb drm/ttm: nuke memory type flags
It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.

Nuke the flags and specify directly which memory type to use.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1
2020-09-11 13:31:23 +02:00
Marek Szyprowski
c915c2cbaf drm: vmwgfx: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2020-09-10 08:18:35 +02:00