493 Commits

Author SHA1 Message Date
Benjamin Dotan
cf1ed52d12 accel/habanalabs/gaudi2 : remove psoc_arc access
Because firmware is blocking PSOC_ARC_DBG, we need to disable access
to this block.

Signed-off-by: Benjamin Dotan <bdotan@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:20 +03:00
Igor Grinberg
01ab1629ad accel/habanalabs/gaudi2: prepare to remove cpu_rst_status
The soft reset has transitioned to CPUCP packet instead of plain
register write and is about to be removed from the struct cpu_dyn_regs.
As a preparation for removing the cpu_rst_status field from
struct cpu_dyn_regs, switch to use the plain macro - this keeps the
backward compatibility.

Signed-off-by: Igor Grinberg <igrinberg@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:20 +03:00
Tomer Tayar
57963ff8ad accel/habanalabs: Move ioctls to the device specific ioctls range
To use drm_ioctl(), move the ioctls to the device specific ioctls
range at [DRM_COMMAND_BASE, DRM_COMMAND_END).

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:20 +03:00
Tomer Tayar
fe77368c0f accel/habanalabs: register compute device as an accel device
Register the compute device as an accel device, and remove the creation
of the habanalabs compute char device.

The IOCTLs in this patch are still handled by the current driver
handler. Moving to DRM IOCTL handling requires moving the IOCTLs
numbers to a specific range, so it will be handled in subsequent
patches.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:20 +03:00
Ofir Bitton
a8ab1a81cc accel/habanalabs: add info ioctl for engine error reports
User gets notification for every engine error report, but he still
lacks the exact engine information. Hence, we allow user to query
for the exact engine reported an error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Tomer Tayar
10926f6005 accel/habanalabs: set default device release watchdog T/O as 30 sec
After being notified about certain errors, user is expected to finish
his post-errors actions and to release the device within some timeout,
after which is deice is being reset.
The default timeout value is 5 sec, which in some case is not enough for
a user application to collect debug data.
Increase the default value to 30 sec.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Dani Liberman
8887279092 accel/habanalabs: handle f/w reserved dram space request
It is possible for FW to request reserved space in dram.
If the device supports this option, it will retrieve the size from the
f/w and will reserve it.

Currently we add the common code infrastructure to support it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Oded Gabbay
fa46c7bb50 accel/habanalabs/gaudi2: fix missing check of kernel ctx
If we are initializing the kernel context when we have a Gaudi2 device,
we don't need to do any late initializing of that context with
specific Gaudi2 code.

Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Igor Grinberg
15c0bb1623 accel/habanalabs/gaudi2: prepare to remove soft_rst_irq
The soft reset has transitioned to CPUCP packet instead of plain
register write and is about to be removed from the struct cpu_dyn_regs.
As a preparation for removing the gic_host_soft_rst_irq field from
struct cpu_dyn_regs, switch to use the plain macro - this keeps the
backward compatibility.

Signed-off-by: Igor Grinberg <igrinberg@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Ofir Bitton
1e3a78270b accel/habanalabs/gaudi2: unsecure tpc count registers
As TPC kernels now must use those registers we unsecure them.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Tomer Tayar
5a8487ac54 accel/habanalabs/gaudi2: un-secure register for engine cores interrupt
The F/W dynamically allocates one of the PSOC scratchpad registers for
the engine cores, so they can raise events towards the F/W.
To allow the engine cores to access this register, this register must be
non-secured.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Juerg Haefliger
b03dc2b621 accel/habanalabs/gaudi: Add MODULE_FIRMWARE macros
The module loads firmware so add MODULE_FIRMWARE macros to provide that
information via modinfo.

Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Ivan Orlov
e11a7d2ca5 accel: make accel_class a static const structure
Now that the driver core allows for struct class to be in read-only
memory, move the accel_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Cc: dri-devel@lists.freedesktop.org
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Ofir Bitton
d33c3d0541 accel/habanalabs: dump temperature threshold boot error
Add dump of an error reported from f/w during boot time.
This error indicates a failure with setting temperature threshold.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:19 +03:00
Oded Gabbay
37d72439a4 accel/habanalabs: reset device if scrubbing failed
If scrubbing memory after user released device has failed it means
the device is in a bad state and should be reset.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
2023-10-09 12:37:19 +03:00
Oded Gabbay
89803af535 accel/habanalabs: remove pdev check on idle check
Our simulator supports idle check so no need anymore to check if pdev
exists.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
2023-10-09 12:37:18 +03:00
farah kassabri
2da9f8d805 accel/habanalabs: fix wait_for_interrupt abortion flow
When the driver needs to abort waiters for interrupts, for cases
such as critical events that occur and driver need to do hard reset,
in such scenario the driver will complete the fence to wake up the
waiting thread, and will set the fence error indication.
The return value of the completion API will be greater than 0
since it will return the timeout, but as this indicates successful
completion, the driver should mark it as aborted.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
farah kassabri
eaa43a06b7 accel/habanalabs: Allow single timestamp registration request at a time
Protect against concurrency of user requesting to register a timestamp
offset (where the driver fills the timestamp when the command submission
has finished executing) to a specific user interrupt ID. The
protection is basically to allow only one timestamp registration
request to be handled at a time.

This is needed because the user can decide to re-use a timestamp
offset (register an already registered offset, to a different
interrupt ID). This means the request will cause the timestamp node to
move from one interrupt list to another interrupt list. In such
scenario, without proper protection, we could end up adding the same
node twice to the interrupts wait lists.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Koby Elbaz
964b1f675d accel/habanalabs: rename fd_list to hpriv_list
Every time an FD is returned to the user, the driver adds
a corresponding private structure to the list.
Yet, it's still a list of private structures rather than of FDs.
Remove, as well, an unnecessary comment.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Koby Elbaz
942f18c56d accel/habanalabs: call put_pid after hpriv list is updated
Because we might still be using related resources, decrementing PID's
reference count should be done at later stages of the device release.
A good place is right after the representing private structure is
removed from LKD's list.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Koby Elbaz
2b541cf913 accel/habanalabs: print return code when process termination fails
As part of driver teardown, we attempt to kill all user processes.
It shouldn't fail, but if it does we want to print the error code that
the kapi returned to us.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
farah kassabri
bffd2f16ae accel/habanalabs: fix standalone preboot descriptor request
The preboot used to statically allocate memory for the comms descriptor
on the device memory when driver requested the descriptor information.
Now preboot moved to dynamic memory allocation where it wants to check
the size the driver expects vs. what the f/w expects.
Note there are no backward compatibility issues as older f/w versions
simply ignore this value.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Dani Liberman
43d8acce60 accel/habanalabs: handle arc farm razwi
Implement razwi handling for arc farm and add it to arc farm sei
event handler.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Ofir Bitton
f17182d036 accel/habanalabs: stop fetching MME SBTE error cause
Because in this case we have only a single possible cause, we can
safely stop fetching the cause from firmware.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Koby Elbaz
e4a97d6b62 accel/habanalabs: set device status 'malfunction' while in rmmod
hl_device_status() returns the status of an acquired device.
If a device is going down (following an rmmod cmd),
it should be marked as an unusable/malfunctioning device, and
hence should not be acquired.
However, since this was not the case so far (i.e., a device going
down would inaccurately return 'in reset' status allowing the user
to acquire the device) it introduced a bug where as part of a reset
flow, the driver could not kill processes that have not run yet, and
since those processes aren't blocked from reacquiring a device,
we get eventually a new flow of a driver attempting to kill all
processes in a list that can't be ever really empty.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Tomer Tayar
e7b2902a33 accel/habanalabs: print task name upon creation of a user context
It is useful for debug to know which user process have acquired the
device.
Add this info to the relevant debug print, in addition to the already
printed user context's ASID.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:18 +03:00
Tomer Tayar
7dccb064a7 accel/habanalabs: print task name and request code upon ioctl failure
When an ioctl fails, it is useful to know what is the task command name
and the full ioctl request code, in addition to the task pid and the
ioctl number.
Add the additional information to the relevant debug error prints.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:17 +03:00
Ofir Bitton
c6a4f256ae accel/habanalabs: notify user about undefined opcode event
In order for user to be aware of undefined opcode events, we must
store all relevant information and notify user about the failure.
The user will fetch the stored info via info ioctl.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:17 +03:00
Tomer Tayar
a35c997601 accel/habanalabs: update pending reset flags with new reset requests
If hl_device_cond_reset() is called while a reset is already pending but
hasn't started, the reset request will be dropped.
If the flags of the new request are more severe, e.g. a hard reset while
the pending reset is a compute reset, the eventual reset won't be
suitable for the device status.

To prevent such cases, update the pending reset flags with the new
requests flags before the requests are dropped.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:17 +03:00
Tomer Tayar
5d89ce6f8c accel/habanalabs: prevent immediate hard reset due to 2 adjacent H/W events
When a H/W event is received while a user is registered to events, no
immediate hard reset will happen, and instead the user will be notified
and will have some time to handle it and eventually release the
device, after which the reset will be done.
If a user, as part of the handling and as part of the cleanup steps
towards releasing the device, unregisters from receiving those events,
and at that time an adjacent H/W event is received, it will be assumed
that the user is not registered to events and thus an immediate hard
reset is required.

To prevent such an unwanted immediate reset, modify the driver to
perform it if the user is not registered to events AND we don't already
have a pending reset for a previous H/W event.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-10-09 12:37:17 +03:00
Andreas Gruenbacher
6309727ef2 kthread: add kthread_stop_put
Add a kthread_stop_put() helper that stops a thread and puts its task
struct.  Use it to replace the various instances of kthread_stop()
followed by put_task_struct().

Remove the kthread_stop_put() macro in usbip that is similar but doesn't
return the result of kthread_stop().

[agruenba@redhat.com: fix kerneldoc comment]
  Link: https://lkml.kernel.org/r/20230911111730.2565537-1-agruenba@redhat.com
[akpm@linux-foundation.org: document kthread_stop_put()'s argument]
Link: https://lkml.kernel.org/r/20230907234048.2499820-1-agruenba@redhat.com
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:57 -07:00
Kees Cook
5e6a1c803f accel/ivpu: Annotate struct ivpu_job with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct ivpu_job.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Cc: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: llvm@lists.linux.dev
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://lore.kernel.org/r/20230922175416.work.272-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-28 16:39:08 -07:00
Dave Airlie
79fb229b88 Merge tag 'drm-misc-next-2023-09-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.7-rc1:

UAPI Changes:
- drm_file owner is now updated during use, in the case of a drm fd
  opened by the display server for a client, the correct owner is
  displayed.
- Qaic gains support for the QAIC_DETACH_SLICE_BO ioctl to allow bo
  recycling.

Cross-subsystem Changes:
- Disable boot logo for au1200fb, mmpfb and unexport logo helpers.
  Only fbcon should manage display of logo.
- Update freescale in MAINTAINERS.
- Add some bridge files to bridge in MAINTAINERS.
- Update gma500 driver repo in MAINTAINERS to point to drm-misc.

Core Changes:
- Move size computations to drm buddy allocator.
- Make drm_atomic_helper_shutdown(NULL) a nop.
- Assorted small fixes in drm_debugfs, DP-MST payload addition error handling.
- Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR handling.
- Handle bad (h/v)sync_end in EDID by clipping to htotal.
- Build GPUVM as a module.

Driver Changes:
- Simple drivers don't need to cache prepared result.
- Call drm_atomic_helper_shutdown() in shutdown/unbind for a whole lot
  more drm drivers.
- Assorted small fixes in amdgpu, ssd130x, bridge/it6621, accel/qaic,
  nouveau, tc358768.
- Add NV12 for komeda writeback.
- Add arbitration lost event to synopsis/dw-hdmi-cec.
- Speed up s/r in nouveau by not restoring some big bo's.
- Assorted nouveau display rework in preparation for GSP-RM,
  especially related to how the modeset sequence works and
  the DP sequence in relation to link training.
- Update anx7816 panel.
- Support NVSYNC and NHSYNC in tegra.
- Allow multiple power domains in simple driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f1fae5eb-25b8-192a-9a53-215e1184ce81@linux.intel.com
2023-09-29 08:27:15 +10:00
Stanislaw Gruszka
d776f654d0 accel/ivpu: Compile ivpu_debugfs.c conditionally
Only compile ivpu_debugfs.c file with CONFIG_DEBUG_FS.

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230907072610.433497-2-stanislaw.gruszka@linux.intel.com
2023-09-27 13:11:51 +02:00
Stanislaw Gruszka
c78199a78f accel/ivpu: Update debugfs to latest changes in DRM
Use new drm debugfs helpers. This is needed after changes from
commit 78346ebf9f94 ("drm/debugfs: drop debugfs_init() for the render
and accel node v2").

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230907072610.433497-1-stanislaw.gruszka@linux.intel.com
2023-09-27 13:11:44 +02:00
Karol Wachowski
645d694559 accel/ivpu: Use cached buffers for FW loading
Create buffers with cache coherency on the CPU side (write-back) while
disabling snooping on the VPU side. These buffers require an explicit
cache flush after each CPU-side modification.

Configuring pages as write-combined may introduce significant delays,
potentially taking hundreds of milliseconds for 64 MB buffers.

Added internal DRM_IVPU_BO_NOSNOOP mask which disables snooping on the
VPU side. Allocate FW runtime memory buffer (64 MB) as cached with
snooping-disabled.

This fixes random long FW loading times and boot params memory
corruption on warmboot (due to missed wmb).

Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230926120943.GD846747@linux.intel.com
2023-09-27 07:40:43 +02:00
Karol Wachowski
09bb81cf24 accel/ivpu/40xx: Fix missing VPUIP interrupts
Move sequence of masking and unmasking global interrupts from buttress
interrupt handler to generic one that handles both VPUIP and BTRS
interrupts.

Unmasking global interrupts will re-trigger MSI for any pending interrupts.
Lack of this sequence can randomly cause to miss any VPUIP interrupt that
comes after reading VPU_40XX_HOST_SS_ICB_STATUS_0 and before clearing
all active interrupt sources.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-6-stanislaw.gruszka@linux.intel.com
2023-09-27 07:40:37 +02:00
Karol Wachowski
ec3e3adc6d accel/ivpu/40xx: Disable frequency change interrupt
Do not enable frequency change interrupt on 40xx as it might
lead to an interrupt storm in current design.

FREQ_CHANGE interrupt is triggered on D0I2 entry which will cause
KMD to check VPU interrupt sources by reading VPUIP registers.
Access to those registers will toggle necessary clocks and trigger
another FREQ_CHANGE interrupt possibly ending in an infinite loop.

FREQ_CHANGE interrupt has only debug purposes and can be permanently
disabled.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-5-stanislaw.gruszka@linux.intel.com
2023-09-27 07:40:30 +02:00
Karol Wachowski
6c3f2f90cc accel/ivpu/40xx: Ensure clock resource ownership Ack before Power-Up
We need to wait for the CLOCK_RESOURCE_OWN_ACK bit to be set
after configuring the workpoint. This step ensures that the VPU
microcontroller clock is actively toggling and ready for operation.

Previously, we relied solely on the READY bit in the VPU_STATUS
register, which indicated the completion of the workpoint download.
However, this approach was insufficient, as the READY bit could be set
while the device was still running on a sideband clock until the PLL
locked. To guarantee that the PLL is locked and the device is running on
the main clock source, we now wait for the CLOCK_RESOURCE_OWN_ACK before
proceeding with the remainder of the power-up sequence.

Fixes: 79cdc56c4a54 ("accel/ivpu: Add initial support for VPU 4")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-4-stanislaw.gruszka@linux.intel.com
2023-09-27 07:40:23 +02:00
Jacek Lawrynowicz
0026525550 accel/ivpu: Don't flood dmesg with VPU ready message
Use ivpu_dbg() to print the VPU ready message so it doesn't pollute
the dmesg.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-3-stanislaw.gruszka@linux.intel.com
2023-09-27 07:40:17 +02:00
Stanislaw Gruszka
b0873eead1 accel/ivpu: Do not use wait event interruptible
If we receive signal when waiting for IPC message response in
ivpu_ipc_receive() we return error and continue to operate.
Then the driver can send another IPC messages and re-use occupied
slot of the message still processed by the firmware. This can result
in corrupting firmware memory and following FW crash with messages:

[ 3698.569719] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x1103, ret -512
[ 3698.569747] intel_vpu 0000:00:0b.0: [drm] ivpu_jsm_unregister_db(): Failed to unregister doorbell 3: -512
[ 3698.569756] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): IPC message vpu:0x88980000 not released by firmware
[ 3698.569763] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_tx_prepare(): JSM message vpu:0x88980040 not released by firmware
[ 3698.570234] intel_vpu 0000:00:0b.0: [drm] ivpu_ipc_send_receive_internal(): IPC receive failed: type 0x110e, ret -512
[ 3698.570318] intel_vpu 0000:00:0b.0: [drm] *ERROR* ivpu_mmu_dump_event(): MMU EVTQ: 0x10 (Translation fault) SSID: 0 SID: 3, e[2] 00000000, e[3] 00000208, in addr: 0x88988000, fetch addr: 0x0

To fix the issue don't use interruptible variant of wait event to
allow firmware to finish IPC processing.

Fixes: 5d7422cfb498 ("accel/ivpu: Add IPC driver and JSM messages")
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925121137.872158-2-stanislaw.gruszka@linux.intel.com
2023-09-27 07:39:46 +02:00
Stanislaw Gruszka
9c1b2429c1 accel/ivpu: Add Arrow Lake pci id
Enable VPU on Arrow Lake CPUs.

Reviewed-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922132206.812817-1-stanislaw.gruszka@linux.intel.com
2023-09-25 08:49:28 +02:00
Pranjal Ramajor Asha Kanojiya
217b812364 accel/qaic: Add QAIC_DETACH_SLICE_BO IOCTL
Once a BO is attached with slicing configuration that BO can only be used
for that particular setting. With this new feature user can detach slicing
configuration off an already sliced BO and attach new slicing configuration
using QAIC_ATTACH_SLICE_BO.

This will support BO recycling.

detach_slice_bo() detaches slicing configuration from a BO. This new
helper function can also be used in release_dbc() as we are doing the
exact same thing.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
[jhugo: add documentation for new ioctl]
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-8-quic_jhugo@quicinc.com
2023-09-22 09:58:49 -06:00
Pranjal Ramajor Asha Kanojiya
b2576f6cf6 accel/qaic: Create a function to initialize BO
This makes sure that we have a single place to initialize and
re-initialize BO.

Use this new API to cleanup release_dbc()

We will need this for next patch to detach slicing to a BO.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-7-quic_jhugo@quicinc.com
2023-09-22 09:58:13 -06:00
Pranjal Ramajor Asha Kanojiya
0a9ee93b82 accel/qaic: Clean up BO during flushing of transfer list
Variables that are set while adding the corresponding BO in transfer list
should be cleaned when flushing them out of transfer list prematurely.

After this patch we do not need some of the cleanup done in release_dbc()

This patch would also pave the way to have a central location to clean BO,
during an undesired situation.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-6-quic_jhugo@quicinc.com
2023-09-22 09:57:46 -06:00
Pranjal Ramajor Asha Kanojiya
b3107e75a9 accel/qaic: Undo slicing setup done in qaic_attach_slicing_bo()
qaic_attach_slicing_bo() updates slicing config on BO. Use the existing
function qaic_free_slices_bo() to remove slicing config done in
qaic_attach_slicing_bo().

Use qaic_free_slices_bo() to cleanup release_dbc()

This would be helpful when we introduce a new IOCTL to detach slicing
configuration onto a BO.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-5-quic_jhugo@quicinc.com
2023-09-22 09:56:54 -06:00
Pranjal Ramajor Asha Kanojiya
77f71e153f accel/qaic: Declare BO 'sliced' after all the operations are complete
Once the BO is declared 'sliced' by setting bo->sliced to true we can
perform DMA (QAIC_EXECUTE_BO) operation on that BO. Hence we should
declare a BO sliced after completing all the operations.

Adding BO to its respective DBC list in qaic_attach_slicing_bo() seems
out of place as qaic_attach_slicing_bo() should just update BO with
slicing configuration.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-4-quic_jhugo@quicinc.com
2023-09-22 09:56:23 -06:00
Pranjal Ramajor Asha Kanojiya
76d42aa951 accel/qaic: Update BO metadata in a central location
Update/Clean up BO metadata in a central location, this will help maintain
the code and looks cleaner.

Use qaic_unprepare_bo() to cleanup release_dbc()

Next few patches will be implementing detach IOCTL which will leverage
this patch.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-3-quic_jhugo@quicinc.com
2023-09-22 09:55:52 -06:00
Pranjal Ramajor Asha Kanojiya
cb850f6f69 accel/qaic: Remove ->size field from struct qaic_bo
->size field in struct qaic_bo stores user requested buffer size for
allocate path or size of the dmabuf(PRIME). Now for allocate path driver
allocates a BO of size which is PAGE_SIZE aligned, this size is already
stored in base BO structure (struct drm_gem_object).

So difference is ->size of struct qaic_bo stores the raw value coming from
user and ->size in struct drm_gem_object stores the PAGE_SZIE aligned size.

Do not use ->size from struct qaic_bo for any validation or operation
instead use ->size from struct drm_gem_object since we already have
allocated that much memory then why not use it. Only validate if user
is trying to use more then the BO size. This make the driver more flexible.

After this change ->size field of struct qaic_bo becomes redundant. Remove
it.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230901172247.11410-2-quic_jhugo@quicinc.com
2023-09-22 09:55:12 -06:00
Dave Airlie
f107ff76a8 Merge tag 'drm-misc-next-2023-09-11-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.7-rc1:

UAPI Changes:
- Nouveau changed to not set NO_PREFETCH flag explicitly.

Cross-subsystem Changes:
- Update documentation of dma-buf intro and uapi.
- fbdev/sbus fixes.
- Use initializer macros in a lot of fbdev drivers.
- Add Boris Brezillon as Panfrost driver maintainer.
- Add Jessica Zhang as drm/panel reviewer.
- Make more fbdev drivers use fb_ops helpers for deferred io.
- Small hid trailing whitespace fix.
- Use fb_ops in hid/picolcd

Core Changes:
- Assorted small fixes to ttm tests, drm/mst.
- Documentation updates to bridge.
- Add kunit tests for some drm_fb functions.
- Rework drm_debugfs implementation.
- Update xe documentation to mark todos as completed.

Driver Changes:
- Add support to rockchip for rv1126 mipi-dsi and vop.
- Assorted small fixes to nouveau, bridge/samsung-dsim,
  bridge/lvds-codec, loongson, rockchip, panfrost, gma500, repaper,
  komeda, virtio, ssd130x.
- Add support for simple panels Mitsubishi AA084XE01,
  JDI LPM102A188A,
- Documentation updates to accel/ivpu.
- Some nouveau scheduling/fence fixes.
- Power management related fixes and other fixes to ivpu.
- Assorted bridge/it66121 fixes.
- Make platform drivers return void in remove() callback.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3da6554b-3b47-fe7d-c4ea-21f4f819dbb6@linux.intel.com
2023-09-22 16:28:36 +10:00