IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.
1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.
A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.
v2:
- (Rob Clark) Fix msm build
- Pass in run work queue
v3:
- (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
- (Boris) default to ordered work queue
v6:
- (Luben / checkpatch) fix alignment in msm_ringbuffer.c
- (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
- (Luben) Update comment for drm_sched_wqueue_enqueue
- (Luben) Positive check for submit_wq in drm_sched_init
- (Luben) s/alloc_submit_wq/own_submit_wq
v7:
- (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
- (Luben) Adjust var names / comments
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Add scheduler wqueue ready, stop, and start helpers to hide the
implementation details of the scheduler from the drivers.
v2:
- s/sched_wqueue/sched_wqueue (Luben)
- Remove the extra white line after the return-statement (Luben)
- update drm_sched_wqueue_ready comment (Luben)
Cc: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-2-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.
Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.
Another alternative would be to grab the spinlock when that happens.
v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
the accurate timestamp is needed and/or the timestamp is not based on
ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1774baa64f93 ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com
VPU was rebranded as NPU (Neural Processing Unit) so user facing
strings have to be updated but the code remains as is and the module
is still called intel_vpu.ko.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-9-stanislaw.gruszka@linux.intel.com
CMD_SYNC does not need any args as we poll for completion anyway.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-8-stanislaw.gruszka@linux.intel.com
Previously using dma_alloc_wc() API we created cache coherent
(mapped as write-back) mappings.
Because we disable MMU600 snooping it was required to do costly
page walk and cache flushes after each page table modification.
With write-combined buffers it's possible to do a single write memory
barrier to flush write-combined buffer to memory which simplifies the
driver and significantly reduce time of map/unmap operations.
Mapping time of 255 MB is reduced from 2.5 ms to 500 us.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-7-stanislaw.gruszka@linux.intel.com
Waking up process, which wait for particular condition, will go to
sleep again on wake_up() if the condition is not met. Add abort flag
to wake up IPC receivers, which will finish with -ECANCELED error.
This is only needed for reset, run time power management prevent to
suspend VPU when there is pending IPC processing or pending job.
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-5-stanislaw.gruszka@linux.intel.com
Stop job_done thread when going to suspend. Use kthread_park() instead
of kthread_stop() to avoid memory allocation and potential failure
on resume.
Use separate function as thread wake up condition. Use spin lock to assure
rx_msg_list is properly protected against concurrent access. This avoid
race condition when the rx_msg_list list is modified and read in
ivpu_ipc_recive() at the same time.
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-4-stanislaw.gruszka@linux.intel.com
We should not leave device half enabled if there is failure somewhere
it power up sequence. Fix device init and resume paths.
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-3-stanislaw.gruszka@linux.intel.com
Profiling freq is a debug firmware feature. It switches default clock
to higher resolution for fine-grained and more accurate firmware task
profiling. We already configure it during boot up of VPU4.
Add debugfs knob and helpers per HW generation that allow to change it.
For vpu37xx the implementation is empty as profiling frequency can only
be changed on VPU4 or newer.
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-2-stanislaw.gruszka@linux.intel.com
Currently, we are only warning the user if the BIN or RENDER jobs don't
finish before we unregister V3D. We must wait for all jobs to finish
before unregistering. Therefore, warn the user if TFU or CSD jobs
are not done by the time the driver is unregistered.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20231023105927.101502-1-mcanal@igalia.com
Currently the VPU firmware prepares for D0i3 every time the VPU
is entering D0i2 Idle state. This is not optimal as we might not
enter D0i3 every time we enter D0i2 Idle and this preparation
is quite costly.
This optimization moves D0i3 preparation to a dedicated
message sent from the host driver only when the driver is about
to enter D0i3 - this reduces power consumption and latency for
certain workloads, for example audio workloads that submit
inference every 10 ms.
The VPU needs non zero time to enter IDLE state after responding to
D0i3 entry message. If the driver does not wait for the VPU to enter
IDLE state it could cause warm boot failures.
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-12-stanislaw.gruszka@linux.intel.com
Split ivpu_ipc_send_receive() implementation to have a version
that does not call pm_runtime_resume_and_get(). That implementation
can be invoked when device is up and runtime resume is prohibited
(for example at the end of boot sequence).
The new function will be used for D0i3 entry IPC message addition
in the separate change.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-10-stanislaw.gruszka@linux.intel.com
The firmware needs to know the time spent in D0i3/D3 to
calculate telemetry data. The D0i3/D3 residency time is
calculated by the driver and passed to the firmware
in the boot parameters.
The driver also passes VPU perf counter value captured
right before entering D0i3 - this allows the VPU firmware
to generate monotonic timestamps for the logs.
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-9-stanislaw.gruszka@linux.intel.com
The driver needs to capture the D0i3 entry timestamp to
calculate D0i3 residency time.
The D0i3 residency time and the VPU timestamp are passed
to the firmware at D0i3 exit (warm boot).
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-8-stanislaw.gruszka@linux.intel.com
Change meaning of test_mode module parameter from integer value
to bitmask allowing setting different test features with corresponding
bits.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-7-stanislaw.gruszka@linux.intel.com
Add test_mode = 3 that add VPU_JOB_FLAGS_NULL_SUBMISSION_MASK
flag to the job send to the VPU device. Then the VPU will process
the job but won't execute commands (except the command to signal
the fence).
This can be used to estimate job processing overhead in the host
software and VPU firmware.
Unlike the null hardware mode, the null submission mode will
still work even if UMD uses VPU fences to track job completion.
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-6-stanislaw.gruszka@linux.intel.com
Setting a non-zero work point resets the IP hence IP_RESET
trigger is redundant.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-5-stanislaw.gruszka@linux.intel.com
Add new debugfs file to set dvfs_mode FW boot parameter and restart
the FW to allow experimenting with DVFS (dynamic voltage & frequency
scaling).
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-4-stanislaw.gruszka@linux.intel.com
The usage count of struct dev_pm_info is an implementation detail that
is only available if CONFIG_PM is enabled, so printing it in a debug message
causes a build failure in configurations without PM:
In file included from include/linux/device.h:15,
from include/linux/pci.h:37,
from drivers/accel/ivpu/ivpu_pm.c:8:
drivers/accel/ivpu/ivpu_pm.c: In function 'ivpu_rpm_get_if_active':
drivers/accel/ivpu/ivpu_pm.c:254:51: error: 'struct dev_pm_info' has no member named 'usage_count'
254 | atomic_read(&vdev->drm.dev->power.usage_count));
| ^
include/linux/dev_printk.h:129:48: note: in definition of macro 'dev_printk'
129 | _dev_printk(level, dev, fmt, ##__VA_ARGS__); \
| ^~~~~~~~~~~
drivers/accel/ivpu/ivpu_drv.h:75:17: note: in expansion of macro 'dev_dbg'
75 | dev_dbg((vdev)->drm.dev, "[%s] " fmt, #type, ##args); \
| ^~~~~~~
drivers/accel/ivpu/ivpu_pm.c:253:9: note: in expansion of macro 'ivpu_dbg'
253 | ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n",
| ^~~~~~~~
The print message does not seem essential, so the easiest workaround is
to just remove it.
Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231027152633.528490-1-arnd@kernel.org
Use QAIC_TIMESYNC MHI channel to send UTC time to device in SBL
environment. Remove support for QAIC_TIMESYNC MHI channel in AMSS
environment as it is not used in that environment.
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231016170114.5446-3-quic_jhugo@quicinc.com
Device and Host have a time synchronization mechanism that happens once
during boot when device is in SBL mode. After that, in mission-mode there
is no timesync. In an experiment after continuous operation, device time
drifted w.r.t. host by approximately 3 seconds per day. This drift leads
to mismatch in timestamp of device and Host logs. To correct this
implement periodic timesync in driver. This timesync is carried out via
QAIC_TIMESYNC_PERIODIC MHI channel.
Signed-off-by: Ajit Pal Singh <quic_ajitpals@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231016170114.5446-2-quic_jhugo@quicinc.com
Several virtualization use-cases either don't support 32 MultiMSIs
(Xen/VMware) or have significant drawbacks to their use (KVM's vIOMMU,
which is required to support 32 MSI, needs to allocate an alternate
system memory space for each device using vIOMMU (e.g. 8GB VM mem and
2 cards => 8 + 2 * 8 = 24GB host memory required)). Support these
cases by enabling a 1 MSI fallback mode.
Whenever all 32 MSIs requested are not available, a second request for
a single MSI is made. Its success is the initiator of single MSI mode.
This mode causes all interrupts generated by the device to be directed
to the 0th MSI (firmware >=v1.10 will do this as a response to the PCIe
MSI capability configuration). Likewise, all interrupt handlers for the
device are registered to the 0th MSI.
Since the DBC interrupt handler checks if the DBC is in use or if
there is any pending changes, the 'spurious' interrupts are
disregarded. If there is work to be done, the standard threaded IRQ
handler is dispatched.
On every interrupt, the MHI handler wakes up its threaded interrupt
handler, and attempts to wake any waiters for MHI state events.
Performance is within +-0.6% for test cases that typify real world
use. Larger differences ([-4,+132]%, avg +47%) exist for very simple
tasks (e.g. addition) compiled for single NSPs. It is assumed that the
small work and many interrupts typically cause contention (e.g. 16 NSPs
vs 4 CPUs), as evidenced by the standard deviation between runs also
decreasing (r=-0.48 between delta(Performace_test) and
delta(StdDev_test/Avg_test))
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231016170036.5409-1-quic_jhugo@quicinc.com
This is already uAPI, xserver parses it. It's useful to document
since user-space might want to lookup the parent connector.
Additionally, people (me included) have misunderstood the PATH
property for being stable across reboots, but since a KMS object
ID is baked in there that's not the case. So PATH shouldn't be
used as-is in config files and such.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20231023203629.198109-1-contact@emersion.fr
drm_mode_rmfb performs two operations: drop the FB from the
file_priv->fbs list, and make sure the FB is no longer used on a
plane.
In the next commit an IOCTL which only does so former will be
introduced, so let's split it into a separate function.
No functional change, only refactoring.
v2: no change
Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Dennis Filder <d.filder@web.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231020101926.145327-1-contact@emersion.fr
Avoid a possible uninitialized use of the crtc_state variable in function
ssd132x_primary_plane_atomic_check() and avoid the following Smatch warn:
drivers/gpu/drm/solomon/ssd130x.c:921 ssd132x_primary_plane_atomic_check()
error: uninitialized symbol 'crtc_state'.
Fixes: fdd591e00a9c ("drm/ssd130x: Add support for the SSD132x OLED controller family")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/dri-devel/7dd6ca45-8263-44fe-a318-2fd9d761425d@moroto.mountain/
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231020225338.1686974-1-javierm@redhat.com
Rather than doing this in the IP code for the SDMA paging
engine, move it up to the core device level init level.
This should fix the scheduler init ordering.
v2: drop extra parens
v3: drop SDMA helpers
v4: Added a Fixes tag because amdgpu dereferences an uninitialized
scheduler without this patch, and this patch fixes this. (Luben)
Tested-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20231025171928.3318505-1-alexander.deucher@amd.com
Acked-by: Christian König <christian.koenig@amd.com>
Fixes: 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Update the GPU Scheduler maintainer email.
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Link: https://lore.kernel.org/r/20231026174438.18427-2-ltuikov89@gmail.com
The GPU scheduler has now a variable number of run-queues, which are set up at
drm_sched_init() time. This way, each driver announces how many run-queues it
requires (supports) per each GPU scheduler it creates. Note, that run-queues
correspond to scheduler "priorities", thus if the number of run-queues is set
to 1 at drm_sched_init(), then that scheduler supports a single run-queue,
i.e. single "priority". If a driver further sets a single entity per
run-queue, then this creates a 1-to-1 correspondence between a scheduler and
a scheduled entity.
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
DRM CI keeps track of which tests are failing, flaking or being skipped
by the ci in the expectations files. Add entries for those files to the
corresponding driver maintainer, so they can be notified when they
change.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20230919182249.153499-1-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Clarify the procedure developer must follow to request privileges to
run tests on Freedesktop gitlab CI.
This measure was added to avoid untrusted people to misuse the
infrastructure.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-11-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Since the kernel doesn't use a bot like Mesa that requires tests to pass
in order to merge the patches, leave it to developers and/or maintainers
to manually retry.
Suggested-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-10-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Export the resultant kernel config, making it easier to verify if the
resultant config was correctly generated.
Suggested-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-9-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
With the new sharding, the default job timeout is not enough for i915
and their jobs are failing before completing.
See below the current execution time:
🞋 job i915:tgl 8/8 has new status: success (37m3s)
🞋 job i915:tgl 7/8 has new status: success (19m43s)
🞋 job i915:tgl 6/8 has new status: success (21m47s)
🞋 job i915:tgl 5/8 has new status: success (18m16s)
🞋 job i915:tgl 4/8 has new status: success (21m43s)
🞋 job i915:tgl 3/8 has new status: success (17m59s)
🞋 job i915:tgl 2/8 has new status: success (22m15s)
🞋 job i915:tgl 1/8 has new status: success (18m52s)
🞋 job i915:cml 2/2 has new status: success (1h19m58s)
🞋 job i915:cml 1/2 has new status: success (55m45s)
🞋 job i915:whl 2/2 has new status: success (1h8m56s)
🞋 job i915:whl 1/2 has new status: success (54m3s)
🞋 job i915:kbl 3/3 has new status: success (37m43s)
🞋 job i915:kbl 2/3 has new status: success (36m37s)
🞋 job i915:kbl 1/3 has new status: success (34m52s)
🞋 job i915:amly 2/2 has new status: success (1h7m60s)
🞋 job i915:amly 1/2 has new status: success (59m18s)
🞋 job i915:glk 2/2 has new status: success (58m26s)
🞋 job i915:glk 1/2 has new status: success (50m23s)
🞋 job i915:apl 3/3 has new status: success (1h6m39s)
🞋 job i915:apl 2/3 has new status: success (1h4m45s)
🞋 job i915:apl 1/3 has new status: success (1h7m38s)
(generated with ci_run_n_monitor.py script)
The longest job is 1h19m58s, so adjust the timeout.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-8-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
The Collabora Lava farm added a tag called `subset-1-gfx` to half of
devices the graphics community use.
Lets use this tag so we don't occupy all the resources.
This is particular important because Mesa3D shares the resources with
DRM-CI and use them to do pre-merge tests, so it can block developers
from getting their patches merged.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-7-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Since the script that collected the list of the expectation files was
bogus and placing test to the flakes list incorrectly, restart the
expectation files with the correct script.
This reduces a lot the number of tests in the flakes list.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-6-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
IGT has recently merged a patch that makes code_getversion test to fails
if the driver isn't loaded or if it isn't the expected one defined in
variable IGT_FORCE_DRIVER.
Without this test, jobs were passing when the driver didn't load or
probe for some reason, giving the illusion that everything was ok.
Uprev IGT to include this modification and include core_getversion test
in all the shards.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-5-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Add helper script that given a gitlab pipeline url, analyse which are
the failures and flakes and update the xfails folder accordingly.
Example:
Trigger a pipeline in gitlab infrastructure, than re-try a few jobs more
than once (so we can have data if failures are consistent across jobs
with the same name or if they are flakes) and execute:
update-xfails.py https://gitlab.freedesktop.org/helen.fornazier/linux/-/pipelines/970661
git diff should show you that it updated files in xfails folder.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Tested-by: Vignesh Raman <vignesh.raman@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-4-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
amdgpu driver wasn't loading because amdgpu firmware wasn't being
installed in the rootfs due to the wrong DEBIAN_ARCH variable.
rename ARCH to DEBIAN_ARCH also, so we don't have the confusing
DEBIAN_ARCH, KERNEL_ARCH and ARCH.
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Link: https://lore.kernel.org/r/20231024004525.169002-3-helen.koike@collabora.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>