linux/drivers/misc
Yuri Nudelman 17ab47d2d6 habanalabs/gaudi: fix a race condition causing DMAR error
There is a rare race condition in CB completion mechanism, that can
occur under a very high pressure of command submissions.
The preconditions for this to happen are:

 1. There should be enough command submissions for the pre-allocated
    patched CB pool to run out of commands. At this stage we start
    allocating new patched CBs as they arrive.
 2. CB size has to be exactly (128*n + 104)B for some n, i.e. 24B below
    a cache line end.

The flow:

 1. Two command buffers being completed on different streams, at the
    same time. Denote those CB1 and CB2.
 2. Each command buffer is injected with two messages, 16B each - one
    for a HBW update of the completion queue, another to raise
    interrupt.
 3. Assume CB1 updated the completion queue and raise the interrupt.
 4. Assume CB2 updated the completion queue but did not raise the
    interrupt yet.
 5. The host receives the interrupt. It goes over the completion queue
    and sees two completions - CB1 and CB2. Release them both.
 6. CB2 performs the last command. The problem is that the last command
    is split between 2 cache lines. So to read the last 8B of the last
    command, it has to access the host again. Problem is - CB2 is
    already released. This causes a DMAR error.

The solution to this problem is simply to make sure the last two
commands in the CB are always in the same cache line, using NOP padding.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-07-12 09:09:24 +03:00
..
altera-stapl altera-stapl: Use swap() instead of open coding it 2022-05-09 15:39:45 +02:00
bcm-vk misc: bcm-vk: replace usage of found with dedicated list iterator variable 2022-04-24 17:30:27 +02:00
c2port
cardreader Merge 5.19-rc6 into char-misc-next 2022-07-11 08:32:58 +02:00
cb710 cb710: avoid NULL pointer subtraction 2021-10-05 15:50:05 +02:00
cxl cxl: drop unexpected word "the" in the comments 2022-06-27 16:15:38 +02:00
echo
eeprom Merge 5.19-rc6 into char-misc-next 2022-07-11 08:32:58 +02:00
genwqe Merge 5.15-rc3 into char-misc next 2021-09-27 15:39:40 +02:00
habanalabs habanalabs/gaudi: fix a race condition causing DMAR error 2022-07-12 09:09:24 +03:00
ibmasm
lis3lv02d spi: make remove callback a void function 2022-02-09 13:00:45 +00:00
lkdtm lkdtm: cfi: use NULL for a null pointer rather than zero 2022-06-27 16:16:07 +02:00
mei mei: me: add raptor lake point S DID 2022-06-10 15:39:24 +02:00
ocxl cxl/ocxl: Prepare cleanup of powerpc's asm/prom.h 2022-05-11 23:06:39 +10:00
pvpanic misc/pvpanic: Convert regular spinlock into trylock on panic path 2022-04-29 16:54:59 +02:00
sgi-gru misc: sgi-gru: grukservices: drop unexpected word "the" in the comments 2022-06-27 16:15:17 +02:00
sgi-xp sgi-xp: Use the bitmap API to allocate bitmaps 2022-07-08 15:41:39 +02:00
ti-st
uacce uacce: Handle parent device removal or parent driver module rmmod 2022-07-01 10:35:08 +02:00
vmw_vmci VMCI: Add support for ARM64 2022-04-24 17:32:14 +02:00
ad525x_dpot-i2c.c misc: ad525x_dpot: Make ad_dpot_remove() return void 2021-10-13 14:35:37 +02:00
ad525x_dpot-spi.c spi: make remove callback a void function 2022-02-09 13:00:45 +00:00
ad525x_dpot.c misc: ad525x_dpot: Make ad_dpot_remove() return void 2021-10-13 14:35:37 +02:00
ad525x_dpot.h misc: ad525x_dpot: Make ad_dpot_remove() return void 2021-10-13 14:35:37 +02:00
apds990x.c
apds9802als.c
atmel-ssc.c misc: atmel-ssc: Fix IRQ check in ssc_probe 2022-06-10 15:29:56 +02:00
bh1770glc.c
cs5535-mfgpt.c
ds1682.c
dummy-irq.c
dw-xdata-pcie.c
enclosure.c misc: enclosure: replace snprintf in show functions with sysfs_emit 2021-10-22 11:25:39 +02:00
fastrpc.c misc: fastrpc: fix list iterator in fastrpc_req_mem_unmap_impl 2022-05-19 18:57:20 +02:00
gehc-achc.c misc: gehc: Add SPI ID table 2021-10-05 15:47:18 +02:00
hi6421v600-irq.c misc: hi6421-spmi-pmic: Use generic_handle_irq_safe(). 2022-03-02 22:28:50 +01:00
hisi_hikey_usb.c misc: hisi_hikey_usb: change the DT schema 2021-09-14 10:57:31 +02:00
hmc6352.c
hpilo.c
hpilo.h
ibmvmc.c
ibmvmc.h
ics932s401.c
isl29003.c
isl29020.c
Kconfig misc: fastrpc: Add support to secure memory map 2022-03-18 14:11:00 +01:00
kgdbts.c kgdbts: fix return value of __setup handler 2022-03-18 14:17:56 +01:00
lattice-ecp3-config.c spi: make remove callback a void function 2022-02-09 13:00:45 +00:00
Makefile misc: open-dice: Add driver to expose DICE data to userspace 2022-02-04 16:45:39 +01:00
open-dice.c misc: open-dice: Add driver to expose DICE data to userspace 2022-02-04 16:45:39 +01:00
pch_phub.c
pci_endpoint_test.c misc: pci_endpoint_test: Terminate statement with semicolon 2022-01-11 10:19:59 -06:00
phantom.c
qcom-coincell.c
sram-exec.c
sram.c misc: sram: Add compatible string for Tegra234 SYSRAM 2021-12-08 15:16:05 +01:00
sram.h
tifm_7xx1.c tifm: Remove usage of the deprecated "pci-dma-compat.h" API 2021-09-21 17:33:31 +02:00
tifm_core.c tifm: Remove usage of the deprecated "pci-dma-compat.h" API 2021-09-21 17:33:31 +02:00
tsl2550.c
vmw_balloon.c vmw_balloon: Print errors on reset only once 2022-04-24 17:24:06 +02:00
xilinx_sdfec.c