linux/drivers/iommu
Alexander Lobakin ea01fa7031 iommu/dma: avoid expensive indirect calls for sync operations
When IOMMU is on, the actual synchronization happens in the same cases
as with the direct DMA. Advertise %DMA_F_CAN_SKIP_SYNC in IOMMU DMA to
skip sync ops calls (indirect) for non-SWIOTLB buffers.

perf profile before the patch:

    18.53%  [kernel]       [k] gq_rx_skb
    14.77%  [kernel]       [k] napi_reuse_skb
     8.95%  [kernel]       [k] skb_release_data
     5.42%  [kernel]       [k] dev_gro_receive
     5.37%  [kernel]       [k] memcpy
<*>  5.26%  [kernel]       [k] iommu_dma_sync_sg_for_cpu
     4.78%  [kernel]       [k] tcp_gro_receive
<*>  4.42%  [kernel]       [k] iommu_dma_sync_sg_for_device
     4.12%  [kernel]       [k] ipv6_gro_receive
     3.65%  [kernel]       [k] gq_pool_get
     3.25%  [kernel]       [k] skb_gro_receive
     2.07%  [kernel]       [k] napi_gro_frags
     1.98%  [kernel]       [k] tcp6_gro_receive
     1.27%  [kernel]       [k] gq_rx_prep_buffers
     1.18%  [kernel]       [k] gq_rx_napi_handler
     0.99%  [kernel]       [k] csum_partial
     0.74%  [kernel]       [k] csum_ipv6_magic
     0.72%  [kernel]       [k] free_pcp_prepare
     0.60%  [kernel]       [k] __napi_poll
     0.58%  [kernel]       [k] net_rx_action
     0.56%  [kernel]       [k] read_tsc
<*>  0.50%  [kernel]       [k] __x86_indirect_thunk_r11
     0.45%  [kernel]       [k] memset

After patch, lines with <*> no longer show up, and overall
cpu usage looks much better (~60% instead of ~72%):

    25.56%  [kernel]       [k] gq_rx_skb
     9.90%  [kernel]       [k] napi_reuse_skb
     7.39%  [kernel]       [k] dev_gro_receive
     6.78%  [kernel]       [k] memcpy
     6.53%  [kernel]       [k] skb_release_data
     6.39%  [kernel]       [k] tcp_gro_receive
     5.71%  [kernel]       [k] ipv6_gro_receive
     4.35%  [kernel]       [k] napi_gro_frags
     4.34%  [kernel]       [k] skb_gro_receive
     3.50%  [kernel]       [k] gq_pool_get
     3.08%  [kernel]       [k] gq_rx_napi_handler
     2.35%  [kernel]       [k] tcp6_gro_receive
     2.06%  [kernel]       [k] gq_rx_prep_buffers
     1.32%  [kernel]       [k] csum_partial
     0.93%  [kernel]       [k] csum_ipv6_magic
     0.65%  [kernel]       [k] net_rx_action

iavf yields +10% of Mpps on Rx. This also unblocks batched allocations
of XSk buffers when IOMMU is active.

Co-developed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-05-07 13:29:53 +02:00
..
amd iommu/amd: Change log message severity 2024-04-12 12:21:46 +02:00
arm iommu/arm-smmu-v3: Fix access for STE.SHCFG 2024-03-26 10:47:39 +00:00
intel iommu/vt-d: Fix WARN_ON in iommu probe path 2024-04-12 12:06:24 +02:00
iommufd iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest 2024-04-14 13:52:08 -03:00
apple-dart.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
dma-iommu.c iommu/dma: avoid expensive indirect calls for sync operations 2024-05-07 13:29:53 +02:00
dma-iommu.h iommu: Optimise PCI SAC address trick 2023-07-14 16:14:17 +02:00
exynos-iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
fsl_pamu_domain.c iommu/fsl_pamu: Implement a PLATFORM domain 2023-09-25 11:40:54 +02:00
fsl_pamu_domain.h
fsl_pamu.c iommu/fsl: fix all kernel-doc warnings in fsl_pamu.c 2023-03-22 14:50:15 +01:00
fsl_pamu.h
hyperv-iommu.c x86/vector: Rename send_cleanup_vector() to vector_schedule_cleanup() 2023-08-06 14:15:09 +02:00
io-pgfault.c iommu: Make iommu_report_device_fault() return void 2024-02-16 15:19:37 +01:00
io-pgtable-arm-v7s.c iommu/io-pgtable-arm-v7s: Remove map/unmap 2022-11-19 10:44:15 +01:00
io-pgtable-arm.c iommu: Extend LPAE page table format to support custom allocators 2023-11-27 11:10:12 +01:00
io-pgtable-arm.h
io-pgtable-dart.c
io-pgtable.c iommu: Allow passing custom allocators to pgtable drivers 2023-11-27 11:10:12 +01:00
iommu-debugfs.c
iommu-priv.h iommu: constify pointer to bus_type 2024-03-01 13:46:57 +01:00
iommu-sva.c Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-03-08 09:05:59 +01:00
iommu-sysfs.c iommu: Do not export iommu_device_link/unlink() 2023-07-14 16:14:15 +02:00
iommu-traces.c iommu: Remove detach_dev callback 2023-01-13 16:39:18 +01:00
iommu.c iommu: Validate the PASID in iommu_attach_device_pasid() 2024-03-28 06:38:40 +01:00
iova.c iommu/iova: use named kmem_cache for iova magazines 2024-02-09 11:45:47 +01:00
ipmmu-vmsa.c Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-03-08 09:05:59 +01:00
irq_remapping.c iommu: Fix compilation without CONFIG_IOMMU_INTEL 2024-03-08 09:03:18 +01:00
irq_remapping.h
Kconfig IOMMU Updates for Linux v6.9 2024-03-13 09:15:30 -07:00
Makefile iommu: Separate SVA and IOPF 2024-02-16 15:19:29 +01:00
msm_iommu_hw-8xxx.h
msm_iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
msm_iommu.h
mtk_iommu_v1.c iommu: mtk: fix module autoloading 2024-04-12 12:04:50 +02:00
mtk_iommu.c iommu: mtk: fix module autoloading 2024-04-12 12:04:50 +02:00
of_iommu.c iommu: re-use local fwnode variable in iommu_ops_from_fwnode() 2024-03-01 13:47:01 +01:00
omap-iommu-debug.c
omap-iommu.c iommu: Mark dev_iommu_priv_set() with a lockdep 2023-12-12 10:18:49 +01:00
omap-iommu.h iommu/omap: Convert to generic_single_device_group() 2023-09-25 11:52:08 +02:00
omap-iopgtable.h
rockchip-iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
s390-iommu.c iommu/dma: Allow a single FQ in addition to per-CPU FQs 2023-10-02 08:43:03 +02:00
sprd-iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
sun50i-iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
tegra-smmu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00
virtio-iommu.c iommu: constify of_phandle_args in xlate 2024-03-01 13:46:57 +01:00