linux

iv/linux

Author	SHA1	Message	Date
James Gowans	8f4b589595	irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs GICv3 LPIs are impacted by an architectural design issue: they do not have a global active state and as such a given LPI can be delivered to a new CPU after an affinity change while the previous instance of the same LPI handler has not yet completed on the original CPU. If LPIs had an active state, this second LPI would not be delivered until the first CPU deactivated the initial LPI, just like SPIs. To solve this issue, use the newly introduced IRQD_RESEND_WHEN_IN_PROGRESS flag, ensuring that we do not lose an LPI being delivered during that window by getting the GIC to resend it. This workaround gets enabled for all LPIs, including the VPE doorbells. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Gowans <jgowans@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: KarimAllah Raslan <karahmed@amazon.com> Cc: Yipeng Zou <zouyipeng@huawei.com> Cc: Zhang Jianhua <chris.zjh@huawei.com> [maz: massaged commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230608120021.3273400-4-jgowans@amazon.com	2023-06-16 12:23:40 +01:00
Linus Torvalds	7fa8a8ee94	- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page(). - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY= =J+Dj -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...	2023-04-27 19:42:02 -07:00
Sebastian Reichel	a8707f5538	irqchip/gic-v3: Add Rockchip 3588001 erratum workaround Rockchip RK3588/RK3588s GIC600 integration does not support the sharability feature. Rockchip assigned Erratum ID #3588001 for this issue. Note, that the 0x0201743b ID is not Rockchip specific and thus there is an extra of_machine_is_compatible() check. The flags are named FORCE_NON_SHAREABLE to be vendor agnostic, since apparently similar integration design errors exist in other platforms and they can reuse the same flag. Co-developed-by: XiaoDong Huang <derrick.huang@rock-chips.com> Signed-off-by: XiaoDong Huang <derrick.huang@rock-chips.com> Co-developed-by: Kever Yang <kever.yang@rock-chips.com> Signed-off-by: Kever Yang <kever.yang@rock-chips.com> Co-developed-by: Lucas Tanure <lucas.tanure@collabora.com> Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230418142109.49762-2-sebastian.reichel@collabora.com	2023-04-18 17:31:17 +01:00
Kirill A. Shutemov	23baf831a3	mm, treewide: redefine MAX_ORDER sanely MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-05 19:42:46 -07:00
Linus Torvalds	143c7bc649	iommufd for 6.3 Some polishing and small fixes for iommufd: - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem - Use GFP_KERNEL_ACCOUNT inside the iommu_domains - Support VFIO_NOIOMMU mode with iommufd - Various typos - A list corruption bug if HWPTs are used for attach -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY/TgzQAKCRCFwuHvBreF Ya3AAP4/WxTJIbDvtTyH3Fae3NxTdO8j8gsUvU1vrRYG83zdnAEAxd1yii7GEO8D crkeq9D4FUiPAkFnJ64Exw2FHb060Qg= =RABK -----END PGP SIGNATURE----- Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Some polishing and small fixes for iommufd: - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem - Use GFP_KERNEL_ACCOUNT inside the iommu_domains - Support VFIO_NOIOMMU mode with iommufd - Various typos - A list corruption bug if HWPTs are used for attach" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Do not add the same hwpt to the ioas->hwpt_list twice iommufd: Make sure to zero vfio_iommu_type1_info before copying to user vfio: Support VFIO_NOIOMMU with iommufd iommufd: Add three missing structures in ucmd_buffer selftests: iommu: Fix test_cmd_destroy_access() call in user_copy iommu: Remove IOMMU_CAP_INTR_REMAP irq/s390: Add arch_is_isolated_msi() for s390 iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code iommufd: Convert to msi_device_has_isolated_msi() vfio/type1: Convert to iommu_group_has_isolated_msi() iommu: Add iommu_group_has_isolated_msi() genirq/msi: Add msi_device_has_isolated_msi()	2023-02-24 14:34:12 -08:00
Johan Hovold	1e46e040de	irqchip/gic-v3-its: Use irq_domain_create_hierarchy() Use the irq_domain_create_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Note that the domain host_data was first set to the struct its_node during allocation only to immediately be overwritten with the struct msi_domain_info. Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-17-johan+linaro@kernel.org	2023-02-13 19:31:25 +00:00
Jason Gunthorpe	dcb83f6ec1	genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI What x86 calls "interrupt remapping" is one way to achieve isolated MSI, make it clear this is talking about isolated MSI, no matter how it is achieved. This matches the new driver facing API name of msi_device_has_isolated_msi() No functional change. Link: https://lore.kernel.org/r/6-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2023-01-11 16:27:23 -04:00
Linus Torvalds	f23cdfcd04	IOMMU Updates for Linux v6.1: Including: - Removal of the bus_set_iommu() interface which became unnecesary because of IOMMU per-device probing - Make the dma-iommu.h header private - Intel VT-d changes from Lu Baolu: - Decouple PASID and PRI from SVA - Add ESRTPS & ESIRTPS capability check - Cleanups - Apple DART support for the M1 Pro/MAX SOCs - Support for AMD IOMMUv2 page-tables for the DMA-API layer. The v2 page-tables are compatible with the x86 CPU page-tables. Using them for DMA-API prepares support for hardware-assisted IOMMU virtualization - Support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver - Some smaller fixes and cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmNEC5oACgkQK/BELZcB GuNcOQ/6A5SXmcvDRLYZW1ENM5Z6xsZ1LabSZkjhYSpmbJyu8Uny/Z2aRWqxPMLJ hJeHTsWSLhrTq1VfjFhELHB3kgT2DRr7H3LXXaMNC6qz690EcavX1wKX2AxH0m22 8YrktkyAmFQ3BG6rsQLdlMMasLph/x06ix/xO9opQZVFdj/fV0Jx7ekX1JK+U3hx MI96i5W3G5PBVHBypAvjxSlmA4saj9Fhk7l3IZL7py9AOKz7NypuwWRs+86PMBiO EzLt5aF4g8pmKChF/c9BsoIbjBYvTG/s3NbycIng0ACc2SOvf+EvtoVZQclWifbT lwti9PLdsoVUnPOZHLYOTx4xSf/UyoLVzaLxJ52aoXnNYe2qaX5DANXhT2mWIY/Y z1mzOkShmK7WF7a8arRyqJeLJ4SvDx8GrbvLiom3DAzmqVHzzFGadHtt5fvGYN4F Jet/JIN3HjECQbamqtPBpWquBFhLmgusPksIiyMFscRvYdZqkaVkTkElcF3WqAMm QkeecfoTQ9Vdtdz44ZVLRjKpS77yRZmHshp1r/rfSI+9Ok8uRI+xmmcyrAI6ElqH DH14tLHPzw694rTHF+bTCd+pPMGOoFLi0xAfUXAeGWm1uzC1JIRrVu5JeQNOUOSD 5SQDXB7dPrhXngaws5Fx2u3amCO3688mslcGgM7q54kC+LyVo0E= =h0sT -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - remove the bus_set_iommu() interface which became unnecesary because of IOMMU per-device probing - make the dma-iommu.h header private - Intel VT-d changes from Lu Baolu: - Decouple PASID and PRI from SVA - Add ESRTPS & ESIRTPS capability check - Cleanups - Apple DART support for the M1 Pro/MAX SOCs - support for AMD IOMMUv2 page-tables for the DMA-API layer. The v2 page-tables are compatible with the x86 CPU page-tables. Using them for DMA-API prepares support for hardware-assisted IOMMU virtualization - support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver - some smaller fixes and cleanups * tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/vt-d: Avoid unnecessary global DMA cache invalidation iommu/vt-d: Avoid unnecessary global IRTE cache invalidation iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_support iommu/vt-d: Remove pasid_set_eafe() iommu/vt-d: Decouple PASID & PRI enabling from SVA iommu/vt-d: Remove unnecessary SVA data accesses in page fault path dt-bindings: iommu: arm,smmu-v3: Relax order of interrupt names iommu: dart: Support t6000 variant iommu/io-pgtable-dart: Add DART PTE support for t6000 iommu/io-pgtable: Add DART subpage protection support iommu/io-pgtable: Move Apple DART support to its own file iommu/mediatek: Add support for MT6795 Helio X10 M4Us iommu/mediatek: Introduce new flag TF_PORT_TO_ADDR_MT8173 dt-bindings: mediatek: Add bindings for MT6795 M4U iommu/iova: Fix module config properly iommu/amd: Fix sparse warning iommu/amd: Remove outdated comment iommu/amd: Free domain ID after domain_flush_pages iommu/amd: Free domain id in error path iommu/virtio: Fix compile error with viommu_capable() ...	2022-10-10 13:20:53 -07:00
Pierre Gondois	f55a9b59e8	irqchip/gic-v3-its: Remove cpumask_var_t allocation Running a PREEMPT_RT kernel based on v5.19-rc3-rt4 on an Ampere Altra triggers: [ 22.616229] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 [ 22.616239] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1884, name: kworker/80:1 [ 22.616243] preempt_count: 3, expected: 0 [ 22.616244] RCU nest depth: 0, expected: 0 [...] [ 22.616250] hardirqs last enabled at (33): _raw_spin_unlock_irq (/home/piegon01/linux/./arch/arm64/include/asm/irqflags.h:35) [ 22.616273] hardirqs last disabled at (34): __schedule (/home/piegon01/linux/kernel/sched/core.c:6432 (discriminator 1)) [ 22.616283] softirqs last enabled at (0): copy_process (/home/piegon01/linux/./include/linux/lockdep.h:191) [ 22.616297] softirqs last disabled at (0): 0x0 [ 22.616305] Preemption disabled at: [ 22.616307] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1612) [ 22.616322] CPU: 80 PID: 1884 Comm: kworker/80:1 Tainted: G W [...] [ 22.616328] Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 [ 22.616333] Workqueue: events work_for_cpu_fn [ 22.616344] Call trace: [...] [ 22.616403] alloc_cpumask_var_node (/home/piegon01/linux/lib/cpumask.c:115) [ 22.616414] alloc_cpumask_var (/home/piegon01/linux/lib/cpumask.c:147) [ 22.616417] its_select_cpu (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1580) [ 22.616428] its_set_affinity (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1659) [ 22.616431] msi_domain_set_affinity (/home/piegon01/linux/kernel/irq/msi.c:501) [ 22.616440] irq_do_set_affinity (/home/piegon01/linux/kernel/irq/manage.c:276) [ 22.616443] irq_setup_affinity (/home/piegon01/linux/kernel/irq/manage.c:633) [ 22.616447] irq_startup (/home/piegon01/linux/kernel/irq/chip.c:280) [ 22.616453] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1777) Follow the pattern established in commit cba4235e6031e ("genirq: Remove mask argument from setup_affinity()") and co to overcome this issue by defining a static struct cpumask and protecting it by a raw spinlock. Since its_select_cpu() can be executed with IRQs enabled or disabled, enforce that the cpumask computation is done with interrupts disabled. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220912141857.1391343-1-pierre.gondois@arm.com	2022-09-12 16:37:43 +01:00
Robin Murphy	fa49364cd5	iommu/dma: Move public interfaces to linux/iommu.h The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with only a couple more public interfaces left pertaining to MSI integration. Since these depend on the main IOMMU API header anyway, move their declarations there, taking the opportunity to update the half-baked comments to proper kerneldoc along the way. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-09-07 14:47:00 +02:00
Thomas Gleixner	cdb4913293	irqchip updates for 5.19: - Add new infrastructure to stop gpiolib from rewriting irq_chip structures behind our back. Convert a few of them, but this will obviously be a long effort. - A bunch of GICv3 improvements, such as using MMIO-based invalidations when possible, and reducing the amount of polling we perform when reconfiguring interrupts. - Another set of GICv3 improvements for the Pseudo-NMI functionality, with a nice cleanup making it easy to reason about the various states we can be in when an NMI fires. - The usual bunch of misc fixes and minor improvements. -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmKGcX8PHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD7kYP/1sbxyRoq7iWqtTDK7ENWvqXh5wu/YZe0pnw jr0hPrJTdQKUbsBA+pusklEnTHvRgnLOmFpfR3X7apGg/If7mPRZGQcZz3fXKwDA 53u74IzZhYa+fx9H0L1qtBUHJtTP4/IexkzL/84R19u2/ewIhzDyhpvGxA/yAFj+ Gi6bgz93NGMOt/tdNtXZvj5zdr+5BayC6JBpnyzliyxS1xD3YeA0T05fHDYfjrcM 51gUeA/9tA3EWiRzsdZGq6uDaUfBW5aspWu0bZx/WBUWNBvAAjWzhIgNWDW/xKJP N3t6UQ6+uNYJXvdaCJlBLc6TiXBzGXINgr4oMljg8nJRYLt+xVsadkTnFxlnqoY/ FNeEiOUQqjZ1qcvHJoIceGHgTq//o3VaZ+AnuAESqeNPGavz+LMOCNo7Su+k2+Tk H3x09+p+SbrzJvRVyboLVk+v74NtzEz1fGrjEzQk2eHw+dc18yz1v+D1EX1REkhM gjzjSIAgZoq1M3GZL8tyrov44vhG3mUm3jAO01u9fRTHqEee6WIKt0aijSe/sCRr chTf+S9n8xPsr6AHUPQImV/fSismK4erCJeAiSp+P3hZjqyK8iPsHgiM5YLj50Cl ry9dACxv6CYf7lMKmKPC/atV1IlJSEZpguc6FLQ2tv9IBWqNMQXve0012acFMr6B ZpncbECV =nQxd -----END PGP SIGNATURE----- Merge tag 'irqchip-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Add new infrastructure to stop gpiolib from rewriting irq_chip structures behind our back. Convert a few of them, but this will obviously be a long effort. - A bunch of GICv3 improvements, such as using MMIO-based invalidations when possible, and reducing the amount of polling we perform when reconfiguring interrupts. - Another set of GICv3 improvements for the Pseudo-NMI functionality, with a nice cleanup making it easy to reason about the various states we can be in when an NMI fires. - The usual bunch of misc fixes and minor improvements. Link: https://lore.kernel.org/all/20220519165308.998315-1-maz@kernel.org	2022-05-20 18:48:54 +02:00
Marc Zyngier	3f893a5962	irqchip/gic-v3: Always trust the managed affinity provided by the core code Now that the core code has been fixed to always give us an affinity that only includes online CPUs, directly use this affinity when computing a target CPU. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220405185040.206297-4-maz@kernel.org	2022-04-10 21:06:30 +02:00
Marc Zyngier	af27e41612	irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling The way KVM drives GICv4.{0,1} is as follows: - vcpu_load() makes the VPE resident, instructing the RD to start scanning for interrupts - just before entering the guest, we check that the RD has finished scanning and that we can start running the vcpu - on preemption, we deschedule the VPE by making it invalid on the RD However, we are preemptible between the first two steps. If it so happens and that the RD was still scanning, we nonetheless write to the GICR_VPENDBASER register while Dirty is set, and bad things happen (we're in UNPRED land). This affects both the 4.0 and 4.1 implementations. Make sure Dirty is cleared before performing the deschedule, meaning that its_clear_vpend_valid() becomes a sort of full VPE residency barrier. Reported-by: Jingyi Wang <wangjingyi11@huawei.com> Tested-by: Nianyao Tang <tangnianyao@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit") Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com	2022-04-05 16:33:13 +01:00
Marc Zyngier	eba1e44bee	irqchip/gic-v3-its: Skip HP notifier when no ITS is registered We have some systems out there that have both LPI support and an ITS, but that don't expose the ITS in their firmware tables (either because it is broken or because they run under a hypervisor that hides it...). Is such a configuration, we still register the HP notifier to free the allocated tables if needed, resulting in a warning as there is no memory to free (nothing was allocated the first place). Fix it by keying the HP notifier on the presence of at least one sucessfully probed ITS. Fixes: d23bc2bc1d63 ("irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve") Reported-by: Steev Klimaszewski <steev@kali.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20220202103454.2480465-1-maz@kernel.org	2022-02-02 10:43:10 +00:00
Thomas Gleixner	243d308037	irqchip fixes for 5.17, take #1 - Drop an unused private data field in the AIC driver - Various fixes to the realtek-rtl driver - Make the GICv3 ITS driver compile again in !SMP configurations - Force reset of the GICv3 ITSs at probe time to avoid issues during kexec - Yet another kfree/bitmap_free conversion - Various DT updates (Renesas, SiFive) -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmH0KYkPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD0JMQAMAQfj4UwC0Ha6wcfbdzkNRWRIfMno2/vSDi wJ/RAQi9i9+XeTRluq3aUDmMo+uOCumhJnBsfVF9YGzQXEtWN7OHiSyYtEP5M4gQ pk7SHJ2Ro0W8yGGQvwVvf9g8USDD1NfDefjPSr4SZe27hEEush0ngmVWhJeUZzbO UnTLb3ej4/tlxc0JJBKt8qdBqhfGVqeqppAAFKWm1PoJOtg7HPYQUlaHyTWWfDRf jqqdtKVUysgqG+ok0cO5fX1mnNfA/pC1UkCJavK0mceThtgLf3J3FRT2R8dMBaxP kK0bwf1rj7z9pPKaHmWvoplaJFNs6dtvtNdRQ1DGzsT4NqV6YpOmU8vZvYPsv1VE jA2YvjuRN2DHfH437RIdIKu6RozMdJlnh/cp5CwHXmyJJY51gWj2WwBjQ/oOKmKZ Yd11Wi9pQ06AoTRzmVwS7adHk2VPdygW/P7Xatem/UUJoS15BLxDLGXI7mK2xzNo 911DAYA0JvgmUZ3zYAkT3V4Lf62hBPUnJnE4J/0eWjPIun2b0UzyOaz/IhJFcShU wwZNz/HErd0qkzdFxDxjcEHCA1pHpZO/g171LVaH3b5E7WNMknt0r/YLS+7ktFkq we69SFg7iz5P9QJONqvlIqP2HRTer9ILQNaeLdM+Ag2YusYLq03W1eiYOfPYv9up sYcoIf2f =0mrS -----END PGP SIGNATURE----- Merge tag 'irqchip-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Drop an unused private data field in the AIC driver - Various fixes to the realtek-rtl driver - Make the GICv3 ITS driver compile again in !SMP configurations - Force reset of the GICv3 ITSs at probe time to avoid issues during kexec - Yet another kfree/bitmap_free conversion - Various DT updates (Renesas, SiFive) Link: https://lore.kernel.org/r/20220128174217.517041-1-maz@kernel.org	2022-01-29 21:03:20 +01:00
Marc Zyngier	c733ebb7cb	irqchip/gic-v3-its: Reset each ITS's BASERn register before probe A recent bug report outlined that the way GICv4.1 is handled across kexec is pretty bad. We can end-up in a situation where ITSs share memory (this is the case when SVPET==1) and reprogram the base registers, creating a situation where ITSs that are part of a given affinity group see different pointers. Which is illegal. Boo. In order to restore some sanity, reset the BASERn registers to 0 before probing any ITS. Although this isn't optimised at all, this is only a once-per-boot cost, which shouldn't show up on anyone's radar. Cc: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20211216190315.GA14220@lpieralisi Link: https://lore.kernel.org/r/20220124133809.1291195-1-maz@kernel.org	2022-01-26 11:10:28 +00:00
Ard Biesheuvel	16436f70ab	irqchip/gic-v3-its: Fix build for !SMP Commit 835f442fdbce ("irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime") added a reference to cpus_booted_once_mask, which does not exist on !SMP builds, breaking the build for such configurations. Given the intent of the check, short circuit it to always pass. Cc: Valentin Schneider <valentin.schneider@arm.com> Fixes: 835f442fdbce ("irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220122151614.133766-1-ardb@kernel.org	2022-01-22 15:42:49 +00:00
Thomas Gleixner	67d50b5f91	irqchip updates for 5.17 - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHZitYPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDH4UP/3hsBH9KGWFakokvJJqXb8OS9LW2K8bEDm1B 9FaDAL6KamZVCGQmBUrzxuBSw1YSQszFZ752ozQpioEQ5IyTUcVbocxNznUOIOFc F38f3jOS7KmjqTIMi7AM+lZPqrBH17cnMpRCorNF4CWVM+iHUUrxdYfV7AGifHyX zpl9okkNpgdpO4gPBbPA0BBhIT+a9lmzqfFrfo+sMin6bgmk1mq3tJ4pVJV5K8KL PXwa123eEtnr2P7JVnp+ChoECdv4QEFS0gFHw9CgE0XsKa5NjDoJsEkhl5lnNkTV q387HGFyERsplknPzxLbF26IJQcUuJTX3PQFvuvv43ZNOqA/QI42956yIZ4KU6Er cDWle9uj7xeHSbU48yz7wIGddpDY6abxlq4C8897itWiepR2iswW0hmEebFfDQ9n A2imKdxZ6sUwwJ/lYiapdu6L5R9v1yx/4cmHfGyE1/FO5qKGzMOSKBJFt6eu+PdH Lb04N+3IyhhI0REzZ/q803Gr9MsZ8SHl2x14BO6olLOvvVDn4p4QoL2mvsidpO3/ /SIKvElW9/jZSmmaW4pOchXbm6RX0cuiu2PtKk5srh7MoX2zoiRh5hsVzpCBNwxX w5xbFCYX7s+KstG2kgnbRYrfsrgPl+6gX8M4bftYl3K9/bJ8ULSYEHAvI4emYlN2 6u89hrQL =MDM3 -----END PGP SIGNATURE----- Merge tag 'irqchip-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes Link: https://lore.kernel.org/lkml/20220108130807.4109738-1-maz@kernel.org	2022-01-10 13:55:41 +01:00
Valentin Schneider	835f442fdb	irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime The new memreserve cpuhp callback only needs to survive up until a point where every CPU in the system has booted once. Beyond that, it becomes a no-op and can be put in the bin. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-4-valentin.schneider@arm.com	2021-12-16 13:21:12 +00:00
Valentin Schneider	d23bc2bc1d	irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve Memory used by the LPI tables have to be made persistent for kexec to have a chance to work, as explained in [1]. If they have been made persistent and we are booting into a kexec'd kernel, we also need to free the pages that were preemptively allocated by the new kernel for those tables. Both of those operations currently happen during its_cpu_init(), which happens in a _STARTING (IOW atomic) cpuhp callback for secondary CPUs. efi_mem_reserve_iomem() issues a GFP_ATOMIC allocation, which unfortunately doesn't work under PREEMPT_RT (this ends up grabbing a non-raw spinlock, which can sleep under PREEMPT_RT). Similarly, freeing the pages ends up grabbing a sleepable spinlock. Since the memreserve is only required by kexec, it doesn't have to be done so early in the secondary boot process. Issue the reservation in a new CPUHP_AP_ONLINE_DYN cpuhp callback, and piggy-back the page freeing on top of it. A CPU gets to run the body of this new callback exactly once. As kexec issues a machine_shutdown() prior to machine_kexec(), it will be serialized vs a CPU being plugged to life by the hotplug machinery - either the CPU will have been brought up and have had its redistributor's pending table memreserved, or it never went online and will have its table allocated by the new kernel. [1]: https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com/ Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-3-valentin.schneider@arm.com	2021-12-16 13:21:11 +00:00
Valentin Schneider	c0cdc89072	irqchip/gic-v3-its: Give the percpu rdist struct its own flags field Later patches will require tracking some per-rdist status. Reuse the bytes "lost" to padding within the __percpu rdist struct as a flags field, and re-encode ->lpi_enabled within said flags. No change in functionality intended. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-2-valentin.schneider@arm.com	2021-12-16 13:21:11 +00:00
Wudi Wang	b383a42ca5	irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL INVALL CMD specifies that the ITS must ensure any caching associated with the interrupt collection defined by ICID is consistent with the LPI configuration tables held in memory for all Redistributors. SYNC is required to ensure that INVALL is executed. Currently, LPI configuration data may be inconsistent with that in the memory within a short period of time after the INVALL command is executed. Signed-off-by: Wudi Wang <wangwudi@hisilicon.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue") Link: https://lore.kernel.org/r/20211208015429.5007-1-zhangshaokun@hisilicon.com	2021-12-08 11:13:18 +00:00
Kaige Fu	280bef5129	irqchip/gic-v3-its: Fix potential VPE leak on error In its_vpe_irq_domain_alloc, when its_vpe_init() returns an error, there is an off-by-one in the number of VPEs to be freed. Fix it by simply passing the number of VPEs allocated, which is the index of the loop iterating over the VPEs. Fixes: 7d75bbb4bc1a ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown") Signed-off-by: Kaige Fu <kaige.fu@linux.alibaba.com> [maz: fixed commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/d9e36dee512e63670287ed9eff884a5d8d6d27f2.1631672311.git.kaige.fu@linux.alibaba.com	2021-09-22 14:37:04 +01:00
Andy Shevchenko	ff5fe8867a	irqchip/gic-v3: Switch to bitmap_zalloc() Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210618151657.65305-4-andriy.shevchenko@linux.intel.com	2021-07-26 18:04:01 +01:00
Zhen Lei	944a1a17d3	irqchip/gic-v3-its: Remove unnecessary oom message Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210609140643.14531-1-thunder.leizhen@huawei.com	2021-06-11 14:19:39 +01:00
Linus Torvalds	152d32aa84	ARM: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - Some selftests improvements -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCJ13kUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroM1HAgAqzPxEtiTPTFeFJV5cnPPJ3dFoFDK y/juZJUQ1AOtvuWzzwuf175ewkv9vfmtG6rVohpNSkUlJYeoc6tw7n8BTTzCVC1b c/4Dnrjeycr6cskYlzaPyV6MSgjSv5gfyj1LA5UEM16LDyekmaynosVWY5wJhju+ Bnyid8l8Utgz+TLLYogfQJQECCrsU0Wm//n+8TWQgLf1uuiwshU5JJe7b43diJrY +2DX+8p9yWXCTz62sCeDWNahUv8AbXpMeJ8uqZPYcN1P0gSEUGu8xKmLOFf9kR7b M4U1Gyz8QQbjd2lqnwiWIkvRLX6gyGVbq2zH0QbhUe5gg3qGUX7JjrhdDQ== =AXUi -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...	2021-05-01 10:14:08 -07:00
Shenming Lu	c21bc068cd	irqchip/gic-v3-its: Drop the setting of PTZ altogether GICv4.1 gives a way to get the VLPI state, which needs to map the vPE first, and after the state read, we may remap the vPE back while the VPT is not empty. So we can't assume that the VPT is empty at the first map. Besides, the optimization of PTZ is probably limited since the HW should be fairly efficient to parse the empty VPT. Let's drop the setting of PTZ altogether. Signed-off-by: Shenming Lu <lushenming@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210322060158.1584-3-lushenming@huawei.com	2021-03-24 18:12:20 +00:00
Marc Zyngier	301beaf197	irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping In order to be able to manipulate the VPT once a vPE has been unmapped, perform the required CMO to invalidate the CPU view of the VPT. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Shenming Lu <lushenming@huawei.com> Link: https://lore.kernel.org/r/20210322060158.1584-2-lushenming@huawei.com	2021-03-24 18:05:52 +00:00
Ingo Molnar	a359f75796	irq: Fix typos in comments Fix ~36 single-word typos in the IRQ, irqchip and irqdomain code comments. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-03-22 04:23:14 +01:00
Linus Torvalds	6a447b0e31	ARM: * PSCI relay at EL2 when "protected KVM" is enabled * New exception injection code * Simplification of AArch32 system register handling * Fix PMU accesses when no PMU is enabled * Expose CSV3 on non-Meltdown hosts * Cache hierarchy discovery fixes * PV steal-time cleanups * Allow function pointers at EL2 * Various host EL2 entry cleanups * Simplification of the EL2 vector allocation s390: * memcg accouting for s390 specific parts of kvm and gmap * selftest for diag318 * new kvm_stat for when async_pf falls back to sync x86: * Tracepoints for the new pagetable code from 5.10 * Catch VFIO and KVM irqfd events before userspace * Reporting dirty pages to userspace with a ring buffer * SEV-ES host support * Nested VMX support for wait-for-SIPI activity state * New feature flag (AVX512 FP16) * New system ioctl to report Hyper-V-compatible paravirtualization features Generic: * Selftest improvements -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl/bdL4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNgQQgAnTH6rhXa++Zd5F0EM2NwXwz3iEGb lOq1DZSGjs6Eekjn8AnrWbmVQr+CBCuGU9MrxpSSzNDK/awryo3NwepOWAZw9eqk BBCVwGBbJQx5YrdgkGC0pDq2sNzcpW/VVB3vFsmOxd9eHblnuKSIxEsCCXTtyqIt XrLpQ1UhvI4yu102fDNhuFw2EfpzXm+K0Lc0x6idSkdM/p7SyeOxiv8hD4aMr6+G bGUQuMl4edKZFOWFigzr8NovQAvDHZGrwfihu2cLRYKLhV97QuWVmafv/yYfXcz2 drr+wQCDNzDOXyANnssmviazrhOX0QmTAhbIXGGX/kTxYKcfPi83ZLoI3A== =ISud -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...	2020-12-20 10:44:05 -08:00
Marc Zyngier	5fe71d271d	irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device The ITS already has some notion of "shared" devices. Let's map the MSI_ALLOC_FLAGS_PROXY_DEVICE flag onto this internal property. Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20201129135208.680293-3-maz@kernel.org	2020-12-11 14:47:50 +00:00
Shenming Lu	0b39498230	irqchip/gic-v4.1: Reduce the delay when polling GICR_VPENDBASER.Dirty The 10us delay of the poll on the GICR_VPENDBASER.Dirty bit is too high, which might greatly affect the total scheduling latency of a vCPU in our measurement. So we reduce it to 1 to lessen the impact. Signed-off-by: Shenming Lu <lushenming@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201128141857.983-2-lushenming@huawei.com	2020-12-11 14:47:10 +00:00
Shenming Lu	57e3cebd02	KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit In order to reduce the impact of the VPT parsing happening on the GIC, we can split the vcpu reseidency in two phases: - programming GICR_VPENDBASER: this still happens in vcpu_load() - checking for the VPT parsing to be complete: this can happen on vcpu entry (in kvm_vgic_flush_hwstate()) This allows the GIC and the CPU to work in parallel, rewmoving some of the entry overhead. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Shenming Lu <lushenming@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201128141857.983-3-lushenming@huawei.com	2020-11-30 11:18:29 +00:00
Thomas Gleixner	7032908cd5	irqchip fixes for Linux 5.10, take #2 - Fix Exiu driver trigger type when using ACPI - Fix GICv3 ITS suspend/resume to use the in-kernel path at all times, sidestepping braindead firmware support -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+6Yf8PHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD/+QP/R6DppA5eVK98xAt/qXia0eUbady8pib/a+8 UGKFJ8d5CPhaP/EuuUk1zcN4ry2WnRSXVxesVkEQMLu8nKLsgG8lEgojJAel+w92 gEbLuVnWJz6kiW/P7NUR/B1mYCacbi5IY+v3hFcoa6XzHbNHulzidUIkAiVcq9E/ aqr08O62BFcBWzxRaaK6om2gTI3oZvD+v/oHTfMzIA59oAQGmgUYnc0CUOjVR5Bl aGYsdq2RqkvSdQbb2oVMPW6Tb22cfUQT1TnY1JR+Q5a5vEqTu67bTs6rpviR2Fj8 CepcmoYUFC1tDW/tfbj+PgdDoyE2rbFUYyYiPX74caMi6rgLvSf8t1zK+sh/Q6x7 Lnriw6xRbCMtkIczM5A3OrMQ/PWdcxHkyNC/3x2+qkkZVOBbZJ+cSWsJm1d3yky+ IuaOTP2UYnToc+/uGHLysqUAehnMsSZnOvuYUVE4RVgMfbJvOX7rZdV1m+pic1GE Rm4TudcWni9b+14rFae/lOXXGDmY725898TI3Nud8bjKvyYcTMDO+R201VNFgNcB m/cWvPhbYP5OLR2cnvbWx0bg4WeBOY3m4RFIPHb2gCiPcWECLSCBaDfm7BG0eQvN DMLFIDePsj74vXejQbe5Txo8+dCe5vZsuzLRdFYElq6pOXVMjBtxF4s7In7RnqNB NPNENkqY =IjDX -----END PGP SIGNATURE----- Merge tag 'irqchip-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix Exiu driver trigger type when using ACPI - Fix GICv3 ITS suspend/resume to use the in-kernel path at all times, sidestepping braindead firmware support Link: https://lore.kernel.org/r/20201122184752.553990-1-maz@kernel.org	2020-11-25 00:56:28 +01:00
Xu Qiang	74cde1a533	irqchip/gic-v3-its: Unconditionally save/restore the ITS state on suspend On systems without HW-based collections (i.e. anything except GIC-500), we rely on firmware to perform the ITS save/restore. This doesn't really work, as although FW can properly save everything, it cannot fully restore the state of the command queue (the read-side is reset to the head of the queue). This results in the ITS consuming previously processed commands, potentially corrupting the state. Instead, let's always save the ITS state on suspend, disabling it in the process, and restore the full state on resume. This saves us from broken FW as long as it doesn't enable the ITS by itself (for which we can't do anything). This amounts to simply dropping the ITS_FLAGS_SAVE_SUSPEND_STATE. Signed-off-by: Xu Qiang <xuqiang36@huawei.com> [maz: added warning on resume, rewrote commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201107104226.14282-1-xuqiang36@huawei.com	2020-11-22 12:58:35 +00:00
Linus Torvalds	cf1d2b44f6	ACPI updates for 5.10-rc1 - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron). - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo). - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada). - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki). - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: * Add predefined names from the SMBus sepcification (Bob Moore). * Update acpi_help UUID list (Bob Moore). * Return exceptions for string-to-integer conversions in iASL (Bob Moore). * Add a new "ALL <NameSeg>" debugger command (Bob Moore). * Add support for 64 bit risc-v compilation (Colin Ian King). * Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap). - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung). - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko). - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo). - Add missing config_item_put() to fix refcount leak (Hanjun Guo). - Drop lefrover field from struct acpi_memory_device (Hanjun Guo). - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings). - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov). - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao). - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing). - Serialize tools/power/acpi Makefile (Thomas Renninger). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl+F4IkSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRx1gIQAIZrt09fquEIZhYulGZAkuYhSX2U/DZt poow5+TiGk36JNHlbZS19kZ3F0tJ1wA6CKSfF/bYyULxL+gYaUjdLXzv2kArTSAj nzDXQ2CystpySZI/sEkl4QjsMg0xuZlBhlnCfNHzJw049TgdsJHnxMkJXb8T90A+ l2JKm2OpBkNvQGNpwd3djLg8xSDnHUmuevsWZPHDp92/fLMF9DUBk8dVuEwa0ndF hAUpWm+EL1tJQnhNwtfV/Akd9Ypqgk/7ROFWFHGDtHMZGnBjpyXZw68vHMX7SL6N Ej90GWGPHSJs/7Fsg4Hiaxxcph9WFNLPcpck5lVAMIrNHMKANjqQzCsmHavV/WTG STC9/qwJauA1EOjovlmlCFHctjKE/ya6Hm299WTlfBqB+Lu1L3oMR2CC+Uj0YfyG sv3264rJCsaSw610iwQOG807qHENopASO2q5DuKG0E9JpcaBUwn1N4qP5svvQciq 4aA8Ma6xM/QHCO4CS0Se9C0+WSVtxWwOUichRqQmU4E6u1sXvKJxTeWo79rV7PAh L6BwoOxBLabEiyzpi6HPGs6DoKj/N6tOQenBh4ibdwpAwMtq7hIlBFa0bp19c2wT vx8F2Raa8vbQ2zZ1QEiPZnPLJUoy2DgaCtKJ6E0FTDXNs3VFlWgyhIUlIRqk5BS9 OnAwVAUrTMkJ =feLU -----END PGP SIGNATURE----- Merge tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...	2020-10-14 11:42:04 -07:00
Mike Rapoport	9f3d5eaa3c	memblock: implement for_each_reserved_mem_region() using __next_mem_region() Iteration over memblock.reserved with for_each_reserved_mem_region() used __next_reserved_mem_region() that implemented a subset of __next_mem_region(). Use __for_each_mem_range() and, essentially, __next_mem_region() with appropriate parameters to reduce code duplication. While on it, rename for_each_reserved_mem_region() to for_each_reserved_mem_range() for consistency. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format] Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-17-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-13 18:38:35 -07:00
Jonathan Cameron	95ac5bf4e4	irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory Note this crash is present before any of the patches in this series, but as explained below it is highly unlikely anyone is shipping a firmware that causes it. Tests were done using an overriden SRAT. On ARM64, the gic-v3 driver directly parses SRAT to locate GIC Interrupt Translation Service (ITS) Affinity Structures. This is done much later in the boot than the parses of SRAT which identify proximity domains. As a result, an ITS placed in a proximity domain that is not defined by another SRAT structure will result in a NUMA node that is not completely configured and a crash. ITS [mem 0x202100000-0x20211ffff] ITS@0x0000000202100000: Using ITS number 0 Unable to handle kernel paging request at virtual address 0000000000001a08 ... Call trace: __alloc_pages_nodemask+0xe8/0x338 alloc_pages_node.constprop.0+0x34/0x40 its_probe_one+0x2f8/0xb18 gic_acpi_parse_madt_its+0x108/0x150 acpi_table_parse_entries_array+0x17c/0x264 acpi_table_parse_entries+0x48/0x6c acpi_table_parse_madt+0x30/0x3c its_init+0x1c4/0x644 gic_init_bases+0x4b8/0x4ec gic_acpi_init+0x134/0x264 acpi_match_madt+0x4c/0x84 acpi_table_parse_entries_array+0x17c/0x264 acpi_table_parse_entries+0x48/0x6c acpi_table_parse_madt+0x30/0x3c __acpi_probe_device_table+0x8c/0xe8 irqchip_init+0x3c/0x48 init_IRQ+0xcc/0x100 start_kernel+0x33c/0x548 ACPI 6.3 allows any set of Affinity Structures in SRAT to define a proximity domain. However, as we do not see this crash, we can conclude that no firmware is currently placing an ITS in a node that is separate from those containing memory and / or processors. We could modify the SRAT parsing behavior to identify the existence of Proximity Domains unique to the ITS structures, and handle them as a special case of a generic initiator (once support for those merges). This patch avoids the complexity that would be needed to handle this corner case, by not allowing the ITS entry parsing code to instantiate new NUMA Nodes. If one is encountered that does not already exist, then NO_NUMA_NODE is assigned and a warning printed just as if the value had been greater than allowed NUMA Nodes. "SRAT: Invalid NUMA node -1 in ITS affinity" Whilst this does not provide the full flexibility allowed by ACPI, it does fix the problem. We can revisit a more sophisticated solution if needed by future platforms. Change is simply to replace acpi_map_pxm_to_node with pxm_to_node reflecting the fact a new mapping is not created. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-24 12:57:38 +02:00
Marc Zyngier	eff65bd439	Merge remote-tracking branch 'origin/irq/gic-retrigger' into irq/irqchip-next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-17 16:50:02 +01:00
Marc Zyngier	5f774f5e12	irqchip/git-v3-its: Implement irq_retrigger callback for device-triggered LPIs It is pretty easy to provide a retrigger callback for the ITS, as it we already have the required support in terms of irq_set_irqchip_state(). Note that this only works for device-generated LPIs, and not the GICv4 doorbells, which should never have to be retriggered anyway. Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-06 18:26:13 +01:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Linus Torvalds	f8b036a7fc	The usual boring updates from the interrupt subsystem: - Infrastructure to allow building irqchip drivers as modules - Consolidation of irqchip ACPI probing - Removal of the EOI-preflow interrupt handler which was required for SPARC support and became obsolete after SPARC was converted to use sparse interrupts. - Cleanups, fixes and improvements all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8pDL0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoRTFEACYvH2LnSu1GlXB0XtL3+XyV8bWN3Yr Qfcp9JbIibx65YkJjcyvfBNA6GjXoogMr9vOHeRVnPtOwzl/7n/lnh/43d6+YPot 7UvIjGtpH3E/lF0kJKfuEsM8CX8DcVhn6dV/T+dJ00m69dAVQHNRsVqAi1/iWEeT 9vBBELoJL79BU2g83NQZ7V0UrqiA5QlPYLpbSffliE6UWjG6XTH2CPM5XucuySNQ es3szxQ55rtPEzqCHVL0YW75vV39bmKZPqoApA/XQDJrp3bgftjdldoTe7YPQfSG MXAvB+6axPD+mdeag7/XZFC1DcMx8CnistZSJKpdYZe7mQ7iunfeJRhkEzb+DrO1 WdcDcYOm0rLHhPrUZItJdACjuPNmN9pMaK1PbabsivnHVWzMYYKmMwbW+AEsygGW nnlsZP1Nr61Mo7O8+EKmxDdox4Qjk3lmQl4SdQgUKNKsI5yFYjvt2CfCjWLQJNBa w7YiLnL9IChXwrvdGqMIoEueUi0pC3gGbZ/bjDbxI4NJxJgEEav49m/prxM2A2Pl gfNdwlM1xgNydIBgt/jij/a8Lmv555RuZmvDV7QV7fFwaIqt3Qb5cs0Roq+GlzZR e0wuikGl0r/Bdow62rle7EysbBBGosAYf6K/kaGhd8v/kx2ByDnPPWzOqtxc+K+i Iw/daEQRsSnWuw== =KA8b -----END PGP SIGNATURE----- Merge tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The usual boring updates from the interrupt subsystem: - Infrastructure to allow building irqchip drivers as modules - Consolidation of irqchip ACPI probing - Removal of the EOI-preflow interrupt handler which was required for SPARC support and became obsolete after SPARC was converted to use sparse interrupts. - Cleanups, fixes and improvements all over the place" * tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/loongson-pch-pic: Fix the misused irq flow handler irqchip/loongson-htvec: Support 8 groups of HT vectors irqchip/loongson-liointc: Fix misuse of gc->mask_cache dt-bindings: interrupt-controller: Update Loongson HTVEC description irqchip/imx-intmux: Fix irqdata regs save in imx_intmux_runtime_suspend() irqchip/imx-intmux: Implement intmux runtime power management irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() irqchip: Fix IRQCHIP_PLATFORM_DRIVER_* compilation by including module.h irqchip/stm32-exti: Map direct event to irq parent irqchip/mtk-cirq: Convert to a platform driver irqchip/mtk-sysirq: Convert to a platform driver irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros irqchip: irq-bcm2836.h: drop a duplicated word irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR irqchip/irq-bcm7038-l1: Guard uses of cpu_logical_map irqchip/gic-v3: Remove unused register definition irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent irqdomain: Export irq_domain_update_bus_token ...	2020-08-04 18:11:58 -07:00
Thomas Gleixner	f0c7baca18	genirq/affinity: Make affinity setting if activated opt-in John reported that on a RK3288 system the perf per CPU interrupts are all affine to CPU0 and provided the analysis: "It looks like what happens is that because the interrupts are not per-CPU in the hardware, armpmu_request_irq() calls irq_force_affinity() while the interrupt is deactivated and then request_irq() with IRQF_PERCPU \| IRQF_NOBALANCING. Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls irq_setup_affinity() which returns early because IRQF_PERCPU and IRQF_NOBALANCING are set, leaving the interrupt on its original CPU." This was broken by the recent commit which blocked interrupt affinity setting in hardware before activation of the interrupt. While this works in general, it does not work for this particular case. As contrary to the initial analysis not all interrupt chip drivers implement an activate callback, the safe cure is to make the deferred interrupt affinity setting at activation time opt-in. Implement the necessary core logic and make the two irqchip implementations for which this is required opt-in. In hindsight this would have been the right thing to do, but ... Fixes: baedb87d1b53 ("genirq/affinity: Handle affinity setting on inactive interrupts correctly") Reported-by: John Keeping <john@metanate.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de	2020-07-27 16:20:40 +02:00
Zenghui Yu	d1bd7e0ba5	irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled box, I get the following kernel splat: [ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567 [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23 [ 0.053770] Call trace: [ 0.053774] dump_backtrace+0x0/0x218 [ 0.053775] show_stack+0x2c/0x38 [ 0.053777] dump_stack+0xc4/0x10c [ 0.053779] ___might_sleep+0xfc/0x140 [ 0.053780] __might_sleep+0x58/0x90 [ 0.053782] slab_pre_alloc_hook+0x7c/0x90 [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0 [ 0.053785] its_cpu_init+0x6f4/0xe40 [ 0.053786] gic_starting_cpu+0x24/0x38 [ 0.053788] cpuhp_invoke_callback+0xa0/0x710 [ 0.053789] notify_cpu_starting+0xcc/0xd8 [ 0.053790] secondary_start_kernel+0x148/0x200 # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40 its_cpu_init+0x6f4/0xe40: allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818 (inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138 (inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166 It turned out that we're allocating memory using GFP_KERNEL (may sleep) within the CPU hotplug notifier, which is indeed an atomic context. Bad thing may happen if we're playing on a system with more than a single CommonLPIAff group. Avoid it by turning this into an atomic allocation. Fixes: 5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200630133746.816-1-yuzenghui@huawei.com	2020-07-27 08:55:04 +01:00
Zenghui Yu	3af9571cd5	irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR The GICv4.1 spec tells us that it's CONSTRAINED UNPREDICTABLE to issue a register-based invalidation operation for a vPEID not mapped to that RD, or another RD within the same CommonLPIAff group. To follow this rule, commit f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") tried to address the race between the RD accesses and the vPE affinity change, but somehow forgot to take GICR_INVALLR into account. Let's take the vpe_lock before evaluating vpe->col_idx to fix it. Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200720092328.708-1-yuzenghui@huawei.com	2020-07-27 08:55:03 +01:00
Linus Torvalds	bfe91da29b	Bugfixes and a one-liner patch to silence sparse. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8DWosUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroO8cAf/UskNg8qoLGG17rQwhFpmigSllbiJ TAyi3tpb1Y0Z2MfYeGkeiEb1L34bS28Cxl929DoqI3hrXy1wDCmsHPB5c3URXrzd aswvr7pJtQV9iH1ykaS2woFJnOUovMFsFYMhj46yUPoAvdKOZKvuqcduxbogYHFw YeRhS+1lGfiP2A0j3O/nnNJ0wq+FxKO46G3CgWeqG75+FSL6y/tl0bZJUMKKajQZ GNaOv/CYCHAfUdvgy0ZitRD8lV8yxng3dYGjm+a52Kmn2ZWiFlxNrnxzHySk16Rn Lq6MfFOqgrYpoZv7SnsFYnRE05U5bEFQ8BGr22fImQ+ktKDgq+9gv6cKwA== =+DN/ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Bugfixes and a one-liner patch to silence a sparse warning" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART KVM: arm64: PMU: Fix per-CPU access in preemptible context KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks KVM: x86: Mark CR4.TSD as being possibly owned by the guest KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode kvm: use more precise cast and do not drop __user KVM: x86: bit 8 of non-leaf PDPEs is not reserved KVM: X86: Fix async pf caused null-ptr-deref KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell KVM: arm64: pvtime: Ensure task delay accounting is enabled KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE KVM: arm64: Annotate hyp NMI-related functions as __always_inline KVM: s390: reduce number of IO pins to 1	2020-07-06 12:48:04 -07:00
Marc Zyngier	a3f574cd65	KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell When making a vPE non-resident because it has hit a blocking WFI, the doorbell can fire at any time after the write to the RD. Crucially, it can fire right between the write to GICR_VPENDBASER and the write to the pending_last field in the its_vpe structure. This means that we would overwrite pending_last with stale data, and potentially not wakeup until some unrelated event (such as a timer interrupt) puts the vPE back on the CPU. GICv4 isn't affected by this as we actively mask the doorbell on entering the guest, while GICv4.1 automatically manages doorbell delivery without any hypervisor-driven masking. Use the vpe_lock to synchronize such update, which solves the problem altogether. Fixes: ae699ad348cdc ("irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-06-23 11:24:39 +01:00
Zenghui Yu	31dbb6b1d0	irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomic readx_poll_timeout() can sleep if @sleep_us is specified by the caller, and is therefore unsafe to be used inside the atomic context, which is this case when we use it to poll the GICR_VPENDBASER.Dirty bit in irq_set_vcpu_affinity() callback. Let's convert to its atomic version instead which helps to get the v4.1 board back to life! Fixes: 96806229ca03 ("irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200605052345.1494-1-yuzenghui@huawei.com	2020-06-21 15:13:11 +01:00
Marc Zyngier	c5d6082d35	irqchip/gic-v3-its: Balance initial LPI affinity across CPUs When mapping a LPI, the ITS driver picks the first possible affinity, which is in most cases CPU0, assuming that if that's not suitable, someone will come and set the affinity to something more interesting. It apparently isn't the case, and people complain of poor performance when many interrupts are glued to the same CPU. So let's place the interrupts by finding the "least loaded" CPU (that is, the one that has the fewer LPIs mapped to it). So called 'managed' interrupts are an interesting case where the affinity is actually dictated by the kernel itself, and we should honor this. Reported-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1575642904-58295-1-git-send-email-john.garry@huawei.com Link: https://lore.kernel.org/r/20200515165752.121296-3-maz@kernel.org	2020-05-20 11:00:00 +01:00
Marc Zyngier	2f13ff1d1d	irqchip/gic-v3-its: Track LPI distribution on a per CPU basis In order to improve the distribution of LPIs among CPUs, let start by tracking the number of LPIs assigned to CPUs, both for managed and non-managed interrupts (as separate counters). Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20200515165752.121296-2-maz@kernel.org	2020-05-18 10:30:42 +01:00

1 2 3 4 5 ...

300 Commits