IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Remove gnttab_query_foreign_access(), as it is unused and unsafe to
use.
All previous use cases assumed a grant would not be in use after
gnttab_query_foreign_access() returned 0. This information is useless
in best case, as it only refers to a situation in the past, which could
have changed already.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Using gnttab_query_foreign_access() is unsafe, as it is racy by design.
The use case in the gntalloc driver is not needed at all. While at it
replace the call of gnttab_end_foreign_access_ref() with a call of
gnttab_end_foreign_access(), which is what is really wanted there. In
case the grant wasn't used due to an allocation failure, just free the
grant via gnttab_free_grant_reference().
This is CVE-2022-23039 / part of XSA-396.
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V3:
- fix __del_gref() (Jan Beulich)
Add a new grant table function gnttab_try_end_foreign_access(), which
will remove and free a grant if it is not in use.
Its main use case is to either free a grant if it is no longer in use,
or to take some other action if it is still in use. This other action
can be an error exit, or (e.g. in the case of blkfront persistent grant
feature) some special handling.
This is CVE-2022-23036, CVE-2022-23038 / part of XSA-396.
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- new patch
V4:
- add comments to header (Jan Beulich)
Letting xenbus_grant_ring() tear down grants in the error case is
problematic, as the other side could already have used these grants.
Calling gnttab_end_foreign_access_ref() without checking success is
resulting in an unclear situation for any caller of xenbus_grant_ring()
as in the error case the memory pages of the ring page might be
partially mapped. Freeing them would risk unwanted foreign access to
them, while not freeing them would leak memory.
In order to remove the need to undo any gnttab_grant_foreign_access()
calls, use gnttab_alloc_grant_references() to make sure no further
error can occur in the loop granting access to the ring pages.
It should be noted that this way of handling removes leaking of
grant entries in the error case, too.
This is CVE-2022-23040 / part of XSA-396.
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
This patch implements arch_xen_unpopulated_init() on Arm where
the extended regions (if any) are gathered from DT and inserted
into specific Xen resource to be used as unused address space
for Xen scratch pages by unpopulated-alloc code.
The extended region (safe range) is a region of guest physical
address space which is unused and could be safely used to create
grant/foreign mappings instead of wasting real RAM pages from
the domain memory for establishing these mappings.
The extended regions are chosen by the hypervisor at the domain
creation time and advertised to it via "reg" property under
hypervisor node in the guest device-tree. As region 0 is reserved
for grant table space (always present), the indexes for extended
regions are 1...N.
If arch_xen_unpopulated_init() fails for some reason the default
behaviour will be restored (allocate xenballooned pages).
This patch also removes XEN_UNPOPULATED_ALLOC dependency on x86.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-6-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The main reason of this change is that unpopulated-alloc
code cannot be used in its current form on Arm, but there
is a desire to reuse it to avoid wasting real RAM pages
for the grant/foreign mappings.
The problem is that system "iomem_resource" is used for
the address space allocation, but the really unallocated
space can't be figured out precisely by the domain on Arm
without hypervisor involvement. For example, not all device
I/O regions are known by the time domain starts creating
grant/foreign mappings. And following the advise from
"iomem_resource" we might end up reusing these regions by
a mistake. So, the hypervisor which maintains the P2M for
the domain is in the best position to provide unused regions
of guest physical address space which could be safely used
to create grant/foreign mappings.
Introduce new helper arch_xen_unpopulated_init() which purpose
is to create specific Xen resource based on the memory regions
provided by the hypervisor to be used as unused space for Xen
scratch pages. If arch doesn't define arch_xen_unpopulated_init()
the default "iomem_resource" will be used.
Update the arguments list of allocate_resource() in fill_list()
to always allocate a region from the hotpluggable range
(maximum possible addressable physical memory range for which
the linear mapping could be created). If arch doesn't define
arch_get_mappable_range() the default range (0,-1) will be used.
The behaviour on x86 won't be changed by current patch as both
arch_xen_unpopulated_init() and arch_get_mappable_range()
are not implemented for it.
Also fallback to allocate xenballooned pages (balloon out RAM
pages) if we do not have any suitable resource to work with
(target_resource is invalid) and as the result we won't be able
to provide unpopulated pages on a request.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-5-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
This patch rolls back some of the changes introduced by commit
121f2faca2 "xen/balloon: rename alloc/free_xenballooned_pages"
in order to make possible to still allocate xenballooned pages
if CONFIG_XEN_UNPOPULATED_ALLOC is enabled.
On Arm the unpopulated pages will be allocated on top of extended
regions provided by Xen via device-tree (the subsequent patches
will add required bits to support unpopulated-alloc feature on Arm).
The problem is that extended regions feature has been introduced
into Xen quite recently (during 4.16 release cycle). So this
effectively means that Linux must only use unpopulated-alloc on Arm
if it is running on "new Xen" which advertises these regions.
But, it will only be known after parsing the "hypervisor" node
at boot time, so before doing that we cannot assume anything.
In order to keep working if CONFIG_XEN_UNPOPULATED_ALLOC is enabled
and the extended regions are not advertised (Linux is running on
"old Xen", etc) we need the fallback to alloc_xenballooned_pages().
This way we wouldn't reduce the amount of memory usable (wasting
RAM pages) for any of the external mappings anymore (and eliminate
XSA-300) with "new Xen", but would be still functional ballooning
out RAM pages with "old Xen".
Also rename alloc(free)_xenballooned_pages to xen_alloc(free)_ballooned_pages
and make xen_alloc(free)_unpopulated_pages static inline in xen.h
if CONFIG_XEN_UNPOPULATED_ALLOC is disabled.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-4-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
If memremap_pages() succeeds the range is guaranteed to have proper page
table, there is no need for an additional virt_addr_valid() check.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/1639080336-26573-2-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
While working with Xen's libxenvchan library I have faced an issue with
unmap notifications sent in wrong order if both UNMAP_NOTIFY_SEND_EVENT
and UNMAP_NOTIFY_CLEAR_BYTE were requested: first we send an event channel
notification and then clear the notification byte which renders in the below
inconsistency (cli_live is the byte which was requested to be cleared on unmap):
[ 444.514243] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
libxenvchan_is_open cli_live 1
[ 444.515239] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14
Thus it is not possible to reliably implement the checks like
- wait for the notification (UNMAP_NOTIFY_SEND_EVENT)
- check the variable (UNMAP_NOTIFY_CLEAR_BYTE)
because it is possible that the variable gets checked before it is cleared
by the kernel.
To fix that we need to re-order the notifications, so the variable is first
gets cleared and then the event channel notification is sent.
With this fix I can see the correct order of execution:
[ 54.522611] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14
[ 54.537966] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
libxenvchan_is_open cli_live 0
Cc: stable@vger.kernel.org
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211210092817.580718-1-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The Xen console driver is still vulnerable for an attack via excessive
number of events sent by the backend. Fix that by using a lateeoi event
channel.
For the normal domU initial console this requires the introduction of
bind_evtchn_to_irq_lateeoi() as there is no xenbus device available
at the time the event channel is bound to the irq.
As the decision whether an interrupt was spurious or not requires to
test for bytes having been read from the backend, move sending the
event into the if statement, as sending an event without having found
any bytes to be read is making no sense at all.
This is part of XSA-391
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- slightly adapt spurious irq detection (Jan Beulich)
V3:
- fix spurious irq detection (Jan Beulich)
If the xenstore page hasn't been allocated properly, reading the value
of the related hvm_param (HVM_PARAM_STORE_PFN) won't actually return
error. Instead, it will succeed and return zero. Instead of attempting
to xen_remap a bad guest physical address, detect this condition and
return early.
Note that although a guest physical address of zero for
HVM_PARAM_STORE_PFN is theoretically possible, it is not a good choice
and zero has never been validly used in that capacity.
Also recognize all bits set as an invalid value.
For 32-bit Linux, any pfn above ULONG_MAX would get truncated. Pfns
above ULONG_MAX should never be passed by the Xen tools to HVM guests
anyway, so check for this condition and return early.
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20211123210748.1910236-1-sstabellini@kernel.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The Xen pvcalls device is not essential for booting. Set the respective
flag.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211022064800.14978-5-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When booting the xenbus driver will wait for PV devices to have
connected to their backends before continuing. The timeout is different
between essential and non-essential devices.
Non-essential devices are identified by their nodenames directly in the
xenbus driver, which requires to update this list in case a new device
type being non-essential is added (this was missed for several types
in the past).
In order to avoid this problem, add a "not_essential" flag to struct
xenbus_driver which can be set to "true" by the respective frontend.
Set this flag for the frontends currently regarded to be not essential
(vkbs and vfb) and use it for testing in the xenbus driver.
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211022064800.14978-2-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
In case of errors in xenbus_init (e.g. missing xen_store_gfn parameter),
we goto out_error but we forget to reset xen_store_domain_type to
XS_UNKNOWN. As a consequence xenbus_probe_initcall and other initcalls
will still try to initialize xenstore resulting into a crash at boot.
[ 2.479830] Call trace:
[ 2.482314] xb_init_comms+0x18/0x150
[ 2.486354] xs_init+0x34/0x138
[ 2.489786] xenbus_probe+0x4c/0x70
[ 2.498432] xenbus_probe_initcall+0x2c/0x7c
[ 2.503944] do_one_initcall+0x54/0x1b8
[ 2.507358] kernel_init_freeable+0x1ac/0x210
[ 2.511617] kernel_init+0x28/0x130
[ 2.516112] ret_from_fork+0x10/0x20
Cc: <Stable@vger.kernel.org>
Cc: jbeulich@suse.com
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Link: https://lore.kernel.org/r/20211115222719.2558207-1-sstabellini@kernel.org
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
This configuration option provides a misc device as an API to userspace.
Make this API usable without having to select the module as a transitive
dependency.
This also fixes an issue where localyesconfig would select
CONFIG_XEN_PRIVCMD=m because it was not visible and defaulted to
building as module.
[boris: clarified help message per Jan's suggestion]
Based-on-patch-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211116143323.18866-1-jgross@suse.com
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYYp8HgAKCRCAXGG7T9hj
vmuVAP4whjbyIi4IxYEOnE6On0aD0AgUMiFa7QXrDZi6NXUQIwEAnggLFe+rEG5C
Fwi/cEXSHrRgveqrgD4GYEr6l0GTxwM=
=/fMa
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.16b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- a series to speed up the boot of Xen PV guests
- some cleanups in Xen related code
- replacement of license texts with the appropriate SPDX headers and
fixing of wrong SPDX headers in Xen header files
- a small series making paravirtualized interrupt masking much simpler
and at the same time removing complaints of objtool
- a fix for Xen ballooning hogging workqueues for too long
- enablement of the Xen pciback driver for Arm
- some further small fixes/enhancements
* tag 'for-linus-5.16b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (22 commits)
xen/balloon: fix unused-variable warning
xen/balloon: rename alloc/free_xenballooned_pages
xen/balloon: add late_initcall_sync() for initial ballooning done
x86/xen: remove 32-bit awareness from startup_xen
xen: remove highmem remnants
xen: allow pv-only hypercalls only with CONFIG_XEN_PV
x86/xen: remove 32-bit pv leftovers
xen-pciback: allow compiling on other archs than x86
x86/xen: switch initial pvops IRQ functions to dummy ones
x86/xen: remove xen_have_vcpu_info_placement flag
x86/pvh: add prototype for xen_pvh_init()
xen: Fix implicit type conversion
xen: fix wrong SPDX headers of Xen related headers
xen/pvcalls-back: Remove redundant 'flush_workqueue()' calls
x86/xen: Remove redundant irq_enter/exit() invocations
xen-pciback: Fix return in pm_ctrl_init()
xen/x86: restrict PV Dom0 identity mapping
xen/x86: there's no highmem anymore in PV mode
xen/x86: adjust handling of the L3 user vsyscall special page table
xen/x86: adjust xen_set_fixmap()
...
In configurations with CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=n
and CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y, gcc warns about an
unused variable:
drivers/xen/balloon.c:83:12: error: 'xen_hotplug_unpopulated' defined but not used [-Werror=unused-variable]
Since this is always zero when CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
is disabled, turn it into a preprocessor constant in that case.
Fixes: 121f2faca2 ("xen/balloon: rename alloc/free_xenballooned_pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211108111408.3940366-1-arnd@kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Merge misc updates from Andrew Morton:
"257 patches.
Subsystems affected by this patch series: scripts, ocfs2, vfs, and
mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
cleanups, kfence, and damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
mm/damon: remove return value from before_terminate callback
mm/damon: fix a few spelling mistakes in comments and a pr_debug message
mm/damon: simplify stop mechanism
Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
Docs/admin-guide/mm/damon/start: simplify the content
Docs/admin-guide/mm/damon/start: fix a wrong link
Docs/admin-guide/mm/damon/start: fix wrong example commands
mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
mm/damon: remove unnecessary variable initialization
Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
selftests/damon: support watermarks
mm/damon/dbgfs: support watermarks
mm/damon/schemes: activate schemes based on a watermarks mechanism
tools/selftests/damon: update for regions prioritization of schemes
mm/damon/dbgfs: support prioritization weights
mm/damon/vaddr,paddr: support pageout prioritization
mm/damon/schemes: prioritize regions within the quotas
mm/damon/selftests: support schemes quotas
mm/damon/dbgfs: support quotas of schemes
...
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()
The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.
@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)
[sfr@canb.auug.org.au: fixup]
Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au
Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().
The callers are updated with the below semantic patch:
@@
expression addr;
expression size;
@@
- memblock_free(addr, size);
+ memblock_phys_free(addr, size);
Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Including:
- Intel IOMMU Updates fro Lu Baolu:
- Dump DMAR translation structure when DMA fault occurs
- An optimization in the page table manipulation code
- Use second level for GPA->HPA translation
- Various cleanups
- Arm SMMU Updates from Will
- Minor optimisations to SMMUv3 command creation and submission
- Numerous new compatible string for Qualcomm SMMUv2 implementations
- Fixes for the SWIOTLB based implemenation of dma-iommu code for
untrusted devices
- Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
code for r8a77980
- A couple of cleanups and fixes for the Apple DART IOMMU driver
- Make use of generic report_iommu_fault() interface in the AMD IOMMU
driver
- Various smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmGD6NQACgkQK/BELZcB
GuOSfg/9FKXl5ym86BP3tAS1fREKH7p59JRGZrrIR89NyHAcEUjtNG3YLPao+YxU
3CDgLkru+vlDpYY54QoyqcY5FgIHT3Cna/Cdk4zekRmSO/14gHp47jtZRheOUzLF
rvwfaplcbbtT8akpsVFzvw8YpQLGSDiDQSl7xL2+40Z9hiYX/gS9Af+PH98tAXsa
yZKZj6gU+JXM58VihO3M7umyE06tovyBaYgcsBZtbf66bGc0ySu+fe75UVWbueRt
Z8jwqa7TUfVXiYC8h+LqtGET6gtzNSsxAU3VllRe7Brf6K8i/yaRs/TO2Hp83d7/
q/fcK3vNQ5v3aDNci/DjBB8SEySzCmRz/9ocCOCx8ByuRp+5lwVRPPq3WcUMtsZY
QpYo9Fk7luFz2Gj5LObKAVBvOoeBZ5Km3oPs4HVmQ6epxn/rVckJDnJnVSLJuATq
tSZC2heRfFlg1dT6WFaynCTP2RI1LlNEdKhHirV6L368rSjmF0ZdQxdTpHULsHr1
yMjqL21OfcSkLW91rvfb3g68EsIwDbCPGTOlQWZLmAtwOWtHSCLPgwwEG7WefZbH
yaslpmlUTOurUnFmpxlfLicy5sqsBL2ASzGJkEKrgunw82Ke96zzkRzi+9j9HeS6
g0AyIWMi1cUAjONVUZtV4yjImXh63HIPiKx730a9teodusoxm+Q=
=waUR
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Intel IOMMU Updates fro Lu Baolu:
- Dump DMAR translation structure when DMA fault occurs
- An optimization in the page table manipulation code
- Use second level for GPA->HPA translation
- Various cleanups
- Arm SMMU Updates from Will
- Minor optimisations to SMMUv3 command creation and submission
- Numerous new compatible string for Qualcomm SMMUv2 implementations
- Fixes for the SWIOTLB based implemenation of dma-iommu code for
untrusted devices
- Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
code for r8a77980
- A couple of cleanups and fixes for the Apple DART IOMMU driver
- Make use of generic report_iommu_fault() interface in the AMD IOMMU
driver
- Various smaller fixes and cleanups
* tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (35 commits)
iommu/dma: Fix incorrect error return on iommu deferred attach
iommu/dart: Initialize DART_STREAMS_ENABLE
iommu/dma: Use kvcalloc() instead of kvzalloc()
iommu/tegra-smmu: Use devm_bitmap_zalloc when applicable
iommu/dart: Use kmemdup instead of kzalloc and memcpy
iommu/vt-d: Avoid duplicate removing in __domain_mapping()
iommu/vt-d: Convert the return type of first_pte_in_page to bool
iommu/vt-d: Clean up unused PASID updating functions
iommu/vt-d: Delete dev_has_feat callback
iommu/vt-d: Use second level for GPA->HPA translation
iommu/vt-d: Check FL and SL capability sanity in scalable mode
iommu/vt-d: Remove duplicate identity domain flag
iommu/vt-d: Dump DMAR translation structure when DMA fault occurs
iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel option
iommu/arm-smmu-qcom: Request direct mapping for modem device
iommu: arm-smmu-qcom: Add compatible for QCM2290
dt-bindings: arm-smmu: Add compatible for QCM2290 SoC
iommu/arm-smmu-qcom: Add SM6350 SMMU compatible
dt-bindings: arm-smmu: Add compatible for SM6350 SoC
iommu/arm-smmu-v3: Properly handle the return value of arm_smmu_cmdq_build_cmd()
...
alloc_xenballooned_pages() and free_xenballooned_pages() are used as
direct replacements of xen_alloc_unpopulated_pages() and
xen_free_unpopulated_pages() in case CONFIG_XEN_UNPOPULATED_ALLOC isn't
defined.
Guard both functions with !CONFIG_XEN_UNPOPULATED_ALLOC and rename them
to the xen_*() variants they are replacing. This allows to remove some
ifdeffery from the xen.h header file. Adapt the prototype of the
functions to match.
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102092234.17852-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.
In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.
In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.
Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.
[boris: replaced pr_info() with pr_notice()]
Cc: <stable@vger.kernel.org>
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102091944.17487-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
There are some remaining 32-bit pv-guest support leftovers in the Xen
hypercall interface. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211028081221.2475-2-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Xen-pciback driver was designed to be built for x86 only. But it
can also be used by other architectures, e.g. Arm.
Currently PCI backend implements multiple functionalities at a time,
such as:
1. It is used as a database for assignable PCI devices, e.g. xl
pci-assignable-{add|remove|list} manipulates that list. So, whenever
the toolstack needs to know which PCI devices can be passed through
it reads that from the relevant sysfs entries of the pciback.
2. It is used to hold the unbound PCI devices list, e.g. when passing
through a PCI device it needs to be unbound from the relevant device
driver and bound to pciback (strictly speaking it is not required
that the device is bound to pciback, but pciback is again used as a
database of the passed through PCI devices, so we can re-bind the
devices back to their original drivers when guest domain shuts down)
3. Device reset for the devices being passed through
4. Para-virtualised use-cases support
The para-virtualised part of the driver is not always needed as some
architectures, e.g. Arm or x86 PVH Dom0, are not using backend-frontend
model for PCI device passthrough.
For such use-cases make the very first step in splitting the
xen-pciback driver into two parts: Xen PCI stub and PCI PV backend
drivers.
For that add new configuration options CONFIG_XEN_PCI_STUB and
CONFIG_XEN_PCIDEV_STUB, so the driver can be limited in its
functionality, e.g. no support for para-virtualised scenario.
x86 platform will continue using CONFIG_XEN_PCIDEV_BACKEND for the
fully featured backend driver.
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211028143620.144936-1-andr2000@gmail.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The variable 'i' is defined as UINT. However, in the for_each_possible_cpu
its value is assigned to -1. That doesn't make sense and in the
cpumask_next() it is implicitly type converted to INT. It is universally
accepted that the implicit type conversion is terrible. Also, having the
good programming custom will set an example for others. Thus, it might be
better to change the definition of 'i' from UINT to INT.
[boris: fixed commit message formatting]
Fixes: 3fac10145b ("xen: Re-upload processor PM data to hypervisor after S3 resume (v2)")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/1635233531-2437704-1-git-send-email-jiasheng@iscas.ac.cn
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jiamei Xie <jiamei.xie@arm.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.
Remove the redundant 'flush_workqueue()' calls.
This was generated with coccinelle:
@@
expression E;
@@
- flush_workqueue(E);
destroy_workqueue(E);
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/2d6c2e031e4aa2acf2ac4e0bbbc17cfdcc8dbee2.1634236560.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Return NULL instead of passing to ERR_PTR while err is zero,
this fix smatch warnings:
drivers/xen/xen-pciback/conf_space_capability.c:163
pm_ctrl_init() warn: passing zero to 'ERR_PTR'
Fixes: a92336a117 ("xen/pciback: Drop two backends, squash and cleanup some code.")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211008074417.8260-1-yuehaibing@huawei.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
In order to better track where in the kernel the dma-buf code is used,
put the symbols in the namespace DMA_BUF and modify all users of the
symbols to properly import the namespace to not break the build at the
same time.
Now the output of modinfo shows the use of these symbols, making it
easier to watch for users over time:
$ modinfo drivers/misc/fastrpc.ko | grep import
import_ns: DMA_BUF
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYWBSIwAKCRCAXGG7T9hj
vrXxAP9na1EqRJ+SpWyvxHY1jMaIrbg1bgnOc+GsnWxU5liW5AEA4h1HjHtVtrzL
3vweIS6u2fanrWlYML/daQ3r6EuLPQc=
=iXsP
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.15b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- fix two minor issues in the Xen privcmd driver plus a cleanup patch
for that driver
- fix multiple issues related to running as PVH guest and some related
earlyprintk fixes for other Xen guest types
- fix an issue introduced in 5.15 the Xen balloon driver
* tag 'for-linus-5.15b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/balloon: fix cancelled balloon action
xen/x86: adjust data placement
x86/PVH: adjust function/data placement
xen/x86: hook up xen_banner() also for PVH
xen/x86: generalize preferred console model from PV to PVH Dom0
xen/x86: make "earlyprintk=xen" work for HVM/PVH DomU
xen/x86: allow "earlyprintk=xen" to work for PV Dom0
xen/x86: make "earlyprintk=xen" work better for PVH Dom0
xen/x86: allow PVH Dom0 without XEN_PV=y
xen/x86: prevent PVH type from getting clobbered
xen/privcmd: drop "pages" parameter from xen_remap_pfn()
xen/privcmd: fix error handling in mmap-resource processing
xen/privcmd: replace kcalloc() by kvcalloc() when allocating empty pages
In case a ballooning action is cancelled the new kernel thread handling
the ballooning might end up in a busy loop.
Fix that by handling the cancelled action gracefully.
While at it introduce a short wait for the BP_WAIT case.
Cc: stable@vger.kernel.org
Fixes: 8480ed9c2b ("xen/balloon: use a kernel thread instead a workqueue")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211005133433.32008-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Decouple XEN_DOM0 from XEN_PV, converting some existing uses of XEN_DOM0
to a new XEN_PV_DOM0. (I'm not convinced all are really / should really
be PV-specific, but for starters I've tried to be conservative.)
For PVH Dom0 the hypervisor populates MADT with only x2APIC entries, so
without x2APIC support enabled in the kernel things aren't going to work
very well. (As opposed, DomU-s would only ever see LAPIC entries in MADT
as of now.) Note that this then requires PVH Dom0 to be 64-bit, as
X86_X2APIC depends on X86_64.
In the course of this xen_running_on_version_or_later() needs to be
available more broadly. Move it from a PV-specific to a generic file,
considering that what it does isn't really PV-specific at all anyway.
Note that xen/interface/version.h cannot be included on its own; in
enlighten.c, which uses SCHEDOP_* anyway, include xen/interface/sched.h
first to resolve the apparently sole missing type (xen_ulong_t).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/983bb72f-53df-b6af-14bd-5e088bd06a08@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The function doesn't use it and all of its callers say in a comment that
their respective arguments are to be non-NULL only in auto-translated
mode. Since xen_remap_domain_mfn_array() isn't supposed to be used by
non-PV, drop the parameter there as well. It was bogusly passed as non-
NULL (PRIV_VMA_LOCKED) by its only caller anyway. For
xen_remap_domain_gfn_range(), otoh, it's not clear at all why this
wouldn't want / might not need to gain auto-translated support down the
road, so the parameter is retained there despite now remaining unused
(and the only caller passing NULL); correct a respective comment as
well.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/036ad8a2-46f9-ac3d-6219-bdc93ab9e10b@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
xen_pfn_t is the same size as int only on 32-bit builds (and not even
on Arm32). Hence pfns[] can't be used directly to read individual error
values returned from xen_remap_domain_mfn_array(); every other error
indicator would be skipped/ignored on 64-bit.
Fixes: 3ad0876554 ("xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/aa6d6a67-6889-338a-a910-51e889f792d5@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Osstest has been suffering test failures for a little while from order-4
allocation failures, resulting from alloc_empty_pages() calling
kcalloc(). As there's no need for physically contiguous space here,
switch to kvcalloc().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/6d698901-98a4-05be-c421-bcd0713f5335@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where adjacent
slots could be exposed to the untrusted device if IO_TLB_SIZE < iova
granule < PAGE_SIZE.
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210929023300.335969-7-stevensd@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYU8xegAKCRCAXGG7T9hj
vvk2APwM85dFSMNHQo5+S35X+h+M8uKuqqvPYxVtKqQEwu3LXAEAg0cVgr1lWegI
X98f07/+M0rPJl24kgW/EIxAD9fsWAw=
=JILz
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Some minor cleanups and fixes of some theoretical bugs, as well as a
fix of a bug introduced in 5.15-rc1"
* tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/x86: fix PV trap handling on secondary processors
xen/balloon: fix balloon kthread freezing
swiotlb-xen: this is PV-only on x86
xen/pci-swiotlb: reduce visibility of symbols
PCI: only build xen-pcifront in PV-enabled environments
swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests
Xen/gntdev: don't ignore kernel unmapping error
xen/x86: drop redundant zeroing from cpu_initialize_context()
Commit 8480ed9c2b ("xen/balloon: use a kernel thread instead a
workqueue") switched the Xen balloon driver to use a kernel thread.
Unfortunately the patch omitted to call try_to_freeze() or to use
wait_event_freezable_timeout(), causing a system suspend to fail.
Fixes: 8480ed9c2b ("xen/balloon: use a kernel thread instead a workqueue")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210920100345.21939-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The code is unreachable for HVM or PVH, and it also makes little sense
in auto-translated environments. On Arm, with
xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
time seeing what good the Xen specific variant does - the generic one
ought to be fine for all purposes there. Still Arm code explicitly
references symbols here, so the code will continue to be included there.
Instead of making PCI_XEN's "select" conditional, simply drop it -
SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
is - as explained above - dead code in non-PV environments.
This in turn allows dropping the stubs for
xen_{create,destroy}_contiguous_region(), the former of which was broken
anyway - it failed to set the DMA handle output.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/5947b8ae-fdc7-225c-4838-84712265fc1e@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
While the hypervisor hasn't been enforcing this, we would still better
avoid issuing requests with GFNs not aligned to the requested order.
Instead of altering the value also in the call to panic(), drop it
there for being static and hence easy to determine without being part
of the panic message.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/7b3998e3-1233-4e5a-89ec-d740e77eb166@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
While working on XSA-361 and its follow-ups, I failed to spot another
place where the kernel mapping part of an operation was not treated the
same as the user space part. Detect and propagate errors and add a 2nd
pr_debug().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/c2513395-74dc-aea3-9192-fd265aa44e35@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYUR8DAAKCRCAXGG7T9hj
vjclAP0ekisZSabOsmzM0iz8rpuYj113/kQozokFeFD5eQlEiAD/b5LytLosic/1
5XKa5fbOyimAyRBwblWhpLLhrW6uiAg=
=WoAB
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- The first hunk of a Xen swiotlb fixup series fixing multiple minor
issues and doing some small cleanups
- Some further Xen related fixes avoiding WARN() splats when running as
Xen guests or dom0
- A Kconfig fix allowing the pvcalls frontend to be built as a module
* tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
swiotlb-xen: drop DEFAULT_NSLABS
swiotlb-xen: arrange to have buffer info logged
swiotlb-xen: drop leftover __ref
swiotlb-xen: limit init retries
swiotlb-xen: suppress certain init retries
swiotlb-xen: maintain slab count properly
swiotlb-xen: fix late init retry
swiotlb-xen: avoid double free
xen/pvcalls: backend can be a module
xen: fix usage of pmd_populate in mremap for pv guests
xen: reset legacy rtc flag for PV domU
PM: base: power: don't try to use non-existing RTC for storing data
xen/balloon: use a kernel thread instead a workqueue
It was introduced by 4035b43da6 ("xen-swiotlb: remove xen_set_nslabs")
and then not removed by 2d29960af0 ("swiotlb: dynamically allocate
io_tlb_default_mem").
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/15259326-209a-1d11-338c-5018dc38abe8@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
I consider it unhelpful that address and size of the buffer aren't put
in the log file; it makes diagnosing issues needlessly harder. The
majority of callers of swiotlb_init() also passes 1 for the "verbose"
parameter.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/2e3c8e68-36b2-4ae9-b829-bf7f75d39d47@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Commit a98f565462 ("xen-swiotlb: split xen_swiotlb_init") should not
only have added __init to the split off function, but also should have
dropped __ref from the one left.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/7cd163e1-fe13-270b-384c-2708e8273d34@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Due to the use of max(1024, ...) there's no point retrying (and issuing
bogus log messages) when the number of slabs is already no larger than
this minimum value.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/984fa426-2b7b-4b77-5ce8-766619575b7f@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Only on the 2nd of the paths leading to xen_swiotlb_init()'s "error"
label it is useful to retry the allocation; the first one did already
iterate through all possible order values.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/56477481-87da-4962-9661-5e1b277efde0@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Generic swiotlb code makes sure to keep the slab count a multiple of the
number of slabs per segment. Yet even without checking whether any such
assumption is made elsewhere, it is easy to see that xen_swiotlb_fixup()
might alter unrelated memory when calling xen_create_contiguous_region()
for the last segment, when that's not a full one - the function acts on
full order-N regions, not individual pages.
Align the slab count suitably when halving it for a retry. Add a build
time check and a runtime one. Replace the no longer useful local
variable "slabs" by an "order" one calculated just once, outside of the
loop. Re-use "order" for calculating "dma_bits", and change the type of
the latter as well as the one of "i" while touching this anyway.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/dc054cb0-bec4-4db0-fc06-c9fc957b6e66@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The commit referenced below removed the assignment of "bytes" from
xen_swiotlb_init() without - like done for xen_swiotlb_init_early() -
adding an assignment on the retry path, thus leading to excessively
sized allocations upon retries.
Fixes: 2d29960af0 ("swiotlb: dynamically allocate io_tlb_default_mem")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/778299d6-9cfd-1c13-026e-25ee5d14ecb3@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Of the two paths leading to the "error" label in xen_swiotlb_init() one
didn't allocate anything, while the other did already free what was
allocated.
Fixes: b827760053 ("xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/ce9c2adb-8a52-6293-982a-0d6ece943ac6@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
It's not clear to me why only the frontend has been tristate. Switch the
backend to be, too.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/54a6070c-92bb-36a3-2fc0-de9ccca438c5@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Today the Xen ballooning is done via delayed work in a workqueue. This
might result in workqueue hangups being reported in case of large
amounts of memory are being ballooned in one go (here 16GB):
BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3
in-flight: 229:balloon_process
pending: cache_reap
workqueue events_freezable_power_: flags=0x84
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: disk_events_workfn
workqueue mm_percpu_wq: flags=0x8
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: vmstat_update
pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43
This can easily be avoided by using a dedicated kernel thread for doing
the ballooning work.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmExXHoVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGAZwP/iHdEZzuQ4cz2uXUaV0fevj9jjPU
zJ8wrrNabAiT6f5x861DsARQSR4OSt3zN0tyBNgZwUdotbe7ED5GegrgIUBMWlML
QskhTEIZj7TexAX/20vx671gtzI3JzFg4c9BuriXCFRBvychSevdJPr65gMDOesL
vOJnXe+SGXG2+fPWi/PxrcOItNRcveqo2GiWHT3g0Cv/DJUulu81gEkz3hrufnMR
cjMeSkV0nJJcvI755OQBOUnEuigW64k4m2WxHPG24tU8cQOCqV6lqwOfNQBAn4+F
OoaCMyPQT9gvGYwGExQMCXGg0wbUt1qnxzOVoA2qFCwbo+MFhqjBvPXab6VJm7CE
mY3RrTtvxSqBdHI6EGcYeLjhycK9b+LLoJ1qc3S9FK8It6NoFFp4XV0R6ItPBls7
mWi9VSpyI6k0AwLq+bGXEHvaX/bnnf/vfqn8H+w6mRZdXjFV8EB2DiOSRX/OqjVG
RnvTtXzWWThLyXvWR3Jox4+7X6728oL7akLemoeZI6oTbJDm7dQgwpz5HbSyHXLh
d+gUF3Y/6lqxT5N9GSVDxpD1bEMh2I7nGQ4M7WGbGas/3yUemF8wbBqGQo4a+YeD
d9vGAUxDp2PQTtL2sjFo5Gd4PZEM9g7vwWzRvHe0o5NxKEXcBg25b8cD1hxrN9Y4
Y1AAnc0kLO+My3PC
=lw3M
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kbuild: redo fake deps at include/ksym/*.h
kbuild: clean up objtool_args slightly
modpost: get the *.mod file path more simply
checkkconfigsymbols.py: Fix the '--ignore' option
kbuild: merge vmlinux_link() between ARCH=um and other architectures
kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
kbuild: remove stale *.symversions
kbuild: remove unused quiet_cmd_update_lto_symversions
gen_compile_commands: extract compiler command from a series of commands
x86: remove cc-option-yn test for -mtune=
arc: replace cc-option-yn uses with cc-option
s390: replace cc-option-yn uses with cc-option
ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
sparc: move the install rule to arch/sparc/Makefile
security: remove unneeded subdir-$(CONFIG_...)
kbuild: sh: remove unused install script
kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
kbuild: Switch to 'f' variants of integrated assembler flag
kbuild: Shuffle blank line to improve comment meaning
...
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"A new feature called restricted DMA pools. It allows SWIOTLB to
utilize per-device (or per-platform) allocated memory pools instead of
using the global one.
The first big user of this is ARM Confidential Computing where the
memory for DMA operations can be set per platform"
* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
swiotlb: use depends on for DMA_RESTRICTED_POOL
of: restricted dma: Don't fail device probe on rmem init failure
of: Move of_dma_set_restricted_buffer() into device.c
powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
s390/pv: fix the forcing of the swiotlb
swiotlb: Free tbl memory in swiotlb_exit()
swiotlb: Emit diagnostic in swiotlb_exit()
swiotlb: Convert io_default_tlb_mem to static allocation
of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
swiotlb: add overflow checks to swiotlb_bounce
swiotlb: fix implicit debugfs declarations
of: Add plumbing for restricted DMA pool
dt-bindings: of: Add restricted DMA pool
swiotlb: Add restricted DMA pool initialization
swiotlb: Add restricted DMA alloc/free support
swiotlb: Refactor swiotlb_tbl_unmap_single
swiotlb: Move alloc_size to swiotlb_find_slots
swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
swiotlb: Update is_swiotlb_active to add a struct device argument
swiotlb: Update is_swiotlb_buffer to add a struct device argument
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYTB6pAAKCRCAXGG7T9hj
vk9xAP9hUQtGdTb5eW9+3X3PHDON2jGbydAMtQ57WJLdqj0oFgD/cGV95PKv0jgg
zgw6Icf9Vp6APF/Ucd/dOB0B6qviJgw=
=PNRc
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- some small cleanups
- a fix for a bug when running as Xen PV guest which could result in
not all memory being transferred in case of a migration of the guest
- a small series for getting rid of code for supporting very old Xen
hypervisor versions nobody should be using since many years now
- a series for hardening the Xen block frontend driver
- a fix for Xen PV boot code issuing warning messages due to a stray
preempt_disable() on the non-boot processors
* tag 'for-linus-5.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: remove stray preempt_disable() from PV AP startup code
xen/pcifront: Removed unnecessary __ref annotation
x86: xen: platform-pci-unplug: use pr_err() and pr_warn() instead of raw printk()
drivers/xen/xenbus/xenbus_client.c: fix bugon.cocci warnings
xen/blkfront: don't trust the backend response data blindly
xen/blkfront: don't take local copy of a request from the ring page
xen/blkfront: read response from backend only once
xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guests
xen: assume XENFEAT_mmu_pt_update_preserve_ad being set for pv guests
xen: check required Xen features
xen: fix setting of max_pfn in shared_info
- fix debugfs initialization order (Anthony Iliopoulos)
- use memory_intersects() directly (Kefeng Wang)
- allow to return specific errors from ->map_sg
(Logan Gunthorpe, Martin Oliveira)
- turn the dma_map_sg return value into an unsigned int (me)
- provide a common global coherent pool іmplementation (me)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmEvY+8LHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPaehAAsgnBzzzoLHO83pgs0KL92c+0DiMNHYmaMCJOvZXk
x2Irv+O74WikRJc4S7uQ26p2spjmUxjmiOjld+8+NN0liD4QO9BQ/SZpIp8emuKS
/yPG6Xh86xSl/OrPL1y7kGeHkRi5sm3mRhcTdILFQFPLcSReupe++GRfnvrpbOPk
tj3pBGXluD6iJH12BBt00ushUVzZ0F2xaF6xUDAs94RSZ3tlqsfx6c928Y1KxSZh
f89q/KuaokyogFG7Ujj/nYgIUETaIs2W6UmxBfRzdEMJFSffwomUMbw+M+qGJ7/d
2UjamFYRX16FReE8WNsndbX1E6k5JBW12E1qwV3dUwatlNLWEaRq3PNiWkF7zcFH
LDkpDYN6s5bIDPTfDp21XfPygoH8KQhnD9lVf0aB7n04uu8VJrGB9+10PpkCJVXD
0b2dcuSwCO7hAfTfNGVV8f3EI/1XPflr1hJvMgcVtY53CR96ldp+4QaElzWLXumN
MyptirmrVITNVyVwGzhGAblXBLWdarXD0EXudyiaF4Xbrj3AkIOSUCghEwKLpjQf
UwMFFwSE8yGxKTRK4HfU5gMzy6G751fU7TUe5lmxZLovDflQoSXMWgHE8e7r0Qel
o5v6lmUzoWz2fAISf3xjauo2ncgmfWMwYM6C7OJy5nG73QXLQId9J+ReXbmrgrrN
DgI=
=spje
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- fix debugfs initialization order (Anthony Iliopoulos)
- use memory_intersects() directly (Kefeng Wang)
- allow to return specific errors from ->map_sg (Logan Gunthorpe,
Martin Oliveira)
- turn the dma_map_sg return value into an unsigned int (me)
- provide a common global coherent pool іmplementation (me)
* tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits)
hexagon: use the generic global coherent pool
dma-mapping: make the global coherent pool conditional
dma-mapping: add a dma_init_global_coherent helper
dma-mapping: simplify dma_init_coherent_memory
dma-mapping: allow using the global coherent pool for !ARM
ARM/nommu: use the generic dma-direct code for non-coherent devices
dma-direct: add support for dma_coherent_default_memory
dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
dma-mapping: disallow .map_sg operations from returning zero on error
dma-mapping: return error code from dma_dummy_map_sg()
x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR
x86/amd_gart: return error code from gart_map_sg()
xen: swiotlb: return error code from xen_swiotlb_map_sg()
parisc: return error code from .map_sg() ops
sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
sparc/iommu: return error codes from .map_sg() ops
s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR
s390/pci: return error code from s390_dma_map_sg()
powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
powerpc/iommu: return error code from .map_sg() ops
...
Here is the big set of driver core patches for 5.15-rc1.
These do change a number of different things across different
subsystems, and because of that, there were 2 stable tags created that
might have already come into your tree from different pulls that did the
following
- changed the bus remove callback to return void
- sysfs iomem_get_mapping rework
The latter one will cause a tiny merge issue with your tree, as there
was a last-minute fix for this in 5.14 in your tree, but the fixup
should be "obvious". If you want me to provide a fixed merge for this,
please let me know.
Other than those two things, there's only a few small things in here:
- kernfs performance improvements for huge numbers of sysfs
users at once
- tiny api cleanups
- other minor changes
All of these have been in linux-next for a while with no reported
problems, other than the before-mentioned merge issue.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYS+FLQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylXuACfWECnysDtXNe66DdETCFs1a1RToYAoMokWeU5
s8VFP1NY2BjmxJbkebLL
=8kVu
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core patches for 5.15-rc1.
These do change a number of different things across different
subsystems, and because of that, there were 2 stable tags created that
might have already come into your tree from different pulls that did
the following
- changed the bus remove callback to return void
- sysfs iomem_get_mapping rework
Other than those two things, there's only a few small things in here:
- kernfs performance improvements for huge numbers of sysfs users at
once
- tiny api cleanups
- other minor changes
All of these have been in linux-next for a while with no reported
problems, other than the before-mentioned merge issue"
* tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits)
MAINTAINERS: Add dri-devel for component.[hc]
driver core: platform: Remove platform_device_add_properties()
ARM: tegra: paz00: Handle device properties with software node API
bitmap: extend comment to bitmap_print_bitmask/list_to_buf
drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI
topology: use bin_attribute to break the size limitation of cpumap ABI
lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list
sysfs: Rename struct bin_attribute member to f_mapping
sysfs: Invoke iomem_get_mapping() from the sysfs open callback
debugfs: Return error during {full/open}_proxy_open() on rmmod
zorro: Drop useless (and hardly used) .driver member in struct zorro_dev
zorro: Simplify remove callback
sh: superhyway: Simplify check in remove callback
nubus: Simplify check in remove callback
nubus: Make struct nubus_driver::remove return void
kernfs: dont call d_splice_alias() under kernfs node lock
kernfs: use i_lock to protect concurrent inode updates
kernfs: switch kernfs to use an rwsem
kernfs: use VFS negative dentry caching
...
Use BUG_ON instead of a if condition followed by BUG.
Generated by: scripts/coccinelle/misc/bugon.cocci
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jing Yangyang <jing.yangyang@zte.com.cn>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210825062451.69998-1-deng.changcheng@zte.com.cn
Signed-off-by: Juergen Gross <jgross@suse.com>
XENFEAT_gnttab_map_avail_bits is always set in Xen 4.0 and newer.
Remove coding assuming it might be zero.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210730071804.4302-4-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Linux kernel is not supported to run on Xen versions older than 4.0.
Add tests for required Xen features always being present in Xen 4.0
and newer.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210730071804.4302-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Delete/fixup few includes in anticipation of global -isystem compile
option removal.
Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition
of uintptr_t error (one definition comes from <stddef.h>, another from
<linux/types.h>).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYReBzAAKCRCAXGG7T9hj
vsjcAQD9tV9ThjOP9HXL3lGGoSIBCy+t92ZwpVkQuYITa9fHFAD/ViU+Iye5WHTf
jWxlveDJMoOegEmpKEQAoX56hDHPkww=
=lQOv
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"A small cleanup patch and a fix of a rare race in the Xen evtchn
driver"
* tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: Fix race in set_evtchn_to_irq
xen/events: remove redundant initialization of variable irq
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq
mapping are lazily allocated in this function. The check whether the row
is already present and the row initialization is not synchronized. Two
threads can at the same time allocate a new row for evtchn_to_irq and
add the irq mapping to the their newly allocated row. One thread will
overwrite what the other has set for evtchn_to_irq[row] and therefore
the irq mapping is lost. This will trigger a BUG_ON later in
bind_evtchn_to_cpu:
INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802
INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002)
INFO: nvme nvme77: 1/0/0 default/read/poll queues
CRIT: kernel BUG at drivers/xen/events/events_base.c:427!
WARN: invalid opcode: 0000 [#1] SMP NOPTI
WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0
WARN: Call Trace:
WARN: set_affinity_irq+0x121/0x150
WARN: irq_do_set_affinity+0x37/0xe0
WARN: irq_setup_affinity+0xf6/0x170
WARN: irq_startup+0x64/0xe0
WARN: __setup_irq+0x69e/0x740
WARN: ? request_threaded_irq+0xad/0x160
WARN: request_threaded_irq+0xf5/0x160
WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme]
WARN: pci_request_irq+0xa9/0xf0
WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130
WARN: queue_request_irq+0x4c/0x70 [nvme]
WARN: nvme_reset_work+0x82d/0x1550 [nvme]
WARN: ? check_preempt_wakeup+0x14f/0x230
WARN: ? check_preempt_curr+0x29/0x80
WARN: ? nvme_irq_check+0x30/0x30 [nvme]
WARN: process_one_work+0x18e/0x3c0
WARN: worker_thread+0x30/0x3a0
WARN: ? process_one_work+0x3c0/0x3c0
WARN: kthread+0x113/0x130
WARN: ? kthread_park+0x90/0x90
WARN: ret_from_fork+0x3a/0x50
This patch sets evtchn_to_irq rows via a cmpxchg operation so that they
will be set only once. The row is now cleared before writing it to
evtchn_to_irq in order to not create a race once the row is visible for
other threads.
While at it, do not require the page to be zeroed, because it will be
overwritten with -1's in clear_evtchn_to_irq_row anyway.
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Fixes: d0b075ffee ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated")
Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The .map_sg() op now expects an error code instead of zero on failure.
xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but
xen_swiotlb_map_page() only supports returning errors as
DMA_MAPPING_ERROR. So coalesce all errors into EIO per the documentation
for dma_map_sgtable().
Signed-off-by: Martin Oliveira <martin.oliveira@eideticom.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Since commit 69031f5008 ("swiotlb: Set dev->dma_io_tlb_mem to the
swiotlb pool used"), 'struct device' may hold a copy of the global
'io_default_tlb_mem' pointer if the device is using swiotlb for DMA. A
subsequent call to swiotlb_exit() will therefore leave dangling pointers
behind in these device structures, resulting in KASAN splats such as:
| BUG: KASAN: use-after-free in __iommu_dma_unmap_swiotlb+0x64/0xb0
| Read of size 8 at addr ffff8881d7830000 by task swapper/0/0
|
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3-debug #1
| Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020
| Call Trace:
| <IRQ>
| dump_stack+0x9c/0xcf
| print_address_description.constprop.0+0x18/0x130
| kasan_report.cold+0x7f/0x111
| __iommu_dma_unmap_swiotlb+0x64/0xb0
| nvme_pci_complete_rq+0x73/0x130
| blk_complete_reqs+0x6f/0x80
| __do_softirq+0xfc/0x3be
Convert 'io_default_tlb_mem' to a static structure, so that the
per-device pointers remain valid after swiotlb_exit() has been invoked.
All users are updated to reference the static structure directly, using
the 'nslabs' field to determine whether swiotlb has been initialised.
The 'slots' array is still allocated dynamically and referenced via a
pointer rather than a flexible array member.
Cc: Claire Chang <tientzu@chromium.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fixes: 69031f5008 ("swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
The variable irq is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210721114010.108648-1-colin.king@canonical.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The driver core ignores the return value of this callback because there
is only little it can do when a device disappears.
This is the final bit of a long lasting cleanup quest where several
buses were converted to also return void from their remove callback.
Additionally some resource leaks were fixed that were caused by drivers
returning an error code in the expectation that the driver won't go
away.
With struct bus_type::remove returning void it's prevented that newly
implemented buses return an ignored error code and so don't anticipate
wrong expectations for driver authors.
Reviewed-by: Tom Rix <trix@redhat.com> (For fpga)
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio)
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts)
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb)
Acked-by: Pali Rohár <pali@kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media)
Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform)
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-By: Vinod Koul <vkoul@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com> (For xen)
Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd)
Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb)
Acked-by: Johan Hovold <johan@kernel.org>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus)
Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio)
Acked-by: Maximilian Luz <luzmaximilian@gmail.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec)
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack)
Acked-by: Geoff Levand <geoff@infradead.org> (For ps3)
Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt)
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th)
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia)
Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI)
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr)
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid)
Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM)
Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa)
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire)
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid)
Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox)
Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss)
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC)
Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
use it to determine whether to bounce the data or not. This will be
useful later to allow for different pools.
Signed-off-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Includes Will's fix]
Update is_swiotlb_buffer to add a struct device argument. This will be
useful later to allow for different pools.
Signed-off-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYOVQVQAKCRCAXGG7T9hj
vnsfAQCLwcqSbno1iiUFDFGDzwnF0RZjIxpvEvdI/Zc0KMAO5gEA9tixIpfBYUO3
0yAd9g/VRumPS0wRsfwUrnLYs8TZ4gY=
=pPbL
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Only two minor patches this time: one cleanup patch and one patch
refreshing a Xen header"
* tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: sync include/xen/interface/io/ring.h with Xen's newest version
xen: Use DEVICE_ATTR_*() macro
Use DEVICE_ATTR_*() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210526141019.13752-1-yuehaibing@huawei.com
Signed-off-by: Juergen Gross <jgross@suse.com>
This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers. The major core change is a rework
to drop the status byte handling macros and the old bit shifted
definitions and the rest of the updates are minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
H0hkJN/Io7enFs5v3WA=
=zwIa
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers.
The major core change is a rework to drop the status byte handling
macros and the old bit shifted definitions and the rest of the updates
are minor fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
scsi: aha1740: Avoid over-read of sense buffer
scsi: arcmsr: Avoid over-read of sense buffer
scsi: ips: Avoid over-read of sense buffer
scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
scsi: elx: libefc: Fix less than zero comparison of a unsigned int
scsi: elx: efct: Fix pointer error checking in debugfs init
scsi: elx: efct: Fix is_originator return code type
scsi: elx: efct: Fix link error for _bad_cmpxchg
scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
scsi: elx: efct: Fix error handling in efct_hw_init()
scsi: elx: efct: Remove redundant initialization of variable lun
scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
scsi: lpfc: Fix build error in lpfc_scsi.c
scsi: target: iscsi: Remove redundant continue statement
scsi: qla4xxx: Remove redundant continue statement
scsi: ppa: Switch to use module_parport_driver()
scsi: imm: Switch to use module_parport_driver()
scsi: mpt3sas: Fix error return value in _scsih_expander_add()
...
In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.
This is working fine as all user events are lateeoi events. Note that
lateeoi_ack_mask_dynirq() is not modified as there is no explicit call
to xen_irq_lateeoi() expected later.
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Fixes: b6622798bc ("xen/events: avoid handling the same event on two cpus at the same time")
Tested-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com>
Link: https://lore.kernel.org/r/20210623130913.9405-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Originally the SCSI subsystem has been using 'special' SCSI status codes,
which were the SAM-specified ones but shifted by 1. As most drivers have
now been modified to use the SAM-specified ones, having two nearly
identical sets of definitions only causes confusion.
The Linux-specifed SCSI status codes have been marked obsolete for several
years so drop them and use the SAM-specified status codes throughout.
Link: https://lore.kernel.org/r/20210427083046.31620-41-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The message byte is now unused, so we can drop the helper to set the
message byte and the check for message bytes during error recovery.
Link: https://lore.kernel.org/r/20210427083046.31620-38-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver_byte field in the result is now unused, so we can drop the
definitions.
Link: https://lore.kernel.org/r/20210427083046.31620-15-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
DRIVER_ERROR was supposed to signal an error generated by the driver, which
xen-scsiback arguably isn't. Also the driver bytes don't have a detailed
error recovery, so we should rather return DID_ERROR instead of
DRIVER_ERROR.
Link: https://lore.kernel.org/r/20210427083046.31620-13-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the check for DRIVER_SENSE with a check for
scsi_status_is_check_condition().
Audit all callsites to ensure the SAM status is set correctly. For
backwards compability move the DRIVER_SENSE definition to sg.h, and update
sg, bsg, and scsi_ioctl to set the DRIVER_SENSE driver_status whenever
SAM_STAT_CHECK_CONDITION is present.
[mkp: fix zeroday srp warning]
Link: https://lore.kernel.org/r/20210427083046.31620-10-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
fix
When multiple PCI devices get assigned to a guest right at boot, libxl
incrementally populates the backend tree. The writes for the first of
the devices trigger the backend watch. In turn xen_pcibk_setup_backend()
will set the XenBus state to Initialised, at which point no further
reconfigures would happen unless a device got hotplugged. Arrange for
reconfigure to also get triggered from the backend watch handler.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/2337cbd6-94b9-4187-9862-c03ea12e0c61@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
The commit referenced below was incomplete: It merely affected what
would get written to the vdev-<N> xenstore node. The guest would still
find the function at the original function number as long as
__xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt
__xen_pcibk_get_pcifront_dev().
Undo overriding the function to zero and instead make sure that VFs at
function zero remain alone in their slot. This has the added benefit of
improving overall capacity, considering that there's only a total of 32
slots available right now (PCI segment and bus can both only ever be
zero at present).
Fixes: 8a5248fe10 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
xen_swiotlb_init calls swiotlb_late_init_with_tbl, which fails with
-ENOMEM if the swiotlb has already been initialized.
Add an explicit check io_tlb_default_mem != NULL at the beginning of
xen_swiotlb_init. If the swiotlb is already initialized print a warning
and return -EEXIST.
On x86, the error propagates.
On ARM, we don't actually need a special swiotlb buffer (yet), any
buffer would do. So ignore the error and continue.
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210512201823.1963-3-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: a4574f63ed ("mm/memremap_pages: convert to 'struct range'")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210508021913.1727-1-thunder.leizhen@huawei.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Commit d3eeb1d77c ("xen/gntdev: use mmu_interval_notifier_insert")
introduced an error in gntdev_mmap(): in case the call of
mmu_interval_notifier_insert_locked() fails the exit path should not
call mmu_interval_notifier_remove(), as this might result in NULL
dereferences.
One reason for failure is e.g. a signal pending for the running
process.
Fixes: d3eeb1d77c ("xen/gntdev: use mmu_interval_notifier_insert")
Cc: stable@vger.kernel.org
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Link: https://lore.kernel.org/r/20210423054038.26696-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"Christoph Hellwig has taken a cleaver and trimmed off the not-needed
code and nicely folded duplicate code in the generic framework.
This lays the groundwork for more work to add extra DMA-backend-ish in
the future. Along with that some bug-fixes to make this a nice working
package"
* 'stable/for-linus-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: don't override user specified size in swiotlb_adjust_size
swiotlb: Fix the type of index
swiotlb: Make SWIOTLB_NO_FORCE perform no allocation
ARM: Qualify enabling of swiotlb_init()
swiotlb: remove swiotlb_nr_tbl
swiotlb: dynamically allocate io_tlb_default_mem
swiotlb: move global variables into a new io_tlb_mem structure
xen-swiotlb: remove the unused size argument from xen_swiotlb_fixup
xen-swiotlb: split xen_swiotlb_init
swiotlb: lift the double initialization protection from xen-swiotlb
xen-swiotlb: remove xen_io_tlb_start and xen_io_tlb_nslabs
xen-swiotlb: remove xen_set_nslabs
xen-swiotlb: use io_tlb_end in xen_swiotlb_dma_supported
xen-swiotlb: use is_swiotlb_buffer in is_xen_swiotlb_buffer
swiotlb: split swiotlb_tbl_sync_single
swiotlb: move orig addr and size validation into swiotlb_bounce
swiotlb: remove the alloc_size parameter to swiotlb_tbl_unmap_single
powerpc/svm: stop using io_tlb_start
This series consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core
change is using a sbitmap instead of an atomic for queue tracking.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL
WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S
klkSxLzXKQLzJBgdK5w=
=p5B/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).
The major core change is using a sbitmap instead of an atomic for
queue tracking"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
scsi: target: tcm_fc: Fix a kernel-doc header
scsi: target: Shorten ALUA error messages
scsi: target: Fix two format specifiers
scsi: target: Compare explicitly with SAM_STAT_GOOD
scsi: sd: Introduce a new local variable in sd_check_events()
scsi: dc395x: Open-code status_byte(u8) calls
scsi: 53c700: Open-code status_byte(u8) calls
scsi: smartpqi: Remove unused functions
scsi: qla4xxx: Remove an unused function
scsi: myrs: Remove unused functions
scsi: myrb: Remove unused functions
scsi: mpt3sas: Fix two kernel-doc headers
scsi: fcoe: Suppress a compiler warning
scsi: libfc: Fix a format specifier
scsi: aacraid: Remove an unused function
scsi: core: Introduce enum scsi_disposition
scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
scsi: core: Rename scsi_softirq_done() into scsi_complete()
scsi: core: Remove an incorrect comment
scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYIZZdwAKCRCAXGG7T9hj
vtaDAQDplyo+1T1Mv5DepYe0DnGYicOsCxzYzqMvhYkb+eubyAD/SFcsof/PtJAW
2zropoo2NTnf+zQVuC638pdXVSK9VAc=
=n1Tz
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- remove some PV ACPI cpu/memory hotplug code which has been broken for
a long time
- support direct mapped guests (other than dom0) on Arm
- several small fixes and cleanups
* tag 'for-linus-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/arm: introduce XENFEAT_direct_mapped and XENFEAT_not_direct_mapped
xen-pciback: simplify vpci's find hook
xen-blkfront: Fix 'physical' typos
xen-blkback: fix compatibility bug with single page rings
xen: Remove support for PV ACPI cpu/memory hotplug
xen/pciback: Fix incorrect type warnings
eliminate custom code patching. For that, the alternatives infra is
extended to accomodate paravirt's needs and, as a result, a lot of
paravirt patching code goes away, leading to a sizeable cleanup and
simplification. Work by Juergen Gross.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCGiXQACgkQEsHwGGHe
VUocbw/+OkFzphK6zlNA8O3RJ24u2csXUWWUtpGlZ2220Nn/Bgyso2+fyg/NEeQg
EmEttaY3JG/riCDfHk5Xm2saeVtsbPXN4f0sJm/Io/djF7Cm03WS0eS0aA2Rnuca
MhmvvkrzYqZXAYVaxKkIH6sNlPgyXX7vDNPbTd/0ZCOb3ZKIyXwL+SaLatMCtE5o
ou7e8Bj8xPSwcaCyK6sqjrT6jdpPjoTrxxrwENW8AlRu5lCU1pIY03GGhARPVoEm
fWkZsIPn7DxhpyIqzJtEMX8EK1xN96E+NGkNuSAtJGP9HRb+3j5f4s3IUAfXiLXq
r7NecFw8zHhPKl9J0pPCiW7JvMrCMU5xGwyeUmmhKyK2BxwvvAC173ohgMlCfB2Q
FPIsQWemat17tSue8LIA8SmlSDQz6R+tTdUFT+vqmNV34PxOIEeSdV7HG8rs87Ec
dYB9ENUgXqI+h2t7atE68CpTLpWXzNDcq2olEsaEUXenky2hvsi+VxNkWpmlKQ3I
NOMU/AyH8oUzn5O0o3oxdPhDLmK5ItEFxjYjwrgLfKFQ+Y8vIMMq3LrKQGwOj+ZU
n9qC7JjOwDKZGjd3YqNNRhnXp+w0IJvUHbyr3vIAcp8ohQwEKgpUvpZzf/BKUvHh
nJgJSJ53GFJBbVOJMfgVq+JcFr+WO8MDKHaw6zWeCkivFZdSs4g=
=h+km
-----END PGP SIGNATURE-----
Merge tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 alternatives/paravirt updates from Borislav Petkov:
"First big cleanup to the paravirt infra to use alternatives and thus
eliminate custom code patching.
For that, the alternatives infrastructure is extended to accomodate
paravirt's needs and, as a result, a lot of paravirt patching code
goes away, leading to a sizeable cleanup and simplification.
Work by Juergen Gross"
* tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/paravirt: Have only one paravirt patch function
x86/paravirt: Switch functions with custom code to ALTERNATIVE
x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs
x86/paravirt: Switch iret pvops to ALTERNATIVE
x86/paravirt: Simplify paravirt macros
x86/paravirt: Remove no longer needed 32-bit pvops cruft
x86/paravirt: Add new features for paravirt patching
x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has()
x86/alternative: Support ALTERNATIVE_TERNARY
x86/alternative: Support not-feature
x86/paravirt: Switch time pvops functions to use static_call()
static_call: Add function to query current function
static_call: Move struct static_call_key definition to static_call_types.h
x86/alternative: Merge include files
x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()
There's no point in comparing SBDF - we can simply compare the struct
pci_dev pointers. If they weren't the same for a given device, we'd have
bigger problems from having stored a stale pointer.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/158273a2-d1b9-3545-b25d-affca867376c@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Commit 76fc253723 ("xen/acpi-stub: Disable it b/c the acpi_processor_add
is no longer called.") declared as BROKEN support for Xen ACPI stub (which
is required for xen-acpi-{cpu|memory}-hotplug) and suggested that this
is temporary and will be soon fixed. This was in March 2013.
Further, commit cfafae9403 ("xen: rename dom0_op to platform_op")
renamed an interface used by memory hotplug code without updating that
code (as it was BROKEN and therefore not compiled). This was
in November 2015 and has gone unnoticed for over 5 year.
It is now clear that this code is of no interest to anyone and therefore
should be removed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/1618336344-3162-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Correct enum pci_channel_io_normal should be used instead of putting
integer value 1.
Fix following smatch warnings:
drivers/xen/xen-pciback/pci_stub.c:805:40: warning: incorrect type in argument 2 (different base types)
drivers/xen/xen-pciback/pci_stub.c:805:40: expected restricted pci_channel_state_t [usertype] state
drivers/xen/xen-pciback/pci_stub.c:805:40: got int
drivers/xen/xen-pciback/pci_stub.c:862:40: warning: incorrect type in argument 2 (different base types)
drivers/xen/xen-pciback/pci_stub.c:862:40: expected restricted pci_channel_state_t [usertype] state
drivers/xen/xen-pciback/pci_stub.c:862:40: got int
drivers/xen/xen-pciback/pci_stub.c:973:31: warning: incorrect type in argument 2 (different base types)
drivers/xen/xen-pciback/pci_stub.c:973:31: expected restricted pci_channel_state_t [usertype] state
drivers/xen/xen-pciback/pci_stub.c:973:31: got int
Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210326181442.GA1735905@LEGION
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYHB+3wAKCRCAXGG7T9hj
vktXAQCV+BbqlrLdQwsoWqFCrPB9PLMOsMRw5q9BykUDihscEgD/RwnJlIGGwgvE
MzAcCyXWukL8epStIWbBxTBHM12eJw8=
=7jyu
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.12b-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix of a 5.12 patch for the rather uncommon problem of
running as a Xen guest with a real time kernel config"
* tag 'for-linus-5.12b-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: Change irq_info lock to raw_spinlock_t
Unmask operation must be called with interrupt disabled,
on preempt_rt spin_lock_irqsave/spin_unlock_irqrestore
don't disable/enable interrupts, so use raw_* implementation
and change lock variable in struct irq_info from spinlock_t
to raw_spinlock_t
Cc: stable@vger.kernel.org
Fixes: 25da4618af ("xen/events: don't unmask an event channel when an eoi is pending")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20210406105105.10141-1-luca.fancellu@arm.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYF37OgAKCRCAXGG7T9hj
vp8hAP4h7mvjfkntbFXagrJK9pi2xVC9d/YO5nfa8/K3LcGVnQD/fKcU9ggPN9vI
GLnhyprGLcCA4aTL6Ogb37o9fDd4Yws=
=joIg
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains a small series with a more elegant fix of a problem
which was originally fixed in rc2"
* tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"
xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG
The Xen memory hotplug limit should depend on the memory hotplug
generic option, rather than the Xen balloon configuration. It's
possible to have a kernel with generic memory hotplug enabled, but
without Xen balloon enabled, at which point memory hotplug won't work
correctly due to the size limitation of the p2m.
Rename the option to XEN_MEMORY_HOTPLUG_LIMIT since it's no longer
tied to ballooning.
Fixes: 9e2369c06c ("xen: add helpers to allocate unpopulated memory")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210324122424.58685-2-roger.pau@citrix.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Instead of allocating ->list and ->orig_addr separately just do one
dynamic allocation for the actual io_tlb_mem structure. This simplifies
a lot of the initialization code, and also allows to just check
io_tlb_default_mem to see if swiotlb is in use.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Added a new struct, io_tlb_mem, as the IO TLB memory pool descriptor and
moved relevant global variables into that struct.
This will be useful later to allow for restricted DMA pool.
Signed-off-by: Claire Chang <tientzu@chromium.org>
[hch: rebased]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Split xen_swiotlb_init into a normal an an early case. That makes both
much simpler and more readable, and also allows marking the early
code as __init and x86-only.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Lift the double initialization protection from xen-swiotlb to the core
code to avoid exposing too many swiotlb internals. Also upgrade the
check to a warning as it should not happen.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The xen_io_tlb_start and xen_io_tlb_nslabs variables are now only used in
xen_swiotlb_init, so replace them with local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The xen_set_nslabs function is a little weird, as it has just one
caller, that caller passes a global variable as the argument,
which is then overriden in the function and a derivative of it
returned. Just add a cpp symbol for the default size using a readable
constant and open code the remaining three lines in the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use the existing variable that holds the physical address for
xen_io_tlb_end to simplify xen_swiotlb_dma_supported a bit, and remove
the otherwise unused xen_io_tlb_end variable and the xen_virt_to_bus
helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use the is_swiotlb_buffer to check if a physical address is
a swiotlb buffer. This works because xen-swiotlb does use the
same buffer as the main swiotlb code, and xen_io_tlb_{start,end}
are just the addresses for it that went through phys_to_virt.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Split swiotlb_tbl_sync_single into two separate funtions for the to device
and to cpu synchronization.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Now that swiotlb remembers the allocation size there is no need to pass
it back to swiotlb_tbl_unmap_single.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYEsmRgAKCRCAXGG7T9hj
vsQ9AP9oN1PKbTGn9U6FR/yJtMuD2XuX8a86PnMI8iM/bnox5QEA/kLIOBknM/nF
bPDfBcb72BERKX+83qtd5153zcbhww4=
=a/rf
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.12b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Two fix series and a single cleanup:
- a small cleanup patch to remove unneeded symbol exports
- a series to cleanup Xen grant handling (avoiding allocations in
some cases, and using common defines for "invalid" values)
- a series to address a race issue in Xen event channel handling"
* tag 'for-linus-5.12b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
Xen/gntdev: don't needlessly use kvcalloc()
Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}
Xen/gntdev: don't needlessly allocate k{,un}map_ops[]
Xen: drop exports of {set,clear}_foreign_p2m_mapping()
xen/events: avoid handling the same event on two cpus at the same time
xen/events: don't unmask an event channel when an eoi is pending
xen/events: reset affinity of 2-level event when tearing it down
The time pvops functions are the only ones left which might be
used in 32-bit mode and which return a 64-bit value.
Switch them to use the static_call() mechanism instead of pvops, as
this allows quite some simplification of the pvops implementation.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com
Requesting zeroed memory when all of it will be overwritten subsequently
by all ones is a waste of processing bandwidth. In fact, rather than
recording zeroed ->grants[], fill that array too with more appropriate
"invalid" indicators.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/9a726be2-4893-8ffe-0ef1-b70dd1c229b1@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
It's not helpful if every driver has to cook its own. Generalize
xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which
shouldn't have expanded to zero to begin with). Use the constants in
p2m.c and gntdev.c right away, and update field types where necessary so
they would match with the constants' types (albeit without touching
struct ioctl_gntdev_grant_ref's ref field, as that's part of the public
interface of the kernel and would require introducing a dependency on
Xen's grant_table.h public header).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
They're needed only in the not-auto-translate (i.e. PV) case; there's no
point in allocating memory that's never going to get accessed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/180d50cb-5531-8952-4bf0-d65c554638ed@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When changing the cpu affinity of an event it can happen today that
(with some unlucky timing) the same event will be handled on the old
and the new cpu at the same time.
Avoid that by adding an "event active" flag to the per-event data and
call the handler only if this flag isn't set.
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
An event channel should be kept masked when an eoi is pending for it.
When being migrated to another cpu it might be unmasked, though.
In order to avoid this keep three different flags for each event channel
to be able to distinguish "normal" masking/unmasking from eoi related
masking/unmasking and temporary masking. The event channel should only
be able to generate an interrupt if all flags are cleared.
Cc: stable@vger.kernel.org
Fixes: 54c9de8989 ("xen/events: add a new "late EOI" evtchn framework")
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com
[boris -- corrected Fixed tag format]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When creating a new event channel with 2-level events the affinity
needs to be reset initially in order to avoid using an old affinity
from earlier usage of the event channel port. So when tearing an event
channel down reset all affinity bits.
The same applies to the affinity when onlining a vcpu: all old
affinity settings for this vcpu must be reset. As percpu events get
initialized before the percpu event channel hook is called,
resetting of the affinities happens after offlining a vcpu (this is
working, as initial percpu memory is zeroed out).
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
tcm_loop could be used like a normal block device, so we can't use
GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to
target_cmd_init_cdb() and converts the users. For every driver but loop
GFP_KERNEL is kept.
This will also be useful in subsequent patches where loop needs to do
target_submit_prep() from interrupt context to get a ref to the se_device,
and so it will need to use GFP_ATOMIC.
Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
target_submit_cmd_map_sgls() is being removed, so convert xen to the new
submission API. This has it use target_init_cmd(), target_submit_prep(), or
target_submit() because we need to have LIO core map sgls which is now done
in target_submit_prep(). target_init_cmd() will never fail for xen because
it does its own sync during session shutdown, so we can remove that code.
Note: xen never calls target_stop_session() so target_submit_cmd_map_sgls()
never failed (in the new API target_init_cmd() handles
target_stop_session() being called when cmds are being submitted). If it
were to have used target_stop_session() and got an error, we would have hit
a refcount bug like xen and usb, because it does:
if (rc < 0) {
transport_send_check_condition_and_sense(se_cmd,
TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0);
transport_generic_free_cmd(se_cmd, 0);
}
transport_send_check_condition_and_sense() calls queue_status which calls
scsiback_cmd_done->target_put_sess_cmd. We do an extra
transport_generic_free_cmd() call above which would have dropped the
refcount to -1 and the refcount code would spit out errors.
Link: https://lore.kernel.org/r/20210227170006.5077-13-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYDjnxwAKCRCAXGG7T9hj
vlq2AP9AljV5qAlSBkwtfi9awxs8GXpIpyu/NgC/TKkq4NZQcAD/Xpa6qrB7mwPD
K810NcXsaPmvJiUtkKDIQ5vGHqyaKAE=
=E+GX
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.12b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:
- A small series for Xen event channels adding some sysfs nodes for per
pv-device settings and statistics, and two fixes of theoretical
problems.
- two minor fixes (one for an unlikely error path, one for a comment).
* tag 'for-linus-5.12b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-front-pgdir-shbuf: don't record wrong grant handle upon error
xen: Replace lkml.org links with lore
xen/evtchn: use READ/WRITE_ONCE() for accessing ring indices
xen/evtchn: use smp barriers for user event ring
xen/events: add per-xenbus device event statistics and settings
Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and
"mhp_flags". As discussed recently [1], "mhp" is our internal acronym for
memory hotplug now.
[1] https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c334d@redhat.com/
Link: https://lkml.kernel.org/r/20210126115829.10909-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order for subsequent unmapping to not mistakenly unmap handle 0,
record a perceived always-invalid one instead.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Link: https://lore.kernel.org/r/82414b0f-1b63-5509-7c1d-5bcc8239a3de@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
As started by commit 05a5f51ca5 ("Documentation: Replace lkml.org
links with lore"), replace lkml.org links with lore to better use a
single source that's more likely to stay available long-term.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210210234618.2734785-1-keescook@chromium.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
For avoiding read- and write-tearing by the compiler use READ_ONCE()
and WRITE_ONCE() for accessing the ring indices in evtchn.c.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210219154030.10892-9-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The ring buffer for user events is local to the given kernel instance,
so smp barriers are fine for ensuring consistency.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20210219154030.10892-8-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Add syfs nodes for each xenbus device showing event statistics (number
of events and spurious events, number of associated event channels)
and for setting a spurious event threshold in case a frontend is
sending too many events without being rogue on purpose.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210219154030.10892-7-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYCu8dgAKCRCAXGG7T9hj
vuxTAP0S1iJ6DR5Y2pdSy2dfxn/gItNqUlR7vbFdxgf/mBSNxAD/fxbtVWM1GuTs
3Fwz0T60BcxsHZXhDcPAA2cjoqORbQs=
=2b0M
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"A series of Xen related security fixes, all related to limited error
handling in Xen backend drivers"
* tag 'for-linus-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-blkback: fix error handling in xen_blkbk_map()
xen-scsiback: don't "handle" error by BUG()
xen-netback: don't "handle" error by BUG()
xen-blkback: don't "handle" error by BUG()
xen/arm: don't ignore return errors from set_phys_to_machine
Xen/gntdev: correct error checking in gntdev_map_grant_pages()
Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages()
Xen/x86: also check kernel mapping in set_foreign_p2m_mapping()
Xen/x86: don't bail early from clear_foreign_p2m_mapping()
Pull networking updates from David Miller:
"Here is what we have this merge window:
1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.
2) Add RSS multi group support to octeontx2-pf driver, from Geetha
Sowjanya.
3) Add support for KS8851 PHY. From Marek Vasut.
4) Add support for GarfieldPeak bluetooth controller from Kiran K.
5) Add support for half-duplex tcan4x5x can controllers.
6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
Liew.
7) Rework RX port offload infrastructure, particularly wrt, UDP
tunneling, from Jakub Kicinski.
8) Add BCM72116 PHY support, from Florian Fainelli.
9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
Oltean.
10) Add support for picosecond rx delay in dwmac-meson8b chips. From
Martin Blumenstingl.
11) Support TSO on xfrm interfaces from Eyal Birger.
12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.
13) Support BCM4908 integrated switch, from Rafał Miłecki.
14) Support for directly accessing kernel module variables via module
BTF info, from Andrii Naryiko.
15) Add DASH (esktop and mobile Architecture for System Hardware)
support to r8169 driver, from Heiner Kallweit.
16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.
17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.
18) Support link aggregation in DSA, from Tobias Waldekranz.
19) Support for bitwidse atomics in bpf, from Brendan Jackman.
20) SmartEEE support in at803x driver, from Russell King.
21) Add support for flow based tunneling to GTP, from Pravin B Shelar.
22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.
23) TLS RX offload for bonding, from Tariq Toukan.
24) RX decap offklload support in mac80211, from Felix Fietkou.
25) devlink health saupport in octeontx2-af, from George Cherian.
26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung
27) Delegated actionss support in mptcp, from Paolo Abeni.
28) Support receive timestamping when doin zerocopy tcp receive. From
Arjun Ray.
29) HTB offload support for mlx5, from Maxim Mikityanskiy.
30) UDP GRO forwarding, from Maxim Mikityanskiy.
31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.
32) Weighted random twos choice algorithm for ipvs, from Darby Payne.
33) Fix netdev registration deadlock, from Johannes Berg.
34) Various conversions to new tasklet api, from EmilRenner Berthing.
35) Bulk skb allocations in veth, from Lorenzo Bianconi.
36) New ethtool interface for lane setting, from Danielle Ratson.
37) Offload failiure notifications for routes, from Amit Cohen.
38) BCM4908 support, from Rafał Miłecki.
39) Support several new iwlwifi chips, from Ihab Zhaika.
40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.
41) Support for mhi prrotocols, from Loic Poulain.
42) Optimize bpf program stats.
43) Implement RFC6056, for better port randomization, from Eric
Dumazet.
44) hsr tag offloading support from George McCollister.
45) Netpoll support in qede, from Bhaskar Upadhaya.
46) 2005/400g speed support in bonding 3ad mode, from Nikolay
Aleksandrov.
47) Netlink event support in mptcp, from Florian Westphal.
48) Better skbuff caching, from Alexander Lobakin.
49) MRP (Media Redundancy Protocol) offloading in DSA and a few
drivers, from Horatiu Vultur.
50) mqprio saupport in mvneta, from Maxime Chevallier.
51) Remove of_phy_attach, no longer needed, from Florian Fainelli"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
octeontx2-pf: Fix otx2_get_fecparam()
cteontx2-pf: cn10k: Prevent harmless double shift bugs
net: stmmac: Add PCI bus info to ethtool driver query output
ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
ptp: ptp_clockmatrix: Clean-up dev_*() messages.
ptp: ptp_clockmatrix: Remove unused header declarations.
ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
net: stmmac: dwmac-sun8i: Add a shutdown callback
net: stmmac: dwmac-sun8i: Minor probe function cleanup
net: stmmac: dwmac-sun8i: Use reset_control_reset
net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
net: stmmac: dwmac-sun8i: Return void from PHY unpower
r8169: use macro pm_ptr
net: mdio: Remove of_phy_attach()
net: mscc: ocelot: select PACKING in the Kconfig
net: re-solve some conflicts after net -> net-next merge
net: dsa: tag_rtl4_a: Support also egress tags
...
In particular -ENOMEM may come back here, from set_foreign_p2m_mapping().
Don't make problems worse, the more that handling elsewhere (together
with map's status fields now indicating whether a mapping wasn't even
attempted, and hence has to be considered failed) doesn't require this
odd way of dealing with errors.
This is part of XSA-362.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Failure of the kernel part of the mapping operation should also be
indicated as an error to the caller, or else it may assume the
respective kernel VA is okay to access.
Furthermore gnttab_map_refs() failing still requires recording
successfully mapped handles, so they can be unmapped subsequently. This
in turn requires there to be a way to tell full hypercall failure from
partial success - preset map_op status fields such that they won't
"happen" to look as if the operation succeeded.
Also again use GNTST_okay instead of implying its value (zero).
This is part of XSA-361.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
We may not skip setting the field in the unmap structure when
GNTMAP_device_map is in use - such an unmap would fail to release the
respective resources (a page ref in the hypervisor). Otoh the field
doesn't need setting at all when GNTMAP_device_map is not in use.
To record the value for unmapping, we also better don't use our local
p2m: In particular after a subsequent change it may not have got updated
for all the batch elements. Instead it can simply be taken from the
respective map's results.
We can additionally avoid playing this game altogether for the kernel
part of the mappings in (x86) PV mode.
This is part of XSA-361.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYCUrYAAKCRCAXGG7T9hj
voF2AP9x3wKo+kYAqlRQrrDBb/AoMezoJ3pFiqJ+1Lg4yK4N6AD+IQxHuIODBfgj
bK12hr5oE6S3e7tTaVWweWzys5UmWA4=
=zleO
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix for an issue introduced this development cycle: when
running as a Xen guest on Arm systems the kernel will hang during
boot"
* tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
arm/xen: Don't probe xenbus as part of an early initcall
In order to support the possibility of per-device event channel
settings (e.g. lateeoi spurious event thresholds) add a xenbus device
pointer to struct irq_info() and modify the related event channel
binding interfaces to take the pointer to the xenbus device as a
parameter instead of the domain id of the other side.
While at it remove the stale prototype of bind_evtchn_to_irq_lateeoi().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After Commit 3499ba8198 ("xen: Fix event channel callback via
INTX/GSI"), xenbus_probe() will be called too early on Arm. This will
recent to a guest hang during boot.
If the hang wasn't there, we would have ended up to call
xenbus_probe() twice (the second time is in xenbus_probe_initcall()).
We don't need to initialize xenbus_probe() early for Arm guest.
Therefore, the call in xen_guest_init() is now removed.
After this change, there is no more external caller for xenbus_probe().
So the function is turned to a static one. Interestingly there were two
prototypes for it.
Cc: stable@vger.kernel.org
Fixes: 3499ba8198 ("xen: Fix event channel callback via INTX/GSI")
Reported-by: Ian Jackson <iwj@xenproject.org>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20210210170654.5377-1-julien@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYBLX8QAKCRCAXGG7T9hj
vr2TAP4ylwxPVbf1l1V8zYCtCoNg087+Ubolr5kXXJkesG/nkgD6A2ix2oN1sC0Z
kbFBeZHqgP4AbVl7IhBALVFa1GPxWQg=
=NPGM
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A fix for a regression introduced in 5.11 resulting in Xen dom0
having problems to correctly initialize Xenstore.
- A fix for avoiding WARN splats when booting as Xen dom0 with
CONFIG_AMD_MEM_ENCRYPT enabled due to a missing trap handler for the
#VC exception (even if the handler should never be called).
- A fix for the Xen bklfront driver adapting to the correct but
unexpected behavior of new qemu.
* tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled
xen: Fix XenStore initialisation for XS_LOCAL
xen-blkfront: allow discard-* nodes to be optional
In commit 3499ba8198 ("xen: Fix event channel callback via INTX/GSI")
I reworked the triggering of xenbus_probe().
I tried to simplify things by taking out the workqueue based startup
triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
handler.
I missed the fact that in the XS_LOCAL case (Dom0 starting its own
xenstored or xenstore-stubdom, which happens after the kernel is booted
completely), that IRQ-based trigger is still actually needed.
So... put it back, except more cleanly. By just spawning a xenbus_probe
thread which waits on xb_waitq and runs the probe the first time it
gets woken, just as the workqueue-based hack did.
This is actually a nicer approach for *all* the back ends with different
interrupt methods, and we can switch them all over to that without the
complex conditions for when to trigger it. But not in -rc6. This is
the minimal fix for the regression, although it's a step in the right
direction instead of doing a partial revert and actually putting the
workqueue back. It's also simpler than the workqueue.
Fixes: 3499ba8198 ("xen: Fix event channel callback via INTX/GSI")
Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/4c9af052a6e0f6485d1de43f2c38b1461996db99.camel@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYAGllQAKCRCAXGG7T9hj
vqEtAP9uws/W/JPcnsohK76hMcFAVxZCVdX7C3HvfW5tp6hqMgEAg9ic8sYiuHhn
6FouRu/ZXHJEg3PpS5W66yKNIYPvGgw=
=rR+L
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A series to fix a regression when running as a fully virtualized
guest on an old Xen hypervisor not supporting PV interrupt callbacks
for HVM guests.
- A patch to add support to query Xen resource sizes (setting was
possible already) from user mode.
* tag 'for-linus-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Fix xen_hvm_smp_init() when vector callback not available
x86/xen: Don't register Xen IPIs when they aren't going to be used
x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
xen: Set platform PCI device INTX affinity to CPU0
xen: Fix event channel callback via INTX/GSI
xen/privcmd: allow fetching resource sizes
With INTX or GSI delivery, Xen uses the event channel structures of CPU0.
If the interrupt gets handled by Linux on a different CPU, then no events
are seen as pending. Rather than introducing locking to allow other CPUs
to process CPU0's events, just ensure that the PCI interrupts happens
only on CPU0.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-3-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
For a while, event channel notification via the PCI platform device
has been broken, because we attempt to communicate with xenstore before
we even have notifications working, with the xs_reset_watches() call
in xs_init().
We tend to get away with this on Xen versions below 4.0 because we avoid
calling xs_reset_watches() anyway, because xenstore might not cope with
reading a non-existent key. And newer Xen *does* have the vector
callback support, so we rarely fall back to INTX/GSI delivery.
To fix it, clean up a bit of the mess of xs_init() and xenbus_probe()
startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM
case, deferring it to be called from xenbus_probe() in the XS_HVM case
instead.
Then fix up the invocation of xenbus_probe() to happen either from its
device_initcall if the callback is available early enough, or when the
callback is finally set up. This means that the hack of calling
xenbus_probe() from a workqueue after the first interrupt, or directly
from the PCI platform device setup, is no longer needed.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210113132606.422794-2-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Allow issuing an IOCTL_PRIVCMD_MMAP_RESOURCE ioctl with num = 0 and
addr = 0 in order to fetch the size of a specific resource.
Add a shortcut to the default map resource path, since fetching the
size requires no address to be passed in, and thus no VMA to setup.
This is missing from the initial implementation, and causes issues
when mapping resources that don't have fixed or known sizes.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: stable@vger.kernel.org # >= 4.18
Link: https://lore.kernel.org/r/20210112115358.23346-1-roger.pau@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
accesses, inefficient and disfunctional code. The goal is to remove the
export of irq_to_desc() to prevent these things from creeping up again.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/ifgsTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYm6EACAo8sObkuY3oWLagtGj1KHxon53oGZ
VfDw2LYKM+rgJjDWdiyocyxQU5gtm6loWCrIHjH2adRQ4EisB5r8hfI8NZHxNMyq
8khUi822NRBfFN6SCpO8eW9o95euscNQwCzqi7gV9/U/BAKoDoSEYzS4y0YmJlup
mhoikkrFiBuFXplWI0gbP4ihb8S/to2+kTL6o7eBoJY9+fSXIFR3erZ6f3fLjYZG
CQUUysTywdDhLeDkC9vaesXwgdl2XnaPRwcQqmK8Ez0QYNYpawyILUHLD75cIHDu
bHdK2ZoDv/wtad/3BoGTK3+wChz20a/4/IAnBIUVgmnSLsPtW8zNEOPWNNc0aGg+
rtafi5bvJ1lMoSZhkjLWQDOGU6vFaXl9NkC2fpF+dg1skFMT2CyLC8LD/ekmocon
zHAPBva9j3m2A80hI3dUH9azo/IOl1GHG8ccM6SCxY3S/9vWSQChNhQDLe25xBEO
VtKZS7DYFCRiL8mIy9GgwZWof8Vy2iMua2ML+W9a3mC9u3CqSLbCFmLMT/dDoXl1
oHnMdAHk1DRatA8pJAz83C75RxbAS2riGEqtqLEQ6OaNXn6h0oXCanJX9jdKYDBh
z6ijWayPSRMVktN6FDINsVNFe95N4GwYcGPfagIMqyMMhmJDic6apEzEo7iA76lk
cko28MDqTIK4UQ==
=BXv+
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"This is the second attempt after the first one failed miserably and
got zapped to unblock the rest of the interrupt related patches.
A treewide cleanup of interrupt descriptor (ab)use with all sorts of
racy accesses, inefficient and disfunctional code. The goal is to
remove the export of irq_to_desc() to prevent these things from
creeping up again"
* tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
genirq: Restrict export of irq_to_desc()
xen/events: Implement irq distribution
xen/events: Reduce irq_info:: Spurious_cnt storage size
xen/events: Only force affinity mask for percpu interrupts
xen/events: Use immediate affinity setting
xen/events: Remove disfunct affinity spreading
xen/events: Remove unused bind_evtchn_to_irq_lateeoi()
net/mlx5: Use effective interrupt affinity
net/mlx5: Replace irq_to_desc() abuse
net/mlx4: Use effective interrupt affinity
net/mlx4: Replace irq_to_desc() abuse
PCI: mobiveil: Use irq_data_get_irq_chip_data()
PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
NTB/msi: Use irq_has_action()
mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc
pinctrl: nomadik: Use irq_has_action()
drm/i915/pmu: Replace open coded kstat_irqs() copy
drm/i915/lpe_audio: Remove pointless irq_to_desc() usage
s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()
parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts()
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX92dmQAKCRCAXGG7T9hj
vqfiAQCWM8eVftAiTxuQGzdYHYBl5p57UtOc3n0Ap8cHbl9SOwEAqlLmQEopnbHT
P4WzH/PegsCU5BopSb+NgXkBesMtPgk=
=Yjsy
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:
"Some minor cleanup patches and a small series disentangling some Xen
related Kconfig options"
* tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Kconfig: remove X86_64 depends from XEN_512GB
xen/manage: Fix fall-through warnings for Clang
xen-blkfront: Fix fall-through warnings for Clang
xen: remove trailing semicolon in macro definition
xen: Kconfig: nest Xen guest options
xen: Remove Xen PVH/PVHVM dependency on PCI
x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX9iqjQAKCRCAXGG7T9hj
vvNXAQCNirQi0LTN7LGZvL2eDTg1SDcq7XII8YomcJl8UFBwvQD9HtRo5CwGzVKc
PbDqGdvZdl9+VyVyeEvZTv/baaollAo=
=MPh4
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Fixes for security issues just having been disclosed:
- a five patch series for fixing of XSA-349 (DoS via resource
depletion in Xen dom0)
- a patch fixing XSA-350 (access of stale pointer in a Xen dom0)"
* tag 'for-linus-5.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-blkback: set ring->xenblkd to NULL after kthread_stop()
xenbus/xenbus_backend: Disallow pending watch messages
xen/xenbus: Count pending messages for each watch
xen/xenbus/xen_bus_type: Support will_handle watch callback
xen/xenbus: Add 'will_handle' callback support in xenbus_watch_path()
xen/xenbus: Allow watches discard events before queueing
A Xen PVH domain doesn't have a PCI bus or devices, so it doesn't need
PCI support built in. Currently, XEN_PVH depends on XEN_PVHVM which
depends on PCI.
Introduce XEN_PVHVM_GUEST as a toplevel item and change XEN_PVHVM to a
hidden variable. This allows XEN_PVH to depend on XEN_PVHVM without PCI
while XEN_PVHVM_GUEST depends on PCI.
In drivers/xen, compile platform-pci depending on XEN_PVHVM_GUEST since
that pulls in the PCI dependency for linking.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20201014175342.152712-2-jandryuk@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Keep track of the assignments of event channels to CPUs and select the
online CPU with the least assigned channels in the affinity mask which is
handed to irq_chip::irq_set_affinity() from the core code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20201210194045.457218278@linutronix.de
To prepare for interrupt spreading reduce the storage size of
irq_info::spurious_cnt to u8 so the required flag for the spreading logic
will not increase the storage size.
Protect the usage site against overruns.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20201210194045.360198201@linutronix.de
All event channel setups bind the interrupt on CPU0 or the target CPU for
percpu interrupts and overwrite the affinity mask with the corresponding
cpumask. That does not make sense.
The XEN implementation of irqchip::irq_set_affinity() already picks a
single target CPU out of the affinity mask and the actual target is stored
in the effective CPU mask, so destroying the user chosen affinity mask
which might contain more than one CPU is wrong.
Change the implementation so that the channel is bound to CPU0 at the XEN
level and leave the affinity mask alone. At startup of the interrupt
affinity will be assigned out of the affinity mask and the XEN binding will
be updated. Only keep the enforcement for real percpu interrupts.
On resume the overwrite is not required either because info->cpu and the
affinity mask are still the same as at the time of suspend. Same for
rebind_evtchn_irq().
This also prepares for proper interrupt spreading.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20201210194045.250321315@linutronix.de
There is absolutely no reason to mimic the x86 deferred affinity
setting. This mechanism is required to handle the hardware induced issues
of IO/APIC and MSI and is not in use when the interrupts are remapped.
XEN does not need this and can simply change the affinity from the calling
context. The core code invokes this with the interrupt descriptor lock held
so it is fully serialized against any other operation.
Mark the interrupts with IRQ_MOVE_PCNTXT to disable the deferred affinity
setting. The conditional mask/unmask operation is already handled in
xen_rebind_evtchn_to_cpu().
This makes XEN on x86 use the same mechanics as on e.g. ARM64 where
deferred affinity setting is not required and not implemented and the code
path in the ack functions is compiled out.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20201210194045.157601122@linutronix.de
This function can only ever work when the event channels:
- are already established
- interrupts assigned to them
- the affinity has been set by user space already
because any newly set up event channel is forced to be bound to CPU0 and
the affinity mask of the interrupt is forced to contain cpumask_of(0).
As the CPU0 enforcement was in place _before_ this was implemented it's
entirely unclear how that can ever have worked at all.
Remove it as preparation for doing it proper.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20201210194045.065115500@linutronix.de