IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Using MMUEXT_TLB_FLUSH_MULTI doesn't buy us much since the hypervisor
will likely perform same IPIs as would have the guest.
More importantly, using MMUEXT_INVLPG_MULTI may not to invalidate the
guest's address on remote CPU (when, for example, VCPU from another guest
is running there).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Resuming PMU currently triggers a warning from ___might_sleep() (assuming
CONFIG_DEBUG_ATOMIC_SLEEP is set) when xen_pmu_init() allocates GFP_KERNEL
page because we are in state resembling atomic context.
Move resuming PMU to xen_arch_resume() which is called in regular context.
For symmetry move suspending PMU to xen_arch_suspend() as well.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> # 4.3
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
After commit 8c058b0b9c ("x86/irq: Probe for PIC presence before
allocating descs for legacy IRQs") early_irq_init() will no longer
preallocate descriptors for legacy interrupts if PIC does not
exist, which is the case for Xen PV guests.
Therefore we may need to allocate those descriptors ourselves.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
On some NUMA system, after dom0 up, we see below warning even if there are
enough pfn ranges that could be used for remapping:
"Unable to find available pfn range, not remapping identity pages"
Fix it to avoid getting a memory region of zero size in xen_find_pfn_range.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Mapping a large range of foreign GFNs can take a long time, add a
reschedule point after each batch of 16 GFNs.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Build cpu_hotplug for ARM and ARM64 guests.
Rename arch_(un)register_cpu to xen_(un)register_cpu and provide an
empty implementation on ARM and ARM64. On x86 just call
arch_(un)register_cpu as we are already doing.
Initialize cpu_hotplug on ARM.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Swiotlb is used on ARM64 to support DMA on platform where devices are
not protected by an SMMU. Furthermore it's only enabled for DOM0.
While Xen is always using 4KB page granularity in the stage-2 page table,
Linux ARM64 may either use 4KB or 64KB. This means that a Linux page
can be spanned accross multiple Xen page.
The Swiotlb code has to validate that the buffer used for DMA is
physically contiguous in the memory. As a Linux page can't be shared
between local memory and foreign page by design (the balloon code always
removing entirely a Linux page), the changes in the code are very
minimal because we only need to check the first Xen PFN.
Note that it may be possible to optimize the function
check_page_physically_contiguous to avoid looping over every Xen PFN
for local memory. Although I will let this optimization for a follow-up.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
With 64KB page granularity support, the frame number will be different.
It will be easier to modify the behavior in a single place rather than
in each caller.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.
Note that pfn_to_gfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_gfn to make this explicit.
We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.
The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.
We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.
Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Currently, a grant is always based on the Xen page granularity (i.e
4KB). When Linux is using a different page granularity, a single page
will be split between multiple grants.
The new helpers will be in charge of splitting the Linux page into grants
and call a function given by the caller on each grant.
Also provide an helper to count the number of grants within a given
contiguous region.
Note that the x86/include/asm/xen/page.h is now including
xen/interface/grant_table.h rather than xen/grant_table.h. It's
necessary because xen/grant_table.h depends on asm/xen/page.h and will
break the compilation. Furthermore, only definition in
interface/grant_table.h is required.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
They are not used in common code expect in one place in balloon.c which is
only compiled when Linux is using PV MMU. It's not the case on ARM.
Rather than worrying how to handle the 64KB case, drop them.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Rename alloc_p2m() to xen_alloc_p2m_entry() and export it.
This is useful for ensuring that a p2m entry is allocated (i.e., not a
shared missing or identity entry) so that subsequent set_phys_to_machine()
calls will require no further allocations.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Make xen_alloc_p2m_entry() a nop on auto-xlate guests.
All users of alloc_xenballoon_pages() wanted low memory pages, so
remove the option for high memory.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
During setup, discard RAM regions that are above the maximum
reservation (instead of marking them as E820_UNUSABLE). This allows
hotplug memory to be placed at these addresses.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Pull powerpc fixes from Michael Ellerman:
- Re-enable CONFIG_SCSI_DH in our defconfigs
- Remove unused os_area_db_id_video_mode
- cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
- cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
- cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
- cxl: Workaround malformed pcie packets on some cards from Philippe
- cxl: Fix number of allocated pages in SPA from Christophe Lombard
- Fix checkstop in native_hpte_clear() with lockdep from Cyril
- Panic on unhandled Machine Check on powernv from Daniel
- selftests/powerpc: Fix build failure of load_unaligned_zeropad test
* tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix build failure of load_unaligned_zeropad test
powerpc/powernv: Panic on unhandled Machine Check
powerpc: Fix checkstop in native_hpte_clear() with lockdep
cxl: Fix number of allocated pages in SPA
cxl: Workaround malformed pcie packets on some cards
cxl: fix leak of ctx->mapping when releasing kernel API contexts
cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API
cxl: fix leak of IRQ names in cxl_free_afu_irqs()
powerpc/ps3: Remove unused os_area_db_id_video_mode
powerpc/configs: Re-enable CONFIG_SCSI_DH
Merge misc fixes from Andrew Morton:
"6 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
sh: add copy_user_page() alias for __copy_user()
lib/Kconfig: ZLIB_DEFLATE must select BITREVERSE
mm, dax: fix DAX deadlocks
memcg: convert threshold to bytes
builddeb: remove debian/files before build
mm, fs: obey gfp_mapping for add_to_page_cache()
copy_user_page() is needed by DAX. Without this we get a compile error
for DAX on SH:
fs/dax.c:280:2: error: implicit declaration of function `copy_user_page' [-Werror=implicit-function-declaration]
copy_user_page(vto, (void __force *)vfrom, vaddr, to);
^
This was done with a random config that happened to include DAX support.
This patch has only been compile tested.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull arm64 fixes from Will Deacon:
"Here are a few more arm64 fixes for 4.3. Again, nothing too
significant, but worth having nonetheless. The MINSIGSTKSZ update is
a bit grotty, but the value we currently have is wrong (too small), so
anybody using that will have issues already. It has Arnd's ack for
the asm-generic change.
Summary:
- Fix module CFLAGS setting in workaround for erratum #843419
- Update MINSIGSTKSZ and SIGSTKSZ to match glibc
- Wire up some new compat syscalls"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: compat: wire up new syscalls
arm64: Fix MINSIGSTKSZ and SIGSTKSZ
arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419
Pull KVM fixes from Paolo Bonzini:
"Bug fixes for system management mode emulation.
The first two patches fix SMM emulation on Nehalem processors. The
others fix some cases that became apparent as work progressed on the
firmware side"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix RSM into 64-bit protected mode
KVM: x86: fix previous commit for 32-bit
KVM: x86: fix SMI to halted VCPU
KVM: x86: clean up kvm_arch_vcpu_runnable
KVM: x86: map/unmap private slots in __x86_set_memory_region
KVM: x86: build kvm_userspace_memory_region in x86_set_memory_region
In order to get into 64-bit protected mode, you need to enable
paging while EFER.LMA=1. For this to work, CS.L must be 0.
Currently, we load the segments before CR0 and CR4, which means
that if RSM returns into 64-bit protected mode CS.L is already 1
and everything breaks.
Luckily, CS.L=0 is always the case when executing RSM, because it
is forbidden to execute RSM from 64-bit protected mode. Hence it
is enough to load CR0 and CR4 first, and only then the segments.
Fixes: 660a5d517a
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 208473c1f3 ("ARM: wire up new syscalls") hooked up the new
userfaultfd and membarrier syscalls for ARM, so do the same for our
compat syscall table in arm64.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- Fix AVX detection to prevent use of non-existent AESNI.
- Some SPARC ciphers did not set their IV size which may lead to
memory corruption"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ahash - ensure statesize is non-zero
crypto: camellia_aesni_avx - Fix CPU feature checks
crypto: sparc - initialize blkcipher.ivsize
Otherwise, two copies (one of them never populated and thus bogus)
are allocated for the regular and SMM address spaces. This breaks
SMM with EPT but without unrestricted guest support, because the
SMM copy of the identity page map is all zeros.
By moving the allocation to the caller we also remove the last
vestiges of kernel-allocated memory regions (not accessible anymore
in userspace since commit b74a07beed, "KVM: Remove kernel-allocated
memory regions", 2010-06-21); that is a nice bonus.
Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patch will make x86_set_memory_region fill the
userspace_addr. Since the struct is not used untouched
anymore, it makes sense to build it in x86_set_memory_region
directly; it also simplifies the callers.
Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MINSIGSTKSZ and SIGSTKSZ for ARM64 are not correctly set in latest kernel.
This patch fixes this issue.
This issue is reported in LTP (testcase: sigaltstack02.c).
Testcase failed when sigaltstack() called with stack size "MINSIGSTKSZ - 1"
Since in Glibc-2.22, MINSIGSTKSZ is set to 5120 but in kernel
it is set to 2048 so testcase gets failed.
Testcase Output:
sigaltstack02 1 TPASS : stgaltstack() fails, Invalid Flag value,errno:22
sigaltstack02 2 TFAIL : sigaltstack() returned 0, expected -1,errno:12
Reported Issue in Glibc Bugzilla:
Bugfix in Glibc-2.22: [Bug 16850]
https://sourceware.org/bugzilla/show_bug.cgi?id=16850
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
Signed-off-by: Manjeet Pawar <manjeet.p@samsung.com>
Signed-off-by: Rohit Thapliyal <r.thapliyal@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit df057cc7b4 ("arm64: errata: add module build workaround for
erratum #843419") sets CFLAGS_MODULE to ensure that the large memory
model is used by the compiler when building kernel modules.
However, CFLAGS_MODULE is an environment variable and intended to be
overridden on the command line, which appears to be the case with the
Ubuntu kernel packaging system, so use KBUILD_CFLAGS_MODULE instead.
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: df057cc7b4 ("arm64: errata: add module build workaround for erratum #843419")
Reported-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull MIPS fixes from Ralf Baechle:
- MIPS didn't define the new ioremap_uc. Defined it as an alias for
ioremap_uncached.
- Replace workaround for MIPS16 build issue with a correct one.
* git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Define ioremap_uc
MIPS: UAPI: Ignore __arch_swab{16,32,64} when using MIPS16
Revert "MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16."
Pull swiotlb fixlet from Konrad Rzeszutek Wilk:
"Enable the SWIOTLB under 32-bit PAE kernels.
Nowadays most distros enable this due to CONFIG_HYPERVISOR|XEN=y which
select SWIOTLB. But for those that are not interested in
virtualization and wanting to use 32-bit PAE kernels and wanting to
have working DMA operations - this configures it for them"
* 'stable/for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: Enable it under x86 PAE
Pull strscpy powerpc fix from Chris Metcalf.
Fix powerpc big-endian build.
* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/powerpc: provide zero_bytemask() for big-endian
Pull ARM SoC fixes from Arnd Bergmann:
"The fixes for this week include one small patch that was years in the
making and that finally fixes using all eight CPUs on exynos542x.
The rest are lots of minor changes for sunxi, imx, exynos and shmobile
- fixing the minimum voltage for Allwinner A20
- thermal boot issue on SMDK5250.
- invalid clock used for FIMD IOMMU.
- audio on Renesas r8a7790/r8a7791
- invalid clock used for FIMD IOMMU
- LEDs on exynos5422-odroidxu3-common
- usb pin control for imx-rex
- imx53: fix PMIC interrupt level
- a Makefile typo"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: Fix wrong clock binding for sysmmu_fimd1_1 on exynos5420
ARM: dts: Fix bootup thermal issue on smdk5250
ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain for sound
ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain for sound
arm-cci500: Don't enable PMU driver by default
ARM: dts: fix usb pin control for imx-rex dts
ARM: imx53: qsrb: fix PMIC interrupt level
ARM: imx53: include IRQ dt-bindings header
ARM: dts: add suspend opp to exynos4412
ARM: dts: Fix LEDs on exynos5422-odroidxu3
ARM: EXYNOS: reset Little cores when cpu is up
ARM: dts: Fix Makefile target for sun4i-a10-itead-iteaduino-plus
ARM: dts: sunxi: Raise minimum CPU voltage for sun7i-a20 to meet SoC specifications
All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.
Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.
Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised
In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.
We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)
This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.
There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.
Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.
Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.
Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.
The second reason is that since e784b6499d ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.
The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().
Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.
This fixes some hangs following EEH events on cxlflash setups.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
native_hpte_clear() is called in real mode from two places:
- Early in boot during htab initialisation if firmware assisted dump is
active.
- Late in the kexec path.
In both contexts there is no need to disable interrupts are they are
already disabled. Furthermore, locking around the tlbie() is only required
for pre POWER5 hardware.
On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
POWER5 hardware concurrent tlbie()s could result in deadlock. This code
would only be executed at crashdump time, during which all bets are off,
concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
best course of action is to simply do nothing. Concurrent tlbie()s are not
possible in the first case as secondary CPUs have not come up yet.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For some reason, only the little-endian flavor of
powerpc provided the zero_bytemask() implementation.
Reported-by: Michal Sojka <sojkam1@fel.cvut.cz>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
We need to explicitly check the AVX and AES CPU features, as we can't
infer them from the related XSAVE feature flags. For example, the
Core i3 2310M passes the XSAVE feature test but does not implement
AES-NI.
Reported-and-tested-by: Stéphane Glondu <glondu@debian.org>
References: https://bugs.debian.org/800934
Fixes: ce4f5f9b65 ("x86/fpu, crypto x86/camellia_aesni_avx: Simplify...")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable <stable@vger.kernel.org> # 4.2
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Most distributions end up enabling SWIOTLB already with 32-bit
kernels due to the combination of CONFIG_HYPERVISOR_GUEST|CONFIG_XEN=y
as those end up requiring the SWIOTLB.
However for those that are not interested in virtualization and
run in 32-bit they will discover that: "32-bit PAE 4.2.0 kernel
(no IOMMU code) would hang when writing to my USB disk. The kernel
spews million(-ish messages per sec) to syslog, effectively
"hanging" userspace with my kernel.
Oct 2 14:33:06 voodoochild kernel: [ 223.287447] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
Oct 2 14:33:06 voodoochild kernel: [ 223.287448] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
Oct 2 14:33:06 voodoochild kernel: [ 223.287449] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
... etc ..."
Enabling it makes the problem go away.
N.B. With a6dfa128ce
"config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected"
we also have the important part of the SG macros enabled to make this
work properly - in case anybody wants to backport this patch.
Reported-and-Tested-by: Christian Melki <christian.melki@t2data.com>
Signed-off-by: Christian Melki <christian.melki@t2data.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull arm64 fixes from Will Deacon:
"This addresses a couple of issues found with RT, a broken initrd
message in the console log and a simple performance fix for some MMC
workloads.
Summary:
- A couple of locking fixes for RT kernels
- Avoid printing bogus initrd warnings when initrd isn't present
- Performance fix for random mmap file readahead
- Typo fix"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: replace read_lock to rcu lock in call_break_hook
arm64: Don't relocate non-existent initrd
arm64: convert patch_lock to raw lock
arm64: readahead: fault retry breaks mmap file read random detection
arm64: debug: Fix typo in debug-monitors.c
Pull strscpy fixes from Chris Metcalf :
"This patch series fixes up a couple of architecture issues where
strscpy wasn't configured correctly (missing on h8300, duplicating
local and asm-generic copies on powerpc and tile).
It also adds a use of zero_bytemask() to the final store for strscpy
to avoid writing uninitialized data to the destination. However, to
make this work we had to add support for zero_bytemask() to the two
architectures that didn't have it (alpha and tile), because they were
providing their own local copies, but didn't provide the
zero_bytemask() that was previously only required when building with
CONFIG_DCACHE_WORD_ACCESS"
[ Side note: there is still no actual users of strscpy except for the
one preexisting use in arch/tile that predates the generic version.
So this is all about fixing the infrastructure so that we eventually
can start using it. - Linus ]
* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
strscpy: zero any trailing garbage bytes in the destination
word-at-a-time.h: support zero_bytemask() on alpha and tile
word-at-a-time.h: fix some Kbuild files
arch/tile added word-at-a-time.h after the patch that added generic-y
entries; the generic-y entry is now stale.
arch/h8300 is newer than the generic-y patch for word-at-a-time.h,
and needs a generic-y entry.
arch/powerpc seems to have gotten a generic-y entry by mistake in
the first patch; this change removes it.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
When booting a kernel without an initrd, the kernel reports that it
moves -1 bytes worth, having gone through the motions with initrd_start
equal to initrd_end:
Moving initrd from [4080000000-407fffffff] to [9fff49000-9fff48fff]
Prevent this by bailing out early when the initrd size is zero (i.e. we
have no initrd), avoiding the confusing message and other associated
work.
Fixes: 1570f0d7ab ("arm64: support initrd outside kernel linear map")
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull xen bug fixes from David Vrabel:
- Fix VM save performance regression with x86 PV guests
- Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)
- Other minor fixes.
* tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen/p2m: hint at the last populated P2M entry
x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
xen/x86: Don't try to write syscall-related MSRs for PV guests
xen: use correct type for HYPERVISOR_memory_op()
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes and an update to the default configuration"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/defconfig: set SCSI_DH=y
s390/vtime: correct scaled cputime of partially idle CPUs
s390/boot/decompression: disable floating point in decompressor
s390/numa: use correct type for node_to_cpumask_map