IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
If the kernel detects that the s390 hardware supports the vector
facility, it is enabled by default at an early stage. To force
it off, use the novx kernel parameter. Note that there is a small
time window, where the vector facility is enabled before it is
forced to be off.
With enabling the vector facility by default, the FPU save and
restore functions can be improved. They do not longer require
to manage expensive control register updates to enable or disable
the vector enablement control for particular processes.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The call to pgste_set_key in ptep_set_access_flags can be avoided
if the old pte is found to be valid at the time the new access
rights are set. The function that created the old, valid pte already
completed the required storage key operation.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The principles of operation states reads are in order, writes are in
order, writes can be reordered after reads, but no reads can be
reordered after writes.
The atomic and bitops variantes for z196 use the interlocked-access
facility instructions with a memory barrier before and after the
instruction. Because of the memory ordering the first barrier is
unnecessary and can be removed.
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To be able to analyse problems in regard to hypervisor overhead
add a tracepoing for diagnose calls. It reports the number of
the diagnose issued, e.g.
sshd-1385 [002] .... 42.701431: diagnose: nr=0x9c
<idle>-0 [001] ..s. 43.587528: diagnose: nr=0x9c
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce /sys/debug/kernel/diag_stat with a statistic how many diagnose
calls have been done by each CPU in the system.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The generic implementation for test_and_set_bit_lock in include/asm-generic
uses the standard test_and_set_bit operation. This is done with either a
'csg' or a 'loag' instruction. For both version the cache line is fetched
exclusively, even if the bit is already set. The result is an increase in
cache traffic, for a contented lock this is a bad idea.
Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use bit 2**1 of the pte and bit 2**14 of the pmd for the soft dirty
bit. The fault mechanism to do dirty tracking is already in place.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We often need to correlate an 8 bit path mask with the position
in a channel path array. Introduce and use pathmask_to_pos for
that task.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Hold the device_lock during [de]activation of the channel measurement
block to synchronize concurrent usage of these functions.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The <linux/memblock.h> already provides for_each_mem_range() macro that
iterates through memblock areas from type_a and not included in type_b.
We can remove custom for_each_dump_mem_range() macro and use the
for_each_mem_range() instead.
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The principles of operation says:
The storage-operand fetch references of one instruction
occur after those of all preceding instructions and
before those of subsequent instructions, as observed
by other CPUs and by channel programs.
[...]
The CPU may fetch the operands of instructions before the
instructions are executed.
[...]
The CPU may delay placing results in storage.
[...]
the results of one instruction are placed in storage after
the results of all preceding instructions have been placed
in storage and before any results of the succeeding
instructions are stored, as observed by other CPUs and by
the channel subsystem.
which boils down to:
- reads are in order
- writes are in order
- reads can happen earlier
- writes can happen later
By definition (see memory-barrier.txt) read barriers orders
reads vs reads and write barriers orders writes agains writes.
but neither of these orders reads vs. writes.
That means we can implement smp_wmb,smp_rmb,wmb and rmb as
simple compiler barriers. To avoid reviewing all driver code
for correct barrier usage we keep dma_[rw]mb as serialization
for now.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
By definition smp_wmb only orders writes against writes. (Finish all
previous writes, and do not start any future write). To protect the
vdso init code against early reads on other CPUs, let's use a full
smp_mb at the end of vdso init. As right now smp_wmb is implemented
as full serialization, this needs no stable backport, but this change
will be necessary if we reimplement smp_wmb.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
_raw_write_lock_wait first sets the high order bit to indicate a
pending writer and then waits for the reader to drop to zero.
smp_rmb by definition only orders reads against reads. Let's use
a full smp_mb instead. As right now smp_rmb is implemented
as full serialization, this needs no stable backport, but this
patch will be necessary if we reimplement smp_rmb.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Let's factor this out and always use get_tod_clock_fast() when
reading the guest TOD.
STORE CLOCK FAST does not do serialization and, therefore, might
result in some fuzziness between different processors in a way
that subsequent calls on different CPUs might have time stamps that
are earlier. This semantics is fine though for all KVM use cases.
To make it obvious that the new function has STORE CLOCK FAST
semantics we name it kvm_s390_get_tod_clock_fast.
With this patch, we only have a handful of places were we
have to care about STP sync (using preempt_disable() logic).
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's move that whole logic into one function. We now always use unsigned
values when calculating the epoch (to avoid over/underflow defined).
Also, we always have to get all VCPUs out of SIE before doing the update
to avoid running differing VCPUs with different TODs.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Nobody except early.c makes use of store_tod_clock() to handle the
cc. So if we would get a cc != 0, we would be in more trouble.
Let's replace all users with get_tod_clock(). Returning a cc
on an ioctl sounded strange either way.
We can now also easily move the get_tod_clock() call into the
preempt_disable() section. This is in fact necessary to make the
STP sync work as expected. Otherwise the host TOD could change
and we would end up with a wrong epoch calculation.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
PER events can always co-exist with other program interrupts.
For now, we always overwrite all program interrupt parameters when
injecting any type of program interrupt.
Let's handle that correctly by only overwriting the relevant portion of
the program interrupt parameters. Therefore we can now inject PER events
and ordinary program interrupts concurrently, resulting in no loss of
program interrupts. This will especially by helpful when manually detecting
PER events later - as both types might be triggered during one SIE exit.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The main reason to keep program injection in kernel separated until now
was that we were able to do some checking, if really only the owning
thread injects program interrupts (via waitqueue_active(li->wq)).
This BUG_ON was never triggered and the chances of really hitting it, if
another thread injected a program irq to another vcpu, were very small.
Let's drop this check and turn kvm_s390_inject_program_int() and
kvm_s390_inject_prog_irq() into simple inline functions that makes use of
kvm_s390_inject_vcpu().
__must_check can be dropped as they are implicitely given by
kvm_s390_inject_vcpu(), to avoid ugly long function prototypes.
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's get rid of the local variable and exit directly if we found
any pending interrupt. This is not only faster, but also better
readable.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We can remove that double check.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
No need to separate pending and floating irqs when setting interception
requests. Let's do it for all equally.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We don't care about program event recording irqs (synchronous
program irqs) but asynchronous irqs when checking for disabled
wait. Machine checks were missing.
Let's directly switch to the functions we have for that purpose
instead of testing once again for magic bits.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
the float int structure is no longer used in __inject_vm.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes and an update to the default configuration"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/defconfig: set SCSI_DH=y
s390/vtime: correct scaled cputime of partially idle CPUs
s390/boot/decompression: disable floating point in decompressor
s390/numa: use correct type for node_to_cpumask_map
This adds an IOMMU API implementation for s390 PCI devices.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull strscpy string copy function implementation from Chris Metcalf.
Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.
The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.
strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result. To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.
strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string. Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated. It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.
strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG. It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.
So why did I waffle about this for so long?
Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.
And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.
So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches. Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.
* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: use global strscpy() rather than private copy
string: provide strscpy()
Make asm/word-at-a-time.h available on all architectures
As we need to add further flags to the bpf_prog structure, lets migrate
both bools to a bitfield representation. The size of the base structure
(excluding insns) remains unchanged at 40 bytes.
Add also tags for the kmemchecker, so that it doesn't throw false
positives. Even in case gcc would generate suboptimal code, it's not
being accessed in performance critical paths.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix this warning:
arch/s390/configs/performance_defconfig:380:warning: symbol value 'm' invalid for SCSI_DH
Introduced via 086b91d052ebe4ead5d28021afe3bdfd70af15bf
(scsi_dh: integrate into the core SCSI code)
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The calculation for the SMT scaling factor for a hardware thread
which has been partially idle needs to disregard the cycles spent
by the other threads of the core while the thread is idle.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
my gcc 5.1 used an ldgr instruction with a register != 0,2,4,6 for
spilling/filling into a floating point register in our decompressor.
This will cause an AFP-register data exception as the decompressor
did not setup the additional floating point registers via cr0.
That causes a program check loop that looked like a hang with
one "Uncompressing Linux... " message (directly booted via kvm)
or a loop of "Uncompressing Linux... " messages (when booted via
zipl boot loader).
The offending code in my build was
48e400: e3 c0 af ff ff 71 lay %r12,-1(%r10)
-->48e406: b3 c1 00 1c ldgr %f1,%r12
48e40a: ec 6c 01 22 02 7f clij %r6,2,12,0x48e64e
but gcc could do spilling into an fpr at any function. We can
simply disable floating point support at that early stage.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
and a few PPC bug fixes too.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJWBSn7AAoJEL/70l94x66Dxd4H/RT6kWWj9x4grEYUkcJUDyK2
AXm7XcKQm04auwAic8Otr+ts/Qix/50kWmBe/TU0QLgqb8rj5Dj3yGFK6Z1y6mAz
KvaxqMJd4tZGTqN0DDvC2ItEdzjfAdeJZo/FHXqPHVspG0G14T7STLna02LTBBEJ
tNzY9qor8nFhg2fT2szqKaudUNgTqkCTpo57o2BrHE96SHG+m0WdpQCV1F5hPVpg
Te0Pb7qX9xng5n3sQ7IV/t3QYbrza1ACwNQS9XJa0Yu6iEz7JdmVmzHQASK9ynn6
hUHhsNYGx4IsPjPtfJk2GroNaRDZL+VMzw07tfcOvPx8xkS9hS63pwzmSBqfLrM=
=Ywqn
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC
bug fixes too"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: disable halt_poll_ns as default for s390x
KVM: x86: fix off-by-one in reserved bits check
KVM: x86: use correct page table format to check nested page table reserved bits
KVM: svm: do not call kvm_set_cr0 from init_vmcb
KVM: x86: trap AMD MSRs for the TSeg base and mask
KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()
KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit
KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs
kvm: svm: reset mmu on VCPU reset
We observed some performance degradation on s390x with dynamic
halt polling. Until we can provide a proper fix, let's enable
halt_poll_ns as default only for supported architectures.
Architectures are now free to set their own halt_poll_ns
default value.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With CONFIG_CPUMASK_OFFSTACK=y cpumask_var_t is a pointer to a CPU mask.
Replace the incorrect type for node_to_cpumask_map with cpumask_t.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull s390 fixes from Martin Schwidefsky:
"A couple of system call updates. The two new system calls userfaultfd
and membarrier have been added, as well as the 17 direct calls for the
multiplexed socket system calls.
In addition the system call compat wrappers have been flagged as
notrace functions and a few wrappers could be removed.
And bug fixes for the vector register handling, cpu_mf, suspend/resume,
compat signals, SMT cputime accounting and the zfcp dumper"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: wire up separate socketcalls system calls
s390/compat: remove superfluous compat wrappers
s390/compat: do not trace compat wrapper functions
s390/s390x: allocate sys_membarrier system call number
s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK
s390: wire up userfaultfd system call
s390/vtime: correct scaled cputime for SMT
s390/cpum_cf: Corrected return code for unauthorized counter sets
s390/compat: correct uc_sigmask of the compat signal frame
s390: fix floating point register corruption
s390/hibernate: fix save and restore of vector registers
As discussed on linux-arch all architectures should wire up the separate
system calls that are hidden behind the socketcall multiplexer system call.
It's just a couple more system calls and gives us a very small performance
improvement.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A couple of compat wrapper functions are simply trampolines to the real
system call. This happened because the compat wrapper defines will only
sign and zero extend system call parameters which are of different size
on s390/s390x (longs and pointers).
All other parameters will be correctly sign and zero extended by the
normal system call wrappers.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add notrace to the compat wrapper define to disable tracing of compat
wrapper functions. These are supposed to be very small and more or less
just a trampoline to the real system call.
Also fix indentation.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This config option is completely irrelevant for zfcpdump and
unfortunately causes a kernel panic on recent kernels in
"mspro_block_init()/driver_register()".
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The scaled cputime is supposed to be derived from the normal per-thread
cputime by dividing it with the average thread density in the last interval.
The calculation of the scaling values for the average thread density is
incorrect. The current, incorrect calculation:
Ci = cycle count with i active threads
T = unscaled cputime, sT = scaled cputime
sT = T * (C1 + C2 + ... + Cn) / (1*C1 + 2*C2 + ... + n*Cn)
The calculation happens to yield the correct numbers for the simple cases
with only one Ci value not zero. But for cases with multiple Ci values not
zero it fails. E.g. on a SMT-2 system with one thread active half the time
and two threads active for the other half of the time it fails, the scaling
factor should be 3/4 but the formula gives 2/3.
The correct formula is
sT = T * (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn)
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Previously, the cpum_cf PMU returned -EPERM if a counter is requested and
the counter set to which the counter belongs is not authorized. According
to the perf_event_open() system call manual, an error code of EPERM indicates
an unsupported exclude setting or CAP_SYS_ADMIN is missing.
Use ENOENT to indicate that particular counters are not available when the
counter set which contains the counter is not authorized. For generic events,
this might trigger a fall back, for example, to a software event.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The uc_sigmask in the ucontext structure is an array of words to keep
the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
only has 64 bits).
For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
there are two 4 byte words. The compat signal handler code uses a
simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
As s390 is a big-endian architecture this is incorrect, the two words
in the 31 bit sigset_t array need to be swapped.
Cc: <stable@vger.kernel.org>
Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The critical section cleanup code misses to add the offset of the
thread_struct to the task address.
Therefore, if the critical section code gets executed, it may corrupt
the task struct or restore the contents of the floating point registers
from the wrong memory location.
Fixes d0164ee20d "s390/kernel: remove save_fpu_regs() parameter and use
__LC_CURRENT instead".
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The swsusp_arch_suspend()/swsusp_arch_resume() functions currently only
save and restore the floating point registers. If the task that started
the hibernation process is using vector registers they can get lost.
To fix this just call save_fpu_regs in swsusp_arch_suspend(), the restore
will happen automatically on return to user space.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The offending commit accidentally replaces an atomic_clear with an
atomic_or instead of an atomic_andnot in kvm_s390_vcpu_request_handled.
The symptom is that kvm guests on s390 hang on startup.
This patch simply replaces the incorrect atomic_or with atomic_andnot
Fixes: 805de8f43c20 (atomic: Replace atomic_{set,clear}_mask() usage)
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new statistic can help diagnosing VCPUs that, for any reason,
trigger bad behavior of halt_poll_ns autotuning.
For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
10+20+40+80+160+320+480 = 1110 microseconds out of every
479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
is consuming about 30% more CPU than it would use without
polling. This would show as an abnormally high number of
attempted polling compared to the successful polls.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>