5690 Commits

Author SHA1 Message Date
Martin Schwidefsky
60f07c8ec5 s390/mm: fix race on mm->context.flush_mm
The order in __tlb_flush_mm_lazy is to flush TLB first and then clear
the mm->context.flush_mm bit. This can lead to missed flushes as the
bit can be set anytime, the order needs to be the other way aronud.

But this leads to a different race, __tlb_flush_mm_lazy may be called
on two CPUs concurrently. If mm->context.flush_mm is cleared first then
another CPU can bypass __tlb_flush_mm_lazy although the first CPU has
not done the flush yet. In a virtualized environment the time until the
flush is finally completed can be arbitrarily long.

Add a spinlock to serialize __tlb_flush_mm_lazy and use the function
in finish_arch_post_lock_switch as well.

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:42 +02:00
Martin Schwidefsky
b3e5dc45fd s390/mm: fix local TLB flushing vs. detach of an mm address space
The local TLB flushing code keeps an additional mask in the mm.context,
the cpu_attach_mask. At the time a global flush of an address space is
done the cpu_attach_mask is copied to the mm_cpumask in order to avoid
future global flushes in case the mm is used by a single CPU only after
the flush.

Trouble is that the reset of the mm_cpumask is racy against the detach
of an mm address space by switch_mm. The current order is first the
global TLB flush and then the copy of the cpu_attach_mask to the
mm_cpumask. The order needs to be the other way around.

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:42 +02:00
Harald Freudenberger
46fde9a9d2 s390/zcrypt: externalize AP queue interrupt control
KVM has a need to control the interrupts on real and virtualized
AP queue devices. This fix provides a new function to control
the interrupt facilities of an AP queue device.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:42 +02:00
Harald Freudenberger
050349b5b7 s390/zcrypt: externalize AP config info query
KVM has a need to fetch the crypto configuration information
as it is returned by the PQAP(QCI) instruction. This patch
introduces a new API ap_query_configuration() which provides
this info in a handy way for the caller.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:42 +02:00
Tony Krowiak
e7fc5146cf s390/zcrypt: externalize test AP queue
Under certain specified conditions, the Test AP Queue (TAPQ)
subfunction of the Process Adjunct Processor Queue (PQAP) instruction
will be intercepted by a guest VM. The guest VM must have a means for
executing the intercepted instruction.

The vfio_ap driver will provide an interface to execute the
PQAP(TAPQ) instruction subfunction on behalf of a guest VM.
The code for executing the AP instructions currently resides in the
AP bus. This patch refactors the AP bus code to externalize access
to the PQAP(TAPQ) instruction subfunction to make it available to
the vfio_ap driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:42 +02:00
Martin Schwidefsky
2fc4876ea8 s390/mm: use VM_BUG_ON in crst_table_[upgrade|downgrade]
The BUG_ON in crst_table_[upgrade|downgrade] is a debugging aid,
replace it with VM_BUG_ON.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-06 09:24:41 +02:00
Linus Torvalds
9e85ae6af6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The first part of the s390 updates for 4.14:

   - Add machine type 0x3906 for IBM z14

   - Add IBM z14 TLB flushing improvements for KVM guests

   - Exploit the TOD clock epoch extension to provide a continuous TOD
     clock afer 2042/09/17

   - Add NIAI spinlock hints for IBM z14

   - Rework the vmcp driver and use CMA for the respone buffer of z/VM
     CP commands

   - Drop some s390 specific asm headers and use the generic version

   - Add block discard for DASD-FBA devices under z/VM

   - Add average request times to DASD statistics

   - A few of those constify patches which seem to be in vogue right now

   - Cleanup and bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (50 commits)
  s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs
  s390/dasd: Add discard support for FBA devices
  s390/zcrypt: make CPRBX const
  s390/uaccess: avoid mvcos jump label
  s390/mm: use generic mm_hooks
  s390/facilities: fix typo
  s390/vmcp: simplify vmcp_response_free()
  s390/topology: Remove the unused parent_node() macro
  s390/dasd: Change unsigned long long to unsigned long
  s390/smp: convert cpuhp_setup_state() return code to zero on success
  s390: fix 'novx' early parameter handling
  s390/dasd: add average request times to dasd statistics
  s390/scm: use common completion path
  s390/pci: log changes to uid checking
  s390/vmcp: simplify vmcp_ioctl()
  s390/vmcp: return -ENOTTY for unknown ioctl commands
  s390/vmcp: split vmcp header file and move to uapi
  s390/vmcp: make use of contiguous memory allocator
  s390/cpcmd,vmcp: avoid GFP_DMA allocations
  s390/vmcp: fix uaccess check and avoid undefined behavior
  ...
2017-09-05 09:45:46 -07:00
Linus Torvalds
5f82e71a00 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:

 - Add 'cross-release' support to lockdep, which allows APIs like
   completions, where it's not the 'owner' who releases the lock, to be
   tracked. It's all activated automatically under
   CONFIG_PROVE_LOCKING=y.

 - Clean up (restructure) the x86 atomics op implementation to be more
   readable, in preparation of KASAN annotations. (Dmitry Vyukov)

 - Fix static keys (Paolo Bonzini)

 - Add killable versions of down_read() et al (Kirill Tkhai)

 - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)

 - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)

 - Remove smp_mb__before_spinlock() and convert its usages, introduce
   smp_mb__after_spinlock() (Peter Zijlstra)

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  locking/lockdep/selftests: Fix mixed read-write ABBA tests
  sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
  acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
  locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
  smp: Avoid using two cache lines for struct call_single_data
  locking/lockdep: Untangle xhlock history save/restore from task independence
  locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
  futex: Remove duplicated code and fix undefined behaviour
  Documentation/locking/atomic: Finish the document...
  locking/lockdep: Fix workqueue crossrelease annotation
  workqueue/lockdep: 'Fix' flush_work() annotation
  locking/lockdep/selftests: Add mixed read-write ABBA tests
  mm, locking/barriers: Clarify tlb_flush_pending() barriers
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
  locking/lockdep: Explicitly initialize wq_barrier::done::map
  locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
  locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
  locking/refcounts, x86/asm: Implement fast refcount overflow protection
  locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
  ...
2017-09-04 11:52:29 -07:00
Linus Torvalds
0081a0ce80 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnad:
 "The main RCU related changes in this cycle were:

   - Removal of spin_unlock_wait()
   - SRCU updates
   - RCU torture-test updates
   - RCU Documentation updates
   - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant
   - Miscellaneous RCU fixes
   - CPU-hotplug fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  arch: Remove spin_unlock_wait() arch-specific definitions
  locking: Remove spin_unlock_wait() generic definitions
  drivers/ata: Replace spin_unlock_wait() with lock/unlock pair
  ipc: Replace spin_unlock_wait() with lock/unlock pair
  exit: Replace spin_unlock_wait() with lock/unlock pair
  completion: Replace spin_unlock_wait() with lock/unlock pair
  doc: Set down RCU's scheduling-clock-interrupt needs
  doc: No longer allowed to use rcu_dereference on non-pointers
  doc: Add RCU files to docbook-generation files
  doc: Update memory-barriers.txt for read-to-write dependencies
  doc: Update RCU documentation
  membarrier: Provide expedited private command
  rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter()
  rcu: Add warning to rcu_idle_enter() for irqs enabled
  rcu: Make rcu_idle_enter() rely on callers disabling irqs
  rcu: Add assertions verifying blocked-tasks list
  rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit()
  rcu: Add TPS() protection for _rcu_barrier_trace strings
  rcu: Use idle versions of swait to make idle-hack clear
  swait: Add idle variants which don't contribute to load average
  ...
2017-09-04 08:13:52 -07:00
Ingo Molnar
edc2988c54 Merge branch 'linus' into locking/core, to fix up conflicts
Conflicts:
	mm/page_alloc.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-04 11:01:18 +02:00
Linus Torvalds
69c0067aa3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc fixes from Al Viro:
 "Loose ends and regressions from the last merge window.

  Strictly speaking, only binfmt_flat thing is a build regression per
  se - the rest is 'only sparse cares about that' stuff"

[ This came in before the 4.13 release and could have gone there, but it
  was late in the release and nothing seemed critical enough to care, so
  I'm pulling it in the 4.14 merge window instead  - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  binfmt_flat: fix arch/m32r and arch/microblaze flat_put_addr_at_rp()
  compat_hdio_ioctl: Fix a declaration
  <linux/uaccess.h>: Fix copy_in_user() declaration
  annotate RWF_... flags
  teach SYSCALL_DEFINE/COMPAT_SYSCALL_DEFINE to handle __bitwise arguments
2017-09-03 16:09:03 -07:00
David S. Miller
6026e043d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three cases of simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01 17:42:05 -07:00
Joerg Roedel
47b59d8e40 Merge branches 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next 2017-09-01 11:31:42 +02:00
Al Viro
4f59c71852 teach SYSCALL_DEFINE/COMPAT_SYSCALL_DEFINE to handle __bitwise arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-08-31 17:32:37 -04:00
Martin Schwidefsky
8ab867cb08 s390/mm: fix BUG_ON in crst_table_upgrade
A 31-bit compat process can force a BUG_ON in crst_table_upgrade
with specific, invalid mmap calls, e.g.

   mmap((void*) 0x7fff8000, 0x10000, 3, 32, -1, 0)

The arch_get_unmapped_area[_topdown] functions miss an if condition
in the decision to do a page table upgrade.

Fixes: 9b11c7912d00 ("s390/mm: simplify arch_get_unmapped_area[_topdown]")
Cc: <stable@vger.kernel.org>  # v4.12+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-31 14:03:21 +02:00
Martin Schwidefsky
0b89ede629 s390/mm: fork vs. 5 level page tabel
The mm->context.asce field of a new process is not set up correctly
in case of a fork with a 5 level page table.
Add the missing case to init_new_context().

Fixes: 1aea9b3f9210 ("s390/mm: implement 5 level pages tables")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-31 14:03:21 +02:00
David Hildenbrand
c95c895303 KVM: s390: vsie: cleanup mcck reinjection
The machine check information is part of the vsie_page.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170830160603.5452-4-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-31 13:49:40 +02:00
David Hildenbrand
3dbf0205b1 KVM: s390: use WARN_ON_ONCE only for checking
Move the real logic that always has to be executed out of the
WARN_ON_ONCE.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170830160603.5452-3-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-31 13:49:39 +02:00
David Hildenbrand
8149fc0772 KVM: s390: guestdbg: fix range check
Looks like the "overflowing" range check is wrong.

|=======b-------a=======|

addr >= a || addr <= b

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170830160603.5452-2-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-31 13:49:39 +02:00
David Hildenbrand
1935222dc2 KVM: s390: we are always in czam mode
Independent of the underlying hardware, kvm will now always handle
SIGP SET ARCHITECTURE as if czam were enabled. Therefore, let's not
only forward that bit but always set it.

While at it, add a comment regarding STHYI.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170829143108.14703-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29 16:50:20 +02:00
Christian Borntraeger
fa41ba0d08 s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs
Right now there is a potential hang situation for postcopy migrations,
if the guest is enabling storage keys on the target system during the
postcopy process.

For storage key virtualization, we have to forbid the empty zero page as
the storage key is a property of the physical page frame.  As we enable
storage key handling lazily we then drop all mappings for empty zero
pages for lazy refaulting later on.

This does not work with the postcopy migration, which relies on the
empty zero page never triggering a fault again in the future. The reason
is that postcopy migration will simply read a page on the target system
if that page is a known zero page to fault in an empty zero page.  At
the same time postcopy remembers that this page was already transferred
- so any future userfault on that page will NOT be retransmitted again
to avoid races.

If now the guest enters the storage key mode while in postcopy, we will
break this assumption of postcopy.

The solution is to disable the empty zero page for KVM guests early on
and not during storage key enablement. With this change, the postcopy
migration process is guaranteed to start after no zero pages are left.

As guest pages are very likely not empty zero pages anyway the memory
overhead is also pretty small.

While at it this also adds proper page table locking to the zero page
removal.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:31:27 +02:00
Jan Höppner
28b841b3a7 s390/dasd: Add discard support for FBA devices
The z/VM hypervisor provides virtual disks (VDISK) which are backed by
main memory of the hypervisor. Those devices are seen as DASD FBA disks
within the Linux guest.

Whenever data is written to such a device, memory is allocated
on-the-fly by z/VM accordingly. This memory, however, is not being freed
if data on the device is deleted by the guest OS.

In order to make memory usable after deletion again, add discard support
to the FBA discipline.

While at it, update comments regarding the DASD_FEATURE_* flags.

Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:31:26 +02:00
Martin Schwidefsky
d66bf801e0 s390/uaccess: avoid mvcos jump label
If the kernel is compiled for z10 or later machines the uaccess
code inlines the mvcos instruction. The facility bit 27 which
indicates the availability of MVCOS has to be set. The have_mvcos
jump label will always be true.

Make the generation of the have_mvcos jump label conditional on
!CONFIG_HAVE_MARCH_Z10_FEATURES.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:29:10 +02:00
Martin Schwidefsky
8e58ab5c65 s390/mm: use generic mm_hooks
With git commit 3446c13b268af86391d06611327006b059b8bab1
"s390/mm: four page table levels vs. fork"
s390 dropped its architecture specific version of arch_dup_mmap.

Now all functions defined by include/asm-generic/mm_hooks.h are
identical to the s390 versions. Use the generic header.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:29:09 +02:00
Heiko Carstens
ba7944114a s390/facilities: fix typo
There really is no "general extension" facility available. Use the
correct name "general instructions extension" facility instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:29:04 +02:00
Claudio Imbrenda
1bab1c02af KVM: s390: expose no-DAT to guest and migration support
The STFLE bit 147 indicates whether the ESSA no-DAT operation code is
valid, the bit is not normally provided to the host; the host is
instead provided with an SCLP bit that indicates whether guests can
support the feature.

This patch:
* enables the STFLE bit in the guest if the corresponding SCLP bit is
  present in the host.
* adds support for migrating the no-DAT bit in the PGSTEs
* fixes the software interpretation of the ESSA instruction that is
  used when migrating, both for the new operation code and for the old
  "set stable", as per specifications.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29 15:15:56 +02:00
Heiko Carstens
631aebfee8 KVM: s390: sthyi: remove invalid guest write access
handle_sthyi() always writes to guest memory if the sthyi function
code is zero in order to fault in the page that later is written to.

However a function code of zero does not necessarily mean that a write
to guest memory happens: if the KVM host is running as a second level
guest under z/VM 6.2 the sthyi instruction is indicated to be
available to the KVM host, however if the instruction is executed it
will always return with a return code that indicates "unsupported
function code".

In such a case handle_sthyi() must not write to guest memory. This
means that the prior write access to fault in the guest page may
result in invalid guest exceptions, and/or invalid data modification.

In order to be architecture compliant simply remove the write_guest()
call.

Given that the guest assumed a write access anyway, this fix does not
qualify for -stable. This just makes sure the sthyi handler is
architecture compliant.

Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29 15:15:56 +02:00
Collin L. Walling
8fa1696ea7 KVM: s390: Multiple Epoch Facility support
Allow for the enablement of MEF and the support for the extended
epoch in SIE and VSIE for the extended guest TOD-Clock.

A new interface is used for getting/setting a guest's extended TOD-Clock
that uses a single ioctl invocation, KVM_S390_VM_TOD_EXT.  Since the
host time is a moving target that might see an epoch switch or STP sync
checks we need an atomic ioctl and cannot use the exisiting two
interfaces. The old method of getting and setting the guest TOD-Clock is
still retained and is used when the old ioctls are called.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29 15:15:54 +02:00
Jason J. Herne
b697e435ae KVM: s390: Support Configuration z/Architecture Mode
kvm has always supported the concept of starting in z/Arch mode so let's
reflect the feature bit to the guest.

Also, we change sigp set architecture to reject any request to change
architecture modes.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-28 16:25:13 +02:00
Christian Borntraeger
ccc59f47b6 KVM: s390: two fixes for sthyi emulation
- missing inline assembly constraint
 - wrong exception handling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZmuHdAAoJEBF7vIC1phx8dWEP/ilr5tP0+NZ57kmBGn/7/BpB
 toAqU20MRDaXj2mGfHBy5VCF5LU1acZGIor9M51KXaruXOcI/qIubze86tLvv5xI
 ts0RDwaQLrSysFrTAsn1b1DXeOzy65ZN7gvWlWab9Wq9gCdPjDerPsZ4gM3Q+vyH
 NrPnwAVvlkC9KeiqQWBSXBdEGd7saL/FU1h2r1wnFk05/b5h8ktl73JGN1uq6WBc
 c0RdElRQ+sa24NX1rI2qScuwfwoZMUiIuIzEvWXURaKSDN9hLR8RXTE1tf4BU+Ix
 jxyWej1MbA/kjGzW83xaeWansPyT6AgLZjdLsFnC+JDo1Qa0fnPntlBcCkrdkbd7
 oRYbj/VUJs4vkng0L/rlL3RbPt2K5UvaaoRDmaNqQAUJHopo+yQv4WsrOXA9nuuB
 cf74CLUpCwZiiGLonVAnVKlVRhUsj/DuzRdi6AB0DgCgf4iX36rHjBaFVzLuUWTU
 DasqpaGIX8vPJ4CiYTnJP4wAXvDwY19vrBA3CIgjlkF3JkDuBOcYLc7KXdif5azr
 +puBufI//NoV9xVX2cAxtn/ouKdH6LUwXhTkVjARQuz+nugiJ+FyHodWK/TCO0kz
 D51/+oR5BvwWxxXmxW7jwQvxenNbv95HNo60Z+Nzg2UbpqE1AYYt9PppJ0QU0P4k
 Z6+PYE2fcY127lOliLqy
 =Vvs+
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.13-2' into kvms390/next

Additional fixes on top of these two
 - missing inline assembly constraint
 - wrong exception handling

are necessary
2017-08-28 13:46:32 +02:00
Jiri Slaby
30d6e0a419 futex: Remove duplicated code and fix undefined behaviour
There is code duplicated over all architecture's headers for
futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
and comparison of the result.

Remove this duplication and leave up to the arches only the needed
assembly which is now in arch_futex_atomic_op_inuser.

This effectively distributes the Will Deacon's arm64 fix for undefined
behaviour reported by UBSAN to all architectures. The fix was done in
commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with
FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.

And as suggested by Thomas, check for negative oparg too, because it was
also reported to cause undefined behaviour report.

Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
remove pointless access_ok() checks") as access_ok there returns true.
We introduce it back to the helper for the sake of simplicity (it gets
optimized away anyway).

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
Cc: linux-mips@linux-mips.org
Cc: Rich Felker <dalias@libc.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: peterz@infradead.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: sparclinux@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: linux-s390@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-hexagon@vger.kernel.org
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-xtensa@linux-xtensa.org
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Stafford Horne <shorne@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-parisc@vger.kernel.org
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-alpha@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
2017-08-25 22:49:59 +02:00
Dou Liyang
41b0dbfac0 s390/topology: Remove the unused parent_node() macro
Commit a7be6e5a7f8d ("mm: drop useless local parameters of
__register_one_node()") removes the last user of parent_node().

The parent_node() macro in S390 platform is unnecessary.

Remove it for cleanup.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-23 13:31:51 +02:00
Jan Höppner
7bf76f0169 s390/dasd: Change unsigned long long to unsigned long
Unsigned long long and unsigned long were different in size for 31-bit.
For 64-bit the size for both datatypes is 8 Bytes and since the support
for 31-bit is long gone we can clean up a little and change everything
to unsigned long.
Change get_phys_clock() along the way to accept unsigned long as well so
that the DASD code can be consistent.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-23 13:31:51 +02:00
Heiko Carstens
e1108e8f0d s390/smp: convert cpuhp_setup_state() return code to zero on success
cpuhp_setup_state() returns a state number on CPUHP_AP_ONLINE_DYN if
no error occurred. Therefore convert the return code to zero before
using it as exit code for s390_smp_init().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-23 13:31:49 +02:00
Heiko Carstens
673cfddf6e s390: fix 'novx' early parameter handling
Specifying the 'novx' kernel parameter always results in a warning:

Malformed early option 'novx'

The reason for this is that the novx early parameter handling function
always returns a non-zero value which means that an error occurred.
Fix this and return the correct zero value instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-23 13:31:47 +02:00
Radim Krčmář
f2d8421b88 KVM: s390: two fixes for sthyi emulation
- missing inline assembly constraint
 - wrong exception handling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZmuHdAAoJEBF7vIC1phx8dWEP/ilr5tP0+NZ57kmBGn/7/BpB
 toAqU20MRDaXj2mGfHBy5VCF5LU1acZGIor9M51KXaruXOcI/qIubze86tLvv5xI
 ts0RDwaQLrSysFrTAsn1b1DXeOzy65ZN7gvWlWab9Wq9gCdPjDerPsZ4gM3Q+vyH
 NrPnwAVvlkC9KeiqQWBSXBdEGd7saL/FU1h2r1wnFk05/b5h8ktl73JGN1uq6WBc
 c0RdElRQ+sa24NX1rI2qScuwfwoZMUiIuIzEvWXURaKSDN9hLR8RXTE1tf4BU+Ix
 jxyWej1MbA/kjGzW83xaeWansPyT6AgLZjdLsFnC+JDo1Qa0fnPntlBcCkrdkbd7
 oRYbj/VUJs4vkng0L/rlL3RbPt2K5UvaaoRDmaNqQAUJHopo+yQv4WsrOXA9nuuB
 cf74CLUpCwZiiGLonVAnVKlVRhUsj/DuzRdi6AB0DgCgf4iX36rHjBaFVzLuUWTU
 DasqpaGIX8vPJ4CiYTnJP4wAXvDwY19vrBA3CIgjlkF3JkDuBOcYLc7KXdif5azr
 +puBufI//NoV9xVX2cAxtn/ouKdH6LUwXhTkVjARQuz+nugiJ+FyHodWK/TCO0kz
 D51/+oR5BvwWxxXmxW7jwQvxenNbv95HNo60Z+Nzg2UbpqE1AYYt9PppJ0QU0P4k
 Z6+PYE2fcY127lOliLqy
 =Vvs+
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: two fixes for sthyi emulation

- missing inline assembly constraint
- wrong exception handling
2017-08-21 18:21:30 +02:00
Heiko Carstens
857b8de967 KVM: s390: sthyi: fix specification exception detection
sthyi should only generate a specification exception if the function
code is zero and the response buffer is not on a 4k boundary.

The current code would also test for unknown function codes if the
response buffer, that is currently only defined for function code 0,
is not on a 4k boundary and incorrectly inject a specification
exception instead of returning with condition code 3 and return code 4
(unsupported function code).

Fix this by moving the boundary check.

Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-21 15:34:54 +02:00
Heiko Carstens
4a4eefcd0e KVM: s390: sthyi: fix sthyi inline assembly
The sthyi inline assembly misses register r3 within the clobber
list. The sthyi instruction will always write a return code to
register "R2+1", which in this case would be r3. Due to that we may
have register corruption and see host crashes or data corruption
depending on how gcc decided to allocate and use registers during
compile time.

Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-21 15:34:36 +02:00
Ingo Molnar
94edf6f3c2 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

 - Removal of spin_unlock_wait()
 - SRCU updates
 - Torture-test updates
 - Documentation updates
 - Miscellaneous fixes
 - CPU-hotplug fixes
 - Miscellaneous non-RCU fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-21 09:45:19 +02:00
Paul E. McKenney
952111d7db arch: Remove spin_unlock_wait() arch-specific definitions
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair.  This commit therefore removes the underlying arch-specific
arch_spin_unlock_wait() for all architectures providing them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
2017-08-17 08:08:59 -07:00
David S. Miller
463910e2df Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
Joerg Roedel
f42c223514 iommu/s390: Add support for iommu_device handling
Add support for the iommu_device_register interface to make
the s390 hardware iommus visible to the iommu core and in
sysfs.

Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-08-15 18:22:45 +02:00
Minchan Kim
99baac21e4 mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
problem and Mel fixed it[1] and found same problem on MADV_FREE[2].

Quote from Mel Gorman:
 "The race in question is CPU 0 running madv_free and updating some PTEs
  while CPU 1 is also running madv_free and looking at the same PTEs.
  CPU 1 may have writable TLB entries for a page but fail the pte_dirty
  check (because CPU 0 has updated it already) and potentially fail to
  flush.

  Hence, when madv_free on CPU 1 returns, there are still potentially
  writable TLB entries and the underlying PTE is still present so that a
  subsequent write does not necessarily propagate the dirty bit to the
  underlying PTE any more. Reclaim at some unknown time at the future
  may then see that the PTE is still clean and discard the page even
  though a write has happened in the meantime. I think this is possible
  but I could have missed some protection in madv_free that prevents it
  happening."

This patch aims for solving both problems all at once and is ready for
other problem with KSM, MADV_FREE and soft-dirty story[3].

TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
catch there are parallel threads going on.  In that case, forcefully,
flush TLB to prevent for user to access memory via stale TLB entry
although it fail to gather page table entry.

I confirmed this patch works with [4] test program Nadav gave so this
patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
v2" in current mmotm.

NOTE:

This patch modifies arch-specific TLB gathering interface(x86, ia64,
s390, sh, um).  It seems most of architecture are straightforward but
s390 need to be careful because tlb_flush_mmu works only if
mm->context.flush_mm is set to non-zero which happens only a pte entry
really is cleared by ptep_get_and_clear and friends.  However, this
problem never changes the pte entries but need to flush to prevent
memory access from stale tlb.

[1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
[2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
[3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
[4] https://patchwork.kernel.org/patch/9861621/

[minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
  Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-10 15:54:07 -07:00
Minchan Kim
56236a5955 mm: refactor TLB gathering API
This patch is a preparatory patch for solving race problems caused by
TLB batch.  For that, we will increase/decrease TLB flush pending count
of mm_struct whenever tlb_[gather|finish]_mmu is called.

Before making it simple, this patch separates architecture specific part
and rename it to arch_tlb_[gather|finish]_mmu and generic part just
calls it.

It shouldn't change any behavior.

Link: http://lkml.kernel.org/r/20170802000818.4760-5-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-10 15:54:07 -07:00
Daniel Borkmann
3b497806f6 bpf, s390x: implement jiting of BPF_J{LT, LE, SLT, SLE}
This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the s390x eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 16:53:57 -07:00
David S. Miller
3118e6e19d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The UDP offload conflict is dealt with by simply taking what is
in net-next where we have removed all of the UFO handling code
entirely.

The TCP conflict was a case of local variables in a function
being removed from both net and net-next.

In netvsc we had an assignment right next to where a missing
set of u64 stats sync object inits were added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 16:28:45 -07:00
Sebastian Ott
5db2317999 s390/pci: log changes to uid checking
Some hypervisors allow to toggle uid checking at runtime. Log
changes to uid checking in s390dbf.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-09 09:09:41 -04:00
Heiko Carstens
ef267938f0 s390/vmcp: split vmcp header file and move to uapi
Split the vmcp header file and move the device driver internal
structure to the C file, and move the ioctl definitions to the uapi
directory.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-09 09:09:36 -04:00
Heiko Carstens
3f4298427a s390/vmcp: make use of contiguous memory allocator
If memory is fragmented it is unlikely that large order memory
allocations succeed. This has been an issue with the vmcp device
driver since a long time, since it requires large physical contiguous
memory ares for large responses.

To hopefully resolve this issue make use of the contiguous memory
allocator (cma). This patch adds a vmcp specific vmcp cma area with a
default size of 4MB. The size can be changed either via the
VMCP_CMA_SIZE config option at compile time or with the "vmcp_cma"
kernel parameter (e.g. "vmcp_cma=16m").

For any vmcp response buffers larger than 16k memory from the cma area
will be allocated. If such an allocation fails, there is a fallback to
the buddy allocator.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-09 09:09:35 -04:00
Heiko Carstens
cd4386a931 s390/cpcmd,vmcp: avoid GFP_DMA allocations
According to the CP Programming Services manual Diagnose Code 8
"Virtual Console Function" can be used in all addressing modes. Also
the input and output buffers do not have a limitation which specifies
they need to be below the 2GB line.

This is true at least since z/VM 5.4.

Therefore remove the sam31/64 instructions and allow for simple
GFP_KERNEL allocations. This makes it easier to allocate a 1MB page
if the user requested such a large return buffer.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-09 09:09:35 -04:00