linux/arch
Kirill A. Shutemov 09ef493985 x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds
In split page table lock case, we embed spinlock_t into struct page.
For obvious reason, we don't want to increase size of struct page if
spinlock_t is too big, like with DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC or
on -rt kernel.  So we disable split page table lock, if spinlock_t is
too big.

This patchset allows to allocate the lock dynamically if spinlock_t is
big.  In this page->ptl is used to store pointer to spinlock instead of
spinlock itself.  It costs additional cache line for indirect access,
but fix page fault scalability for multi-threaded applications.

LOCK_STAT depends on DEBUG_SPINLOCK, so on current kernel enabling
LOCK_STAT to analyse scalability issues breaks scalability.  ;)

The patchset mostly fixes this.  Results for ./thp_memscale -c 80 -b 512M
on 4-socket machine:

baseline, no CONFIG_LOCK_STAT:	9.115460703 seconds time elapsed
baseline, CONFIG_LOCK_STAT=y:	53.890567123 seconds time elapsed
patched, no CONFIG_LOCK_STAT:	8.852250368 seconds time elapsed
patched, CONFIG_LOCK_STAT=y:	11.069770759 seconds time elapsed

Patch count is scary, but most of them trivial. Overview:

 Patches 1-4	Few bug fixes. No dependencies to other patches.
		Probably should applied as soon as possible.

 Patch 5	Changes signature of pgtable_page_ctor(). We will use it
		for dynamic lock allocation, so it can fail.

 Patches 6-8	Add missing constructor/destructor calls on few archs.
		It's fixes NR_PAGETABLE accounting and prepare to use
		split ptl.

 Patches 9-33	Add pgtable_page_ctor() fail handling to all archs.

 Patches 34	Finally adds support of dynamically-allocated page->pte.
		Also contains documentation for split page table lock.

This patch (of 34):

I've missed that we preallocate few pmds on pgd_alloc() if X86_PAE
enabled.  Let's add missed constructor/destructor calls.

I haven't noticed it during testing since prep_new_page() clears
page->mapping and therefore page->ptl.  It's effectively equal to
spin_lock_init(&page->ptl).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:15 +09:00
..
alpha Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
arc DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
arm mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS 2013-11-15 09:32:14 +09:00
arm64 Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2013-11-14 08:51:29 +09:00
avr32 fbdev changes for 3.13 2013-11-14 14:44:20 +09:00
blackfin ACPI and power management updates for 3.13-rc1 2013-11-14 13:41:48 +09:00
c6x DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
cris PCI changes for the v3.13 merge window: 2013-11-14 14:02:00 +09:00
frv PCI changes for the v3.13 merge window: 2013-11-14 14:02:00 +09:00
hexagon DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
ia64 ACPI and power management updates for 3.13-rc1 2013-11-14 13:41:48 +09:00
m32r Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
m68k Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-11-12 10:20:12 +09:00
metag mm: use pgdat_end_pfn() to simplify the code in arch 2013-11-13 12:09:03 +09:00
microblaze mm/arch: use __free_reserved_page() to simplify the code 2013-11-13 12:09:03 +09:00
mips Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
mn10300 PCI changes for the v3.13 merge window: 2013-11-14 14:02:00 +09:00
openrisc DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
parisc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
powerpc PCI changes for the v3.13 merge window: 2013-11-14 14:02:00 +09:00
s390 mm, thp: do not access mm->pmd_huge_pte directly 2013-11-15 09:32:14 +09:00
score Linux 3.12-rc4 2013-10-09 12:36:13 +02:00
sh sh: move fpu_counter into ARCH specific thread_struct 2013-11-13 12:09:13 +09:00
sparc mm, thp: do not access mm->pmd_huge_pte directly 2013-11-15 09:32:14 +09:00
tile PCI changes for the v3.13 merge window: 2013-11-14 14:02:00 +09:00
um Merge branch 'linus' into sched/core 2013-11-01 08:24:41 +01:00
unicore32 sched, arch: Create asm/preempt.h 2013-09-25 14:07:50 +02:00
x86 x86: add missed pgtable_pmd_page_ctor/dtor calls for preallocated pmds 2013-11-15 09:32:15 +09:00
xtensa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
.gitignore
Kconfig Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-11-12 10:36:00 +09:00