620329 Commits

Author SHA1 Message Date
Linus Torvalds
2161a2a644 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fix from Dmitry Torokhov:
 "One small change to make joydev (which is used by older games) to bind
  to devices that export Z axis but not X or Y (such as TRC rudder)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: joydev - recognize devices with Z axis as joysticks
2016-09-30 21:25:09 -07:00
Linus Torvalds
dbd8805b0a Merge branch 'akpm' (patches from Andrew)
Merge more fixes from Andrew Morton:
 "Three fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  include/linux/property.h: fix typo/compile error
  ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
  mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
2016-09-30 15:51:10 -07:00
John Youn
37aa7271d9 include/linux/property.h: fix typo/compile error
This fixes commit d76eebfa175e ("include/linux/property.h: fix build
issues with gcc-4.4.4").

With that commit we get the following compile error when using the
PROPERTY_ENTRY_INTEGER_ARRAY macro.

 include/linux/property.h:201:39: error: `u32_data' undeclared (first
                 use in this function)
  PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u32, _val_)
                                       ^
 include/linux/property.h:193:17: note: in definition of macro
                 `PROPERTY_ENTRY_INTEGER_ARRAY'
  { .pointer = { _type_##_data = _val_ } },  \
                 ^

This needs a '.' to reference the union member.  It seems this was just
overlooked here since it is done correctly in similar constructs in
other parts of the original commit.

This fix is in preparation of upcoming commits that will use this macro.

Fixes: commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4")
Link: http://lkml.kernel.org/r/2de3b929290d88a723ed829a3e3cbd02044714df.1475114627.git.johnyoun@synopsys.com
Signed-off-by: John Youn <johnyoun@synopsys.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Eric Ren
c33f0785bf ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.

This is the backtrace when the deadlock happens:

   __wait_on_bit_lock+0x50/0xa0
   __lock_page+0xb7/0xc0
   ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
   ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
   do_page_mkwrite+0x66/0xc0
   handle_mm_fault+0x685/0x1350
   __do_page_fault+0x1d8/0x4d0
   trace_do_page_fault+0x37/0xf0
   do_async_page_fault+0x19/0x70
   async_page_fault+0x28/0x30

In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.

But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.

Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.

These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().

Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.

Changes since v1:
1. Also put ENOMEM error case into consideration.

Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: He Gang <ghe@suse.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Johannes Weiner
22f2ac51b6 mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>	[3.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Vineet Gupta
ef25bacbb0 ARC: [plat*] enables MODULE*
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:59:48 -07:00
Vineet Gupta
cd5d38b052 ARCv2: fix local_save_flags
Commit d9676fa152c83b ("ARCv2: Enable LOCKDEP"), changed
local_save_flags() to not return raw STATUS32 but encoded in the form
such that it could be fed directly to CLRI/SETI instructions.
However the STATUS32.E[] was not captured correctly as it corresponds to
bits [4:1] in the register and not [3:0]

Fixes: d9676fa152c83b ("ARCv2: Enable LOCKDEP")
Cc: Evgeny Voevodin <evgeny.voevodin@intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:25 -07:00
Noam Camus
3528f84f75 ARC: CONFIG_NODES_SHIFT fix default values
Seem like values assigned as absolute number and not and
shift value, i.e. should be 0 for one node (2^0) and 1 for
couple of nodes (2^1)

Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:24 -07:00
Yuriy Kolerov
bc0c7ece61 ARCv2: intc: Use kflag if STATUS32.IE must be reset
In the end of "arc_init_IRQ" STATUS32.IE flag is going to be affected by
"flag" instruction but "flag" never touches IE flag on ARCv2. So "kflag"
instruction must be used instead of "flag".

Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Cc: stable@vger.kernel.org #4.2+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:24 -07:00
Vineet Gupta
99a2ca65d5 ARC: .exit.* sections can be discarded in .eh_frame regime
We used to keep the .exit.* sections as linker would fail in final link
due to references from .debug_frame which itself could not be discardrd
due to the forced "write,alloc" attributes for it.

|   LD      init/built-in.o
| `.exit.text' referenced in section `.debug_frame' of arch/arc/built-in.o: defined in discarded section `.exit.text' of arch/arc/built-in.o
| Makefile:949: recipe for target 'vmlinux' failed

With .debug_frame now retired, this hack is no longer needed.
kernel binary is now a little bit smaller as well.

closes STAR 9000549913

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:23 -07:00
Vineet Gupta
86effd0dc6 ARC: dw2 unwind: enable cfi pseudo ops in string lib
This uses a new set of annoations viz. ENTRY_CFI/END_CFI to enabel cfi
ops generation.

Note that we didn't change the normal ENTRY/EXIT as we don't actually
want unwind info in the trap/exception/interrutp handlers which use
these, as unwinder then gets confused (it keeps recursing vs. stopping).
Semantically these are leaf routines and unwinding should stop when it
hits those routines.

Before
------

    28.52%     1.19%          9929  hackbench  libuClibc-1.0.17.so   [.] __write_nocancel
            |
            ---__write_nocancel
               |--8.95%--EV_Trap
               |           --8.25%--sys_write
               |                     |--3.93%--sock_write_iter
     ...
               |--2.62%--memset   <==== [LEAF entry as no unwind info]
                         ^^^^^^

After
-----

    29.46%     1.24%         13622  hackbench  libuClibc-1.0.17.so   [.] __write_nocancel
            |
            ---__write_nocancel
               |--9.31%--EV_Trap
               |           --8.62%--sys_write
               |                     |--4.17%--sock_write_iter
     ...
               |--6.19%--sys_write
               |           --6.19%--sock_write_iter
               |                     unix_stream_sendmsg
               |                     |--1.62%--sock_alloc_send_pskb
               |                     |--0.89%--sock_def_readable
               |                     |--0.88%--_raw_spin_unlock_irqrestore
               |                     |--0.69%--memset
               |                     |         ^^^^^^     <==== [now in proper callframe]
               |                     |
               |                      --0.52%--skb_copy_datagram_from_iter

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:22 -07:00
Vineet Gupta
5a205a32ff ARC: dw2 unwind: add infrastructure for adding cfi pseudo ops to asm
1. detect whether binutils supports the cfi pseudo ops
2. define conditional macros to generate the ops
3. define new ENTRY_CFI/END_CFI to annotate hand asm code.
   - Needed because we don't want to emit dwarf info in general ENTRY/END
     used by lowest level trap/exception/interrutp handlers as unwinder
     gets confused trying to unwind out of them. We want unwinder to
     instead stop when it hits onfo those routines
   - These provide minimal start/end cfi ops assuming routine doesn't
     touch stack memory/regs

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:22 -07:00
Vineet Gupta
2dad1122d9 ARC: entry: make ret_from_system_call local label
This essentially removes ENTRY() assembler annotation for this symbol
since it didn't have a pairing END()

This in ahead of introducing cfi pseudo ops in ENTRY/END which expects
paired cfi_startproc/cfi_endproc

| ../arch/arc/kernel/entry.S: Assembler messages:
| ../arch/arc/kernel/entry.S:270: Error: previous CFI entry not closed (missing .cfi_endproc)
| ../scripts/Makefile.build:326: recipe for target 'arch/arc/kernel/entry-arcv2.o' failed
| make[4]: *** [arch/arc/kernel/entry-arcv2.o] Error 1

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:21 -07:00
Vineet Gupta
2d04864247 ARC: dw2 unwind: don't force dwarf 2
In .debug_frame based unwinding regime, we used to force -gdwarf-2 since
kernel unwinder only claimed to handle dwarf 2. This changed since commit
6d0d506012c93d ("ARC: dw2 unwind: Don't bail for CIE.version != 1")
which added some support beyond dwarf 2, atleast to handle CIE != 1

The ill-effect of -gdwarf-2 is that it forces generation of .debug_*
sections, which bloats loadable modules .ko files. For the curious, this
doesn't affect vmlinx binary since linker script discards .debug_* but
same discard is not yet implemented for modules.

So it seems we can drop the -gdwarf-2 toggle, which should not be needed
anyways given that we now use .eh_frame based unwinding.

I've verified using GNU 2016.09-engo10 that the actual unwind info is
not different with or w/o this toggle - but the debug_* sections are
gone for good.

before
-----
arc-linux-readelf -S q_proc.ko-unwinding-1-eh_frame-switch | grep debug
  [15] .debug_info       PROGBITS        00000000 000300 00d08d 00 	0   0  1
  [16] .rela.debug_info  RELA            00000000 0162a0 008844 0c   I 29  15  4
  [17] .debug_abbrev     PROGBITS        00000000 00d38d 0005f8 00 	0   0  1
  [18] .debug_loc        PROGBITS        00000000 00d985 000070 00 	0   0  1
  [19] .rela.debug_loc   RELA            00000000 01eae4 0000c0 0c   I 29  18  4
  [20] .debug_aranges    PROGBITS        00000000 00d9f5 000040 00 	0   0  1
  [21] .rela.debug_arang RELA            00000000 01eba4 000030 0c   I 29  20  4
  [22] .debug_ranges     PROGBITS        00000000 00da35 000018 00 	0   0  1
  [23] .rela.debug_range RELA            00000000 01ebd4 000030 0c   I 29  22  4
  [24] .debug_line       PROGBITS        00000000 00da4d 000b5b 00 	0   0  1
  [25] .rela.debug_line  RELA            00000000 01ec04 0000cc 0c   I 29  24  4
  [26] .debug_str        PROGBITS        00000000 00e5a8 007831 01   MS 0   0  1

after
----

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:21 -07:00
Vineet Gupta
6716dbbdef ARC: dw2 unwind: switch to .eh_frame based unwinding
So finally after almost 8 years of dealing with .debug_frame, we are
finally switching to .eh_frame. The reason being stripped kernel
binaries had non-functional unwinder as .debug_frame was gone.
Also, in general .eh_frame seems more common way of doing unwinding.

This also folds a revert of f52e126cc747 ("ARC: unwind: ensure that
.debug_frame is generated (vs. .eh_frame)") to ensure that we start
getting .eh_frame

Reported-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:20 -07:00
Vineet Gupta
d040876b4a ARC: dw2 unwind: factor CIE specifics for .eh_frame/.debug_frame
This paves way for switching to .eh_frame based unwindiing

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:19 -07:00
Vineet Gupta
94f4fb0841 ARC: module: support R_ARC_32_PCREL relocation
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:19 -07:00
Alexey Brodkin
e0d5321fac arc: perf: Enable generic "cache-references" and "cache-misses" events
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and
PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC.

Those events are actually aliases to 2 cache events that we do support
and so this change sets "cache-reference" and "cache-misses" events
in the same way as "L1-dcache-loads" and L1-dcache-load-misses.

And while at it adding debug info for cache events as well as doing a
subtle fix in HW events debug info - config value is much better
represented by hex so we may see not only event index but as well other
control bits set (if they exist).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:18 -07:00
Noam Camus
ce0f493240 ARC: [plat-eznps] add missing atomic_fetch_xxx operations
Build brekeage since last changes to generic atomic operations.
Added couple of missing macros which are now mandatory

Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:18 -07:00
Vineet Gupta
ce6365270e ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions
ARCv2 ISA provides 64-bit exclusive load/stores so use them to implement
the 64-bit atomics and elide the spinlock based generic 64-bit atomics

boot tested with atomic64 self-test (and GOD bless the person who wrote
them, I realized my inline assmebly is sloppy as hell)

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:17 -07:00
Vineet Gupta
26c01c49d5 ARCv2: Support dynamic peripheral address space in HS38 rel 3.0 cores
HS release 3.0 provides for even more flexibility in specifying the
volatile address space for mapping peripherals.

With HS 2.1 @start was made flexible / programmable - with HS 3.0 even
@end can be setup (vs. fixed to 0xFFFF_FFFF before).

So add code to reflect that and while at it remove an unused struct
defintion

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:17 -07:00
Vineet Gupta
f507684637 ARCv2: identify HS38 rel 3.0 cores
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:16 -07:00
Vineet Gupta
9efac6798b ARCv2: Add support for ZeBu Emulation platform for HS cores
The cool thing is that same kernel image can run on
 - nsim OSCI simulation platform
 - SDPlite FPGA setups

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:15 -07:00
Alexey Brodkin
618a9cd06d arc: Add "model" properly in device tree description of all boards
As it was discussed quite some time ago (see
https://lkml.org/lkml/2015/11/5/862) it's a good practice to add
"model" property in .dts. Moreover as per ePAPR "model" property is
required and should look like "manufacturer,model" so we do here.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Jonas Gorski <jonas.gorski@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Christian Ruppert <christian.ruppert@alitech.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-09-30 14:48:15 -07:00
Javi Merino
9a2172a8d5 MAINTAINERS: Switch to kernel.org email address for Javi Merino
Change my email address to my kernel.org account instead of the ARM one.

Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 10:45:11 -07:00
Mark Brown
2ce0468433 Merge remote-tracking branches 'spi/topic/ti-qspi', 'spi/topic/tools', 'spi/topic/txx9' and 'spi/topic/xlp' into spi-next 2016-09-30 09:14:22 -07:00
Mark Brown
3424ff29a0 Merge remote-tracking branches 'spi/topic/rspi', 'spi/topic/sc18is602', 'spi/topic/sh-msiof', 'spi/topic/spidev-test' and 'spi/topic/st-ssc4' into spi-next 2016-09-30 09:14:18 -07:00
Mark Brown
66b5a337d0 Merge remote-tracking branches 'spi/topic/octeon', 'spi/topic/pic32-sqi', 'spi/topic/pxa2xx' and 'spi/topic/qup' into spi-next 2016-09-30 09:14:14 -07:00
Mark Brown
e2df04ed3b Merge remote-tracking branches 'spi/topic/fsl-espi', 'spi/topic/imx', 'spi/topic/jcore', 'spi/topic/loopback' and 'spi/topic/meson' into spi-next 2016-09-30 09:14:10 -07:00
Mark Brown
a931a189d4 Merge remote-tracking branches 'spi/topic/bcm', 'spi/topic/dw' and 'spi/topic/fsl-dspi' into spi-next 2016-09-30 09:14:08 -07:00
Mark Brown
c5aee51b7a Merge remote-tracking branch 'spi/topic/dma' into spi-next 2016-09-30 09:14:07 -07:00
Mark Brown
1a8dabf88d Merge remote-tracking branch 'spi/topic/core' into spi-next 2016-09-30 09:14:06 -07:00
Mark Brown
07216b5503 Merge remote-tracking branch 'spi/fix/spidev' into spi-linus 2016-09-30 09:14:04 -07:00
Mark Brown
2ed89d577c Merge remote-tracking branch 'regulator/topic/tps80031' into regulator-next 2016-09-30 09:14:02 -07:00
Mark Brown
2dfcb921da Merge remote-tracking branches 'regulator/topic/of', 'regulator/topic/pv88080', 'regulator/topic/rk808', 'regulator/topic/set-voltage' and 'regulator/topic/tps65218' into regulator-next 2016-09-30 09:13:58 -07:00
Mark Brown
81c383c9ba Merge remote-tracking branches 'regulator/topic/bulk', 'regulator/topic/dbx500', 'regulator/topic/hi6421', 'regulator/topic/load' and 'regulator/topic/ltc3676' into regulator-next 2016-09-30 09:13:55 -07:00
Mark Brown
ec09f1c575 Merge remote-tracking branch 'regulator/fix/tps65910' into regulator-linus 2016-09-30 09:13:52 -07:00
Wanpeng Li
2fa5f04f85 x86/entry/64: Fix context tracking state warning when load_gs_index fails
This warning:

 WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50
 CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
 Call Trace:
  dump_stack+0x99/0xd0
  __warn+0xd1/0xf0
  warn_slowpath_null+0x1d/0x20
  enter_from_user_mode+0x32/0x50
  error_entry+0x6d/0xc0
  ? general_protection+0x12/0x30
  ? native_load_gs_index+0xd/0x20
  ? do_set_thread_area+0x19c/0x1f0
  SyS_set_thread_area+0x24/0x30
  do_int80_syscall_32+0x7c/0x220
  entry_INT80_compat+0x38/0x50

... can be reproduced by running the GS testcase of the ldt_gdt test unit in
the x86 selftests.

do_int80_syscall_32() will call enter_form_user_mode() to convert context
tracking state from user state to kernel state. The load_gs_index() call
can fail with user gsbase, gsbase will be fixed up and proceed if this
happen.

However, enter_from_user_mode() will be called again in the fixed up path
though it is context tracking kernel state currently.

This patch fixes it by just fixing up gsbase and telling lockdep that IRQs
are off once load_gs_index() failed with user gsbase.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 13:53:12 +02:00
Andy Lutomirski
05fb3c199b x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
Otherwise arch_task_struct_size == 0 and we die.  While we're at it,
set X86_FEATURE_ALWAYS, too.

Reported-by: David Saggiorato <david@saggiorato.net>
Tested-by: David Saggiorato <david@saggiorato.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86")
Link: http://lkml.kernel.org/r/8de723afbf0811071185039f9088733188b606c9.1475103911.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 13:53:04 +02:00
Andy Lutomirski
1ef55be16e x86/asm: Get rid of __read_cr4_safe()
We use __read_cr4() vs __read_cr4_safe() inconsistently.  On
CR4-less CPUs, all CR4 bits are effectively clear, so we can make
the code simpler and more robust by making __read_cr4() always fix
up faults on 32-bit kernels.

This may fix some bugs on old 486-like CPUs, but I don't have any
easy way to test that.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: david@saggiorato.net
Link: http://lkml.kernel.org/r/ea647033d357d9ce2ad2bbde5a631045f5052fb6.1475178370.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-30 12:40:12 +02:00
Thomas Gleixner
d7e25c66c9 Merge branch 'x86/urgent' into x86/asm
Get the cr4 fixes so we can apply the final cleanup
2016-09-30 12:38:28 +02:00
Segher Boessenkool
e4aad64597 x86/vdso: Fix building on big endian host
We need to call GET_LE to read hdr->e_type.

Fixes: 57f90c3dfc75 ("x86/vdso: Error out if the vDSO isn't a valid DSO")
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-next@vger.kernel.org
Link: http://lkml.kernel.org/r/20160929193442.GA16617@gate.crashing.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-30 12:37:40 +02:00
Andy Lutomirski
192d1dccbf x86/boot: Fix another __read_cr4() case on 486
The condition for reading CR4 was wrong: there are some CPUs with
CPUID but not CR4.  Rather than trying to make the condition exact,
use __read_cr4_safe().

Fixes: 18bc7bd523e0 ("x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly")
Reported-by: david@saggiorato.net
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/8c453a61c4f44ab6ff43c29780ba04835234d2e5.1475178369.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-30 12:37:40 +02:00
Frederic Weisbecker
447976ef4f sched/irqtime: Consolidate irqtime flushing code
The code performing irqtime nsecs stats flushing to kcpustat is roughly
the same for hardirq and softirq. So lets consolidate that common code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:41 +02:00
Frederic Weisbecker
19d23dbfeb sched/irqtime: Consolidate accounting synchronization with u64_stats API
The irqtime accounting currently implement its own ad hoc implementation
of u64_stats API. Lets rather consolidate it with the appropriate
library.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-5-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:40 +02:00
Frederic Weisbecker
68107df5f2 u64_stats: Introduce IRQs disabled helpers
Introduce light versions of u64_stats helpers for context where
either preempt or IRQs are disabled. This way we can make this library
usable by scheduler irqtime accounting which currenty implement its
ad-hoc version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:40 +02:00
Frederic Weisbecker
2810f611f9 sched/irqtime: Remove needless IRQs disablement on kcpustat update
The callers of the functions performing irqtime kcpustat updates have
IRQS disabled, no need to disable them again.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:39 +02:00
Frederic Weisbecker
f9094a6575 sched/irqtime: No need for preempt-safe accessors
We can safely use the preempt-unsafe accessors for irqtime when we
flush its counters to kcpustat as IRQs are disabled at this time.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474849761-12678-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:46:38 +02:00
Peter Zijlstra
b60205c7c5 sched/fair: Fix min_vruntime tracking
While going through enqueue/dequeue to review the movement of
set_curr_task() I noticed that the (2nd) update_min_vruntime() call in
dequeue_entity() is suspect.

It turns out, its actually wrong because it will consider
cfs_rq->curr, which could be the entry we just normalized. This mixes
different vruntime forms and leads to fail.

The purpose of the second update_min_vruntime() is to move
min_vruntime forward if the entity we just removed is the one that was
holding it back; _except_ for the DEQUEUE_SAVE case, because then we
know its a temporary removal and it will come back.

However, since we do put_prev_task() _after_ dequeue(), cfs_rq->curr
will still be set (and per the above, can be tranformed into a
different unit), so update_min_vruntime() should also consider
curr->on_rq. This also fixes another corner case where the enqueue
(which also does update_curr()->update_min_vruntime()) happens on the
rq->lock break in schedule(), between dequeue and put_prev_task.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: 1e876231785d ("sched: Fix ->min_vruntime calculation in dequeue_entity()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:03:29 +02:00
Peter Zijlstra
9148a3a10e sched/debug: Add SCHED_WARN_ON()
Provide SCHED_WARN_ON as wrapper for WARN_ON_ONCE() to avoid
CONFIG_SCHED_DEBUG wrappery.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-30 11:03:29 +02:00