640078 Commits

Author SHA1 Message Date
Arnd Bergmann
affd159b23 isofs: fix timestamps beyond 2027
commit 34be4dbf87fc3e474a842305394534216d428f5d upstream.

isofs uses a 'char' variable to load the number of years since
1900 for an inode timestamp. On architectures that use a signed
char type by default, this results in an invalid date for
anything beyond 2027.

This changes the function argument to a 'u8' array, which
is defined the same way on all architectures, and unambiguously
lets us use years until 2155.

This should be backported to all kernels that might still be
in use by that date.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:04 +00:00
Coly Li
770e10817e bcache: check ca->alloc_thread initialized before wake up it
commit 91af8300d9c1d7c6b6a2fd754109e08d4798b8d8 upstream.

In bcache code, sysfs entries are created before all resources get
allocated, e.g. allocation thread of a cache set.

There is posibility for NULL pointer deference if a resource is accessed
but which is not initialized yet. Indeed Jorg Bornschein catches one on
cache set allocation thread and gets a kernel oops.

The reason for this bug is, when bch_bucket_alloc() is called during
cache set registration and attaching, ca->alloc_thread is not properly
allocated and initialized yet, call wake_up_process() on ca->alloc_thread
triggers NULL pointer deference failure. A simple and fast fix is, before
waking up ca->alloc_thread, checking whether it is allocated, and only
wake up ca->alloc_thread when it is not NULL.

Signed-off-by: Coly Li <colyli@suse.de>
Reported-by: Jorg Bornschein <jb@capsec.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Eric Biggers
a1e25420a4 libceph: don't WARN() if user tries to add invalid key
commit b11270853fa3654f08d4a6a03b23ddb220512d8d upstream.

The WARN_ON(!key->len) in set_secret() in net/ceph/crypto.c is hit if a
user tries to add a key of type "ceph" with an invalid payload as
follows (assuming CONFIG_CEPH_LIB=y):

    echo -e -n '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' \
	| keyctl padd ceph desc @s

This can be hit by fuzzers.  As this is merely bad input and not a
kernel bug, replace the WARN_ON() with return -EINVAL.

Fixes: 7af3ea189a9a ("libceph: stop allocating a new cipher on every crypto request")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Dan Carpenter
7d00fdbc49 eCryptfs: use after free in ecryptfs_release_messaging()
commit db86be3a12d0b6e5c5b51c2ab2a48f06329cb590 upstream.

We're freeing the list iterator so we should be using the _safe()
version of hlist_for_each_entry().

Fixes: 88b4a07e6610 ("[PATCH] eCryptfs: Public key transport mechanism")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Andreas Rohner
45a99f1f78 nilfs2: fix race condition that causes file system corruption
commit 31ccb1f7ba3cfe29631587d451cf5bb8ab593550 upstream.

There is a race condition between nilfs_dirty_inode() and
nilfs_set_file_dirty().

When a file is opened, nilfs_dirty_inode() is called to update the
access timestamp in the inode.  It calls __nilfs_mark_inode_dirty() in a
separate transaction.  __nilfs_mark_inode_dirty() caches the ifile
buffer_head in the i_bh field of the inode info structure and marks it
as dirty.

After some data was written to the file in another transaction, the
function nilfs_set_file_dirty() is called, which adds the inode to the
ns_dirty_files list.

Then the segment construction calls nilfs_segctor_collect_dirty_files(),
which goes through the ns_dirty_files list and checks the i_bh field.
If there is a cached buffer_head in i_bh it is not marked as dirty
again.

Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate
transactions, it is possible that a segment construction that writes out
the ifile occurs in-between the two.  If this happens the inode is not
on the ns_dirty_files list, but its ifile block is still marked as dirty
and written out.

In the next segment construction, the data for the file is written out
and nilfs_bmap_propagate() updates the b-tree.  Eventually the bmap root
is written into the i_bh block, which is not dirty, because it was
written out in another segment construction.

As a result the bmap update can be lost, which leads to file system
corruption.  Either the virtual block address points to an unallocated
DAT block, or the DAT entry will be reused for something different.

The error can remain undetected for a long time.  A typical error
message would be one of the "bad btree" errors or a warning that a DAT
entry could not be found.

This bug can be reproduced reliably by a simple benchmark that creates
and overwrites millions of 4k files.

Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
NeilBrown
ab96d9c222 autofs: don't fail mount for transient error
commit ecc0c469f27765ed1e2b967be0aa17cee1a60b76 upstream.

Currently if the autofs kernel module gets an error when writing to the
pipe which links to the daemon, then it marks the whole moutpoint as
catatonic, and it will stop working.

It is possible that the error is transient.  This can happen if the
daemon is slow and more than 16 requests queue up.  If a subsequent
process tries to queue a request, and is then signalled, the write to
the pipe will return -ERESTARTSYS and autofs will take that as total
failure.

So change the code to assess -ERESTARTSYS and -ENOMEM as transient
failures which only abort the current request, not the whole mountpoint.

It isn't a crash or a data corruption, but having autofs mountpoints
suddenly stop working is rather inconvenient.

Ian said:

: And given the problems with a half dozen (or so) user space applications
: consuming large amounts of CPU under heavy mount and umount activity this
: could happen more easily than we expect.

Link: http://lkml.kernel.org/r/87y3norvgp.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Stanislaw Gruszka
f2c97052dc rt2x00usb: mark device removed when get ENOENT usb error
commit bfa62a52cad93686bb8d8171ea5288813248a7c6 upstream.

ENOENT usb error mean "specified interface or endpoint does not exist or
is not enabled". Mark device not present when we encounter this error
similar like we do with ENODEV error.

Otherwise we can have infinite loop in rt2x00usb_work_rxdone(), because
we remove and put again RX entries to the queue infinitely.

We can have similar situation when submit urb will fail all the time
with other error, so we need consider to limit number of entries
processed by rxdone work. But for now, since the patch fixes
reproducible soft lockup issue on single processor systems
and taken ENOENT error meaning, let apply this fix.

Patch adds additional ENOENT check not only in rx kick routine, but
also on other places where we check for ENODEV error.

Reported-by: Richard Genoud <richard.genoud@gmail.com>
Debugged-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Mirko Parthey
3f6e2910c4 MIPS: BCM47XX: Fix LED inversion for WRT54GSv1
commit 56a46acf62af5ba44fca2f3f1c7c25a2d5385b19 upstream.

The WLAN LED on the Linksys WRT54GSv1 is active low, but the software
treats it as active high. Fix the inverted logic.

Fixes: 7bb26b169116 ("MIPS: BCM47xx: Fix LEDs on WRT54GS V1.0")
Signed-off-by: Mirko Parthey <mirko.parthey@web.de>
Looks-ok-by: Rafał Miłecki <zajec5@gmail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16071/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Maciej W. Rozycki
219f386eae MIPS: Fix an n32 core file generation regset support regression
commit 547da673173de51f73887377eb275304775064ad upstream.

Fix a commit 7aeb753b5353 ("MIPS: Implement task_user_regset_view.")
regression, then activated by commit 6a9c001b7ec3 ("MIPS: Switch ELF
core dumper to use regsets.)", that caused n32 processes to dump o32
core files by failing to set the EF_MIPS_ABI2 flag in the ELF core file
header's `e_flags' member:

$ file tls-core
tls-core: ELF 32-bit MSB executable, MIPS, N32 MIPS64 rel2 version 1 (SYSV), [...]
$ ./tls-core
Aborted (core dumped)
$ file core
core: ELF 32-bit MSB core file MIPS, MIPS-I version 1 (SYSV), SVR4-style
$

Previously the flag was set as the result of a:

statement placed in arch/mips/kernel/binfmt_elfn32.c, however in the
regset case, i.e. when CORE_DUMP_USE_REGSET is set, ELF_CORE_EFLAGS is
no longer used by `fill_note_info' in fs/binfmt_elf.c, and instead the
`->e_flags' member of the regset view chosen is.  We have the views
defined in arch/mips/kernel/ptrace.c, however only an o32 and an n64
one, and the latter is used for n32 as well.  Consequently an o32 core
file is incorrectly dumped from n32 processes (the ELF32 vs ELF64 class
is chosen elsewhere, and the 32-bit one is correctly selected for n32).

Correct the issue then by defining an n32 regset view and using it as
appropriate.  Issue discovered in GDB testing.

Fixes: 7aeb753b5353 ("MIPS: Implement task_user_regset_view.")
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Djordje Todorovic <djordje.todorovic@rt-rk.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17617/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
Masahiro Yamada
3a63c9b9ef MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry
commit 3cad14d56adbf7d621fc5a35db42f3acc0a2d6e8 upstream.

arch/mips/boot/dts/brcm/bcm96358nb4ser.dts does not exist, so
we cannot build bcm96358nb4ser.dtb .

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Fixes: 695835511f96 ("MIPS: BMIPS: rename bcm96358nb4ser to bcm6358-neufbox4-sercom")
Acked-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:03 +00:00
James Hogan
e92114860f MIPS: Fix odd fp register warnings with MIPS64r2
commit c7fd89a6407ea3a44a2a2fa12d290162c42499c4 upstream.

Building 32-bit MIPS64r2 kernels produces warnings like the following
on certain toolchains (such as GNU assembler 2.24.90, but not GNU
assembler 2.28.51) since commit 22b8ba765a72 ("MIPS: Fix MIPS64 FP
save/restore on 32-bit kernels"), due to the exposure of fpu_save_16odd
from fpu_save_double and fpu_restore_16odd from fpu_restore_double:

arch/mips/kernel/r4k_fpu.S:47: Warning: float register should be even, was 1
...
arch/mips/kernel/r4k_fpu.S:59: Warning: float register should be even, was 1
...

This appears to be because .set mips64r2 does not change the FPU ABI to
64-bit when -march=mips64r2 (or e.g. -march=xlp) is provided on the
command line on that toolchain, from the default FPU ABI of 32-bit due
to the -mabi=32. This makes access to the odd FPU registers invalid.

Fix by explicitly changing the FPU ABI with .set fp=64 directives in
fpu_save_16odd and fpu_restore_16odd, and moving the undefine of fp up
in asmmacro.h so fp doesn't turn into $30.

Fixes: 22b8ba765a72 ("MIPS: Fix MIPS64 FP save/restore on 32-bit kernels")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17656/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Hou Tao
1cd9686e0a dm: fix race between dm_get_from_kobject() and __dm_destroy()
commit b9a41d21dceadf8104812626ef85dc56ee8a60ed upstream.

The following BUG_ON was hit when testing repeat creation and removal of
DM devices:

    kernel BUG at drivers/md/dm.c:2919!
    CPU: 7 PID: 750 Comm: systemd-udevd Not tainted 4.1.44
    Call Trace:
     [<ffffffff81649e8b>] dm_get_from_kobject+0x34/0x3a
     [<ffffffff81650ef1>] dm_attr_show+0x2b/0x5e
     [<ffffffff817b46d1>] ? mutex_lock+0x26/0x44
     [<ffffffff811df7f5>] sysfs_kf_seq_show+0x83/0xcf
     [<ffffffff811de257>] kernfs_seq_show+0x23/0x25
     [<ffffffff81199118>] seq_read+0x16f/0x325
     [<ffffffff811de994>] kernfs_fop_read+0x3a/0x13f
     [<ffffffff8117b625>] __vfs_read+0x26/0x9d
     [<ffffffff8130eb59>] ? security_file_permission+0x3c/0x44
     [<ffffffff8117bdb8>] ? rw_verify_area+0x83/0xd9
     [<ffffffff8117be9d>] vfs_read+0x8f/0xcf
     [<ffffffff81193e34>] ? __fdget_pos+0x12/0x41
     [<ffffffff8117c686>] SyS_read+0x4b/0x76
     [<ffffffff817b606e>] system_call_fastpath+0x12/0x71

The bug can be easily triggered, if an extra delay (e.g. 10ms) is added
between the test of DMF_FREEING & DMF_DELETING and dm_get() in
dm_get_from_kobject().

To fix it, we need to ensure the test of DMF_FREEING & DMF_DELETING and
dm_get() are done in an atomic way, so _minor_lock is used.

The other callers of dm_get() have also been checked to be OK: some
callers invoke dm_get() under _minor_lock, some callers invoke it under
_hash_lock, and dm_start_request() invoke it after increasing
md->open_count.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
John Crispin
822c308a15 MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver
commit 8593b18ad348733b5d5ddfa0c79dcabf51dff308 upstream.

Switch the printk() call to the prefered pr_warn() api.

Fixes: 7e5873d3755c ("MIPS: pci: Add MT7620a PCIE driver")
Signed-off-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15321/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Mikulas Patocka
67246fb944 dm: allocate struct mapped_device with kvzalloc
commit 856eb0916d181da6d043cc33e03f54d5c5bbe54a upstream.

The structure srcu_struct can be very big, its size is proportional to the
value CONFIG_NR_CPUS. The Fedora kernel has CONFIG_NR_CPUS 8192, the field
io_barrier in the struct mapped_device has 84kB in the debugging kernel
and 50kB in the non-debugging kernel. The large size may result in failure
of the function kzalloc_node.

In order to avoid the allocation failure, we use the function
kvzalloc_node, this function falls back to vmalloc if a large contiguous
chunk of memory is not available. This patch also moves the field
io_barrier to the last position of struct mapped_device - the reason is
that on many processor architectures, short memory offsets result in
smaller code than long memory offsets - on x86-64 it reduces code size by
320 bytes.

Note to stable kernel maintainers - the kernels 4.11 and older don't have
the function kvzalloc_node, you can use the function vzalloc_node instead.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Eric Biggers
6609a3cdfc dm bufio: fix integer overflow when limiting maximum cache size
commit 74d4108d9e681dbbe4a2940ed8fdff1f6868184c upstream.

The default max_cache_size_bytes for dm-bufio is meant to be the lesser
of 25% of the size of the vmalloc area and 2% of the size of lowmem.
However, on 32-bit systems the intermediate result in the expression

    (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100

overflows, causing the wrong result to be computed.  For example, on a
32-bit system where the vmalloc area is 520093696 bytes, the result is
1174405 rather than the expected 130023424, which makes the maximum
cache size much too small (far less than 2% of lowmem).  This causes
severe performance problems for dm-verity users on affected systems.

Fix this by using mult_frac() to correctly multiply by a percentage.  Do
this for all places in dm-bufio that multiply by a percentage.  Also
replace (VMALLOC_END - VMALLOC_START) with VMALLOC_TOTAL, which contrary
to the comment is now defined in include/linux/vmalloc.h.

Depends-on: 9993bc635 ("sched/x86: Fix overflow in cyc2ns_offset")
Fixes: 95d402f057f2 ("dm: add bufio")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Vijendar Mukunda
b0c6e0e936 ALSA: hda: Add Raven PCI ID
commit 9ceace3c9c18c67676e75141032a65a8e01f9a7a upstream.

This commit adds PCI ID for Raven platform

Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Vadim Lomovtsev
3194d87568 PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF
commit 7f342678634f16795892677204366e835e450dda upstream.

The Cavium ThunderX (CN8XXX) family of PCIe Root Ports does not advertise
an ACS capability.  However, the RTL internally implements similar
protection as if ACS had Request Redirection, Completion Redirection,
Source Validation, and Upstream Forwarding features enabled.

Change Cavium ACS capabilities quirk flags accordingly.

Fixes: b404bcfbf035 ("PCI: Add ACS quirk for all Cavium devices")
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
[bhelgaas: tidy changelog, comment, stable tag]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Mathias Kresin
49ad11b022 MIPS: ralink: Fix typo in mt7628 pinmux function
commit 05a67cc258e75ac9758e6f13d26337b8be51162a upstream.

There is a typo inside the pinmux setup code. The function is called
refclk and not reclk.

Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16047/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Mathias Kresin
351ece3a77 MIPS: ralink: Fix MT7628 pinmux
commit 8ef4b43cd3794d63052d85898e42424fd3b14d24 upstream.

According to the datasheet the REFCLK pin is shared with GPIO#37 and
the PERST pin is shared with GPIO#36.

Fixes: 53263a1c6852 ("MIPS: ralink: add mt7628an support")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16046/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Philip Derrin
d72cdeaf1c ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
commit 3b0c0c922ff4be275a8beb87ce5657d16f355b54 upstream.

When CONFIG_ARM_LPAE is set, the PMD dump relies on the software
read-only bit to determine whether a page is writable. This
concealed a bug which left the kernel text section writable
(AP2=0) while marked read-only in the software bit.

In a kernel with the AP2 bug, the dump looks like this:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M ro x  SHD
    0xc0600000-0xc0800000           2M ro NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

The fix is to check that the software and hardware bits are both
set before displaying "ro". The dump then shows the true perms:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M RW x  SHD
    0xc0600000-0xc0800000           2M RW NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

Fixes: ded947798469 ("ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE")
Signed-off-by: Philip Derrin <philip@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:02 +00:00
Philip Derrin
eb5ede8b71 ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
commit 400eeffaffc7232c0ae1134fe04e14ae4fb48d8c upstream.

Currently, for ARM kernels with CONFIG_ARM_LPAE and
CONFIG_STRICT_KERNEL_RWX enabled, the 2MiB pages mapping the
kernel code and rodata are writable. They are marked read-only in
a software bit (L_PMD_SECT_RDONLY) but the hardware read-only bit
is not set (PMD_SECT_AP2).

For user mappings, the logic that propagates the software bit
to the hardware bit is in set_pmd_at(); but for the kernel,
section_update() writes the PMDs directly, skipping this logic.

The fix is to set PMD_SECT_AP2 for read-only sections in
section_update(), at the same time as L_PMD_SECT_RDONLY.

Fixes: 1e3479225acb ("ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error")
Signed-off-by: Philip Derrin <philip@cog.systems>
Reported-by: Neil Dick <neil@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Tested-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Catalin Marinas
e5380004ee arm64: Implement arch-specific pte_access_permitted()
commit 6218f96c58dbf44a06aeaf767aab1f54fc397838 upstream.

The generic pte_access_permitted() implementation only checks for
pte_present() (together with the write permission where applicable).
However, for both kernel ptes and PROT_NONE mappings pte_present() also
returns true on arm64 even though such mappings are not user accessible.
Additionally, arm64 now supports execute-only user permission
(PROT_EXEC) which is implemented by clearing the PTE_USER bit.

With this patch the arm64 implementation of pte_access_permitted()
checks for the PTE_VALID and PTE_USER bits together with writable access
if applicable.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Andy Lutomirski
0d794d0d01 x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
commit ca37e57bbe0cf1455ea3e84eb89ed04a132d59e1 upstream.

Running this code with IRQs enabled (where dummy_lock is a spinlock):

static void check_load_gs_index(void)
{
	/* This will fail. */
	load_gs_index(0xffff);

	spin_lock(&dummy_lock);
	spin_unlock(&dummy_lock);
}

Will generate a lockdep warning.  The issue is that the actual write
to %gs would cause an exception with IRQs disabled, and the exception
handler would, as an inadvertent side effect, update irqflag tracing
to reflect the IRQs-off status.  native_load_gs_index() would then
turn IRQs back on and return with irqflag tracing still thinking that
IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
error and warn.

Fix it by adding the missing tracing.

Apparently nothing did this in a context where it mattered.  I haven't
tried to find a code path that would actually exhibit the warning if
appropriately nasty user code were running.

I suspect that the security impact of this bug is very, very low --
production systems don't run with lockdep enabled, and the warning is
mostly harmless anyway.

Found during a quick audit of the entry code to try to track down an
unrelated bug that Ingo found in some still-in-development code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Masami Hiramatsu
2816c0455c x86/decoder: Add new TEST instruction pattern
commit 12a78d43de767eaf8fb272facb7a7b6f2dc6a9df upstream.

The kbuild test robot reported this build warning:

  Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c

  Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx)
  Warning: objdump says 3 bytes, but insn_get_length() says 2
  Warning: decoded and checked 1569014 instructions with 1 warnings

This sequence seems to be a new instruction not in the opcode map in the Intel SDM.

The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8.
Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of
the ModR/M Byte (bits 2,1,0 in parenthesis)"

In that table, opcodes listed by the index REG bits as:

  000         001       010 011  100        101        110         111
 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX

So, it seems TEST Ib is assigned to 001.

Add the new pattern.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Eric Biggers
443d26a6f7 lib/mpi: call cond_resched() from mpi_powm() loop
commit 1d9ddde12e3c9bab7f3d3484eb9446315e3571ca upstream.

On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
largest permitted inputs (16384 bits), the kernel spends 10+ seconds
doing modular exponentiation in mpi_powm() without rescheduling.  If all
threads do it, it locks up the system.  Moreover, it can cause
rcu_sched-stall warnings.

Notwithstanding the insanity of doing this calculation in kernel mode
rather than in userspace, fix it by calling cond_resched() as each bit
from the exponent is processed.  It's still noninterruptible, but at
least it's preemptible now.

Do the cond_resched() once per bit rather than once per MPI limb because
each limb might still easily take 100+ milliseconds on slow CPUs.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Paul E. McKenney
fb8bd56e35 sched: Make resched_cpu() unconditional
commit 7c2102e56a3f7d85b5d8f33efbd7aecc1f36fdd8 upstream.

The current implementation of synchronize_sched_expedited() incorrectly
assumes that resched_cpu() is unconditional, which it is not.  This means
that synchronize_sched_expedited() can hang when resched_cpu()'s trylock
fails as follows (analysis by Neeraj Upadhyay):

o	CPU1 is waiting for expedited wait to complete:

	sync_rcu_exp_select_cpus
	     rdp->exp_dynticks_snap & 0x1   // returns 1 for CPU5
	     IPI sent to CPU5

	synchronize_sched_expedited_wait
		 ret = swait_event_timeout(rsp->expedited_wq,
					   sync_rcu_preempt_exp_done(rnp_root),
					   jiffies_stall);

	expmask = 0x20, CPU 5 in idle path (in cpuidle_enter())

o	CPU5 handles IPI and fails to acquire rq lock.

	Handles IPI
	     sync_sched_exp_handler
		 resched_cpu
		     returns while failing to try lock acquire rq->lock
		 need_resched is not set

o	CPU5 calls  rcu_idle_enter() and as need_resched is not set, goes to
	idle (schedule() is not called).

o	CPU 1 reports RCU stall.

Given that resched_cpu() is now used only by RCU, this commit fixes the
assumption by making resched_cpu() unconditional.

Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Suggested-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
WANG Cong
6be6e48daa vsock: use new wait API for vsock_stream_sendmsg()
commit 499fde662f1957e3cb8d192a94a099ebe19c714b upstream.

As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():

  vsock_stream_has_space
    vmci_transport_stream_has_space
      vmci_qpair_produce_free_space
        qp_lock
          qp_acquire_queue_mutex
            mutex_lock

Just switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: "Jorgen S. Hansen" <jhansen@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
WANG Cong
41e4fbdf24 ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
commit 76da0704507bbc51875013f6557877ab308cfd0a upstream.

In commit 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf")
I assumed NETDEV_REGISTER and NETDEV_UNREGISTER are paired,
unfortunately, as reported by jeffy, netdev_wait_allrefs()
could rebroadcast NETDEV_UNREGISTER event until all refs are
gone.

We have to add an additional check to avoid this corner case.
For netdev_wait_allrefs() dev->reg_state is NETREG_UNREGISTERED,
for dev_change_net_namespace(), dev->reg_state is
NETREG_REGISTERED. So check for dev->reg_state != NETREG_UNREGISTERED.

Fixes: 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf")
Reported-by: jeffy <jeffy.chen@rock-chips.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Vlastimil Babka
d0629c6bd5 x86/mm: fix use-after-free of vma during userfaultfd fault
commit cb0631fd3cf9e989cd48293fe631cbc402aec9a9 upstream.

Syzkaller with KASAN has reported a use-after-free of vma->vm_flags in
__do_page_fault() with the following reproducer:

  mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
  mmap(&(0x7f0000011000/0x3000)=nil, 0x3000, 0x1, 0x32, 0xffffffffffffffff, 0x0)
  r0 = userfaultfd(0x0)
  ioctl$UFFDIO_API(r0, 0xc018aa3f, &(0x7f0000002000-0x18)={0xaa, 0x0, 0x0})
  ioctl$UFFDIO_REGISTER(r0, 0xc020aa00, &(0x7f0000019000)={{&(0x7f0000012000/0x2000)=nil, 0x2000}, 0x1, 0x0})
  r1 = gettid()
  syz_open_dev$evdev(&(0x7f0000013000-0x12)="2f6465762f696e7075742f6576656e742300", 0x0, 0x0)
  tkill(r1, 0x7)

The vma should be pinned by mmap_sem, but handle_userfault() might (in a
return to userspace scenario) release it and then acquire again, so when
we return to __do_page_fault() (with other result than VM_FAULT_RETRY),
the vma might be gone.

Specifically, per Andrea the scenario is
 "A return to userland to repeat the page fault later with a
  VM_FAULT_NOPAGE retval (potentially after handling any pending signal
  during the return to userland). The return to userland is identified
  whenever FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in
  vmf->flags"

However, since commit a3c4fb7c9c2e ("x86/mm: Fix fault error path using
unsafe vma pointer") there is a vma_pkey() read of vma->vm_flags after
that point, which can thus become use-after-free.  Fix this by moving
the read before calling handle_mm_fault().

Reported-by: syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Fixes: 3c4fb7c9c2e ("x86/mm: Fix fault error path using unsafe vma pointer")
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Lv Zheng
7003eb6399 ACPI / EC: Fix regression related to triggering source of EC event handling
commit 53c5eaabaea9a1b7a96f95ccc486d2ad721d95bb upstream.

Originally the Samsung quirks removed by commit 4c237371 can be covered
by commit e923e8e7 and ec_freeze_events=Y mode. But commit 9c40f956
changed ec_freeze_events=Y back to N, making this problem re-surface.

Actually, if commit e923e8e7 is robust enough, we can freely change
ec_freeze_events mode, so this patch fixes the issue by improving
commit e923e8e7.

Related commits listed in the merged order:

 Commit: e923e8e79e18fd6be9162f1be6b99a002e9df2cb
 Subject: ACPI / EC: Fix an issue that SCI_EVT cannot be detected
          after event is enabled

 Commit: 4c237371f290d1ed3b2071dd43554362137b1cce
 Subject: ACPI / EC: Remove old CLEAR_ON_RESUME quirk

 Commit: 9c40f956ce9b331493347d1b3cb7e384f7dc0581
 Subject: Revert "ACPI / EC: Enable event freeze mode..." to fix
          a regression

This patch not only fixes the reported post-resume EC event triggering
source issue, but also fixes an unreported similar issue related to the
driver bind by adding EC event triggering source in ec_install_handlers().

Fixes: e923e8e79e18 (ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled)
Fixes: 4c237371f290 (ACPI / EC: Remove old CLEAR_ON_RESUME quirk)
Fixes: 9c40f956ce9b (Revert "ACPI / EC: Enable event freeze mode..." to fix a regression)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196833
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reported-by: Alistair Hamilton <ahpatent@gmail.com>
Tested-by: Alistair Hamilton <ahpatent@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:01 +00:00
Vasily Gorbik
7160a44752 s390/disassembler: increase show_code buffer size
commit b192571d1ae375e0bbe0aa3ccfa1a3c3704454b9 upstream.

Current buffer size of 64 is too small. objdump shows that there are
instructions which would require up to 75 bytes buffer (with current
formating). 128 bytes "ought to be enough for anybody".

Also replaces 8 spaces with a single tab to reduce the memory footprint.

Fixes the following KASAN finding:

BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538
Write of size 1 at addr 000000005a4a75a0 by task bash/1282

CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<000000000011eeb6>] show_stack+0x56/0x88)
 [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0
 [<00000000004e2994>] print_address_description+0xf4/0x288
 [<00000000004e2cf2>] kasan_report+0x13a/0x230
 [<0000000000e38ae6>] number+0x3fe/0x538
 [<0000000000e3dfe4>] vsnprintf+0x194/0x948
 [<0000000000e3ea42>] sprintf+0xa2/0xb8
 [<00000000001198dc>] print_insn+0x374/0x500
 [<0000000000119346>] show_code+0x4ee/0x538
 [<000000000011f234>] show_registers+0x34c/0x388
 [<000000000011f2ae>] show_regs+0x3e/0xa8
 [<000000000011f502>] die+0x1ea/0x2e8
 [<0000000000138f0e>] do_no_context+0x106/0x168
 [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0
 [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0
 [<000000000090639e>] sysrq_handle_crash+0x46/0x58
([<0000000000000007>] 0x7)
 [<00000000009073fa>] __handle_sysrq+0x102/0x218
 [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100
 [<000000000061d67a>] proc_reg_write+0xb2/0x128
 [<0000000000520be6>] __vfs_write+0xee/0x368
 [<0000000000521222>] vfs_write+0x21a/0x278
 [<000000000052156a>] SyS_write+0xda/0x178
 [<0000000000e555cc>] system_call+0xc4/0x270

The buggy address belongs to the page:
page:000003d1016929c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x0()
raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000
raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
 000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00
>000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
                               ^
 000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8
 000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00
==================================================================

Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:00 +00:00
Heiko Carstens
5380996086 s390/disassembler: add missing end marker for e7 table
commit 5c50538752af7968f53924b22dede8ed4ce4cb3b upstream.

The e7 opcode table does not have an end marker. Hence when trying to
find an unknown e7 instruction the code will access memory behind the
table until it finds something that matches the opcode, or the kernel
crashes, whatever comes first.

This affects not only the in-kernel disassembler but also uprobes and
kprobes which refuse to set a probe on unknown instructions, and
therefore search the opcode tables to figure out if instructions are
known or not.

Fixes: 3585cb0280654 ("s390/disassembler: add vector instructions")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:00 +00:00
Heiko Carstens
550435a1d1 s390/runtime instrumention: fix possible memory corruption
commit d6e646ad7cfa7034d280459b2b2546288f247144 upstream.

For PREEMPT enabled kernels the runtime instrumentation (RI) code
contains a possible use-after-free bug. If a task that makes use of RI
exits, it will execute do_exit() while still enabled for preemption.

That function will call exit_thread_runtime_instr() via
exit_thread(). If exit_thread_runtime_instr() gets preempted after the
RI control block of the task has been freed but before the pointer to
it is set to NULL, then save_ri_cb(), called from switch_to(), will
write to already freed memory.

Avoid this and simply disable preemption while freeing the control
block and setting the pointer to NULL.

Fixes: e4b8b3f33fca ("s390: add support for runtime instrumentation")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:00 +00:00
Heiko Carstens
c9d0db615c s390: fix transactional execution control register handling
commit a1c5befc1c24eb9c1ee83f711e0f21ee79cbb556 upstream.

Dan Horák reported the following crash related to transactional execution:

User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000]
CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1
Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
task: 00000000fafc8000 task.stack: 00000000fafc4000
User PSW : 0705200180000000 000003ff93c14e70
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e
           0000000000000000 0000000000000002 0000000000000000 000003ff00000000
           0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770
           000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0
User Code: 000003ff93c14e62: 60e0b030            std     %f14,48(%r11)
           000003ff93c14e66: 60f0b038            std     %f15,56(%r11)
          #000003ff93c14e6a: e5600000ff0e        tbegin  0,65294
          >000003ff93c14e70: a7740006            brc     7,3ff93c14e7c
           000003ff93c14e74: a7080000            lhi     %r0,0
           000003ff93c14e78: a7f40023            brc     15,3ff93c14ebe
           000003ff93c14e7c: b2220000            ipm     %r0
           000003ff93c14e80: 8800001c            srl     %r0,28

There are several bugs with control register handling with respect to
transactional execution:

- on task switch update_per_regs() is only called if the next task has
  an mm (is not a kernel thread). This however is incorrect. This
  breaks e.g. for user mode helper handling, where the kernel creates
  a kernel thread and then execve's a user space program. Control
  register contents related to transactional execution won't be
  updated on execve. If the previous task ran with transactional
  execution disabled then the new task will also run with
  transactional execution disabled, which is incorrect. Therefore call
  update_per_regs() unconditionally within switch_to().

- on startup the transactional execution facility is not enabled for
  the idle thread. This is not really a bug, but an inconsistency to
  other facilities. Therefore enable the facility if it is available.

- on fork the new thread's per_flags field is not cleared. This means
  that a child process inherits the PER_FLAG_NO_TE flag. This flag can
  be set with a ptrace request to disable transactional execution for
  the current process. It should not be inherited by new child
  processes in order to be consistent with the handling of all other
  PER related debugging options. Therefore clear the per_flags field in
  copy_thread_tls().

Reported-and-tested-by: Dan Horák <dan@danny.cz>
Fixes: d35339a42dd1 ("s390: add support for transactional memory")
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30 08:39:00 +00:00
Greg Kroah-Hartman
133e6ccf46 Linux 4.9.65 2017-11-24 08:33:43 +01:00
Jann Horn
ceaec6e8cd mm/pagewalk.c: report holes in hugetlb ranges
commit 373c4557d2aa362702c4c2d41288fb1e54990b7c upstream.

This matters at least for the mincore syscall, which will otherwise copy
uninitialized memory from the page allocator to userspace.  It is
probably also a correctness error for /proc/$pid/pagemap, but I haven't
tested that.

Removing the `walk->hugetlb_entry` condition in walk_hugetlb_range() has
no effect because the caller already checks for that.

This only reports holes in hugetlb ranges to callers who have specified
a hugetlb_entry callback.

This issue was found using an AFL-based fuzzer.

v2:
 - don't crash on ->pte_hole==NULL (Andrew Morton)
 - add Cc stable (Andrew Morton)

Changed for 4.4/4.9 stable backport:
 - fix up conflict in the huge_pte_offset() call

Fixes: 1e25a271c8ac ("mincore: apply page table walker on do_mincore()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Jan Harkes
fae5947129 coda: fix 'kernel memory exposure attempt' in fsync
commit d337b66a4c52c7b04eec661d86c2ef6e168965a2 upstream.

When an application called fsync on a file in Coda a small request with
just the file identifier was allocated, but the declared length was set
to the size of union of all possible upcall requests.

This bug has been around for a very long time and is now caught by the
extra checking in usercopy that was introduced in Linux-4.8.

The exposure happens when the Coda cache manager process reads the fsync
upcall request at which point it is killed. As a result there is nobody
servicing any further upcalls, trapping any processes that try to access
the mounted Coda filesystem.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Pavel Tatashin
9980b82783 mm/page_alloc.c: broken deferred calculation
commit d135e5750205a21a212a19dbb05aeb339e2cbea7 upstream.

In reset_deferred_meminit() we determine number of pages that must not
be deferred.  We initialize pages for at least 2G of memory, but also
pages for reserved memory in this node.

The reserved memory is determined in this function:
memblock_reserved_memory_within(), which operates over physical
addresses, and returns size in bytes.  However, reset_deferred_meminit()
assumes that that this function operates with pfns, and returns page
count.

The result is that in the best case machine boots slower than expected
due to initializing more pages than needed in single thread, and in the
worst case panics because fewer than needed pages are initialized early.

Link: http://lkml.kernel.org/r/20171021011707.15191-1-pasha.tatashin@oracle.com
Fixes: 864b9a393dcb ("mm: consider memblock reservations for deferred memory initialization sizing")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Corey Minyard
55b06b0fc0 ipmi: fix unsigned long underflow
commit 392a17b10ec4320d3c0e96e2a23ebaad1123b989 upstream.

When I set the timeout to a specific value such as 500ms, the timeout
event will not happen in time due to the overflow in function
check_msg_timeout:
...
	ent->timeout -= timeout_period;
	if (ent->timeout > 0)
		return;
...

The type of timeout_period is long, but ent->timeout is unsigned long.
This patch makes the type consistent.

Reported-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
alex chen
8af777385f ocfs2: should wait dio before inode lock in ocfs2_setattr()
commit 28f5a8a7c033cbf3e32277f4cc9c6afd74f05300 upstream.

we should wait dio requests to finish before inode lock in
ocfs2_setattr(), otherwise the following deadlock will happen:

process 1                  process 2                    process 3
truncate file 'A'          end_io of writing file 'A'   receiving the bast messages
ocfs2_setattr
 ocfs2_inode_lock_tracker
  ocfs2_inode_lock_full
 inode_dio_wait
  __inode_dio_wait
  -->waiting for all dio
  requests finish
                                                        dlm_proxy_ast_handler
                                                         dlm_do_local_bast
                                                          ocfs2_blocking_ast
                                                           ocfs2_generic_handle_bast
                                                            set OCFS2_LOCK_BLOCKED flag
                        dio_end_io
                         dio_bio_end_aio
                          dio_complete
                           ocfs2_dio_end_io
                            ocfs2_dio_end_io_write
                             ocfs2_inode_lock
                              __ocfs2_cluster_lock
                               ocfs2_wait_for_mask
                               -->waiting for OCFS2_LOCK_BLOCKED
                               flag to be cleared, that is waiting
                               for 'process 1' unlocking the inode lock
                           inode_dio_end
                           -->here dec the i_dio_count, but will never
                           be called, so a deadlock happened.

Link: http://lkml.kernel.org/r/59F81636.70508@huawei.com
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Changwei Ge
a8356445ba ocfs2: fix cluster hang after a node dies
commit 1c01967116a678fed8e2c68a6ab82abc8effeddc upstream.

When a node dies, other live nodes have to choose a new master for an
existed lock resource mastered by the dead node.

As for ocfs2/dlm implementation, this is done by function -
dlm_move_lockres_to_recovery_list which marks those lock rsources as
DLM_LOCK_RES_RECOVERING and manages them via a list from which DLM
changes lock resource's master later.

So without invoking dlm_move_lockres_to_recovery_list, no master will be
choosed after dlm recovery accomplishment since no lock resource can be
found through ::resource list.

What's worse is that if DLM_LOCK_RES_RECOVERING is not marked for lock
resources mastered a dead node, it will break up synchronization among
nodes.

So invoke dlm_move_lockres_to_recovery_list again.

Fixs: 'commit ee8f7fcbe638 ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down")'
Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED6E0F9@H3CMLB14-EX.srv.huawei-3com.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reported-by: Vitaly Mayatskih <v.mayatskih@gmail.com>
Tested-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Adam Wallis
2bd38ece78 dmaengine: dmatest: warn user when dma test times out
commit a9df21e34b422f79d9a9fa5c3eff8c2a53491be6 upstream.

Commit adfa543e7314 ("dmatest: don't use set_freezable_with_signal()")
introduced a bug (that is in fact documented by the patch commit text)
that leaves behind a dangling pointer. Since the done_wait structure is
allocated on the stack, future invocations to the DMATEST can produce
undesirable results (e.g., corrupted spinlocks). Ideally, this would be
cleaned up in the thread handler, but at the very least, the kernel
is left in a very precarious scenario that can lead to some long debug
sessions when the crash comes later.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197605
Signed-off-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Ji-Ze Hong (Peter Hong)
e6d4a078f0 serial: 8250_fintek: Fix finding base_port with activated SuperIO
commit fd97e66c5529046e989a0879c3bb58fddb592c71 upstream.

The SuperIO will be configured at boot time by BIOS, but some BIOS
will not deactivate the SuperIO when the end of configuration. It'll
lead to mismatch for pdata->base_port in probe_setup_port(). So we'll
deactivate all SuperIO before activate special base_port in
fintek_8250_enter_key().

Tested on iBASE MI802.

Tested-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
Signed-off-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
Reviewd-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Lukas Wunner
70eb4608bb serial: omap: Fix EFR write on RTS deassertion
commit 2a71de2f7366fb1aec632116d0549ec56d6a3940 upstream.

Commit 348f9bb31c56 ("serial: omap: Fix RTS handling") sought to enable
auto RTS upon manual RTS assertion and disable it on deassertion.
However it seems the latter was done incorrectly, it clears all bits in
the Extended Features Register *except* auto RTS.

Fixes: 348f9bb31c56 ("serial: omap: Fix RTS handling")
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Roberto Sassu
2cfbb32f6c ima: do not update security.ima if appraisal status is not INTEGRITY_PASS
commit 020aae3ee58c1af0e7ffc4e2cc9fe4dc630338cb upstream.

Commit b65a9cfc2c38 ("Untangling ima mess, part 2: deal with counters")
moved the call of ima_file_check() from may_open() to do_filp_open() at a
point where the file descriptor is already opened.

This breaks the assumption made by IMA that file descriptors being closed
belong to files whose access was granted by ima_file_check(). The
consequence is that security.ima and security.evm are updated with good
values, regardless of the current appraisal status.

For example, if a file does not have security.ima, IMA will create it after
opening the file for writing, even if access is denied. Access to the file
will be allowed afterwards.

Avoid this issue by checking the appraisal status before updating
security.ima.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00
Eric Biggers
aa15fe4d6a crypto: dh - Fix double free of ctx->p
commit 12d41a023efb01b846457ccdbbcbe2b65a87d530 upstream.

When setting the secret with the software Diffie-Hellman implementation,
if allocating 'g' failed (e.g. if it was longer than
MAX_EXTERN_MPI_BITS), then 'p' was freed twice: once immediately, and
once later when the crypto_kpp tfm was destroyed.

Fix it by using dh_free_ctx() (renamed to dh_clear_ctx()) in the error
paths, as that correctly sets the pointers to NULL.

KASAN report:

    MPI: mpi too large (32760 bits)
    ==================================================================
    BUG: KASAN: use-after-free in mpi_free+0x131/0x170
    Read of size 4 at addr ffff88006c7cdf90 by task reproduce_doubl/367

    CPU: 1 PID: 367 Comm: reproduce_doubl Not tainted 4.14.0-rc7-00040-g05298abde6fe #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     dump_stack+0xb3/0x10b
     ? mpi_free+0x131/0x170
     print_address_description+0x79/0x2a0
     ? mpi_free+0x131/0x170
     kasan_report+0x236/0x340
     ? akcipher_register_instance+0x90/0x90
     __asan_report_load4_noabort+0x14/0x20
     mpi_free+0x131/0x170
     ? akcipher_register_instance+0x90/0x90
     dh_exit_tfm+0x3d/0x140
     crypto_kpp_exit_tfm+0x52/0x70
     crypto_destroy_tfm+0xb3/0x250
     __keyctl_dh_compute+0x640/0xe90
     ? kasan_slab_free+0x12f/0x180
     ? dh_data_from_key+0x240/0x240
     ? key_create_or_update+0x1ee/0xb20
     ? key_instantiate_and_link+0x440/0x440
     ? lock_contended+0xee0/0xee0
     ? kfree+0xcf/0x210
     ? SyS_add_key+0x268/0x340
     keyctl_dh_compute+0xb3/0xf1
     ? __keyctl_dh_compute+0xe90/0xe90
     ? SyS_add_key+0x26d/0x340
     ? entry_SYSCALL_64_fastpath+0x5/0xbe
     ? trace_hardirqs_on_caller+0x3f4/0x560
     SyS_keyctl+0x72/0x2c0
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x43ccf9
    RSP: 002b:00007ffeeec96158 EFLAGS: 00000246 ORIG_RAX: 00000000000000fa
    RAX: ffffffffffffffda RBX: 000000000248b9b9 RCX: 000000000043ccf9
    RDX: 00007ffeeec96170 RSI: 00007ffeeec96160 RDI: 0000000000000017
    RBP: 0000000000000046 R08: 0000000000000000 R09: 0248b9b9143dc936
    R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000409670 R14: 0000000000409700 R15: 0000000000000000

    Allocated by task 367:
     save_stack_trace+0x16/0x20
     kasan_kmalloc+0xeb/0x180
     kmem_cache_alloc_trace+0x114/0x300
     mpi_alloc+0x4b/0x230
     mpi_read_raw_data+0xbe/0x360
     dh_set_secret+0x1dc/0x460
     __keyctl_dh_compute+0x623/0xe90
     keyctl_dh_compute+0xb3/0xf1
     SyS_keyctl+0x72/0x2c0
     entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 367:
     save_stack_trace+0x16/0x20
     kasan_slab_free+0xab/0x180
     kfree+0xb5/0x210
     mpi_free+0xcb/0x170
     dh_set_secret+0x2d7/0x460
     __keyctl_dh_compute+0x623/0xe90
     keyctl_dh_compute+0xb3/0xf1
     SyS_keyctl+0x72/0x2c0
     entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: 802c7f1c84e4 ("crypto: dh - Add DH software implementation")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00
Tudor-Dan Ambarus
4a7e023124 crypto: dh - fix memleak in setkey
commit ee34e2644a78e2561742bea8c4bdcf83cabf90a7 upstream.

setkey can be called multiple times during the existence
of the transformation object. In case of multiple setkey calls,
the old key was not freed and we leaked memory.
Free the old MPI key if any.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00
Eric W. Biederman
67b718fcf8 net/sctp: Always set scope_id in sctp_inet6_skb_msgname
[ Upstream commit 7c8a61d9ee1df0fb4747879fa67a99614eb62fec ]

Alexandar Potapenko while testing the kernel with KMSAN and syzkaller
discovered that in some configurations sctp would leak 4 bytes of
kernel stack.

Working with his reproducer I discovered that those 4 bytes that
are leaked is the scope id of an ipv6 address returned by recvmsg.

With a little code inspection and a shrewd guess I discovered that
sctp_inet6_skb_msgname only initializes the scope_id field for link
local ipv6 addresses to the interface index the link local address
pertains to instead of initializing the scope_id field for all ipv6
addresses.

That is almost reasonable as scope_id's are meaniningful only for link
local addresses.  Set the scope_id in all other cases to 0 which is
not a valid interface index to make it clear there is nothing useful
in the scope_id field.

There should be no danger of breaking userspace as the stack leak
guaranteed that previously meaningless random data was being returned.

Fixes: 372f525b495c ("SCTP:  Resync with LKSCTP tree.")
History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00
Huacai Chen
f0ae7a1b45 fealnx: Fix building error on MIPS
[ Upstream commit cc54c1d32e6a4bb3f116721abf900513173e4d02 ]

This patch try to fix the building error on MIPS. The reason is MIPS
has already defined the LONG macro, which conflicts with the LONG enum
in drivers/net/ethernet/fealnx.c.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00
Xin Long
362d2ce0f8 sctp: do not peel off an assoc from one netns to another one
[ Upstream commit df80cd9b28b9ebaa284a41df611dbf3a2d05ca74 ]

Now when peeling off an association to the sock in another netns, all
transports in this assoc are not to be rehashed and keep use the old
key in hashtable.

As a transport uses sk->net as the hash key to insert into hashtable,
it would miss removing these transports from hashtable due to the new
netns when closing the sock and all transports are being freeed, then
later an use-after-free issue could be caused when looking up an asoc
and dereferencing those transports.

This is a very old issue since very beginning, ChunYu found it with
syzkaller fuzz testing with this series:

  socket$inet6_sctp()
  bind$inet6()
  sendto$inet6()
  unshare(0x40000000)
  getsockopt$inet_sctp6_SCTP_GET_ASSOC_ID_LIST()
  getsockopt$inet_sctp6_SCTP_SOCKOPT_PEELOFF()

This patch is to block this call when peeling one assoc off from one
netns to another one, so that the netns of all transport would not
go out-sync with the key in hashtable.

Note that this patch didn't fix it by rehashing transports, as it's
difficult to handle the situation when the tuple is already in use
in the new netns. Besides, no one would like to peel off one assoc
to another netns, considering ipaddrs, ifaces, etc. are usually
different.

Reported-by: ChunYu Wang <chunwang@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:41 +01:00