645897 Commits

Author SHA1 Message Date
Vasily Averin
d450fcdb42 ext4: fix buffer leak in __ext4_read_dirblock() on error path
commit de59fae0043f07de5d25e02ca360f7d57bfa5866 upstream.

Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 3.9
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Vasily Averin
82dfeb8d82 ext4: fix buffer leak in ext4_xattr_move_to_block() on error path
commit 6bdc9977fcdedf47118d2caf7270a19f4b6d8a8f upstream.

Fixes: 3f2571c1f91f ("ext4: factor out xattr moving")
Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 2.6.23
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Vasily Averin
b66102a457 ext4: release bs.bh before re-using in ext4_xattr_block_find()
commit 45ae932d246f721e6584430017176cbcadfde610 upstream.

bs.bh was taken in previous ext4_xattr_block_find() call,
it should be released before re-using

Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 2.6.26
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Vasily Averin
6b436fb67e ext4: fix possible leak of s_journal_flag_rwsem in error path
commit af18e35bfd01e6d65a5e3ef84ffe8b252d1628c5 upstream.

Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 4.7
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Theodore Ts'o
8547ff5d14 ext4: fix possible leak of sbi->s_group_desc_leak in error path
commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream.

Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock")
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 4.18
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Theodore Ts'o
11a2eb02e6 ext4: avoid possible double brelse() in add_new_gdb() on error path
commit 4f32c38b4662312dd3c5f113d8bdd459887fb773 upstream.

Fixes: b40971426a83 ("ext4: add error checking to calls to ...")
Reported-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 2.6.38
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Vasily Averin
142e0172e3 ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing
commit f348e2241fb73515d65b5d77dd9c174128a7fbf2 upstream.

Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 3.7
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:02 +01:00
Vasily Averin
f30a52c9c5 ext4: avoid buffer leak in ext4_orphan_add() after prior errors
commit feaf264ce7f8d54582e2f66eb82dd9dd124c94f3 upstream.

Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock")
Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link")
Cc: Dmitry Monakhov <dmonakhov@gmail.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 2.6.34
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Vasily Averin
8a6a7dd744 ext4: fix possible inode leak in the retry loop of ext4_resize_fs()
commit db6aee62406d9fbb53315fcddd81f1dc271d49fa upstream.

Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 3.7
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Vasily Averin
05821678f2 ext4: avoid potential extra brelse in setup_new_flex_group_blocks()
commit 9e4028935cca3f9ef9b6a90df9da6f1f94853536 upstream.

Currently bh is set to NULL only during first iteration of for cycle,
then this pointer is not cleared after end of using.
Therefore rollback after errors can lead to extra brelse(bh) call,
decrements bh counter and later trigger an unexpected warning in __brelse()

Patch moves brelse() calls in body of cycle to exclude requirement of
brelse() call in rollback.

Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 3.3+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Vasily Averin
c484fd258c ext4: add missing brelse() add_new_gdb_meta_bg()'s error path
commit 61a9c11e5e7a0dab5381afa5d9d4dd5ebf18f7a0 upstream.

Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 3.7
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Vasily Averin
771e8c73f2 ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path
commit cea5794122125bf67559906a0762186cf417099c upstream.

Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...")
Cc: stable@kernel.org # 3.3
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Vasily Averin
b1341999bd ext4: add missing brelse() update_backups()'s error path
commit ea0abbb648452cdb6e1734b702b6330a7448fcf8 upstream.

Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 2.6.19
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Michael Kelley
f6939dbd80 clockevents/drivers/i8253: Add support for PIT shutdown quirk
commit 35b69a420bfb56b7b74cb635ea903db05e357bec upstream.

Add support for platforms where pit_shutdown() doesn't work because of a
quirk in the PIT emulation. On these platforms setting the counter register
to zero causes the PIT to start running again, negating the shutdown.

Provide a global variable that controls whether the counter register is
zero'ed, which platform specific code can override.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "devel@linuxdriverproject.org" <devel@linuxdriverproject.org>
Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Cc: "virtualization@lists.linux-foundation.org" <virtualization@lists.linux-foundation.org>
Cc: "jgross@suse.com" <jgross@suse.com>
Cc: "akataria@vmware.com" <akataria@vmware.com>
Cc: "olaf@aepfle.de" <olaf@aepfle.de>
Cc: "apw@canonical.com" <apw@canonical.com>
Cc: vkuznets <vkuznets@redhat.com>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>
Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>
Cc: KY Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1541303219-11142-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Filipe Manana
800d8112ee Btrfs: fix data corruption due to cloning of eof block
commit ac765f83f1397646c11092a032d4f62c3d478b81 upstream.

We currently allow cloning a range from a file which includes the last
block of the file even if the file's size is not aligned to the block
size. This is fine and useful when the destination file has the same size,
but when it does not and the range ends somewhere in the middle of the
destination file, it leads to corruption because the bytes between the EOF
and the end of the block have undefined data (when there is support for
discard/trimming they have a value of 0x00).

Example:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt

 $ export foo_size=$((256 * 1024 + 100))
 $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo
 $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar

 $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar

 $ od -A d -t x1 /mnt/bar
 0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
 *
 0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c
 *
 0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00
 0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 *
 0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
 *
 1048576

The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527
(512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead
of 0xb5.

This is similar to the problem we had for deduplication that got recently
fixed by commit de02b9f6bb65 ("Btrfs: fix data corruption when
deduplicating between different files").

Fix this by not allowing such operations to be performed and return the
errno -EINVAL to user space. This is what XFS is doing as well at the VFS
level. This change however now makes us return -EINVAL instead of
-EOPNOTSUPP for cases where the source range maps to an inline extent and
the destination range's end is smaller then the destination file's size,
since the detection of inline extents is done during the actual process of
dropping file extent items (at __btrfs_drop_extents()). Returning the
-EINVAL error is done early on and solely based on the input parameters
(offsets and length) and destination file's size. This makes us consistent
with XFS and anyone else supporting cloning since this case is now checked
at a higher level in the VFS and is where the -EINVAL will be returned
from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1
by commit 07d19dc9fbe9 ("vfs: avoid problematic remapping requests into
partial EOF block"). So this change is more geared towards stable kernels,
as it's unlikely the new VFS checks get removed intentionally.

A test case for fstests follows soon, as well as an update to filter
existing tests that expect -EOPNOTSUPP to accept -EINVAL as well.

CC: <stable@vger.kernel.org> # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
Robbie Ko
3fe6b9aa5c Btrfs: fix cur_offset in the error case for nocow
commit 506481b20e818db40b6198815904ecd2d6daee64 upstream.

When the cow_file_range fails, the related resources are unlocked
according to the range [start..end), so the unlock cannot be repeated in
run_delalloc_nocow.

In some cases (e.g. cur_offset <= end && cow_start != -1), cur_offset is
not updated correctly, so move the cur_offset update before
cow_file_range.

  kernel BUG at mm/page-writeback.c:2663!
  Internal error: Oops - BUG: 0 [#1] SMP
  CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O
  Hardware name: Realtek_RTD1296 (DT)
  Workqueue: writeback wb_workfn (flush-btrfs-1)
  task: ffffffc076db3380 ti: ffffffc02e9ac000 task.ti: ffffffc02e9ac000
  PC is at clear_page_dirty_for_io+0x1bc/0x1e8
  LR is at clear_page_dirty_for_io+0x14/0x1e8
  pc : [<ffffffc00033c91c>] lr : [<ffffffc00033c774>] pstate: 40000145
  sp : ffffffc02e9af4f0
  Process kworker/u8:7 (pid: 31525, stack limit = 0xffffffc02e9ac020)
  Call trace:
  [<ffffffc00033c91c>] clear_page_dirty_for_io+0x1bc/0x1e8
  [<ffffffbffc514674>] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs]
  [<ffffffbffc4fb168>] run_delalloc_nocow+0x3b8/0x948 [btrfs]
  [<ffffffbffc4fb948>] run_delalloc_range+0x250/0x3a8 [btrfs]
  [<ffffffbffc514c0c>] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs]
  [<ffffffbffc516048>] __extent_writepage+0xe8/0x248 [btrfs]
  [<ffffffbffc51630c>] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs]
  [<ffffffbffc5185a8>] extent_writepages+0x48/0x68 [btrfs]
  [<ffffffbffc4f5828>] btrfs_writepages+0x20/0x30 [btrfs]
  [<ffffffc00033d758>] do_writepages+0x30/0x88
  [<ffffffc0003ba0f4>] __writeback_single_inode+0x34/0x198
  [<ffffffc0003ba6c4>] writeback_sb_inodes+0x184/0x3c0
  [<ffffffc0003ba96c>] __writeback_inodes_wb+0x6c/0xc0
  [<ffffffc0003bac20>] wb_writeback+0x1b8/0x1c0
  [<ffffffc0003bb0f0>] wb_workfn+0x150/0x250
  [<ffffffc0002b0014>] process_one_work+0x1dc/0x388
  [<ffffffc0002b02f0>] worker_thread+0x130/0x500
  [<ffffffc0002b6344>] kthread+0x10c/0x110
  [<ffffffc000284590>] ret_from_fork+0x10/0x40
  Code: d503201f a9025bb5 a90363b7 f90023b9 (d4210000)

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:01 +01:00
H. Peter Anvin (Intel)
53111ab29e arch/alpha, termios: implement BOTHER, IBSHIFT and termios2
commit d0ffb805b729322626639336986bc83fc2e60871 upstream.

Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags
using arbitrary flags. Because BOTHER is not defined, the general
Linux code doesn't allow setting arbitrary baud rates, and because
CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in
drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037.

Resolve both problems by #defining BOTHER to 037 on Alpha.

However, userspace still needs to know if setting BOTHER is actually
safe given legacy kernels (does anyone actually care about that on
Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even
though they use the same structure. Define struct termios2 just for
compatibility; it is the exact same structure as struct termios. In a
future patchset, this will be cleaned up so the uapi headers are
usable from libc.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: <linux-alpha@vger.kernel.org>
Cc: <linux-serial@vger.kernel.org>
Cc: Johan Hovold <johan@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
H. Peter Anvin
4d18bea3de termios, tty/tty_baudrate.c: fix buffer overrun
commit 991a25194097006ec1e0d2e0814ff920e59e3465 upstream.

On architectures with CBAUDEX == 0 (Alpha and PowerPC), the code in tty_baudrate.c does
not do any limit checking on the tty_baudrate[] array, and in fact a
buffer overrun is possible on both architectures. Add a limit check to
prevent that situation.

This will be followed by a much bigger cleanup/simplification patch.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Requested-by: Cc: Johan Hovold <johan@kernel.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
John Garry
af86cb901c of, numa: Validate some distance map rules
commit 89c38422e072bb453e3045b8f1b962a344c3edea upstream.

Currently the NUMA distance map parsing does not validate the distance
table for the distance-matrix rules 1-2 in [1].

However the arch NUMA code may enforce some of these rules, but not all.
Such is the case for the arm64 port, which does not enforce the rule that
the distance between separates nodes cannot equal LOCAL_DISTANCE.

The patch adds the following rules validation:
- distance of node to self equals LOCAL_DISTANCE
- distance of separate nodes > LOCAL_DISTANCE

This change avoids a yet-unresolved crash reported in [2].

A note on dealing with symmetrical distances between nodes:

Validating symmetrical distances between nodes is difficult. If it were
mandated in the bindings that every distance must be recorded in the
table, then it would be easy. However, it isn't.

In addition to this, it is also possible to record [b, a] distance only
(and not [a, b]). So, when processing the table for [b, a], we cannot
assert that current distance of [a, b] != [b, a] as invalid, as [a, b]
distance may not be present in the table and current distance would be
default at REMOTE_DISTANCE.

As such, we maintain the policy that we overwrite distance [a, b] = [b, a]
for b > a. This policy is different to kernel ACPI SLIT validation, which
allows non-symmetrical distances (ACPI spec SLIT rules allow it). However,
the distance debug message is dropped as it may be misleading (for a distance
which is later overwritten).

Some final notes on semantics:

- It is implied that it is the responsibility of the arch NUMA code to
  reset the NUMA distance map for an error in distance map parsing.

- It is the responsibility of the FW NUMA topology parsing (whether OF or
  ACPI) to enforce NUMA distance rules, and not arch NUMA code.

[1] Documents/devicetree/bindings/numa.txt
[2] https://www.spinics.net/lists/arm-kernel/msg683304.html

Cc: stable@vger.kernel.org # 4.7
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Arnd Bergmann
d29a7b6f72 mtd: docg3: don't set conflicting BCH_CONST_PARAMS option
commit be2e1c9dcf76886a83fb1c433a316e26d4ca2550 upstream.

I noticed during the creation of another bugfix that the BCH_CONST_PARAMS
option that is set by DOCG3 breaks setting variable parameters for any
other users of the BCH library code.

The only other user we have today is the MTD_NAND software BCH
implementation (most flash controllers use hardware BCH these days
and are not affected). I considered removing BCH_CONST_PARAMS entirely
because of the inherent conflict, but according to the description in
lib/bch.c there is a significant performance benefit in keeping it.

To avoid the immediate problem of the conflict between MTD_NAND_BCH
and DOCG3, this only sets the constant parameters if MTD_NAND_BCH
is disabled, which should fix the problem for all cases that
are affected. This should also work for all stable kernels.

Note that there is only one machine that actually seems to use the
DOCG3 driver (arch/arm/mach-pxa/mioa701.c), so most users should have
the driver disabled, but it almost certainly shows up if we wanted
to test random kernels on machines that use software BCH in MTD.

Fixes: d13d19ece39f ("mtd: docg3: add ECC correction code")
Cc: stable@vger.kernel.org
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Vasily Khoruzhick
7e1e1956dc netfilter: conntrack: fix calculation of next bucket number in early_drop
commit f393808dc64149ccd0e5a8427505ba2974a59854 upstream.

If there's no entry to drop in bucket that corresponds to the hash,
early_drop() should look for it in other buckets. But since it increments
hash instead of bucket number, it actually looks in the same bucket 8
times: hsize is 16k by default (14 bits) and hash is 32-bit value, so
reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in
most cases.

Fix it by increasing bucket number instead of hash and rename _hash
to bucket to avoid future confusion.

Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Andrea Arcangeli
818e584636 mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
commit ac5b2c18911ffe95c08d69273917f90212cf5659 upstream.

THP allocation might be really disruptive when allocated on NUMA system
with the local node full or hard to reclaim.  Stefan has posted an
allocation stall report on 4.12 based SLES kernel which suggests the
same issue:

  kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
  kvm cpuset=/ mems_allowed=0-1
  CPU: 10 PID: 84752 Comm: kvm Tainted: G        W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased)
  Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017
  Call Trace:
   dump_stack+0x5c/0x84
   warn_alloc+0xe0/0x180
   __alloc_pages_slowpath+0x820/0xc90
   __alloc_pages_nodemask+0x1cc/0x210
   alloc_pages_vma+0x1e5/0x280
   do_huge_pmd_wp_page+0x83f/0xf00
   __handle_mm_fault+0x93d/0x1060
   handle_mm_fault+0xc6/0x1b0
   __do_page_fault+0x230/0x430
   do_page_fault+0x2a/0x70
   page_fault+0x7b/0x80
   [...]
  Mem-Info:
  active_anon:126315487 inactive_anon:1612476 isolated_anon:5
   active_file:60183 inactive_file:245285 isolated_file:0
   unevictable:15657 dirty:286 writeback:1 unstable:0
   slab_reclaimable:75543 slab_unreclaimable:2509111
   mapped:81814 shmem:31764 pagetables:370616 bounce:0
   free:32294031 free_pcp:6233 free_cma:0
  Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
  Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no

The defrag mode is "madvise" and from the above report it is clear that
the THP has been allocated for MADV_HUGEPAGA vma.

Andrea has identified that the main source of the problem is
__GFP_THISNODE usage:

: The problem is that direct compaction combined with the NUMA
: __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very
: hard the local node, instead of failing the allocation if there's no
: THP available in the local node.
:
: Such logic was ok until __GFP_THISNODE was added to the THP allocation
: path even with MPOL_DEFAULT.
:
: The idea behind the __GFP_THISNODE addition, is that it is better to
: provide local memory in PAGE_SIZE units than to use remote NUMA THP
: backed memory. That largely depends on the remote latency though, on
: threadrippers for example the overhead is relatively low in my
: experience.
:
: The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in
: extremely slow qemu startup with vfio, if the VM is larger than the
: size of one host NUMA node. This is because it will try very hard to
: unsuccessfully swapout get_user_pages pinned pages as result of the
: __GFP_THISNODE being set, instead of falling back to PAGE_SIZE
: allocations and instead of trying to allocate THP on other nodes (it
: would be even worse without vfio type1 GUP pins of course, except it'd
: be swapping heavily instead).

Fix this by removing __GFP_THISNODE for THP requests which are
requesting the direct reclaim.  This effectivelly reverts 5265047ac301
on the grounds that the zone/node reclaim was known to be disruptive due
to premature reclaim when there was memory free.  While it made sense at
the time for HPC workloads without NUMA awareness on rare machines, it
was ultimately harmful in the majority of cases.  The existing behaviour
is similar, if not as widespare as it applies to a corner case but
crucially, it cannot be tuned around like zone_reclaim_mode can.  The
default behaviour should always be to cause the least harm for the
common case.

If there are specialised use cases out there that want zone_reclaim_mode
in specific cases, then it can be built on top.  Longterm we should
consider a memory policy which allows for the node reclaim like behavior
for the specific memory ranges which would allow a

[1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com

Mel said:

: Both patches look correct to me but I'm responding to this one because
: it's the fix.  The change makes sense and moves further away from the
: severe stalling behaviour we used to see with both THP and zone reclaim
: mode.
:
: I put together a basic experiment with usemem configured to reference a
: buffer multiple times that is 80% the size of main memory on a 2-socket
: box with symmetric node sizes and defrag set to "always".  The defrag
: setting is not the default but it would be functionally similar to
: accessing a buffer with madvise(MADV_HUGEPAGE).  Usemem is configured to
: reference the buffer multiple times and while it's not an interesting
: workload, it would be expected to complete reasonably quickly as it fits
: within memory.  The results were;
:
: usemem
:                                   vanilla           noreclaim-v1
: Amean     Elapsd-1       42.78 (   0.00%)       26.87 (  37.18%)
: Amean     Elapsd-3       27.55 (   0.00%)        7.44 (  73.00%)
: Amean     Elapsd-4        5.72 (   0.00%)        5.69 (   0.45%)
:
: This shows the elapsed time in seconds for 1 thread, 3 threads and 4
: threads referencing buffers 80% the size of memory.  With the patches
: applied, it's 37.18% faster for the single thread and 73% faster with two
: threads.  Note that 4 threads showing little difference does not indicate
: the problem is related to thread counts.  It's simply the case that 4
: threads gets spread so their workload mostly fits in one node.
:
: The overall view from /proc/vmstats is more startling
:
:                          4.19.0-rc1  4.19.0-rc1
:                             vanillanoreclaim-v1r1
: Minor Faults               35593425      708164
: Major Faults                 484088          36
: Swap Ins                    3772837           0
: Swap Outs                   3932295           0
:
: Massive amounts of swap in/out without the patch
:
: Direct pages scanned        6013214           0
: Kswapd pages scanned              0           0
: Kswapd pages reclaimed            0           0
: Direct pages reclaimed      4033009           0
:
: Lots of reclaim activity without the patch
:
: Kswapd efficiency              100%        100%
: Kswapd velocity               0.000       0.000
: Direct efficiency               67%        100%
: Direct velocity           11191.956       0.000
:
: Mostly from direct reclaim context as you'd expect without the patch.
:
: Page writes by reclaim  3932314.000       0.000
: Page writes file                 19           0
: Page writes anon            3932295           0
: Page reclaim immediate        42336           0
:
: Writes from reclaim context is never good but the patch eliminates it.
:
: We should never have default behaviour to thrash the system for such a
: basic workload.  If zone reclaim mode behaviour is ever desired but on a
: single task instead of a global basis then the sensible option is to build
: a mempolicy that enforces that behaviour.

This was a severe regression compared to previous kernels that made
important workloads unusable and it starts when __GFP_THISNODE was
added to THP allocations under MADV_HUGEPAGE.  It is not a significant
risk to go to the previous behavior before __GFP_THISNODE was added, it
worked like that for years.

This was simply an optimization to some lucky workloads that can fit in
a single node, but it ended up breaking the VM for others that can't
possibly fit in a single node, so going back is safe.

[mhocko@suse.com: rewrote the changelog based on the one from Andrea]
Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org
Fixes: 5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Debugged-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: <stable@vger.kernel.org>	[4.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Changwei Ge
dd4c84ba2b ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
commit 29aa30167a0a2e6045a0d6d2e89d8168132333d5 upstream.

Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().

According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry.  But
there is obviouse misuse of brelse around related code.

After failure of ocfs2_check_dir_entry(), current code just moves to
next position and uses the problematic buffer head again and again
during which the problematic buffer head is released for multiple times.
I suppose, this a serious issue which is long-lived in ocfs2.  This may
cause other file systems which is also used in a the same host insane.

So we should also consider about bakcporting this patch into linux
-stable.

Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Suggested-by: Changkuo Shi <shi.changkuo@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Greg Edwards
acdc6a723e vhost/scsi: truncate T10 PI iov_iter to prot_bytes
commit 4542d623c7134bc1738f8a68ccb6dd546f1c264f upstream.

Commands with protection information included were not truncating the
protection iov_iter to the number of protection bytes in the command.
This resulted in vhost_scsi mis-calculating the size of the protection
SGL in vhost_scsi_calc_sgls(), and including both the protection and
data SG entries in the protection SGL.

Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq")
Signed-off-by: Greg Edwards <gedwards@ddn.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Gustavo A. R. Silva
e200279bb8 reset: hisilicon: fix potential NULL pointer dereference
commit e9a2310fb689151166df7fd9971093362d34bd79 upstream.

There is a potential execution path in which function
platform_get_resource() returns NULL. If this happens,
we will end up having a NULL pointer dereference.

Fix this by replacing devm_ioremap with devm_ioremap_resource,
which has the NULL check and the memory region request.

This code was detected with the help of Coccinelle.

Cc: stable@vger.kernel.org
Fixes: 97b7129cd2af ("reset: hisilicon: change the definition of hisi_reset_init")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Mikulas Patocka
7ecf24877e mach64: fix image corruption due to reading accelerator registers
commit c09bcc91bb94ed91f1391bffcbe294963d605732 upstream.

Reading the registers without waiting for engine idle returns
unpredictable values. These unpredictable values result in display
corruption - if atyfb_imageblit reads the content of DP_PIX_WIDTH with the
bit DP_HOST_TRIPLE_EN set (from previous invocation), the driver would
never ever clear the bit, resulting in display corruption.

We don't want to wait for idle because it would degrade performance, so
this patch modifies the driver so that it never reads accelerator
registers.

HOST_CNTL doesn't have to be read, we can just write it with
HOST_BYTE_ALIGN because no other part of the driver cares if
HOST_BYTE_ALIGN is set.

DP_PIX_WIDTH is written in the functions atyfb_copyarea and atyfb_fillrect
with the default value and in atyfb_imageblit with the value set according
to the source image data.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Ville Syrjälä <syrjala@sci.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:26:00 +01:00
Mikulas Patocka
6bd9b55243 mach64: fix display corruption on big endian machines
commit 3c6c6a7878d00a3ac997a779c5b9861ff25dfcc8 upstream.

The code for manual bit triple is not endian-clean. It builds the variable
"hostdword" using byte accesses, therefore we must read the variable with
"le32_to_cpu".

The patch also enables (hardware or software) bit triple only if the image
is monochrome (image->depth). If we want to blit full-color image, we
shouldn't use the triple code.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Ville Syrjälä <syrjala@sci.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Yan, Zheng
43e38372f9 Revert "ceph: fix dentry leak in splice_dentry()"
commit efe328230dc01aa0b1269aad0b5fae73eea4677a upstream.

This reverts commit 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3.

splice_dentry() is used by three places. For two places, req->r_dentry
is passed to splice_dentry(). In the case of error, req->r_dentry does
not get updated. So splice_dentry() should not drop reference.

Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Ilya Dryomov
9efe04470a libceph: bump CEPH_MSG_MAX_DATA_LEN
commit 94e6992bb560be8bffb47f287194adf070b57695 upstream.

If the read is large enough, we end up spinning in the messenger:

  libceph: osd0 192.168.122.1:6801 io error
  libceph: osd0 192.168.122.1:6801 io error
  libceph: osd0 192.168.122.1:6801 io error

This is a receive side limit, so only reads were affected.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Enric Balletbo i Serra
ee27421c80 clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call
commit 665636b2940d0897c4130253467f5e8c42eea392 upstream.

Fixes the signedness bug returning '(-22)' on the return type by removing the
sanity checker in rockchip_ddrclk_get_parent(). The function should return
and unsigned value only and it's safe to remove the sanity checker as the
core functions that call get_parent like clk_core_get_parent_by_index already
ensures the validity of the clk index returned (index >= core->num_parents).

Fixes: a4f182bf81f18 ("clk: rockchip: add new clock-type for the ddrclk")
Cc: stable@vger.kernel.org
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Ronald Wahl
79a3e118b0 clk: at91: Fix division by zero in PLL recalc_rate()
commit 0f5cb0e6225cae2f029944cb8c74617aab6ddd49 upstream.

Commit a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached MUL
and DIV values") removed a check that prevents a division by zero. This
now causes a stacktrace when booting the kernel on a at91 platform if
the PLL DIV register contains zero. This commit reintroduces this check.

Fixes: a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ronald Wahl <rwahl@gmx.de>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Krzysztof Kozlowski
0a468686fa clk: s2mps11: Fix matching when built as module and DT node contains compatible
commit 8985167ecf57f97061599a155bb9652c84ea4913 upstream.

When driver is built as module and DT node contains clocks compatible
(e.g. "samsung,s2mps11-clk"), the module will not be autoloaded because
module aliases won't match.

The modalias from uevent: of:NclocksT<NULL>Csamsung,s2mps11-clk
The modalias from driver: platform:s2mps11-clk

The devices are instantiated by parent's MFD.  However both Device Tree
bindings and parent define the compatible for clocks devices.  In case
of module matching this DT compatible will be used.

The issue will not happen if this is a built-in (no need for module
matching) or when clocks DT node does not contain compatible (not
correct from bindings perspective but working for driver).

Note when backporting to stable kernels: adjust the list of device ID
entries.

Cc: <stable@vger.kernel.org>
Fixes: 53c31b3437a6 ("mfd: sec-core: Add of_compatible strings for clock MFD cells")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Max Filippov
0dac028156 xtensa: fix boot parameters address translation
commit 40dc948f234b73497c3278875eb08a01d5854d3f upstream.

The bootloader may pass physical address of the boot parameters structure
to the MMUv3 kernel in the register a2. Code in the _SetupMMU block in
the arch/xtensa/kernel/head.S is supposed to map that physical address to
the virtual address in the configured virtual memory layout.

This code haven't been updated when additional 256+256 and 512+512
memory layouts were introduced and it may produce wrong addresses when
used with these layouts.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Max Filippov
a10aa331aa xtensa: make sure bFLT stack is 16 byte aligned
commit 0773495b1f5f1c5e23551843f87b5ff37e7af8f7 upstream.

Xtensa ABI requires stack alignment to be at least 16. In noMMU
configuration ARCH_SLAB_MINALIGN is used to align stack. Make it at
least 16.

This fixes the following runtime error in noMMU configuration, caused by
interaction between insufficiently aligned stack and alloca function,
that results in corruption of on-stack variable in the libc function
glob:

 Caught unhandled exception in 'sh' (pid = 47, pc = 0x02d05d65)
  - should not happen
  EXCCAUSE is 15

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Max Filippov
c9226141c2 xtensa: add NOTES section to the linker script
commit 4119ba211bc4f1bf638f41e50b7a0f329f58aa16 upstream.

This section collects all source .note.* sections together in the
vmlinux image. Without it .note.Linux section may be placed at address
0, while the rest of the kernel is at its normal address, resulting in a
huge vmlinux.bin image that may not be linked into the xtensa Image.elf.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:59 +01:00
Huacai Chen
fca93d34f3 MIPS: Loongson-3: Fix BRIDGE irq delivery problem
[ Upstream commit 360fe725f8849aaddc53475fef5d4a0c439b05ae ]

After commit e509bd7da149dc349160 ("genirq: Allow migration of chained
interrupts by installing default action") Loongson-3 fails at here:

setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction);

This is because both chained_action and cascade_irqaction don't have
IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET
timer interrupt can't be delivered during S3. So we set the irqchip of
the chained irq to loongson_irq_chip which doesn't disable the chained
irq in CP0.Status.

Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20434/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Huacai Chen <chenhuacai@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:59 +01:00
Huacai Chen
83babf64c1 MIPS: Loongson-3: Fix CPU UART irq delivery problem
[ Upstream commit d06f8a2f1befb5a3d0aa660ab1c05e9b744456ea ]

Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to
other CPUs) may cause interrupts be lost, especially in multi-package
machines (Package-0's UART irq cannot be delivered to others). So make
mask_loongson_irq() and unmask_loongson_irq() be no-ops.

The original problem (UART IRQ may deliver to any core) is also because
of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to
remove all of the stuff.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20433/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Huacai Chen <chenhuacai@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Helge Deller
b7706a8242 parisc: Fix exported address of os_hpmc handler
[ Upstream commit 99a3ae51d557d8e38a7aece65678a31f9db215ee ]

In the C-code we need to put the physical address of the hpmc handler in
the interrupt vector table (IVA) in order to get HPMCs working.  Since
on parisc64 function pointers are indirect (in fact they are function
descriptors) we instead export the address as variable and not as
function.

This reverts a small part of commit f39cce654f9a ("parisc: Add
cfi_startproc and cfi_endproc to assembly code").

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>    [4.9+]
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Helge Deller
5a41c65213 parisc: Fix HPMC handler by increasing size to multiple of 16 bytes
[ Upstream commit d5654e156bc4d68a87bbaa6d7e020baceddf6e68 ]

Make sure that the HPMC (High Priority Machine Check) handler is 16-byte
aligned and that it's length in the IVT is a multiple of 16 bytes.
Otherwise PDC may decide not to call the HPMC crash handler.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Helge Deller
4d01ed8336 parisc: Align os_hpmc_size on word boundary
[ Upstream commit 0ed9d3de5f8f97e6efd5ca0e3377cab5f0451ead ]

The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus
triggered the unaligned fault handler at startup.
Fix it by aligning it properly.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Kees Cook
3d67d62931 bna: ethtool: Avoid reading past end of buffer
[ Upstream commit 4dc69c1c1fff2f587f8e737e70b4a4e7565a5c94 ]

Using memcpy() from a string that is shorter than the length copied means
the destination buffer is being filled with arbitrary data from the kernel
rodata segment. Instead, use strncpy() which will fill the trailing bytes
with zeros.

This was found with the future CONFIG_FORTIFY_SOURCE feature.

Cc: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Vincenzo Maffione
e029b8379e e1000: fix race condition between e1000_down() and e1000_watchdog
[ Upstream commit 44c445c3d1b4eacff23141fa7977c3b2ec3a45c9 ]

This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().

    CPU x                           CPU y
    --------------------------------------------------------------------
    e1000_down():
        netif_carrier_off()
                                    e1000_watchdog():
                                        if (carrier == off) {
                                            netif_carrier_on();
                                            enable_hw_transmit();
                                        }
        disable_hw_transmit();
                                    e1000_watchdog():
                                        /* carrier on, do nothing */

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Colin Ian King
55edbf1ce1 e1000: avoid null pointer dereference on invalid stat type
[ Upstream commit 5983587c8c5ef00d6886477544ad67d495bc5479 ]

Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously.  Fix this by skipping over the read of p on an invalid
stat type.

Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Michal Hocko
1a55a71ed2 mm: do not bug_on on incorrect length in __mm_populate()
commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream.

syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate

  kernel BUG at mm/gup.c:1242!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
  RIP: 0010:__mm_populate+0x1e2/0x1f0
  Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
  Call Trace:
     vm_brk_flags+0xc3/0x100
     vm_brk+0x1f/0x30
     load_elf_library+0x281/0x2e0
     __ia32_sys_uselib+0x170/0x1e0
     do_fast_syscall_32+0xca/0x420
     entry_SYSENTER_compat+0x70/0x7f

The reason is that the length of the new brk is not page aligned when we
try to populate the it.  There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should.  All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place.  The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.

Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags.  The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.

Also remove the bogus BUG_ONs.

[osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 4.9:
 - There is no do_brk_flags() function; update do_brk()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21 09:25:58 +01:00
Miklos Szeredi
c31ccf1fe8 fuse: set FR_SENT while locked
commit 4c316f2f3ff315cb48efb7435621e5bfb81df96d upstream.

Otherwise fuse_dev_do_write() could come in and finish off the request, and
the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...))
in request_end().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:58 +01:00
Miklos Szeredi
7238fcbd68 fuse: fix blocked_waitq wakeup
commit 908a572b80f6e9577b45e81b3dfe2e22111286b8 upstream.

Using waitqueue_active() is racy.  Make sure we issue a wake_up()
unconditionally after storing into fc->blocked.  After that it's okay to
optimize with waitqueue_active() since the first wake up provides the
necessary barrier for all waiters, not the just the woken one.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 3c18ef8117f0 ("fuse: optimize wake_up")
Cc: <stable@vger.kernel.org> # v3.10
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:58 +01:00
Kirill Tkhai
0799a93d30 fuse: Fix use-after-free in fuse_dev_do_write()
commit d2d2d4fb1f54eff0f3faa9762d84f6446a4bc5d0 upstream.

After we found req in request_find() and released the lock,
everything may happen with the req in parallel:

cpu0                              cpu1
fuse_dev_do_write()               fuse_dev_do_write()
  req = request_find(fpq, ...)    ...
  spin_unlock(&fpq->lock)         ...
  ...                             req = request_find(fpq, oh.unique)
  ...                             spin_unlock(&fpq->lock)
  queue_interrupt(&fc->iq, req);   ...
  ...                              ...
  ...                              ...
  request_end(fc, req);
    fuse_put_request(fc, req);
  ...                              queue_interrupt(&fc->iq, req);


Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:58 +01:00
Kirill Tkhai
7996b1c0ea fuse: Fix use-after-free in fuse_dev_do_read()
commit bc78abbd55dd28e2287ec6d6502b842321a17c87 upstream.

We may pick freed req in this way:

[cpu0]                                  [cpu1]
fuse_dev_do_read()                      fuse_dev_do_write()
   list_move_tail(&req->list, ...);     ...
   spin_unlock(&fpq->lock);             ...
   ...                                  request_end(fc, req);
   ...                                    fuse_put_request(fc, req);
   if (test_bit(FR_INTERRUPTED, ...))
         queue_interrupt(fiq, req);

Fix that by keeping req alive until we finish all manipulations.

Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <stable@vger.kernel.org> # v4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:57 +01:00
Quinn Tran
0dd45a4df3 scsi: qla2xxx: shutdown chip if reset fail
commit 1e4ac5d6fe0a4af17e4b6251b884485832bf75a3 upstream.

If chip unable to fully initialize, use full shutdown sequence to clear out
any stale FW state.

Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: stable@vger.kernel.org  #4.10
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:57 +01:00
Himanshu Madhani
0b6393523e scsi: qla2xxx: Fix incorrect port speed being set for FC adapters
commit 4c1458df9635c7e3ced155f594d2e7dfd7254e21 upstream.

Fixes: 6246b8a1d26c7c ("[SCSI] qla2xxx: Enhancements to support ISP83xx.")
Fixes: 1bb395485160d2 ("qla2xxx: Correct iiDMA-update calling conventions.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21 09:25:57 +01:00