Commit Graph

753462 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
2075b16e32 rbtree: include rcu.h
Since commit c1adf20052 ("Introduce rb_replace_node_rcu()")
rbtree_augmented.h uses RCU related data structures but does not include
the header file.  It works as long as it gets somehow included before
that and fails otherwise.

Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Changbin Du
78eb0c6356 scripts/faddr2line: fix error when addr2line output contains discriminator
When addr2line output contains discriminator, the current awk script
cannot parse it.  This patch fixes it by extracting key words using
regex which is more reliable.

  $ scripts/faddr2line vmlinux tlb_flush_mmu_free+0x26
  tlb_flush_mmu_free+0x26/0x50:
  tlb_flush_mmu_free at mm/memory.c:258 (discriminator 3)
  scripts/faddr2line: eval: line 173: unexpected EOF while looking for matching `)'

Link: http://lkml.kernel.org/r/1525323379-25193-1-git-send-email-changbin.du@intel.com
Fixes: 6870c0165f ("scripts/faddr2line: show the code context")
Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Ashish Samant
e438302920 ocfs2: take inode cluster lock before moving reflinked inode from orphan dir
While reflinking an inode, we create a new inode in orphan directory,
then take EX lock on it, reflink the original inode to orphan inode and
release EX lock.  Once the lock is released another node could request
it in EX mode from ocfs2_recover_orphans() which causes downconvert of
the lock, on this node, to NL mode.

Later we attempt to initialize security acl for the orphan inode and
move it to the reflink destination.  However, while doing this we dont
take EX lock on the inode.  This could potentially cause problems
because we could be starting transaction, accessing journal and
modifying metadata of the inode while holding NL lock and with another
node holding EX lock on the inode.

Fix this by taking orphan inode cluster lock in EX mode before
initializing security and moving orphan inode to reflink destination.
Use the __tracker variant while taking inode lock to avoid recursive
locking in the ocfs2_init_security_and_acl() call chain.

Link: http://lkml.kernel.org/r/1523475107-7639-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
David Rientjes
27ae357fa8 mm, oom: fix concurrent munlock and oom reaper unmap, v3
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.

This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma.  Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.

This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup().  If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.

Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP.  The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap().  It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.

This issue fixes CVE-2018-1000200.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Naoya Horiguchi
013567be19 mm: migrate: fix double call of radix_tree_replace_slot()
radix_tree_replace_slot() is called twice for head page, it's obviously
a bug.  Let's fix it.

Link: http://lkml.kernel.org/r/20180423072101.GA12157@hori1.linux.bs1.fc.nec.co.jp
Fixes: e71769ae52 ("mm: enable thp migration for shmem thp")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Zi Yan <zi.yan@sent.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Laura Abbott
3955333df9 proc/kcore: don't bounds check against address 0
The existing kcore code checks for bad addresses against __va(0) with
the assumption that this is the lowest address on the system.  This may
not hold true on some systems (e.g.  arm64) and produce overflows and
crashes.  Switch to using other functions to validate the address range.

It's currently only seen on arm64 and it's not clear if anyone wants to
use that particular combination on a stable release.  So this is not
urgent for stable.

Link: http://lkml.kernel.org/r/20180501201143.15121-1-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Tested-by: Dave Anderson <anderson@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>a
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Roman Gushchin
7aaf772723 mm: don't show nr_indirectly_reclaimable in /proc/vmstat
Don't show nr_indirectly_reclaimable in /proc/vmstat, because there is
no need to export this vm counter to userspace, and some changes are
expected in reclaimable object accounting, which can alter this counter.

Link: http://lkml.kernel.org/r/20180425191422.9159-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Pavel Tatashin
27227c7338 mm: sections are not offlined during memory hotremove
Memory hotplug and hotremove operate with per-block granularity.  If the
machine has a large amount of memory (more than 64G), the size of a
memory block can span multiple sections.  By mistake, during hotremove
we set only the first section to offline state.

The bug was discovered because kernel selftest started to fail:
  https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop

After commit, "mm/memory_hotplug: optimize probe routine".  But, the bug
is older than this commit.  In this optimization we also added a check
for sections to be in a proper state during hotplug operation.

Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com
Fixes: 2d070eab2e ("mm: consider zone which is not fully populated to have holes")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Vitaly Wool
6098d7e136 z3fold: fix reclaim lock-ups
Do not try to optimize in-page object layout while the page is under
reclaim.  This fixes lock-ups on reclaim and improves reclaim
performance at the same time.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: <Oleksiy.Avramchenko@sony.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Jeffrey Hugo
ae646f0b9c init: fix false positives in W+X checking
load_module() creates W+X mappings via __vmalloc_node_range() (from
layout_and_allocate()->move_module()->module_alloc()) by using
PAGE_KERNEL_EXEC.  These mappings are later cleaned up via
"call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module().

This is a problem because call_rcu_sched() queues work, which can be run
after debug_checkwx() is run, resulting in a race condition.  If hit,
the race results in a nasty splat about insecure W+X mappings, which
results in a poor user experience as these are not the mappings that
debug_checkwx() is intended to catch.

This issue is observed on multiple arm64 platforms, and has been
artificially triggered on an x86 platform.

Address the race by flushing the queued work before running the
arch-defined mark_rodata_ro() which then calls debug_checkwx().

Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org
Fixes: e1a58320a3 ("x86/mm: Warn on W^X mappings")
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reported-by: Timur Tabi <timur@codeaurora.org>
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Yury Norov
4ba281d5bd lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit()
test_find_first_bit() is intentionally sub-optimal, and may cause soft
lockup due to long time of run on some systems.  So decrease length of
bitmap to traverse to avoid lockup.

With the change below, time of test execution doesn't exceed 0.2 seconds
on my testing system.

Link: http://lkml.kernel.org/r/20180420171949.15710-1-ynorov@caviumnetworks.com
Fixes: 4441fca0a2 ("lib: test module for find_*_bit() functions")
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Dmitry Vyukov
c9cf87ea6a KASAN: prohibit KASAN+STRUCTLEAK combination
Currently STRUCTLEAK inserts initialization out of live scope of variables
from KASAN point of view.  This leads to KASAN false positive reports.
Prohibit this combination for now.

Link: http://lkml.kernel.org/r/20180419172451.104700-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Shuah Khan (Samsung OSG)
1d1c8e5f0d MAINTAINERS: update Shuah's email address
Update email address in MAINTAINERS file due to IT infrastructure changes
at Samsung.

Link: http://lkml.kernel.org/r/20180501212815.25911-1-shuah@kernel.org
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 17:28:45 -07:00
Linus Torvalds
4bc871984f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Verify lengths of keys provided by the user is AF_KEY, from Kevin
    Easton.

 2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka.

 3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R.
    Silva.

 4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most
    grateful for this fix.

 5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we
    do appreciate.

 6) Use after free in TLS code. Once again we are blessed by the
    honorable Eric Dumazet with this fix.

 7) Fix regression in TLS code causing stalls on partial TLS records.
    This fix is bestowed upon us by Andrew Tomt.

 8) Deal with too small MTUs properly in LLC code, another great gift
    from Eric Dumazet.

 9) Handle cached route flushing properly wrt. MTU locking in ipv4, to
    Hangbin Liu we give thanks for this.

10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux.
    Paolo Abeni, he gave us this.

11) Range check coalescing parameters in mlx4 driver, thank you Moshe
    Shemesh.

12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother
    David Howells.

13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens,
    you're the best!

14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata
    Benerjee saved us!

15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship
    is now water tight, thanks to Andrey Ignatov.

16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes
    everywhere!

17) Fix error path in tcf_proto_create, Jiri Pirko what would we do
    without you!

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
  net sched actions: fix refcnt leak in skbmod
  net: sched: fix error path in tcf_proto_create() when modules are not configured
  net sched actions: fix invalid pointer dereferencing if skbedit flags missing
  ixgbe: fix memory leak on ipsec allocation
  ixgbevf: fix ixgbevf_xmit_frame()'s return type
  ixgbe: return error on unsupported SFP module when resetting
  ice: Set rq_last_status when cleaning rq
  ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
  mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
  bonding: send learning packets for vlans on slave
  bonding: do not allow rlb updates to invalid mac
  net/mlx5e: Err if asked to offload TC match on frag being first
  net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
  net/mlx5: Free IRQs in shutdown path
  rxrpc: Trace UDP transmission failure
  rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages
  rxrpc: Fix the min security level for kernel calls
  rxrpc: Fix error reception on AF_INET6 sockets
  rxrpc: Fix missing start of call timeout
  qed: fix spelling mistake: "taskelt" -> "tasklet"
  ...
2018-05-11 14:14:46 -07:00
Linus Torvalds
a1f45efbb9 NFS client fixes for Linux 4.17-rc4
Bugfixes:
 - Fix a possible NFSoRDMA list corruption during recovery
 - Fix sunrpc tracepoint crashes
 
 Other change:
 - Update Trond's email in the MAINTAINERS file
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlr2ABEACgkQ18tUv7Cl
 QOvLew//WipZ1of+dZpiGa95pqVKBrIxq5R1y8LACmEKaiyfHOOoFcaopI7YDU1r
 OkBRZkldMLKOSGZsQ9xEjh3OOPgW60oInFZ2sD2qjnph23x09IcDbiCp8iJ0PTFI
 iD9ioUKc3h7FSl0pJQSjIo9+9fFsTZzIioxP7tDZt2Kog5OMIZeWAqRIj1xmgu5i
 TX793gTFJ+SfMSkvWZM5oOHVEmW/oXAgWsgaVXEqkdjK2JI6KYKqAgMj0CLvvNIo
 S2eeJjbyd9Hl59lDo50NzrZEQESlPYod6ZDfEOmF50mxC3MCLlmtAgwXKknVaY1N
 1L4tFuBoXBLV0jctBztuqMIDKXncoNlsCvr38WqkBaFxikKpK8dFqeByh+wCTdtz
 pwMPHFDQmQB1mIwqzQa+O6MAZ5n3a/cgyWQtoymlq5ddQU3roB2euWXRmaoXPudY
 SnmEVYxq839Ukw16qNa1HkKkroy8Zzqr5+sS30w/l916U9/S3ZolXF+XU5ux+6hQ
 Mlu9aW5SCP4S5QresaAcjPcBdLvbjN8/h/I8bdCmPRCGVKSkcxSz2MZYUli8UxAq
 tht4tQtuCY1XInQPnuf20egnJnrhpgQjb8Xx5BvTtcEkFvz9F36lzK4ot0lqQzTo
 tGDDW8gpeskt0Z1PC4eD1gq/E+FSywP7gg/g32AMdk2GpCewBog=
 =xfe9
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "These patches fix both a possible corruption during NFSoRDMA MR
  recovery, and a sunrpc tracepoint crash.

  Additionally, Trond has a new email address to put in the MAINTAINERS
  file"

* tag 'nfs-for-4.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  Change Trond's email address in MAINTAINERS
  sunrpc: Fix latency trace point crashes
  xprtrdma: Fix list corruption / DMAR errors during MR recovery
2018-05-11 13:56:43 -07:00
Roman Mashak
a52956dfc5 net sched actions: fix refcnt leak in skbmod
When application fails to pass flags in netlink TLV when replacing
existing skbmod action, the kernel will leak refcnt:

$ tc actions get action skbmod index 1
total acts 0

        action order 0: skbmod pipe set smac 00:11:22:33:44:55
         index 1 ref 1 bind 0

For example, at this point a buggy application replaces the action with
index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
however refcnt gets bumped:

$ tc actions get actions skbmod index 1
total acts 0

        action order 0: skbmod pipe set smac 00:11:22:33:44:55
         index 1 ref 2 bind 0
$

Tha patch fixes this by calling tcf_idr_release() on existing actions.

Fixes: 86da71b573 ("net_sched: Introduce skbmod action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 16:37:03 -04:00
Linus Torvalds
ac42803695 These patches fix two long-standing bugs in the DIO code path, one of
which is a crash trivially triggerable with splice().
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJa9bYgAAoJEEp/3jgCEfOLlz4H/2rlMpTGRn2mHwiK3DmVsmXT
 J9YRAClWEmJOAbtwjwToRB9QWbx8vkVRTI+CDJDK4IMVXgTCpAexvymdzeC54gZw
 PzTRfSjXEGa4fVF5P+TwWrz4nGgBq+uEz35dK7ZB0zJZHPiZbXdMChCNnPaVdPQY
 2kat9alDAuSyBt7HLt1J+EwKaFBUJcCL0TJJUirFrO3HYN9npj/QcvKSdK0OpPYA
 WzJDduuCFv4V3o9j03sP0SRw2jYJqzWqw3nrs+47uxPeStDZ9OK7cXztP4BdbDjv
 WYlNewcC6H578MvB0GFeQ8ED9RJj3C/6JRhQReF45COUQaNS91i1gQi7KRGviOw=
 =StVr
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.17-rc5' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "These patches fix two long-standing bugs in the DIO code path, one of
  which is a crash trivially triggerable with splice()"

* tag 'ceph-for-4.17-rc5' of git://github.com/ceph/ceph-client:
  ceph: fix iov_iter issues in ceph_direct_read_write()
  libceph: add osd_req_op_extent_osd_data_bvecs()
  ceph: fix rsize/wsize capping in ceph_direct_read_write()
2018-05-11 13:36:06 -07:00
Jiri Pirko
d68d75fdc3 net: sched: fix error path in tcf_proto_create() when modules are not configured
In case modules are not configured, error out when tp->ops is null
and prevent later null pointer dereference.

Fixes: 33a48927c1 ("sched: push TC filter protocol creation into a separate function")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 16:34:38 -04:00
Linus Torvalds
3f5f8596ed Fixes for critical regressions and a build failure.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJa9eaYAAoJELcQ+SIFb8HamGAH/imdKSv7GJe7OOZyJ1AjHPIc
 BLklw6RA6ejGpKdFNVzK77g+5DmpH+00sInWJNDginQCop+quH5aqTvLdVaoFh/w
 /XU+aZClSNvqI3wesPPKTKuWj/bofDks7pGfiXsBXrSIT6SGQVMC7U/iL9YhkyN2
 93hgdYs1qEoy2d3MgFfyRmAJKhPVnr4jbzN5RLXpbZox9NMYBa4qPL2GXvhx0A7T
 3DtXykSmfBAHBHbAnLCIvwgJ9aQg9TUVMKJo3jXynTvnD+5qgLYiE8y3j+Yg6qRm
 WG1G5f5sU9fEPgkYApg30wdrO3HbSi9POsLDW0Inzlt1/WUMrC3iJ8Gza7K58kM=
 =VZ/j
 -----END PGP SIGNATURE-----

Merge tag 'sh-for-4.17-fixes' of git://git.libc.org/linux-sh

Pull arch/sh fixes from Rich Felker:
 "Fixes for critical regressions and a build failure.

  The regressions were introduced in 4.15 and 4.17-rc1 and prevented
  booting on affected systems"

* tag 'sh-for-4.17-fixes' of git://git.libc.org/linux-sh:
  sh: switch to NO_BOOTMEM
  sh: mm: Fix unprotected access to struct device
  sh: fix build failure for J2 cpu with SMP disabled
2018-05-11 13:14:24 -07:00
Linus Torvalds
7404bc2773 arm64 fixes:
- Mitigate Spectre-v2 for NVIDIA Denver CPUs
 
 - Free memblocks corresponding to freed initrd area
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJa9bgsAAoJELescNyEwWM0yYwIAKvMuUU8d6fy/5EdjTm2uG9p
 DoSw+ezHeiUrphQwNvOc/fj0vGutM+sftcmghRV1KmP7lvAqk/zvK57PAZjwQ5ua
 i1X2AJemKr7Gs77FV5Y6Jgkkd2kaIh3n86d9/hM7n9TfAt31vPAYCapb8h3LbRBJ
 bjZXoTHeujZAIMLGyxzLGVlk9MdW2UjQ3LvWGby/mFEPuktJKkApxBSNQOJOuRKw
 Ny/eCwFhbyLzDA4zXw7hASld/J+WWBhk0m8ks2qy7BD/F2auZX/p5flU/NoE1VXi
 JevclGif18iQtZQRV/hJ1woLROfbp6cRKWaVB4cEFKSnB2mG6FLSfrYyvbCj6LE=
 =lZDP
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "There's a small memblock accounting problem when freeing the initrd
  and a Spectre-v2 mitigation for NVIDIA Denver CPUs which just requires
  a match on the CPU ID register.

  Summary:

   - Mitigate Spectre-v2 for NVIDIA Denver CPUs

   - Free memblocks corresponding to freed initrd area"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: capabilities: Add NVIDIA Denver CPU to bp_harden list
  arm64: Add MIDR encoding for NVIDIA CPUs
  arm64: To remove initrd reserved area entry from memblock
2018-05-11 13:09:04 -07:00
Linus Torvalds
5c6b54600d powerpc fixes for 4.17 #5
One fix for an actual regression, the change to the SYSCALL_DEFINE wrapper broke
 FTRACE_SYSCALLS for us due to a name mismatch. There's also another commit to
 the same code to make sure we match all our syscalls with various prefixes.
 
 And then just one minor build fix, and the removal of an unused variable that
 was removed and then snuck back in due to some rebasing.
 
 Thanks to:
   Naveen N. Rao.
 -----BEGIN PGP SIGNATURE-----
 
 iQIwBAABCAAaBQJa9Zw4ExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAr
 cg/9Go3swOrZhh0Nc8UUwxXJlNN20fgOJVmqVuTh6cE8I10pw2Uy5z/1mD642X0n
 Y3qLRIJJPYGoAXyZglxPpD1rUWQJ7C7WCvFIgXGj9itDZmySD4lgxwDKFGBUphRG
 1PejPtY/IHpcSyCECwD6tPj5ZSSPemlkXqLZAXS8ECSTQwi605tcV3WlZtWtPk1w
 7PZraHSxc9gGtfxXR78aLd6C7ZAzodm7zXyJeZq3g0QuXUbEgA3DDvD7O9vRDQau
 f2gj4b+19gHCpq9uPoTnnZH7V1eyZ8XpLUoxBKkdD1rwq8oW8pSZnOg40j9qwkGc
 z0gnys0c34jFGc0GeQUgfA4SJwclO6KkEcdQiy3uU0+QIUA0dW4QvKHvdxULps1y
 h80d2PSi1hQVBGpeV+agrc8zSLzSp7M7mo2CCkU+S2OBBnxjh+6dhSqAT6Ms7cZx
 W9Ly6NbMSLQCsC1K4EP8d56fug99z4O+44Wb3anTiGg0LhVELm9+Mlo5Ku8s/klB
 qSa7rFJ6mx5donBTf70pMRhSJY24bie5E8j5+o78gEnC66j+p7sBEhBOe44TPNKi
 y3Zv76PT7/7hvo1Eyus849dW0YKiJUiZKEInVgErqkSI75z0mO0WqAJ88UNQLaXB
 TCRIBf5aX8mpsWcVcJb8OF2+JNGDjX9w2snLEta/TfUso14=
 =7GGr
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix for an actual regression, the change to the SYSCALL_DEFINE
  wrapper broke FTRACE_SYSCALLS for us due to a name mismatch. There's
  also another commit to the same code to make sure we match all our
  syscalls with various prefixes.

  And then just one minor build fix, and the removal of an unused
  variable that was removed and then snuck back in due to some rebasing.

  Thanks to: Naveen N. Rao"

* tag 'powerpc-4.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Fix CONFIG_NUMA=n build
  powerpc/trace/syscalls: Update syscall name matching logic to account for ppc_ prefix
  powerpc/trace/syscalls: Update syscall name matching logic
  powerpc/64: Remove unused paca->soft_enabled
2018-05-11 13:07:22 -07:00
Linus Torvalds
c110a8b792 Working on some new updates to trace filtering, I noticed that the
regex_match_front() test was updated to be limited to the size
 of the pattern instead of the full test string. But as the test string
 is not guaranteed to be nul terminated, it still needs to consider
 the size of the test string.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCWvWzNRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhiPAP9bmOzqT3YK+dF19pLJCrmjyF95Wh85
 /10xaH3G1Q5e8AEA3ZXQqVNEGnaEs2uO/c5yvTP6/k1WEfGuTqTO5IH2hwI=
 =cKB5
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Working on some new updates to trace filtering, I noticed that the
  regex_match_front() test was updated to be limited to the size of the
  pattern instead of the full test string.

  But as the test string is not guaranteed to be nul terminated, it
  still needs to consider the size of the test string"

* tag 'trace-v4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix regex_match_front() to not over compare the test string
2018-05-11 13:04:35 -07:00
David S. Miller
f4d641a228 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-05-11

This series contains fixes to the ice, ixgbe and ixgbevf drivers.

Jeff Shaw provides a fix to ensure rq_last_status gets set, whether or
not the hardware responds with an error in the ice driver.

Emil adds a check for unsupported module during the reset routine for
ixgbe.

Luc Van Oostenryck fixes ixgbevf_xmit_frame() where it was not using the
correct return value (int).

Colin Ian King fixes a potential resource leak in ixgbe, where we were
not freeing ipsec in our cleanup path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 15:57:23 -04:00
David S. Miller
f01008916f RxRPC fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWvTI3fu3V2unywtrAQJhsRAAoO801foYD0QvcePS7kygwY3xgEnhWfI2
 gTKX7yzYHsoZT+0wesMZ2wjFplTt5pH351H/ytcRiXZ+VIQu+6rWaNTwuUvAISYy
 6hsYST3Exl3P/ZW2GZNZIHyht3Qmpj6O8DYbJvJiJF5MVApb2zQKsuOa+ZBywgD2
 eeahiHZ4wOMgY4YLQkBl1WKEh78AaWkkBljLyvFNC6v1GkvBGJ2AAZZNt+Ye65i7
 AvCMqXD1hmqqfWBK12dz9HIPJCPRv2uoDGehS1EsfCdqQmE0Cw9k54tVPbAOBKzb
 1ys2dgRc87/UYjXX4e+OS7u+pmoxE3MRiWxT+hFfHFa0PSYu/R2aM2Jbh2VxtdfS
 PeeK8BKMqB6W2MFTU1ZUG0viw7LVTxN0oiLQ+eEbhs+ew+czbZSIsqcO6BUTIoNZ
 M1KqR17PHYjjKGtUp12/8iAO2x6ejNhmWRZvxlyp5TviF5Txub0a9/IfuV1t18ut
 N7i+L0jLsjUsPdQlBJUNuTb5TrMdMof18sISZtf4wSMa6llrrOl3CTxO7LSnJjw/
 shhs3MBqt3geSp0b0OzT8imPjGZRxHF7hWfhn4SeRqsmPFyLVW+je64P1+De0iP9
 o9IQjVFX6WJP9NdRygai9gcWw7CJpmFo8ODPzBBU6O64lHk0NKE2Ihs3i7wdM9h0
 SFRxfOl+ma0=
 =jobL
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-fixes-20180510' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes

Here are three fixes for AF_RXRPC and two tracepoints that were useful for
finding them:

 (1) Fix missing start of expect-Rx-by timeout on initial packet
     transmission so that calls will time out if the peer doesn't respond.

 (2) Fix error reception on AF_INET6 sockets by using the correct family of
     sockopts on the UDP transport socket.

 (3) Fix setting the minimum security level on kernel calls so that they
     can be encrypted.

 (4) Add a tracepoint to log ICMP/ICMP6 and other error reports from the
     transport socket.

 (5) Add a tracepoint to log UDP sendmsg failure so that we can find out if
     transmission failure occurred on the UDP socket.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 15:55:57 -04:00
Roman Mashak
af5d01842f net sched actions: fix invalid pointer dereferencing if skbedit flags missing
When application fails to pass flags in netlink TLV for a new skbedit action,
the kernel results in the following oops:

[    8.307732] BUG: unable to handle kernel paging request at 0000000000021130
[    8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
[    8.310595] Oops: 0000 [#1] SMP PTI
[    8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
[    8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
[    8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
[    8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
[    8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
[    8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
[    8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
[    8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
[    8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
[    8.325590] FS:  00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
[    8.327001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
[    8.329289] Call Trace:
[    8.329735]  tcf_skbedit_init+0xa7/0xb0
[    8.330423]  tcf_action_init_1+0x362/0x410
[    8.331139]  ? try_to_wake_up+0x44/0x430
[    8.331817]  tcf_action_init+0x103/0x190
[    8.332511]  tc_ctl_action+0x11a/0x220
[    8.333174]  rtnetlink_rcv_msg+0x23d/0x2e0
[    8.333902]  ? _cond_resched+0x16/0x40
[    8.334569]  ? __kmalloc_node_track_caller+0x5b/0x2c0
[    8.335440]  ? rtnl_calcit.isra.31+0xf0/0xf0
[    8.336178]  netlink_rcv_skb+0xdb/0x110
[    8.336855]  netlink_unicast+0x167/0x220
[    8.337550]  netlink_sendmsg+0x2a7/0x390
[    8.338258]  sock_sendmsg+0x30/0x40
[    8.338865]  ___sys_sendmsg+0x2c5/0x2e0
[    8.339531]  ? pagecache_get_page+0x27/0x210
[    8.340271]  ? filemap_fault+0xa2/0x630
[    8.340943]  ? page_add_file_rmap+0x108/0x200
[    8.341732]  ? alloc_set_pte+0x2aa/0x530
[    8.342573]  ? finish_fault+0x4e/0x70
[    8.343332]  ? __handle_mm_fault+0xbc1/0x10d0
[    8.344337]  ? __sys_sendmsg+0x53/0x80
[    8.345040]  __sys_sendmsg+0x53/0x80
[    8.345678]  do_syscall_64+0x4f/0x100
[    8.346339]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    8.347206] RIP: 0033:0x7f591191da67
[    8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[    8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
[    8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
[    8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
[    8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
[    8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
[    8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 <8b> 53 30
74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
[    8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
[    8.359770] CR2: 0000000000021130
[    8.360438] ---[ end trace 60c66be45dfc14f0 ]---

The caller calls action's ->init() and passes pointer to "struct tc_action *a",
which later may be initialized to point at the existing action, otherwise
"struct tc_action *a" is still invalid, and therefore dereferencing it is an
error as happens in tcf_idr_release, where refcnt is decremented.

So in case of missing flags tcf_idr_release must be called only for
existing actions.

v2:
    - prepare patch for net tree

Fixes: 5e1567aeb7 ("net sched: skbedit action fix late binding")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 15:52:43 -04:00
Jens Axboe
9abd68ef45 nvme: add quirk to force medium priority for SQ creation
Some P3100 drives have a bug where they think WRRU (weighted round robin)
is always enabled, even though the host doesn't set it. Since they think
it's enabled, they also look at the submission queue creation priority. We
used to set that to MEDIUM by default, but that was removed in commit
81c1cd9835. This causes various issues on that drive. Add a quirk to
still set MEDIUM priority for that controller.

Fixes: 81c1cd9835 ("nvme/pci: Don't set reserved SQ create flags")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <keith.busch@intel.com>
2018-05-11 13:37:14 -06:00
Linus Torvalds
84c3a0979c xen: fix for 4.17-rc5
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCWvV2WQAKCRCAXGG7T9hj
 vvV1AQD/mqwRavel82e8JiMosoqrpZWwZ4uK2m7DhhIGhdyuegEAjmqzkjYSInrA
 0A7FeFH2Wl1nYiKBl8ppvAd2GOkbbws=
 =kcKL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "One fix for the kernel running as a fully virtualized guest using PV
  drivers on old Xen hypervisor versions"

* tag 'for-linus-4.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: Reset VCPU0 info pointer after shared_info remap
2018-05-11 12:30:34 -07:00
Colin Ian King
c89ebb968f ixgbe: fix memory leak on ipsec allocation
The error clean up path kfree's adapter->ipsec and should be
instead kfree'ing ipsec. Fix this.  Also, the err1 error exit path
does not need to kfree ipsec because this failure path was for
the failed allocation of ipsec.

Detected by CoverityScan, CID#146424 ("Resource Leak")

Fixes: 63a67fe229 ("ixgbe: add ipsec offload add and remove SA")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-11 12:22:22 -07:00
Luc Van Oostenryck
cf12aab67a ixgbevf: fix ixgbevf_xmit_frame()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-11 12:18:35 -07:00
Emil Tantilov
bbb2707623 ixgbe: return error on unsupported SFP module when resetting
Add check for unsupported module and return the error code.
This fixes a Coverity hit due to unused return status from setup_sfp.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-11 12:16:58 -07:00
Jeff Shaw
ea3beca422 ice: Set rq_last_status when cleaning rq
Prior to this commit, the rq_last_status was only set when hardware
responded with an error. This leads to rq_last_status being invalid
in the future when hardware eventually responds without error. This
commit resolves the issue by unconditionally setting rq_last_status
with the value returned in the descriptor.

Fixes: 940b61af02 ("ice: Initialize PF and setup miscellaneous
interrupt")

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-11 11:43:17 -07:00
Trond Myklebust
04ac6fdba1 Change Trond's email address in MAINTAINERS
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-05-11 14:39:06 -04:00
Rob Herring
ac21fc2dcb sh: switch to NO_BOOTMEM
Commit 0fa1c57934 ("of/fdt: use memblock_virt_alloc for early alloc")
inadvertently switched the DT unflattening allocations from memblock to
bootmem which doesn't work because the unflattening happens before
bootmem is initialized. Swapping the order of bootmem init and
unflattening could also fix this, but removing bootmem is desired. So
enable NO_BOOTMEM on SH like other architectures have done.

Fixes: 0fa1c57934 ("of/fdt: use memblock_virt_alloc for early alloc")
Reported-by: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rich Felker <dalias@libc.org>
2018-05-11 13:35:46 -04:00
Linus Torvalds
be83bbf806 mmap: introduce sane default mmap limits
The internal VM "mmap()" interfaces are based on the mmap target doing
everything using page indexes rather than byte offsets, because
traditionally (ie 32-bit) we had the situation that the byte offset
didn't fit in a register.  So while the mmap virtual address was limited
by the word size of the architecture, the backing store was not.

So we're basically passing "pgoff" around as a page index, in order to
be able to describe backing store locations that are much bigger than
the word size (think files larger than 4GB etc).

But while this all makes a ton of sense conceptually, we've been dogged
by various drivers that don't really understand this, and internally
work with byte offsets, and then try to work with the page index by
turning it into a byte offset with "pgoff << PAGE_SHIFT".

Which obviously can overflow.

Adding the size of the mapping to it to get the byte offset of the end
of the backing store just exacerbates the problem, and if you then use
this overflow-prone value to check various limits of your device driver
mmap capability, you're just setting yourself up for problems.

The correct thing for drivers to do is to do their limit math in page
indices, the way the interface is designed.  Because the generic mmap
code _does_ test that the index doesn't overflow, since that's what the
mmap code really cares about.

HOWEVER.

Finding and fixing various random drivers is a sisyphean task, so let's
just see if we can just make the core mmap() code do the limiting for
us.  Realistically, the only "big" backing stores we need to care about
are regular files and block devices, both of which are known to do this
properly, and which have nice well-defined limits for how much data they
can access.

So let's special-case just those two known cases, and then limit other
random mmap users to a backing store that still fits in "unsigned long".
Realistically, that's not much of a limit at all on 64-bit, and on
32-bit architectures the only worry might be the GPU drivers, which can
have big physical address spaces.

To make it possible for drivers like that to say that they are 64-bit
clean, this patch does repurpose the "FMODE_UNSIGNED_OFFSET" bit in the
file flags to allow drivers to mark their file descriptors as safe in
the full 64-bit mmap address space.

[ The timing for doing this is less than optimal, and this should really
  go in a merge window. But realistically, this needs wide testing more
  than it needs anything else, and being main-line is the only way to do
  that.

  So the earlier the better, even if it's outside the proper development
  cycle        - Linus ]

Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11 09:52:01 -07:00
Charles Machalow
4e50d9ebae nvme: Fix sync controller reset return
If a controller reset is requested while the device has no namespaces,
we were incorrectly returning ENETRESET. This patch adds the check for
ADMIN_ONLY controller state to indicate a successful reset.

Fixes: 8000d1fdb0  ("nvme-rdma: fix sysfs invoked reset_ctrl error flow ")
Cc: <stable@vger.kernel.org>
Signed-off-by: Charles Machalow <charles.machalow@intel.com>
[changelog]
Signed-off-by: Keith Busch <keith.busch@intel.com>
2018-05-11 10:51:45 -06:00
Linus Torvalds
41e3e10823 Power management fixes for 4.17-rc5
- Restore device_may_wakeup() check in pci_enable_wake() removed
    inadvertently during the 4.13 cycle to prevent systems from
    drawing excessive power when suspended or off, among other
    things (Rafael Wysocki).
 
  - Fix pci_dev_run_wake() to properly handle devices that only can
    signal PME# when in the D3cold power state (Kai Heng Feng).
 
  - Fix the schedutil cpufreq governor to avoid using UINT_MAX
    as the new CPU frequency in some cases due to a missing check
    (Rafael Wysocki).
 
  - Remove a stale comment regarding worker kthreads from the
    schedutil cpufreq governor (Juri Lelli).
 
  - Fix a copy-paste mistake in the intel_pstate driver documentation
    (Juri Lelli).
 
  - Fix a typo in the system sleep states documentation (Jonathan
    Neuschäfer).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJa9ZxLAAoJEILEb/54YlRxosQQAIoRa353q55oy3hNUKzybOY0
 z2MtQjjgDQsRKKFe8hbfjLy0QnSQCUASW8LaHpfDBqeO8ZR2TwRwR7H8b3dUpZj9
 ehsOrzNNnOlj1rSAbRaUfPJU1fA8HDoWcfwaKHwUVYXr9zwZTFv2x4UTJ2+bmOx9
 UdCI0Jl2aKtBSe+SPGNiSewQ3oLD3LYcv9VV/sTJ1XP0Wmwr0SoikzDIiJCo+lo1
 gXvQlM7ngxKtt02k4XUYEUjt49TrjWjLNQrAXVvFI7kn1KRlkzLl1E1g299/DxRw
 CSTboeDOkaKGJP84YmvdEUBp+IF1bQ8JwPe/Q/8i5+1MvBnvLgXOPlqpLAKAVjxr
 NBI7aAb83Q0aAecx0ioPVET9EDQ+AVrCj20PnitURfy1nl059knNwrvSnqCw1uLD
 JGVY2z4mm4zI2LlaUWKCK0PLTgucRZIU8HUiiBsI2u42KmG3EdfoDzvNUsxcZ146
 5Q+asEKTJoqltJfxwgQGaix7xXC75JVE65ICWB29ba3RddFZ7r4pu+pTg7yEsrpX
 98p3CPmQjbVbX5wcs9l0H0lYrOCEZj4saDHsmQ+62fQRu9VhxeSHmWBykOM9/k2j
 TRpRJK59BeeUMRtf1676B/uKevfuuT8seSXWtQwyWZc+Z+ZTJq/WKxVN7iV6/F21
 95RVu+yL1bhNKDjzJhyG
 =bCt1
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix two PCI power management regressions from the 4.13 cycle and
  one cpufreq schedutil governor bug introduced during the 4.12 cycle,
  drop a stale comment from the schedutil code and fix two mistakes in
  docs.

  Specifics:

   - Restore device_may_wakeup() check in pci_enable_wake() removed
     inadvertently during the 4.13 cycle to prevent systems from drawing
     excessive power when suspended or off, among other things (Rafael
     Wysocki).

   - Fix pci_dev_run_wake() to properly handle devices that only can
     signal PME# when in the D3cold power state (Kai Heng Feng).

   - Fix the schedutil cpufreq governor to avoid using UINT_MAX as the
     new CPU frequency in some cases due to a missing check (Rafael
     Wysocki).

   - Remove a stale comment regarding worker kthreads from the schedutil
     cpufreq governor (Juri Lelli).

   - Fix a copy-paste mistake in the intel_pstate driver documentation
     (Juri Lelli).

   - Fix a typo in the system sleep states documentation (Jonathan
     Neuschäfer)"

* tag 'pm-4.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI / PM: Check device_may_wakeup() in pci_enable_wake()
  PCI / PM: Always check PME wakeup capability for runtime wakeup support
  cpufreq: schedutil: Avoid using invalid next_freq
  cpufreq: schedutil: remove stale comment
  PM: docs: intel_pstate: fix Active Mode w/o HWP paragraph
  PM: docs: sleep-states: Fix a typo ("includig")
2018-05-11 09:49:02 -07:00
Linus Torvalds
e03dc5d3d4 NAND fixes:
- Make nand_soft_waitrdy() wait tWB before polling the status REG
 - Fix BCH write in the the Marvell NAND controller driver
 - Fix wrong picosec to msec conversion in the Marvell NAND controller
   driver
 - Fix DMA handling in the TI OneNAND controllre driver
 -----BEGIN PGP SIGNATURE-----
 
 iQI5BAABCAAjBQJa9UwTHBxib3Jpcy5icmV6aWxsb25AYm9vdGxpbi5jb20ACgkQ
 Ze02AX4ItwAZdQ//SGaNWGzrCaXqoAQMMVanHJLeSau5KDTQpuz11RkjDe5q5CF6
 II8v34ks5SDb8pWnuKSvVgJx/n/zO1UE9N3aLPmPrLs4J3COHJAii7TFaunfcfpa
 MIE58C6ZohFWqe+xKl46UFxwsfmwqDZvV/UTMC+6MABj9JeDy2bZx64tIzbp8kT6
 Vmi2tuUTAQ2tnsdhymsdg59fy8Kr0CFQMzmlRG8pz3+dg6pyoCdlkvZO2U0mFNZb
 KebN9jiifvPgrPgHiql1rRMM0kUfQq0BTjwQ2YSkyuxXzaZ5XWE1etRacby8REtd
 /pTH6YrrPrguqhTknA00rG4YPxYAF2gUAmVmtT0AHIuUHVs4qe+RevNPTT9uEWKi
 W0hJLY10zZBpQXSvvZ7Au9P/24pHsYSakoPKgTdMXyIqciXt81pzGHwK8ySp7riX
 qHcvJDqflmO0NO+197pgi8J35QUKkaScTcoKKoFgnJEYHvMVguRtzBfB9p0a4HXO
 r78HgGzxWPMZdExr/81TOPSUdEQUbh7677+kg5mLQABIbqXfxes+dQUE+ApAIdmG
 01X/YdpkOOjruYL5UuTTs56KwOgmVcgiSjLeDbXI3l5qgw1tXnjhraqYB1CTcNfc
 hN1fqFPjrSyNL1wvYqkiVSkIXfbELPazeziLqkvq4uUHWsPGv+BzY/sHDsc=
 =dBC1
 -----END PGP SIGNATURE-----

Merge tag 'mtd/fixes-for-4.17-rc5' of git://git.infradead.org/linux-mtd

Pull mtd fixes from Boris Brezillon:

 - make nand_soft_waitrdy() wait tWB before polling the status REG

 - fix BCH write in the the Marvell NAND controller driver

 - fix wrong picosec to msec conversion in the Marvell NAND controller
   driver

 - fix DMA handling in the TI OneNAND controllre driver

* tag 'mtd/fixes-for-4.17-rc5' of git://git.infradead.org/linux-mtd:
  mtd: rawnand: Make sure we wait tWB before polling the STATUS reg
  mtd: rawnand: marvell: fix command xtype in BCH write hook
  mtd: rawnand: marvell: pass ms delay to wait_op
  mtd: onenand: omap2: Disable DMA for HIGHMEM buffers
2018-05-11 09:46:14 -07:00
David S. Miller
5ae4bbf769 mlx5-fixes-2018-05-10
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa9NFvAAoJEEg/ir3gV/o+vNoIAM/5zfT9f2iu6uNADcrFFfYY
 bdHY/psg9qDAjqZxmFYezcwdOrPY5GxLn+1VVZPLfwCir/qupTOO2skLRyAKEau0
 uKSP45LD6E+M0Sew+15//sEB3J2JzcjJsNd61lzdl+3GKT/Nr/ZGY0K8iFXItdc3
 Ye/vsL1IRNaosl4dnAGzOylGeit2VeUkmS/JrFRVqFjVLu78zxEuLHdnIZApt+4W
 lwpLnsplhUbPk6lwHNNureSuzQq4SXMLWIB+v1uxzHOSSZT8nkrr4/ew/BYHp1oo
 EteRykl4x3SkDbPcTBeElvpb52nduC6jgn8auVXOY9XZDmqX+rxhfBh3fBVFYB8=
 =FwjF
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2018-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2018-05-10

the following series includes some fixes for mlx5 core driver.
Please pull and let me know if there's any problem.

For -stable v4.5
("net/mlx5: E-Switch, Include VF RDMA stats in vport statistics")

For -stable v4.10
("net/mlx5e: Err if asked to offload TC match on frag being first")
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 12:26:29 -04:00
Linus Torvalds
ca30093dd7 nouveau, amdgpu, i915, vc4, omap, exynos and atomic fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJa9R/gAAoJEAx081l5xIa+QqkP/jpIAvhLsJQXeTVDinZt7R73
 sgC9vb0Jt3x8VKPEaUSLGAGXFJJJ/MTwuD5/YILjPOqrZdvXJkcozAUjY7g2Xk6A
 gagrbt9DPmeMYy8hss5f7iRALnu8AN2D0st9Cxmp6nu6XJmjPj3YHFz9VhD8nSkV
 lPRJ9/pdC7p5fhpHwQgCazTKGHxmUlZQaXUjNSavgZpnVGl/XrOuiTLcyx8FL8GE
 6Io4KtrAh2ToJgRU/m4NykqwflR+dk4it4yn4rTalEqUoOqyVd4LLgTRwj2PQUP+
 wOwMWErmN4OFX29Y3pl5nXpN26+sQQdztvvLlnulfS+gATlmzciLgEc0phYuv9/1
 90XrI1bYyyODzYnAUe8BGgRTl2iquQMvxROvEwxT8hQCwYJ738SFobpAEaFD+VeV
 4Or0bVgMO+ld3T1ED4IdDD4/Ix8CkqOLxhVwnRhG/bOC1r4IjMmeNfpoXtFBDg2H
 +LpU9Dvnbn9Z71dWiYB5OObjdikR+VANx2l2FM1h/EKacPJf4Zro24Za9bF04fe4
 FAtw1MY8PO5+NW2SuFSzEMvWONeHKT1+r0zNsoGU4sTgDSDu1rL2eOOgoKmvT3C1
 ih2CbR9av6LTf0I1mE25busoHGKWwb8I+pmqMoFNB93Yq8H6jKp8eRxJHhJ7aSqE
 CY3hvgCJKg7v8EYMHyjt
 =X5Dx
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.17-rc5' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "nouveau, amdgpu, i915, vc4, omap, exynos and atomic fixes.

  As last week seemed a bit slow, we got a few more fixes this week.

  The main stuff is two weeks of fixes for amdgpu, some missing bits of
  vega12 atom firmware support were added, and some power management
  fixes.

  Nouveau got two regression fixes for an DP MST deadlock and a random
  oops fix.

  i915 got an LVDS panel timeout fix 2 WARN fixes.

  exynos fixed a pagefault issue in the mixer driver.

  vc4 has an oops fix.

  omap had a bunch of uninit var and error-checking fixes. Two atomic
  modesetting state fixes.

  One minor agp cleanup patch"

* tag 'drm-fixes-for-v4.17-rc5' of git://people.freedesktop.org/~airlied/linux: (30 commits)
  drm/amd/pp: Fix performance drop on Fiji
  drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
  drm/nouveau/ttm: don't dereference nvbo::cli, it can outlive client
  agp: uninorth: make two functions static
  drm/amd/pp: Refine the output of pp_power_profile_mode on VI
  drm/amdgpu: Switch to interruptable wait to recover from ring hang.
  drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pages
  drm/amd/display: Use kvzalloc for potentially large allocations
  drm/amd/display: Don't return ddc result and read_bytes in same return value
  drm/amd/display: Add get_firmware_info_v3_2 for VG12
  drm/amd: Add BIOS smu_info v3_3 required struct def.
  drm/amd/display: Add VG12 ASIC IDs
  drm/vc4: Fix scaling of uni-planar formats
  drm/exynos: hdmi: avoid duplicating drm_bridge_attach
  drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log
  drm/i915: Correctly populate user mode h/vdisplay with pipe src size during readout
  drm/i915: Adjust eDP's logical vco in a reliable place.
  drm/bridge/sii8620: add Kconfig dependency on extcon
  drm/omap: handle alloc failures in omap_connector
  drm/omap: add missing linefeeds to prints
  ...
2018-05-11 09:18:02 -07:00
Andrey Ignatov
1b97013bfb ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
earlier in 919483096b.

* udp_sendmsg one was there since the beginning when linux sources were
  first added to git;
* ping_v4_sendmsg one was copy/pasted in c319b4d76b.

Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
have to be freed if they were allocated previously.

Add label so that future callers (if any) can use it instead of kfree()
before return that is easy to forget.

Fixes: c319b4d76b (net: ipv4: add IPPROTO_ICMP socket kind)
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 12:00:58 -04:00
Christophe JAILLET
8ccc113172 mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
Resources are not freed in the reverse order of the allocation.
Labels are also mixed-up.

Fix it and reorder code and labels in the error handling path of
'mlxsw_core_bus_device_register()'

Fixes: ef3116e540 ("mlxsw: spectrum: Register KVD resources with devlink")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 11:56:05 -04:00
David S. Miller
89dd2e752c Merge branch 'bonding-bug-fixes-and-regressions'
Debabrata Banerjee says:

====================
bonding: bug fixes and regressions

Fixes to bonding driver for balance-alb mode, suitable for stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 11:50:41 -04:00
Debabrata Banerjee
21706ee8a4 bonding: send learning packets for vlans on slave
There was a regression at some point from the intended functionality of
commit f60c3704e8 ("bonding: Fix alb mode to only use first level
vlans.")

Given the return value vlan_get_encap_level() we need to store the nest
level of the bond device, and then compare the vlan's encap level to
this. Without this, this check always fails and learning packets are
never sent.

In addition, this same commit caused a regression in the behavior of
balance_alb, which requires learning packets be sent for all interfaces
using the slave's mac in order to load balance properly. For vlan's
that have not set a user mac, we can send after checking one bit.
Otherwise we need send the set mac, albeit defeating rx load balancing
for that vlan.

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 11:50:41 -04:00
Debabrata Banerjee
4fa8667ca3 bonding: do not allow rlb updates to invalid mac
Make sure multicast, broadcast, and zero mac's cannot be the output of rlb
updates, which should all be directed arps. Receive load balancing will be
collapsed if any of these happen, as the switch will broadcast.

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 11:50:41 -04:00
Steven Rostedt (VMware)
dc432c3d7f tracing: Fix regex_match_front() to not over compare the test string
The regex match function regex_match_front() in the tracing filter logic,
was fixed to test just the pattern length from testing the entire test
string. That is, it went from strncmp(str, r->pattern, len) to
strcmp(str, r->pattern, r->len).

The issue is that str is not guaranteed to be nul terminated, and if r->len
is greater than the length of str, it can access more memory than is
allocated.

The solution is to add a simple test if (len < r->len) return 0.

Cc: stable@vger.kernel.org
Fixes: 285caad415 ("tracing/filters: Fix MATCH_FRONT_ONLY filter matching")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-05-11 10:56:42 -04:00
Rafael J. Wysocki
ef050374e1 Merge branches 'pm-pci' and 'pm-docs'
* pm-pci:
  PCI / PM: Check device_may_wakeup() in pci_enable_wake()
  PCI / PM: Always check PME wakeup capability for runtime wakeup support

* pm-docs:
  PM: docs: intel_pstate: fix Active Mode w/o HWP paragraph
  PM: docs: sleep-states: Fix a typo ("includig")
2018-05-11 15:17:18 +02:00
Haneen Mohammed
7f6df440b8 drm: Match sysfs name in link removal to link creation
This patch matches the sysfs name used in the unlinking with the
linking function. Otherwise, remove_compat_control_link() fails to remove
sysfs created by create_compat_control_link() in drm_dev_register().

Fixes: 6449b088dd ("drm: Add fake controlD* symlinks for backwards
compat")
Cc: Dave Airlie <airlied@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
[seanpaul added Fixes and Cc tags]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180511041542.GA4253@haneen-vb
2018-05-11 09:06:05 -04:00
Sean Christopherson
64f7a11586 KVM: vmx: update sec exec controls for UMIP iff emulating UMIP
Update SECONDARY_EXEC_DESC for UMIP emulation if and only UMIP
is actually being emulated.  Skipping the VMCS update eliminates
unnecessary VMREAD/VMWRITE when UMIP is supported in hardware,
and on platforms that don't have SECONDARY_VM_EXEC_CONTROL.  The
latter case resolves a bug where KVM would fill the kernel log
with warnings due to failed VMWRITEs on older platforms.

Fixes: 0367f205a3 ("KVM: vmx: add support for emulating UMIP")
Cc: stable@vger.kernel.org #4.16
Reported-by: Paolo Zeppegno <pzeppegno@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Radim KrÄmář <rkrcmar@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-11 11:21:13 +02:00
Junaid Shahid
c19986fea8 kvm: x86: Suppress CR3_PCID_INVD bit only when PCIDs are enabled
If the PCIDE bit is not set in CR4, then the MSb of CR3 is a reserved
bit. If the guest tries to set it, that should cause a #GP fault. So
mask out the bit only when the PCIDE bit is set.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-11 11:21:12 +02:00
Paolo Bonzini
bcb2b94ae0 KVM: selftests: exit with 0 status code when tests cannot be run
Right now, skipped tests are returning a failure exit code if /dev/kvm does
not exists.  Consistently return a zero status code so that various scripts
over the interwebs do not complain.  Also return a zero status code if
the KVM_CAP_SYNC_REGS capability is not present, and hardcode in the
test the register kinds that are covered (rather than just using whatever
value of KVM_SYNC_X86_VALID_FIELDS is provided by the kernel headers).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-11 11:21:12 +02:00