Commit Graph

737131 Commits

Author SHA1 Message Date
Dan Williams
3968523f85 mpls, nospec: Sanitize array index in mpls_label_ok()
mpls_label_ok() validates that the 'platform_label' array index from a
userspace netlink message payload is valid. Under speculation the
mpls_label_ok() result may not resolve in the CPU pipeline until after
the index is used to access an array element. Sanitize the index to zero
to prevent userspace-controlled arbitrary out-of-bounds speculation, a
precursor for a speculative execution side channel vulnerability.

Cc: <stable@vger.kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:24:12 -05:00
Sowmini Varadhan
ebeeb1ad9b rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management
An rds_connection can get added during netns deletion between lines 528
and 529 of

  506 static void rds_tcp_kill_sock(struct net *net)
  :
  /* code to pull out all the rds_connections that should be destroyed */
  :
  528         spin_unlock_irq(&rds_tcp_conn_lock);
  529         list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node)
  530                 rds_conn_destroy(tc->t_cpath->cp_conn);

Such an rds_connection would miss out the rds_conn_destroy()
loop (that cancels all pending work) and (if it was scheduled
after netns deletion) could trigger the use-after-free.

A similar race-window exists for the module unload path
in rds_tcp_exit -> rds_tcp_destroy_conns

Concurrency with netns deletion (rds_tcp_kill_sock()) must be handled
by checking check_net() before enqueuing new work or adding new
connections.

Concurrency with module-unload is handled by maintaining a module
specific flag that is set at the start of the module exit function,
and must be checked before enqueuing new work or adding new connections.

This commit refactors existing RDS_DESTROY_PENDING checks added by
commit 3db6e0d172 ("rds: use RCU to synchronize work-enqueue with
connection teardown") and consolidates all the concurrency checks
listed above into the function rds_destroy_pending().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:23:52 -05:00
Linus Torvalds
a0f79386a4 Mostly cleanups, but three bug fixes:
1. don't pass garbage return codes back up the call chain (Mike Marshall)
 
  2. fix stale inode test (Martin Brandenburg)
 
  3. fix off-by-one errors (Xiongfeng Wang)
 
 Also: add Martin as a reviewer in the Maintainers file.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaejneAAoJEM9EDqnrzg2+XhoQAIDF112mOwLwqDPmr4ty0g6/
 gBcoHOrRFlYWPlS5aubjoZ3jFX2fAeNuHzYS4LIuqVKUdsC+oTKQ2URJ7KKpvLiK
 6zOaz2Y4GLns2sa1ZUKli6nEBbPi6uwoF54FNbwt3b+97wpmJwlnXm9ztyt5REKA
 zOHvLgJAcfGNZEJ7gyB1zjwllu4JeD0A4MoN4vJCtkKLAaNClywu4+V0jwZB+SSN
 8QjDXNqkcD31ahWhQ/CaU4zXlxOOV+4ZR7/p5IKT693hEhV+ikTvmXy8g0+bksxj
 L+FHmQMTO+GqCS5FxuBQd3v1IP5FkoHEmAwvr3C5aMlRAaVJ9eVVIZaC9CpOJBRB
 S/CiaG2Mw8vx8VGOm8O93Z+xDi9tCYP8x4i7b5r62h0T9wSyHJSkSIUd6VIkCV9Q
 c92bX/N3wHBvCPT+RC898plni5HsFpzs3vSs8hiaAICgp64sC8pIqVlZOAdMtJd8
 RL4la/Fited/T+3BpaCTkmnvNk8Ktax7wHYsCt4gSyHN8WRvkzowgC5kV6S30Qlh
 zfoXG0K50FcU8T5r3i8slvUHmsiyYxYwJIk/z1iDgXI7y4IIR6FGDxQmw5TxgNS7
 +veTo6FCxon6QshtpAOeELCau7qNXhtlDdGqqm4+gDfMWoCn0Jem/LzdA2gPXCOr
 iCDwHLiu6WXt7ZHTrgln
 =xrih
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs updates from Mike Marshall:
 "Mostly cleanups, but three bug fixes:

   - don't pass garbage return codes back up the call chain (Mike
     Marshall)

   - fix stale inode test (Martin Brandenburg)

   - fix off-by-one errors (Xiongfeng Wang)

  Also add Martin as a reviewer in the Maintainers file"

* tag 'for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: reverse sense of is-inode-stale test in d_revalidate
  orangefs: simplify orangefs_inode_is_stale
  Orangefs: don't propogate whacky error codes
  orangefs: use correct string length
  orangefs: make orangefs_make_bad_inode static
  orangefs: remove ORANGEFS_KERNEL_DEBUG
  orangefs: remove gossip_ldebug and gossip_lerr
  orangefs: make orangefs_client_debug_init static
  MAINTAINERS: update orangefs list and add myself as reviewer
2018-02-08 12:20:41 -08:00
Kees Cook
79a8a642bf net: Whitelist the skbuff_head_cache "cb" field
Most callers of put_cmsg() use a "sizeof(foo)" for the length argument.
Within put_cmsg(), a copy_to_user() call is made with a dynamic size, as a
result of the cmsg header calculations. This means that hardened usercopy
will examine the copy, even though it was technically a fixed size and
should be implicitly whitelisted. All the put_cmsg() calls being built
from values in skbuff_head_cache are coming out of the protocol-defined
"cb" field, so whitelist this field entirely instead of creating per-use
bounce buffers, for which there are concerns about performance.

Original report was:

Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'skbuff_head_cache' (offset 64, size 16)!
WARNING: CPU: 0 PID: 3663 at mm/usercopy.c:81 usercopy_warn+0xdb/0x100 mm/usercopy.c:76
...
 __check_heap_object+0x89/0xc0 mm/slab.c:4426
 check_heap_object mm/usercopy.c:236 [inline]
 __check_object_size+0x272/0x530 mm/usercopy.c:259
 check_object_size include/linux/thread_info.h:112 [inline]
 check_copy_size include/linux/thread_info.h:143 [inline]
 copy_to_user include/linux/uaccess.h:154 [inline]
 put_cmsg+0x233/0x3f0 net/core/scm.c:242
 sock_recv_errqueue+0x200/0x3e0 net/core/sock.c:2913
 packet_recvmsg+0xb2e/0x17a0 net/packet/af_packet.c:3296
 sock_recvmsg_nosec net/socket.c:803 [inline]
 sock_recvmsg+0xc9/0x110 net/socket.c:810
 ___sys_recvmsg+0x2a4/0x640 net/socket.c:2179
 __sys_recvmmsg+0x2a9/0xaf0 net/socket.c:2287
 SYSC_recvmmsg net/socket.c:2368 [inline]
 SyS_recvmmsg+0xc4/0x160 net/socket.c:2352
 entry_SYSCALL_64_fastpath+0x29/0xa0

Reported-by: syzbot+e2d6cfb305e9f3911dea@syzkaller.appspotmail.com
Fixes: 6d07d1cd30 ("usercopy: Restrict non-usercopy caches to size 0")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:15:48 -05:00
Mathieu Malaterre
e728789c52 net: Extra '_get' in declaration of arch_get_platform_mac_address
In commit c7f5d10549 ("net: Add eth_platform_get_mac_address() helper."),
two declarations were added:

  int eth_platform_get_mac_address(struct device *dev, u8 *mac_addr);
  unsigned char *arch_get_platform_get_mac_address(void);

An extra '_get' was introduced in arch_get_platform_get_mac_address, remove
it. Fix compile warning using W=1:

  CC      net/ethernet/eth.o
net/ethernet/eth.c:523:24: warning: no previous prototype for ‘arch_get_platform_mac_address’ [-Wmissing-prototypes]
 unsigned char * __weak arch_get_platform_mac_address(void)
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  AR      net/ethernet/built-in.o

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:13:30 -05:00
Linus Torvalds
81153336eb AFS development
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWnx0Mvu3V2unywtrAQI3ng//Xdv2rxVjv4znzekb/EkE9QIakH3ET3wt
 hBewQjaGkOWhZKgyE7DnhCMh7y6OrX/oVNtjPU8H7EEHDHVs+nyoGoDu282jlppr
 qO7yMbxZwDtpja7O9hVtIViFZSqlEey/RCq1KKRUl/HDmyyOmAvOZHCpyowUqcYD
 KqJs9Z2/onkP43rwmoKIQPEeKHxRfAs6pTiAG7fUPYC4d6aSskiN5K65N0g4dx4F
 G6pDC/mIJWx2qeeI//CzSxnqhzWAhkozOs9UtvquSrIoNcYMSOQRHGne50n7OqkK
 rZCttm4gSlrEU11cPDNExjKU4z8UM3tmVdudntC8wbng5PFCHTR7JB5nZu1bEjqw
 TpIjb302QnUefzu1AGge03ZnysqDKKBAxKKwD1gYBHaj7Y2CrqP4lo+6QA4ePYTv
 qD7nRZCiQ8rF3PJOYJ7xe944Jziktf6PhnOXyxOSNCv3IT90YD7meOR3MldMjny/
 hM2ahYqfWXjLAjH20Q+B8z7ab9GDdVsBTl06w/ZX+RMrg5CNdDaYe0nfG/tS7H3A
 oD7xIjUwWjqxMBqtXNUe/3GAOnU+ilEiKjq8gmNkBSjRlpO6SMxi02jOp66HwnRs
 tD5qG3Bn2F3hdvEtwcKcS0cVWX511lLF5vkhlBhSbs/XkS+BXULr3vDsl5XclwAw
 /07q8HsHlnM=
 =fSB4
 -----END PGP SIGNATURE-----

Merge tag 'afs-next-20180208' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull afs updates from David Howells:
 "Four fixes:

   - add a missing put

   - two fixes to reset the address iteration cursor correctly

   - fix setting up the fileserver iteration cursor.

  Two cleanups:

   - remove some dead code

   - rearrange a function to be more logically laid out

  And one new feature:

   - Support AFS dynamic root.

     With this one should be able to do, say:

        mkdir /afs
        mount -t afs none /afs -o dyn

     to create a dynamic root and then, provided you have keyutils
     installed, do:

        ls /afs/grand.central.org

     and:

        ls /afs/umich.edu

     to list the root volumes of both those organisations' AFS cells
     without requiring any other setup (the kernel upcall to a program
     in the keyutils package to do DNS access as does NFS)"

* tag 'afs-next-20180208' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Support the AFS dynamic root
  afs: Rearrange afs_select_fileserver() a little
  afs: Remove unused code
  afs: Fix server list handling
  afs: Need to clear responded flag in addr cursor
  afs: Fix missing cursor clearance
  afs: Add missing afs_put_cell()
2018-02-08 12:12:04 -08:00
Nathan Fontenot
ec95dffa40 ibmvnic: queue reset when CRQ gets closed during reset
While handling a driver reset we get a H_CLOSED return trying
to send a CRQ event. When this occurs we need to queue up another
reset attempt. Without doing this we see instances where the driver
is left in a closed state because the reset failed and there is no
further attempts to reset the driver.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:11:15 -05:00
Gustavo A. R. Silva
583133b35e atm: he: use 64-bit arithmetic instead of 32-bit
Add suffix ULL to constants 272, 204, 136 and 68 in order to give the
compiler complete information about the proper arithmetic to use.
Notice that these constants are used in contexts that expect
expressions of type unsigned long long (64 bits, unsigned).

The following expressions are currently being evaluated using 32-bit
arithmetic:

272 * mult
204 * mult
136 * mult
68 * mult

Addresses-Coverity-ID: 201058
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 15:05:16 -05:00
Linus Torvalds
ef9417e8a9 IOMMU Updates for Linux v4.16
Including:
 
 	- 5-level page-table support for the Intel IOMMU.
 
 	- Error reporting improvements for the AMD IOMMU driver
 
 	- Additional DT bindings for ipmmu-vmsa (Renesas)
 
 	- Smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJafGLMAAoJECvwRC2XARrjPTUP/0g/n8H5j35DevM56G62MrNq
 fNweMxPm7AqZQR/dnIkPnlH5NWfP1z5PZ47H/nAMAqd7cKHVOfUmzoufiUSGP92V
 eweFF4ufjqA+V5fluGcnt0UNxgbEGs+cEgf9jbEkUlpmFisV7BwOCGIJbVdHMrxG
 jkrr/L17iX82uqIru9JmfB2K0pEPBtBHQSZpooGHAyGsR4xU6nX1X64mV/a9Oh/2
 qzfzRsAbF5ZtAszktVz9j2AMfp40BrrAcHzmvepjS5yTjlH9t5J8UdM48GHWU+Zp
 ptmlJ3fJybe0yUI6GDfG9M6+/RX0T/xMvV1QcSJW6KP0q/i9p4hrIQufoOzstMYM
 uCsFPlhMLFSDcQy6CZ3M6VEsU5mdJ0KMn0xAN8rBLAok1ScGKrlP5qWpXJLeUJRp
 Ie7R4WVT+Ly/SLppoiLagiTW3ZD/gQh+YPNgYwXptMdDmiqSRdXm0nF6bzTiKk1Z
 8h8oEj2ittwBTC+fXuP+1C/wOKYL6KJUGnykLcHBDO+/wkEWOP0KM6939+T7IjHt
 zkiUapRegRvWyDOq1HFVl0tBCRLo1dqwG/3PFpqHUkj6Iyqyhd8y/V5IM3GTSI+d
 1tHBz6dXin62N/xYu/ScpmPMerpjP/AtMqd3dvx7Q+9vgNIAVSPMKFqeXhQ3P2ph
 +p1CdWvPYPb7wUhTvcja
 =+LFh
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "This time there are not a lot of changes coming from the IOMMU side.

  That is partly because I returned from my parental leave late in the
  development process and probably partly because everyone was busy with
  Spectre and Meltdown mitigation work and didn't find the time for
  IOMMU work. So here are the few changes that queued up for this merge
  window:

   - 5-level page-table support for the Intel IOMMU.

   - error reporting improvements for the AMD IOMMU driver

   - additional DT bindings for ipmmu-vmsa (Renesas)

   - small fixes and cleanups"

* tag 'iommu-updates-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Clean up of_iommu_init_fn
  iommu/ipmmu-vmsa: Remove redundant of_iommu_init_fn hook
  iommu/msm: Claim bus ops on probe
  iommu/vt-d: Enable 5-level paging mode in the PASID entry
  iommu/vt-d: Add a check for 5-level paging support
  iommu/vt-d: Add a check for 1GB page support
  iommu/vt-d: Enable upto 57 bits of domain address width
  iommu/vt-d: Use domain instead of cache fetching
  iommu/exynos: Don't unconditionally steal bus ops
  iommu/omap: Fix debugfs_create_*() usage
  iommu/vt-d: clean up pr_irq if request_threaded_irq fails
  iommu: Check the result of iommu_group_get() for NULL
  iommu/ipmmu-vmsa: Add r8a779(70|95) DT bindings
  iommu/ipmmu-vmsa: Add r8a7796 DT binding
  iommu/amd: Set the device table entry PPR bit for IOMMU V2 devices
  iommu/amd - Record more information about unknown events
2018-02-08 12:03:54 -08:00
Linus Torvalds
605dc7761d Merge branch 'pcmcia' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia
Pull pcmcia updates from Dominik Brodowski:
 "The linux-pcmcia mailing list was shut down, so offer an alternative
  path for patches in MAINTAINERS.

  Also, throw in two odd fixes for the pcmcia subsystem"

* 'pcmcia' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia:
  pcmcia: soc_common: Handle return value of clk_prepare_enable
  pcmcia: use proper printk format for resource
  pcmcia: remove mailing list, update MAINTAINERS
2018-02-08 11:48:49 -08:00
Linus Torvalds
fe26adf431 nouveau features, i915 + amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJae8OFAAoJEAx081l5xIa+mIUP/0leefSxgD4GTAAO5nQDIwTX
 TLnFP52i0/wrQ1T1CKkBCTnc8yRo4OSH3KMqnwHppBRGinYVRlz404pEckw3yUYq
 kTFS6ZKlfjZRgo7UIia49UlbDWse6aK6VUFwtyyb9et62rlTE0nmLXLHdKHuTnzi
 DxxMvmdDKWn9q/he5nHKg3d9H3ICc/EWINEqlxKIrX4Zgk/ymq/95rZNY0tOvuFa
 1WSFAl0IuCR330trgpN4kOLuCno/W0MuQFVJ4ymgeMW8ZhjM4UTjOANAm/8wZfmo
 Dau16psa18iE/kdz+iobdC1nzAS1VdMYXLv7HepLouYXByd6o2Xc6TMvBO0d9NxV
 JiLpntzdnmGHE0y/5GgMPJ5+8CCNzaI0ASqPbNvKVSB08cZB0hvYiVQdLSGAMLoY
 DiNwsgT+Pk+OXddvR+i8WdAUfU9aOKhl01bFlPWheXyZdAkGwvbBb4xQ6A11U5C2
 HUW1ZKPE0M4yGblnQpAulw7wcYEGHs0xMIfG8RwLGR0FazSsW2Rk8GKbMapEvhUx
 Ge3pvB51u70L/q1X1POy/q9+ITs82KXr5T+cjpdo+yOxq1JbfgQWdSlCIXH4Ptlf
 h53HWbJOu5JUWjI2FiePHwmjhxwxT01ManUThrlYJ4OR+5LyWbA1y0m5c1FV2zFd
 p82ux/7cSmaE6hN8LsdF
 =857C
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.16-part2-fixes' of git://people.freedesktop.org/~airlied/linux

Pull more drm updates from Dave Airlie:
 "Ben missed sending his nouveau tree, but he really didn't have much
  stuff in it:

   - GP108 acceleration support is enabled by "secure boot" support

   - some clockgating work on Kepler, and bunch of fixes

   - the bulk of the diff is regenerated firmware files, the change to
     them really isn't that large.

  Otherwise this contains regular Intel and AMDGPU fixes"

* tag 'drm-for-v4.16-part2-fixes' of git://people.freedesktop.org/~airlied/linux: (59 commits)
  drm/i915/bios: add DP max link rate to VBT child device struct
  drm/i915/cnp: Properly handle VBT ddc pin out of bounds.
  drm/i915/cnp: Ignore VBT request for know invalid DDC pin.
  drm/i915/cmdparser: Do not check past the cmd length.
  drm/i915/cmdparser: Check reg_table_count before derefencing.
  drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing
  drm/i915/gvt: Use KVM r/w to access guest opregion
  drm/i915/gvt: Fix aperture read/write emulation when enable x-no-mmap=on
  drm/i915/gvt: only reset execlist state of one engine during VM engine reset
  drm/i915/gvt: refine intel_vgpu_submission_ops as per engine ops
  drm/amdgpu: re-enable CGCG on CZ and disable on ST
  drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning
  drm/nouveau/mmu: Fix trailing semicolon
  drm/nouveau: Introduce NvPmEnableGating option
  drm/nouveau: Add support for SLCG for Kepler2
  drm/nouveau: Add support for BLCG on Kepler2
  drm/nouveau: Add support for BLCG on Kepler1
  drm/nouveau: Add support for basic clockgating on Kepler1
  drm/nouveau/kms/nv50: fix handling of gamma since atomic conversion
  drm/nouveau/kms/nv50: use INTERPOLATE_257_UNITY_RANGE LUT on newer chipsets
  ...
2018-02-08 11:42:05 -08:00
Linus Torvalds
9e95dae76b Things have been very quiet on the rbd side, as work continues on the
big ticket items slated for the next merge window.
 
 On the CephFS side we have a large number of cap handling improvements,
 a fix for our long-standing abuse of ->journal_info in ceph_readpages()
 and yet another dentry pointer management patch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJafGqnAAoJEEp/3jgCEfOLjNcH/R6G/xyytDMfxaN+D8DBqCPF
 IaQM7RtgYJeRzDIXYYCkDEBPYqLcD2fjHLzFotFNLcgLdeUcSOyfg7NuCOWWq7o2
 t4z6Ekyish3GWZLUmlSdPcToQ+xIlMRshU8ZmzCHTCzx8XjO+CAnCADp5dh8OKZx
 mCpRX16sXdc6ozE1hsGKIkUoNrkdj8d3+HseZ2Uxb/4FZBNgH3cmmg7c5y6M+sp6
 wT4NEES3baqq2v5cVfw7T+d4MNgRm4/JC1aBy1JBkQlmVFNGteQTT7yzo0X1AfJ+
 +kcR10ddg0gD4WGYhL+iZlQCfwyMp7vouHQbgTOgt+rDCitjDy5r1BAamtxnZjM=
 =ctaD
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.16-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Things have been very quiet on the rbd side, as work continues on the
  big ticket items slated for the next merge window.

  On the CephFS side we have a large number of cap handling
  improvements, a fix for our long-standing abuse of ->journal_info in
  ceph_readpages() and yet another dentry pointer management patch"

* tag 'ceph-for-4.16-rc1' of git://github.com/ceph/ceph-client:
  ceph: improving efficiency of syncfs
  libceph: check kstrndup() return value
  ceph: try to allocate enough memory for reserved caps
  ceph: fix race of queuing delayed caps
  ceph: delete unreachable code in ceph_check_caps()
  ceph: limit rate of cap import/export error messages
  ceph: fix incorrect snaprealm when adding caps
  ceph: fix un-balanced fsc->writeback_count update
  ceph: track read contexts in ceph_file_info
  ceph: avoid dereferencing invalid pointer during cached readdir
  ceph: use atomic_t for ceph_inode_info::i_shared_gen
  ceph: cleanup traceless reply handling for rename
  ceph: voluntarily drop Fx cap for readdir request
  ceph: properly drop caps for setattr request
  ceph: voluntarily drop Lx cap for link/rename requests
  ceph: voluntarily drop Ax cap for requests that create new inode
  rbd: whitelist RBD_FEATURE_OPERATIONS feature bit
  rbd: don't NULL out ->obj_request in rbd_img_obj_parent_read_full()
  rbd: use kmem_cache_zalloc() in rbd_img_request_create()
  rbd: obj_request->completion is unused
2018-02-08 11:38:59 -08:00
Nicolas Pitre
a8c6db00bf cramfs: better MTD dependency expression
Commit b9f5fb1800 ("cramfs: fix MTD dependency") did what it says.

Since commit 9059a3493e ("kconfig: fix relational operators for bool
and tristate symbols") it is possible to do it slightly better though.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-08 11:37:31 -08:00
Christian Brauner
4ff66cae7f rtnetlink: require unique netns identifier
Since we've added support for IFLA_IF_NETNSID for RTM_{DEL,GET,SET,NEW}LINK
it is possible for userspace to send us requests with three different
properties to identify a target network namespace. This affects at least
RTM_{NEW,SET}LINK. Each of them could potentially refer to a different
network namespace which is confusing. For legacy reasons the kernel will
pick the IFLA_NET_NS_PID property first and then look for the
IFLA_NET_NS_FD property but there is no reason to extend this type of
behavior to network namespace ids. The regression potential is quite
minimal since the rtnetlink requests in question either won't allow
IFLA_IF_NETNSID requests before 4.16 is out (RTM_{NEW,SET}LINK) or don't
support IFLA_NET_NS_{PID,FD} (RTM_{DEL,GET}LINK) in the first place.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 14:33:20 -05:00
Jason Wang
762c330d67 tuntap: add missing xdp flush
When using devmap to redirect packets between interfaces,
xdp_do_flush() is usually a must to flush any batched
packets. Unfortunately this is missed in current tuntap
implementation.

Unlike most hardware driver which did XDP inside NAPI loop and call
xdp_do_flush() at then end of each round of poll. TAP did it in the
context of process e.g tun_get_user(). So fix this by count the
pending redirected packets and flush when it exceeds NAPI_POLL_WEIGHT
or MSG_MORE was cleared by sendmsg() caller.

With this fix, xdp_redirect_map works again between two TAPs.

Fixes: 761876c857 ("tap: XDP support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 14:10:30 -05:00
Masahiro Yamada
9e3e10c725 kconfig: send error messages to stderr
These messages should be directed to stderr.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:10:10 +09:00
Masahiro Yamada
f3ff6fb5db kconfig: echo stdin to stdout if either is redirected
If stdio is not tty, conf_askvalue() puts additional new line to
prevent prompts from being concatenated into a single line.  This
care is missing in conf_choice(), so a 'choice' prompt and the next
prompt are shown in the same line.

Move the code into xfgets() to cater to all cases.  To improve this
more, let's echo stdin to stdout.  This clarifies what keys were
input from stdio and the stdout looks like as if it were from tty.

I removed the isatty(2) check since stderr is unrelated here.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:10:10 +09:00
Masahiro Yamada
d2a04648a5 kconfig: remove check_stdin()
Except silentoldconfig, valid_stdin is 1, so check_stdin() is no-op.

oldconfig and silentoldconfig work almost in the same way except that
the latter generates additional files under include/.  Both ask users
for input for new symbols.

I do not know why only silentoldconfig requires stdio be tty.

  $ rm -f .config; touch .config
  $ yes "" | make oldconfig > stdout
  $ rm -f .config; touch .config
  $ yes "" | make silentoldconfig > stdout
  make[1]: *** [silentoldconfig] Error 1
  make: *** [silentoldconfig] Error 2
  $ tail -n 4 stdout
  Console input/output is redirected. Run 'make oldconfig' to update configuration.

  scripts/kconfig/Makefile:40: recipe for target 'silentoldconfig' failed
  Makefile:507: recipe for target 'silentoldconfig' failed

Redirection is useful, for example, for testing where we want to give
particular key inputs from a test file, then check the result.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:10:10 +09:00
Masahiro Yamada
cd58a91def kconfig: remove 'config*' pattern from .gitignnore
I could not figure out why this pattern should be ignored.
Checking commit 1e65174a33 ("Add some basic .gitignore files")
did not help.

Let's remove this pattern, then see if it is really needed.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:10:09 +09:00
Masahiro Yamada
4f208f3921 kconfig: show '?' prompt even if no help text is available
'make config', 'make oldconfig', etc. always receive '?' as a valid
input and show useful information even if no help text is available.

------------------------>8------------------------
foo (FOO) [N/y] (NEW) ?

There is no help available for this option.
Symbol: FOO [=n]
Type  : bool
Prompt: foo
  Defined at Kconfig:1
------------------------>8------------------------

However, '?' is not shown in the prompt if its help text is missing.
Let's show '?' all the time so that the prompt and the behavior match.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:10:09 +09:00
Masahiro Yamada
cb67ab2cd2 kconfig: do not write choice values when their dependency becomes n
"# CONFIG_... is not set" for choice values are wrongly written into
the .config file if they are once visible, then become invisible later.

  Test case
  ---------

---------------------------(Kconfig)----------------------------
config A
	bool "A"

choice
	prompt "Choice ?"
	depends on A

config CHOICE_B
	bool "Choice B"

config CHOICE_C
	bool "Choice C"

endchoice
----------------------------------------------------------------

---------------------------(.config)----------------------------
CONFIG_A=y
----------------------------------------------------------------

With the Kconfig and .config above,

  $ make config
  scripts/kconfig/conf  --oldaskconfig Kconfig
  *
  * Linux Kernel Configuration
  *
  A (A) [Y/n] n
  #
  # configuration written to .config
  #
  $ cat .config
  #
  # Automatically generated file; DO NOT EDIT.
  # Linux Kernel Configuration
  #
  # CONFIG_A is not set
  # CONFIG_CHOICE_B is not set
  # CONFIG_CHOICE_C is not set

Here,

  # CONFIG_CHOICE_B is not set
  # CONFIG_CHOICE_C is not set

should not be written into the .config file because their dependency
"depends on A" is unmet.

Currently, there is no code that clears SYMBOL_WRITE of choice values.

Clear SYMBOL_WRITE for all symbols in sym_calc_value(), then set it
again after calculating visibility.  To simplify the logic, set the
flag if they have non-n visibility, regardless of types, and regardless
of whether they are choice values or not.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09 04:08:05 +09:00
Nicolas Dichtel
cb9f7a9a5c netlink: ensure to loop over all netns in genlmsg_multicast_allns()
Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
case when commit 134e63756d was pushed.
However, there was no reason to stop the loop if a netns does not have
listeners.
Returns -ESRCH only if there was no listeners in all netns.

To avoid having the same problem in the future, I didn't take the
assumption that nlmsg_multicast() returns only 0 or -ESRCH.

Fixes: 134e63756d ("genetlink: make netns aware")
CC: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 14:03:18 -05:00
David Howells
8c2f826dc3 rxrpc: Don't put crypto buffers on the stack
Don't put buffers of data to be handed to crypto on the stack as this may
cause an assertion failure in the kernel (see below).  Fix this by using an
kmalloc'd buffer instead.

kernel BUG at ./include/linux/scatterlist.h:147!
...
RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc]
RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028
RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88
RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08
R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95
R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08
FS:  0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 rxkad_respond_to_challenge+0x297/0x330 [rxrpc]
 rxrpc_process_connection+0xd1/0x690 [rxrpc]
 ? process_one_work+0x1c3/0x680
 ? __lock_is_held+0x59/0xa0
 process_one_work+0x249/0x680
 worker_thread+0x3a/0x390
 ? process_one_work+0x680/0x680
 kthread+0x121/0x140
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x3a/0x50

Reported-by: Jonathan Billings <jsbillings@jsbillings.org>
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 13:48:29 -05:00
Linus Torvalds
c013632192 2nd set of arm64 updates for 4.16:
Spectre v1 mitigation:
 - back-end version of array_index_mask_nospec()
 - masking of the syscall number to restrict speculation through the
   syscall table
 - masking of __user pointers prior to deference in uaccess routines
 
 Spectre v2 mitigation update:
 - using the new firmware SMC calling convention specification update
 - removing the current PSCI GET_VERSION firmware call mitigation as
   vendors are deploying new SMCCC-capable firmware
 - additional branch predictor hardening for synchronous exceptions and
   interrupts while in user mode
 
 Meltdown v3 mitigation update for Cavium Thunder X: unaffected but
 hardware erratum gets in the way. The kernel now starts with the page
 tables mapped as global and switches to non-global if kpti needs to be
 enabled.
 
 Other:
 - Theoretical trylock bug fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlp8lqcACgkQa9axLQDI
 XvH2lxAAnsYqthpGQ11MtDJB+/UiBAFkg9QWPDkwrBDvNhgpll+J0VQuCN1QJ2GX
 qQ8rkv8uV+y4Fqr8hORGJy5At+0aI63ZCJ72RGkZTzJAtbFbFGIDHP7RhAEIGJBS
 Lk9kDZ7k39wLEx30UXIFYTTVzyHar397TdI7vkTcngiTzZ8MdFATfN/hiKO906q3
 14pYnU9Um4aHUdcJ+FocL3dxvdgniuuMBWoNiYXyOCZXjmbQOnDNU2UrICroV8lS
 mB+IHNEhX1Gl35QzNBtC0ET+aySfHBMJmM5oln+uVUljIGx6En1WLj6mrHYcx8U2
 rIBm5qO/X/4iuzYPGkxwQtpjq3wPYxsSUnMdKJrsUZqAfy2QeIhFx6XUtJsZPB2J
 /lgls5xSXMOS7oiOQtmVjcDLBURDmYXGwljXR4n4jLm4CT1V9qSLcKHu1gdFU9Mq
 VuMUdPOnQub1vqKndi154IoYDTo21jAib2ktbcxpJfSJnDYoit4Gtnv7eWY+M3Pd
 Toaxi8htM2HSRwbvslHYGW8ZcVpI79Jit+ti7CsFg7m9Lvgs0zxcnNui4uPYDymT
 jh2JYxuirIJbX9aGGhnmkNhq9REaeZJg9LA2JM8S77FCHN3bnlSdaG6wy899J6EI
 lK4anCuPQKKKhUia/dc1MeKwrmmC18EfPyGUkOzywg/jGwGCmZM=
 =Y0TT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "As I mentioned in the last pull request, there's a second batch of
  security updates for arm64 with mitigations for Spectre/v1 and an
  improved one for Spectre/v2 (via a newly defined firmware interface
  API).

  Spectre v1 mitigation:

   - back-end version of array_index_mask_nospec()

   - masking of the syscall number to restrict speculation through the
     syscall table

   - masking of __user pointers prior to deference in uaccess routines

  Spectre v2 mitigation update:

   - using the new firmware SMC calling convention specification update

   - removing the current PSCI GET_VERSION firmware call mitigation as
     vendors are deploying new SMCCC-capable firmware

   - additional branch predictor hardening for synchronous exceptions
     and interrupts while in user mode

  Meltdown v3 mitigation update:

    - Cavium Thunder X is unaffected but a hardware erratum gets in the
      way. The kernel now starts with the page tables mapped as global
      and switches to non-global if kpti needs to be enabled.

  Other:

   - Theoretical trylock bug fixed"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (38 commits)
  arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
  arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
  arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
  arm/arm64: smccc: Make function identifiers an unsigned quantity
  firmware/psci: Expose SMCCC version through psci_ops
  firmware/psci: Expose PSCI conduit
  arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
  arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
  arm/arm64: KVM: Turn kvm_psci_version into a static inline
  arm/arm64: KVM: Advertise SMCCC v1.1
  arm/arm64: KVM: Implement PSCI 1.0 support
  arm/arm64: KVM: Add smccc accessors to PSCI code
  arm/arm64: KVM: Add PSCI_VERSION helper
  arm/arm64: KVM: Consolidate the PSCI include files
  arm64: KVM: Increment PC after handling an SMC trap
  arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  arm64: entry: Apply BP hardening for suspicious interrupts from EL0
  arm64: entry: Apply BP hardening for high-priority synchronous exceptions
  arm64: futex: Mask __user pointers prior to dereference
  ...
2018-02-08 10:44:25 -08:00
Linus Torvalds
846ade7dd2 virtio, vhost: fixes, cleanups, features
This includes the disk/cache memory stats for for the virtio balloon,
 as well as multiple fixes and cleanups.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJaempgAAoJECgfDbjSjVRpGOIIAIWiarIFrjjcE+hxcFdKkKvC
 T8YQzfvxHTuBqD8m1jd/9R/U0RHYRM4MX+cg6tVz9J2VVhQ2Hjfrs7HExqoZKul8
 sOOzk0d+ii0iQvlgnmmXamlceizP4IuNZcia7FDAZK0hWfHE84dPrG3/hhWOGruN
 NDR1v7k8GBIMS+7lExQwzmy6gs4zGJftJUF9Fnb4CVT26wWbOKYS2exC8UJerZAE
 2puWx71Fd/C27x/iOsxZOME0tOmzU7MIRksSjDcT7YQLesIuJSbjpQd5yfvXW0qC
 7iEQzueZuYP4VroHvpDbKDxvFsIxhjFGKR4/sSG+a0Zso1ejS9fyuDL9a627PJI=
 =/qQR
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost updates from Michael Tsirkin:
 "virtio, vhost: fixes, cleanups, features

  This includes the disk/cache memory stats for for the virtio balloon,
  as well as multiple fixes and cleanups"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: don't hold onto file pointer for VHOST_SET_LOG_FD
  vhost: don't hold onto file pointer for VHOST_SET_VRING_ERR
  vhost: don't hold onto file pointer for VHOST_SET_VRING_CALL
  ringtest: ring.c malloc & memset to calloc
  virtio_vop: don't kfree device on register failure
  virtio_pci: don't kfree device on register failure
  virtio: split device_register into device_initialize and device_add
  vhost: remove unused lock check flag in vhost_dev_cleanup()
  vhost: Remove the unused variable.
  virtio_blk: print capacity at probe time
  virtio: make VIRTIO a menuconfig to ease disabling it all
  virtio/ringtest: virtio_ring: fix up need_event math
  virtio/ringtest: fix up need_event math
  virtio: virtio_mmio: make of_device_ids const.
  firmware: Use PTR_ERR_OR_ZERO()
  virtio-mmio: Use PTR_ERR_OR_ZERO()
  vhost/scsi: Improve a size determination in four functions
  virtio_balloon: include disk/file caches memory statistics
2018-02-08 10:41:00 -08:00
Chuck Lever
175e03101d svcrdma: Fix Read chunk round-up
A single NFSv4 WRITE compound can often have three operations:
PUTFH, WRITE, then GETATTR.

When the WRITE payload is sent in a Read chunk, the client places
the GETATTR in the inline part of the RPC/RDMA message, just after
the WRITE operation (sans payload). The position value in the Read
chunk enables the receiver to insert the Read chunk at the correct
place in the received XDR stream; that is between the WRITE and
GETATTR.

According to RFC 8166, an NFS/RDMA client does not have to add XDR
round-up to the Read chunk that carries the WRITE payload. The
receiver adds XDR round-up padding if it is absent and the
receiver's XDR decoder requires it to be present.

Commit 193bcb7b37 ("svcrdma: Populate tail iovec when receiving")
attempted to add support for receiving such a compound so that just
the WRITE payload appears in rq_arg's page list, and the trailing
GETATTR is placed in rq_arg's tail iovec. (TCP just strings the
whole compound into the head iovec and page list, without regard
to the alignment of the WRITE payload).

The server transport logic also had to accommodate the optional XDR
round-up of the Read chunk, which it did simply by lengthening the
tail iovec when round-up was needed. This approach is adequate for
the NFSv2 and NFSv3 WRITE decoders.

Unfortunately it is not sufficient for nfsd4_decode_write. When the
Read chunk length is a couple of bytes less than PAGE_SIZE, the
computation at the end of nfsd4_decode_write allows argp->pagelen to
go negative, which breaks the logic in read_buf that looks for the
tail iovec.

The result is that a WRITE operation whose payload length is just
less than a multiple of a page succeeds, but the subsequent GETATTR
in the same compound fails with NFS4ERR_OP_ILLEGAL because the XDR
decoder can't find it. Clients ignore the error, but they must
update their attribute cache via a separate round trip.

As nfsd4_decode_write appears to expect the payload itself to always
have appropriate XDR round-up, have svc_rdma_build_normal_read_chunk
add the Read chunk XDR round-up to the page_len rather than
lengthening the tail iovec.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: 193bcb7b37 ("svcrdma: Populate tail iovec when receiving")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:17 -05:00
Arnd Bergmann
2285ae760d NFSD: hide unused svcxdr_dupstr()
There is now only one caller left for svcxdr_dupstr() and this is inside
of an #ifdef, so we can get a warning when the option is disabled:

fs/nfsd/nfs4xdr.c:241:1: error: 'svcxdr_dupstr' defined but not used [-Werror=unused-function]

This changes the remaining caller to use a nicer IS_ENABLED() check,
which lets the compiler drop the unused code silently.

Fixes: e40d99e6183e ("NFSD: Clean up symlink argument XDR decoders")
Suggested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:17 -05:00
Amir Goldstein
39ca1bf624 nfsd: store stat times in fill_pre_wcc() instead of inode times
The time values in stat and inode may differ for overlayfs and stat time
values are the correct ones to use. This is also consistent with the fact
that fill_post_wcc() also stores stat time values.

This means introducing a stat call that could fail, where previously we
were just copying values out of the inode.  To be conservative about
changing behavior, we fall back to copying values out of the inode in
the error case.  It might be better just to clear fh_pre_saved (though
note the BUG_ON in set_change_info).

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:17 -05:00
Amir Goldstein
76c479480b nfsd: encode stat->mtime for getattr instead of inode->i_mtime
The values of stat->mtime and inode->i_mtime may differ for overlayfs
and stat->mtime is the correct value to use when encoding getattr.
This is also consistent with the fact that other attr times are also
encoded from stat values.

Both callers of lease_get_mtime() already have the value of stat->mtime,
so the only needed change is that lease_get_mtime() will not overwrite
this value with inode->i_mtime in case the inode does not have an
exclusive lease.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:16 -05:00
J. Bruce Fields
0078117c6d nfsd: return RESOURCE not GARBAGE_ARGS on too many ops
A client that sends more than a hundred ops in a single compound
currently gets an rpc-level GARBAGE_ARGS error.

It would be more helpful to return NFS4ERR_RESOURCE, since that gives
the client a better idea how to recover (for example by splitting up the
compound into smaller compounds).

This is all a bit academic since we've never actually seen a reason for
clients to send such long compounds, but we may as well fix it.

While we're there, just use NFSD4_MAX_OPS_PER_COMPOUND == 16, the
constant we already use in the 4.1 case, instead of hard-coding 100.
Chances anyone actually uses even 16 ops per compound are small enough
that I think there's a neglible risk or any regression.

This fixes pynfs test COMP6.

Reported-by: "Lu, Xinyu" <luxy.fnst@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:16 -05:00
Linus Torvalds
977e41524d CRIS port changes for 4.16
Includes only a small fix for some conflicting symbols, aligning CRIS
 with other platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iEYEABECAAYFAlp5frcACgkQ31LbvUHyf1e9rACbBugJIN2v+wHZ2+XmQOW27hDp
 V/IAn1HkpGdwyxxCZ6JZnr/XAtCi1RhJ
 =EP69
 -----END PGP SIGNATURE-----
mergetag object 6e0377212c
 type commit
 tag cris-for-4.16-urgent
 tagger Jesper Nilsson <jesper@jni.nu> 1518084841 +0100
 
 CRIS urgent breakage fix for 4.16
 
 The main Makefile for the CRIS port was
 overzealously scrubbed in 4.15-rc3,
 breaking the build for all CRIS SoCs.
 -----BEGIN PGP SIGNATURE-----
 
 iEYEABECAAYFAlp8I9IACgkQ31LbvUHyf1dIiACeP/H/3asKo7JgidYmA1gkEk4A
 oiwAn0QXaFm5ljxuBSd88FIr4E5vfwdD
 =fjIf
 -----END PGP SIGNATURE-----

Merge tags 'cris-for-4.16' and 'cris-for-4.16-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris

Pull CRIS updates and fixes from Jesper Nilsson:

 - a small fix for some conflicting symbols, aligning CRIS with other
   platforms.

 - fix build breakage regression for all CRIS SoCs. The main Makefile
   for the CRIS port was overzealously scrubbed in 4.15-rc3.

* tag 'cris-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
  cris: Fix conflicting types for _etext, _edata, _end

* tag 'cris-for-4.16-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
  CRIS: Restore mistakenly cleared kernel Makefile
2018-02-08 10:36:05 -08:00
Kalle Valo
99ffd198f0 Merge ath-current from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
ath.git fixes for 4.16. Major changes:

ath10k

* correct firmware RAM dump length for QCA6174/QCA9377

* add new QCA988X device id

* fix a kernel panic during pci probe

* revert a recent commit which broke ath10k firmware metadata parsing

ath9k

* fix a noise floor regression introduced during the merge window

* add new device id
2018-02-08 19:28:49 +02:00
Steven Rostedt (VMware)
878cb3fb06 selftests/ftrace: Add more tests for removing of function probes
Al Viro discovered a bug in the removing of function probes where if it had
a '*' at the beginning, it would fail to find any matches. That is, because
it reset the glob search string to the the initial string with a "MATCH_END"
type, instead of skipping the wildcard "*" it included it, where it would
not match any functions because "*" was being treated as a normal character
and not a wildcard one.

Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:31:50 -05:00
Steven Rostedt (VMware)
97fe22adf3 selftests/ftrace: Add some missing glob checks
Al Viro discovered a bug in the glob ftrace filtering code where "*a*b" is
treated the same as "a*b", and functions that would be selected by "*a*b"
but not "a*b" are not selected with "*a*b".

Add tests for patterns "*a*b" and "a*b*" to the glob selftest.

Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk

Cc: Shuah Khan <shuah@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:13:28 -05:00
Steven Rostedt (VMware)
0787ce3360 selftests/ftrace: Have reset_ftrace_filter handle multiple instances
If a probe is attached to a static function that is in multiple files with
the same name, removing it by name will remove all instances:

 # grep jump_label_unlock set_ftrace_filter
jump_label_unlock:traceoff:unlimited
jump_label_unlock:traceoff:unlimited

 # echo '!jump_label_unlock:traceoff' >> set_ftrace_filter
 # grep jump_label_unlock set_ftrace_filter
 #

But the loop in reset_ftrace_filter will try to remove multiple instances
multiple times. If this happens the second time will error and cause the
test to fail.

At each iteration of the loop, check to see if the probe being removed still
exists.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:13:17 -05:00
Steven Rostedt (VMware)
df9d36d9e8 selftests/ftrace: Have reset_ftrace_filter handle modules
If a function probe in set_ftrace_filter belongs to a module, it will
contain the module name. Like:

 wmi_query_block [wmi]:traceoff:unlimited

But writing:

 '!wmi_query_block [wmi]:traceoff' > set_ftrace_filter

will cause an error. We still need to write:

 '!wmi_query_block:traceoff' > set_ftrace_filter

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:13:09 -05:00
Steven Rostedt (VMware)
0723402141 tracing: Fix parsing of globs with a wildcard at the beginning
Al Viro reported:

    For substring - sure, but what about something like "*a*b" and "a*b"?
    AFAICS, filter_parse_regex() ends up with identical results in both
    cases - MATCH_GLOB and *search = "a*b".  And no way for the caller
    to tell one from another.

Testing this with the following:

 # cd /sys/kernel/tracing
 # echo '*raw*lock' > set_ftrace_filter
 bash: echo: write error: Invalid argument

With this patch:

 # echo '*raw*lock' > set_ftrace_filter
 # cat set_ftrace_filter
_raw_read_trylock
_raw_write_trylock
_raw_read_unlock
_raw_spin_unlock
_raw_write_unlock
_raw_spin_trylock
_raw_spin_lock
_raw_write_lock
_raw_read_lock

Al recommended not setting the search buffer to skip the first '*' unless we
know we are not using MATCH_GLOB. This implements his suggested logic.

Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk

Cc: stable@vger.kernel.org
Fixes: 60f1d5e3ba ("ftrace: Support full glob matching")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Suggsted-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:11:47 -05:00
Steven Rostedt (VMware)
7b65865627 ftrace: Remove incorrect setting of glob search field
__unregister_ftrace_function_probe() will incorrectly parse the glob filter
because it resets the search variable that was setup by filter_parse_regex().

Al Viro reported this:

    After that call of filter_parse_regex() we could have func_g.search not
    equal to glob only if glob started with '!' or '*'.  In the former case
    we would've buggered off with -EINVAL (not = 1).  In the latter we
    would've set func_g.search equal to glob + 1, calculated the length of
    that thing in func_g.len and proceeded to reset func_g.search back to
    glob.

    Suppose the glob is e.g. *foo*.  We end up with
	    func_g.type = MATCH_MIDDLE_ONLY;
	    func_g.len = 3;
	    func_g.search = "*foo";
    Feeding that to ftrace_match_record() will not do anything sane - we
    will be looking for names containing "*foo" (->len is ignored for that
    one).

Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk

Cc: stable@vger.kernel.org
Fixes: 3ba0092971 ("ftrace: Introduce ftrace_glob structure")
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08 10:11:11 -05:00
David S. Miller
c702558681 Merge branch 'nfp-fix-disabling-TC-offloads-in-flower-max-TSO-segs-and-module-version'
Jakub Kicinski says:

====================
nfp: fix disabling TC offloads in flower, max TSO segs and module version

This set corrects the way nfp deals with the NETIF_F_HW_TC flag.
It has slipped the review that flower offload does not currently
refuse disabling this flag when filter offload is active.

nfp's flower offload does not actually keep track of how many filters
for each port are offloaded.  The accounting of the number of filters
is added to the nfp core structures, and BPF moved to use these
structures as well.

If users are allowed to disable TC offloads while filters are active,
not only is it incorrect behaviour, but actually the NFP will never
be told to remove the flows, leading to use-after-free when stats
arrive.

Fourth patch makes sure we declare the max number of TSO segments.
FW should drop longer packets cleanly (otherwise this would be a
security problem for untrusted VFs) but dropping longer TSO frames
is not nice and driver should prevent them from being generated.

Last small addition populates MODULE_VERSION with kernel version.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:28 -05:00
Jakub Kicinski
1a5e8e3500 nfp: populate MODULE_VERSION
DKMS and similar out-of-tree module replacement services use
module version to make sure the out-of-tree software is not
older than the module shipped with the kernel.  We use the
kernel version in ethtool -i output, put it into MODULE_VERSION
as well.

Reported-by: Jan Gutter <jan.gutter@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:27 -05:00
Jakub Kicinski
0d592e52fb nfp: limit the number of TSO segments
Most FWs limit the number of TSO segments a frame can produce
to 64.  This is for fairness and efficiency (of FW datapath)
reasons.  If a frame with larger number of segments is submitted
the FW will drop it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:27 -05:00
Jakub Kicinski
d692403e5c nfp: forbid disabling hw-tc-offload on representors while offload active
All netdevs which can accept TC offloads must implement
.ndo_set_features().  nfp_reprs currently do not do that, which
means hw-tc-offload can be turned on and off even when offloads
are active.

Whether the offloads are active is really a question to nfp_ports,
so remove the per-app tc_busy callback indirection thing, and
simply count the number of offloaded items in nfp_port structure.

Fixes: 8a2768732a ("nfp: provide infrastructure for offloading flower based TC filters")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:27 -05:00
Jakub Kicinski
0b9de4ca85 nfp: don't advertise hw-tc-offload on non-port netdevs
nfp_port is a structure which represents an ASIC port, both
PCIe vNIC (on a PF or a VF) or the external MAC port.  vNIC
netdev (struct nfp_net) and pure representor netdev (struct
nfp_repr) both have a pointer to this structure.  nfp_reprs
always have a port associated.  nfp_nets, however, only represent
a device port in legacy mode, where they are considered the
MAC port. In switchdev mode they are just the CPU's side of
the PCIe link.

By definition TC offloads only apply to device ports.  Don't
set the flag on vNICs without a port (i.e. in switchdev mode).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:27 -05:00
Jakub Kicinski
e3ac6c0737 nfp: bpf: require ETH table
Upcoming changes will require all netdevs supporting TC offloads
to have a full struct nfp_port.  Require those for BPF offload.
The operation without management FW reporting information about
Ethernet ports is something we only support for very old and very
basic NIC firmwares anyway.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08 10:01:27 -05:00
Ryan Hsu
9ce8b24aa9 Revert "ath10k: add sanity check to ie_len before parsing fw/board ie"
This reverts commit 9ed4f91628.

The commit introduced a regression that over read the ie with
the padding.

- the expected IE information

ath10k_pci 0000:03:00.0: found firmware features ie (1 B)
ath10k_pci 0000:03:00.0: Enabling feature bit: 6
ath10k_pci 0000:03:00.0: Enabling feature bit: 7
ath10k_pci 0000:03:00.0: features
ath10k_pci 0000:03:00.0: 00000000: c0 00 00 00 00 00 00 00

- the wrong IE with padding is read (0x77)

ath10k_pci 0000:03:00.0: found firmware features ie (4 B)
ath10k_pci 0000:03:00.0: Enabling feature bit: 6
ath10k_pci 0000:03:00.0: Enabling feature bit: 7
ath10k_pci 0000:03:00.0: Enabling feature bit: 8
ath10k_pci 0000:03:00.0: Enabling feature bit: 9
ath10k_pci 0000:03:00.0: Enabling feature bit: 10
ath10k_pci 0000:03:00.0: Enabling feature bit: 12
ath10k_pci 0000:03:00.0: Enabling feature bit: 13
ath10k_pci 0000:03:00.0: Enabling feature bit: 14
ath10k_pci 0000:03:00.0: Enabling feature bit: 16
ath10k_pci 0000:03:00.0: Enabling feature bit: 17
ath10k_pci 0000:03:00.0: Enabling feature bit: 18
ath10k_pci 0000:03:00.0: features
ath10k_pci 0000:03:00.0: 00000000: c0 77 07 00 00 00 00 00

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-02-08 14:34:18 +02:00
Daniel Borkmann
69fe98edee Merge branch 'bpf-misc-nfp-bpftool-doc-fixes'
Jakub Kicinski says:

====================
First patch in this series fixes applying the relocation to immediate
load instructions in the NFP JIT.

The remaining patches come from Quentin.  Small addition to libbpf
makes sure it recognizes all standard section names.  Makefile in
bpftool/Documentation is improved to explicitly check for rst2man
being installed on the system, otherwise we risk installing empty
files.  Man page for bpftool-map is corrected to include program
as a potential value for map of programs.

Last two patches are slightly longer, those update bash completions to
include this release cycle's additions from Roman.  Maybe the use of
Fixes tags is slightly frivolous there, but having bash completions
which don't cover all commands and options could be disruptive to work
flow for users.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 11:59:51 +01:00
Quentin Monnet
a827a164b9 tools: bpftool: add bash completion for cgroup commands
Add bash completion for "bpftool cgroup" command family. While at it,
also fix the formatting of some keywords in the man page for cgroups.

Fixes: 5ccda64d38 ("bpftool: implement cgroup bpf operations")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 11:59:50 +01:00
Quentin Monnet
55f3538c49 tools: bpftool: add bash completion for bpftool prog load
Add bash completion for bpftool command `prog load`. Completion for this
command is easy, as it only takes existing file paths as arguments.

Fixes: 49a086c201 ("bpftool: implement prog load command")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 11:59:50 +01:00
Quentin Monnet
2148481d5d tools: bpftool: make syntax for program map update explicit in man page
Specify in the documentation that when using bpftool to update a map of
type BPF_MAP_TYPE_PROG_ARRAY, the syntax for the program used as a value
should use the "id|tag|pinned" keywords convention, as used with
"bpftool prog" commands.

Fixes: ff69c21a85 ("tools: bpftool: add documentation")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 11:59:50 +01:00
Quentin Monnet
92426820f7 tools: bpftool: exit doc Makefile early if rst2man is not available
If rst2man is not available on the system, running `make doc` from the
bpftool directory fails with an error message. However, it creates empty
manual pages (.8 files in this case). A subsequent call to `make
doc-install` would then succeed and install those empty man pages on the
system.

To prevent this, raise a Makefile error and exit immediately if rst2man
is not available before generating the pages from the rst documentation.

Fixes: ff69c21a85 ("tools: bpftool: add documentation")
Reported-by: Jason van Aaardt <jason.vanaardt@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 11:59:50 +01:00