57111 Commits

Author SHA1 Message Date
Yan, Zheng
8a2ac3a8e9 ceph: don't request excl caps when mount is readonly
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26 16:08:36 +01:00
Yan, Zheng
3c1392d4c4 ceph: don't update importing cap's mseq when handing cap export
Updating mseq makes client think importer mds has accepted all prior
cap messages and importer mds knows what caps client wants. Actually
some cap messages may have been dropped because of mseq mismatch.

If mseq is left untouched, importing cap's mds_wanted later will get
reset by cap import message.

Cc: stable@vger.kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26 16:08:25 +01:00
Chengguang Xu
0cab9f33d9 ceph: remove redundant assignment
There is redundant assighment of variable i in
ceph_mdsmap_get_random_mds(), just remvoe it.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26 15:56:04 +01:00
Yan, Zheng
2bf996ac48 ceph: cleanup splice_dentry()
splice_dentry() may drop the original dentry and return other
dentry. It relies on its caller to update pointer that points
to the dropped dentry. This is error-prone.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26 15:56:03 +01:00
Linus Torvalds
eaa7649971 SPI NOR Changes
Core changes:
   - Parse the 4BAIT SFDP section
   - Add a bunch of SPI NOR entries to the flash_info table
   - Add the concept of SFDP fixups and use it to fix a bug on MX25L25635F
   - A bunch of minor cleanups/comestic changes
 
 NAND changes:
   NAND core changes:
   - kernel-doc miscellaneous fixes.
   - Third batch of fixes/cleanup to the raw NAND core impacting various
     controller drivers (ams-delta, marvell, fsmc, denali, tegra, vf610):
     * Stopping to pass mtd_info objects to internal functions
     * Reorganizing code to avoid forward declarations
     * Dropping useless test in nand_legacy_set_defaults()
     * Moving nand_exec_op() to internal.h
     * Adding nand_[de]select_target() helpers
     * Passing the CS line to be selected in struct nand_operation
     * Making ->select_chip() optional when ->exec_op() is implemented
     * Deprecating the ->select_chip() hook
     * Moving the ->exec_op() method to nand_controller_ops
     * Moving ->setup_data_interface() to nand_controller_ops
     * Deprecating the dummy_controller field
     * Fixing JEDEC detection
     * Providing a helper for polling GPIO R/B pin
 
   Raw NAND chip drivers changes:
   - Macronix:
     * Flagging 1.8V AC chips with a broken GET_FEATURES(TIMINGS)
 
   Raw NAND controllers drivers changes:
   - Ams-delta:
     * Fixing the error path
     * SPDX tag added
     * May be compiled with COMPILE_TEST=y
     * Conversion to ->exec_op() interface
     * Dropping .IOADDR_R/W use
     * Use GPIO API for data I/O
   - Denali:
     * Removing denali_reset_banks()
     * Removing ->dev_ready() hook
     * Including <linux/bits.h> instead of <linux/bitops.h>
     * Changes to comply with the above fixes/cleanup done in the core.
   - FSMC:
     * Adding an SPDX tag to replace the license text
     * Making conversion from chip to fsmc consistent
     * Fixing unchecked return value in fsmc_read_page_hwecc
     * Changes to comply with the above fixes/cleanup done in the core.
   - Marvell:
     * Preventing timeouts on a loaded machine (fix)
     * Changes to comply with the above fixes/cleanup done in the core.
   - OMAP2:
     * Pass the parent of pdev to dma_request_chan() (fix)
   - R852:
     * Use generic DMA API
   - sh_flctl:
     * Converting to SPDX identifiers
   - Sunxi:
     * Write pageprog related opcodes to the right register: WCMD_SET (fix)
   - Tegra:
     * Stop implementing ->select_chip()
   - VF610:
     * Adding an SPDX tag to replace the license text
     * Changes to comply with the above fixes/cleanup done in the core.
   - Various trivial/spelling/coding style fixes.
 
   SPI-NAND drivers changes:
   - Removing the depreacated mt29f_spinand driver from staging.
   - Adding support for:
     * Toshiba TC58CVG2S0H
     * GigaDevice GD5FxGQ4xA
     * Winbond W25N01GV
 
 JFFS2 changes:
 - Fix a lockdep issue
 
 MTD changes:
 - Rework the physmap driver to merge gpio-addr-flash and physmap_of
   in it
 - Add a new compatible for RedBoot partitions
 - Make sub-partitions RW if the parent partition was RO because of a
   mis-alignment
 - Add pinctrl support to the
 - Addition of /* fall-through */ comments where appropriate
 - Various minor fixes and cleanups
 
 Other changes:
 - Update my email address
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCgA6FiEEKmCqpbOU668PNA69Ze02AX4ItwAFAlwZRoocHGJvcmlzLmJy
 ZXppbGxvbkBib290bGluLmNvbQAKCRBl7TYBfgi3ALhmEACGoyJLVxRsE0yXOex7
 JdPhZxnOkwnzaxQPQBQ7pFrWvRinV1Th8JenCCLiyM6DBcIDRo/87etC6cnjfnH/
 eJ8e7xPLiESCsAB8xixP1YvLaZQzjBKOH/+qNR6anp505ZNKhgsuUDATLSFfaEvz
 bZv3f1V70dPXdEK/3QXDZakqVtfvVBMSdyJMWdKSdVn70yA72wCPP4igcGdfYfoZ
 KKaanhS1EdxA++WFrRqocQay9rtjnFGONHLfrefop9YKSevLp1UDZSAYSk4CHcEf
 yaMFD+qlxc0wHlOk4XyDY3vREcdO50r2vXfN/Hxf0D6JeC/6RT0Hrm0YyY8X/u7l
 xLhSv+DKGyft3SnQ4MDdwg57bvDSOkPryI3cxBul3S6I00pCdo8l5zQskezDbk0E
 CkUuB+f7Wn3lmV7W8ZNbYHx0ljVJEXMxEQ6m/6ZLizIr/aC7m5ncgv8a3mL7QQQA
 KXuLamw9pPlf/tAOQxB1PTiE1n8ECYqcInJXzoxaRzUdn8TlIYjxUb7GIDoOFpzm
 6a3oCx6tUZq4j/GAoyYgo12NhH0aYW3X/V8N7SfPjmIofvFiUL1y1VQ2ghNDLCqH
 2cz6WsFGoA0oSVnxl/TSSM6/ZXbvnLgM4zq6+wZITyCFNwyC956MddZrjL6lVkMT
 s6kvh/h3wBWN+GxwlDeY+KZMlQ==
 =myqK
 -----END PGP SIGNATURE-----

Merge tag 'mtd/for-4.21' of git://git.infradead.org/linux-mtd

Pull mtd updates from Boris Brezillon:
 "SPI NOR Core changes:
   - Parse the 4BAIT SFDP section
   - Add a bunch of SPI NOR entries to the flash_info table
   - Add the concept of SFDP fixups and use it to fix a bug on MX25L25635F
   - A bunch of minor cleanups/comestic changes

  NAND core changes:
   - kernel-doc miscellaneous fixes.
   - Third batch of fixes/cleanup to the raw NAND core impacting various
     controller drivers (ams-delta, marvell, fsmc, denali, tegra,
     vf610):
      * Stop to pass mtd_info objects to internal functions
      * Reorganize code to avoid forward declarations
      * Drop useless test in nand_legacy_set_defaults()
      * Move nand_exec_op() to internal.h
      * Add nand_[de]select_target() helpers
      * Pass the CS line to be selected in struct nand_operation
      * Make ->select_chip() optional when ->exec_op() is implemented
      * Deprecate the ->select_chip() hook
      * Move the ->exec_op() method to nand_controller_ops
      * Move ->setup_data_interface() to nand_controller_ops
      * Deprecate the dummy_controller field
      * Fix JEDEC detection
      * Provide a helper for polling GPIO R/B pin

  Raw NAND chip drivers changes:
   - Macronix:
      * Flag 1.8V AC chips with a broken GET_FEATURES(TIMINGS)

  Raw NAND controllers drivers changes:
   - Ams-delta:
      * Fix the error path
      * SPDX tag added
      * May be compiled with COMPILE_TEST=y
      * Conversion to ->exec_op() interface
      * Drop .IOADDR_R/W use
      * Use GPIO API for data I/O
   - Denali:
      * Remove denali_reset_banks()
      * Remove ->dev_ready() hook
      * Include <linux/bits.h> instead of <linux/bitops.h>
      * Changes to comply with the above fixes/cleanup done in the core.
   - FSMC:
      * Add an SPDX tag to replace the license text
      * Make conversion from chip to fsmc consistent
      * Fix unchecked return value in fsmc_read_page_hwecc
      * Changes to comply with the above fixes/cleanup done in the core.
   - Marvell:
      * Prevent timeouts on a loaded machine (fix)
      * Changes to comply with the above fixes/cleanup done in the core.
   - OMAP2:
      * Pass the parent of pdev to dma_request_chan() (fix)
   - R852:
      * Use generic DMA API
   - sh_flctl:
      * Convert to SPDX identifiers
   - Sunxi:
      * Write pageprog related opcodes to the right register: WCMD_SET (fix)
   - Tegra:
      * Stop implementing ->select_chip()
   - VF610:
      * Add an SPDX tag to replace the license text
      * Changes to comply with the above fixes/cleanup done in the core.
   - Various trivial/spelling/coding style fixes.

  SPI-NAND drivers changes:
   - Remove the depreacated mt29f_spinand driver from staging.
   - Add support for:
      * Toshiba TC58CVG2S0H
      * GigaDevice GD5FxGQ4xA
      * Winbond W25N01GV

  JFFS2 changes:
   - Fix a lockdep issue

  MTD changes:
   - Rework the physmap driver to merge gpio-addr-flash and physmap_of
     in it
   - Add a new compatible for RedBoot partitions
   - Make sub-partitions RW if the parent partition was RO because of a
     mis-alignment
   - Add pinctrl support to the
   - Addition of /* fall-through */ comments where appropriate
   - Various minor fixes and cleanups

  Other changes:
   - Update my email address"

* tag 'mtd/for-4.21' of git://git.infradead.org/linux-mtd: (108 commits)
  mtd: rawnand: sunxi: Write pageprog related opcodes to WCMD_SET
  MAINTAINERS: Update my email address
  mtd: rawnand: marvell: prevent timeouts on a loaded machine
  mtd: rawnand: omap2: Pass the parent of pdev to dma_request_chan()
  mtd: rawnand: Fix JEDEC detection
  mtd: spi-nor: Add support for is25lp016d
  mtd: spi-nor: parse SFDP 4-byte Address Instruction Table
  mtd: spi-nor: Add 4B_OPCODES flag to is25lp256
  mtd: spi-nor: Add an SPDX tag to spi-nor.{c,h}
  mtd: spi-nor: Make the enable argument passed to set_byte() a bool
  mtd: spi-nor: Stop passing flash_info around
  mtd: spi-nor: Avoid forward declaration of internal functions
  mtd: spi-nor: Drop inline on all internal helpers
  mtd: spi-nor: Add a post BFPT fixup for MX25L25635E
  mtd: spi-nor: Add a post BFPT parsing fixup hook
  mtd: spi-nor: Add the SNOR_F_4B_OPCODES flag
  mtd: spi-nor: cast to u64 to avoid uint overflows
  mtd: spi-nor: Add support for IS25LP032/064
  mtd: spi-nor: add entry for mt35xu512aba flash
  mtd: spi-nor: add macros related to MICRON flash
  ...
2018-12-25 12:49:46 -08:00
Linus Torvalds
4971f090aa drm pull request for 4.21-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcExwOAAoJEAx081l5xIa+euIP/1NZZvSB+bsCtOwDG8I6uWsS
 OU5JUZ8q2dqyyFagRxzlkeSt3uWJqKp5NyNwuc9z/5u6AGF+3/97D0J1lG6Os/st
 4abF6NadivYJ4cXhJ1ddIHOFMVDcAsyMWNDb93NwPwncCsQ0jt5FFOsrCyj6BGY+
 ihHFlHrIyDrbBGDHz+u1E/EO5WkNnaLDoC+/k2fTRWCNI3bQL3O+orsYTI6S2uvU
 lQJnRfYAllgLD2p1k/rrBHcHXBv50roR0e8uhGmbdhGdp5bEW30UGBLHXxQjjSVy
 fQCwFwTO8X6zoxU53Zbbk+MVrp+jkTHcGKViHRuLkaHzE5mX26UXDwlXdN32ZUbK
 yHOJp+uDaWXX7MIz0LsB9Iqj2+eIUoFaIJMoZTMGVTNvqnTxKnoHnjAtbTH2u258
 teFgmy4BIgPgo2kwEnBEZjCapou0Eivyut2wq8bTAB2Fe8LwURJpr3cioTtMLlUO
 L5/PoD27eFvBCAeFrQIwF3b2XiQEnBpXocmilEwP1xDMPgoyeePAfIF2iEpDvi0U
 jce3rLd2yVvo92xYUgoHkVTD8si/pKKnZ1D0U3+RI6pxK6s0HJEHjcNEMdvdm+2S
 4qgvBQV3wlWFkXEK8PR5BHPoLntg18tKon/BTLBjgGkN9E1o9fWs1/s6KQGY4xdo
 l3Vvfx2LTdkgEoBssSwB
 =wh4W
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Core:
   - shared fencing staging removal
   - drop transactional atomic helpers and move helpers to new location
   - DP/MST atomic cleanup
   - Leasing cleanups and drop EXPORT_SYMBOL
   - Convert drivers to atomic helpers and generic fbdev.
   - removed deprecated obj_ref/unref in favour of get/put
   - Improve dumb callback documentation
   - MODESET_LOCK_BEGIN/END helpers

  panels:
   - CDTech panels, Banana Pi Panel, DLC1010GIG,
   - Olimex LCD-O-LinuXino, Samsung S6D16D0, Truly NT35597 WQXGA,
   - Himax HX8357D, simulated RTSM AEMv8.
   - GPD Win2 panel
   - AUO G101EVN010

  vgem:
   - render node support

  ttm:
   - move global init out of drivers
   - fix LRU handling for ghost objects
   - Support for simultaneous submissions to multiple engines

  scheduler:
   - timeout/fault handling changes to help GPU recovery
   - helpers for hw with preemption support

  i915:
   - Scaler/Watermark fixes
   - DP MST + powerwell fixes
   - PSR fixes
   - Break long get/put shmemfs pages
   - Icelake fixes
   - Icelake DSI video mode enablement
   - Engine workaround improvements

  amdgpu:
   - freesync support
   - GPU reset enabled on CI, VI, SOC15 dGPUs
   - ABM support in DC
   - KFD support for vega12/polaris12
   - SDMA paging queue on vega
   - More amdkfd code sharing
   - DCC scanout on GFX9
   - DC kerneldoc
   - Updated SMU firmware for GFX8 chips
   - XGMI PSP + hive reset support
   - GPU reset
   - DC trace support
   - Powerplay updates for newer Polaris
   - Cursor plane update fast path
   - kfd dma-buf support

  virtio-gpu:
   - add EDID support

  vmwgfx:
   - pageflip with damage support

  nouveau:
   - Initial Turing TU104/TU106 modesetting support

  msm:
   - a2xx gpu support for apq8060 and imx5
   - a2xx gpummu support
   - mdp4 display support for apq8060
   - DPU fixes and cleanups
   - enhanced profiling support
   - debug object naming interface
   - get_iova/page pinning decoupling

  tegra:
   - Tegra194 host1x, VIC and display support enabled
   - Audio over HDMI for Tegra186 and Tegra194

  exynos:
   - DMA/IOMMU refactoring
   - plane alpha + blend mode support
   - Color format fixes for mixer driver

  rcar-du:
   - R8A7744 and R8A77470 support
   - R8A77965 LVDS support

  imx:
   - fbdev emulation fix
   - multi-tiled scalling fixes
   - SPDX identifiers

  rockchip
   - dw_hdmi support
   - dw-mipi-dsi + dual dsi support
   - mailbox read size fix

  qxl:
   - fix cursor pinning

  vc4:
   - YUV support (scaling + cursor)

  v3d:
   - enable TFU (Texture Formatting Unit)

  mali-dp:
   - add support for linear tiled formats

  sun4i:
   - Display Engine 3 support
   - H6 DE3 mixer 0 support
   - H6 display engine support
   - dw-hdmi support
   - H6 HDMI phy support
   - implicit fence waiting
   - BGRX8888 support

  meson:
   - Overlay plane support
   - implicit fence waiting
   - HDMI 1.4 4k modes

  bridge:
   - i2c fixes for sii902x"

* tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (1403 commits)
  drm/amd/display: Add fast path for cursor plane updates
  drm/amdgpu: Enable GPU recovery by default for CI
  drm/amd/display: Fix duplicating scaling/underscan connector state
  drm/amd/display: Fix unintialized max_bpc state values
  Revert "drm/amd/display: Set RMX_ASPECT as default"
  drm/amdgpu: Fix stub function name
  drm/msm/dpu: Fix clock issue after bind failure
  drm/msm/dpu: Clean up dpu_media_info.h static inline functions
  drm/msm/dpu: Further cleanups for static inline functions
  drm/msm/dpu: Cleanup the debugfs functions
  drm/msm/dpu: Remove dpu_irq and unused functions
  drm/msm: Make irq_postinstall optional
  drm/msm/dpu: Cleanup callers of dpu_hw_blk_init
  drm/msm/dpu: Remove unused functions
  drm/msm/dpu: Remove dpu_crtc_is_enabled()
  drm/msm/dpu: Remove dpu_crtc_get_mixer_height
  drm/msm/dpu: Remove dpu_dbg
  drm/msm: dpu: Remove crtc_lock
  drm/msm: dpu: Remove vblank_requested flag from dpu_crtc
  drm/msm: dpu: Separate crtc assignment from vblank enable
  ...
2018-12-25 11:48:26 -08:00
Theodore Ts'o
2b08b1f12c ext4: fix a potential fiemap/page fault deadlock w/ inline_data
The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent()
while still holding the xattr semaphore.  This is not necessary and it
triggers a circular lockdep warning.  This is because
fiemap_fill_next_extent() could trigger a page fault when it writes
into page which triggers a page fault.  If that page is mmaped from
the inline file in question, this could very well result in a
deadlock.

This problem can be reproduced using generic/519 with a file system
configuration which has the inline_data feature enabled.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2018-12-25 00:56:33 -05:00
Theodore Ts'o
812c0cab2c ext4: make sure enough credits are reserved for dioread_nolock writes
There are enough credits reserved for most dioread_nolock writes;
however, if the extent tree is sufficiently deep, and/or quota is
enabled, the code was not allowing for all eventualities when
reserving journal credits for the unwritten extent conversion.

This problem can be seen using xfstests ext4/034:

   WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180
   Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
   RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180
   	...
   EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata
   EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28
   EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2018-12-24 20:27:08 -05:00
Paulo Alcantara
e7b602f437 cifs: Save TTL value when parsing DFS referrals
This will be needed by DFS cache.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:49:00 -06:00
Aurelien Aptel
5fc7fcd054 cifs: auto disable 'serverino' in dfs mounts
Different servers have different set of file ids.

After failover, unique IDs will be different so we can't validate
them.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:05:11 -06:00
Paulo Alcantara
d9345e0ae7 cifs: Make devname param optional in cifs_compose_mount_options()
If we only want to get the mount options strings, do not return the
devname.

For DFS failover, we'll be passing the DFS full path down to
cifs_mount() rather than the devname.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:05:08 -06:00
Paulo Alcantara
c34fea5a63 cifs: Skip any trailing backslashes from UNC
When extracting hostname from UNC, check for leading backslashes
before trying to remove them.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:05:05 -06:00
Paulo Alcantara
56c762eb9b cifs: Refactor out cifs_mount()
* Split and refactor the very large function cifs_mount() in multiple
  functions:

- tcp, ses and tcon setup to mount_get_conns()
- tcp, ses and tcon cleanup in mount_put_conns()
- tcon tlink setup to mount_setup_tlink()
- remote path checking to is_path_remote()

* Implement 2 version of cifs_mount() for DFS-enabled builds and
  non-DFS-enabled builds (CONFIG_CIFS_DFS_UPCALL).

In preparation for DFS failover support.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 23:00:38 -06:00
Georgy A Bystrenin
9a596f5b39 CIFS: Fix error mapping for SMB2_LOCK command which caused OFD lock problem
While resolving a bug with locks on samba shares found a strange behavior.
When a file locked by one node and we trying to lock it from another node
it fail with errno 5 (EIO) but in that case errno must be set to
(EACCES | EAGAIN).
This isn't happening when we try to lock file second time on same node.
In this case it returns EACCES as expected.
Also this issue not reproduces when we use SMB1 protocol (vers=1.0 in
mount options).

Further investigation showed that the mapping from status_to_posix_error
is different for SMB1 and SMB2+ implementations.
For SMB1 mapping is [NT_STATUS_LOCK_NOT_GRANTED to ERRlock]
(See fs/cifs/netmisc.c line 66)
but for SMB2+ mapping is [STATUS_LOCK_NOT_GRANTED to -EIO]
(see fs/cifs/smb2maperror.c line 383)

Quick changes in SMB2+ mapping from EIO to EACCES has fixed issue.

BUG: https://bugzilla.kernel.org/show_bug.cgi?id=201971

Signed-off-by: Georgy A Bystrenin <gkot@altlinux.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:42:56 -06:00
Long Li
54e94ff94e CIFS: return correct errors when pinning memory failed for direct I/O
When pinning memory failed, we should return the correct error code and
rewind the SMB credits.

Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:42:48 -06:00
Long Li
b6bc8a7b99 CIFS: use the correct length when pinning memory for direct I/O for write
The current code attempts to pin memory using the largest possible wsize
based on the currect SMB credits. This doesn't cause kernel oops but this
is not optimal as we may pin more pages then actually needed.

Fix this by only pinning what are needed for doing this write I/O.

Signed-off-by: Long Li <longli@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
2018-12-23 22:42:19 -06:00
Ronnie Sahlberg
59a63e479c cifs: check ntwrk_buf_start for NULL before dereferencing it
RHBZ: 1021460

There is an issue where when multiple threads open/close the same directory
ntwrk_buf_start might end up being NULL, causing the call to smbCalcSize
later to oops with a NULL deref.

The real bug is why this happens and why this can become NULL for an
open cfile, which should not be allowed.
This patch tries to avoid a oops until the time when we fix the underlying
issue.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:41:31 -06:00
Ronnie Sahlberg
52baa51d30 cifs: remove coverity warning in calc_lanman_hash
password_with_pad is a fixed size buffer of 16 bytes, it contains a
password string, to be padded with \0 if shorter than 16 bytes
but is just truncated if longer.
It is not, and we do not depend on it to be, nul terminated.

As such, do not use strncpy() to populate this buffer since
the str* prefix suggests that this is a string, which it is not,
and it also confuses coverity causing a false warning.

Detected by CoverityScan CID#113743 ("Buffer not null terminated")

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:41:26 -06:00
YueHaibing
0f57451eeb cifs: remove set but not used variable 'smb_buf'
Fixes gcc '-Wunused-but-set-variable' warning:

fs/cifs/sess.c: In function '_sess_auth_rawntlmssp_assemble_req':
fs/cifs/sess.c:1157:18: warning:
 variable 'smb_buf' set but not used [-Wunused-but-set-variable]

It never used since commit cc87c47d9d7a ("cifs: Separate rawntlmssp auth
from CIFS_SessSetup()")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:41:20 -06:00
Gustavo A. R. Silva
07fa6010ff cifs: suppress some implicit-fallthrough warnings
To avoid the warning:

     warning: this statement may fall through [-Wimplicit-fallthrough=]

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:41:11 -06:00
Ronnie Sahlberg
f9793b6fcc cifs: change smb2_query_eas to use the compound query-info helper
Reducing the number of network roundtrips improves the performance
of query xattrs

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:40:17 -06:00
Kenneth D'souza
4a3b38aec5 Add vers=3.0.2 as a valid option for SMBv3.0.2
Technically 3.02 is not the dialect name although that is more familiar to
many, so we should also accept the official dialect name (3.0.2 vs. 3.02)
in vers=

Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:39:29 -06:00
Ronnie Sahlberg
07d3b2e426 cifs: create a helper function for compound query_info
and convert statfs to use it.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:38:17 -06:00
Steve French
97aa495a89 cifs: address trivial coverity warning
This is not actually a bug but as Coverity points out we shouldn't
be doing an "|=" on a value which hasn't been set (although technically
it was memset to zero so isn't a bug) and so might as well change
"|=" to "=" in this line

Detected by CoverityScan, CID#728535 ("Unitialized scalar variable")

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-12-23 22:38:14 -06:00
Steve French
f5942db5ef cifs: smb2 commands can not be negative, remove confusing check
As Coverity points out le16_to_cpu(midEntry->Command) can not be
less than zero.

Detected by CoverityScan, CID#1438650 ("Macro compares unsigned to 0")

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-12-23 22:37:23 -06:00
Ronnie Sahlberg
0967e54579 cifs: use a compound for setting an xattr
Improve performance by reducing number of network round trips
for set xattr.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:36:24 -06:00
Colin Ian King
5890255b83 cifs: clean up indentation, replace spaces with tab
Trivial fix to clean up indentation, replace spaces with tab

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23 22:36:09 -06:00
Linus Torvalds
3c730b1041 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of fixes - no common topic ;-)"

[ The aio spectre patch also came in from Jens, so now we have that
  doubly fixed .. ]

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
  aio: fix spectre gadget in lookup_ioctx
2018-12-23 10:40:41 -08:00
Christian Brauner
94f82008ce Revert "vfs: Allow userns root to call mknod on owned filesystems."
This reverts commit 55956b59df336f6738da916dbb520b6e37df9fbd.

commit 55956b59df33 ("vfs: Allow userns root to call mknod on owned filesystems.")
enabled mknod() in user namespaces for userns root if CAP_MKNOD is
available. However, these device nodes are useless since any filesystem
mounted from a non-initial user namespace will set the SB_I_NODEV flag on
the filesystem. Now, when a device node s created in a non-initial user
namespace a call to open() on said device node will fail due to:

bool may_open_dev(const struct path *path)
{
        return !(path->mnt->mnt_flags & MNT_NODEV) &&
                !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
}

The problem with this is that as of the aforementioned commit mknod()
creates partially functional device nodes in non-initial user namespaces.
In particular, it has the consequence that as of the aforementioned commit
open() will be more privileged with respect to device nodes than mknod().
Before it was the other way around. Specifically, if mknod() succeeded
then it was transparent for any userspace application that a fatal error
must have occured when open() failed.

All of this breaks multiple userspace workloads and a widespread assumption
about how to handle mknod(). Basically, all container runtimes and systemd
live by the slogan "ask for forgiveness not permission" when running user
namespace workloads. For mknod() the assumption is that if the syscall
succeeds the device nodes are useable irrespective of whether it succeeds
in a non-initial user namespace or not. This logic was chosen explicitly
to allow for the glorious day when mknod() will actually be able to create
fully functional device nodes in user namespaces.
A specific problem people are already running into when running 4.18 rc
kernels are failing systemd services. For any distro that is run in a
container systemd services started with the PrivateDevices= property set
will fail to start since the device nodes in question cannot be
opened (cf. the arguments in [1]).

Full disclosure, Seth made the very sound argument that it is already
possible to end up with partially functional device nodes. Any filesystem
mounted with MS_NODEV set will allow mknod() to succeed but will not allow
open() to succeed. The difference to the case here is that the MS_NODEV
case is transparent to userspace since it is an explicitly set mount option
while the SB_I_NODEV case is an implicit property enforced by the kernel
and hence opaque to userspace.

[1]: https://github.com/systemd/systemd/pull/9483

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-22 14:18:34 -08:00
Omar Sandoval
65eed012d1 xfs: reallocate realtime summary cache on growfs
At mount time, we allocate m_rsum_cache with the number of realtime
bitmap blocks. However, xfs_growfs_rt() can increase the number of
realtime bitmap blocks. Using the cache after this happens may access
out of the bounds of the cache. Fix it by reallocating the cache in this
case.

Fixes: 355e3532132b ("xfs: cache minimum realtime summary level")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-21 18:45:18 -08:00
Dan Williams
d8a706414a dax: Use non-exclusive wait in wait_entry_unlocked()
get_unlocked_entry() uses an exclusive wait because it is guaranteed to
eventually obtain the lock and follow on with an unlock+wakeup cycle.
The wait_entry_unlocked() path does not have the same guarantee. Rather
than open-code an extra wakeup, just switch to a non-exclusive wait.

Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-12-21 11:35:53 -08:00
Chris Perl
594d1644cd NFS: nfs_compare_mount_options always compare auth flavors.
This patch removes the check from nfs_compare_mount_options to see if a
`sec' option was passed for the current mount before comparing auth
flavors and instead just always compares auth flavors.

Consider the following scenario:

You have a server with the address 192.168.1.1 and two exports /export/a
and /export/b.  The first export supports `sys' and `krb5' security, the
second just `sys'.

Assume you start with no mounts from the server.

The following results in EIOs being returned as the kernel nfs client
incorrectly thinks it can share the underlying `struct nfs_server's:

$ mkdir /tmp/{a,b}
$ sudo mount -t nfs -o vers=3,sec=krb5 192.168.1.1:/export/a /tmp/a
$ sudo mount -t nfs -o vers=3          192.168.1.1:/export/b /tmp/b
$ df >/dev/null
df: ‘/tmp/b’: Input/output error

Signed-off-by: Chris Perl <cperl@janestreet.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-21 13:38:45 -05:00
Linus Torvalds
783619556a an important smb3 fix for an regression to some servers introduced by compounding optimization to rmdir
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlwcW2IACgkQiiy9cAdy
 T1FaLAv/Vs0QhYATaJHkvmjk7EdFoXTZST2wYxPcPSrOfWbCaMjsKREwPVUjOCiW
 W//zL/Qbi0FSVMWdoskJP0KQC5rAhUSxlvtDxrpzqszfeJprIs0oL5IRlOwxNdtT
 F9/i8+s/AuYq0TTn15UtqZHS6wZdWcerdttWV1V/97hEwcO5Xg0pEtyCmLPf7k8W
 wNdxCAQFZ9j2pDVyuJO3a0+Tas34dc2t/cac12h0qeXbrE6e88bQWa6bpEKSvCKr
 cMU94pkajCKeelZOhq+ga7cCmlBJs6gt4sgsKEsoDn72tQCsWVH6p1N4+AxmLsZU
 bR65XodusR1WHMesSth7QraUk0pIQ4ZzMRPZJCkh9bSjaa+fxX1Up/sjn74q1prf
 DHJ/52rQrWK3hvETUZD2B6N9AEDN0swbqeCJRlUYlzG5OEdfit9qSgfTaRzYxVnX
 +tct7j+8mjzk+rsGuTXrQupPXUndPTcpUrKFp5db9Sejcx/Gw/atKhW6mVgL3LiR
 8kVIraSV
 =+8yd
 -----END PGP SIGNATURE-----

Merge tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb3 fix from Steve French:
 "An important smb3 fix for an regression to some servers introduced by
  compounding optimization to rmdir.

  This fix has been tested by multiple developers (including me) with
  the usual private xfstesting, but also by the new cifs/smb3 "buildbot"
  xfstest VMs (thank you Ronnie and Aurelien for good work on this
  automation). The automated testing has been updated so that it will
  catch problems like this in the future.

  Note that Pavel discovered (very recently) some unrelated but
  extremely important bugs in credit handling (smb3 flow control problem
  that can lead to disconnects/reconnects) when compounding, that I
  would have liked to send in ASAP but the complete testing of those two
  fixes may not be done in time and have to wait for 4.21"

* tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: Fix rmdir compounding regression to strict servers
2018-12-21 08:56:31 -08:00
Al Viro
718c43038f mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:51:23 -05:00
Al Viro
757cbe597f LSM: new method: ->sb_add_mnt_opt()
Adding options to growing mnt_opts.  NFS kludge with passing
context= down into non-text-options mount switched to it, and
with that the last use of ->sb_parse_opts_str() is gone.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:50:02 -05:00
Al Viro
204cc0ccf1 LSM: hide struct security_mnt_opts from any generic code
Keep void * instead, allocate on demand (in parse_str_opts, at the
moment).  Eventually both selinux and smack will be better off
with private structures with several strings in those, rather than
this "counter and two pointers to dynamically allocated arrays"
ugliness.  This commit allows to do that at leisure, without
disrupting anything outside of given module.

Changes:
	* instead of struct security_mnt_opt use an opaque pointer
initialized to NULL.
	* security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and
security_free_mnt_opts() take it as var argument (i.e. as void **);
call sites are unchanged.
	* security_sb_set_mnt_opts() and security_sb_remount() take
it by value (i.e. as void *).
	* new method: ->sb_free_mnt_opts().  Takes void *, does
whatever freeing that needs to be done.
	* ->sb_set_mnt_opts() and ->sb_remount() might get NULL as
mnt_opts argument, meaning "empty".

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:48:34 -05:00
Al Viro
6a0440e5b7 nfs_remount(): don't leak, don't ignore LSM options quietly
* if mount(2) passes something like "context=foo" with MS_REMOUNT
in flags (/sbin/mount.nfs will _not_ do that - you need to issue
the syscall manually), you'll get leaked copies for LSM options.
The reason is that instead of nfs_{alloc,free}_parsed_mount_data()
nfs_remount() uses kzalloc/kfree, which lacks the needed cleanup.

* selinux options are not changed on remount (as for any other
fs), but in case of NFS the failure is quiet - they are not compared
to what we used to have, with complaint in case of attempted changes.
Trivially fixed by converting to use of security_sb_remount().

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:47:19 -05:00
Al Viro
a65001e8a4 btrfs: sanitize security_mnt_opts use
1) keeping a copy in btrfs_fs_info is completely pointless - we never
use it for anything.  Getting rid of that allows for simpler calling
conventions for setup_security_options() (caller is responsible for
freeing mnt_opts in all cases).

2) on remount we want to use ->sb_remount(), not ->sb_set_mnt_opts(),
same as we would if not for FS_BINARY_MOUNTDATA.  Behaviours *are*
close (in fact, selinux sb_set_mnt_opts() ought to punt to
sb_remount() in "already initialized" case), but let's handle
that uniformly.  And the only reason why the original btrfs changes
didn't go for security_sb_remount() in btrfs_remount() case is that
it hadn't been exported.  Let's export it for a while - it'll be
going away soon anyway.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:47:08 -05:00
Al Viro
a10d7c22b3 LSM: split ->sb_set_mnt_opts() out of ->sb_kern_mount()
... leaving the "is it kernel-internal" logics in the caller.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:46:42 -05:00
Al Viro
f5c0c26d90 new helper: security_sb_eat_lsm_opts()
combination of alloc_secdata(), security_sb_copy_data(),
security_sb_parse_opt_str() and free_secdata().

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:46:00 -05:00
Al Viro
c039bc3c24 LSM: lift extracting and parsing LSM options into the caller of ->sb_remount()
This paves the way for retaining the LSM options from a common filesystem
mount context during a mount parameter parsing phase to be instituted prior
to actual mount/reconfiguration actions.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:45:41 -05:00
Al Viro
6be8750b4c LSM: lift parsing LSM options into the caller of ->sb_kern_mount()
This paves the way for retaining the LSM options from a common filesystem
mount context during a mount parameter parsing phase to be instituted prior
to actual mount/reconfiguration actions.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21 11:45:30 -05:00
Eric Sandeen
3cc31fa65d iomap: don't search past page end in iomap_is_partially_uptodate
iomap_is_partially_uptodate() is intended to check wither blocks within
the selected range of a not-uptodate page are uptodate; if the range we
care about is up to date, it's an optimization.

However, the iomap implementation continues to check all blocks up to
from+count, which is beyond the page, and can even be well beyond the
iop->uptodate bitmap.

I think the worst that will happen is that we may eventually find a zero
bit and return "not partially uptodate" when it would have otherwise
returned true, and skip the optimization.  Still, it's clearly an invalid
memory access that must be fixed.

So: fix this by limiting the search to within the page as is done in the
non-iomap variant, block_is_partially_uptodate().

Zorro noticed thiswhen KASAN went off for 512 byte blocks on a 64k
page system:

 BUG: KASAN: slab-out-of-bounds in iomap_is_partially_uptodate+0x1a0/0x1e0
 Read of size 8 at addr ffff800120c3a318 by task fsstress/22337

Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-21 08:42:50 -08:00
Linus Torvalds
f57b620a89 This pull request contains fixes for both UBI and UBIFS:
- Kconfig dependency fixes for our new auth feature
 - Fix for selecting the right compressor when creating a fs
 - Bugfix for a bug in UBIFS's O_TMPFILE implementation
 - Refcounting fixes for UBI
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAlwb+MAWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wSIjD/9zyAQUaMXJzI/ct2SXXikyTgRi
 hmLbUhuIEZD/c3tDQMgb3XqL7yYkOt/QnfeMXpi/Fy9wAdKcOU9QCcL6TPfR9B+B
 v22VdDcsU5pp2Hcinm2Yz6v2OyYwchb2YShfMkiZSf0CKAhcsKZhaeqhug/Zq+w5
 xZ6fMPvu8HhinBavq9OgSpPx892vlF1ypXHNeQNxwslz/gWKso47jtmXpz4LDX7+
 s4+p+lagVJ0MHQ3fVhBvEplWr9zPKHvX8uGohADLsDp5bAqRyGNxBwqgGm6J6H7B
 UHuqUjUf19hUZEauwKugmPTM3Y0TkZjV472qsp9omx8w/XozVqfb4YgMBBE3Qwuy
 ZS7GcaGxf54WJbOJFQv7pS3DUmfOjQBnU7YAR7phMBDeOUfxRXWkuzIB+RkED+0d
 7JjTGWQItdbTfXl2QWfSr/1rj45LNKmVPMAI2pKUBnC9dD9pktCcB90vTcrg2vjE
 3U9GbqOVORZvHh2bcR0Y8uDUBOp4v7ybCc+OOTNVWuWQpckp79E/7dkVLbpoI12R
 xEgMEaCSXlwPUxQ/FrxKBWhOZprPXNMo4xw547SDBxMq0IKUniA+YoUxtN3FxhQ8
 OiLkfcQ+5bove9qYfARaRC/hk5lpgMWebAGou/CkfTIPJsw3ZdlmL6HH6ag5i6fS
 fBHsVBdtv0ADSLXylg==
 =nZML
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS fixes from Richard Weinberger:

 - Kconfig dependency fixes for our new auth feature

 - Fix for selecting the right compressor when creating a fs

 - Bugfix for a bug in UBIFS's O_TMPFILE implementation

 - Refcounting fixes for UBI

* tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs:
  ubifs: Handle re-linking of inodes correctly while recovery
  ubi: Do not drop UBI device reference before using
  ubi: Put MTD device after it is not used
  ubifs: Fix default compression selection in ubifs
  ubifs: Fix memory leak on error condition
  ubifs: auth: Add CONFIG_KEYS dependency
  ubifs: CONFIG_UBIFS_FS_AUTHENTICATION should depend on UBIFS_FS
  ubifs: replay: Fix high stack usage
2018-12-20 14:17:24 -08:00
David Howells
43f5e655ef vfs: Separate changing mount flags full remount
Separate just the changing of mount flags (MS_REMOUNT|MS_BIND) from full
remount because the mount data will get parsed with the new fs_context
stuff prior to doing a remount - and this causes the syscall to fail under
some circumstances.

To quote Eric's explanation:

  [...] mount(..., MS_REMOUNT|MS_BIND, ...) now validates the mount options
  string, which breaks systemd unit files with ProtectControlGroups=yes
  (e.g.  systemd-networkd.service) when systemd does the following to
  change a cgroup (v1) mount to read-only:

    mount(NULL, "/run/systemd/unit-root/sys/fs/cgroup/systemd", NULL,
	  MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND, NULL)

  ... when the kernel has CONFIG_CGROUPS=y but no cgroup subsystems
  enabled, since in that case the error "cgroup1: Need name or subsystem
  set" is hit when the mount options string is empty.

  Probably it doesn't make sense to validate the mount options string at
  all in the MS_REMOUNT|MS_BIND case, though maybe you had something else
  in mind.

This is also worthwhile doing because we will need to add a mount_setattr()
syscall to take over the remount-bind function.

Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
2018-12-20 16:32:56 +00:00
David Howells
e262e32d6b vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled
Only the mount namespace code that implements mount(2) should be using the
MS_* flags.  Suppress them inside the kernel unless uapi/linux/mount.h is
included.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
2018-12-20 16:32:56 +00:00
Dave Chinner
a837eca241 iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()"
This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3.

The reverted commit added page reference counting to iomap page
structures that are used to track block size < page size state. This
was supposed to align the code with page migration page accounting
assumptions, but what it has done instead is break XFS filesystems.
Every fstests run I've done on sub-page block size XFS filesystems
has since picking up this commit 2 days ago has failed with bad page
state errors such as:

# ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038"
....
SECTION       -- xfs
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test1 4.20.0-rc6-dgc+
MKFS_OPTIONS  -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc
MOUNT_OPTIONS -- /dev/sdc /mnt/scratch

generic/038 454s ...
 run fstests generic/038 at 2018-12-20 18:43:05
 XFS (sdc): Unmounting Filesystem
 XFS (sdc): Mounting V5 Filesystem
 XFS (sdc): Ending clean mount
 BUG: Bad page state in process kswapd0  pfn:3a7fa
 page:ffffea0000ccbeb0 count:0 mapcount:0 mapping:ffff88800d9b6360 index:0x1
 flags: 0xfffffc0000000()
 raw: 000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360
 raw: 0000000000000001 0000000000000000 00000000ffffffff
 page dumped because: non-NULL mapping
 CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
 Call Trace:
  dump_stack+0x67/0x90
  bad_page.cold.116+0x8a/0xbd
  free_pcppages_bulk+0x4bf/0x6a0
  free_unref_page_list+0x10f/0x1f0
  shrink_page_list+0x49d/0xf50
  shrink_inactive_list+0x19d/0x3b0
  shrink_node_memcg.constprop.77+0x398/0x690
  ? shrink_slab.constprop.81+0x278/0x3f0
  shrink_node+0x7a/0x2f0
  kswapd+0x34b/0x6d0
  ? node_reclaim+0x240/0x240
  kthread+0x11f/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x24/0x30
 Disabling lock debugging due to kernel taint
....

The failures are from anyway that frees pages and empties the
per-cpu page magazines, so it's not a predictable failure or an easy
to debug failure.

generic/038 is a reliable reproducer of this problem - it has a 9 in
10 failure rate on one of my test machines. Failure on other
machines have been at random points in fstests runs but every run
has ended up tripping this problem. Hence generic/038 was used to
bisect the failure because it was the most reliable failure.

It is too close to the 4.20 release (not to mention holidays) to
try to diagnose, fix and test the underlying cause of the problem,
so reverting the commit is the only option we have right now. The
revert has been tested against a current tot 4.20-rc7+ kernel across
multiple machines running sub-page block size XFs filesystems and
none of the bad page state failures have been seen.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-20 07:22:51 -08:00
Darrick J. Wong
86d163dbfe xfs: stringify scrub types in ftrace output
Use __print_symbolic to print the scrub type in ftrace output.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2018-12-19 14:02:01 -08:00
Darrick J. Wong
c494213f30 xfs: stringify btree cursor types in ftrace output
Use __print_symbolic to print the btree type in ftrace output.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2018-12-19 14:02:01 -08:00
Darrick J. Wong
0357d21a6c xfs: move XFS_INODE_FORMAT_STR mappings to libxfs
Move XFS_INODE_FORMAT_STR to libxfs so that we don't forget to keep it
updated, and add necessary TRACE_DEFINE_ENUM.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2018-12-19 14:02:01 -08:00