52592 Commits

Author SHA1 Message Date
Tyson Nottingham
ceec03764a ext4: omit init_itable=n in procfs when disabled
Don't show init_itable=n in /proc/fs/ext4/<dev>/options when filesystem
is mounted with noinit_itable.

Signed-off-by: Tyson Nottingham <tgnottingham@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-30 00:53:33 -04:00
Tyson Nottingham
68afa7e083 ext4: show more binary mount options in procfs
Previously, /proc/fs/ext4/<dev>/options would only show binary options
if they were set (1 in the options bit mask). E.g. it would show "grpid"
if it was set, but it would not show "nogrpid" if grpid was not set.

This seems sensible, but when an option is absent from the file, it can
be hard for the unfamiliar to know what is being used. E.g. if there
isn't a (no)grpid entry, nogrpid is in effect. But if there isn't a
(no)auto_da_alloc entry, auto_da_alloc is in effect. If there isn't a
(minixdf|bsddf) entry, it turns out bsddf is in effect. It all depends
on how the option is implemented.

It's clearer to be explicit, so print the corresponding option
regardless of whether it means a 1 or a 0 in the bit mask.

Note that options which do not have an explicit disable option aren't
indicated as being disabled even with this change (e.g. dax).

Signed-off-by: Tyson Nottingham <tgnottingham@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-30 00:51:10 -04:00
Tyson Nottingham
bc1420ae56 ext4: simplify kobject usage
Replace kset with generic kobject provided by kobject_create_and_add(),
since the latter is sufficient.

Signed-off-by: Tyson Nottingham <tgnottingham@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-30 00:41:34 -04:00
Tyson Nottingham
6ca06829fb ext4: remove unused parameters in sysfs code
Signed-off-by: Tyson Nottingham <tgnottingham@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-30 00:13:10 -04:00
Tyson Nottingham
c2e5df7626 ext4: null out kobject* during sysfs cleanup
Make cleanup of ext4_feat kobject consistent with similar objects.

Signed-off-by: Tyson Nottingham <tgnottingham@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-30 00:03:38 -04:00
Theodore Ts'o
18db4b4e6f ext4: don't allow r/w mounts if metadata blocks overlap the superblock
If some metadata block, such as an allocation bitmap, overlaps the
superblock, it's very likely that if the file system is mounted
read/write, the results will not be pretty.  So disallow r/w mounts
for file systems corrupted in this particular way.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-03-29 22:10:35 -04:00
Theodore Ts'o
a45403b515 ext4: always initialize the crc32c checksum driver
The extended attribute code now uses the crc32c checksum for hashing
purposes, so we should just always always initialize it.  We also want
to prevent NULL pointer dereferences if one of the metadata checksum
features is enabled after the file sytsem is originally mounted.

This issue has been assigned CVE-2018-1094.

https://bugzilla.kernel.org/show_bug.cgi?id=199183
https://bugzilla.redhat.com/show_bug.cgi?id=1560788

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-03-29 22:10:31 -04:00
Theodore Ts'o
8e4b5eae5d ext4: fail ext4_iget for root directory if unallocated
If the root directory has an i_links_count of zero, then when the file
system is mounted, then when ext4_fill_super() notices the problem and
tries to call iput() the root directory in the error return path,
ext4_evict_inode() will try to free the inode on disk, before all of
the file system structures are set up, and this will result in an OOPS
caused by a NULL pointer dereference.

This issue has been assigned CVE-2018-1092.

https://bugzilla.kernel.org/show_bug.cgi?id=199179
https://bugzilla.redhat.com/show_bug.cgi?id=1560777

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-03-29 21:56:09 -04:00
Eric Biggers
ce3fd194fc ext4: limit xattr size to INT_MAX
ext4 isn't validating the sizes of xattrs where the value of the xattr
is stored in an external inode.  This is problematic because
->e_value_size is a u32, but ext4_xattr_get() returns an int.  A very
large size is misinterpreted as an error code, which ext4_get_acl()
translates into a bogus ERR_PTR() for which IS_ERR() returns false,
causing a crash.

Fix this by validating that all xattrs are <= INT_MAX bytes.

This issue has been assigned CVE-2018-1095.

https://bugzilla.kernel.org/show_bug.cgi?id=199185
https://bugzilla.redhat.com/show_bug.cgi?id=1560793

Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Fixes: e50e5129f384 ("ext4: xattr-in-inode support")
2018-03-29 14:31:42 -04:00
Theodore Ts'o
7dac4a1726 ext4: add validity checks for bitmap block numbers
An privileged attacker can cause a crash by mounting a crafted ext4
image which triggers a out-of-bounds read in the function
ext4_valid_block_bitmap() in fs/ext4/balloc.c.

This issue has been assigned CVE-2018-1093.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199181
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1560782
Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-03-26 23:54:10 -04:00
zhenwei.pi
dcae058a8d ext4: fix comments in ext4_swap_extents()
"mark_unwritten" in comment and "unwritten" in the function arguments
is mismatched.

Signed-off-by: zhenwei.pi <zhenwei.pi@youruncloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-26 01:44:03 -04:00
Goldwyn Rodrigues
043d20d159 ext4: use generic_writepages instead of __writepage/write_cache_pages
Code cleanup. Instead of writing an internal static function, use the
available generic_writepages().

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-26 01:32:50 -04:00
Eric Sandeen
0d9366d67b ext4: don't complain about incorrect features when probing
If mount is auto-probing for filesystem type, it will try various
filesystems in order, with the MS_SILENT flag set.  We get
that flag as the silent arg to ext4_fill_super.

If we're probing (silent==1) then don't complain about feature
incompatibilities that are found if it looks like it's actually
a different valid extN type - failed probes should be silent
in this case.

If the on-disk features are unknown even to ext4, then complain.

Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Tested-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2018-03-22 11:59:00 -04:00
Nikolay Borisov
1d39834fba ext4: remove EXT4_STATE_DIOREAD_LOCK flag
Commit 16c54688592c ("ext4: Allow parallel DIO reads") reworked the way
locking happens around parallel dio reads. This resulted in obviating
the need for EXT4_STATE_DIOREAD_LOCK flag and accompanying logic.
Currently this amounts to dead code so let's remove it. No functional
changes

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2018-03-22 11:52:10 -04:00
Jiri Slaby
fe23cb65c2 ext4: fix offset overflow on 32-bit archs in ext4_iomap_begin()
ext4_iomap_begin() has a bug where offset returned in the iomap
structure will be truncated to unsigned long size. On 64-bit
architectures this is fine but on 32-bit architectures obviously not.
Not many places actually use the offset stored in the iomap structure
but one of visible failures is in SEEK_HOLE / SEEK_DATA implementation.
If we create a file like:

dd if=/dev/urandom of=file bs=1k seek=8m count=1

then

lseek64("file", 0x100000000ULL, SEEK_DATA)

wrongly returns 0x100000000 on unfixed kernel while it should return
0x200000000. Avoid the overflow by proper type cast.

Fixes: 545052e9e35a ("ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.15
2018-03-22 11:50:26 -04:00
Eryu Guan
45d8ec4d9f ext4: update i_disksize if direct write past ondisk size
Currently in ext4 direct write path, we update i_disksize only when
new eof is greater than i_size, and don't update it even when new
eof is greater than i_disksize but less than i_size. This doesn't
work well with delalloc buffer write, which updates i_size and
i_disksize only when delalloc blocks are resolved (at writeback
time), the i_disksize from direct write can be lost if a previous
buffer write succeeded at write time but failed at writeback time,
then results in corrupted ondisk inode size.

Consider this case, first buffer write 4k data to a new file at
offset 16k with delayed allocation, then direct write 4k data to the
same file at offset 4k before delalloc blocks are resolved, which
doesn't update i_disksize because it writes within i_size(20k), but
the extent tree metadata has been committed in journal. Then
writeback of the delalloc blocks fails (due to device error etc.),
and i_size/i_disksize from buffer write can't be written to disk
(still zero). A subsequent umount/mount cycle recovers journal and
writes extent tree metadata from direct write to disk, but with
i_disksize being zero.

Fix it by updating i_disksize too in direct write path when new eof
is greater than i_disksize but less than i_size, so i_disksize is
always consistent with direct write.

This fixes occasional i_size corruption in fstests generic/475.

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-03-22 11:44:59 -04:00
Eryu Guan
73fdad00b2 ext4: protect i_disksize update by i_data_sem in direct write path
i_disksize update should be protected by i_data_sem, by either taking
the lock explicitly or by using ext4_update_i_disksize() helper. But the
i_disksize updates in ext4_direct_IO_write() are not protected at all,
which may be racing with i_disksize updates in writeback path in
delalloc buffer write path.

This is found by code inspection, and I didn't hit any i_disksize
corruption due to this bug. Thanks to Jan Kara for catching this bug and
suggesting the fix!

Reported-by: Jan Kara <jack@suse.cz>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-03-22 11:41:25 -04:00
Theodore Ts'o
044e6e3d74 ext4: don't update checksum of new initialized bitmaps
When reading the inode or block allocation bitmap, if the bitmap needs
to be initialized, do not update the checksum in the block group
descriptor.  That's because we're not set up to journal those changes.
Instead, just set the verified bit on the bitmap block, so that it's
not necessary to validate the checksum.

When a block or inode allocation actually happens, at that point the
checksum will be calculated, and update of the bg descriptor block
will be properly journalled.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-02-19 14:16:47 -05:00
Theodore Ts'o
85e0c4e89c jbd2: if the journal is aborted then don't allow update of the log tail
This updates the jbd2 superblock unnecessarily, and on an abort we
shouldn't truncate the log.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-02-19 12:22:53 -05:00
Theodore Ts'o
fb7c02445c ext4: pass -ESHUTDOWN code to jbd2 layer
Previously the jbd2 layer assumed that a file system check would be
required after a journal abort.  In the case of the deliberate file
system shutdown, this should not be necessary.  Allow the jbd2 layer
to distinguish between these two cases by using the ESHUTDOWN errno.

Also add proper locking to __journal_abort_soft().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-02-18 23:45:18 -05:00
Theodore Ts'o
a6d9946bb9 ext4: eliminate sleep from shutdown ioctl
The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to
avoid some races (that are now fixed), but in fact it introduced its
own race.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-02-18 23:16:28 -05:00
Theodore Ts'o
576d18ed60 ext4: shutdown should not prevent get_write_access
The ext4 forced shutdown flag needs to prevent new handles from being
started, but it needs to allow existing handles to complete.  So the
forced shutdown flag should not force ext4_journal_get_write_access to
fail.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-02-18 22:07:36 -05:00
Theodore Ts'o
ed65b00f8d jbd2: clarify bad journal block checksum message
There were two error messages emitted by jbd2, one for a bad checksum
for a jbd2 descriptor block, and one for a bad checksum for a jbd2
data block.  Change the data block checksum error so that the two can
be disambiguated.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-02-18 21:33:13 -05:00
Theodore Ts'o
ccf0f32acd ext4: add tracepoints for shutdown and file system errors
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-02-18 20:53:23 -05:00
Linus Torvalds
a9a08845e9 vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
        L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
        for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
    done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do.  But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-11 14:34:03 -08:00
Linus Torvalds
ee5daa1361 Merge branch 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more poll annotation updates from Al Viro:
 "This is preparation to solving the problems you've mentioned in the
  original poll series.

  After this series, the kernel is ready for running

      for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
            L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
            for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
      done

  as a for bulk search-and-replace.

  After that, the kernel is ready to apply the patch to unify
  {de,}mangle_poll(), and then get rid of kernel-side POLL... uses
  entirely, and we should be all done with that stuff.

  Basically, that's what you suggested wrt KPOLL..., except that we can
  use EPOLL... instead - they already are arch-independent (and equal to
  what is currently kernel-side POLL...).

  After the preparations (in this series) switch to returning EPOLL...
  from ->poll() instances is completely mechanical and kernel-side
  POLL... can go away. The last step (killing kernel-side POLL... and
  unifying {de,}mangle_poll() has to be done after the
  search-and-replace job, since we need userland-side POLL... for
  unified {de,}mangle_poll(), thus the cherry-pick at the last step.

  After that we will have:

   - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
     ->poll() still using those will be caught by sparse.

   - eventpoll.c and select.c warning-free wrt __poll_t

   - no more kernel-side definitions of POLL... - userland ones are
     visible through the entire kernel (and used pretty much only for
     mangle/demangle)

   - same behavior as after the first series (i.e. sparc et.al. epoll(2)
     working correctly)"

* 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  annotate ep_scan_ready_list()
  ep_send_events_proc(): return result via esed->res
  preparation to switching ->poll() to returning EPOLL...
  add EPOLLNVAL, annotate EPOLL... and event_poll->event
  use linux/poll.h instead of asm/poll.h
  xen: fix poll misannotation
  smc: missing poll annotations
2018-02-11 13:57:19 -08:00
Linus Torvalds
878e66d06f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs fixes from Al Viro.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  seq_file: fix incomplete reset on read from zero offset
  kernfs: fix regression in kernfs_fop_write caused by wrong type
2018-02-09 19:22:17 -08:00
Linus Torvalds
a28348322f 4.16 minor SMB3 fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGcBAABAgAGBQJafSqsAAoJEIosvXAHck9RHmEMAJyzkwc503WOl9/ZyagcaDli
 4mJEplVgxL6ZcgmaPZrZ1qaZvHd0JWq5bDbPeuuNv+wyqIu14DYVHivaORswfI7y
 Q0p0gslWf+hyS637CcmBajgEZbgAZIAkUktC+KPa7lZcUFDEvgYwHnQNuK3yvhBR
 zRrWeiumWn4l25ahc8GBA5nZ7tDM5xkLpv8DfI0ycCbm5E+Bqnf23m13hTMT7Mt3
 4hBc6iEdi+/IcRkwf5BHEO94hNeWSb4oERLIWxXXkZ3XTSlYtJteV/pdIoJfhHnr
 Th453VUwPfkRVVw3h4feZaIKM6kGPStGg1435+6lBpgTWQgNImd/Kcg3d181U3rs
 /+iORX2KLwwl6orVQnX5IBiUpnB2+ePpRGjMAGedIPSztMVInGInxxT1UZQtMCIg
 fJ6PQ1eH/OlY7WiY16+3YBYvtWPPqJc98P7gyfDocne7ZoT0XkoQ+2YejaNzI2Sz
 8Qkw6Y8gLSQ8tC2duV14evlLmynbB1qRL9n99iD06w==
 =Ps35
 -----END PGP SIGNATURE-----

Merge tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "There are a couple additional security fixes that are still being
  tested that are not in this set."

* tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  Add missing structs and defines from recent SMB3.1.1 documentation
  address lock imbalance warnings in smbdirect.c
  cifs: silence compiler warnings showing up with gcc-8.0.0
  Add some missing debug fields in server and tcon structs
2018-02-09 14:42:57 -08:00
Linus Torvalds
f1517df870 This request is late, apologies.
But it's also a fairly small update this time around.  Some cleanup,
 RDMA fixes, overlayfs fixes, and a fix for an NFSv4 state bug.
 
 The bigger deal for nfsd this time around is Jeff Layton's
 already-merged i_version patches.  This series has a minor conflict with
 that one, and the resolution should be obvious.  (Stephen Rothwell has
 been carrying it in linux-next for what it's worth.)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJafNVvAAoJECebzXlCjuG+yZUP/2SctFtkW638z9frLcIVt5M6
 x5hluw5jtFrVqq/KoMwi7rVaMzhdvcgwwfaLciqrPCOmcMKlOqiWslyCV0wZVCZS
 jabkOeinKVAyPTlESesNyArWKBWaB8QaYDwbkQ5Y76U9Ma5gwSghS1wc8vrNduZY
 2StieESOiOs9LljXf5SqCC5nN9s7gs4qtCK7aZ3JIt4661Lh39LqyO5zxLnc78eL
 USnJKHjTSreY2Vd1/TdNWyZhiim43wdrB+jpy6IoocTqyhYalkCz1iYdJn1arqtP
 iIddPpczKxkHekFVj7/Kfa+ATFtdXIpivOBhhOT0oY8HukTd58bh/oUMrFt4BSuP
 MQst0R9h1sanBE18XBPlXuIK51sm3AjjOGaQycl/Mzes+dMRgIP/KspAcnwwXHqG
 gyZsF3VzliFTc9s0SyiAz2AxNTUnjd+LV3E0DUeivURa6V3pc+sFlQzi8PRxRaep
 0gmhYcZsfwdDKZ/kbQyQdSWN48NxOLFke4fYjmoUtoyILa0NAHEqafeJkR5EiRTm
 tZsL9H/3THEGWygYlXGGBo/J4w5jE3uL/8KkfeuZefzSo0Ujqu0pBALMTnGFLKRx
 Mpw7JEqfUwqIVZ0Qh6q9yIcjr89qWv96UpBqRRIkFX5zOPN7B1BH8C89g8qy3Hyt
 gm/5BTw4FPE0uAM9Nhsd
 =icEX
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.16' of git://linux-nfs.org/~bfields/linux

Pull nfsd update from Bruce Fields:
 "A fairly small update this time around. Some cleanup, RDMA fixes,
  overlayfs fixes, and a fix for an NFSv4 state bug.

  The bigger deal for nfsd this time around was Jeff Layton's
  already-merged i_version patches"

* tag 'nfsd-4.16' of git://linux-nfs.org/~bfields/linux:
  svcrdma: Fix Read chunk round-up
  NFSD: hide unused svcxdr_dupstr()
  nfsd: store stat times in fill_pre_wcc() instead of inode times
  nfsd: encode stat->mtime for getattr instead of inode->i_mtime
  nfsd: return RESOURCE not GARBAGE_ARGS on too many ops
  nfsd4: don't set lock stateid's sc_type to CLOSED
  nfsd: Detect unhashed stids in nfsd4_verify_open_stid()
  sunrpc: remove dead code in svc_sock_setbufsize
  svcrdma: Post Receives in the Receive completion handler
  nfsd4: permit layoutget of executable-only files
  lockd: convert nlm_rqst.a_count from atomic_t to refcount_t
  lockd: convert nlm_lockowner.count from atomic_t to refcount_t
  lockd: convert nsm_handle.sm_count from atomic_t to refcount_t
2018-02-08 15:18:32 -08:00
Linus Torvalds
a0f79386a4 Mostly cleanups, but three bug fixes:
1. don't pass garbage return codes back up the call chain (Mike Marshall)
 
  2. fix stale inode test (Martin Brandenburg)
 
  3. fix off-by-one errors (Xiongfeng Wang)
 
 Also: add Martin as a reviewer in the Maintainers file.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaejneAAoJEM9EDqnrzg2+XhoQAIDF112mOwLwqDPmr4ty0g6/
 gBcoHOrRFlYWPlS5aubjoZ3jFX2fAeNuHzYS4LIuqVKUdsC+oTKQ2URJ7KKpvLiK
 6zOaz2Y4GLns2sa1ZUKli6nEBbPi6uwoF54FNbwt3b+97wpmJwlnXm9ztyt5REKA
 zOHvLgJAcfGNZEJ7gyB1zjwllu4JeD0A4MoN4vJCtkKLAaNClywu4+V0jwZB+SSN
 8QjDXNqkcD31ahWhQ/CaU4zXlxOOV+4ZR7/p5IKT693hEhV+ikTvmXy8g0+bksxj
 L+FHmQMTO+GqCS5FxuBQd3v1IP5FkoHEmAwvr3C5aMlRAaVJ9eVVIZaC9CpOJBRB
 S/CiaG2Mw8vx8VGOm8O93Z+xDi9tCYP8x4i7b5r62h0T9wSyHJSkSIUd6VIkCV9Q
 c92bX/N3wHBvCPT+RC898plni5HsFpzs3vSs8hiaAICgp64sC8pIqVlZOAdMtJd8
 RL4la/Fited/T+3BpaCTkmnvNk8Ktax7wHYsCt4gSyHN8WRvkzowgC5kV6S30Qlh
 zfoXG0K50FcU8T5r3i8slvUHmsiyYxYwJIk/z1iDgXI7y4IIR6FGDxQmw5TxgNS7
 +veTo6FCxon6QshtpAOeELCau7qNXhtlDdGqqm4+gDfMWoCn0Jem/LzdA2gPXCOr
 iCDwHLiu6WXt7ZHTrgln
 =xrih
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs updates from Mike Marshall:
 "Mostly cleanups, but three bug fixes:

   - don't pass garbage return codes back up the call chain (Mike
     Marshall)

   - fix stale inode test (Martin Brandenburg)

   - fix off-by-one errors (Xiongfeng Wang)

  Also add Martin as a reviewer in the Maintainers file"

* tag 'for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: reverse sense of is-inode-stale test in d_revalidate
  orangefs: simplify orangefs_inode_is_stale
  Orangefs: don't propogate whacky error codes
  orangefs: use correct string length
  orangefs: make orangefs_make_bad_inode static
  orangefs: remove ORANGEFS_KERNEL_DEBUG
  orangefs: remove gossip_ldebug and gossip_lerr
  orangefs: make orangefs_client_debug_init static
  MAINTAINERS: update orangefs list and add myself as reviewer
2018-02-08 12:20:41 -08:00
Linus Torvalds
81153336eb AFS development
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWnx0Mvu3V2unywtrAQI3ng//Xdv2rxVjv4znzekb/EkE9QIakH3ET3wt
 hBewQjaGkOWhZKgyE7DnhCMh7y6OrX/oVNtjPU8H7EEHDHVs+nyoGoDu282jlppr
 qO7yMbxZwDtpja7O9hVtIViFZSqlEey/RCq1KKRUl/HDmyyOmAvOZHCpyowUqcYD
 KqJs9Z2/onkP43rwmoKIQPEeKHxRfAs6pTiAG7fUPYC4d6aSskiN5K65N0g4dx4F
 G6pDC/mIJWx2qeeI//CzSxnqhzWAhkozOs9UtvquSrIoNcYMSOQRHGne50n7OqkK
 rZCttm4gSlrEU11cPDNExjKU4z8UM3tmVdudntC8wbng5PFCHTR7JB5nZu1bEjqw
 TpIjb302QnUefzu1AGge03ZnysqDKKBAxKKwD1gYBHaj7Y2CrqP4lo+6QA4ePYTv
 qD7nRZCiQ8rF3PJOYJ7xe944Jziktf6PhnOXyxOSNCv3IT90YD7meOR3MldMjny/
 hM2ahYqfWXjLAjH20Q+B8z7ab9GDdVsBTl06w/ZX+RMrg5CNdDaYe0nfG/tS7H3A
 oD7xIjUwWjqxMBqtXNUe/3GAOnU+ilEiKjq8gmNkBSjRlpO6SMxi02jOp66HwnRs
 tD5qG3Bn2F3hdvEtwcKcS0cVWX511lLF5vkhlBhSbs/XkS+BXULr3vDsl5XclwAw
 /07q8HsHlnM=
 =fSB4
 -----END PGP SIGNATURE-----

Merge tag 'afs-next-20180208' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull afs updates from David Howells:
 "Four fixes:

   - add a missing put

   - two fixes to reset the address iteration cursor correctly

   - fix setting up the fileserver iteration cursor.

  Two cleanups:

   - remove some dead code

   - rearrange a function to be more logically laid out

  And one new feature:

   - Support AFS dynamic root.

     With this one should be able to do, say:

        mkdir /afs
        mount -t afs none /afs -o dyn

     to create a dynamic root and then, provided you have keyutils
     installed, do:

        ls /afs/grand.central.org

     and:

        ls /afs/umich.edu

     to list the root volumes of both those organisations' AFS cells
     without requiring any other setup (the kernel upcall to a program
     in the keyutils package to do DNS access as does NFS)"

* tag 'afs-next-20180208' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Support the AFS dynamic root
  afs: Rearrange afs_select_fileserver() a little
  afs: Remove unused code
  afs: Fix server list handling
  afs: Need to clear responded flag in addr cursor
  afs: Fix missing cursor clearance
  afs: Add missing afs_put_cell()
2018-02-08 12:12:04 -08:00
Linus Torvalds
9e95dae76b Things have been very quiet on the rbd side, as work continues on the
big ticket items slated for the next merge window.
 
 On the CephFS side we have a large number of cap handling improvements,
 a fix for our long-standing abuse of ->journal_info in ceph_readpages()
 and yet another dentry pointer management patch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJafGqnAAoJEEp/3jgCEfOLjNcH/R6G/xyytDMfxaN+D8DBqCPF
 IaQM7RtgYJeRzDIXYYCkDEBPYqLcD2fjHLzFotFNLcgLdeUcSOyfg7NuCOWWq7o2
 t4z6Ekyish3GWZLUmlSdPcToQ+xIlMRshU8ZmzCHTCzx8XjO+CAnCADp5dh8OKZx
 mCpRX16sXdc6ozE1hsGKIkUoNrkdj8d3+HseZ2Uxb/4FZBNgH3cmmg7c5y6M+sp6
 wT4NEES3baqq2v5cVfw7T+d4MNgRm4/JC1aBy1JBkQlmVFNGteQTT7yzo0X1AfJ+
 +kcR10ddg0gD4WGYhL+iZlQCfwyMp7vouHQbgTOgt+rDCitjDy5r1BAamtxnZjM=
 =ctaD
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.16-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Things have been very quiet on the rbd side, as work continues on the
  big ticket items slated for the next merge window.

  On the CephFS side we have a large number of cap handling
  improvements, a fix for our long-standing abuse of ->journal_info in
  ceph_readpages() and yet another dentry pointer management patch"

* tag 'ceph-for-4.16-rc1' of git://github.com/ceph/ceph-client:
  ceph: improving efficiency of syncfs
  libceph: check kstrndup() return value
  ceph: try to allocate enough memory for reserved caps
  ceph: fix race of queuing delayed caps
  ceph: delete unreachable code in ceph_check_caps()
  ceph: limit rate of cap import/export error messages
  ceph: fix incorrect snaprealm when adding caps
  ceph: fix un-balanced fsc->writeback_count update
  ceph: track read contexts in ceph_file_info
  ceph: avoid dereferencing invalid pointer during cached readdir
  ceph: use atomic_t for ceph_inode_info::i_shared_gen
  ceph: cleanup traceless reply handling for rename
  ceph: voluntarily drop Fx cap for readdir request
  ceph: properly drop caps for setattr request
  ceph: voluntarily drop Lx cap for link/rename requests
  ceph: voluntarily drop Ax cap for requests that create new inode
  rbd: whitelist RBD_FEATURE_OPERATIONS feature bit
  rbd: don't NULL out ->obj_request in rbd_img_obj_parent_read_full()
  rbd: use kmem_cache_zalloc() in rbd_img_request_create()
  rbd: obj_request->completion is unused
2018-02-08 11:38:59 -08:00
Nicolas Pitre
a8c6db00bf cramfs: better MTD dependency expression
Commit b9f5fb1800d8 ("cramfs: fix MTD dependency") did what it says.

Since commit 9059a3493efe ("kconfig: fix relational operators for bool
and tristate symbols") it is possible to do it slightly better though.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-08 11:37:31 -08:00
Arnd Bergmann
2285ae760d NFSD: hide unused svcxdr_dupstr()
There is now only one caller left for svcxdr_dupstr() and this is inside
of an #ifdef, so we can get a warning when the option is disabled:

fs/nfsd/nfs4xdr.c:241:1: error: 'svcxdr_dupstr' defined but not used [-Werror=unused-function]

This changes the remaining caller to use a nicer IS_ENABLED() check,
which lets the compiler drop the unused code silently.

Fixes: e40d99e6183e ("NFSD: Clean up symlink argument XDR decoders")
Suggested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:17 -05:00
Amir Goldstein
39ca1bf624 nfsd: store stat times in fill_pre_wcc() instead of inode times
The time values in stat and inode may differ for overlayfs and stat time
values are the correct ones to use. This is also consistent with the fact
that fill_post_wcc() also stores stat time values.

This means introducing a stat call that could fail, where previously we
were just copying values out of the inode.  To be conservative about
changing behavior, we fall back to copying values out of the inode in
the error case.  It might be better just to clear fh_pre_saved (though
note the BUG_ON in set_change_info).

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:17 -05:00
Amir Goldstein
76c479480b nfsd: encode stat->mtime for getattr instead of inode->i_mtime
The values of stat->mtime and inode->i_mtime may differ for overlayfs
and stat->mtime is the correct value to use when encoding getattr.
This is also consistent with the fact that other attr times are also
encoded from stat values.

Both callers of lease_get_mtime() already have the value of stat->mtime,
so the only needed change is that lease_get_mtime() will not overwrite
this value with inode->i_mtime in case the inode does not have an
exclusive lease.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:16 -05:00
J. Bruce Fields
0078117c6d nfsd: return RESOURCE not GARBAGE_ARGS on too many ops
A client that sends more than a hundred ops in a single compound
currently gets an rpc-level GARBAGE_ARGS error.

It would be more helpful to return NFS4ERR_RESOURCE, since that gives
the client a better idea how to recover (for example by splitting up the
compound into smaller compounds).

This is all a bit academic since we've never actually seen a reason for
clients to send such long compounds, but we may as well fix it.

While we're there, just use NFSD4_MAX_OPS_PER_COMPOUND == 16, the
constant we already use in the 4.1 case, instead of hard-coding 100.
Chances anyone actually uses even 16 ops per compound are small enough
that I think there's a neglible risk or any regression.

This fixes pynfs test COMP6.

Reported-by: "Lu, Xinyu" <luxy.fnst@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08 13:40:16 -05:00
Linus Torvalds
6fbac201f9 iversion.h related cleanup for v4.16
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJae0mSAAoJEAAOaEEZVoIVs98P+wSbwfgLeyTufmrRYrD9kxfh
 EQXfuvnJqPzRHLJIUXfwzTN3IV9RZ1434ci31lZvQE3PKrgb90QuBLiR6OIKULef
 UqpYRmjsg7BfFBdAnyUR8xSmmeN94PjXQk7tG+YQn096HJVZ6cG5qCA8RjJ9dFoq
 2haDcOfDU+3e8mbtrrF4doP6jGrVwV+okqRsshFBclQv62Kk3m7L5AjQINyZpTM5
 ZKX5JIMOAmlJcHsz/2J1qLAIRQKsvEUbRLV43bzp3E03PuVFPhig3dVtpGPUe+Yi
 OW0JX49hIoTCrQ4KZk6uweLG7ZpaSoppXggEi2ERNCUkCf3nhejLlScfye+yLx7f
 sItgPkOYU0VVF70Y72XH1DbOekZr/XCLZdEEUNCS/P68hnyK0gBNC9zPGetlxMMi
 wjjQ9Qe45vD2JFlrvhHrdUdCnxnE05zC9ckBrmM94uRwIfDR0WVgo6pfebfRkAJd
 Wp4/PfbaySY7vk4oyaXlNxcDIH2NvWwYkioI/K9rRGbB2KjTdXonQojBy+rT0LeS
 f3mufyZYyCxdwu3Wf8WO36H23L+4fseMthKIIPA0aL4wasB9LgD8gDnkyKx28DT4
 S32tdK4UALC8SAVsPr+vSaMVzKOZmuNHac+XB2i+5lHl8G/n4M2a+JFTeR4CnKJ/
 9LsBEBL5Oj7ZXL7lfFIO
 =iEKM
 -----END PGP SIGNATURE-----

Merge tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull inode->i_version cleanup from Jeff Layton:
 "Goffredo went ahead and sent a patch to rename this function, and
  reverse its sense, as we discussed last week.

  The patch is very straightforward and I figure it's probably best to
  go ahead and merge this to get the API as settled as possible"

* tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}
2018-02-07 14:25:22 -08:00
Linus Torvalds
fe803f8628 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF and ext2 fixlets from Jan Kara:
 "A UDF fix and an ext2 cleanup"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: drop unneeded newline
  udf: Sanitize nanoseconds for time stamps
2018-02-07 14:23:06 -08:00
Steve French
5f60a56494 Add missing structs and defines from recent SMB3.1.1 documentation
The last two updates to MS-SMB2 protocol documentation added various
flags and structs (especially relating to SMB3.1.1 tree connect).
Add missing defines and structs to smb2pdu.h

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:46 -06:00
Steve French
f9de151bf2 address lock imbalance warnings in smbdirect.c
Although at least one of these was an overly strict sparse warning
in the new smbdirect code, it is cleaner to fix - so no warnings.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:43 -06:00
Arnd Bergmann
ade7db991b cifs: silence compiler warnings showing up with gcc-8.0.0
This bug was fixed before, but came up again with the latest
compiler in another function:

fs/cifs/cifssmb.c: In function 'CIFSSMBSetEA':
fs/cifs/cifssmb.c:6362:3: error: 'strncpy' offset 8 is out of the bounds [0, 4] [-Werror=array-bounds]
   strncpy(parm_data->list[0].name, ea_name, name_len);

Let's apply the same fix that was used for the other instances.

Fixes: b2a3ad9ca502 ("cifs: silence compiler warnings showing up with gcc-4.7.0")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steve French <smfrench@gmail.com>
2018-02-07 09:36:41 -06:00
Steve French
ede2e520a1 Add some missing debug fields in server and tcon structs
Allow dumping out debug information on dialect, signing, unix extensions
and encryption

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:38 -06:00
Linus Torvalds
a2e5790d84 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - kasan updates

 - procfs

 - lib/bitmap updates

 - other lib/ updates

 - checkpatch tweaks

 - rapidio

 - ubsan

 - pipe fixes and cleanups

 - lots of other misc bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  Documentation/sysctl/user.txt: fix typo
  MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns
  MAINTAINERS: update various PALM patterns
  MAINTAINERS: update "ARM/OXNAS platform support" patterns
  MAINTAINERS: update Cortina/Gemini patterns
  MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern
  MAINTAINERS: remove ANDROID ION pattern
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
  mm: docs: fix parameter names mismatch
  mm: docs: fixup punctuation
  pipe: read buffer limits atomically
  pipe: simplify round_pipe_size()
  pipe: reject F_SETPIPE_SZ with size over UINT_MAX
  pipe: fix off-by-one error when checking buffer limits
  pipe: actually allow root to exceed the pipe buffer limits
  pipe, sysctl: remove pipe_proc_fn()
  pipe, sysctl: drop 'min' parameter from pipe-max-size converter
  kasan: rework Kconfig settings
  crash_dump: is_kdump_kernel can be boolean
  kernel/mutex: mutex_is_locked can be boolean
  ...
2018-02-06 22:15:42 -08:00
Eric Biggers
f734076181 pipe: read buffer limits atomically
The pipe buffer limits are accessed without any locking, and may be
changed at any time by the sysctl handlers.  In theory this could cause
problems for expressions like the following:

    pipe_user_pages_hard && user_bufs > pipe_user_pages_hard

...  since the assembly code might reference the 'pipe_user_pages_hard'
memory location multiple times, and if the admin removes the limit by
setting it to 0, there is a very brief window where processes could
incorrectly observe the limit to be exceeded.

Fix this by loading the limits with READ_ONCE() prior to use.

Link: http://lkml.kernel.org/r/20180111052902.14409-8-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:48 -08:00
Eric Biggers
c4fed5a91f pipe: simplify round_pipe_size()
round_pipe_size() calculates the number of pages the requested size
corresponds to, then rounds the page count up to the next power of 2.

However, it also rounds everything < PAGE_SIZE up to PAGE_SIZE.
Therefore, there's no need to actually translate the size into a page
count; we just need to round the size up to the next power of 2.

We do need to verify the size isn't greater than (1 << 31), since on
32-bit systems roundup_pow_of_two() would be undefined in that case.  But
that can just be combined with the UINT_MAX check which we need anyway
now.

Finally, update pipe_set_size() to not redundantly check the return value
of round_pipe_size() for the "invalid size" case twice.

Link: http://lkml.kernel.org/r/20180111052902.14409-7-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:48 -08:00
Eric Biggers
96e99be40e pipe: reject F_SETPIPE_SZ with size over UINT_MAX
A pipe's size is represented as an 'unsigned int'.  As expected, writing a
value greater than UINT_MAX to /proc/sys/fs/pipe-max-size fails with
EINVAL.  However, the F_SETPIPE_SZ fcntl silently truncates such values to
32 bits, rather than failing with EINVAL as expected.  (It *does* fail
with EINVAL for values above (1 << 31) but <= UINT_MAX.)

Fix this by moving the check against UINT_MAX into round_pipe_size() which
is called in both cases.

Link: http://lkml.kernel.org/r/20180111052902.14409-6-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Eric Biggers
9903a91c76 pipe: fix off-by-one error when checking buffer limits
With pipe-user-pages-hard set to 'N', users were actually only allowed up
to 'N - 1' buffers; and likewise for pipe-user-pages-soft.

Fix this to allow up to 'N' buffers, as would be expected.

Link: http://lkml.kernel.org/r/20180111052902.14409-5-ebiggers3@gmail.com
Fixes: b0b91d18e2e9 ("pipe: fix limit checking in pipe_set_size()")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Eric Biggers
85c2dd5473 pipe: actually allow root to exceed the pipe buffer limits
pipe-user-pages-hard and pipe-user-pages-soft are only supposed to apply
to unprivileged users, as documented in both Documentation/sysctl/fs.txt
and the pipe(7) man page.

However, the capabilities are actually only checked when increasing a
pipe's size using F_SETPIPE_SZ, not when creating a new pipe.  Therefore,
if pipe-user-pages-hard has been set, the root user can run into it and be
unable to create pipes.  Similarly, if pipe-user-pages-soft has been set,
the root user can run into it and have their pipes limited to 1 page each.

Fix this by allowing the privileged override in both cases.

Link: http://lkml.kernel.org/r/20180111052902.14409-4-ebiggers3@gmail.com
Fixes: 759c01142a5d ("pipe: limit the per-user amount of pages allocated in pipes")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Eric Biggers
319e0a21bb pipe, sysctl: remove pipe_proc_fn()
pipe_proc_fn() is no longer needed, as it only calls through to
proc_dopipe_max_size().  Just put proc_dopipe_max_size() in the ctl_table
entry directly, and remove the unneeded EXPORT_SYMBOL() and the ENOSYS
stub for it.

(The reason the ENOSYS stub isn't needed is that the pipe-max-size
ctl_table entry is located directly in 'kern_table' rather than being
registered separately.  Therefore, the entry is already only defined when
the kernel is built with sysctl support.)

Link: http://lkml.kernel.org/r/20180111052902.14409-3-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R . Rodriguez" <mcgrof@kernel.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00