73545 Commits

Author SHA1 Message Date
Shyam Prasad N
46bb1b9484 cifs: do not duplicate fscache cookie for secondary channels
We allocate index cookies for each connection from the client.
However, we don't need this index for each channel in case of
multichannel. So making sure that we avoid creating duplicate
cookies by instantiating only for primary channel.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 23:29:08 -06:00
Shyam Prasad N
0f2b305af9 cifs: connect individual channel servers to primary channel server
Today, we don't have any way to get the smb session for any
of the secondary channels. Introducing a pointer to the primary
server from server struct of any secondary channel. The value will
be NULL for the server of the primary channel. This will enable us
to get the smb session for any channel.

This will be needed for some of the changes that I'm planning
to make soon.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 20:27:06 -06:00
Shyam Prasad N
724244cdb3 cifs: protect session channel fields with chan_lock
Introducing a new spin lock to protect all the channel related
fields in a cifs_ses struct. This lock should be taken
whenever dealing with the channel fields, and should be held
only for very short intervals which will not sleep.

Currently, all channel related fields in cifs_ses structure
are protected by session_mutex. However, this mutex is held for
long periods (sometimes while waiting for a reply from server).
This makes the codepath quite tricky to change.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 16:22:20 -06:00
Shyam Prasad N
8e07757bec cifs: do not negotiate session if session already exists
In cifs_get_smb_ses, if we find an existing matching session,
we should not send a negotiate request for the session if a
session reconnect is not necessary.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 16:21:03 -06:00
Steve French
02102744d3 smb3: do not setup the fscache_super_cookie until fsinfo initialized
We were calling cifs_fscache_get_super_cookie after tcon but before
we queried the info (QFS_Info) we need to initialize the cookie
properly.  Also includes an additional check suggested by Paulo
to make sure we don't initialize super cookie twice.

Suggested-by: David Howells <dhowells@redhat.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 13:10:33 -06:00
Paulo Alcantara
7f28af9cf5 cifs: fix potential use-after-free bugs
Ensure that share and prefix variables are set to NULL after kfree()
when looping through DFS targets in __tree_connect_dfs_target().

Also, get rid of @ref in __tree_connect_dfs_target() and just pass a
boolean to indicate whether we're handling link targets or not.

Fixes: c88f7dcd6d64 ("cifs: support nested dfs links over reconnect")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:59 -06:00
Paulo Alcantara
869da64d07 cifs: fix memory leak of smb3_fs_context_dup::server_hostname
Fix memory leak of smb3_fs_context_dup::server_hostname when parsing
and duplicating fs contexts during mount(2) as reported by kmemleak:

  unreferenced object 0xffff888125715c90 (size 16):
    comm "mount.cifs", pid 3832, jiffies 4304535868 (age 190.094s)
    hex dump (first 16 bytes):
      7a 65 6c 64 61 2e 74 65 73 74 00 6b 6b 6b 6b a5  zelda.test.kkkk.
    backtrace:
      [<ffffffff8168106e>] kstrdup+0x2e/0x60
      [<ffffffffa027a362>] smb3_fs_context_dup+0x392/0x8d0 [cifs]
      [<ffffffffa0136353>] cifs_smb3_do_mount+0x143/0x1700 [cifs]
      [<ffffffffa02795e8>] smb3_get_tree+0x2e8/0x520 [cifs]
      [<ffffffff817a19aa>] vfs_get_tree+0x8a/0x2d0
      [<ffffffff8181e3e3>] path_mount+0x423/0x1a10
      [<ffffffff8181fbca>] __x64_sys_mount+0x1fa/0x270
      [<ffffffff83ae364b>] do_syscall_64+0x3b/0x90
      [<ffffffff83c0007c>] entry_SYSCALL_64_after_hwframe+0x44/0xae
  unreferenced object 0xffff888111deed20 (size 32):
    comm "mount.cifs", pid 3832, jiffies 4304536044 (age 189.918s)
    hex dump (first 32 bytes):
      44 46 53 52 4f 4f 54 31 2e 5a 45 4c 44 41 2e 54  DFSROOT1.ZELDA.T
      45 53 54 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  EST.kkkkkkkkkkk.
    backtrace:
      [<ffffffff8168118d>] kstrndup+0x2d/0x90
      [<ffffffffa027ab2e>] smb3_parse_devname+0x9e/0x360 [cifs]
      [<ffffffffa01870c8>] cifs_setup_volume_info+0xa8/0x470 [cifs]
      [<ffffffffa018c469>] connect_dfs_target+0x309/0xc80 [cifs]
      [<ffffffffa018d6cb>] cifs_mount+0x8eb/0x17f0 [cifs]
      [<ffffffffa0136475>] cifs_smb3_do_mount+0x265/0x1700 [cifs]
      [<ffffffffa02795e8>] smb3_get_tree+0x2e8/0x520 [cifs]
      [<ffffffff817a19aa>] vfs_get_tree+0x8a/0x2d0
      [<ffffffff8181e3e3>] path_mount+0x423/0x1a10
      [<ffffffff8181fbca>] __x64_sys_mount+0x1fa/0x270
      [<ffffffff83ae364b>] do_syscall_64+0x3b/0x90
      [<ffffffff83c0007c>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:56 -06:00
Steve French
ca780da5fd smb3: add additional null check in SMB311_posix_mkdir
Although unlikely for it to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1437501 ("Explicit Null dereference")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:54 -06:00
Steve French
9e7ffa77b2 cifs: release lock earlier in dequeue_mid error case
In dequeue_mid we can log an error while holding a spinlock,
GlobalMid_Lock.  Coverity notes that the error logging
also grabs a lock so it is cleaner (and a bit safer) to
release the GlobalMid_Lock before logging the warning.

Addresses-Coverity: 1507573 ("Thread deadlock")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 12:59:51 -06:00
Steve French
bac35395d2 smb3: add additional null check in SMB2_tcon
Although unlikely to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1420428 ("Explicit null dereferenced")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 10:22:05 -06:00
Steve French
6b7895182c smb3: add additional null check in SMB2_open
Although unlikely to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1418458 ("Explicit null dereferenced")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-12 10:21:51 -06:00
Steve French
4d9beec22f smb3: add additional null check in SMB2_ioctl
Although unlikely for it to be possible for rsp to be null here,
the check is safer to add, and quiets a Coverity warning.

Addresses-Coverity: 1443909 ("Explicit Null dereference")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-11 14:49:24 -06:00
Steve French
0e62904836 smb3: remove trivial dfs compile warning
Fix warning caused by recent changes to the dfs code:

symbol 'tree_connect_dfs_target' was not declared. Should it be static?

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:30:23 -06:00
Paulo Alcantara
c88f7dcd6d cifs: support nested dfs links over reconnect
Mounting a dfs link that has nested links was already supported at
mount(2), so make it work over reconnect as well.

Make the following case work:

* mount //root/dfs/link /mnt -o ...
  - final share: /server/share

* in server settings
  - change target folder of /root/dfs/link3 to /server/share2
  - change target folder of /root/dfs/link2 to /root/dfs/link3
  - change target folder of /root/dfs/link to /root/dfs/link2

* mount -o remount,... /mnt
 - refresh all dfs referrals
 - mark current connection for failover
 - cifs_reconnect() reconnects to root server
 - tree_connect()
   * checks that /root/dfs/link2 is a link, then chase it
   * checks that root/dfs/link3 is a link, then chase it
   * finally tree connect to /server/share2

If the mounted share is no longer accessible and a reconnect had been
triggered, the client will retry it from both last referral
path (/root/dfs/link3) and original referral path (/root/dfs/link).

Any new referral paths found while chasing dfs links over reconnect,
it will be updated to TCP_Server_Info::leaf_fullpath, accordingly.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:30:13 -06:00
Steve French
71e6864eac smb3: do not error on fsync when readonly
Linux allows doing a flush/fsync on a file open for read-only,
but the protocol does not allow that.  If the file passed in
on the flush is read-only try to find a writeable handle for
the same inode, if that is not possible skip sending the
fsync call to the server to avoid breaking the apps.

Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Suggested-by: Jeremy Allison <jra@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 16:28:27 -06:00
Shyam Prasad N
b6f2a0f89d cifs: for compound requests, use open handle if possible
For smb2_compound_op, it is possible to pass a ref to
an already open file. We should be passing it whenever possible.
i.e. if a matching handle is already kept open.

If we don't do that, we will end up breaking leases for files
kept open on the same client.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10 11:05:20 -06:00
Paulo Alcantara
4ac0536f88 cifs: set a minimum of 120s for next dns resolution
With commit 506c1da44fee ("cifs: use the expiry output of dns_query to
schedule next resolution") and after triggering the first reconnect,
the next async dns resolution of tcp server's hostname would be
scheduled based on dns_resolver's key expiry default, which happens to
default to 5s on most systems that use key.dns_resolver for upcall.

As per key.dns_resolver.conf(5):

       default_ttl=<number>
              The  number  of  seconds  to  set  as the expiration on a cached
              record.  This will be overridden if the program manages  to  re-
              trieve  TTL  information along with the addresses (if, for exam-
              ple, it accesses the DNS directly).  The default is  5  seconds.
              The value must be in the range 1 to INT_MAX.

Make the next async dns resolution no shorter than 120s as we do not
want to be upcalling too often.

Cc: stable@vger.kernel.org
Fixes: 506c1da44fee ("cifs: use the expiry output of dns_query to schedule next resolution")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 23:03:08 -06:00
Paulo Alcantara
bbcce36804 cifs: split out dfs code from cifs_reconnect()
Make two separate functions that handle dfs and non-dfs reconnect
logics since cifs_reconnect() became way too complex to handle both.
While at it, add some documentation.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 23:01:55 -06:00
Paulo Alcantara
ae0abb4dac cifs: convert list_for_each to entry variant
Convert list_for_each{,_safe} to list_for_each_entry{,_safe} in
cifs_mark_tcp_ses_conns_for_reconnect() function.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:46:36 -06:00
Paulo Alcantara
43b459aa5e cifs: introduce new helper for cifs_reconnect()
Create cifs_mark_tcp_ses_conns_for_reconnect() helper to mark all
sessions and tcons for reconnect when reconnecting tcp server.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:46:08 -06:00
Paulo Alcantara
efb21d7b0f cifs: fix print of hdr_flags in dfscache_proc_show()
Reorder the parameters in seq_printf() to correctly print header
flags.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-09 20:44:07 -06:00
Shyam Prasad N
49bd49f983 cifs: send workstation name during ntlmssp session setup
During the ntlmssp session setup (authenticate phases)
send the client workstation info. This can make debugging easier on
servers.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-08 13:07:56 -06:00
Shyam Prasad N
c9f1c19cf7 cifs: nosharesock should not share socket with future sessions
Today, when a new mount is done with nosharesock, we ensure
that we don't select an existing matching session. However,
we don't mark the connection as nosharesock, which means that
those could be shared with future sessions.

Fixed it with this commit. Also printing this info in DebugData.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-07 00:09:37 -05:00
Linus Torvalds
b5013d084e 7 cifs/smb3 fixes, mostly refactoring, but also a mount fix, a debugging improvement, and a reconnect fix for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmGF8yAACgkQiiy9cAdy
 T1F2gAv+PLwQmJVKBx7i8xExUX1SRPFLPBEC6RXWsBBgElSbE3RJdaRLbH2OBBty
 PSlK+hGMP65JevcZO2+bF0nHGiLfh7+MumF+xKkvJxXjoPz/+zTMPnlQP9SNW8Dl
 VhgcQDTdSxQ8lzv2d9Z16b749WAPLuMncZCz1IfY+Dsd7/Zagv12QdPdi2knzAxU
 B+qx3dNPzxTFyCtasUEMATHoxpsOc+MywqDPT8p5/NLpF7h7K2w9qwKezc7hKiI6
 iruZKfjJO+g0QAldT3fp3LzfmUr2V8Z85D0VZn18mQNBxinjtk0+uacZzwoXAxqU
 5EicdIhlMEQqtRJNoDUVRMst0h3UP45AhN63Jjh8VdJRUJeJ14zMlSf3ze9KgTIJ
 Sts3WU/7LPjHk6sMg2lr73y+VRSg2jtfEPpCdoo/g0Cv5h6IsdX5NUhNI98onQQ7
 R350i/A+raiRO5lYkzLcDabXDTesiFfENm8YYLlEK6DiQtZ6PhU/L46dgHPYt+hf
 7/RsKCz5
 =ibss
 -----END PGP SIGNATURE-----

Merge tag '5.16-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:

 - reconnect fix for stable

 - minor mount option fix

 - debugging improvement for (TCP) connection issues

 - refactoring of common code to help ksmbd

* tag '5.16-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: add dynamic trace points for socket connection
  cifs: Move SMB2_Create definitions to the shared area
  cifs: Move more definitions into the shared area
  cifs: move NEGOTIATE_PROTOCOL definitions out into the common area
  cifs: Create a new shared file holding smb2 pdu definitions
  cifs: add mount parameter tcpnodelay
  cifs: To match file servers, make sure the server hostname matches
2021-11-06 16:47:53 -07:00
Linus Torvalds
2acda7549e \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmGFN6IACgkQnJ2qBz9k
 QNkfYwgA1w5x/CsN2IMZdx6FTuZFgbOvQpBMTry8iuOPKK3UyIkZaUirTVLKR0cm
 k3QbBR9/vTfQTNg5weuFJcbPZZaCXKEvlPGvDh+pumMbfTkMwL3FADweNBoZ3PzO
 EiRrV45AbRgSMOzsfURzCz1T53Gd8fYM3pXxmNXG+bnE7+Ea+heKgor8/jFc4U3w
 kAKZTfyCiheo7KxVhFGnkGI3ZhIbnbZne4seY/CE4qtv7/bmBE7bhGpmv8LT5FUn
 h/JBDLjFU0fzJpplXE6n/VHXeGaUwb8adnYpzojWQ0lLYFrMIZFQ0KkDK6PNwmJF
 MKWGqRxDkf54oeWuEAJ9t4/OorqM9A==
 =ltE7
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
 "Support for reporting filesystem errors through fanotify so that
  system health monitoring daemons can watch for these and act instead
  of scraping system logs"

* tag 'fsnotify_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (34 commits)
  samples: remove duplicate include in fs-monitor.c
  samples: Fix warning in fsnotify sample
  docs: Fix formatting of literal sections in fanotify docs
  samples: Make fs-monitor depend on libc and headers
  docs: Document the FAN_FS_ERROR event
  samples: Add fs error monitoring example
  ext4: Send notifications on error
  fanotify: Allow users to request FAN_FS_ERROR events
  fanotify: Emit generic error info for error event
  fanotify: Report fid info for file related file system errors
  fanotify: WARN_ON against too large file handles
  fanotify: Add helpers to decide whether to report FID/DFID
  fanotify: Wrap object_fh inline space in a creator macro
  fanotify: Support merging of error events
  fanotify: Support enqueueing of error events
  fanotify: Pre-allocate pool of error events
  fanotify: Reserve UAPI bits for FAN_FS_ERROR
  fsnotify: Support FS_ERROR event type
  fanotify: Require fid_mode for any non-fd event
  fanotify: Encode empty file handle when no inode is provided
  ...
2021-11-06 16:43:20 -07:00
Linus Torvalds
d8b4e5bd48 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmGFNuMACgkQnJ2qBz9k
 QNnIdAgA6T5QBAAfzKj5l+NsNbhmFBIOSvDW+65l8B1ioJaTNivc9Q9sfAaYdICs
 6bGA59FDvnlFe+RyX0Jysphp9Nc4tx7of1fsIhC+gR5U1PwJ/2KvIpZlrz7y4LuP
 yZ4YV9Q2R+4e+68KjzAhAs3izEVmI9L+2LC4/4w18EtIM8NfqIqVYg/nRD/DI1Oz
 /GC21YOYvE6BAtS721DBCONamvp5g3duX7SMjjYuZDeALnfikIdArklNXYUj0dsQ
 a3K+9w0vL+6zCZ7YsLiamoh/PCbNhbU0FseCcKnzELglNQw69KfNk6J7XV64VLqL
 1NW0x4TupUW7TU+mIvmK3jqbXJWe/g==
 =vHTT
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull quota, isofs, and reiserfs updates from Jan Kara:
 "Fixes for handling of corrupted quota files, fix for handling of
  corrupted isofs filesystem, and a small cleanup for reiserfs"

* tag 'fs_for_v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fs: reiserfs: remove useless new_opts in reiserfs_remount
  isofs: Fix out of bound access for corrupted isofs image
  quota: correct error number in free_dqentry()
  quota: check block number when reading the block in quota file
2021-11-06 16:40:48 -07:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Rongwei Wang
8468e937df mm, thp: fix incorrect unmap behavior for private pages
When truncating pagecache on file THP, the private pages of a process
should not be unmapped mapping.  This incorrect behavior on a dynamic
shared libraries which will cause related processes to happen core dump.

A simple test for a DSO (Prerequisite is the DSO mapped in file THP):

    int main(int argc, char *argv[])
    {
	int fd;

	fd = open(argv[1], O_WRONLY);
	if (fd < 0) {
		perror("open");
	}

	close(fd);
	return 0;
    }

The test only to open a target DSO, and do nothing.  But this operation
will lead one or more process to happen core dump.  This patch mainly to
fix this bug.

Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba.com
Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Tested-by: Xu Yu <xuyu@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Collin Fijalkovich <cfijalkovich@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Rongwei Wang
55fc0d9174 mm, thp: lock filemap when truncating page cache
Patch series "fix two bugs for file THP".

This patch (of 2):

Transparent huge page has supported read-only non-shmem files.  The
file- backed THP is collapsed by khugepaged and truncated when written
(for shared libraries).

However, there is a race when multiple writers truncate the same page
cache concurrently.

In that case, subpage(s) of file THP can be revealed by find_get_entry
in truncate_inode_pages_range, which will trigger PageTail BUG_ON in
truncate_inode_page, as follows:

    page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff
    head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0
    flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head)
    raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400
    raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
    head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000
    head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000
    page dumped because: VM_BUG_ON_PAGE(PageTail(page))
    ------------[ cut here ]------------
    kernel BUG at mm/truncate.c:213!
    Internal error: Oops - BUG: 0 [#1] SMP
    Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ...
    CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ...
    Hardware name: ECS, BIOS 0.0.0 02/06/2015
    pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
    Call trace:
     truncate_inode_page+0x64/0x70
     truncate_inode_pages_range+0x550/0x7e4
     truncate_pagecache+0x58/0x80
     do_dentry_open+0x1e4/0x3c0
     vfs_open+0x38/0x44
     do_open+0x1f0/0x310
     path_openat+0x114/0x1dc
     do_filp_open+0x84/0x134
     do_sys_openat2+0xbc/0x164
     __arm64_sys_openat+0x74/0xc0
     el0_svc_common.constprop.0+0x88/0x220
     do_el0_svc+0x30/0xa0
     el0_svc+0x20/0x30
     el0_sync_handler+0x1a4/0x1b0
     el0_sync+0x180/0x1c0
    Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000)

This patch mainly to lock filemap when one enter truncate_pagecache(),
avoiding truncating the same page cache concurrently.

Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com
Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Song Liu <song@kernel.org>
Cc: Collin Fijalkovich <cfijalkovich@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Christoph Hellwig
0b3ea0926a fs: explicitly unregister per-superblock BDIs
Add a new SB_I_ flag to mark superblocks that have an ephemeral bdi
associated with them, and unregister it when the superblock is shut
down.

Link: https://lkml.kernel.org/r/20211021124441.668816-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:34 -07:00
Peter Xu
2301003215 mm/smaps: simplify shmem handling of pte holes
Firstly, check_shmem_swap variable is actually not necessary, because
it's always set with pte_hole hook; checking each would work.

Meanwhile, the check within smaps_pte_entry is not easy to follow.
E.g., pte_none() check is not needed as "!pte_present && !is_swap_pte"
is the same.  Since at it, use the pte_hole() helper rather than dup the
page cache lookup.

Still keep the CONFIG_SHMEM part so the code can be optimized to nop for
!SHMEM.

There will be a very slight functional change in smaps_pte_entry(), that
for !SHMEM we'll return early for pte_none (before checking page==NULL),
but that's even nicer.

Link: https://lkml.kernel.org/r/20210917164756.8586-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Peter Xu
10c848c8b4 mm/smaps: fix shmem pte hole swap calculation
Patch series "mm/smaps: Fixes and optimizations on shmem swap handling".

This patch (of 3):

The shmem swap calculation on the privately writable mappings are using
wrong parameters as spotted by Vlastimil.  Fix them.  This was
introduced in commit 48131e03ca4e ("mm, proc: reduce cost of
/proc/pid/smaps for unpopulated shmem mappings"), when shmem_swap_usage
was reworked to shmem_partial_swap_usage.

Test program:

  void main(void)
  {
      char *buffer, *p;
      int i, fd;

      fd = memfd_create("test", 0);
      assert(fd > 0);

      /* isize==2M*3, fill in pages, swap them out */
      ftruncate(fd, SIZE_2M * 3);
      buffer = mmap(NULL, SIZE_2M * 3, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
      assert(buffer);
      for (i = 0, p = buffer; i < SIZE_2M * 3 / 4096; i++) {
          *p = 1;
          p += 4096;
      }
      madvise(buffer, SIZE_2M * 3, MADV_PAGEOUT);
      munmap(buffer, SIZE_2M * 3);

      /*
       * Remap with private+writtable mappings on partial of the inode (<= 2M*3),
       * while the size must also be >= 2M*2 to make sure there's a none pmd so
       * smaps_pte_hole will be triggered.
       */
      buffer = mmap(NULL, SIZE_2M * 2, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
      printf("pid=%d, buffer=%p\n", getpid(), buffer);

      /* Check /proc/$PID/smap_rollup, should see 4MB swap */
      sleep(1000000);
  }

Before the patch, smaps_rollup shows <4MB swap and the number will be
random depending on the alignment of the buffer of mmap() allocated.
After this patch, it'll show 4MB.

Link: https://lkml.kernel.org/r/20210917164756.8586-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210917164756.8586-2-peterx@redhat.com
Fixes: 48131e03ca4e ("mm, proc: reduce cost of /proc/pid/smaps for unpopulated shmem mappings")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:33 -07:00
Jia He
d41b60359f d_path: fix Kernel doc validator complaining
Kernel doc validator complains:
  Function parameter or member 'p' not described in 'prepend_name'
  Excess function parameter 'buffer' description in 'prepend_name'

Link: https://lkml.kernel.org/r/20211011005614.26189-1-justin.he@arm.com
Fixes: ad08ae586586 ("d_path: introduce struct prepend_buffer")
Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Arnd Bergmann
d1cef29adc fs/posix_acl.c: avoid -Wempty-body warning
The fallthrough comment for an ignored cmpxchg() return value produces a
harmless warning with 'make W=1':

  fs/posix_acl.c: In function 'get_acl':
  fs/posix_acl.c:127:36: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
    127 |                 /* fall through */ ;
        |                                    ^

Simplify it as a step towards a clean W=1 build.  As all architectures
define cmpxchg() as a statement expression these days, it is no longer
necessary to evaluate its return code, and the if() can just be droped.

Link: https://lkml.kernel.org/r/20210927102410.1863853-1-arnd@kernel.org
Link: https://lore.kernel.org/all/20210322132103.qiun2rjilnlgztxe@wittgenstein/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Jan Kara
c7c14a369d ocfs2: do not zero pages beyond i_size
ocfs2_zero_range_for_truncate() can try to zero pages beyond current
inode size despite the fact that underlying blocks should be already
zeroed out and writeback will skip writing such pages anyway.  Avoid the
pointless work.

Link: https://lkml.kernel.org/r/20211025151332.11301-2-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Jan Kara
839b63860e ocfs2: fix data corruption on truncate
Patch series "ocfs2: Truncate data corruption fix".

As further testing has shown, commit 5314454ea3f ("ocfs2: fix data
corruption after conversion from inline format") didn't fix all the data
corruption issues the customer started observing after 6dbf7bb55598
("fs: Don't invalidate page buffers in block_write_full_page()") This
time I have tracked them down to two bugs in ocfs2 truncation code.

One bug (truncating page cache before clearing tail cluster and setting
i_size) could cause data corruption even before 6dbf7bb55598, but before
that commit it needed a race with page fault, after 6dbf7bb55598 it
started to be pretty deterministic.

Another bug (zeroing pages beyond old i_size) used to be harmless
inefficiency before commit 6dbf7bb55598.  But after commit 6dbf7bb55598
in combination with the first bug it resulted in deterministic data
corruption.

Although fixing only the first problem is needed to stop data
corruption, I've fixed both issues to make the code more robust.

This patch (of 2):

ocfs2_truncate_file() did unmap invalidate page cache pages before
zeroing partial tail cluster and setting i_size.  Thus some pages could
be left (and likely have left if the cluster zeroing happened) in the
page cache beyond i_size after truncate finished letting user possibly
see stale data once the file was extended again.  Also the tail cluster
zeroing was not guaranteed to finish before truncate finished causing
possible stale data exposure.  The problem started to be particularly
easy to hit after commit 6dbf7bb55598 "fs: Don't invalidate page buffers
in block_write_full_page()" stopped invalidation of pages beyond i_size
from page writeback path.

Fix these problems by unmapping and invalidating pages in the page cache
after the i_size is reduced and tail cluster is zeroed out.

Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz
Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Colin Ian King
848be75d15 ocfs2/dlm: remove redundant assignment of variable ret
The variable ret is being assigned a value that is never read, it is
updated later on with a different value.  The assignment is redundant
and can be removed.

Addresses-Coverity: ("Unused value")
Link: https://lkml.kernel.org/r/20211007233452.30815-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Valentin Vidic
da5e7c8782 ocfs2: cleanup journal init and shutdown
Allocate and free struct ocfs2_journal in ocfs2_journal_init and
ocfs2_journal_shutdown.  Init and release of system inodes references
the journal so reorder calls to make sure they work correctly.

Link: https://lkml.kernel.org/r/20211009145006.3478-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Chenyuan Mi
ae3fab5bcc ocfs2: fix handle refcount leak in two exception handling paths
The reference counting issue happens in two exception handling paths of
ocfs2_replay_truncate_records().  When executing these two exception
handling paths, the function forgets to decrease the refcount of handle
increased by ocfs2_start_trans(), causing a refcount leak.

Fix this issue by using ocfs2_commit_trans() to decrease the refcount of
handle in two handling paths.

Link: https://lkml.kernel.org/r/20210908102055.10168-1-cymi20@fudan.edu.cn
Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:32 -07:00
Steve French
d7171cd1ac smb3: add dynamic trace points for socket connection
In debugging user problems with ip address/DNS issues with
smb3 mounts, we sometimes needed additional info on the hostname
and ip address.

Add two tracepoints, one to show socket connection success
and one for failures to connect to the socket.

Sample output:
      mount.cifs-14551   [005] .....  7636.547906: smb3_connect_done: conn_id=0x1 server=localhost addr=127.0.0.1:445
      mount.cifs-14558   [004] .....  7642.405413: smb3_connect_done: conn_id=0x2 server=smfrench.file.core.windows.net addr=52.239.158.232:445
      mount.cifs-14741   [005] .....  7818.490716: smb3_connect_done: conn_id=0x3 server=::1 addr=[::1]:445/0%0
      mount.cifs-14810   [000] .....  7966.380337: smb3_connect_err: rc=-101 conn_id=0x4 server=::2 addr=[::2]:445/0%0
      mount.cifs-14810   [000] .....  7966.380356: smb3_connect_err: rc=-101 conn_id=0x4 server=::2 addr=[::2]:139/0%0
      mount.cifs-14818   [003] .....  7986.771992: smb3_connect_done: conn_id=0x5 server=127.0.0.9 addr=127.0.0.9:445
      mount.cifs-14825   [008] .....  8008.178109: smb3_connect_err: rc=-115 conn_id=0x6 server=124.23.0.9 addr=124.23.0.9:445
      mount.cifs-14825   [008] .....  8013.298085: smb3_connect_err: rc=-115 conn_id=0x6 server=124.23.0.9 addr=124.23.0.9:139
           cifsd-14553   [006] .....  8036.735615: smb3_reconnect: conn_id=0x1 server=localhost current_mid=32
           cifsd-14743   [010] .....  8036.735644: smb3_reconnect: conn_id=0x3 server=::1 current_mid=29
           cifsd-14743   [010] .....  8039.921740: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8042.993894: smb3_connect_err: rc=-111 conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8042.993894: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8046.065824: smb3_connect_err: rc=-111 conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8046.065824: smb3_connect_err: rc=-111 conn_id=0x3 server=::1 addr=[::1]:445/0%0
           cifsd-14553   [008] .....  8049.137796: smb3_connect_done: conn_id=0x1 server=localhost addr=127.0.0.1:445
           cifsd-14743   [010] .....  8049.137796: smb3_connect_done: conn_id=0x3 server=::1 addr=[::1]:445/0%0

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 16:20:24 -05:00
Ronnie Sahlberg
c462870bf8 cifs: Move SMB2_Create definitions to the shared area
Move all SMB2_Create definitions (except contexts) into the shared area.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:55:36 -05:00
Ronnie Sahlberg
d8d9de532d cifs: Move more definitions into the shared area
Move SMB2_SessionSetup, SMB2_Close, SMB2_Read, SMB2_Write and
SMB2_ChangeNotify commands into smbfs_common/smb2pdu.h

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:55:21 -05:00
Ronnie Sahlberg
fc0b384469 cifs: move NEGOTIATE_PROTOCOL definitions out into the common area
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:53:54 -05:00
Ronnie Sahlberg
0d35e382e4 cifs: Create a new shared file holding smb2 pdu definitions
This file will contain all the definitions we need for SMB2 packets
and will follow the naming convention of MS-SMB2.PDF as closely
as possible to make it easier to cross-reference beween the definitions
and the standard.

The content of this file will mostly consist of migration of existing
definitions in the cifs/smb2.pdu.h and ksmbd/smb2pdu.h files
with some additional tweaks as the two files have diverged.

This patch introduces the new smbfs_common/smb2pdu.h file
and migrates the SMB2 header as well as TREE_CONNECT and TREE_DISCONNECT
to the shared file.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-05 09:50:57 -05:00
Linus Torvalds
95faf6ba65 Driver core changes for 5.16-rc1
Here is the big set of driver core changes for 5.16-rc1.
 
 All of these have been in linux-next for a while now with no reported
 problems.
 
 Included in here are:
 	- big update and cleanup of the sysfs abi documentation files
 	  and scripts from Mauro.  We are almost at the place where we
 	  can properly check that the running kernel's sysfs abi is
 	  documented fully.
 	- firmware loader updates
 	- dyndbg updates
 	- kernfs cleanups and fixes from Christoph
 	- device property updates
 	- component fix
 	- other minor driver core cleanups and fixes
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPbjQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ync9gCfXKMUI1GAnCfJWAwTdTcd18q5akoAoMw32/AH
 0yh5TjAWFyFd7xz5d7qs
 =itsC
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big set of driver core changes for 5.16-rc1.

  All of these have been in linux-next for a while now with no reported
  problems.

  Included in here are:

   - big update and cleanup of the sysfs abi documentation files and
     scripts from Mauro. We are almost at the place where we can
     properly check that the running kernel's sysfs abi is documented
     fully.

   - firmware loader updates

   - dyndbg updates

   - kernfs cleanups and fixes from Christoph

   - device property updates

   - component fix

   - other minor driver core cleanups and fixes"

* tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (122 commits)
  device property: Drop redundant NULL checks
  x86/build: Tuck away built-in firmware under FW_LOADER
  vmlinux.lds.h: wrap built-in firmware support under FW_LOADER
  firmware_loader: move struct builtin_fw to the only place used
  x86/microcode: Use the firmware_loader built-in API
  firmware_loader: remove old DECLARE_BUILTIN_FIRMWARE()
  firmware_loader: formalize built-in firmware API
  component: do not leave master devres group open after bind
  dyndbg: refine verbosity 1-4 summary-detail
  gpiolib: acpi: Replace custom code with device_match_acpi_handle()
  i2c: acpi: Replace custom function with device_match_acpi_handle()
  driver core: Provide device_match_acpi_handle() helper
  dyndbg: fix spurious vNpr_info change
  dyndbg: no vpr-info on empty queries
  dyndbg: vpr-info on remove-module complete, not starting
  device property: Add missed header in fwnode.h
  Documentation: dyndbg: Improve cli param examples
  dyndbg: Remove support for ddebug_query param
  dyndbg: make dyndbg a known cli param
  dyndbg: show module in vpr-info in dd-exec-queries
  ...
2021-11-04 08:32:38 -07:00
Linus Torvalds
a602285ac1 Merge branch 'per_signal_struct_coredumps-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull per signal_struct coredumps from Eric Biederman:
 "Current coredumps are mixed up with the exit code, the signal handling
  code, and the ptrace code making coredumps much more complicated than
  necessary and difficult to follow.

  This series of changes starts with ptrace_stop and cleans it up,
  making it easier to follow what is happening in ptrace_stop. Then
  cleans up the exec interactions with coredumps. Then cleans up the
  coredump interactions with exit. Finally the coredump interactions
  with the signal handling code is cleaned up.

  The first and last changes are bug fixes for minor bugs.

  I believe the fact that vfork followed by execve can kill the process
  the called vfork if exec fails is sufficient justification to change
  the userspace visible behavior.

  In previous discussions some of these changes were organized
  differently and individually appeared to make the code base worse. As
  currently written I believe they all stand on their own as cleanups
  and bug fixes.

  Which means that even if the worst should happen and the last change
  needs to be reverted for some unimaginable reason, the code base will
  still be improved.

  If the worst does not happen there are a more cleanups that can be
  made. Signals that generate coredumps can easily become eligible for
  short circuit delivery in complete_signal. The entire rendezvous for
  generating a coredump can move into get_signal. The function
  force_sig_info_to_task be written in a way that does not modify the
  signal handling state of the target task (because coredumps are
  eligible for short circuit delivery). Many of these future cleanups
  can be done another way but nothing so cleanly as if coredumps become
  per signal_struct"

* 'per_signal_struct_coredumps-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  coredump: Limit coredumps to a single thread group
  coredump:  Don't perform any cleanups before dumping core
  exit: Factor coredump_exit_mm out of exit_mm
  exec: Check for a pending fatal signal instead of core_state
  ptrace: Remove the unnecessary arguments from arch_ptrace_stop
  signal: Remove the bogus sigkill_pending in ptrace_stop
2021-11-03 12:15:29 -07:00
Linus Torvalds
655fedaad3 Just one JFS patch
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAmGCnuoACgkQNqiEXrVA
 jGS68RAAm/sHFB4Kha351GHiOzLTqdbecYR6g9FrGA1rjeum1H0KeCKxTZ8D+PAP
 pxUieR/n52UC+2qqnZWgJccRgVRdyMSqkVwRFJAGourv9cd8CILb0cZUaoCLaSpU
 GbHn2GPaOBbOOPGiCLY6AJ3IhhDY+2qhgC6M4eGGRW0uVl/vyuW/CuAUH0PUFL+y
 Xksf0qOLcr+nPCsqb+kIbY95ZRD67udtbtn09VW/T2wSCQeN1KJEBcs2zC0BB6Nz
 OwccEZrd0spyCh/15G4pzo2y5Pq4yu+ymeuxFEyCGRTzgT+CNFvS7URSW5fwOLUw
 Q20mi+Y0xu/4fVgGhkpIQXySaA1/4JTYfXDk65q5f1VpI5/7RAVOj8jRUWQKS+ra
 B0ihP1yJf91sh7W2ykxjQtqsw/+UukYJijiD4Jwk8LAOHd00bpsT6BnbF05dDFaH
 a7aWIzaon6qeJ5KDkgqhn9bRRjE5o3i2SqWu3W6QBEkaFYIIurej1m/FNYpgND1W
 c+z6VRO09N18g2vCVEG7nZ6i2ob7xjuB9aNS0r+btf6ZoKZ5pfxDpm8YcFcUlkS7
 pHSt2VnbRZBoYF5/9Hw+2MzEj8Gum3J/CiEfR4nLCjqXUfCdIHYHk+vZFwybeNXy
 YRVRYpTMX60re+vyoRcnVBG0Ua9kvanEVj4HxiZpEukVRi9bzhE=
 =0HJn
 -----END PGP SIGNATURE-----

Merge tag 'jfs-5.16' of git://github.com/kleikamp/linux-shaggy

Pull jfs fix from David Kleikamp:
 "Just one JFS patch"

* tag 'jfs-5.16' of git://github.com/kleikamp/linux-shaggy:
  JFS: fix memleak in jfs_mount
2021-11-03 09:23:25 -07:00
Linus Torvalds
bba7d68227 New code for 5.16:
* Bug fixes and cleanups for kernel memory allocation usage, this time
    without touching the mm code.
  * Refactor the log recovery mechanism that preserves held resources
    across a transaction roll so that it uses the exact same mechanism
    that we use for that during regular runtime.
  * Fix bugs and tighten checking around btree heights.
  * Remove more old typedefs.
  * Fix perag reference leaks when racing with growfs.
  * Remove unused fields from xfs_btree_cur.
  * Allocate various scrub structures on the heap to reduce stack usage.
  * Pack xfs_btree_cur fields and rearrange to support arbitrary heights.
  * Compute maximum possible heights for each btree height, and use that
    to set up slab caches for each btree type.
  * Finally remove kmem_zone_t, since these have always been struct
    kmem_cache on Linux.
  * Compact the structures used to coordinate work intent items.
  * Set up slab caches for each work intent item type.
  * Rename the "bmap_add_free" function to "free_extent_later", which
    more accurately describes what it does.
  * Fix corruption warning on unmount when a CoW preallocation covers a
    data fork delalloc reservation but then the CoW fails.
  * Add some more minor code improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmF9d8IACgkQ+H93GTRK
 tOug0w/+N/RqL2Yo3R67ByI6+Los7JCWixTZJJjDzwvnYAAmT/z04JWD0fSO0iQX
 2UlxsONrWxLswX4qYdzJNhfDfxMI+aso5D9IHNIchOa95JLq4FSnJ7Elw+AaYJW3
 EvSm3LLFwwZwreNk6EXCOfRnPK2EXTFkq4iCPMu8bW+wXlxth2fo2AiZzNzl4y6N
 tBc8lYhtMiu7n223wM+qqAnh6onB4iSu9HtyMcuIAL3FvQU4FYlHy+IurtwnJX+p
 fHhPUG6YaIxvRFDMMEfEd2a8CSRtBoOmAgDNbKKq+l/2NrXRW3ysrohRAKtUrCfC
 15RFKvoZu9a2JxOfXxdr78zFGykuMFesEOaK86c58C35Rp5fTZTmB0kntLrUvIsK
 6gdSslUM0Dsw8v6l6tbIHBMjg7mn4fatK8IM74U7omL+mBHnV2HEBNtaOqG4Fjd+
 YSG3g/F7AkAYDUbvYKLiedZC+spbt28thg25ED6ZWGA1qJ1I5OZ8G9O80v3i0Jcf
 8G16nXeBtxumvz/Ga01S/c7jdCrXm46+qzubf3nET2di+5o+THDorSeg3nySfOz5
 L7uPnm3CV5jYFQN/kVg2an4mO5uLipHzbZSYN1l44lTPt8IUDDEu4DIbpjKgdFGT
 jZE0uRSscsLsGmrui1Bg65KO2cG1rz18Gp+RVeJoNZn7WRW+tjM=
 =zRzv
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.16-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "This cycle we've worked on fixing bugs and improving XFS' memory
  footprint.

  The most notable fixes include: fixing a corruption warning (and free
  space accounting skew) if copy on write fails; fixing slab cache
  misuse if SLOB is enabled, which apparently was broken for years
  without anybody noticing; and fixing a potential race with online
  shrinkfs.

  Otherwise, the bulk of the changes here involve setting up separate
  slab caches for frequently used items such as btree cursors and log
  intent items, and compacting the structures to reduce memory usage of
  those items substantially. This also sets us up to support larger
  btrees in future kernels. We also switch parts of online fsck to
  allocate scrub context information from the heap instead of using
  stack space.

  Summary:

   - Bug fixes and cleanups for kernel memory allocation usage, this
     time without touching the mm code.

   - Refactor the log recovery mechanism that preserves held resources
     across a transaction roll so that it uses the exact same mechanism
     that we use for that during regular runtime.

   - Fix bugs and tighten checking around btree heights.

   - Remove more old typedefs.

   - Fix perag reference leaks when racing with growfs.

   - Remove unused fields from xfs_btree_cur.

   - Allocate various scrub structures on the heap to reduce stack
     usage.

   - Pack xfs_btree_cur fields and rearrange to support arbitrary
     heights.

   - Compute maximum possible heights for each btree height, and use
     that to set up slab caches for each btree type.

   - Finally remove kmem_zone_t, since these have always been struct
     kmem_cache on Linux.

   - Compact the structures used to coordinate work intent items.

   - Set up slab caches for each work intent item type.

   - Rename the "bmap_add_free" function to "free_extent_later", which
     more accurately describes what it does.

   - Fix corruption warning on unmount when a CoW preallocation covers a
     data fork delalloc reservation but then the CoW fails.

   - Add some more minor code improvements"

* tag 'xfs-5.16-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (45 commits)
  xfs: use swap() to make code cleaner
  xfs: Remove duplicated include in xfs_super
  xfs: punch out data fork delalloc blocks on COW writeback failure
  xfs: remove unused parameter from refcount code
  xfs: reduce the size of struct xfs_extent_free_item
  xfs: rename xfs_bmap_add_free to xfs_free_extent_later
  xfs: create slab caches for frequently-used deferred items
  xfs: compact deferred intent item structures
  xfs: rename _zone variables to _cache
  xfs: remove kmem_zone typedef
  xfs: use separate btree cursor cache for each btree type
  xfs: compute absolute maximum nlevels for each btree type
  xfs: kill XFS_BTREE_MAXLEVELS
  xfs: compute the maximum height of the rmap btree when reflink enabled
  xfs: clean up xfs_btree_{calc_size,compute_maxlevels}
  xfs: compute maximum AG btree height for critical reservation calculation
  xfs: rename m_ag_maxlevels to m_allocbt_maxlevels
  xfs: dynamically allocate cursors based on maxlevels
  xfs: encode the max btree height in the cursor
  xfs: refactor btree cursor allocation function
  ...
2021-11-02 12:42:56 -07:00
Linus Torvalds
a64a325bf6 AFS changes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmGBFjQACgkQ+7dXa6fL
 C2vuFhAAiJcYj/qiY9P6OMt593ppDlmrHCt7JzXVtnrVdX5RX0ZgvaG93w3DhPKx
 /edBe/Jz+0FEGk3T+6fLyG9NZRmn8xGW9i+fhA88Vy6x1b1+pv2fG7lr/H5F6fg2
 JC8dwa4nZdpkDzlfZImwseRJYyA5io/wjMva1BzJXXBXX7Y5chCQmFChrkyRzrX0
 Gbj1SMGamDfbu/5OmxK04HDCPaQJ9irWxh/L9/cUcFXYeTYacDF/GwoNTV1pji3Y
 Mg6qAiYd/P8c4zHG4ZjhtW1+d2wzuyu+KKKo32kPVM/Z1QWepJ48PcsVmnt67l0R
 l18lzOu7i/S0vyL9sS+CXIgKcApyofYsm8X/M5zDC2NQDXa+wNSS/gEsf+waApxr
 UtiDEjxDTm0uccHB1VhR4QMx7DrnYp0FBFMp1jUlIu2mMTdAaillzyAxcnGZUwTW
 OMTZZxaLpKmvYUpZ9liQLU0Rcv2eQ2y+eI6RifQ4UpOEosSru5CcyxSdfKB/veHk
 AkmqqqGRg+obSJoZxBMeTZD6SF0u/BrfBJAsS4d2UaNGkNlPXNi8GUUHsOurSPrh
 Z9JqLdCf8u5/hpKQwNPStbjIS6M0KYvkdNuXnLrxbNuk2rgMg8/AraIV0kTTASSt
 WV3L+WhNF+iJfVzQEfPQ8UJ7VQfgofdHhF5X1IdiznTe32DxojU=
 =ax6b
 -----END PGP SIGNATURE-----

Merge tag 'afs-next-20211102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS updates from David Howells:

 - Split the readpage handler for symlinks from the one for files. The
   symlink readpage isn't given a file pointer, so the handling has to
   be special-cased.

   This has been posted as part of a patchset to foliate netfs, afs,
   etc.[1] but I've moved it to this one as it's not actually doing
   foliation but is more of a pre-cleanup.

 - Fix file creation to set the mtime from the client's clock to keep
   make happy if the server's clock isn't quite in sync.[2]

Link: https://lore.kernel.org/r/163005742570.2472992.7800423440314043178.stgit@warthog.procyon.org.uk/ [1]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-October/004395.html [2]

* tag 'afs-next-20211102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Set mtime from the client for yfs create operations
  afs: Sort out symlink reading
2021-11-02 12:39:57 -07:00
Linus Torvalds
78805cbe5d Changes in gfs2:
* Fix a locking order inversion between the inode and iopen glocks in
   gfs2_inode_lookup.
 
 * Implement proper queuing of glock holders for glocks that require
   instantiation (like reading an inode or bitmap blocks from disk).
   Before, multiple glock holders could race with each other and
   half-initialized objects could be exposed; the GL_SKIP flag further
   exacerbated this problem.
 
 * Fix a rare deadlock between inode lookup / creation and remote delete
   work.
 
 * Fix a rare scheduling-while-atomic bug in dlm during glock hash table
   walks.
 
 * Various other minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmGBThoUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTp4vg/9HQbZYtl1R7qWlzaEAOHaZzudcRIS
 wnCGGxqdySdWogXhBCICwyH7nJneFG1XsYvNF8HmH10ALjZkjWM3MV0J9O00eCWk
 VXKqcioRbWtyQiye3A3icT6qhVGd65LK+p6GqtH7pQXOYc1ein/Gi/IWHFN6wCTM
 FXulFAo6i3Uep4lfz9+WPy8iGDVVLUWt0uhmY8O+W8edsDJdX7Kr89mQU/2dUsMp
 BxImDvKcchd8SlWOHNJ2WrbfPVFLd3mgmouojxn7/0pspqtJA6tgOWpAmN0uKw+V
 Qaqb9g0KjrAnH39w1wSzlN9XCItOvT3EGg1HEkl1kx5UDi4S/B9yF0wWbzfI2BUL
 9T0dyAbmIcamcHua+rQatTfBQnEOScNfZQKd9MHE4etLhVyE2fHjmx2Ya4xcmiy/
 /onUEcfjcQvzVY+69hD9cqKwwgTz5G2xyLzD7WdWJD10qU4z22arOSIedmC2c1Eq
 62cbaxcHXrDkG8FYHb/ukNeZ8Fw54niWnBCXDBfRxpx/UbJDzvNxjF2BPNJEBWON
 UFO8cbMVMojrsLkkWKTInecCLjhROwgWXUo8A1CmYQNlMUhObWts2jGoRlUYBUcH
 8Ah+DADcz9H8VV2koURo6ZkJyJBztgJOZ+ysFZSdPQ2HQnijwWqi8Zn33UfTZH6d
 BxxPK+wVfPNZSWQ=
 =sAlR
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.15-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Fix a locking order inversion between the inode and iopen glocks in
   gfs2_inode_lookup.

 - Implement proper queuing of glock holders for glocks that require
   instantiation (like reading an inode or bitmap blocks from disk).
   Before, multiple glock holders could race with each other and
   half-initialized objects could be exposed; the GL_SKIP flag further
   exacerbated this problem.

 - Fix a rare deadlock between inode lookup / creation and remote delete
   work.

 - Fix a rare scheduling-while-atomic bug in dlm during glock hash table
   walks.

 - Various other minor fixes and cleanups.

* tag 'gfs2-v5.15-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (21 commits)
  gfs2: Fix unused value warning in do_gfs2_set_flags()
  gfs2: check context in gfs2_glock_put
  gfs2: Fix glock_hash_walk bugs
  gfs2: Cancel remote delete work asynchronously
  gfs2: set glock object after nq
  gfs2: remove RDF_UPTODATE flag
  gfs2: Eliminate GIF_INVALID flag
  gfs2: fix GL_SKIP node_scope problems
  gfs2: split glock instantiation off from do_promote
  gfs2: further simplify do_promote
  gfs2: re-factor function do_promote
  gfs2: Remove 'first' trace_gfs2_promote argument
  gfs2: change go_lock to go_instantiate
  gfs2: dump glocks from gfs2_consist_OBJ_i
  gfs2: dequeue iopen holder in gfs2_inode_lookup error
  gfs2: Save ip from gfs2_glock_nq_init
  gfs2: Allow append and immutable bits to coexist
  gfs2: Switch some BUG_ON to GLOCK_BUG_ON for debug
  gfs2: move GL_SKIP check from glops to do_promote
  gfs2: Add GL_SKIP holder flag to dump_holder
  ...
2021-11-02 12:35:04 -07:00