2021-06-18 08:31:49 +03:00
/* SPDX-License-Identifier: LGPL-2.1 */
2005-04-17 02:20:36 +04:00
/*
*
2008-01-25 13:12:41 +03:00
* Copyright ( C ) International Business Machines Corp . , 2002 , 2008
2005-04-17 02:20:36 +04:00
* Author ( s ) : Steve French ( sfrench @ us . ibm . com )
2006-08-03 01:56:33 +04:00
* Jeremy Allison ( jra @ samba . org )
2005-04-17 02:20:36 +04:00
*
*/
2010-06-22 19:22:50 +04:00
# ifndef _CIFS_GLOB_H
# define _CIFS_GLOB_H
2005-04-17 02:20:36 +04:00
# include <linux/in.h>
# include <linux/in6.h>
2021-02-21 04:24:11 +03:00
# include <linux/inet.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
# include <linux/scatterlist.h>
# include <linux/mm.h>
2011-12-26 22:53:34 +04:00
# include <linux/mempool.h>
2010-07-21 00:09:02 +04:00
# include <linux/workqueue.h>
2021-11-05 22:03:57 +03:00
# include <linux/utsname.h>
2022-06-01 08:03:18 +03:00
# include <linux/sched/mm.h>
2021-06-30 00:37:05 +03:00
# include <linux/netfs.h>
2005-04-17 02:20:36 +04:00
# include "cifs_fs_sb.h"
2007-09-25 00:25:46 +04:00
# include "cifsacl.h"
2010-10-21 23:25:08 +04:00
# include <crypto/internal/hash.h>
2013-08-09 16:47:17 +04:00
# include <uapi/linux/cifs/cifs_mount.h>
2021-11-05 02:39:01 +03:00
# include "../smbfs_common/smb2pdu.h"
2012-05-28 15:19:39 +04:00
# include "smb2pdu.h"
2010-10-21 23:25:08 +04:00
2018-10-16 22:47:58 +03:00
# define SMB_PATH_MAX 260
2018-06-14 16:43:16 +03:00
# define CIFS_PORT 445
# define RFC1001_PORT 139
2005-04-17 02:20:36 +04:00
/*
* The sizes of various internal tables and strings
*/
# define MAX_UID_INFO 16
# define MAX_SES_INFO 2
# define MAX_TCON_INFO 4
2013-08-09 16:47:19 +04:00
# define MAX_TREE_SIZE (2 + CIFS_NI_MAXHOST + 1 + CIFS_MAX_SHARE_LEN + 1)
2005-04-17 02:20:36 +04:00
# define CIFS_MIN_RCV_POOL 4
2012-05-21 18:20:12 +04:00
# define MAX_REOPEN_ATT 5 /* these many maximum attempts to reopen a file */
2010-12-01 12:12:28 +03:00
/*
* default attribute cache timeout ( jiffies )
*/
# define CIFS_DEF_ACTIMEO (1 * HZ)
/*
* max attribute cache timeout ( jiffies ) - 2 ^ 30
*/
# define CIFS_MAX_ACTIMEO (1 << 30)
2019-03-30 00:31:07 +03:00
/*
* Max persistent and resilient handle timeout ( milliseconds ) .
* Windows durable max was 960000 ( 16 minutes )
*/
# define SMB3_MAX_HANDLE_TIMEOUT 960000
2005-04-17 02:20:36 +04:00
/*
* MAX_REQ is the maximum number of requests that WE will send
2012-03-20 13:55:09 +04:00
* on one socket concurrently .
2005-04-17 02:20:36 +04:00
*/
2012-03-20 13:55:09 +04:00
# define CIFS_MAX_REQ 32767
2005-04-17 02:20:36 +04:00
2008-12-01 23:23:50 +03:00
# define RFC1001_NAME_LEN 15
# define RFC1001_NAME_LEN_WITH_NULL (RFC1001_NAME_LEN + 1)
2018-01-24 15:46:10 +03:00
/* maximum length of ip addr as a string (including ipv6 and sctp) */
# define SERVER_NAME_LENGTH 80
2005-04-17 02:20:36 +04:00
# define SERVER_NAME_LEN_WITH_NULL (SERVER_NAME_LENGTH + 1)
2015-12-18 21:31:36 +03:00
/* echo interval in seconds */
# define SMB_ECHO_INTERVAL_MIN 1
# define SMB_ECHO_INTERVAL_MAX 600
# define SMB_ECHO_INTERVAL_DEFAULT 60
2012-07-12 18:30:44 +04:00
2022-06-06 12:17:56 +03:00
/* smb multichannel query server interfaces interval in seconds */
# define SMB_INTERFACE_POLL_INTERVAL 600
2018-08-08 08:07:45 +03:00
/* maximum number of PDUs in one compound */
# define MAX_COMPOUND 5
2016-09-23 08:44:16 +03:00
/*
* Default number of credits to keep available for SMB3 .
* This value is chosen somewhat arbitrarily . The Windows client
* defaults to 128 credits , the Windows server allows clients up to
* 512 credits ( or 8 K for later versions ) , and the NetApp server
* does not limit clients at all . Choose a high enough default value
* such that the client shouldn ' t limit performance , but allow mount
* to override ( until you approach 64 K , where we limit credits to 65000
* to reduce possibility of seeing more server credit overflow bugs .
*/
# define SMB2_MAX_CREDITS_AVAILABLE 32000
2005-04-17 02:20:36 +04:00
# include "cifspdu.h"
# ifndef XATTR_DOS_ATTRIB
# define XATTR_DOS_ATTRIB "user.DOSATTRIB"
# endif
2021-11-05 22:03:57 +03:00
# define CIFS_MAX_WORKSTATION_LEN (__NEW_UTS_LEN + 1) /* reasonable max for client */
2022-12-13 07:23:16 +03:00
# define CIFS_DFS_ROOT_SES(ses) ((ses)->dfs_root_ses ?: (ses))
2005-04-17 02:20:36 +04:00
/*
* CIFS vfs client Status information ( based on what we know . )
*/
2022-04-07 16:15:49 +03:00
/* associated with each connection */
2005-04-17 02:20:36 +04:00
enum statusEnum {
CifsNew = 0 ,
CifsGood ,
CifsExiting ,
2011-04-12 05:01:14 +04:00
CifsNeedReconnect ,
2021-07-19 20:37:52 +03:00
CifsNeedNegotiate ,
CifsInNegotiate ,
2022-04-07 16:15:49 +03:00
} ;
/* associated with each smb session */
enum ses_status_enum {
SES_NEW = 0 ,
SES_GOOD ,
SES_EXITING ,
SES_NEED_RECON ,
SES_IN_SETUP
smb3: cleanup and clarify status of tree connections
Currently the way the tid (tree connection) status is tracked
is confusing. The same enum is used for structs cifs_tcon
and cifs_ses and TCP_Server_info, but each of these three has
different states that they transition among. The current
code also unnecessarily uses camelCase.
Convert from use of statusEnum to a new tid_status_enum for
tree connections. The valid states for a tid are:
TID_NEW = 0,
TID_GOOD,
TID_EXITING,
TID_NEED_RECON,
TID_NEED_TCON,
TID_IN_TCON,
TID_NEED_FILES_INVALIDATE, /* unused, considering removing in future */
TID_IN_FILES_INVALIDATE
It also removes CifsNeedTcon, CifsInTcon, CifsNeedFilesInvalidate and
CifsInFilesInvalidate from the statusEnum used for session and
TCP_Server_Info since they are not relevant for those.
A follow on patch will fix the places where we use the
tcon->need_reconnect flag to be more consistent with the tid->status.
Also fixes a bug that was:
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-28 00:07:30 +03:00
} ;
/* associated with each tree connection to the server */
enum tid_status_enum {
TID_NEW = 0 ,
TID_GOOD ,
TID_EXITING ,
TID_NEED_RECON ,
TID_NEED_TCON ,
TID_IN_TCON ,
TID_NEED_FILES_INVALIDATE , /* currently unused */
TID_IN_FILES_INVALIDATE
2005-04-17 02:20:36 +04:00
} ;
enum securityEnum {
2013-05-26 15:00:58 +04:00
Unspecified = 0 , /* not specified */
2005-04-17 02:20:36 +04:00
NTLMv2 , /* Legacy NTLM auth with NTLMv2 hash */
2009-05-06 08:16:04 +04:00
RawNTLMSSP , /* NTLMSSP without SPNEGO, NTLMv2 hash */
2008-08-19 23:35:33 +04:00
Kerberos , /* Kerberos via SPNEGO */
2005-04-17 02:20:36 +04:00
} ;
2010-09-19 07:01:58 +04:00
struct session_key {
2007-07-09 11:55:14 +04:00
unsigned int len ;
2010-10-21 15:42:55 +04:00
char * response ;
2007-07-09 11:55:14 +04:00
} ;
2010-10-27 03:10:24 +04:00
/* crypto hashing related structure/fields, not specific to a sec mech */
2010-10-21 23:25:08 +04:00
struct cifs_secmech {
2022-09-29 23:36:50 +03:00
struct shash_desc * hmacmd5 ; /* hmacmd5 hash function, for NTLMv2/CR1 hashes */
struct shash_desc * md5 ; /* md5 hash function, for CIFS/SMB1 signatures */
struct shash_desc * hmacsha256 ; /* hmac-sha256 hash function, for SMB2 signatures */
struct shash_desc * sha512 ; /* sha512 hash function, for SMB3.1.1 preauth hash */
struct shash_desc * aes_cmac ; /* block-cipher based MAC function, for SMB3 signatures */
struct crypto_aead * enc ; /* smb3 encryption AEAD TFM (AES-CCM and AES-GCM) */
struct crypto_aead * dec ; /* smb3 decryption AEAD TFM (AES-CCM and AES-GCM) */
2010-10-21 23:25:08 +04:00
} ;
2010-10-28 18:53:07 +04:00
/* per smb session structure/fields */
2010-10-21 23:25:08 +04:00
struct ntlmssp_auth {
2013-08-29 17:35:10 +04:00
bool sesskey_per_smbsess ; /* whether session key is per smb session */
2010-10-21 23:25:08 +04:00
__u32 client_flags ; /* sent by client in type 1 ntlmsssp exchange */
__u32 server_flags ; /* sent by server in type 2 ntlmssp exchange */
unsigned char ciphertext [ CIFS_CPHTXT_SIZE ] ; /* sent to server */
2010-10-28 18:53:07 +04:00
char cryptkey [ CIFS_CRYPTO_KEY_SIZE ] ; /* used by ntlmssp */
2010-10-21 23:25:08 +04:00
} ;
2007-09-25 00:25:46 +04:00
struct cifs_cred {
int uid ;
int gid ;
int mode ;
int cecount ;
struct cifs_sid osid ;
struct cifs_sid gsid ;
struct cifs_ntace * ntaces ;
struct cifs_ace * aces ;
} ;
2022-10-04 00:43:50 +03:00
struct cifs_open_info_data {
char * symlink_target ;
union {
struct smb2_file_all_info fi ;
struct smb311_posix_qinfo posix_fi ;
} ;
} ;
static inline void cifs_free_open_info ( struct cifs_open_info_data * data )
{
kfree ( data - > symlink_target ) ;
}
2005-04-17 02:20:36 +04:00
/*
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* Except the CIFS PDUs themselves all the
* globally interesting structs should go here
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
2012-09-19 03:20:34 +04:00
/*
* A smb_rqst represents a complete request to be issued to a server . It ' s
* formed by a kvec array , followed by an array of pages . Page data is assumed
* to start at the beginning of the first page .
*/
struct smb_rqst {
struct kvec * rq_iov ; /* array of kvecs */
unsigned int rq_nvec ; /* number of kvecs in array */
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
size_t rq_iter_size ; /* Amount of data in ->rq_iter */
struct iov_iter rq_iter ; /* Data iterator */
struct xarray rq_buffer ; /* Page buffer for encryption */
2012-09-19 03:20:34 +04:00
} ;
2012-05-15 20:21:10 +04:00
struct mid_q_entry ;
struct TCP_Server_Info ;
2012-02-28 15:04:17 +04:00
struct cifsFileInfo ;
2012-05-17 12:18:21 +04:00
struct cifs_ses ;
2012-05-25 11:11:39 +04:00
struct cifs_tcon ;
2012-05-27 20:21:53 +04:00
struct dfs_info3_param ;
2012-05-27 17:34:43 +04:00
struct cifs_fattr ;
2020-12-10 08:07:12 +03:00
struct smb3_fs_context ;
2012-09-19 03:20:26 +04:00
struct cifs_fid ;
2012-09-19 03:20:28 +04:00
struct cifs_readdata ;
2012-09-19 03:20:29 +04:00
struct cifs_writedata ;
2012-09-19 03:20:29 +04:00
struct cifs_io_parms ;
2012-09-19 03:20:32 +04:00
struct cifs_search_info ;
2012-09-19 03:20:33 +04:00
struct cifsInodeInfo ;
2013-07-05 12:00:30 +04:00
struct cifs_open_parms ;
2019-01-16 22:12:41 +03:00
struct cifs_credits ;
2012-05-15 20:21:10 +04:00
2012-05-15 20:20:51 +04:00
struct smb_version_operations {
2016-11-24 02:08:14 +03:00
int ( * send_cancel ) ( struct TCP_Server_Info * , struct smb_rqst * ,
2012-05-15 20:21:10 +04:00
struct mid_q_entry * ) ;
2012-02-28 15:04:17 +04:00
bool ( * compare_fids ) ( struct cifsFileInfo * , struct cifsFileInfo * ) ;
2012-05-17 12:18:21 +04:00
/* setup request: allocate mid, sign message */
2012-09-19 03:20:35 +04:00
struct mid_q_entry * ( * setup_request ) ( struct cifs_ses * ,
2019-09-20 07:08:34 +03:00
struct TCP_Server_Info * ,
struct smb_rqst * ) ;
2012-06-01 14:26:18 +04:00
/* setup async request: allocate mid, sign message */
2012-09-19 03:20:35 +04:00
struct mid_q_entry * ( * setup_async_request ) ( struct TCP_Server_Info * ,
struct smb_rqst * ) ;
2012-05-17 12:18:21 +04:00
/* check response: verify signature, map error */
int ( * check_receive ) ( struct mid_q_entry * , struct TCP_Server_Info * ,
bool ) ;
2019-01-16 22:12:41 +03:00
void ( * add_credits ) ( struct TCP_Server_Info * server ,
const struct cifs_credits * credits ,
const int optype ) ;
2012-05-17 17:53:29 +04:00
void ( * set_credits ) ( struct TCP_Server_Info * , const int ) ;
2012-05-23 16:14:34 +04:00
int * ( * get_credits_field ) ( struct TCP_Server_Info * , const int ) ;
unsigned int ( * get_credits ) ( struct mid_q_entry * ) ;
2012-05-23 14:01:59 +04:00
__u64 ( * get_next_mid ) ( struct TCP_Server_Info * ) ;
2019-03-05 01:02:50 +03:00
void ( * revert_current_mid ) ( struct TCP_Server_Info * server ,
const unsigned int val ) ;
2012-05-17 13:02:51 +04:00
/* data offset from read response message */
unsigned int ( * read_data_offset ) ( char * ) ;
2017-11-23 03:38:46 +03:00
/*
* Data length from read response message
* When in_remaining is true , the returned data length is in
* message field DataRemaining for out - of - band data read ( e . g through
* Memory Registration RDMA write in SMBD ) .
* Otherwise , the returned data length is in message field DataLength .
*/
unsigned int ( * read_data_length ) ( char * , bool in_remaining ) ;
2012-05-17 13:02:51 +04:00
/* map smb to linux error */
int ( * map_error ) ( char * , bool ) ;
2012-05-17 13:25:35 +04:00
/* find mid corresponding to the response message */
struct mid_q_entry * ( * find_mid ) ( struct TCP_Server_Info * , char * ) ;
2018-04-22 23:45:53 +03:00
void ( * dump_detail ) ( void * buf , struct TCP_Server_Info * ptcp_info ) ;
2012-05-28 14:16:31 +04:00
void ( * clear_stats ) ( struct cifs_tcon * ) ;
void ( * print_stats ) ( struct seq_file * m , struct cifs_tcon * ) ;
2013-06-19 23:15:30 +04:00
void ( * dump_share_caps ) ( struct seq_file * , struct cifs_tcon * ) ;
2012-05-17 13:25:35 +04:00
/* verify the message */
2015-12-18 22:05:30 +03:00
int ( * check_message ) ( char * , unsigned int , struct TCP_Server_Info * ) ;
2012-05-17 13:25:35 +04:00
bool ( * is_oplock_break ) ( char * , struct TCP_Server_Info * ) ;
2021-03-08 18:00:50 +03:00
int ( * handle_cancelled_mid ) ( struct mid_q_entry * , struct TCP_Server_Info * ) ;
2019-10-30 02:51:19 +03:00
void ( * downgrade_oplock ) ( struct TCP_Server_Info * server ,
struct cifsInodeInfo * cinode , __u32 oplock ,
unsigned int epoch , bool * purge_cache ) ;
2012-05-23 14:31:03 +04:00
/* process transaction2 response */
bool ( * check_trans2 ) ( struct mid_q_entry * , struct TCP_Server_Info * ,
char * , int ) ;
2012-05-25 10:43:58 +04:00
/* check if we need to negotiate */
bool ( * need_neg ) ( struct TCP_Server_Info * ) ;
/* negotiate to the server */
2021-07-19 16:54:16 +03:00
int ( * negotiate ) ( const unsigned int xid ,
struct cifs_ses * ses ,
struct TCP_Server_Info * server ) ;
2012-09-19 03:20:28 +04:00
/* set negotiated write size */
2020-12-10 08:07:12 +03:00
unsigned int ( * negotiate_wsize ) ( struct cifs_tcon * tcon , struct smb3_fs_context * ctx ) ;
2012-09-19 03:20:28 +04:00
/* set negotiated read size */
2020-12-10 08:07:12 +03:00
unsigned int ( * negotiate_rsize ) ( struct cifs_tcon * tcon , struct smb3_fs_context * ctx ) ;
2012-05-25 10:54:49 +04:00
/* setup smb sessionn */
int ( * sess_setup ) ( const unsigned int , struct cifs_ses * ,
2021-07-19 16:54:16 +03:00
struct TCP_Server_Info * server ,
2012-05-25 10:54:49 +04:00
const struct nls_table * ) ;
/* close smb session */
int ( * logoff ) ( const unsigned int , struct cifs_ses * ) ;
2012-05-25 11:11:39 +04:00
/* connect to a server share */
int ( * tree_connect ) ( const unsigned int , struct cifs_ses * , const char * ,
struct cifs_tcon * , const struct nls_table * ) ;
/* close tree connecion */
int ( * tree_disconnect ) ( const unsigned int , struct cifs_tcon * ) ;
2012-05-27 20:21:53 +04:00
/* get DFS referrals */
int ( * get_dfs_refer ) ( const unsigned int , struct cifs_ses * ,
const char * , struct dfs_info3_param * * ,
unsigned int * , const struct nls_table * , int ) ;
2012-05-27 20:48:35 +04:00
/* informational QFS call */
2020-02-03 22:46:43 +03:00
void ( * qfs_tcon ) ( const unsigned int , struct cifs_tcon * ,
struct cifs_sb_info * ) ;
2012-05-25 14:40:22 +04:00
/* check if a path is accessible or not */
int ( * is_path_accessible ) ( const unsigned int , struct cifs_tcon * ,
struct cifs_sb_info * , const char * ) ;
2012-05-27 17:34:43 +04:00
/* query path data from the server */
2022-10-04 00:43:50 +03:00
int ( * query_path_info ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifs_sb_info * cifs_sb , const char * full_path ,
struct cifs_open_info_data * data , bool * adjust_tz , bool * reparse ) ;
2012-09-19 03:20:26 +04:00
/* query file data from the server */
2022-10-04 00:43:50 +03:00
int ( * query_file_info ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifsFileInfo * cfile , struct cifs_open_info_data * data ) ;
2020-10-23 06:03:14 +03:00
/* query reparse tag from srv to determine which type of special file */
int ( * query_reparse_tag ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifs_sb_info * cifs_sb , const char * path ,
__u32 * reparse_tag ) ;
2012-05-27 17:34:43 +04:00
/* get server index number */
2022-10-04 00:43:50 +03:00
int ( * get_srv_inum ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifs_sb_info * cifs_sb , const char * full_path , u64 * uniqueid ,
struct cifs_open_info_data * data ) ;
2012-09-19 03:20:31 +04:00
/* set size by path */
int ( * set_path_size ) ( const unsigned int , struct cifs_tcon * ,
const char * , __u64 , struct cifs_sb_info * , bool ) ;
/* set size by file handle */
int ( * set_file_size ) ( const unsigned int , struct cifs_tcon * ,
struct cifsFileInfo * , __u64 , bool ) ;
2012-09-19 03:20:32 +04:00
/* set attributes */
int ( * set_file_info ) ( struct inode * , const char * , FILE_BASIC_INFO * ,
const unsigned int ) ;
2013-10-15 00:31:32 +04:00
int ( * set_compression ) ( const unsigned int , struct cifs_tcon * ,
struct cifsFileInfo * ) ;
2012-05-25 14:47:16 +04:00
/* check if we can send an echo or nor */
bool ( * can_echo ) ( struct TCP_Server_Info * ) ;
/* send echo request */
int ( * echo ) ( struct TCP_Server_Info * ) ;
2012-03-17 12:41:12 +04:00
/* create directory */
2018-06-15 05:56:32 +03:00
int ( * posix_mkdir ) ( const unsigned int xid , struct inode * inode ,
umode_t mode , struct cifs_tcon * tcon ,
const char * full_path ,
struct cifs_sb_info * cifs_sb ) ;
2019-09-25 08:32:13 +03:00
int ( * mkdir ) ( const unsigned int xid , struct inode * inode , umode_t mode ,
struct cifs_tcon * tcon , const char * name ,
struct cifs_sb_info * sb ) ;
2012-03-17 12:41:12 +04:00
/* set info on created directory */
void ( * mkdir_setinfo ) ( struct inode * , const char * ,
struct cifs_sb_info * , struct cifs_tcon * ,
const unsigned int ) ;
2012-07-10 16:14:18 +04:00
/* remove directory */
int ( * rmdir ) ( const unsigned int , struct cifs_tcon * , const char * ,
struct cifs_sb_info * ) ;
2012-09-19 03:20:25 +04:00
/* unlink file */
int ( * unlink ) ( const unsigned int , struct cifs_tcon * , const char * ,
struct cifs_sb_info * ) ;
/* open, rename and delete file */
int ( * rename_pending_delete ) ( const char * , struct dentry * ,
const unsigned int ) ;
2012-09-19 03:20:30 +04:00
/* send rename request */
int ( * rename ) ( const unsigned int , struct cifs_tcon * , const char * ,
const char * , struct cifs_sb_info * ) ;
2012-09-19 03:20:31 +04:00
/* send create hardlink request */
int ( * create_hardlink ) ( const unsigned int , struct cifs_tcon * ,
const char * , const char * ,
struct cifs_sb_info * ) ;
2013-08-14 19:25:21 +04:00
/* query symlink target */
int ( * query_symlink ) ( const unsigned int , struct cifs_tcon * ,
2019-04-10 01:44:46 +03:00
struct cifs_sb_info * , const char * ,
char * * , bool ) ;
2012-09-19 03:20:26 +04:00
/* open a file for non-posix mounts */
2022-10-04 00:43:50 +03:00
int ( * open ) ( const unsigned int xid , struct cifs_open_parms * oparms , __u32 * oplock ,
void * buf ) ;
2012-09-19 03:20:26 +04:00
/* set fid protocol-specific info */
void ( * set_fid ) ( struct cifsFileInfo * , struct cifs_fid * , __u32 ) ;
2012-09-19 03:20:26 +04:00
/* close a file */
2012-09-25 11:00:07 +04:00
void ( * close ) ( const unsigned int , struct cifs_tcon * ,
struct cifs_fid * ) ;
2019-12-03 06:46:54 +03:00
/* close a file, returning file attributes and timestamps */
void ( * close_getattr ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifsFileInfo * pfile_info ) ;
2012-09-19 03:20:27 +04:00
/* send a flush request to the server */
int ( * flush ) ( const unsigned int , struct cifs_tcon * , struct cifs_fid * ) ;
2012-09-19 03:20:28 +04:00
/* async read from the server */
int ( * async_readv ) ( struct cifs_readdata * ) ;
2012-09-19 03:20:29 +04:00
/* async write to the server */
2014-02-08 06:45:12 +04:00
int ( * async_writev ) ( struct cifs_writedata * ,
void ( * release ) ( struct kref * ) ) ;
2012-09-19 03:20:29 +04:00
/* sync read from the server */
2014-09-22 14:13:55 +04:00
int ( * sync_read ) ( const unsigned int , struct cifs_fid * ,
2012-09-19 03:20:29 +04:00
struct cifs_io_parms * , unsigned int * , char * * ,
int * ) ;
2012-09-19 03:20:30 +04:00
/* sync write to the server */
2014-09-22 14:13:55 +04:00
int ( * sync_write ) ( const unsigned int , struct cifs_fid * ,
2012-09-19 03:20:30 +04:00
struct cifs_io_parms * , unsigned int * , struct kvec * ,
unsigned long ) ;
2012-09-19 03:20:32 +04:00
/* open dir, start readdir */
int ( * query_dir_first ) ( const unsigned int , struct cifs_tcon * ,
const char * , struct cifs_sb_info * ,
struct cifs_fid * , __u16 ,
struct cifs_search_info * ) ;
/* continue readdir */
int ( * query_dir_next ) ( const unsigned int , struct cifs_tcon * ,
struct cifs_fid * ,
__u16 , struct cifs_search_info * srch_inf ) ;
/* close dir */
int ( * close_dir ) ( const unsigned int , struct cifs_tcon * ,
struct cifs_fid * ) ;
/* calculate a size of SMB message */
2022-08-17 20:14:02 +03:00
unsigned int ( * calc_smb_size ) ( void * buf ) ;
2019-01-24 04:11:16 +03:00
/* check for STATUS_PENDING and process the response if yes */
bool ( * is_status_pending ) ( char * buf , struct TCP_Server_Info * server ) ;
2017-07-09 00:32:00 +03:00
/* check for STATUS_NETWORK_SESSION_EXPIRED */
bool ( * is_session_expired ) ( char * ) ;
2012-09-19 03:20:33 +04:00
/* send oplock break response */
int ( * oplock_response ) ( struct cifs_tcon * , struct cifs_fid * ,
struct cifsInodeInfo * ) ;
2012-09-19 03:20:33 +04:00
/* query remote filesystem */
int ( * queryfs ) ( const unsigned int , struct cifs_tcon * ,
2020-02-03 22:46:43 +03:00
struct cifs_sb_info * , struct kstatfs * ) ;
2012-09-19 17:22:43 +04:00
/* send mandatory brlock to the server */
int ( * mand_lock ) ( const unsigned int , struct cifsFileInfo * , __u64 ,
__u64 , __u32 , int , int , bool ) ;
/* unlock range of mandatory locks */
int ( * mand_unlock_range ) ( struct cifsFileInfo * , struct file_lock * ,
const unsigned int ) ;
/* push brlocks from the cache to the server */
int ( * push_mand_locks ) ( struct cifsFileInfo * ) ;
2012-09-19 17:22:44 +04:00
/* get lease key of the inode */
2013-09-04 13:07:41 +04:00
void ( * get_lease_key ) ( struct inode * , struct cifs_fid * ) ;
2012-09-19 17:22:44 +04:00
/* set lease key of the inode */
2013-09-04 13:07:41 +04:00
void ( * set_lease_key ) ( struct inode * , struct cifs_fid * ) ;
2012-09-19 17:22:44 +04:00
/* generate new lease key */
2013-09-04 13:07:41 +04:00
void ( * new_lease_key ) ( struct cifs_fid * ) ;
2021-07-19 16:54:16 +03:00
int ( * generate_signingkey ) ( struct cifs_ses * ses ,
struct TCP_Server_Info * server ) ;
2020-04-01 02:21:43 +03:00
int ( * calc_signature ) ( struct smb_rqst * , struct TCP_Server_Info * ,
bool allocate_crypto ) ;
2015-06-24 11:17:02 +03:00
int ( * set_integrity ) ( const unsigned int , struct cifs_tcon * tcon ,
struct cifsFileInfo * src_file ) ;
2016-10-01 05:14:26 +03:00
int ( * enum_snapshots ) ( const unsigned int xid , struct cifs_tcon * tcon ,
struct cifsFileInfo * src_file , void __user * ) ;
2020-02-06 15:00:14 +03:00
int ( * notify ) ( const unsigned int xid , struct file * pfile ,
2022-10-15 08:43:22 +03:00
void __user * pbuf , bool return_changes ) ;
2013-11-25 21:09:49 +04:00
int ( * query_mf_symlink ) ( unsigned int , struct cifs_tcon * ,
struct cifs_sb_info * , const unsigned char * ,
char * , unsigned int * ) ;
2013-11-25 21:09:52 +04:00
int ( * create_mf_symlink ) ( unsigned int , struct cifs_tcon * ,
struct cifs_sb_info * , const unsigned char * ,
char * , unsigned int * ) ;
2013-09-05 16:11:28 +04:00
/* if we can do cache read operations */
bool ( * is_read_op ) ( __u32 ) ;
/* set oplock level for the inode */
2013-09-05 21:30:16 +04:00
void ( * set_oplock_level ) ( struct cifsInodeInfo * , __u32 , unsigned int ,
bool * ) ;
2013-09-04 13:07:41 +04:00
/* create lease context buffer for CREATE request */
2018-07-05 16:10:02 +03:00
char * ( * create_lease_buf ) ( u8 * lease_key , u8 oplock ) ;
2013-09-05 21:30:16 +04:00
/* parse lease context buffer and return oplock/epoch info */
2018-04-26 17:10:18 +03:00
__u8 ( * parse_lease_buf ) ( void * buf , unsigned int * epoch , char * lkey ) ;
2017-02-10 13:33:51 +03:00
ssize_t ( * copychunk_range ) ( const unsigned int ,
2017-04-04 10:12:04 +03:00
struct cifsFileInfo * src_file ,
2017-02-10 13:33:51 +03:00
struct cifsFileInfo * target_file ,
u64 src_off , u64 len , u64 dest_off ) ;
2015-06-28 07:18:36 +03:00
int ( * duplicate_extents ) ( const unsigned int , struct cifsFileInfo * src ,
struct cifsFileInfo * target_file , u64 src_off , u64 len ,
u64 dest_off ) ;
2013-11-20 09:44:46 +04:00
int ( * validate_negotiate ) ( const unsigned int , struct cifs_tcon * ) ;
2014-01-27 09:53:43 +04:00
ssize_t ( * query_all_EAs ) ( const unsigned int , struct cifs_tcon * ,
const unsigned char * , const unsigned char * , char * ,
2017-05-13 04:59:10 +03:00
size_t , struct cifs_sb_info * ) ;
2014-01-27 09:53:43 +04:00
int ( * set_EA ) ( const unsigned int , struct cifs_tcon * , const char * ,
const char * , const void * , const __u16 ,
2017-08-24 04:24:56 +03:00
const struct nls_table * , struct cifs_sb_info * ) ;
2014-02-03 09:31:47 +04:00
struct cifs_ntsd * ( * get_acl ) ( struct cifs_sb_info * , struct inode * ,
2020-12-18 20:30:12 +03:00
const char * , u32 * , u32 ) ;
2014-02-11 00:08:16 +04:00
struct cifs_ntsd * ( * get_acl_by_fid ) ( struct cifs_sb_info * ,
2020-12-18 20:30:12 +03:00
const struct cifs_fid * , u32 * , u32 ) ;
2014-02-03 09:31:47 +04:00
int ( * set_acl ) ( struct cifs_ntsd * , __u32 , struct inode * , const char * ,
int ) ;
2014-06-22 11:03:22 +04:00
/* writepages retry size */
unsigned int ( * wp_retry_size ) ( struct inode * ) ;
2014-06-05 19:03:27 +04:00
/* get mtu credits */
int ( * wait_mtu_credits ) ( struct TCP_Server_Info * , unsigned int ,
2019-01-16 22:12:41 +03:00
unsigned int * , struct cifs_credits * ) ;
2019-01-24 05:15:52 +03:00
/* adjust previously taken mtu credits to request size */
int ( * adjust_credits ) ( struct TCP_Server_Info * server ,
struct cifs_credits * credits ,
const unsigned int payload_size ) ;
2014-08-18 20:49:57 +04:00
/* check if we need to issue closedir */
bool ( * dir_needs_close ) ( struct cifsFileInfo * ) ;
2014-08-17 17:38:47 +04:00
long ( * fallocate ) ( struct file * , struct cifs_tcon * , int , loff_t ,
loff_t ) ;
2016-10-31 23:49:30 +03:00
/* init transform request - used for encryption for now */
2018-08-01 02:26:11 +03:00
int ( * init_transform_rq ) ( struct TCP_Server_Info * , int num_rqst ,
struct smb_rqst * , struct smb_rqst * ) ;
2016-11-18 02:24:34 +03:00
int ( * is_transform_hdr ) ( void * buf ) ;
int ( * receive_transform ) ( struct TCP_Server_Info * ,
2018-08-08 08:07:45 +03:00
struct mid_q_entry * * , char * * , int * ) ;
2017-01-18 13:05:57 +03:00
enum securityEnum ( * select_sectype ) ( struct TCP_Server_Info * ,
enum securityEnum ) ;
2018-06-01 03:53:08 +03:00
int ( * next_header ) ( char * ) ;
2018-10-08 03:19:58 +03:00
/* ioctl passthrough for query_info */
int ( * ioctl_query_info ) ( const unsigned int xid ,
2018-10-16 22:47:58 +03:00
struct cifs_tcon * tcon ,
2020-02-03 22:46:43 +03:00
struct cifs_sb_info * cifs_sb ,
2018-10-16 22:47:58 +03:00
__le16 * path , int is_dir ,
2018-10-08 03:19:58 +03:00
unsigned long p ) ;
2019-03-14 08:29:17 +03:00
/* make unix special files (block, char, fifo, socket) */
int ( * make_node ) ( unsigned int xid ,
struct inode * inode ,
struct dentry * dentry ,
struct cifs_tcon * tcon ,
2021-03-18 08:38:53 +03:00
const char * full_path ,
2019-03-14 08:29:17 +03:00
umode_t mode ,
dev_t device_number ) ;
2019-04-25 09:45:29 +03:00
/* version specific fiemap implementation */
int ( * fiemap ) ( struct cifs_tcon * tcon , struct cifsFileInfo * ,
struct fiemap_extent_info * , u64 , u64 ) ;
2019-05-15 00:17:02 +03:00
/* version specific llseek implementation */
loff_t ( * llseek ) ( struct file * , struct cifs_tcon * , loff_t , int ) ;
2020-09-18 08:37:28 +03:00
/* Check for STATUS_IO_TIMEOUT */
bool ( * is_status_io_timeout ) ( char * buf ) ;
2021-02-16 13:40:45 +03:00
/* Check for STATUS_NETWORK_NAME_DELETED */
void ( * is_network_name_deleted ) ( char * buf , struct TCP_Server_Info * srv ) ;
2012-05-15 20:20:51 +04:00
} ;
struct smb_version_values {
char * version_string ;
2012-10-01 21:26:22 +04:00
__u16 protocol_id ;
__u32 req_capabilities ;
2012-02-28 15:23:34 +04:00
__u32 large_lock_type ;
__u32 exclusive_lock_type ;
__u32 shared_lock_type ;
__u32 unlock_lock_type ;
2018-03-31 03:45:31 +03:00
size_t header_preamble_size ;
2012-05-17 12:45:31 +04:00
size_t header_size ;
size_t max_header_size ;
2012-05-17 13:02:51 +04:00
size_t read_rsp_size ;
2011-12-26 22:53:34 +04:00
__le16 lock_cmd ;
2012-07-13 13:58:14 +04:00
unsigned int cap_unix ;
unsigned int cap_nt_find ;
unsigned int cap_large_files ;
2013-06-27 20:45:00 +04:00
__u16 signing_enabled ;
__u16 signing_required ;
2013-09-04 13:07:41 +04:00
size_t create_lease_size ;
2012-05-15 20:20:51 +04:00
} ;
2012-05-17 12:45:31 +04:00
# define HEADER_SIZE(server) (server->vals->header_size)
# define MAX_HEADER_SIZE(server) (server->vals->max_header_size)
2022-08-23 15:52:00 +03:00
# define HEADER_PREAMBLE_SIZE(server) (server->vals->header_preamble_size)
2022-08-23 15:52:01 +03:00
# define MID_HEADER_SIZE(server) (HEADER_SIZE(server) - 1 - HEADER_PREAMBLE_SIZE(server))
2012-05-17 12:45:31 +04:00
2019-04-02 14:00:33 +03:00
/**
* CIFS superblock mount flags ( mnt_cifs_flags ) to consider when
* trying to reuse existing superblock for a new mount
*/
2011-05-26 23:35:47 +04:00
# define CIFS_MOUNT_MASK (CIFS_MOUNT_NO_PERM | CIFS_MOUNT_SET_UID | \
CIFS_MOUNT_SERVER_INUM | CIFS_MOUNT_DIRECT_IO | \
CIFS_MOUNT_NO_XATTR | CIFS_MOUNT_MAP_SPECIAL_CHR | \
2014-09-27 11:19:01 +04:00
CIFS_MOUNT_MAP_SFM_CHR | \
2011-05-26 23:35:47 +04:00
CIFS_MOUNT_UNX_EMUL | CIFS_MOUNT_NO_BRL | \
CIFS_MOUNT_CIFS_ACL | CIFS_MOUNT_OVERR_UID | \
CIFS_MOUNT_OVERR_GID | CIFS_MOUNT_DYNPERM | \
CIFS_MOUNT_NOPOSIXBRL | CIFS_MOUNT_NOSSYNC | \
CIFS_MOUNT_FSCACHE | CIFS_MOUNT_MF_SYMLINKS | \
2011-09-26 18:56:44 +04:00
CIFS_MOUNT_MULTIUSER | CIFS_MOUNT_STRICT_IO | \
2019-04-02 14:00:33 +03:00
CIFS_MOUNT_CIFS_BACKUPUID | CIFS_MOUNT_CIFS_BACKUPGID | \
2019-06-24 09:19:52 +03:00
CIFS_MOUNT_UID_FROM_ACL | CIFS_MOUNT_NO_HANDLE_CACHE | \
2019-08-28 07:58:54 +03:00
CIFS_MOUNT_NO_DFS | CIFS_MOUNT_MODE_FROM_SID | \
2019-08-30 10:12:41 +03:00
CIFS_MOUNT_RO_CACHE | CIFS_MOUNT_RW_CACHE )
2011-05-26 23:35:47 +04:00
2019-04-02 14:00:33 +03:00
/**
* Generic VFS superblock mount flags ( s_flags ) to consider when
* trying to reuse existing superblock for a new mount
*/
2017-11-28 00:05:09 +03:00
# define CIFS_MS_MASK (SB_RDONLY | SB_MANDLOCK | SB_NOEXEC | SB_NOSUID | \
SB_NODEV | SB_SYNCHRONOUS )
2011-05-26 23:35:47 +04:00
struct cifs_mnt_data {
struct cifs_sb_info * cifs_sb ;
2020-12-10 08:07:12 +03:00
struct smb3_fs_context * ctx ;
2011-05-26 23:35:47 +04:00
int flags ;
} ;
2012-03-23 22:28:02 +04:00
static inline unsigned int
get_rfc1002_length ( void * buf )
{
2014-02-23 04:35:38 +04:00
return be32_to_cpu ( * ( ( __be32 * ) buf ) ) & 0xffffff ;
2012-03-23 22:28:02 +04:00
}
2011-12-27 16:12:43 +04:00
static inline void
inc_rfc1001_len ( void * buf , int count )
{
be32_add_cpu ( ( __be32 * ) buf , count ) ;
}
2005-04-17 02:20:36 +04:00
struct TCP_Server_Info {
2008-11-13 22:45:32 +03:00
struct list_head tcp_ses_list ;
struct list_head smb_ses_list ;
2022-07-27 22:49:56 +03:00
spinlock_t srv_lock ; /* protect anything here that is not protected */
2021-02-04 10:20:46 +03:00
__u64 conn_id ; /* connection identifier (useful for debugging) */
2008-11-14 21:44:38 +03:00
int srv_count ; /* reference counter */
2005-08-23 08:38:31 +04:00
/* 15 character server name + 0x20 16th byte indicating type = srv */
2008-12-01 23:23:50 +03:00
char server_RFC1001_name [ RFC1001_NAME_LEN_WITH_NULL ] ;
2012-05-15 20:20:51 +04:00
struct smb_version_operations * ops ;
struct smb_version_values * vals ;
2021-07-19 20:05:53 +03:00
/* updates to tcpStatus protected by cifs_tcp_ses_lock */
cifs: TCP_Server_Info diet
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20 21:36:50 +03:00
enum statusEnum tcpStatus ; /* what we think the status is */
2007-11-17 01:22:06 +03:00
char * hostname ; /* hostname portion of UNC string */
2005-04-17 02:20:36 +04:00
struct socket * ssocket ;
2010-12-13 19:08:35 +03:00
struct sockaddr_storage dstaddr ;
2010-09-02 04:06:02 +04:00
struct sockaddr_storage srcaddr ; /* locally bind to this IP */
Make CIFS mount work in a container.
Teach cifs about network namespaces, so mounting uses adresses/routing
visible from the container rather than from init context.
A container is a chroot on steroids that changes more than just the root
filesystem the new processes see. One thing containers can isolate is
"network namespaces", meaning each container can have its own set of
ethernet interfaces, each with its own own IP address and routing to the
outside world. And if you open a socket in _userspace_ from processes
within such a container, this works fine.
But sockets opened from within the kernel still use a single global
networking context in a lot of places, meaning the new socket's address
and routing are correct for PID 1 on the host, but are _not_ what
userspace processes in the container get to use.
So when you mount a network filesystem from within in a container, the
mount code in the CIFS driver uses the host's networking context and not
the container's networking context, so it gets the wrong address, uses
the wrong routing, and may even try to go out an interface that the
container can't even access... Bad stuff.
This patch copies the mount process's network context into the CIFS
structure that stores the rest of the server information for that mount
point, and changes the socket open code to use the saved network context
instead of the global network context. I.E. "when you attempt to use
these addresses, do so relative to THIS set of network interfaces and
routing rules, not the old global context from back before we supported
containers".
The big long HOWTO sets up a test environment on the assumption you've
never used ocntainers before. It basically says:
1) configure and build a new kernel that has container support
2) build a new root filesystem that includes the userspace container
control package (LXC)
3) package/run them under KVM (so you don't have to mess up your host
system in order to play with containers).
4) set up some containers under the KVM system
5) set up contradictory routing in the KVM system and the container so
that the host and the container see different things for the same address
6) try to mount a CIFS share from both contexts so you can both force it
to work and force it to fail.
For a long drawn out test reproduction sequence, see:
http://landley.livejournal.com/47024.html
http://landley.livejournal.com/47205.html
http://landley.livejournal.com/47476.html
Signed-off-by: Rob Landley <rlandley@parallels.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-23 00:44:05 +03:00
# ifdef CONFIG_NET_NS
struct net * net ;
# endif
2007-06-28 23:44:13 +04:00
wait_queue_head_t response_q ;
2005-04-17 02:20:36 +04:00
wait_queue_head_t request_q ; /* if more than maxmpx to srvr must block*/
2022-07-27 22:49:56 +03:00
spinlock_t mid_lock ; /* protect mid queue and it's entries */
2005-04-17 02:20:36 +04:00
struct list_head pending_mid_q ;
2008-10-29 03:47:57 +03:00
bool noblocksnd ; /* use blocking sendmsg */
bool noautotune ; /* do not autotune send buf sizes */
2021-11-06 14:31:53 +03:00
bool nosharesock ;
2010-01-01 04:28:43 +03:00
bool tcp_nodelay ;
2016-09-23 08:44:16 +03:00
unsigned int credits ; /* send no more requests at once */
unsigned int max_credits ; /* can override large 32000 default at mnt */
2012-02-17 18:09:12 +04:00
unsigned int in_flight ; /* number of requests on the wire to server */
2019-09-10 06:57:11 +03:00
unsigned int max_in_flight ; /* max number of requests that were on wire */
2012-02-06 15:59:18 +04:00
spinlock_t req_lock ; /* protect the two values above */
2022-06-01 08:03:18 +03:00
struct mutex _srv_mutex ;
unsigned int nofs_flag ;
2005-04-17 02:20:36 +04:00
struct task_struct * tsk ;
char server_GUID [ 16 ] ;
2012-05-25 10:43:58 +04:00
__u16 sec_mode ;
2013-05-26 15:01:00 +04:00
bool sign ; /* is signing enabled on this connection? */
2019-09-04 05:18:49 +03:00
bool ignore_signature : 1 ; /* skip validation of signatures in SMB2/3 rsp */
cifs: TCP_Server_Info diet
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20 21:36:50 +03:00
bool session_estab ; /* mark when very first sess is established */
2012-05-23 16:18:00 +04:00
int echo_credits ; /* echo reserved slots */
int oplock_credits ; /* oplock break reserved slots */
bool echoes : 1 ; /* enable echoes */
2014-05-13 03:48:12 +04:00
__u8 client_guid [ SMB2_CLIENT_GUID_SIZE ] ; /* Client GUID */
cifs: TCP_Server_Info diet
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20 21:36:50 +03:00
u16 dialect ; /* dialect index that server chose */
2012-03-20 13:55:09 +04:00
bool oplocks : 1 ; /* enable oplocks */
2005-04-17 02:20:36 +04:00
unsigned int maxReq ; /* Clients should submit no more */
/* than maxReq distinct unanswered SMBs to the server when using */
2018-05-24 10:09:20 +03:00
/* multiplexed reads or writes (for SMB1/CIFS only, not SMB2/SMB3) */
2005-04-17 02:20:36 +04:00
unsigned int maxBuf ; /* maxBuf specifies the maximum */
/* message size the server can send or receive for non-raw SMBs */
2011-02-09 02:52:32 +03:00
/* maxBuf is returned by SMB NegotiateProtocol so maxBuf is only 0 */
/* when socket is setup (and during reconnect) before NegProt sent */
[CIFS] Fix multiuser mounts so server does not invalidate earlier security contexts
When two different users mount the same Windows 2003 Server share using CIFS,
the first session mounted can be invalidated. Some servers invalidate the first
smb session when a second similar user (e.g. two users who get mapped by server to "guest")
authenticates an smb session from the same client.
By making sure that we set the 2nd and subsequent vc numbers to nonzero values,
this ensures that we will not have this problem.
Fixes Samba bug 6004, problem description follows:
How to reproduce:
- configure an "open share" (full permissions to Guest user) on Windows 2003
Server (I couldn't reproduce the problem with Samba server or Windows older
than 2003)
- mount the share twice with different users who will be authenticated as guest.
noacl,noperm,user=john,dir_mode=0700,domain=DOMAIN,rw
noacl,noperm,user=jeff,dir_mode=0700,domain=DOMAIN,rw
Result:
- just the mount point mounted last is accessible:
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-02-20 08:43:09 +03:00
unsigned int max_rw ; /* maxRw specifies the maximum */
2005-04-17 02:20:36 +04:00
/* message size the server can send or receive for */
/* SMB_COM_WRITE_RAW or SMB_COM_READ_RAW. */
2012-07-13 13:58:14 +04:00
unsigned int capabilities ; /* selective disabling of caps by smb sess */
2006-09-30 17:25:52 +04:00
int timeAdj ; /* Adjust for difference in server time zone in sec */
2021-06-25 21:54:32 +03:00
__u64 CurrentMid ; /* multiplex id - rotating counter, protected by GlobalMid_Lock */
2010-10-28 00:20:36 +04:00
char cryptkey [ CIFS_CRYPTO_KEY_SIZE ] ; /* used by ntlm, ntlmv2 etc */
2005-08-23 08:38:31 +04:00
/* 16th byte of RFC1001 workstation name is always null */
2008-12-01 23:23:50 +03:00
char workstation_RFC1001_name [ RFC1001_NAME_LEN_WITH_NULL ] ;
2011-01-07 19:30:28 +03:00
__u32 sequence_number ; /* for signing, protected by srv_mutex */
2018-09-19 10:38:17 +03:00
__u32 reconnect_instance ; /* incremented on each reconnect */
2010-09-19 07:01:58 +04:00
struct session_key session_key ;
2006-07-15 02:37:11 +04:00
unsigned long lstrp ; /* when we got last response from this server */
2010-10-21 23:25:08 +04:00
struct cifs_secmech secmech ; /* crypto sec mech functs, descriptors */
2013-05-26 15:00:59 +04:00
# define CIFS_NEGFLAVOR_UNENCAP 1 /* wct == 17, but no ext_sec */
# define CIFS_NEGFLAVOR_EXTENDED 2 /* wct == 17, ext_sec bit set */
char negflavor ; /* NEGOTIATE response flavor */
2010-04-24 15:57:49 +04:00
/* extended security flavors that server supports */
cifs: TCP_Server_Info diet
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20 21:36:50 +03:00
bool sec_ntlmssp ; /* supports NTLMSSP */
bool sec_kerberosu2u ; /* supports U2U Kerberos */
2010-04-24 15:57:49 +04:00
bool sec_kerberos ; /* supports plain Kerberos */
bool sec_mskerberos ; /* supports legacy MS Kerberos */
2011-10-19 23:29:23 +04:00
bool large_buf ; /* is current buffer large? */
2017-11-07 11:54:55 +03:00
/* use SMBD connection instead of socket */
bool rdma ;
/* point to the SMBD connection if RDMA is used instead of socket */
struct smbd_connection * smbd_conn ;
2011-01-11 15:24:23 +03:00
struct delayed_work echo ; /* echo ping workqueue job */
2011-10-19 23:29:23 +04:00
char * smallbuf ; /* pointer to current "small" buffer */
char * bigbuf ; /* pointer to current "big" buffer */
2018-04-09 11:06:26 +03:00
/* Total size of this PDU. Only valid from cifs_demultiplex_thread */
unsigned int pdu_size ;
2011-10-19 23:29:23 +04:00
unsigned int total_read ; /* total amount of data read in this pass */
2019-11-22 02:26:35 +03:00
atomic_t in_send ; /* requests trying to send */
atomic_t num_waiters ; /* blocked waiting to get in sendrecv */
cifs: TCP_Server_Info diet
Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".
Before:
/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */
After:
/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */
...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20 21:36:50 +03:00
# ifdef CONFIG_CIFS_STATS2
2019-03-26 21:53:21 +03:00
atomic_t num_cmds [ NUMBER_OF_SMB2_COMMANDS ] ; /* total requests by cmd */
2018-08-04 13:24:34 +03:00
atomic_t smb2slowcmd [ NUMBER_OF_SMB2_COMMANDS ] ; /* count resps > 1 sec */
2019-03-26 21:53:21 +03:00
__u64 time_per_cmd [ NUMBER_OF_SMB2_COMMANDS ] ; /* total time per cmd */
__u32 slowest_cmd [ NUMBER_OF_SMB2_COMMANDS ] ;
__u32 fastest_cmd [ NUMBER_OF_SMB2_COMMANDS ] ;
2018-08-04 13:24:34 +03:00
# endif /* STATS2 */
2011-12-27 16:12:43 +04:00
unsigned int max_read ;
unsigned int max_write ;
2019-09-09 07:22:02 +03:00
unsigned int min_offload ;
2019-04-27 06:36:08 +03:00
__le16 compress_algorithm ;
2021-07-05 23:05:39 +03:00
__u16 signing_algorithm ;
2018-04-09 18:47:14 +03:00
__le16 cipher_type ;
2018-02-16 21:19:29 +03:00
/* save initital negprot hash */
__u8 preauth_sha_hash [ SMB2_PREAUTH_HASH_SIZE ] ;
2021-07-05 23:05:39 +03:00
bool signing_negotiated ; /* true if valid signing context rcvd from server */
2018-05-20 04:45:27 +03:00
bool posix_ext_supported ;
2016-11-04 21:50:31 +03:00
struct delayed_work reconnect ; /* reconnect workqueue job */
struct mutex reconnect_mutex ; /* prevent simultaneous reconnects */
2015-12-18 21:31:36 +03:00
unsigned long echo_interval ;
2018-11-14 22:13:25 +03:00
/*
* Number of targets available for reconnect . The more targets
* the more tasks have to wait to let the demultiplex thread
* reconnect .
*/
int nr_targets ;
2019-07-17 01:04:50 +03:00
bool noblockcnt ; /* use non-blocking connect() */
2021-07-19 14:26:24 +03:00
/*
* If this is a session channel ,
* primary_server holds the ref - counted
* pointer to primary channel connection for the session .
*/
# define CIFS_SERVER_IS_CHAN(server) (!!(server)->primary_server)
struct TCP_Server_Info * primary_server ;
2020-11-30 21:02:56 +03:00
# ifdef CONFIG_CIFS_SWN_UPCALL
bool use_swn_dstaddr ;
struct sockaddr_storage swn_dstaddr ;
# endif
2021-11-03 19:53:29 +03:00
struct mutex refpath_lock ; /* protects leaf_fullpath */
/*
* Canonical DFS full paths that were used to chase referrals in mount and reconnect .
*
* origin_fullpath : first or original referral path
* leaf_fullpath : last referral path ( might be changed due to nested links in reconnect )
*
* current_fullpath : pointer to either origin_fullpath or leaf_fullpath
* NOTE : cannot be accessed outside cifs_reconnect ( ) and smb2_reconnect ( )
*
* format : \ \ HOST \ SHARE \ [ OPTIONAL PATH ]
*/
char * origin_fullpath , * leaf_fullpath , * current_fullpath ;
2005-04-17 02:20:36 +04:00
} ;
2022-08-23 15:52:02 +03:00
static inline bool is_smb1 ( struct TCP_Server_Info * server )
{
return HEADER_PREAMBLE_SIZE ( server ) ! = 0 ;
}
2022-06-01 08:03:18 +03:00
static inline void cifs_server_lock ( struct TCP_Server_Info * server )
{
unsigned int nofs_flag = memalloc_nofs_save ( ) ;
mutex_lock ( & server - > _srv_mutex ) ;
server - > nofs_flag = nofs_flag ;
}
static inline void cifs_server_unlock ( struct TCP_Server_Info * server )
{
unsigned int nofs_flag = server - > nofs_flag ;
mutex_unlock ( & server - > _srv_mutex ) ;
memalloc_nofs_restore ( nofs_flag ) ;
}
2019-01-16 22:12:41 +03:00
struct cifs_credits {
unsigned int value ;
unsigned int instance ;
} ;
2012-02-17 18:09:12 +04:00
static inline unsigned int
in_flight ( struct TCP_Server_Info * server )
{
unsigned int num ;
2022-12-09 01:11:00 +03:00
2012-02-17 18:09:12 +04:00
spin_lock ( & server - > req_lock ) ;
num = server - > in_flight ;
spin_unlock ( & server - > req_lock ) ;
return num ;
}
2012-02-06 15:59:18 +04:00
static inline bool
2019-03-08 05:58:20 +03:00
has_credits ( struct TCP_Server_Info * server , int * credits , int num_credits )
2012-02-17 18:09:12 +04:00
{
2012-02-06 15:59:18 +04:00
int num ;
2022-12-09 01:11:00 +03:00
2012-02-17 18:09:12 +04:00
spin_lock ( & server - > req_lock ) ;
2012-03-15 14:22:27 +04:00
num = * credits ;
2012-02-17 18:09:12 +04:00
spin_unlock ( & server - > req_lock ) ;
2019-03-08 05:58:20 +03:00
return num > = num_credits ;
2012-02-17 18:09:12 +04:00
}
2012-05-17 17:53:29 +04:00
static inline void
2019-01-16 22:22:29 +03:00
add_credits ( struct TCP_Server_Info * server , const struct cifs_credits * credits ,
2012-05-23 16:14:34 +04:00
const int optype )
2012-05-17 17:53:29 +04:00
{
2019-01-16 22:22:29 +03:00
server - > ops - > add_credits ( server , credits , optype ) ;
2012-05-17 17:53:29 +04:00
}
2014-06-05 19:03:27 +04:00
static inline void
2019-01-16 22:12:41 +03:00
add_credits_and_wake_if ( struct TCP_Server_Info * server ,
const struct cifs_credits * credits , const int optype )
2014-06-05 19:03:27 +04:00
{
2019-01-16 22:12:41 +03:00
if ( credits - > value ) {
server - > ops - > add_credits ( server , credits , optype ) ;
2014-06-05 19:03:27 +04:00
wake_up ( & server - > request_q ) ;
}
}
2012-05-17 17:53:29 +04:00
static inline void
set_credits ( struct TCP_Server_Info * server , const int val )
{
server - > ops - > set_credits ( server , val ) ;
}
2019-01-24 05:15:52 +03:00
static inline int
adjust_credits ( struct TCP_Server_Info * server , struct cifs_credits * credits ,
const unsigned int payload_size )
{
return server - > ops - > adjust_credits ?
server - > ops - > adjust_credits ( server , credits , payload_size ) : 0 ;
}
2014-12-09 20:37:00 +03:00
static inline __le64
2013-11-02 21:50:34 +04:00
get_next_mid64 ( struct TCP_Server_Info * server )
2012-05-23 14:01:59 +04:00
{
2014-12-09 20:37:00 +03:00
return cpu_to_le64 ( server - > ops - > get_next_mid ( server ) ) ;
2012-05-23 14:01:59 +04:00
}
2013-11-02 21:50:34 +04:00
static inline __le16
get_next_mid ( struct TCP_Server_Info * server )
{
2014-12-09 20:37:00 +03:00
__u16 mid = server - > ops - > get_next_mid ( server ) ;
2013-11-02 21:50:34 +04:00
/*
* The value in the SMB header should be little endian for easy
* on - the - wire decoding .
*/
return cpu_to_le16 ( mid ) ;
}
2019-03-05 01:02:50 +03:00
static inline void
revert_current_mid ( struct TCP_Server_Info * server , const unsigned int val )
{
if ( server - > ops - > revert_current_mid )
server - > ops - > revert_current_mid ( server , val ) ;
}
static inline void
revert_current_mid_from_hdr ( struct TCP_Server_Info * server ,
2021-11-05 02:39:01 +03:00
const struct smb2_hdr * shdr )
2019-03-05 01:02:50 +03:00
{
unsigned int num = le16_to_cpu ( shdr - > CreditCharge ) ;
return revert_current_mid ( server , num > 0 ? num : 1 ) ;
}
2013-11-02 21:50:34 +04:00
static inline __u16
get_mid ( const struct smb_hdr * smb )
{
return le16_to_cpu ( smb - > Mid ) ;
}
static inline bool
compare_mid ( __u16 mid , const struct smb_hdr * smb )
{
return mid = = le16_to_cpu ( smb - > Mid ) ;
}
2012-09-19 03:20:28 +04:00
/*
* When the server supports very large reads and writes via POSIX extensions ,
* we can allow up to 2 ^ 24 - 1 , minus the size of a READ / WRITE_AND_X header , not
* including the RFC1001 length .
*
* Note that this might make for " interesting " allocation problems during
* writeback however as we have to allocate an array of pointers for the
2016-04-01 15:29:48 +03:00
* pages . A 16 M write means ~ 32 kb page array with PAGE_SIZE = = 4096.
2012-09-19 03:20:28 +04:00
*
* For reads , there is a similar problem as we need to allocate an array
* of kvecs to handle the receive , though that should only need to be done
* once .
*/
# define CIFS_MAX_WSIZE ((1<<24) - 1 - sizeof(WRITE_REQ) + 4)
# define CIFS_MAX_RSIZE ((1<<24) - sizeof(READ_RSP) + 4)
/*
* When the server doesn ' t allow large posix writes , only allow a rsize / wsize
* of 2 ^ 17 - 1 minus the size of the call header . That allows for a read or
* write up to the maximum size described by RFC1002 .
*/
# define CIFS_MAX_RFC1002_WSIZE ((1<<17) - 1 - sizeof(WRITE_REQ) + 4)
# define CIFS_MAX_RFC1002_RSIZE ((1<<17) - 1 - sizeof(READ_RSP) + 4)
# define CIFS_DEFAULT_IOSIZE (1024 * 1024)
/*
* Windows only supports a max of 60 kb reads and 65535 byte writes . Default to
* those values when posix extensions aren ' t in force . In actuality here , we
* use 65536 to allow for a write that is a multiple of 4 k . Most servers seem
* to be ok with the extra byte even though Windows doesn ' t send writes that
* are that large .
*
* Citation :
*
2020-06-27 13:31:25 +03:00
* https : //blogs.msdn.com/b/openspecification/archive/2009/04/10/smb-maximum-transmit-buffer-size-and-performance-tuning.aspx
2012-09-19 03:20:28 +04:00
*/
# define CIFS_DEFAULT_NON_POSIX_RSIZE (60 * 1024)
# define CIFS_DEFAULT_NON_POSIX_WSIZE (65536)
Make CIFS mount work in a container.
Teach cifs about network namespaces, so mounting uses adresses/routing
visible from the container rather than from init context.
A container is a chroot on steroids that changes more than just the root
filesystem the new processes see. One thing containers can isolate is
"network namespaces", meaning each container can have its own set of
ethernet interfaces, each with its own own IP address and routing to the
outside world. And if you open a socket in _userspace_ from processes
within such a container, this works fine.
But sockets opened from within the kernel still use a single global
networking context in a lot of places, meaning the new socket's address
and routing are correct for PID 1 on the host, but are _not_ what
userspace processes in the container get to use.
So when you mount a network filesystem from within in a container, the
mount code in the CIFS driver uses the host's networking context and not
the container's networking context, so it gets the wrong address, uses
the wrong routing, and may even try to go out an interface that the
container can't even access... Bad stuff.
This patch copies the mount process's network context into the CIFS
structure that stores the rest of the server information for that mount
point, and changes the socket open code to use the saved network context
instead of the global network context. I.E. "when you attempt to use
these addresses, do so relative to THIS set of network interfaces and
routing rules, not the old global context from back before we supported
containers".
The big long HOWTO sets up a test environment on the assumption you've
never used ocntainers before. It basically says:
1) configure and build a new kernel that has container support
2) build a new root filesystem that includes the userspace container
control package (LXC)
3) package/run them under KVM (so you don't have to mess up your host
system in order to play with containers).
4) set up some containers under the KVM system
5) set up contradictory routing in the KVM system and the container so
that the host and the container see different things for the same address
6) try to mount a CIFS share from both contexts so you can both force it
to work and force it to fail.
For a long drawn out test reproduction sequence, see:
http://landley.livejournal.com/47024.html
http://landley.livejournal.com/47205.html
http://landley.livejournal.com/47476.html
Signed-off-by: Rob Landley <rlandley@parallels.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-23 00:44:05 +03:00
/*
* Macros to allow the TCP_Server_Info - > net field and related code to drop out
* when CONFIG_NET_NS isn ' t set .
*/
# ifdef CONFIG_NET_NS
static inline struct net * cifs_net_ns ( struct TCP_Server_Info * srv )
{
return srv - > net ;
}
static inline void cifs_set_net_ns ( struct TCP_Server_Info * srv , struct net * net )
{
srv - > net = net ;
}
# else
static inline struct net * cifs_net_ns ( struct TCP_Server_Info * srv )
{
return & init_net ;
}
static inline void cifs_set_net_ns ( struct TCP_Server_Info * srv , struct net * net )
{
}
# endif
2018-06-14 16:43:18 +03:00
struct cifs_server_iface {
2022-01-01 15:50:21 +03:00
struct list_head iface_head ;
struct kref refcount ;
2018-06-14 16:43:18 +03:00
size_t speed ;
unsigned int rdma_capable : 1 ;
unsigned int rss_capable : 1 ;
2022-01-01 15:50:21 +03:00
unsigned int is_active : 1 ; /* unset if non existent */
2018-06-14 16:43:18 +03:00
struct sockaddr_storage sockaddr ;
} ;
2022-01-01 15:50:21 +03:00
/* release iface when last ref is dropped */
static inline void
release_iface ( struct kref * ref )
{
struct cifs_server_iface * iface = container_of ( ref ,
struct cifs_server_iface ,
refcount ) ;
list_del_init ( & iface - > iface_head ) ;
kfree ( iface ) ;
}
/*
* compare two interfaces a and b
* return 0 if everything matches .
* return 1 if a has higher link speed , or rdma capable , or rss capable
* return - 1 otherwise .
*/
static inline int
iface_cmp ( struct cifs_server_iface * a , struct cifs_server_iface * b )
{
int cmp_ret = 0 ;
WARN_ON ( ! a | | ! b ) ;
if ( a - > speed = = b - > speed ) {
if ( a - > rdma_capable = = b - > rdma_capable ) {
if ( a - > rss_capable = = b - > rss_capable ) {
cmp_ret = memcmp ( & a - > sockaddr , & b - > sockaddr ,
sizeof ( a - > sockaddr ) ) ;
if ( ! cmp_ret )
return 0 ;
else if ( cmp_ret > 0 )
return 1 ;
else
return - 1 ;
} else if ( a - > rss_capable > b - > rss_capable )
return 1 ;
else
return - 1 ;
} else if ( a - > rdma_capable > b - > rdma_capable )
return 1 ;
else
return - 1 ;
} else if ( a - > speed > b - > speed )
return 1 ;
else
return - 1 ;
}
2019-09-20 05:32:20 +03:00
struct cifs_chan {
2022-04-08 16:31:37 +03:00
unsigned int in_reconnect : 1 ; /* if session setup in progress for this channel */
2019-09-20 05:32:20 +03:00
struct TCP_Server_Info * server ;
2022-01-01 15:50:21 +03:00
struct cifs_server_iface * iface ; /* interface in use */
2019-09-20 05:32:20 +03:00
__u8 signkey [ SMB3_SIGN_KEY_SIZE ] ;
} ;
2005-04-17 02:20:36 +04:00
/*
* Session structure . One of these for each uid session with a particular host
*/
2011-05-27 08:34:02 +04:00
struct cifs_ses {
2008-11-14 21:53:46 +03:00
struct list_head smb_ses_list ;
2021-10-30 07:51:35 +03:00
struct list_head rlist ; /* reconnect list */
2008-11-13 22:45:32 +03:00
struct list_head tcon_list ;
2018-01-24 15:46:10 +03:00
struct cifs_tcon * tcon_ipc ;
2022-07-27 22:49:56 +03:00
spinlock_t ses_lock ; /* protect anything here that is not protected */
2010-02-25 08:36:46 +03:00
struct mutex session_mutex ;
2005-04-17 02:20:36 +04:00
struct TCP_Server_Info * server ; /* pointer to server info */
2008-11-14 21:53:46 +03:00
int ses_count ; /* reference counter */
2022-04-07 16:15:49 +03:00
enum ses_status_enum ses_status ; /* updates protected by cifs_tcp_ses_lock */
2022-12-09 01:11:00 +03:00
unsigned int overrideSecFlg ; /* if non-zero override global sec flags */
2005-04-29 09:41:05 +04:00
char * serverOS ; /* name of operating system underlying server */
char * serverNOS ; /* name of network operating system of server */
2005-04-17 02:20:36 +04:00
char * serverDomain ; /* security realm of server */
2012-05-25 10:43:58 +04:00
__u64 Suid ; /* remote smb uid */
2013-02-06 14:30:39 +04:00
kuid_t linux_uid ; /* overriding owner of files on the mount */
kuid_t cred_uid ; /* owner of credentials */
2012-07-13 13:58:14 +04:00
unsigned int capabilities ;
2021-02-21 04:24:11 +03:00
char ip_addr [ INET6_ADDRSTRLEN + 1 ] ; /* Max ipv6 (or v4) addr string len */
2011-03-01 08:02:57 +03:00
char * user_name ; /* must not be null except during init of sess
and after mount option parsing we fill it */
2007-06-28 23:44:13 +04:00
char * domainName ;
char * password ;
2022-05-25 15:37:04 +03:00
char workstation_name [ CIFS_MAX_WORKSTATION_LEN ] ;
2010-10-14 03:15:00 +04:00
struct session_key auth_key ;
2010-10-28 18:53:07 +04:00
struct ntlmssp_auth * ntlmssp ; /* ciphertext, flags, server challenge */
2013-05-26 15:01:00 +04:00
enum securityEnum sectype ; /* what security flavor was specified? */
bool sign ; /* is signing required? */
Fix default behaviour for empty domains and add domainauto option
With commit 2b149f119 many things have been fixed/introduced.
However, the default behaviour for RawNTLMSSP authentication
seems to be wrong in case the domain is not passed on the command line.
The main points (see below) of the patch are:
- It alignes behaviour with Windows clients
- It fixes backward compatibility
- It fixes UPN
I compared this behavour with the one from a Windows 10 command line
client. When no domains are specified on the command line, I traced
the packets and observed that the client does send an empty
domain to the server.
In the linux kernel case, the empty domain is replaced by the
primary domain communicated by the SMB server.
This means that, if the credentials are valid against the local server
but that server is part of a domain, then the kernel module will
ask to authenticate against that domain and we will get LOGON failure.
I compared the packet trace from the smbclient when no domain is passed
and, in that case, a default domain from the client smb.conf is taken.
Apparently, connection succeeds anyway, because when the domain passed
is not valid (in my case WORKGROUP), then the local one is tried and
authentication succeeds. I tried with any kind of invalid domain and
the result was always a connection.
So, trying to interpret what to do and picking a valid domain if none
is passed, seems the wrong thing to do.
To this end, a new option "domainauto" has been added in case the
user wants a mechanism for guessing.
Without this patch, backward compatibility also is broken.
With kernel 3.10, the default auth mechanism was NTLM.
One of our testing servers accepted NTLM and, because no
domains are passed, authentication was local.
Moving to RawNTLMSSP forced us to change our command line
to add a fake domain to pass to prevent this mechanism to kick in.
For the same reasons, UPN is broken because the domain is specified
in the username.
The SMB server will work out the domain from the UPN and authenticate
against the right server.
Without the patch, though, given the domain is empty, it gets replaced
with another domain that could be the wrong one for the authentication.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2016-12-15 10:01:18 +03:00
bool domainAuto : 1 ;
2011-12-27 16:22:00 +04:00
__u16 session_flags ;
2015-12-18 22:05:30 +03:00
__u8 smb3signingkey [ SMB3_SIGN_KEY_SIZE ] ;
2021-03-25 15:34:54 +03:00
__u8 smb3encryptionkey [ SMB3_ENC_DEC_KEY_SIZE ] ;
__u8 smb3decryptionkey [ SMB3_ENC_DEC_KEY_SIZE ] ;
2018-02-16 21:19:29 +03:00
__u8 preauth_sha_hash [ SMB2_PREAUTH_HASH_SIZE ] ;
2018-06-14 16:43:18 +03:00
/*
* Network interfaces available on the server this session is
* connected to .
*
* Other channels can be opened by connecting and binding this
* session to interfaces from this list .
*
* iface_lock should be taken when accessing any of these fields
*/
spinlock_t iface_lock ;
2021-07-19 13:54:46 +03:00
/* ========= begin: protected by iface_lock ======== */
2022-01-01 15:50:21 +03:00
struct list_head iface_list ;
2018-06-14 16:43:18 +03:00
size_t iface_count ;
unsigned long iface_last_update ; /* jiffies */
2021-07-19 13:54:46 +03:00
/* ========= end: protected by iface_lock ======== */
2019-09-20 05:32:20 +03:00
2021-07-19 13:54:46 +03:00
spinlock_t chan_lock ;
/* ========= begin: protected by chan_lock ======== */
2019-09-20 05:32:20 +03:00
# define CIFS_MAX_CHANNELS 16
2021-07-19 15:46:53 +03:00
# define CIFS_ALL_CHANNELS_SET(ses) \
( ( 1UL < < ( ses ) - > chan_count ) - 1 )
2022-04-08 16:31:37 +03:00
# define CIFS_ALL_CHANS_GOOD(ses) \
( ! ( ses ) - > chans_need_reconnect )
2021-07-19 15:46:53 +03:00
# define CIFS_ALL_CHANS_NEED_RECONNECT(ses) \
( ( ses ) - > chans_need_reconnect = = CIFS_ALL_CHANNELS_SET ( ses ) )
2021-07-19 17:14:46 +03:00
# define CIFS_SET_ALL_CHANS_NEED_RECONNECT(ses) \
( ( ses ) - > chans_need_reconnect = CIFS_ALL_CHANNELS_SET ( ses ) )
2021-07-19 15:46:53 +03:00
# define CIFS_CHAN_NEEDS_RECONNECT(ses, index) \
test_bit ( ( index ) , & ( ses ) - > chans_need_reconnect )
2022-04-08 16:31:37 +03:00
# define CIFS_CHAN_IN_RECONNECT(ses, index) \
( ( ses ) - > chans [ ( index ) ] . in_reconnect )
2021-07-19 15:46:53 +03:00
2019-09-20 05:32:20 +03:00
struct cifs_chan chans [ CIFS_MAX_CHANNELS ] ;
size_t chan_count ;
size_t chan_max ;
atomic_t chan_seq ; /* round robin state */
2021-07-19 15:46:53 +03:00
/*
* chans_need_reconnect is a bitmap indicating which of the channels
* under this smb session needs to be reconnected .
* If not multichannel session , only one bit will be used .
2021-07-19 16:54:16 +03:00
*
* We will ask for sess and tcon reconnection only if all the
* channels are marked for needing reconnection . This will
* enable the sessions on top to continue to live till any
* of the channels below are active .
2021-07-19 15:46:53 +03:00
*/
unsigned long chans_need_reconnect ;
2021-07-19 13:54:46 +03:00
/* ========= end: protected by chan_lock ======== */
2022-12-13 07:23:16 +03:00
struct cifs_ses * dfs_root_ses ;
2005-04-17 02:20:36 +04:00
} ;
2012-09-19 17:22:45 +04:00
2012-07-13 13:58:14 +04:00
static inline bool
cap_unix ( struct cifs_ses * ses )
{
return ses - > server - > vals - > cap_unix & ses - > capabilities ;
}
2022-05-10 02:42:04 +03:00
/*
* common struct for holding inode info when searching for or updating an
* inode with new info
*/
# define CIFS_FATTR_DFS_REFERRAL 0x1
# define CIFS_FATTR_DELETE_PENDING 0x2
# define CIFS_FATTR_NEED_REVAL 0x4
# define CIFS_FATTR_INO_COLLISION 0x8
# define CIFS_FATTR_UNKNOWN_NLINK 0x10
# define CIFS_FATTR_FAKE_ROOT_INO 0x20
struct cifs_fattr {
u32 cf_flags ;
u32 cf_cifsattrs ;
u64 cf_uniqueid ;
u64 cf_eof ;
u64 cf_bytes ;
u64 cf_createtime ;
kuid_t cf_uid ;
kgid_t cf_gid ;
umode_t cf_mode ;
dev_t cf_rdev ;
unsigned int cf_nlink ;
unsigned int cf_dtype ;
struct timespec64 cf_atime ;
struct timespec64 cf_mtime ;
struct timespec64 cf_ctime ;
u32 cf_cifstag ;
2022-10-04 00:43:50 +03:00
char * cf_symlink_target ;
2022-05-10 02:42:04 +03:00
} ;
2005-04-17 02:20:36 +04:00
/*
* there is one of these for each connection to a resource on a particular
2007-06-28 23:44:13 +04:00
* session
2005-04-17 02:20:36 +04:00
*/
2011-05-27 08:34:02 +04:00
struct cifs_tcon {
2008-11-15 19:12:47 +03:00
struct list_head tcon_list ;
int tc_count ;
2016-11-04 21:50:31 +03:00
struct list_head rlist ; /* reconnect list */
2022-07-27 22:49:56 +03:00
spinlock_t tc_lock ; /* protect anything here that is not protected */
2018-10-20 01:14:32 +03:00
atomic_t num_local_opens ; /* num of all opens including disconnected */
atomic_t num_remote_opens ; /* num of all network opens on server */
2005-04-17 02:20:36 +04:00
struct list_head openFileList ;
2016-09-23 02:58:16 +03:00
spinlock_t open_file_lock ; /* protects list above */
2011-05-27 08:34:02 +04:00
struct cifs_ses * ses ; /* pointer to session associated with */
2022-09-21 22:05:53 +03:00
char tree_name [ MAX_TREE_SIZE + 1 ] ; /* UNC name of resource in ASCII */
2005-04-17 02:20:36 +04:00
char * nativeFileSystem ;
2008-12-06 04:41:21 +03:00
char * password ; /* for share-level security */
2011-12-27 16:04:00 +04:00
__u32 tid ; /* The 4 byte tree id */
2005-04-17 02:20:36 +04:00
__u16 Flags ; /* optional support bits */
smb3: cleanup and clarify status of tree connections
Currently the way the tid (tree connection) status is tracked
is confusing. The same enum is used for structs cifs_tcon
and cifs_ses and TCP_Server_info, but each of these three has
different states that they transition among. The current
code also unnecessarily uses camelCase.
Convert from use of statusEnum to a new tid_status_enum for
tree connections. The valid states for a tid are:
TID_NEW = 0,
TID_GOOD,
TID_EXITING,
TID_NEED_RECON,
TID_NEED_TCON,
TID_IN_TCON,
TID_NEED_FILES_INVALIDATE, /* unused, considering removing in future */
TID_IN_FILES_INVALIDATE
It also removes CifsNeedTcon, CifsInTcon, CifsNeedFilesInvalidate and
CifsInFilesInvalidate from the statusEnum used for session and
TCP_Server_Info since they are not relevant for those.
A follow on patch will fix the places where we use the
tcon->need_reconnect flag to be more consistent with the tid->status.
Also fixes a bug that was:
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-03-28 00:07:30 +03:00
enum tid_status_enum status ;
2005-04-17 02:20:36 +04:00
atomic_t num_smbs_sent ;
2012-05-28 14:16:31 +04:00
union {
struct {
atomic_t num_writes ;
atomic_t num_reads ;
atomic_t num_flushes ;
atomic_t num_oplock_brks ;
atomic_t num_opens ;
atomic_t num_closes ;
atomic_t num_deletes ;
atomic_t num_mkdirs ;
atomic_t num_posixopens ;
atomic_t num_posixmkdirs ;
atomic_t num_rmdirs ;
atomic_t num_renames ;
atomic_t num_t2renames ;
atomic_t num_ffirst ;
atomic_t num_fnext ;
atomic_t num_fclose ;
atomic_t num_hardlinks ;
atomic_t num_symlinks ;
atomic_t num_locks ;
atomic_t num_acl_get ;
atomic_t num_acl_set ;
} cifs_stats ;
2012-05-28 15:19:39 +04:00
struct {
atomic_t smb2_com_sent [ NUMBER_OF_SMB2_COMMANDS ] ;
atomic_t smb2_com_failed [ NUMBER_OF_SMB2_COMMANDS ] ;
} smb2_stats ;
2012-05-28 14:16:31 +04:00
} stats ;
2005-04-17 02:20:36 +04:00
__u64 bytes_read ;
__u64 bytes_written ;
2016-09-23 02:58:16 +03:00
spinlock_t stat_lock ; /* protects the two fields above */
2005-04-17 02:20:36 +04:00
FILE_SYSTEM_DEVICE_INFO fsDevInfo ;
2006-06-04 09:53:15 +04:00
FILE_SYSTEM_ATTRIBUTE_INFO fsAttrInfo ; /* ok if fs name truncated */
2005-04-17 02:20:36 +04:00
FILE_SYSTEM_UNIX_INFO fsUnixInfo ;
2018-01-24 15:46:10 +03:00
bool ipc : 1 ; /* set if connection to IPC$ share (always also pipe) */
bool pipe : 1 ; /* set if connection to pipe share */
bool print : 1 ; /* set if connection to printer share */
2008-04-29 04:06:05 +04:00
bool retry : 1 ;
bool nocase : 1 ;
2018-04-26 06:19:09 +03:00
bool nohandlecache : 1 ; /* if strange server resource prob can turn off */
2020-05-19 11:06:57 +03:00
bool nodelete : 1 ;
2008-05-15 20:44:38 +04:00
bool seal : 1 ; /* transport encryption for this mounted share */
2008-04-29 04:06:05 +04:00
bool unix_ext : 1 ; /* if false disable Linux extensions to CIFS protocol
2007-07-19 03:21:09 +04:00
for this mount even if server would support */
2018-05-21 07:41:10 +03:00
bool posix_extensions ; /* if true SMB3.11 posix extensions enabled */
2008-10-23 08:42:37 +04:00
bool local_lease : 1 ; /* check leases (only) on local system not remote */
2009-03-04 22:54:08 +03:00
bool broken_posix_open ; /* e.g. Samba server versions < 3.3.2, 3.2.9 */
2014-08-12 06:05:25 +04:00
bool broken_sparse_sup ; /* if server or share does not support sparse */
2008-11-13 22:45:32 +03:00
bool need_reconnect : 1 ; /* connection reset, tid now invalid */
2016-11-29 22:31:23 +03:00
bool need_reopen_files : 1 ; /* need to reopen tcon file handles */
2015-11-03 19:08:53 +03:00
bool use_resilient : 1 ; /* use resilient instead of durable handles */
2015-11-03 18:15:03 +03:00
bool use_persistent : 1 ; /* use persistent instead of durable handles */
2019-09-12 05:46:20 +03:00
bool no_lease : 1 ; /* Do not request leases on files or directories */
2021-04-09 17:31:37 +03:00
bool use_witness : 1 ; /* use witness protocol */
2013-06-19 23:15:30 +04:00
__le32 capabilities ;
2011-12-27 16:04:00 +04:00
__u32 share_flags ;
__u32 maximal_access ;
__u32 vol_serial_number ;
__le64 vol_create_time ;
2016-11-12 07:36:20 +03:00
__u64 snapshot_time ; /* for timewarp tokens - timestamp of snapshot */
2019-03-30 00:31:07 +03:00
__u32 handle_timeout ; /* persistent and durable handle timeout in ms */
2013-10-10 05:55:53 +04:00
__u32 ss_flags ; /* sector size flags */
__u32 perf_sector_size ; /* best sector size for perf */
2013-11-15 21:26:24 +04:00
__u32 max_chunks ;
__u32 max_bytes_chunk ;
__u32 max_bytes_copy ;
2010-07-05 16:42:27 +04:00
# ifdef CONFIG_CIFS_FSCACHE
u64 resource_id ; /* server resource id */
cifs: Support fscache indexing rewrite
Change the cifs filesystem to take account of the changes to fscache's
indexing rewrite and reenable caching in cifs.
The following changes have been made:
(1) The fscache_netfs struct is no more, and there's no need to register
the filesystem as a whole.
(2) The session cookie is now an fscache_volume cookie, allocated with
fscache_acquire_volume(). That takes three parameters: a string
representing the "volume" in the index, a string naming the cache to
use (or NULL) and a u64 that conveys coherency metadata for the
volume.
For cifs, I've made it render the volume name string as:
"cifs,<ipaddress>,<sharename>"
where the sharename has '/' characters replaced with ';'.
This probably needs rethinking a bit as the total name could exceed
the maximum filename component length.
Further, the coherency data is currently just set to 0. It needs
something else doing with it - I wonder if it would suffice simply to
sum the resource_id, vol_create_time and vol_serial_number or maybe
hash them.
(3) The fscache_cookie_def is no more and needed information is passed
directly to fscache_acquire_cookie(). The cache no longer calls back
into the filesystem, but rather metadata changes are indicated at
other times.
fscache_acquire_cookie() is passed the same keying and coherency
information as before.
(4) The functions to set/reset cookies are removed and
fscache_use_cookie() and fscache_unuse_cookie() are used instead.
fscache_use_cookie() is passed a flag to indicate if the cookie is
opened for writing. fscache_unuse_cookie() is passed updates for the
metadata if we changed it (ie. if the file was opened for writing).
These are called when the file is opened or closed.
(5) cifs_setattr_*() are made to call fscache_resize() to change the size
of the cache object.
(6) The functions to read and write data are stubbed out pending a
conversion to use netfslib.
Changes
=======
ver #8:
- Abstract cache invalidation into a helper function.
- Fix some checkpatch warnings[3].
ver #7:
- Removed the accidentally added-back call to get the super cookie in
cifs_root_iget().
- Fixed the right call to cifs_fscache_get_super_cookie() to take account
of the "-o fsc" mount flag.
ver #6:
- Moved the change of gfpflags_allow_blocking() to current_is_kswapd() for
cifs here.
- Fixed one of the error paths in cifs_atomic_open() to jump around the
call to use the cookie.
- Fixed an additional successful return in the middle of cifs_open() to
use the cookie on the way out.
- Only get a volume cookie (and thus inode cookies) when "-o fsc" is
supplied to mount.
ver #5:
- Fixed a couple of bits of cookie handling[2]:
- The cookie should be released in cifs_evict_inode(), not
cifsFileInfo_put_final(). The cookie needs to persist beyond file
closure so that writepages will be able to write to it.
- fscache_use_cookie() needs to be called in cifs_atomic_open() as it is
for cifs_open().
ver #4:
- Fixed the use of sizeof with memset.
- tcon->vol_create_time is __le64 so doesn't need cpu_to_le64().
ver #3:
- Canonicalise the cifs coherency data to make the cache portable.
- Set volume coherency data.
ver #2:
- Use gfpflags_allow_blocking() rather than using flag directly.
- Upgraded to -rc4 to allow for upstream changes[1].
- fscache_acquire_volume() now returns errors.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23b55d673d7527b093cd97b7c217c82e70cd1af0 [1]
Link: https://lore.kernel.org/r/3419813.1641592362@warthog.procyon.org.uk/ [2]
Link: https://lore.kernel.org/r/CAH2r5muTanw9pJqzAHd01d9A8keeChkzGsCEH6=0rHutVLAF-A@mail.gmail.com/ [3]
Link: https://lore.kernel.org/r/163819671009.215744.11230627184193298714.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906982979.143852.10672081929614953210.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967187187.1823006.247415138444991444.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021579335.640689.2681324337038770579.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/3462849.1641593783@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/1318953.1642024578@warthog.procyon.org.uk/ # v6
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-11-17 18:56:59 +03:00
struct fscache_volume * fscache ; /* cookie for share */
2010-07-05 16:42:27 +04:00
# endif
2012-09-19 17:22:45 +04:00
struct list_head pending_opens ; /* list of incomplete opens */
2022-08-31 05:49:42 +03:00
struct cached_fids * cfids ;
2007-07-19 03:21:09 +04:00
/* BB add field for back pointer to sb struct(s)? */
2018-11-14 21:01:21 +03:00
# ifdef CONFIG_CIFS_DFS_UPCALL
struct list_head ulist ; /* cache update list */
# endif
2022-06-06 12:17:56 +03:00
struct delayed_work query_interfaces ; /* query interfaces workqueue job */
2005-04-17 02:20:36 +04:00
} ;
2010-09-30 03:51:11 +04:00
/*
* This is a refcounted and timestamped container for a tcon pointer . The
* container holds a tcon reference . It is considered safe to free one of
* these when the tl_count goes to 0. The tl_time is the time of the last
* " get " on the container .
*/
struct tcon_link {
2010-10-28 19:16:44 +04:00
struct rb_node tl_rbnode ;
2013-02-06 13:48:56 +04:00
kuid_t tl_uid ;
2010-10-07 03:51:11 +04:00
unsigned long tl_flags ;
# define TCON_LINK_MASTER 0
# define TCON_LINK_PENDING 1
# define TCON_LINK_IN_TREE 2
unsigned long tl_time ;
atomic_t tl_count ;
2011-05-27 08:34:02 +04:00
struct cifs_tcon * tl_tcon ;
2010-09-30 03:51:11 +04:00
} ;
2010-10-07 03:51:11 +04:00
extern struct tcon_link * cifs_sb_tlink ( struct cifs_sb_info * cifs_sb ) ;
2018-08-01 02:26:11 +03:00
extern void smb3_free_compound_rqst ( int num_rqst , struct smb_rqst * rqst ) ;
2010-09-30 03:51:11 +04:00
2011-05-27 08:34:02 +04:00
static inline struct cifs_tcon *
2010-09-30 03:51:11 +04:00
tlink_tcon ( struct tcon_link * tlink )
{
2010-10-07 03:51:11 +04:00
return tlink - > tl_tcon ;
2010-09-30 03:51:11 +04:00
}
2018-06-04 23:29:35 +03:00
static inline struct tcon_link *
cifs_sb_master_tlink ( struct cifs_sb_info * cifs_sb )
{
return cifs_sb - > master_tlink ;
}
2010-10-07 03:51:11 +04:00
extern void cifs_put_tlink ( struct tcon_link * tlink ) ;
2010-09-30 03:51:11 +04:00
2010-09-30 03:51:11 +04:00
static inline struct tcon_link *
cifs_get_tlink ( struct tcon_link * tlink )
{
2010-10-07 03:51:11 +04:00
if ( tlink & & ! IS_ERR ( tlink ) )
atomic_inc ( & tlink - > tl_count ) ;
2010-09-30 03:51:11 +04:00
return tlink ;
}
2010-09-30 03:51:11 +04:00
/* This function is always expected to succeed */
2011-05-27 08:34:02 +04:00
extern struct cifs_tcon * cifs_sb_master_tcon ( struct cifs_sb_info * cifs_sb ) ;
2010-09-30 03:51:11 +04:00
2012-09-19 17:22:45 +04:00
# define CIFS_OPLOCK_NO_CHANGE 0xfe
struct cifs_pending_open {
struct list_head olist ;
struct tcon_link * tlink ;
__u8 lease_key [ 16 ] ;
__u32 oplock ;
} ;
2021-04-13 08:26:42 +03:00
struct cifs_deferred_close {
struct list_head dlist ;
struct tcon_link * tlink ;
__u16 netfid ;
__u64 persistent_fid ;
__u64 volatile_fid ;
} ;
2005-04-17 02:20:36 +04:00
/*
2006-08-03 01:56:33 +04:00
* This info hangs off the cifsFileInfo structure , pointed to by llist .
* This is used to track byte stream locks on the file
2005-04-17 02:20:36 +04:00
*/
struct cifsLockInfo {
2006-08-03 01:56:33 +04:00
struct list_head llist ; /* pointer to next cifsLockInfo */
2011-10-22 15:33:29 +04:00
struct list_head blist ; /* pointer to locks blocked on this */
wait_queue_head_t block_q ;
2006-08-03 01:56:33 +04:00
__u64 offset ;
__u64 length ;
2010-08-17 11:26:00 +04:00
__u32 pid ;
2018-10-04 02:24:38 +03:00
__u16 type ;
__u16 flags ;
2005-04-17 02:20:36 +04:00
} ;
/*
* One of these for each open instance of a file
*/
struct cifs_search_info {
loff_t index_of_last_entry ;
__u16 entries_in_buffer ;
__u16 info_level ;
__u32 resume_key ;
2007-06-28 23:44:13 +04:00
char * ntwrk_buf_start ;
char * srch_entries_start ;
2008-10-08 00:03:33 +04:00
char * last_entry ;
2011-07-16 23:24:37 +04:00
const char * presume_name ;
2005-04-17 02:20:36 +04:00
unsigned int resume_name_len ;
2008-04-29 04:06:05 +04:00
bool endOfSearch : 1 ;
bool emptyDir : 1 ;
bool unicode : 1 ;
bool smallBuf : 1 ; /* so we know which buf_release function to call */
2005-04-17 02:20:36 +04:00
} ;
2019-10-05 18:53:58 +03:00
# define ACL_NO_MODE ((umode_t)(-1))
2013-07-05 12:00:30 +04:00
struct cifs_open_parms {
struct cifs_tcon * tcon ;
struct cifs_sb_info * cifs_sb ;
int disposition ;
int desired_access ;
int create_options ;
const char * path ;
struct cifs_fid * fid ;
2018-06-01 03:16:54 +03:00
umode_t mode ;
2013-07-09 18:40:58 +04:00
bool reconnect : 1 ;
2013-07-05 12:00:30 +04:00
} ;
2012-09-19 03:20:26 +04:00
struct cifs_fid {
__u16 netfid ;
2012-09-19 03:20:26 +04:00
__u64 persistent_fid ; /* persist file id for smb2 */
__u64 volatile_fid ; /* volatile file id for smb2 */
2012-09-19 17:22:44 +04:00
__u8 lease_key [ SMB2_LEASE_KEY_SIZE ] ; /* lease key for smb2 */
2015-11-03 18:26:27 +03:00
__u8 create_guid [ 16 ] ;
cifs: fix rename() by ensuring source handle opened with DELETE bit
To rename a file in SMB2 we open it with the DELETE access and do a
special SetInfo on it. If the handle is missing the DELETE bit the
server will fail the SetInfo with STATUS_ACCESS_DENIED.
We currently try to reuse any existing opened handle we have with
cifs_get_writable_path(). That function looks for handles with WRITE
access but doesn't check for DELETE, making rename() fail if it finds
a handle to reuse. Simple reproducer below.
To select handles with the DELETE bit, this patch adds a flag argument
to cifs_get_writable_path() and find_writable_file() and the existing
'bool fsuid_only' argument is converted to a flag.
The cifsFileInfo struct only stores the UNIX open mode but not the
original SMB access flags. Since the DELETE bit is not mapped in that
mode, this patch stores the access mask in cifs_fid on file open,
which is accessible from cifsFileInfo.
Simple reproducer:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define E(s) perror(s), exit(1)
int main(int argc, char *argv[])
{
int fd, ret;
if (argc != 3) {
fprintf(stderr, "Usage: %s A B\n"
"create&open A in write mode, "
"rename A to B, close A\n", argv[0]);
return 0;
}
fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666);
if (fd == -1) E("openat()");
ret = rename(argv[1], argv[2]);
if (ret) E("rename()");
ret = close(fd);
if (ret) E("close()");
return ret;
}
$ gcc -o bugrename bugrename.c
$ ./bugrename /mnt/a /mnt/b
rename(): Permission denied
Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name")
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
2020-02-21 13:19:06 +03:00
__u32 access ;
2012-09-19 17:22:45 +04:00
struct cifs_pending_open * pending_open ;
2013-09-05 21:30:16 +04:00
unsigned int epoch ;
2018-10-31 03:50:31 +03:00
# ifdef CONFIG_CIFS_DEBUG2
__u64 mid ;
# endif /* CIFS_DEBUG2 */
2013-09-05 21:30:16 +04:00
bool purge_cache ;
2012-09-19 03:20:26 +04:00
} ;
2012-09-19 17:22:43 +04:00
struct cifs_fid_locks {
struct list_head llist ;
struct cifsFileInfo * cfile ; /* fid that owns locks */
struct list_head locks ; /* locks held by fid above */
} ;
2005-04-17 02:20:36 +04:00
struct cifsFileInfo {
2016-09-23 02:58:16 +03:00
/* following two lists are protected by tcon->open_file_lock */
2005-04-17 02:20:36 +04:00
struct list_head tlist ; /* pointer to next fid owned by tcon */
struct list_head flist ; /* next fid (file instance) for this inode */
2016-09-23 02:58:16 +03:00
/* lock list below protected by cifsi->lock_sem */
2012-09-19 17:22:43 +04:00
struct cifs_fid_locks * llist ; /* brlocks held by this fid */
2013-02-06 14:23:02 +04:00
kuid_t uid ; /* allows finding which FileInfo structure */
2005-04-17 02:20:36 +04:00
__u32 pid ; /* process id who opened file */
2012-09-19 03:20:26 +04:00
struct cifs_fid fid ; /* file id from remote */
2016-10-08 03:26:36 +03:00
struct list_head rlist ; /* reconnect list */
2022-12-09 01:11:00 +03:00
/* BB add lock scope info here if needed */
2005-04-17 02:20:36 +04:00
/* lock scope id (0 if none) */
2010-10-11 23:07:18 +04:00
struct dentry * dentry ;
2010-09-30 03:51:11 +04:00
struct tcon_link * tlink ;
2016-09-23 02:58:16 +03:00
unsigned int f_flags ;
2008-04-29 04:06:05 +04:00
bool invalidHandle : 1 ; /* file closed via session abend */
2020-04-10 05:42:18 +03:00
bool swapfile : 1 ;
2009-09-21 14:47:50 +04:00
bool oplock_break_cancelled : 1 ;
2019-10-30 02:51:19 +03:00
unsigned int oplock_epoch ; /* epoch from the lease break */
__u32 oplock_level ; /* oplock/lease level from the lease break */
2016-09-23 02:58:16 +03:00
int count ;
spinlock_t file_info_lock ; /* protects four flag/count fields above */
2009-04-09 05:14:32 +04:00
struct mutex fh_mutex ; /* prevents reopen race after dead ses*/
2005-04-17 02:20:36 +04:00
struct cifs_search_info srch_inf ;
2010-07-21 00:09:02 +04:00
struct work_struct oplock_break ; /* work for oplock breaks */
cifs: move cifsFileInfo_put logic into a work-queue
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:
> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
> page flags PG_locked and PG_writeback for one specific page
> inode->i_rwsem for the directory
> inode->i_rwsem for the file
> cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
> #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
> #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
> #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
> #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
> #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
> } else {
> const char *s = path_init(nd, flags);
> while (!(error = link_path_walk(s, nd)) &&
> (error = do_last(nd, file, op)) > 0) { <<<<
>
> do_last:
> if (open_flag & O_CREAT)
> inode_lock(dir->d_inode); <<<<
> else
> so it's trying to take inode->i_rwsem for the directory
>
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
> owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 2
> 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926
> RWSEM_WAITING_FOR_WRITE
> 0xffff996500393e00 9802 ls UN 0 1:17:26.700
> RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
> #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
> #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff99650065b940] msleep at ffffffff9af573a9
> #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
> #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
> #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
> #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
> #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
> index = 0x8
> mapping = 0xffff8c68f3cd0db0
> flags = 0xfffffc0008095
>
> PAGE-FLAG BIT VALUE
> PG_locked 0 0000001
> PG_uptodate 2 0000004
> PG_lru 4 0000010
> PG_waiters 7 0000080
> PG_writeback 15 0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
> owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 1
> 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912
> RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
> #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
> #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
> #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
> #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
> #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
> #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
> #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
> #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
> #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
> #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
> #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
> #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
> #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
> #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
> #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
> #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6
> COMMAND: "reopen_file"
> #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
> #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
> #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
> #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
> #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
> #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
> #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
> #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
> #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
> #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
> #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
> #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
> #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
> #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
> #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
> #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
> #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
> wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
> #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
> #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
> #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
> #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
> #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
> #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
> inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4
> COMMAND: "ls"
> #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
> #1 [ffff996500393db8] schedule at ffffffff9b6e64df
> #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
> #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
> #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
> #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
> #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
> #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
> if (shared)
> res = down_read_killable(&inode->i_rwsem); <<<<
> else
> res = down_write_killable(&inode->i_rwsem);
>
Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-03 06:06:37 +03:00
struct work_struct put ; /* work for the final part of _put */
2021-04-13 08:26:42 +03:00
struct delayed_work deferred ;
2021-05-05 13:56:47 +03:00
bool deferred_close_scheduled ; /* Flag to indicate close is scheduled */
2022-10-04 00:43:50 +03:00
char * symlink_target ;
2005-04-17 02:20:36 +04:00
} ;
2011-05-26 10:01:59 +04:00
struct cifs_io_parms {
__u16 netfid ;
2012-09-19 03:20:29 +04:00
__u64 persistent_fid ; /* persist file id for smb2 */
__u64 volatile_fid ; /* volatile file id for smb2 */
2011-05-26 10:01:59 +04:00
__u32 pid ;
__u64 offset ;
unsigned int length ;
2011-05-27 08:34:02 +04:00
struct cifs_tcon * tcon ;
2020-05-31 20:38:22 +03:00
struct TCP_Server_Info * server ;
2011-05-26 10:01:59 +04:00
} ;
2017-04-25 21:52:29 +03:00
struct cifs_aio_ctx {
struct kref refcount ;
struct list_head list ;
struct mutex aio_mutex ;
struct completion done ;
struct iov_iter iter ;
struct kiocb * iocb ;
struct cifsFileInfo * cfile ;
struct bio_vec * bv ;
2017-04-25 21:52:31 +03:00
loff_t pos ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
unsigned int nr_pinned_pages ;
2017-04-25 21:52:29 +03:00
ssize_t rc ;
unsigned int len ;
unsigned int total_len ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
unsigned int bv_need_unpin ; /* If ->bv[] needs unpinning */
2017-04-25 21:52:29 +03:00
bool should_dirty ;
2018-11-01 01:13:09 +03:00
/*
* Indicates if this aio_ctx is for direct_io ,
* If yes , iter is a copy of the user passed iov_iter
*/
bool direct_io ;
2017-04-25 21:52:29 +03:00
} ;
2012-09-19 03:20:29 +04:00
/* asynchronous read support */
struct cifs_readdata {
struct kref refcount ;
struct list_head list ;
struct completion done ;
struct cifsFileInfo * cfile ;
struct address_space * mapping ;
2017-04-25 21:52:30 +03:00
struct cifs_aio_ctx * ctx ;
2012-09-19 03:20:29 +04:00
__u64 offset ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
ssize_t got_bytes ;
2012-09-19 03:20:29 +04:00
unsigned int bytes ;
pid_t pid ;
int result ;
struct work_struct work ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
struct iov_iter iter ;
2016-11-24 02:14:57 +03:00
struct kvec iov [ 2 ] ;
2020-05-31 20:38:22 +03:00
struct TCP_Server_Info * server ;
2017-11-23 03:38:46 +03:00
# ifdef CONFIG_CIFS_SMB_DIRECT
struct smbd_mr * mr ;
# endif
2019-01-16 22:12:41 +03:00
struct cifs_credits credits ;
2012-09-19 03:20:29 +04:00
} ;
2012-09-19 03:20:29 +04:00
/* asynchronous write support */
struct cifs_writedata {
struct kref refcount ;
struct list_head list ;
struct completion done ;
enum writeback_sync_modes sync_mode ;
struct work_struct work ;
struct cifsFileInfo * cfile ;
2017-04-25 21:52:31 +03:00
struct cifs_aio_ctx * ctx ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
struct iov_iter iter ;
struct bio_vec * bv ;
2012-09-19 03:20:29 +04:00
__u64 offset ;
pid_t pid ;
unsigned int bytes ;
int result ;
2020-05-31 20:38:22 +03:00
struct TCP_Server_Info * server ;
2017-11-23 03:38:45 +03:00
# ifdef CONFIG_CIFS_SMB_DIRECT
struct smbd_mr * mr ;
# endif
2019-01-16 22:12:41 +03:00
struct cifs_credits credits ;
2012-09-19 03:20:29 +04:00
} ;
2010-10-15 23:34:06 +04:00
/*
* Take a reference on the file private data . Must be called with
2016-09-23 02:58:16 +03:00
* cfile - > file_info_lock held .
2010-10-15 23:34:06 +04:00
*/
2012-07-25 22:59:54 +04:00
static inline void
cifsFileInfo_get_locked ( struct cifsFileInfo * cifs_file )
2009-08-31 19:07:12 +04:00
{
2010-10-15 23:34:06 +04:00
+ + cifs_file - > count ;
2009-08-31 19:07:12 +04:00
}
2012-07-25 22:59:54 +04:00
struct cifsFileInfo * cifsFileInfo_get ( struct cifsFileInfo * cifs_file ) ;
cifs: move cifsFileInfo_put logic into a work-queue
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:
> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
> page flags PG_locked and PG_writeback for one specific page
> inode->i_rwsem for the directory
> inode->i_rwsem for the file
> cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
> #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
> #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
> #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
> #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
> #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
> } else {
> const char *s = path_init(nd, flags);
> while (!(error = link_path_walk(s, nd)) &&
> (error = do_last(nd, file, op)) > 0) { <<<<
>
> do_last:
> if (open_flag & O_CREAT)
> inode_lock(dir->d_inode); <<<<
> else
> so it's trying to take inode->i_rwsem for the directory
>
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
> owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 2
> 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926
> RWSEM_WAITING_FOR_WRITE
> 0xffff996500393e00 9802 ls UN 0 1:17:26.700
> RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
> #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
> #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff99650065b940] msleep at ffffffff9af573a9
> #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
> #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
> #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
> #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
> #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
> index = 0x8
> mapping = 0xffff8c68f3cd0db0
> flags = 0xfffffc0008095
>
> PAGE-FLAG BIT VALUE
> PG_locked 0 0000001
> PG_uptodate 2 0000004
> PG_lru 4 0000010
> PG_waiters 7 0000080
> PG_writeback 15 0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
> owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 1
> 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912
> RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
> #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
> #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
> #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
> #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
> #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
> #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
> #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
> #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
> #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
> #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
> #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
> #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
> #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
> #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
> #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
> #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6
> COMMAND: "reopen_file"
> #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
> #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
> #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
> #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
> #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
> #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
> #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
> #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
> #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
> #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
> #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
> #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
> #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
> #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
> #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
> #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
> #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
> wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
> #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
> #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
> #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
> #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
> #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
> #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
> inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4
> COMMAND: "ls"
> #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
> #1 [ffff996500393db8] schedule at ffffffff9b6e64df
> #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
> #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
> #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
> #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
> #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
> #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
> if (shared)
> res = down_read_killable(&inode->i_rwsem); <<<<
> else
> res = down_write_killable(&inode->i_rwsem);
>
Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-03 06:06:37 +03:00
void _cifsFileInfo_put ( struct cifsFileInfo * cifs_file , bool wait_oplock_hdlr ,
bool offload ) ;
2010-10-15 23:34:04 +04:00
void cifsFileInfo_put ( struct cifsFileInfo * cifs_file ) ;
2009-08-31 19:07:12 +04:00
2013-09-05 13:01:06 +04:00
# define CIFS_CACHE_READ_FLG 1
# define CIFS_CACHE_HANDLE_FLG 2
2013-09-05 21:30:16 +04:00
# define CIFS_CACHE_RH_FLG (CIFS_CACHE_READ_FLG | CIFS_CACHE_HANDLE_FLG)
2013-09-05 13:01:06 +04:00
# define CIFS_CACHE_WRITE_FLG 4
2013-09-05 21:30:16 +04:00
# define CIFS_CACHE_RW_FLG (CIFS_CACHE_READ_FLG | CIFS_CACHE_WRITE_FLG)
# define CIFS_CACHE_RHW_FLG (CIFS_CACHE_RW_FLG | CIFS_CACHE_HANDLE_FLG)
2013-09-05 13:01:06 +04:00
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 23:46:04 +03:00
# define CIFS_CACHE_READ(cinode) ((cinode->oplock & CIFS_CACHE_READ_FLG) || (CIFS_SB(cinode->netfs.inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RO_CACHE))
2013-09-05 16:11:28 +04:00
# define CIFS_CACHE_HANDLE(cinode) (cinode->oplock & CIFS_CACHE_HANDLE_FLG)
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 23:46:04 +03:00
# define CIFS_CACHE_WRITE(cinode) ((cinode->oplock & CIFS_CACHE_WRITE_FLG) || (CIFS_SB(cinode->netfs.inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RW_CACHE))
2013-09-05 13:01:06 +04:00
2005-04-17 02:20:36 +04:00
/*
* One of these for each file inode
*/
struct cifsInodeInfo {
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 23:46:04 +03:00
struct netfs_inode netfs ; /* Netfslib context and vfs inode */
2011-09-22 09:53:59 +04:00
bool can_cache_brlcks ;
2012-09-19 17:22:43 +04:00
struct list_head llist ; /* locks helb by this inode */
2019-10-23 12:02:33 +03:00
/*
* NOTE : Some code paths call down_read ( lock_sem ) twice , so
2020-07-20 03:13:16 +03:00
* we must always use cifs_down_write ( ) instead of down_write ( )
2019-10-23 12:02:33 +03:00
* for this semaphore to avoid deadlocks .
*/
2012-09-19 17:22:44 +04:00
struct rw_semaphore lock_sem ; /* protect the fields above */
2007-06-28 23:44:13 +04:00
/* BB add in lists for dirty pages i.e. write caching info for oplock */
2005-04-17 02:20:36 +04:00
struct list_head openFileList ;
2019-06-05 03:38:38 +03:00
spinlock_t open_file_lock ; /* protects openFileList */
2005-04-17 02:20:36 +04:00
__u32 cifsAttrs ; /* e.g. DOS archive bit, sparse, compressed, system */
2013-09-05 13:01:06 +04:00
unsigned int oplock ; /* oplock/lease level we have */
2013-09-05 21:30:16 +04:00
unsigned int epoch ; /* used to track lease state changes */
2014-03-11 20:11:47 +04:00
# define CIFS_INODE_PENDING_OPLOCK_BREAK (0) /* oplock break in progress */
# define CIFS_INODE_PENDING_WRITERS (1) /* Writes in progress */
2019-10-30 02:51:19 +03:00
# define CIFS_INODE_FLAG_UNUSED (2) /* Unused flag */
2014-04-30 17:31:45 +04:00
# define CIFS_INO_DELETE_PENDING (3) /* delete pending on server */
# define CIFS_INO_INVALID_MAPPING (4) /* pagecache is invalid */
2014-04-30 17:31:47 +04:00
# define CIFS_INO_LOCK (5) /* lock bit for synchronization */
2021-04-13 08:26:42 +03:00
# define CIFS_INO_MODIFIED_ATTR (6) /* Indicate change in mtime/ctime */
2021-09-18 00:50:40 +03:00
# define CIFS_INO_CLOSE_ON_LOCK (7) /* Not to defer the close when lock is set */
2014-04-30 17:31:45 +04:00
unsigned long flags ;
2014-03-11 20:11:47 +04:00
spinlock_t writers_lock ;
unsigned int writers ; /* Number of writers on this inode */
2011-01-20 21:36:50 +03:00
unsigned long time ; /* jiffies of last update of inode */
2012-03-23 22:40:56 +04:00
u64 server_eof ; /* current file size on server -- protected by i_lock */
2009-06-25 08:56:52 +04:00
u64 uniqueid ; /* server inode number */
2011-01-07 19:30:27 +03:00
u64 createtime ; /* creation time on server */
2012-09-19 17:22:44 +04:00
__u8 lease_key [ SMB2_LEASE_KEY_SIZE ] ; /* lease key for this inode */
2021-04-13 08:26:42 +03:00
struct list_head deferred_closes ; /* list of deferred closes */
spinlock_t deferred_lock ; /* protection on deferred list */
2021-05-17 14:28:34 +03:00
bool lease_granted ; /* Flag to indicate whether lease or oplock is granted. */
2022-10-04 00:43:50 +03:00
char * symlink_target ;
2005-04-17 02:20:36 +04:00
} ;
static inline struct cifsInodeInfo *
CIFS_I ( struct inode * inode )
{
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 23:46:04 +03:00
return container_of ( inode , struct cifsInodeInfo , netfs . inode ) ;
2005-04-17 02:20:36 +04:00
}
static inline struct cifs_sb_info *
CIFS_SB ( struct super_block * sb )
{
return sb - > s_fs_info ;
}
2014-10-22 08:25:12 +04:00
static inline struct cifs_sb_info *
CIFS_FILE_SB ( struct file * file )
{
return CIFS_SB ( file_inode ( file ) - > i_sb ) ;
}
2005-09-16 07:44:50 +04:00
static inline char CIFS_DIR_SEP ( const struct cifs_sb_info * cifs_sb )
2005-06-23 04:26:35 +04:00
{
if ( cifs_sb - > mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS )
return ' / ' ;
else
return ' \\ ' ;
}
2005-04-17 02:20:36 +04:00
2011-05-27 07:50:55 +04:00
static inline void
convert_delimiter ( char * path , char delim )
{
2012-11-30 04:07:51 +04:00
char old_delim , * pos ;
2011-05-27 07:50:55 +04:00
if ( delim = = ' / ' )
old_delim = ' \\ ' ;
else
old_delim = ' / ' ;
2012-11-30 04:07:51 +04:00
pos = path ;
while ( ( pos = strchr ( pos , old_delim ) ) )
* pos = delim ;
2011-05-27 07:50:55 +04:00
}
2005-08-25 00:59:35 +04:00
# define cifs_stats_inc atomic_inc
2011-05-27 08:34:02 +04:00
static inline void cifs_stats_bytes_written ( struct cifs_tcon * tcon ,
2005-08-25 00:59:35 +04:00
unsigned int bytes )
{
if ( bytes ) {
spin_lock ( & tcon - > stat_lock ) ;
tcon - > bytes_written + = bytes ;
spin_unlock ( & tcon - > stat_lock ) ;
}
}
2011-05-27 08:34:02 +04:00
static inline void cifs_stats_bytes_read ( struct cifs_tcon * tcon ,
2005-08-25 00:59:35 +04:00
unsigned int bytes )
{
spin_lock ( & tcon - > stat_lock ) ;
tcon - > bytes_read + = bytes ;
spin_unlock ( & tcon - > stat_lock ) ;
}
2011-01-11 15:24:21 +03:00
/*
2011-10-19 23:29:49 +04:00
* This is the prototype for the mid receive function . This function is for
* receiving the rest of the SMB frame , starting with the WordCount ( which is
* just after the MID in struct smb_hdr ) . Note :
*
* - This will be called by cifsd , with no locks held .
* - The mid will still be on the pending_mid_q .
* - mid - > resp_buf will point to the current buffer .
*
* Returns zero on a successful receive , or an error . The receive state in
* the TCP_Server_Info will also be updated .
*/
typedef int ( mid_receive_t ) ( struct TCP_Server_Info * server ,
struct mid_q_entry * mid ) ;
/*
* This is the prototype for the mid callback function . This is called once the
* mid has been received off of the socket . When creating one , take special
* care to avoid deadlocks . Things to bear in mind :
2011-01-11 15:24:21 +03:00
*
2011-05-22 15:09:13 +04:00
* - it will be called by cifsd , with no locks held
* - the mid will be removed from any lists
2011-01-11 15:24:21 +03:00
*/
typedef void ( mid_callback_t ) ( struct mid_q_entry * mid ) ;
2016-11-17 01:06:17 +03:00
/*
* This is the protopyte for mid handle function . This is called once the mid
* has been recognized after decryption of the message .
*/
typedef int ( mid_handle_t ) ( struct TCP_Server_Info * server ,
struct mid_q_entry * mid ) ;
2005-04-17 02:20:36 +04:00
/* one of these for every pending CIFS request to the server */
struct mid_q_entry {
struct list_head qhead ; /* mids waiting on reply from this server */
2018-06-25 15:05:25 +03:00
struct kref refcount ;
2011-12-26 22:53:34 +04:00
struct TCP_Server_Info * server ; /* server corresponding to this mid */
2012-03-23 22:28:03 +04:00
__u64 mid ; /* multiplex id */
2019-03-05 01:02:50 +03:00
__u16 credits ; /* number of credits consumed by this mid */
2019-11-21 22:35:13 +03:00
__u16 credits_received ; /* number of credits from the response */
2012-03-23 22:28:03 +04:00
__u32 pid ; /* process id */
2005-04-17 02:20:36 +04:00
__u32 sequence_number ; /* for CIFS signing */
2005-10-12 06:58:06 +04:00
unsigned long when_alloc ; /* when mid was created */
# ifdef CONFIG_CIFS_STATS2
unsigned long when_sent ; /* time when smb send finished */
unsigned long when_received ; /* when demux complete (taken off wire) */
# endif
2011-10-19 23:29:49 +04:00
mid_receive_t * receive ; /* call receive callback */
2011-01-11 15:24:21 +03:00
mid_callback_t * callback ; /* call completion callback */
2016-11-17 01:06:17 +03:00
mid_handle_t * handle ; /* call handle mid callback */
2011-01-11 15:24:21 +03:00
void * callback_data ; /* general purpose pointer for callback */
CIFS: Fix task struct use-after-free on reconnect
The task which created the MID may be gone by the time cifsd attempts to
call the callbacks on MIDs from cifs_reconnect().
This leads to a use-after-free of the task struct in cifs_wake_up_task:
==================================================================
BUG: KASAN: use-after-free in __lock_acquire+0x31a0/0x3270
Read of size 8 at addr ffff8880103e3a68 by task cifsd/630
CPU: 0 PID: 630 Comm: cifsd Not tainted 5.5.0-rc6+ #119
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x8e/0xcb
print_address_description.constprop.5+0x1d3/0x3c0
? __lock_acquire+0x31a0/0x3270
__kasan_report+0x152/0x1aa
? __lock_acquire+0x31a0/0x3270
? __lock_acquire+0x31a0/0x3270
kasan_report+0xe/0x20
__lock_acquire+0x31a0/0x3270
? __wake_up_common+0x1dc/0x630
? _raw_spin_unlock_irqrestore+0x4c/0x60
? mark_held_locks+0xf0/0xf0
? _raw_spin_unlock_irqrestore+0x39/0x60
? __wake_up_common_lock+0xd5/0x130
? __wake_up_common+0x630/0x630
lock_acquire+0x13f/0x330
? try_to_wake_up+0xa3/0x19e0
_raw_spin_lock_irqsave+0x38/0x50
? try_to_wake_up+0xa3/0x19e0
try_to_wake_up+0xa3/0x19e0
? cifs_compound_callback+0x178/0x210
? set_cpus_allowed_ptr+0x10/0x10
cifs_reconnect+0xa1c/0x15d0
? generic_ip_connect+0x1860/0x1860
? rwlock_bug.part.0+0x90/0x90
cifs_readv_from_socket+0x479/0x690
cifs_read_from_socket+0x9d/0xe0
? cifs_readv_from_socket+0x690/0x690
? mempool_resize+0x690/0x690
? rwlock_bug.part.0+0x90/0x90
? memset+0x1f/0x40
? allocate_buffers+0xff/0x340
cifs_demultiplex_thread+0x388/0x2a50
? cifs_handle_standard+0x610/0x610
? rcu_read_lock_held_common+0x120/0x120
? mark_lock+0x11b/0xc00
? __lock_acquire+0x14ed/0x3270
? __kthread_parkme+0x78/0x100
? lockdep_hardirqs_on+0x3e8/0x560
? lock_downgrade+0x6a0/0x6a0
? lockdep_hardirqs_on+0x3e8/0x560
? _raw_spin_unlock_irqrestore+0x39/0x60
? cifs_handle_standard+0x610/0x610
kthread+0x2bb/0x3a0
? kthread_create_worker_on_cpu+0xc0/0xc0
ret_from_fork+0x3a/0x50
Allocated by task 649:
save_stack+0x19/0x70
__kasan_kmalloc.constprop.5+0xa6/0xf0
kmem_cache_alloc+0x107/0x320
copy_process+0x17bc/0x5370
_do_fork+0x103/0xbf0
__x64_sys_clone+0x168/0x1e0
do_syscall_64+0x9b/0xec0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 0:
save_stack+0x19/0x70
__kasan_slab_free+0x11d/0x160
kmem_cache_free+0xb5/0x3d0
rcu_core+0x52f/0x1230
__do_softirq+0x24d/0x962
The buggy address belongs to the object at ffff8880103e32c0
which belongs to the cache task_struct of size 6016
The buggy address is located 1960 bytes inside of
6016-byte region [ffff8880103e32c0, ffff8880103e4a40)
The buggy address belongs to the page:
page:ffffea000040f800 refcount:1 mapcount:0 mapping:ffff8880108da5c0
index:0xffff8880103e4c00 compound_mapcount: 0
raw: 4000000000010200 ffffea00001f2208 ffffea00001e3408 ffff8880108da5c0
raw: ffff8880103e4c00 0000000000050003 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880103e3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880103e3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880103e3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880103e3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880103e3b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
This can be reliably reproduced by adding the below delay to
cifs_reconnect(), running find(1) on the mount, restarting the samba
server while find is running, and killing find during the delay:
spin_unlock(&GlobalMid_Lock);
mutex_unlock(&server->srv_mutex);
+ msleep(10000);
+
cifs_dbg(FYI, "%s: issuing mid callbacks\n", __func__);
list_for_each_safe(tmp, tmp2, &retry_list) {
mid_entry = list_entry(tmp, struct mid_q_entry, qhead);
Fix this by holding a reference to the task struct until the MID is
freed.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2020-01-23 19:09:06 +03:00
struct task_struct * creator ;
2012-03-23 22:28:02 +04:00
void * resp_buf ; /* pointer to received SMB header */
2018-04-09 11:06:28 +03:00
unsigned int resp_buf_size ;
2012-03-23 22:28:03 +04:00
int mid_state ; /* wish this were enum but can not pass to wait_event */
2017-03-04 02:41:38 +03:00
unsigned int mid_flags ;
2012-03-23 22:28:03 +04:00
__le16 command ; /* smb command code */
2019-01-04 03:45:27 +03:00
unsigned int optype ; /* operation type */
2012-03-23 22:28:03 +04:00
bool large_buf : 1 ; /* if valid response, is pointer to large buf */
2008-04-29 04:06:05 +04:00
bool multiRsp : 1 ; /* multiple trans2 responses for one request */
bool multiEnd : 1 ; /* both received */
2016-11-18 02:24:46 +03:00
bool decrypted : 1 ; /* decrypted entry */
2005-04-17 02:20:36 +04:00
} ;
2017-03-04 02:41:38 +03:00
struct close_cancelled_open {
struct cifs_fid fid ;
struct cifs_tcon * tcon ;
struct work_struct work ;
2019-11-14 21:32:12 +03:00
__u64 mid ;
__u16 cmd ;
2017-03-04 02:41:38 +03:00
} ;
2011-08-09 22:44:44 +04:00
/* Make code in transport.c a little cleaner by moving
update of optional stats into function below */
static inline void cifs_in_send_inc ( struct TCP_Server_Info * server )
{
atomic_inc ( & server - > in_send ) ;
}
static inline void cifs_in_send_dec ( struct TCP_Server_Info * server )
{
atomic_dec ( & server - > in_send ) ;
}
static inline void cifs_num_waiters_inc ( struct TCP_Server_Info * server )
{
atomic_inc ( & server - > num_waiters ) ;
}
static inline void cifs_num_waiters_dec ( struct TCP_Server_Info * server )
{
atomic_dec ( & server - > num_waiters ) ;
}
2019-11-22 02:26:35 +03:00
# ifdef CONFIG_CIFS_STATS2
2011-08-09 22:44:44 +04:00
static inline void cifs_save_when_sent ( struct mid_q_entry * mid )
{
mid - > when_sent = jiffies ;
}
# else
static inline void cifs_save_when_sent ( struct mid_q_entry * mid )
{
}
# endif
2005-04-17 02:20:36 +04:00
2005-08-25 04:10:36 +04:00
/* for pending dnotify requests */
struct dir_notify_req {
2010-10-07 22:46:32 +04:00
struct list_head lhead ;
__le16 Pid ;
__le16 PidHigh ;
__u16 Mid ;
__u16 Tid ;
__u16 Uid ;
__u16 netfid ;
__u32 filter ; /* CompletionFilter (for multishot) */
int multishot ;
struct file * pfile ;
2005-08-25 04:10:36 +04:00
} ;
2008-01-25 13:12:41 +03:00
struct dfs_info3_param {
int flags ; /* DFSREF_REFERRAL_SERVER, DFSREF_STORAGE_SERVER*/
2008-02-15 21:21:49 +03:00
int path_consumed ;
2008-01-25 13:12:41 +03:00
int server_type ;
int ref_flag ;
char * path_name ;
char * node_name ;
2018-11-14 20:38:51 +03:00
int ttl ;
2008-01-25 13:12:41 +03:00
} ;
2021-08-09 12:32:46 +03:00
struct file_list {
struct list_head list ;
struct cifsFileInfo * cfile ;
} ;
2022-10-12 00:16:07 +03:00
struct cifs_mount_ctx {
struct cifs_sb_info * cifs_sb ;
struct smb3_fs_context * fs_ctx ;
unsigned int xid ;
struct TCP_Server_Info * server ;
struct cifs_ses * ses ;
struct cifs_tcon * tcon ;
struct cifs_ses * root_ses ;
uuid_t mount_id ;
char * origin_fullpath , * leaf_fullpath ;
} ;
2008-01-25 13:12:41 +03:00
static inline void free_dfs_info_param ( struct dfs_info3_param * param )
{
if ( param ) {
kfree ( param - > path_name ) ;
kfree ( param - > node_name ) ;
}
}
static inline void free_dfs_info_array ( struct dfs_info3_param * param ,
int number_of_items )
{
int i ;
2022-12-09 01:11:00 +03:00
2008-01-25 13:12:41 +03:00
if ( ( number_of_items = = 0 ) | | ( param = = NULL ) )
return ;
for ( i = 0 ; i < number_of_items ; i + + ) {
kfree ( param [ i ] . path_name ) ;
kfree ( param [ i ] . node_name ) ;
}
kfree ( param ) ;
}
2019-01-08 22:15:28 +03:00
static inline bool is_interrupt_error ( int error )
{
switch ( error ) {
case - EINTR :
case - ERESTARTSYS :
case - ERESTARTNOHAND :
case - ERESTARTNOINTR :
return true ;
}
return false ;
}
static inline bool is_retryable_error ( int error )
{
if ( is_interrupt_error ( error ) | | error = = - EAGAIN )
return true ;
return false ;
}
cifs: fix rename() by ensuring source handle opened with DELETE bit
To rename a file in SMB2 we open it with the DELETE access and do a
special SetInfo on it. If the handle is missing the DELETE bit the
server will fail the SetInfo with STATUS_ACCESS_DENIED.
We currently try to reuse any existing opened handle we have with
cifs_get_writable_path(). That function looks for handles with WRITE
access but doesn't check for DELETE, making rename() fail if it finds
a handle to reuse. Simple reproducer below.
To select handles with the DELETE bit, this patch adds a flag argument
to cifs_get_writable_path() and find_writable_file() and the existing
'bool fsuid_only' argument is converted to a flag.
The cifsFileInfo struct only stores the UNIX open mode but not the
original SMB access flags. Since the DELETE bit is not mapped in that
mode, this patch stores the access mask in cifs_fid on file open,
which is accessible from cifsFileInfo.
Simple reproducer:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#define E(s) perror(s), exit(1)
int main(int argc, char *argv[])
{
int fd, ret;
if (argc != 3) {
fprintf(stderr, "Usage: %s A B\n"
"create&open A in write mode, "
"rename A to B, close A\n", argv[0]);
return 0;
}
fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666);
if (fd == -1) E("openat()");
ret = rename(argv[1], argv[2]);
if (ret) E("rename()");
ret = close(fd);
if (ret) E("close()");
return ret;
}
$ gcc -o bugrename bugrename.c
$ ./bugrename /mnt/a /mnt/b
rename(): Permission denied
Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name")
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
2020-02-21 13:19:06 +03:00
/* cifs_get_writable_file() flags */
# define FIND_WR_ANY 0
# define FIND_WR_FSUID_ONLY 1
# define FIND_WR_WITH_DELETE 2
2005-04-17 02:20:36 +04:00
# define MID_FREE 0
# define MID_REQUEST_ALLOCATED 1
# define MID_REQUEST_SUBMITTED 2
# define MID_RESPONSE_RECEIVED 4
# define MID_RETRY_NEEDED 8 /* session closed while this request out */
2011-02-10 16:03:50 +03:00
# define MID_RESPONSE_MALFORMED 0x10
2011-05-22 15:09:13 +04:00
# define MID_SHUTDOWN 0x20
2005-12-13 07:53:18 +03:00
2017-03-04 02:41:38 +03:00
/* Flags */
# define MID_WAIT_CANCELLED 1 /* Cancelled while waiting for response */
2018-08-30 03:12:59 +03:00
# define MID_DELETED 2 /* Mid has been dequeued/deleted */
2017-03-04 02:41:38 +03:00
2005-12-13 07:53:18 +03:00
/* Types of response buffer returned from SendReceive2 */
# define CIFS_NO_BUFFER 0 /* Response buffer not returned */
# define CIFS_SMALL_BUFFER 1
# define CIFS_LARGE_BUFFER 2
# define CIFS_IOVEC 4 /* array of response buffers */
2005-04-17 02:20:36 +04:00
2007-11-14 01:41:37 +03:00
/* Type of Request to SendReceive2 */
2011-01-11 15:24:23 +03:00
# define CIFS_BLOCKING_OP 1 /* operation can block */
2019-05-06 03:00:02 +03:00
# define CIFS_NON_BLOCKING 2 /* do not block waiting for credits */
2011-01-11 15:24:23 +03:00
# define CIFS_TIMEOUT_MASK 0x003 /* only one of above set in req */
2007-11-14 01:41:37 +03:00
# define CIFS_LOG_ERROR 0x010 /* log NT STATUS if non-zero */
# define CIFS_LARGE_BUF_OP 0x020 /* large request buffer */
2019-05-06 03:00:02 +03:00
# define CIFS_NO_RSP_BUF 0x040 /* no response buffer required */
2007-11-14 01:41:37 +03:00
2012-05-23 16:14:34 +04:00
/* Type of request operation */
2021-03-08 18:00:50 +03:00
# define CIFS_ECHO_OP 0x080 /* echo request */
# define CIFS_OBREAK_OP 0x0100 /* oplock break request */
# define CIFS_NEG_OP 0x0200 /* negotiate request */
# define CIFS_CP_CREATE_CLOSE_OP 0x0400 /* compound create+close request */
2021-02-04 09:49:52 +03:00
/* Lower bitmask values are reserved by others below. */
2021-03-08 18:00:50 +03:00
# define CIFS_SESS_OP 0x2000 /* session setup request */
# define CIFS_OP_MASK 0x2780 /* mask request type */
2016-10-31 23:49:30 +03:00
2021-03-08 18:00:50 +03:00
# define CIFS_HAS_CREDITS 0x0400 /* already has credits */
# define CIFS_TRANSFORM_REQ 0x0800 /* transform request before sending */
# define CIFS_NO_SRV_RSP 0x1000 /* there is no server response */
2012-05-23 16:14:34 +04:00
2006-06-01 02:40:51 +04:00
/* Security Flags: indicate type of session setup needed */
# define CIFSSEC_MAY_SIGN 0x00001
# define CIFSSEC_MAY_NTLMV2 0x00004
# define CIFSSEC_MAY_KRB5 0x00008
# define CIFSSEC_MAY_SEAL 0x00040 /* not supported yet */
2009-05-06 08:16:04 +04:00
# define CIFSSEC_MAY_NTLMSSP 0x00080 /* raw ntlmssp with ntlmv2 */
2006-06-01 02:40:51 +04:00
# define CIFSSEC_MUST_SIGN 0x01001
/* note that only one of the following can be set so the
result of setting MUST flags more than once will be to
require use of the stronger protocol */
# define CIFSSEC_MUST_NTLMV2 0x04004
# define CIFSSEC_MUST_KRB5 0x08008
2007-10-16 21:32:19 +04:00
# ifdef CONFIG_CIFS_UPCALL
2009-05-06 08:16:04 +04:00
# define CIFSSEC_MASK 0x8F08F /* flags supported if no weak allowed */
2007-06-28 23:44:13 +04:00
# else
2009-05-06 08:16:04 +04:00
# define CIFSSEC_MASK 0x87087 /* flags supported if no weak allowed */
2007-10-16 22:10:10 +04:00
# endif /* UPCALL */
2006-06-01 02:40:51 +04:00
# define CIFSSEC_MUST_SEAL 0x40040 /* not supported yet */
2009-05-06 08:16:04 +04:00
# define CIFSSEC_MUST_NTLMSSP 0x80080 /* raw ntlmssp with ntlmv2 */
2006-06-01 02:40:51 +04:00
2013-05-26 15:01:01 +04:00
# define CIFSSEC_DEF (CIFSSEC_MAY_SIGN | CIFSSEC_MAY_NTLMV2 | CIFSSEC_MAY_NTLMSSP)
2021-08-19 13:34:58 +03:00
# define CIFSSEC_MAX (CIFSSEC_MUST_NTLMV2)
# define CIFSSEC_AUTH_MASK (CIFSSEC_MAY_NTLMV2 | CIFSSEC_MAY_KRB5 | CIFSSEC_MAY_NTLMSSP)
2005-04-17 02:20:36 +04:00
/*
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* All constants go here
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
# define UID_HASH (16)
/*
* Note that ONE module should define _DECLARE_GLOBALS_HERE to cause the
* following to be declared .
*/
/****************************************************************************
2022-07-27 22:49:56 +03:00
* Here are all the locks ( spinlock , mutex , semaphore ) in cifs . ko , arranged according
* to the locking order . i . e . if two locks are to be held together , the lock that
* appears higher in this list needs to be taken before the other .
2005-04-17 02:20:36 +04:00
*
2022-07-27 22:49:56 +03:00
* If you hold a lock that is lower in this list , and you need to take a higher lock
* ( or if you think that one of the functions that you ' re calling may need to ) , first
* drop the lock you hold , pick up the higher lock , then the lower one . This will
* ensure that locks are picked up only in one direction in the below table
* ( top to bottom ) .
2005-04-17 02:20:36 +04:00
*
2022-07-27 22:49:56 +03:00
* Also , if you expect a function to be called with a lock held , explicitly document
* this in the comments on top of your function definition .
2019-06-05 03:38:38 +03:00
*
2022-07-27 22:49:56 +03:00
* And also , try to keep the critical sections ( lock hold time ) to be as minimal as
* possible . Blocking / calling other functions with a lock held always increase
* the risk of a possible deadlock .
2005-04-17 02:20:36 +04:00
*
2022-07-27 22:49:56 +03:00
* Following this rule will avoid unnecessary deadlocks , which can get really hard to
* debug . Also , any new lock that you introduce , please add to this list in the correct
* order .
*
* Please populate this list whenever you introduce new locks in your changes . Or in
* case I ' ve missed some existing locks . Please ensure that it ' s added in the list
* based on the locking order expected .
*
* = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* Lock Protects Initialization fn
* = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* vol_list_lock
* vol_info - > ctx_lock vol_info - > ctx
* cifs_sb_info - > tlink_tree_lock cifs_sb_info - > tlink_tree cifs_setup_cifs_sb
* TCP_Server_Info - > TCP_Server_Info cifs_get_tcp_session
* reconnect_mutex
* TCP_Server_Info - > srv_mutex TCP_Server_Info cifs_get_tcp_session
* cifs_ses - > session_mutex cifs_ses sesInfoAlloc
* cifs_tcon
* cifs_tcon - > open_file_lock cifs_tcon - > openFileList tconInfoAlloc
* cifs_tcon - > pending_opens
* cifs_tcon - > stat_lock cifs_tcon - > bytes_read tconInfoAlloc
* cifs_tcon - > bytes_written
* cifs_tcp_ses_lock cifs_tcp_ses_list sesInfoAlloc
* GlobalMid_Lock GlobalMaxActiveXid init_cifs
* GlobalCurrentXid
* GlobalTotalActiveXid
* TCP_Server_Info - > srv_lock ( anything in struct not protected by another lock and can change )
* TCP_Server_Info - > mid_lock TCP_Server_Info - > pending_mid_q cifs_get_tcp_session
* - > CurrentMid
* ( any changes in mid_q_entry fields )
* TCP_Server_Info - > req_lock TCP_Server_Info - > in_flight cifs_get_tcp_session
* - > credits
* - > echo_credits
* - > oplock_credits
* - > reconnect_instance
* cifs_ses - > ses_lock ( anything that is not protected by another lock and can change )
* cifs_ses - > iface_lock cifs_ses - > iface_list sesInfoAlloc
* - > iface_count
* - > iface_last_update
* cifs_ses - > chan_lock cifs_ses - > chans
* - > chans_need_reconnect
* - > chans_in_reconnect
* cifs_tcon - > tc_lock ( anything that is not protected by another lock and can change )
* cifsInodeInfo - > open_file_lock cifsInodeInfo - > openFileList cifs_alloc_inode
* cifsInodeInfo - > writers_lock cifsInodeInfo - > writers cifsInodeInfo_alloc
* cifsInodeInfo - > lock_sem cifsInodeInfo - > llist cifs_init_once
* - > can_cache_brlcks
* cifsInodeInfo - > deferred_lock cifsInodeInfo - > deferred_closes cifsInodeInfo_alloc
* cached_fid - > fid_mutex cifs_tcon - > crfid tconInfoAlloc
* cifsFileInfo - > fh_mutex cifsFileInfo cifs_new_fileinfo
* cifsFileInfo - > file_info_lock cifsFileInfo - > count cifs_new_fileinfo
* - > invalidHandle initiate_cifs_search
* - > oplock_break_cancelled
* cifs_aio_ctx - > aio_mutex cifs_aio_ctx cifs_aio_ctx_alloc
2005-04-17 02:20:36 +04:00
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
# ifdef DECLARE_GLOBALS_HERE
# define GLOBAL_EXTERN
# else
# define GLOBAL_EXTERN extern
# endif
2008-11-14 21:44:38 +03:00
/*
* the list of TCP_Server_Info structures , ie each of the sockets
2008-11-13 23:04:07 +03:00
* connecting our client to a distinct server ( ip address ) , is
2008-11-14 21:44:38 +03:00
* chained together by cifs_tcp_ses_list . The list of all our SMB
2008-11-13 23:04:07 +03:00
* sessions ( and from that the tree connections ) can be found
2008-11-14 21:44:38 +03:00
* by iterating over cifs_tcp_ses_list
*/
2022-07-16 07:57:08 +03:00
extern struct list_head cifs_tcp_ses_list ;
2008-11-14 21:44:38 +03:00
2008-11-15 19:12:47 +03:00
/*
* This lock protects the cifs_tcp_ses_list , the list of smb sessions per
* tcp session , and the list of tcon ' s per smb session . It also protects
2022-07-27 22:49:56 +03:00
* the reference counters for the server , smb session , and tcon .
2016-09-23 02:58:16 +03:00
* generally the locks should be taken in order tcp_ses_lock before
* tcon - > open_file_lock and that before file - > file_info_lock since the
* structure order is cifs_socket - - > cifs_ses - - > cifs_tcon - - > cifs_file
2008-11-15 19:12:47 +03:00
*/
2022-07-16 07:57:08 +03:00
extern spinlock_t cifs_tcp_ses_lock ;
2008-11-20 23:00:44 +03:00
2005-04-17 02:20:36 +04:00
/*
* Global transaction id ( XID ) information
*/
2022-07-25 06:47:59 +03:00
extern unsigned int GlobalCurrentXid ; /* protected by GlobalMid_Sem */
extern unsigned int GlobalTotalActiveXid ; /* prot by GlobalMid_Sem */
extern unsigned int GlobalMaxActiveXid ; /* prot by GlobalMid_Sem */
extern spinlock_t GlobalMid_Lock ; /* protects above & list operations on midQ entries */
2005-04-17 02:20:36 +04:00
/*
* Global counters , updated atomically
*/
2022-07-25 06:47:59 +03:00
extern atomic_t sesInfoAllocCount ;
extern atomic_t tconInfoAllocCount ;
extern atomic_t tcpSesNextId ;
extern atomic_t tcpSesAllocCount ;
extern atomic_t tcpSesReconnectCount ;
extern atomic_t tconInfoReconnectCount ;
2005-04-17 02:20:36 +04:00
2008-05-23 21:38:32 +04:00
/* Various Debug counters */
2022-07-16 07:45:45 +03:00
extern atomic_t buf_alloc_count ; /* current number allocated */
extern atomic_t small_buf_alloc_count ;
2005-12-04 00:58:57 +03:00
# ifdef CONFIG_CIFS_STATS2
2022-07-16 07:45:45 +03:00
extern atomic_t total_buf_alloc_count ; /* total allocated over all time */
extern atomic_t total_small_buf_alloc_count ;
2018-09-18 22:05:18 +03:00
extern unsigned int slow_rsp_threshold ; /* number of secs before logging */
2005-12-04 00:58:57 +03:00
# endif
2005-04-17 02:20:36 +04:00
/* Misc globals */
2018-05-24 12:11:07 +03:00
extern bool enable_oplocks ; /* enable or disable oplocks */
extern bool lookupCacheEnabled ;
extern unsigned int global_secflags ; /* if on, session setup sent
2005-04-17 02:20:36 +04:00
with more secure ntlmssp2 challenge / resp */
2018-05-24 12:11:07 +03:00
extern unsigned int sign_CIFS_PDUs ; /* enable smb packet signing */
2020-10-15 04:24:09 +03:00
extern bool enable_gcm_256 ; /* allow optional negotiate of strongest signing (aes-gcm-256) */
2020-09-12 00:19:28 +03:00
extern bool require_gcm_256 ; /* require use of strongest signing (aes-gcm-256) */
2021-07-05 23:05:39 +03:00
extern bool enable_negotiate_signing ; /* request use of faster (GMAC) signing if available */
2018-05-24 12:11:07 +03:00
extern bool linuxExtEnabled ; /*enable Linux/Unix CIFS extensions*/
extern unsigned int CIFSMaxBufSize ; /* max size not including hdr */
extern unsigned int cifs_min_rcv ; /* min size of big ntwrk buf pool */
extern unsigned int cifs_min_small ; /* min size of small buf pool */
extern unsigned int cifs_max_pending ; /* MAX requests at once to server*/
extern bool disable_legacy_dialects ; /* forbid vers=1.0 and vers=2.0 mounts */
2022-07-16 07:45:45 +03:00
extern atomic_t mid_count ;
2005-04-17 02:20:36 +04:00
2010-07-21 00:09:02 +04:00
void cifs_oplock_break ( struct work_struct * work ) ;
2019-03-29 12:49:12 +03:00
void cifs_queue_oplock_break ( struct cifsFileInfo * cfile ) ;
2021-04-13 08:26:42 +03:00
void smb2_deferred_work_close ( struct work_struct * work ) ;
2010-08-07 23:42:58 +04:00
2021-04-13 08:26:42 +03:00
extern const struct slow_work_ops cifs_oplock_break_ops ;
2012-03-23 22:40:53 +04:00
extern struct workqueue_struct * cifsiod_wq ;
2019-09-07 09:09:49 +03:00
extern struct workqueue_struct * decrypt_wq ;
cifs: move cifsFileInfo_put logic into a work-queue
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:
> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
> page flags PG_locked and PG_writeback for one specific page
> inode->i_rwsem for the directory
> inode->i_rwsem for the file
> cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
> #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
> #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
> #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
> #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
> #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
> } else {
> const char *s = path_init(nd, flags);
> while (!(error = link_path_walk(s, nd)) &&
> (error = do_last(nd, file, op)) > 0) { <<<<
>
> do_last:
> if (open_flag & O_CREAT)
> inode_lock(dir->d_inode); <<<<
> else
> so it's trying to take inode->i_rwsem for the directory
>
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
> owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 2
> 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926
> RWSEM_WAITING_FOR_WRITE
> 0xffff996500393e00 9802 ls UN 0 1:17:26.700
> RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
> #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
> #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff99650065b940] msleep at ffffffff9af573a9
> #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
> #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
> #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
> #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
> #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
> index = 0x8
> mapping = 0xffff8c68f3cd0db0
> flags = 0xfffffc0008095
>
> PAGE-FLAG BIT VALUE
> PG_locked 0 0000001
> PG_uptodate 2 0000004
> PG_lru 4 0000010
> PG_waiters 7 0000080
> PG_writeback 15 0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
> owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 1
> 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912
> RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
> #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
> #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
> #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
> #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
> #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
> #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
> #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
> #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
> #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
> #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
> #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
> #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
> #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
> #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
> #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
> #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6
> COMMAND: "reopen_file"
> #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
> #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
> #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
> #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
> #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
> #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
> #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
> #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
> #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
> #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
> #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
> #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
> #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
> #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
> #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
> #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
> #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
> wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
> #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
> #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
> #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
> #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
> #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
> #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
> inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4
> COMMAND: "ls"
> #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
> #1 [ffff996500393db8] schedule at ffffffff9b6e64df
> #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
> #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
> #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
> #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
> #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
> #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
> if (shared)
> res = down_read_killable(&inode->i_rwsem); <<<<
> else
> res = down_write_killable(&inode->i_rwsem);
>
Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-03 06:06:37 +03:00
extern struct workqueue_struct * fileinfo_put_wq ;
2017-05-03 18:54:01 +03:00
extern struct workqueue_struct * cifsoplockd_wq ;
2021-04-13 08:26:42 +03:00
extern struct workqueue_struct * deferredclose_wq ;
2016-05-24 13:27:44 +03:00
extern __u32 cifs_lock_secret ;
2010-06-22 19:22:50 +04:00
2011-12-26 22:53:34 +04:00
extern mempool_t * cifs_mid_poolp ;
2012-05-15 20:20:51 +04:00
/* Operations for different SMB versions */
# define SMB1_VERSION_STRING "1.0"
2022-06-02 06:08:46 +03:00
# define SMB20_VERSION_STRING "2.0"
# ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
2012-05-15 20:20:51 +04:00
extern struct smb_version_operations smb1_operations ;
extern struct smb_version_values smb1_values ;
2013-09-05 16:11:28 +04:00
extern struct smb_version_operations smb20_operations ;
2012-10-01 21:26:22 +04:00
extern struct smb_version_values smb20_values ;
2022-06-02 06:08:46 +03:00
# endif /* CIFS_ALLOW_INSECURE_LEGACY */
2011-02-24 21:07:19 +03:00
# define SMB21_VERSION_STRING "2.1"
extern struct smb_version_operations smb21_operations ;
extern struct smb_version_values smb21_values ;
2017-09-17 18:41:35 +03:00
# define SMBDEFAULT_VERSION_STRING "default"
extern struct smb_version_values smbdefault_values ;
# define SMB3ANY_VERSION_STRING "3"
extern struct smb_version_values smb3any_values ;
2012-10-01 21:26:22 +04:00
# define SMB30_VERSION_STRING "3.0"
2012-12-09 08:08:06 +04:00
extern struct smb_version_operations smb30_operations ;
2012-10-01 21:26:22 +04:00
extern struct smb_version_values smb30_values ;
2013-06-13 07:48:41 +04:00
# define SMB302_VERSION_STRING "3.02"
2018-11-17 08:03:30 +03:00
# define ALT_SMB302_VERSION_STRING "3.0.2"
2013-06-13 07:48:41 +04:00
/*extern struct smb_version_operations smb302_operations;*/ /* not needed yet */
extern struct smb_version_values smb302_values ;
2014-12-18 07:52:58 +03:00
# define SMB311_VERSION_STRING "3.1.1"
2015-06-24 07:37:11 +03:00
# define ALT_SMB311_VERSION_STRING "3.11"
extern struct smb_version_operations smb311_operations ;
2014-12-18 07:52:58 +03:00
extern struct smb_version_values smb311_values ;
2019-11-18 23:04:08 +03:00
2020-06-09 07:42:10 +03:00
static inline char * get_security_type_str ( enum securityEnum sectype )
{
switch ( sectype ) {
case RawNTLMSSP :
return " RawNTLMSSP " ;
case Kerberos :
return " Kerberos " ;
case NTLMv2 :
return " NTLMv2 " ;
default :
return " Unknown " ;
}
}
2019-11-18 23:04:08 +03:00
static inline bool is_smb1_server ( struct TCP_Server_Info * server )
{
return strcmp ( server - > vals - > version_string , SMB1_VERSION_STRING ) = = 0 ;
}
2020-08-27 17:20:19 +03:00
static inline bool is_tcon_dfs ( struct cifs_tcon * tcon )
{
/*
* For SMB1 , see MS - CIFS 2.4 .55 SMB_COM_TREE_CONNECT_ANDX ( 0x75 ) and MS - CIFS 3.3 .4 .4 DFS
* Subsystem Notifies That a Share Is a DFS Share .
*
* For SMB2 + , see MS - SMB2 2.2 .10 SMB2 TREE_CONNECT Response and MS - SMB2 3.3 .4 .14 Server
* Application Updates a Share .
*/
if ( ! tcon | | ! tcon - > ses | | ! tcon - > ses - > server )
return false ;
return is_smb1_server ( tcon - > ses - > server ) ? tcon - > Flags & SMB_SHARE_IS_IN_DFS :
tcon - > share_flags & ( SHI1005_FLAGS_DFS | SHI1005_FLAGS_DFS_ROOT ) ;
}
2021-11-03 19:53:29 +03:00
static inline bool cifs_is_referral_server ( struct cifs_tcon * tcon ,
const struct dfs_info3_param * ref )
{
/*
* Check if all targets are capable of handling DFS referrals as per
* MS - DFSC 2.2 .4 RESP_GET_DFS_REFERRAL .
*/
return is_tcon_dfs ( tcon ) | | ( ref & & ( ref - > flags & DFSREF_REFERRAL_SERVER ) ) ;
}
cifs: fix lock length calculation
The lock length was wrongly set to 0 when fl_end == OFFSET_MAX, thus
failing to lock the whole file when l_start=0 and l_len=0.
This fixes test 2 from cthon04.
Before patch:
$ ./cthon04/lock/tlocklfs -t 2 /mnt
Creating parent/child synchronization pipes.
Test #1 - Test regions of an unlocked file.
Parent: 1.1 - F_TEST [ 0, 1] PASSED.
Parent: 1.2 - F_TEST [ 0, ENDING] PASSED.
Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED.
Parent: 1.4 - F_TEST [ 1, 1] PASSED.
Parent: 1.5 - F_TEST [ 1, ENDING] PASSED.
Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED.
Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED.
Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED.
Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED.
Test #2 - Try to lock the whole file.
Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED.
Child: 2.1 - F_TEST [ 0, 1] FAILED!
Child: **** Expected EACCES, returned success...
Child: **** Probably implementation error.
** CHILD pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
Parent: Child died
** PARENT pass 1 results: 10/10 pass, 0/0 warn, 0/0 fail (pass/total).
After patch:
$ ./cthon04/lock/tlocklfs -t 2 /mnt
Creating parent/child synchronization pipes.
Test #2 - Try to lock the whole file.
Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED.
Child: 2.1 - F_TEST [ 0, 1] PASSED.
Child: 2.2 - F_TEST [ 0, ENDING] PASSED.
Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED.
Child: 2.4 - F_TEST [ 1, 1] PASSED.
Child: 2.5 - F_TEST [ 1, ENDING] PASSED.
Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED.
Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED.
Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED.
Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED.
Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED.
** PARENT pass 1 results: 2/2 pass, 0/0 warn, 0/0 fail (pass/total).
** CHILD pass 1 results: 9/9 pass, 0/0 warn, 0/0 fail (pass/total).
Fixes: d80c69846ddf ("cifs: fix signed integer overflow when fl_end is OFFSET_MAX")
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-08 19:41:18 +03:00
static inline u64 cifs_flock_len ( const struct file_lock * fl )
2022-05-19 18:18:37 +03:00
{
cifs: fix lock length calculation
The lock length was wrongly set to 0 when fl_end == OFFSET_MAX, thus
failing to lock the whole file when l_start=0 and l_len=0.
This fixes test 2 from cthon04.
Before patch:
$ ./cthon04/lock/tlocklfs -t 2 /mnt
Creating parent/child synchronization pipes.
Test #1 - Test regions of an unlocked file.
Parent: 1.1 - F_TEST [ 0, 1] PASSED.
Parent: 1.2 - F_TEST [ 0, ENDING] PASSED.
Parent: 1.3 - F_TEST [ 0,7fffffffffffffff] PASSED.
Parent: 1.4 - F_TEST [ 1, 1] PASSED.
Parent: 1.5 - F_TEST [ 1, ENDING] PASSED.
Parent: 1.6 - F_TEST [ 1,7fffffffffffffff] PASSED.
Parent: 1.7 - F_TEST [7fffffffffffffff, 1] PASSED.
Parent: 1.8 - F_TEST [7fffffffffffffff, ENDING] PASSED.
Parent: 1.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED.
Test #2 - Try to lock the whole file.
Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED.
Child: 2.1 - F_TEST [ 0, 1] FAILED!
Child: **** Expected EACCES, returned success...
Child: **** Probably implementation error.
** CHILD pass 1 results: 0/0 pass, 0/0 warn, 1/1 fail (pass/total).
Parent: Child died
** PARENT pass 1 results: 10/10 pass, 0/0 warn, 0/0 fail (pass/total).
After patch:
$ ./cthon04/lock/tlocklfs -t 2 /mnt
Creating parent/child synchronization pipes.
Test #2 - Try to lock the whole file.
Parent: 2.0 - F_TLOCK [ 0, ENDING] PASSED.
Child: 2.1 - F_TEST [ 0, 1] PASSED.
Child: 2.2 - F_TEST [ 0, ENDING] PASSED.
Child: 2.3 - F_TEST [ 0,7fffffffffffffff] PASSED.
Child: 2.4 - F_TEST [ 1, 1] PASSED.
Child: 2.5 - F_TEST [ 1, ENDING] PASSED.
Child: 2.6 - F_TEST [ 1,7fffffffffffffff] PASSED.
Child: 2.7 - F_TEST [7fffffffffffffff, 1] PASSED.
Child: 2.8 - F_TEST [7fffffffffffffff, ENDING] PASSED.
Child: 2.9 - F_TEST [7fffffffffffffff,7fffffffffffffff] PASSED.
Parent: 2.10 - F_ULOCK [ 0, ENDING] PASSED.
** PARENT pass 1 results: 2/2 pass, 0/0 warn, 0/0 fail (pass/total).
** CHILD pass 1 results: 9/9 pass, 0/0 warn, 0/0 fail (pass/total).
Fixes: d80c69846ddf ("cifs: fix signed integer overflow when fl_end is OFFSET_MAX")
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-08 19:41:18 +03:00
return ( u64 ) fl - > fl_end - fl - > fl_start + 1 ;
2022-05-19 18:18:37 +03:00
}
2022-05-25 15:37:04 +03:00
static inline size_t ntlmssp_workstation_name_size ( const struct cifs_ses * ses )
{
if ( WARN_ON_ONCE ( ! ses | | ! ses - > server ) )
return 0 ;
/*
* Make workstation name no more than 15 chars when using insecure dialects as some legacy
* servers do require it during NTLMSSP .
*/
if ( ses - > server - > dialect < = SMB20_PROT_ID )
return min_t ( size_t , sizeof ( ses - > workstation_name ) , RFC1001_NAME_LEN_WITH_NULL ) ;
return sizeof ( ses - > workstation_name ) ;
}
2022-10-04 00:43:50 +03:00
static inline void move_cifs_info_to_smb2 ( struct smb2_file_all_info * dst , const FILE_ALL_INFO * src )
{
memcpy ( dst , src , ( size_t ) ( ( u8 * ) & src - > AccessFlags - ( u8 * ) src ) ) ;
dst - > AccessFlags = src - > AccessFlags ;
dst - > CurrentByteOffset = src - > CurrentByteOffset ;
dst - > Mode = src - > Mode ;
dst - > AlignmentRequirement = src - > AlignmentRequirement ;
dst - > FileNameLength = src - > FileNameLength ;
}
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
static inline int cifs_get_num_sgs ( const struct smb_rqst * rqst ,
int num_rqst ,
const u8 * sig )
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
{
unsigned int len , skip ;
unsigned int nents = 0 ;
unsigned long addr ;
int i , j ;
2023-01-31 19:22:07 +03:00
/*
* The first rqst has a transform header where the first 20 bytes are
* not part of the encrypted blob .
*/
skip = 20 ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
/* Assumes the first rqst has a transform header as the first iov.
* I . e .
* rqst [ 0 ] . rq_iov [ 0 ] is transform header
* rqst [ 0 ] . rq_iov [ 1 + ] data to be encrypted / decrypted
* rqst [ 1 + ] . rq_iov [ 0 + ] data to be encrypted / decrypted
*/
for ( i = 0 ; i < num_rqst ; i + + ) {
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
/* We really don't want a mixture of pinned and unpinned pages
* in the sglist . It ' s hard to keep track of which is what .
* Instead , we convert to a BVEC - type iterator higher up .
*/
if ( WARN_ON_ONCE ( user_backed_iter ( & rqst [ i ] . rq_iter ) ) )
return - EIO ;
/* We also don't want to have any extra refs or pins to clean
* up in the sglist .
*/
if ( WARN_ON_ONCE ( iov_iter_extract_will_pin ( & rqst [ i ] . rq_iter ) ) )
return - EIO ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
for ( j = 0 ; j < rqst [ i ] . rq_nvec ; j + + ) {
struct kvec * iov = & rqst [ i ] . rq_iov [ j ] ;
addr = ( unsigned long ) iov - > iov_base + skip ;
if ( unlikely ( is_vmalloc_addr ( ( void * ) addr ) ) ) {
len = iov - > iov_len - skip ;
nents + = DIV_ROUND_UP ( offset_in_page ( addr ) + len ,
PAGE_SIZE ) ;
} else {
nents + + ;
}
2023-01-31 19:22:07 +03:00
skip = 0 ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
}
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
nents + = iov_iter_npages ( & rqst [ i ] . rq_iter , INT_MAX ) ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
}
nents + = DIV_ROUND_UP ( offset_in_page ( sig ) + SMB2_SIGNATURE_SIZE , PAGE_SIZE ) ;
return nents ;
}
/* We can not use the normal sg_set_buf() as we will sometimes pass a
* stack object as buf .
*/
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
static inline void cifs_sg_set_buf ( struct sg_table * sgtable ,
const void * buf ,
unsigned int buflen )
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
{
unsigned long addr = ( unsigned long ) buf ;
unsigned int off = offset_in_page ( addr ) ;
addr & = PAGE_MASK ;
if ( unlikely ( is_vmalloc_addr ( ( void * ) addr ) ) ) {
do {
unsigned int len = min_t ( unsigned int , buflen , PAGE_SIZE - off ) ;
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
sg_set_page ( & sgtable - > sgl [ sgtable - > nents + + ] ,
vmalloc_to_page ( ( void * ) addr ) , len , off ) ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
off = 0 ;
addr + = PAGE_SIZE ;
buflen - = len ;
} while ( buflen ) ;
} else {
cifs: Change the I/O paths to use an iterator rather than a page list
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-01-25 00:13:24 +03:00
sg_set_page ( & sgtable - > sgl [ sgtable - > nents + + ] ,
virt_to_page ( addr ) , buflen , off ) ;
cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system
Unable to handle kernel write to read-only memory at virtual address
ffff0001221cf000
Mem abort info:
ESR = 0x9600004f
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0f: level 3 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004f
CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
[ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
Internal error: Oops: 9600004f [#1] PREEMPT SMP
...
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : __memcpy+0x40/0x230
lr : scatterwalk_copychunks+0xe0/0x200
sp : ffff800014e92de0
x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
Call trace:
__memcpy+0x40/0x230
scatterwalk_map_and_copy+0x98/0x100
crypto_ccm_encrypt+0x150/0x180
crypto_aead_encrypt+0x2c/0x40
crypt_message+0x750/0x880
smb3_init_transform_rq+0x298/0x340
smb_send_rqst.part.11+0xd8/0x180
smb_send_rqst+0x3c/0x100
compound_send_recv+0x534/0xbc0
smb2_query_info_compound+0x32c/0x440
smb2_set_ea+0x438/0x4c0
cifs_xattr_set+0x5d4/0x7c0
This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.
To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done. Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...
Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 00:18:55 +03:00
}
}
2010-06-22 19:22:50 +04:00
# endif /* _CIFS_GLOB_H */