1089068 Commits

Author SHA1 Message Date
Dave Chinner
463260d767 Merge branch 'guilt/xlog-write-rework' into xfs-5.19-for-next 2022-04-21 16:45:52 +10:00
Dave Chinner
898a768f54 Merge branch 'guilt/xfs-unsigned-flags-5.18' into xfs-5.19-for-next 2022-04-21 16:45:03 +10:00
Dave Chinner
1499b8a3a3 Merge branch 'guilt/5.19-miscellaneous' into xfs-5.19-for-next 2022-04-21 11:40:17 +10:00
Dave Chinner
c60d13ea65 xfs: convert log ticket and iclog flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:48:01 +10:00
Dave Chinner
2eb7550d2c xfs: convert shutdown reasons to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:47:38 +10:00
Dave Chinner
b9f3082eee xfs: convert quota options flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:47:32 +10:00
Dave Chinner
90215d7498 xfs: convert ptag flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:47:25 +10:00
Dave Chinner
a103375307 xfs: convert inode lock flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:47:16 +10:00
Dave Chinner
22d53f480c xfs: convert log item tracepoint flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:47:07 +10:00
Dave Chinner
1005dd019c xfs: convert dquot flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:55 +10:00
Dave Chinner
3402d93157 xfs: convert da btree operations flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:47 +10:00
Dave Chinner
581b448447 xfs: convert buffer log item flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:40 +10:00
Dave Chinner
722db70fb2 xfs: convert btree buffer log flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

We also pass the fields to log to xfs_btree_offsets() as a uint32_t
all cases now. I have no idea why we made that parameter a int64_t
in the first place, but while we are fixing this up change it to
a uint32_t field, too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:33 +10:00
Dave Chinner
0d1b976966 xfs: convert AGI log flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:24 +10:00
Dave Chinner
f53dde11b4 xfs: convert AGF log flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:16 +10:00
Dave Chinner
e7d410ac33 xfs: convert bmapi flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:09 +10:00
Dave Chinner
0e5b8e4522 xfs: convert bmap extent type flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:46:01 +10:00
Dave Chinner
79539c7c76 xfs: convert scrub type flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

This touches xfs_fs.h so affects the user API, but the user API
fields are also unsigned so the flags should really be unsigned,
too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:45:52 +10:00
Dave Chinner
a4d98629c9 xfs: convert attr type flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:45:41 +10:00
Dave Chinner
593e34391f xfs: CIL context doesn't need to count iovecs
Now that we account for log opheaders in the log item formatting
code, we don't actually use the aggregated count of log iovecs in
the CIL for anything. Remove it and the tracking code that
calculates it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:56 +10:00
Dave Chinner
14b07ecd5c xfs: xlog_write() doesn't need optype anymore
So remove it from the interface and callers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:48 +10:00
Dave Chinner
be8ddda5f7 xfs: xlog_write() no longer needs contwr state
The rework of xlog_write() no longer requires xlog_get_iclog_state()
to tell it about internal iclog space reservation state to direct it
on what to do. Remove this parameter.

$ size fs/xfs/xfs_log.o.*
   text	   data	    bss	    dec	    hex	filename
  26520	    560	      8	  27088	   69d0	fs/xfs/xfs_log.o.orig
  26384	    560	      8	  26952	   6948	fs/xfs/xfs_log.o.patched

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:37 +10:00
Christoph Hellwig
1236bbe86b xfs: remove xlog_verify_dest_ptr
Just check that the offset in xlog_write_vec is smaller than the iclog
size and remove the expensive cycling through all iclogs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:27 +10:00
Dave Chinner
ad3e369318 xfs: introduce xlog_write_partial()
Re-implement writing of a log vector that does not fit into the
current iclog. The iclog will already be in XLOG_STATE_WANT_SYNC
because xlog_get_iclog_space() will have reserved all the remaining
iclog space for us, hence we can simply iterate over the iovecs in
the log vector getting more iclog space until the entire log vector
is written.

Handling this partial write case separately means we do need to pass
unnecessary state around for the common, fast path case when the log
vector fits entirely within the current iclog. It isolates the
complexity and allows us to modify and improve the partial write
case without impacting the simple fast path.

This change includes several improvements incorporated from patches
written by Christoph Hellwig.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:15 +10:00
Dave Chinner
db357078b0 xfs: introduce xlog_write_full()
Introduce an optimised version of xlog_write() that is used when the
entire write will fit in a single iclog. This greatly simplifies the
implementation of writing a log vector chain into an iclog, and sets
the ground work for a much more understandable xlog_write()
implementation.

This incorporates some factoring and simplifications proposed by
Christoph Hellwig.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:36:05 +10:00
Christoph Hellwig
decb545fc0 xfs: change the type of ic_datap
Turn ic_datap from a char into a void pointer given that it points
to arbitrary data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[dgc: also remove (char *) cast in xlog_alloc_log()]
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:35:53 +10:00
Dave Chinner
d80fc2914f xfs: pass lv chain length into xlog_write()
The caller of xlog_write() usually has a close accounting of the
aggregated vector length contained in the log vector chain passed to
xlog_write(). There is no need to iterate the chain to calculate he
length of the data in xlog_write_calculate_len() if the caller is
already iterating that chain to build it.

Passing in the vector length avoids doing an extra chain iteration,
which can be a significant amount of work given that large CIL
commits can have hundreds of thousands of vectors attached to the
chain.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:35:19 +10:00
Dave Chinner
c5141320c4 xfs: log ticket region debug is largely useless
xlog_tic_add_region() is used to trace the regions being added to a
log ticket to provide information in the situation where a ticket
reservation overrun occurs. The information gathered is stored int
the ticket, and dumped if xlog_print_tic_res() is called.

For a front end struct xfs_trans overrun, the ticket only contains
reservation tracking information - the ticket is never handed to the
log so has no regions attached to it. The overrun debug information in this
case comes from xlog_print_trans(), which walks the items attached
to the transaction and dumps their attached formatted log vectors
directly. It also dumps the ticket state, but that only contains
reservation accounting and nothing else. Hence xlog_print_tic_res()
never dumps region or overrun information from this path.

xlog_tic_add_region() is actually called from xlog_write(), which
means it is being used to track the regions seen in a
CIL checkpoint log vector chain. In looking at CIL behaviour
recently, I've seen 32MB checkpoints regularly exceed 250,000 
regions in the LV chain. The log ticket debug code can track *15*
regions. IOWs, if there is a ticket overrun in the CIL code, the
ticket region tracking code is going to be completely useless for
determining what went wrong. The only thing it can tell us is how
much of an overrun occurred, and we really don't need extra debug
information in the log ticket to tell us that.

Indeed, the main place we call xlog_tic_add_region() is also adding
up the number of regions and the space used so that xlog_write()
knows how much will be written to the log. This is exactly the same
information that log ticket is storing once we take away the useless
region tracking array. Hence xlog_tic_add_region() is not useful,
but can be called 250,000 times a CIL push...

Just strip all that debug "information" out of the of the log ticket
and only have it report reservation space information when an
overrun occurs. This also reduces the size of a log ticket down by
about 150 bytes...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:35:09 +10:00
Dave Chinner
8d547cf9d2 xfs: reserve space and initialise xlog_op_header in item formatting
Current xlog_write() adds op headers to the log manually for every
log item region that is in the vector passed to it. While
xlog_write() needs to stamp the transaction ID into the ophdr, we
already know it's length, flags, clientid, etc at CIL commit time.

This means the only time that xlog write really needs to format and
reserve space for a new ophdr is when a region is split across two
iclogs. Adding the opheader and accounting for it as part of the
normal formatted item region means we simplify the accounting
of space used by a transaction and we don't have to special case
reserving of space in for the ophdrs in xlog_write(). It also means
we can largely initialise the ophdr in transaction commit instead
of xlog_write, making the xlog_write formatting inner loop much
tighter.

xlog_prepare_iovec() is now too large to stay as an inline function,
so we move it out of line and into xfs_log.c.

Object sizes:
text	   data	    bss	    dec	    hex	filename
1125934	 305951	    484	1432369	 15db31 fs/xfs/built-in.a.before
1123360	 305951	    484	1429795	 15d123 fs/xfs/built-in.a.after

So the code is a roughly 2.5kB smaller with xlog_prepare_iovec() now
out of line, even though it grew in size itself.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:34:59 +10:00
Dave Chinner
3c352bef83 xfs: move log iovec alignment to preparation function
To include log op headers directly into the log iovec regions that
the ophdrs wrap, we need to move the buffer alignment code from
xlog_finish_iovec() to xlog_prepare_iovec(). This is because the
xlog_op_header is only 12 bytes long, and we need the buffer that
the caller formats their data into to be 8 byte aligned.

Hence once we start prepending the ophdr in xlog_prepare_iovec(), we
are going to need to manage the padding directly to ensure that the
buffer pointer returned is correctly aligned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:34:49 +10:00
Dave Chinner
c7610dceed xfs: log tickets don't need log client id
We currently set the log ticket client ID when we reserve a
transaction. This client ID is only ever written to the log by
a CIL checkpoint or unmount records, and so anything using a high
level transaction allocated through xfs_trans_alloc() does not need
a log ticket client ID to be set.

For the CIL checkpoint, the client ID written to the journal is
always XFS_TRANSACTION, and for the unmount record it is always
XFS_LOG, and nothing else writes to the log. All of these operations
tell xlog_write() exactly what they need to write to the log (the
optype) and build their own opheaders for start, commit and unmount
records. Hence we no longer need to set the client id in either the
log ticket or the xfs_trans.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:34:33 +10:00
Dave Chinner
54021b6242 xfs: embed the xlog_op_header in the commit record
Remove the final case where xlog_write() has to prepend an opheader
to a log transaction. Similar to the start record, the commit record
is just an empty opheader with a XLOG_COMMIT_TRANS type, so we can
just make this the payload for the region being passed to
xlog_write() and remove the special handling in xlog_write() for
the commit record.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:34:15 +10:00
Dave Chinner
ffa04c1f2c xfs: embed the xlog_op_header in the unmount record
Remove another case where xlog_write() has to prepend an opheader to
a log transaction. The unmount record + ophdr is smaller than the
minimum amount of space guaranteed to be free in an iclog (2 *
sizeof(ophdr)) and so we don't have to care about an unmount record
being split across 2 iclogs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:34:04 +10:00
Dave Chinner
6eaed95e21 xfs: only CIL pushes require a start record
So move the one-off start record writing in xlog_write() out into
the static header that the CIL push builds to write into the log
initially. This simplifes the xlog_write() logic a lot.

pahole on x86-64 confirms that the xlog_cil_trans_hdr is correctly
32 bit aligned and packed for copying the log op and transaction
headers directly into the log as a single log region copy.

struct xlog_cil_trans_hdr {
        struct xlog_op_header      oph[2];               /*     0    24 */
        struct xfs_trans_header    thdr;                 /*    24    16 */
        struct xfs_log_iovec       lhdr[2];              /*    40    32 */

        /* size: 72, cachelines: 2, members: 3 */
        /* last cacheline: 8 bytes */
};

A wart is needed to handle the fact that length of the region the
opheader points to doesn't include the opheader length. hence if
we embed the opheader, we have to substract the opheader length from
the length written into the opheader by the generic copying code.
This will eventually go away when everything is converted to
embedded opheaders.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:33:48 +10:00
Dave Chinner
735fbf67df xfs: factor out the CIL transaction header building
It is static code deep in the middle of the CIL push logic. Factor
it out into a helper so that it is clear and easy to modify
separately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 10:33:23 +10:00
Kaixu Xia
2d9ac4319b xfs: simplify local variable assignment in file write code
Get the struct inode pointer from iocb->ki_filp->f_mapping->host
directly and the other variables are unnecessary, so simplify the
local variables assignment.

Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 08:47:54 +10:00
Dave Chinner
9a5280b312 xfs: reorder iunlink remove operation in xfs_ifree
The O_TMPFILE creation implementation creates a specific order of
operations for inode allocation/freeing and unlinked list
modification. Currently both are serialised by the AGI, so the order
doesn't strictly matter as long as the are both in the same
transaction.

However, if we want to move the unlinked list insertions largely out
from under the AGI lock, then we have to be concerned about the
order in which we do unlinked list modification operations.
O_TMPFILE creation tells us this order is inode allocation/free,
then unlinked list modification.

Change xfs_ifree() to use this same ordering on unlinked list
removal. This way we always guarantee that when we enter the
iunlinked list removal code from this path, we already have the AGI
locked and we don't have to worry about lock nesting AGI reads
inside unlink list locks because it's already locked and attached to
the transaction.

We can do this safely as the inode freeing and unlinked list removal
are done in the same transaction and hence are atomic operations
with respect to log recovery.


Reported-by: Frank Hofmann <fhofmann@cloudflare.com>
Fixes: 298f7bec503f ("xfs: pin inode backing buffer to the inode log item")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 08:45:16 +10:00
Tiezhu Yang
d65a92de43 MAINTAINERS: update IOMAP FILESYSTEM LIBRARY and XFS FILESYSTEM
In IOMAP FILESYSTEM LIBRARY and XFS FILESYSTEM, the M(ail): entry is
redundant with the L(ist): entry, remove the redundant M(ail): entry.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 08:45:14 +10:00
Dave Chinner
b9b3fe152e xfs: convert buffer flags to unsigned.
5.18 w/ std=gnu11 compiled with gcc-5 wants flags stored in unsigned
fields to be unsigned. This manifests as a compiler error such as:

/kisskb/src/fs/xfs/./xfs_trace.h:432:2: note: in expansion of macro 'TP_printk'
  TP_printk("dev %d:%d daddr 0x%llx bbcount 0x%x hold %d pincount %d "
  ^
/kisskb/src/fs/xfs/./xfs_trace.h:440:5: note: in expansion of macro '__print_flags'
     __print_flags(__entry->flags, "|", XFS_BUF_FLAGS),
     ^
/kisskb/src/fs/xfs/xfs_buf.h:67:4: note: in expansion of macro 'XBF_UNMAPPED'
  { XBF_UNMAPPED,  "UNMAPPED" }
    ^
/kisskb/src/fs/xfs/./xfs_trace.h:440:40: note: in expansion of macro 'XFS_BUF_FLAGS'
     __print_flags(__entry->flags, "|", XFS_BUF_FLAGS),
                                        ^
/kisskb/src/fs/xfs/./xfs_trace.h: In function 'trace_raw_output_xfs_buf_flags_class':
/kisskb/src/fs/xfs/xfs_buf.h:46:23: error: initializer element is not constant
 #define XBF_UNMAPPED  (1 << 31)/* do not map the buffer */

as __print_flags assigns XFS_BUF_FLAGS to a structure that uses an
unsigned long for the flag. Since this results in the value of
XBF_UNMAPPED causing a signed integer overflow, the result is
technically undefined behavior, which gcc-5 does not accept as an
integer constant.

This is based on a patch from Arnd Bergman <arnd@arndb.de>.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-21 08:44:59 +10:00
Darrick J. Wong
2229276c52 xfs: use a separate frextents counter for rt extent reservations
As mentioned in the previous commit, the kernel misuses sb_frextents in
the incore mount to reflect both incore reservations made by running
transactions as well as the actual count of free rt extents on disk.
This results in the superblock being written to the log with an
underestimate of the number of rt extents that are marked free in the
rtbitmap.

Teaching XFS to recompute frextents after log recovery avoids
operational problems in the current mount, but it doesn't solve the
problem of us writing undercounted frextents which are then recovered by
an older kernel that doesn't have that fix.

Create an incore percpu counter to mirror the ondisk frextents.  This
new counter will track transaction reservations and the only time we
will touch the incore super counter (i.e the one that gets logged) is
when those transactions commit updates to the rt bitmap.  This is in
contrast to the lazysbcount counters (e.g. fdblocks), where we know that
log recovery will always fix any incorrect counter that we log.
As a bonus, we only take m_sb_lock at transaction commit time.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-12 06:49:42 +10:00
Darrick J. Wong
5a605fd6cb xfs: recalculate free rt extents after log recovery
I've been observing periodic corruption reports from xfs_scrub involving
the free rt extent counter (frextents) while running xfs/141.  That test
uses an error injection knob to induce a torn write to the log, and an
arbitrary number of recovery mounts, frextents will count fewer free rt
extents than can be found the rtbitmap.

The root cause of the problem is a combination of the misuse of
sb_frextents in the incore mount to reflect both incore reservations
made by running transactions as well as the actual count of free rt
extents on disk.  The following sequence can reproduce the undercount:

Thread 1			Thread 2
xfs_trans_alloc(rtextents=3)
xfs_mod_frextents(-3)
<blocks>
				xfs_attr_set()
				xfs_bmap_attr_addfork()
				xfs_add_attr2()
				xfs_log_sb()
				xfs_sb_to_disk()
				xfs_trans_commit()
<log flushed to disk>
<log goes down>

Note that thread 1 subtracts 3 from sb_frextents even though it never
commits to using that space.  Thread 2 writes the undercounted value to
the ondisk superblock and logs it to the xattr transaction, which is
then flushed to disk.  At next mount, log recovery will find the logged
superblock and write that back into the filesystem.  At the end of log
recovery, we reread the superblock and install the recovered
undercounted frextents value into the incore superblock.  From that
point on, we've effectively leaked thread 1's transaction reservation.

The correct fix for this is to separate the incore reservation from the
ondisk usage, but that's a matter for the next patch.  Because the
kernel has been logging superblocks with undercounted frextents for a
very long time and we don't demand that sysadmins run xfs_repair after a
crash, fix the undercount by recomputing frextents after log recovery.

Gating this on log recovery is a reasonable balance (I think) between
correcting the problem and slowing down every mount attempt.  Note that
xfs_repair will fix undercounted frextents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-12 06:49:42 +10:00
Darrick J. Wong
f34061f554 xfs: pass explicit mount pointer to rtalloc query functions
Pass an explicit xfs_mount pointer to the rtalloc query functions so
that they can support transactionless queries.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-12 06:49:41 +10:00
Matthew Wilcox (Oracle)
f3bf67c6c6 xfs: Use generic_file_open()
Remove the open-coded check of O_LARGEFILE.  This changes the errno
to be the same as other filesystems; it was changed generically in
2.6.24 but that fix skipped XFS.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-12 06:49:40 +10:00
Jonathan Lassoff
e270356944 xfs: Add XFS messages to printk index
In order for end users to quickly react to new issues that come up in
production, it is proving useful to leverage the printk indexing system.

This printk index enables kernel developers to use calls to printk()
with changeable format strings (as they always have; no change of
expectations), while enabling end users to examine format strings to
detect changes.
Since end users are using regular expressions to match messages printed
through printk(), being able to detect changes in chosen format strings
from release to release provides a useful signal to review
printk()-matching regular expressions for any necessary updates.

So that detailed XFS messages are captures by this printk index, this
patch wraps the xfs_<level> and xfs_alert_tag functions.

Signed-off-by: Jonathan Lassoff <jof@thejof.com>
Reviewed-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-11 13:06:39 +10:00
Jonathan Lassoff
e60aa787f4 xfs: Simplify XFS logging methods.
Rather than have a constructor to define many nearly-identical
functions, use preprocessor macros to pass down a kernel logging level
to a common function.

Signed-off-by: Jonathan Lassoff <jof@thejof.com>
Reviewed-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-04-11 13:06:28 +10:00
Linus Torvalds
ce522ba9ef Linux 5.18-rc2 v5.18-rc2 2022-04-10 14:21:36 -10:00
Linus Torvalds
8b57b30461 Serial driver fix for 5.18-rc2
This is a single serial driver fix for a build issue that showed up due
 to changes that came in through the tty tree in 5.18-rc1 that were
 missed previously.  It resolves a build error with the mpc52xx_uart
 driver.
 
 It has been in linux-next this week with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLSOw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykUDQCgpgjEqSAUVXHS3NMjIppMSF8RfD4AoNYjH7Hl
 oVs5nzWNDcbZPvvh+TFw
 =F6pL
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull serial driver fix from Greg KH:
 "This is a single serial driver fix for a build issue that showed up
  due to changes that came in through the tty tree in 5.18-rc1 that were
  missed previously. It resolves a build error with the mpc52xx_uart
  driver.

  It has been in linux-next this week with no reported problems"

* tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: serial: mpc52xx_uart: make rx/tx hooks return unsigned, part II.
2022-04-10 10:08:50 -10:00
Linus Torvalds
95aa17c36d Staging driver fix for 5.18-rc2
Here is a single staging driver fix for 5.18-rc2 that resolves an endian
 issue for the r8188eu driver.  It has been in linux-next all this week
 with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLRlw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylqdgCcCNVlN2uYRqAYjVyZmYSNKumgzN8AoMm1/DVT
 bT68T73BB/g1TFMNjuwy
 =Ho+t
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for 5.18-rc2 that resolves an
  endian issue for the r8188eu driver. It has been in linux-next all
  this week with no reported problems"

* tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8188eu: Fix PPPoE tag insertion on little endian systems
2022-04-10 10:04:30 -10:00
Linus Torvalds
33563138ac Driver core changes for 5.18-rc2
Here are 2 small driver core changes for 5.18-rc2.
 
 They are the final bits in the removal of the default_attrs field in
 struct kobj_type.  I had to wait until after 5.18-rc1 for all of the
 changes to do this came in through different development trees, and then
 one new user snuck in.  So this series has 2 changes:
 	- removal of the default_attrs field in the powerpc/pseries/vas
 	  code.  Change has been acked by the PPC maintainers to come
 	  through this tree
 	- removal of default_attrs from struct kobj_type now that all
 	  in-kernel users are removed.  This cleans up the kobject code
 	  a little bit and removes some duplicated functionality that
 	  confused people (now there is only one way to do default
 	  groups.)
 
 All of these have been in linux-next for all of this week with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLRHg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn+9gCfXN0OvKmw5QD55z8YGp/jIycK0ToAnifJ/OX+
 sU2V8ZQfNbV8xw7iXfc2
 =L+Uc
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here are two small driver core changes for 5.18-rc2.

  They are the final bits in the removal of the default_attrs field in
  struct kobj_type. I had to wait until after 5.18-rc1 for all of the
  changes to do this came in through different development trees, and
  then one new user snuck in. So this series has two changes:

   - removal of the default_attrs field in the powerpc/pseries/vas code.

     The change has been acked by the PPC maintainers to come through
     this tree

   - removal of default_attrs from struct kobj_type now that all
     in-kernel users are removed.

     This cleans up the kobject code a little bit and removes some
     duplicated functionality that confused people (now there is only
     one way to do default groups)

  Both of these have been in linux-next for all of this week with no
  reported problems"

* tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kobject: kobj_type: remove default_attrs
  powerpc/pseries/vas: use default_groups in kobj_type
2022-04-10 09:55:09 -10:00
Linus Torvalds
f58d3410c5 Char/Misc driver fix for 5.18-rc2
Here is a single driver fix for 5.18-rc2.  It resolves the build warning
 issue on 32bit systems in the habannalabs driver that came in during the
 5.18-rc1 merge cycle.
 
 It has been in linux-next for all this week with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlK+5Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykTKQCgoOU9/+9EiS3crSlFzo24SdomjKAAoL1nQoN9
 2s4KLX25ynnincGifSr5
 =VVFs
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fix from Greg KH:
 "A single driver fix. It resolves the build warning issue on 32bit
  systems in the habannalabs driver that came in during the 5.18-rc1
  merge cycle.

  It has been in linux-next for all this week with no reported problems"

* tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  habanalabs: Fix test build failures
2022-04-10 09:52:46 -10:00