Go to file
Dave Chinner 82842fee6e xfs: fix AGF vs inode cluster buffer deadlock
Lock order in XFS is AGI -> AGF, hence for operations involving
inode unlinked list operations we always lock the AGI first. Inode
unlinked list operations operate on the inode cluster buffer,
so the lock order there is AGI -> inode cluster buffer.

For O_TMPFILE operations, this now means the lock order set down in
xfs_rename and xfs_link is AGI -> inode cluster buffer -> AGF as the
unlinked ops are done before the directory modifications that may
allocate space and lock the AGF.

Unfortunately, we also now lock the inode cluster buffer when
logging an inode so that we can attach the inode to the cluster
buffer and pin it in memory. This creates a lock order of AGF ->
inode cluster buffer in directory operations as we have to log the
inode after we've allocated new space for it.

This creates a lock inversion between the AGF and the inode cluster
buffer. Because the inode cluster buffer is shared across multiple
inodes, the inversion is not specific to individual inodes but can
occur when inodes in the same cluster buffer are accessed in
different orders.

To fix this we need move all the inode log item cluster buffer
interactions to the end of the current transaction. Unfortunately,
xfs_trans_log_inode() calls are littered throughout the transactions
with no thought to ordering against other items or locking. This
makes it difficult to do anything that involves changing the call
sites of xfs_trans_log_inode() to change locking orders.

However, we do now have a mechanism that allows is to postpone dirty
item processing to just before we commit the transaction: the
->iop_precommit method. This will be called after all the
modifications are done and high level objects like AGI and AGF
buffers have been locked and modified, thereby providing a mechanism
that guarantees we don't lock the inode cluster buffer before those
high level objects are locked.

This change is largely moving the guts of xfs_trans_log_inode() to
xfs_inode_item_precommit() and providing an extra flag context in
the inode log item to track the dirty state of the inode in the
current transaction. This also means we do a lot less repeated work
in xfs_trans_log_inode() by only doing it once per transaction when
all the work is done.

Fixes: 298f7bec50 ("xfs: pin inode backing buffer to the inode log item")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-06-05 04:08:27 +10:00
arch ARM: 2023-06-04 07:16:53 -04:00
block block: fix revalidate performance regression 2023-05-29 08:40:32 -06:00
certs KEYS: Add missing function documentation 2023-04-24 16:15:52 +03:00
crypto This push fixes the following problems: 2023-05-07 10:57:14 -07:00
Documentation Char/Misc driver fixes for 6.4-rc5 2023-06-04 08:32:30 -04:00
drivers - Fix open firmware quirks validation so that they don't get applied 2023-06-04 11:57:38 -04:00
fs xfs: fix AGF vs inode cluster buffer deadlock 2023-06-05 04:08:27 +10:00
include media fixes for v6.4-rc5 2023-06-04 09:10:43 -04:00
init Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
io_uring io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
ipc Merge branch 'work.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:20:07 -08:00
kernel Probes fixes for 6.4-rc4: 2023-06-03 08:23:16 -04:00
lib test_firmware: fix the memory leak of the allocated firmware buffer 2023-05-31 20:31:07 +01:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm: page_table_check: Ensure user pages are not slab pages 2023-05-29 16:14:28 +01:00
net nfsd-6.4 fixes: 2023-06-02 13:38:55 -04:00
rust Rust changes for v6.4 2023-04-30 11:20:22 -07:00
samples samples/bpf: Drop unnecessary fallthrough 2023-05-16 19:44:05 +02:00
scripts Locking changes in v6.4: 2023-05-05 12:56:55 -07:00
security selinux: don't use make's grouped targets feature yet 2023-06-01 13:56:13 -04:00
sound ALSA: hda/realtek: Enable headset onLenovo M70/M90 2023-05-24 14:18:59 +02:00
tools ARM: 2023-06-04 07:16:53 -04:00
usr initramfs: Check negative timestamp to prevent broken cpio archive 2023-04-16 17:37:01 +09:00
virt KVM: Fix vcpu_array[0] races 2023-05-19 13:56:26 -04:00
.clang-format cxl for v6.4 2023-04-30 11:51:51 -07:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for *.dtso files 2023-02-26 15:28:23 +09:00
.gitignore linux-kselftest-kunit-6.4-rc1 2023-04-24 12:31:32 -07:00
.mailmap mailmap: add entries for Nikolay Aleksandrov 2023-05-17 09:35:05 +01:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING
CREDITS MAINTAINERS: sctp: move Neil to CREDITS 2023-05-12 08:51:32 +01:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Char/Misc driver fixes for 6.4-rc5 2023-06-04 08:32:30 -04:00
Makefile Linux 6.4-rc5 2023-06-04 14:04:27 -04:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.