linux/fs/xfs
Dave Chinner b79f4a1c68 xfs: inode recovery readahead can race with inode buffer creation
When we do inode readahead in log recovery, we do can do the
readahead before we've replayed the icreate transaction that stamps
the buffer with inode cores. The inode readahead verifier catches
this and marks the buffer as !done to indicate that it doesn't yet
contain valid inodes.

In adding buffer error notification  (i.e. setting b_error = -EIO at
the same time as as we clear the done flag) to such a readahead
verifier failure, we can then get subsequent inode recovery failing
with this error:

XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32

This occurs when readahead completion races with icreate item replay
such as:

	inode readahead
		find buffer
		lock buffer
		submit RA io
	....
	icreate recovery
	    xfs_trans_get_buffer
		find buffer
		lock buffer
		<blocks on RA completion>
	.....
	<ra completion>
		fails verifier
		clear XBF_DONE
		set bp->b_error = -EIO
		release and unlock buffer
	<icreate gains lock>
	icreate initialises buffer
	marks buffer as done
	adds buffer to delayed write queue
	releases buffer

At this point, we have an initialised inode buffer that is up to
date but has an -EIO state registered against it. When we finally
get to recovering an inode in that buffer:

	inode item recovery
	    xfs_trans_read_buffer
		find buffer
		lock buffer
		sees XBF_DONE is set, returns buffer
	    sees bp->b_error is set
		fail log recovery!

Essentially, we need xfs_trans_get_buf_map() to clear the error status of
the buffer when doing a lookup. This function returns uninitialised
buffers, so the buffer returned can not be in an error state and
none of the code that uses this function expects b_error to be set
on return. Indeed, there is an ASSERT(!bp->b_error); in the
transaction case in xfs_trans_get_buf_map() that would have caught
this if log recovery used transactions....

This patch firstly changes the inode readahead failure to set -EIO
on the buffer, and secondly changes xfs_buf_get_map() to never
return a buffer with an error state set so this first change doesn't
cause unexpected log recovery failures.

cc: <stable@vger.kernel.org> # 3.12 - current
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-01-12 07:03:44 +11:00
..
libxfs xfs: inode recovery readahead can race with inode buffer creation 2016-01-12 07:03:44 +11:00
Kconfig
kmem.c xfs: more info from kmem deadlocks and high-level error msgs 2015-10-12 16:04:45 +11:00
kmem.h xfs: change kmem_free to use generic kvfree() 2015-02-02 09:54:18 +11:00
Makefile xfs: stats are no longer dependent on CONFIG_PROC_FS 2015-10-19 08:42:46 +11:00
mrlock.h
uuid.c
uuid.h
xfs_acl.c xfs: Fix error path in xfs_get_acl 2015-11-10 10:09:45 +11:00
xfs_acl.h xfs: Fix error path in xfs_get_acl 2015-11-10 10:09:45 +11:00
xfs_aops.c xfs: add tracepoints to readpage calls 2016-01-08 11:28:35 +11:00
xfs_aops.h xfs: DAX does not use IO completion callbacks 2015-11-03 12:37:02 +11:00
xfs_attr_inactive.c Merge branch 'xfs-misc-fixes-for-4.2-3' into for-next 2015-06-23 08:49:01 +10:00
xfs_attr_list.c xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_attr.h
xfs_bmap_util.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_bmap_util.h xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate 2015-03-25 15:08:56 +11:00
xfs_buf_item.c Merge branch 'xfs-misc-fixes-for-4.3-3' into for-next 2015-08-25 10:13:35 +10:00
xfs_buf_item.h xfs: fix non-debug build warnings 2015-08-25 10:05:13 +10:00
xfs_buf.c xfs: inode recovery readahead can race with inode buffer creation 2016-01-12 07:03:44 +11:00
xfs_buf.h xfs: print name of verifier if it fails 2016-01-04 16:10:19 +11:00
xfs_dir2_readdir.c xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_discard.c xfs: pass mp to XFS_WANT_CORRUPTED_GOTO 2015-02-23 22:39:08 +11:00
xfs_discard.h
xfs_dquot_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_dquot_item.h
xfs_dquot.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_dquot.h xfs: fix implicit bool to int conversion 2015-01-09 10:48:58 +11:00
xfs_error.c xfs: print name of verifier if it fails 2016-01-04 16:10:19 +11:00
xfs_error.h xfs: remove inst_t 2015-06-22 09:44:02 +10:00
xfs_export.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
xfs_export.h
xfs_extent_busy.c xfs: merge xfs_ag.h into xfs_format.h 2014-11-28 14:25:04 +11:00
xfs_extent_busy.h
xfs_extfree_item.c xfs: add helper to conditionally remove items from the AIL 2015-08-19 10:01:08 +10:00
xfs_extfree_item.h xfs: fix efi/efd error handling to avoid fs shutdown hangs 2015-08-19 09:51:16 +10:00
xfs_file.c xfs: updates for 4.4-rc1 2015-11-11 20:18:48 -08:00
xfs_filestream.c xfs: clean up XFS_MIN_FREELIST macros 2015-06-22 10:13:30 +10:00
xfs_filestream.h
xfs_fsops.c xfs: growfs not aware of sb_meta_uuid 2015-08-19 10:31:41 +10:00
xfs_fsops.h
xfs_globals.c
xfs_icache.c xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_icache.h xfs: merge xfs_ag.h into xfs_format.h 2014-11-28 14:25:04 +11:00
xfs_icreate_item.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_icreate_item.h
xfs_inode_item.c xfs: optimise away log forces on timestamp updates for fdatasync 2015-11-03 13:14:59 +11:00
xfs_inode_item.h xfs: optimise away log forces on timestamp updates for fdatasync 2015-11-03 13:14:59 +11:00
xfs_inode.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_inode.h xfs: clean up inode lockdep annotations 2015-08-19 10:32:49 +10:00
xfs_ioctl32.c xfs: prefix XATTR_LIST_MAX with XFS_ 2015-10-12 16:02:56 +11:00
xfs_ioctl32.h
xfs_ioctl.c Merge branch 'xfs-misc-fixes-for-4.4-2' into for-next 2015-11-03 13:27:58 +11:00
xfs_ioctl.h
xfs_iomap.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_iomap.h xfs: pass a 64-bit count argument to xfs_iomap_write_unwritten 2015-01-09 10:48:12 +11:00
xfs_iops.c xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_iops.h xfs: inodes are new until the dentry cache is set up 2015-02-23 22:38:08 +11:00
xfs_itable.c xfs: fix btree cursor error cleanups 2015-08-19 10:00:53 +10:00
xfs_itable.h
xfs_linux.h xfs: pass xfsstats structures to handlers and macros 2015-10-12 05:19:45 +11:00
xfs_log_cil.c xfs: close xc_cil list_empty() races with cil commit sequence 2015-07-29 11:51:01 +10:00
xfs_log_priv.h xfs: validate metadata LSNs against log on v5 superblocks 2015-10-12 15:59:25 +11:00
xfs_log_recover.c Merge branch 'xfs-misc-fixes-for-4.4-3' into for-next 2015-11-10 10:20:48 +11:00
xfs_log.c xfs: fix log ticket type printing 2016-01-04 16:11:42 +11:00
xfs_log.h xfs: validate metadata LSNs against log on v5 superblocks 2015-10-12 15:59:25 +11:00
xfs_message.c xfs: more info from kmem deadlocks and high-level error msgs 2015-10-12 16:04:45 +11:00
xfs_message.h
xfs_mount.c Merge branch 'xfs-misc-fixes-for-4.4-2' into for-next 2015-11-03 13:27:58 +11:00
xfs_mount.h Merge branch 'xfs-dax-updates' into for-next 2015-11-03 13:28:41 +11:00
xfs_mru_cache.c xfs: xfs_mru_cache_insert() should use GFP_NOFS 2015-03-25 14:57:53 +11:00
xfs_mru_cache.h
xfs_pnfs.c xfs: add missing ilock around dio write last extent alignment 2015-10-12 15:34:20 +11:00
xfs_pnfs.h xfs: unlock i_mutex in xfs_break_layouts 2015-04-13 11:38:29 +10:00
xfs_qm_bhv.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_qm_syscalls.c xfs: saner xfs_trans_commit interface 2015-06-04 13:48:08 +10:00
xfs_qm.c xfs: updates for 4.4-rc1 2015-11-11 20:18:48 -08:00
xfs_qm.h xfs: Convert to using ->get_state callback 2015-03-04 16:06:36 +01:00
xfs_quota.h xfs: fix quota block reservation leak when tp allocates and frees blocks 2015-06-01 07:15:37 +10:00
xfs_quotaops.c xfs: Add support for Q_SETINFO 2015-03-04 16:06:38 +01:00
xfs_rtalloc.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_rtalloc.h
xfs_stats.c xfs: stats are no longer dependent on CONFIG_PROC_FS 2015-10-19 08:42:46 +11:00
xfs_stats.h xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_super.c XFS: Use a signed return type for suffix_kstrtoint() 2016-01-04 16:13:21 +11:00
xfs_super.h xfs: Remove icsb infrastructure 2015-02-23 21:22:31 +11:00
xfs_symlink.c xfs: eliminate committed arg from xfs_bmap_finish 2016-01-11 11:34:01 +11:00
xfs_symlink.h
xfs_sysctl.c xfs: pass xfsstats structures to handlers and macros 2015-10-12 05:19:45 +11:00
xfs_sysctl.h
xfs_sysfs.c xfs: pass xfsstats structures to handlers and macros 2015-10-12 05:19:45 +11:00
xfs_sysfs.h xfs: create global stats and stats_clear in sysfs 2015-10-12 05:15:45 +11:00
xfs_trace.c xfs: move most of xfs_sb.h to xfs_format.h 2014-11-28 14:27:09 +11:00
xfs_trace.h xfs: add tracepoints to readpage calls 2016-01-08 11:28:35 +11:00
xfs_trans_ail.c Merge branch 'xfs-misc-fixes-for-4.4-2' into for-next 2015-11-03 13:27:58 +11:00
xfs_trans_buf.c xfs: only trace buffer items if they exist 2015-02-10 09:23:40 +11:00
xfs_trans_dquot.c xfs: send warning of project quota to userspace via netlink 2016-01-04 16:10:42 +11:00
xfs_trans_extfree.c xfs: ensure EFD trans aborts on log recovery extent free failure 2015-08-19 09:51:43 +10:00
xfs_trans_inode.c xfs: optimise away log forces on timestamp updates for fdatasync 2015-11-03 13:14:59 +11:00
xfs_trans_priv.h xfs: add helper to conditionally remove items from the AIL 2015-08-19 10:01:08 +10:00
xfs_trans.c xfs: per-filesystem stats counter implementation 2015-10-12 18:21:22 +11:00
xfs_trans.h xfs: ensure EFD trans aborts on log recovery extent free failure 2015-08-19 09:51:43 +10:00
xfs_xattr.c Merge branch 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-11-13 18:02:30 -08:00
xfs.h