59f6ab40fd
When lazysbcount is enabled, fsstress and loop mount/unmount test report
the following problems:
XFS (loop0): SB summary counter sanity check failed
XFS (loop0): Metadata corruption detected at xfs_sb_write_verify+0x13b/0x460,
xfs_sb block 0x0
XFS (loop0): Unmount and run xfs_repair
XFS (loop0): First 128 bytes of corrupted metadata buffer:
00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 28 00 00 XFSB.........(..
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000020: 69 fb 7c cd 5f dc 44 af 85 74 e0 cc d4 e3 34 5a i.|._.D..t....4Z
00000030: 00 00 00 00 00 20 00 06 00 00 00 00 00 00 00 80 ..... ..........
00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................
00000050: 00 00 00 01 00 0a 00 00 00 00 00 04 00 00 00 00 ................
00000060: 00 00 0a 00 b4 b5 02 00 02 00 00 08 00 00 00 00 ................
00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 14 00 00 19 ................
XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply
+0xe1e/0x10e0 (fs/xfs/xfs_buf.c:1580). Shutting down filesystem.
XFS (loop0): Please unmount the filesystem and rectify the problem(s)
XFS (loop0): log mount/recovery failed: error -117
XFS (loop0): log mount failed
This corruption will shutdown the file system and the file system will
no longer be mountable. The following script can reproduce the problem,
but it may take a long time.
#!/bin/bash
device=/dev/sda
testdir=/mnt/test
round=0
function fail()
{
echo "$*"
exit 1
}
mkdir -p $testdir
while [ $round -lt 10000 ]
do
echo "******* round $round ********"
mkfs.xfs -f $device
mount $device $testdir || fail "mount failed!"
fsstress -d $testdir -l 0 -n 10000 -p 4 >/dev/null &
sleep 4
killall -w fsstress
umount $testdir
xfs_repair -e $device > /dev/null
if [ $? -eq 2 ];then
echo "ERR CODE 2: Dirty log exception during repair."
exit 1
fi
round=$(($round+1))
done
With lazysbcount is enabled, There is no additional lock protection for
reading m_ifree and m_icount in xfs_log_sb(), if other cpu modifies the
m_ifree, this will make the m_ifree greater than m_icount. For example,
consider the following sequence and ifreedelta is postive:
CPU0 CPU1
xfs_log_sb xfs_trans_unreserve_and_mod_sb
---------- ------------------------------
percpu_counter_sum(&mp->m_icount)
percpu_counter_add_batch(&mp->m_icount,
idelta, XFS_ICOUNT_BATCH)
percpu_counter_add(&mp->m_ifree, ifreedelta);
percpu_counter_sum(&mp->m_ifree)
After this, incorrect inode count (sb_ifree > sb_icount) will be writen to
the log. In the subsequent writing of sb, incorrect inode count (sb_ifree >
sb_icount) will fail to pass the boundary check in xfs_validate_sb_write()
that cause the file system shutdown.
When lazysbcount is enabled, we don't need to guarantee that Lazy sb
counters are completely correct, but we do need to guarantee that sb_ifree
<= sb_icount. On the other hand, the constraint that m_ifree <= m_icount
must be satisfied any time that there /cannot/ be other threads allocating
or freeing inode chunks. If the constraint is violated under these
circumstances, sb_i{count,free} (the ondisk superblock inode counters)
maybe incorrect and need to be marked sick at unmount, the count will
be rebuilt on the next mount.
Fixes:
|
||
---|---|---|
.. | ||
libxfs | ||
scrub | ||
Kconfig | ||
kmem.c | ||
kmem.h | ||
Makefile | ||
mrlock.h | ||
xfs_acl.c | ||
xfs_acl.h | ||
xfs_aops.c | ||
xfs_aops.h | ||
xfs_attr_inactive.c | ||
xfs_attr_item.c | ||
xfs_attr_item.h | ||
xfs_attr_list.c | ||
xfs_bio_io.c | ||
xfs_bmap_item.c | ||
xfs_bmap_item.h | ||
xfs_bmap_util.c | ||
xfs_bmap_util.h | ||
xfs_buf_item_recover.c | ||
xfs_buf_item.c | ||
xfs_buf_item.h | ||
xfs_buf.c | ||
xfs_buf.h | ||
xfs_dir2_readdir.c | ||
xfs_discard.c | ||
xfs_discard.h | ||
xfs_dquot_item_recover.c | ||
xfs_dquot_item.c | ||
xfs_dquot_item.h | ||
xfs_dquot.c | ||
xfs_dquot.h | ||
xfs_error.c | ||
xfs_error.h | ||
xfs_export.c | ||
xfs_export.h | ||
xfs_extent_busy.c | ||
xfs_extent_busy.h | ||
xfs_extfree_item.c | ||
xfs_extfree_item.h | ||
xfs_file.c | ||
xfs_filestream.c | ||
xfs_filestream.h | ||
xfs_fsmap.c | ||
xfs_fsmap.h | ||
xfs_fsops.c | ||
xfs_fsops.h | ||
xfs_globals.c | ||
xfs_health.c | ||
xfs_icache.c | ||
xfs_icache.h | ||
xfs_icreate_item.c | ||
xfs_icreate_item.h | ||
xfs_inode_item_recover.c | ||
xfs_inode_item.c | ||
xfs_inode_item.h | ||
xfs_inode.c | ||
xfs_inode.h | ||
xfs_ioctl32.c | ||
xfs_ioctl32.h | ||
xfs_ioctl.c | ||
xfs_ioctl.h | ||
xfs_iomap.c | ||
xfs_iomap.h | ||
xfs_iops.c | ||
xfs_iops.h | ||
xfs_itable.c | ||
xfs_itable.h | ||
xfs_iunlink_item.c | ||
xfs_iunlink_item.h | ||
xfs_iwalk.c | ||
xfs_iwalk.h | ||
xfs_linux.h | ||
xfs_log_cil.c | ||
xfs_log_priv.h | ||
xfs_log_recover.c | ||
xfs_log.c | ||
xfs_log.h | ||
xfs_message.c | ||
xfs_message.h | ||
xfs_mount.c | ||
xfs_mount.h | ||
xfs_mru_cache.c | ||
xfs_mru_cache.h | ||
xfs_notify_failure.c | ||
xfs_ondisk.h | ||
xfs_pnfs.c | ||
xfs_pnfs.h | ||
xfs_pwork.c | ||
xfs_pwork.h | ||
xfs_qm_bhv.c | ||
xfs_qm_syscalls.c | ||
xfs_qm.c | ||
xfs_qm.h | ||
xfs_quota.h | ||
xfs_quotaops.c | ||
xfs_refcount_item.c | ||
xfs_refcount_item.h | ||
xfs_reflink.c | ||
xfs_reflink.h | ||
xfs_rmap_item.c | ||
xfs_rmap_item.h | ||
xfs_rtalloc.c | ||
xfs_rtalloc.h | ||
xfs_stats.c | ||
xfs_stats.h | ||
xfs_super.c | ||
xfs_super.h | ||
xfs_symlink.c | ||
xfs_symlink.h | ||
xfs_sysctl.c | ||
xfs_sysctl.h | ||
xfs_sysfs.c | ||
xfs_sysfs.h | ||
xfs_trace.c | ||
xfs_trace.h | ||
xfs_trans_ail.c | ||
xfs_trans_buf.c | ||
xfs_trans_dquot.c | ||
xfs_trans_priv.h | ||
xfs_trans.c | ||
xfs_trans.h | ||
xfs_xattr.c | ||
xfs_xattr.h | ||
xfs.h |