tmpfs: fix shmem_evict_inode() warnings on i_blocks
Dmitry Vyukov provides a little program, autogenerated by syzkaller, which races a fault on a mapping of a sparse memfd object, against truncation of that object below the fault address: run repeatedly for a few minutes, it reliably generates shmem_evict_inode()'s WARN_ON(inode->i_blocks). (But there's nothing specific to memfd here, nor to the fstat which it happened to use to generate the fault: though that looked suspicious, since a shmem_recalc_inode() had been added there recently. The same problem can be reproduced with open+unlink in place of memfd_create, and with fstatfs in place of fstat.) v3.7 commit 0f3c42f522dc ("tmpfs: change final i_blocks BUG to WARNING") explains one cause of such a warning (a race with shmem_writepage to swap), and possible solutions; but we never took it further, and this syzkaller incident turns out to have a different cause. shmem_getpage_gfp()'s error recovery, when a freshly allocated page is then found to be beyond eof, looks plausible - decrementing the alloced count that was just before incremented - but in fact can go wrong, if a racing thread (the truncator, for example) gets its shmem_recalc_inode() in just after our delete_from_page_cache(). delete_from_page_cache() decrements nrpages, that shmem_recalc_inode() will balance the books by decrementing alloced itself, then our decrement of alloced take it one too low: leading to the WARNING when the object is finally evicted. Once the new page has been exposed in the page cache, shmem_getpage_gfp() must leave it to shmem_recalc_inode() itself to get the accounting right in all cases (and not fall through from "trunc:" to "decused:"). Adjust that error recovery block; and the reinitialization of info and sbinfo can be removed too. While we're here, fix shmem_writepage() to avoid the original issue: it will be safe against a racing shmem_recalc_inode(), if it merely increments swapped before the shmem_delete_from_page_cache() which decrements nrpages (but it must then do its own shmem_recalc_inode() before that, while still in balance, instead of after). (Aside: why do we shmem_recalc_inode() here in the swap path? Because its raison d'etre is to cope with clean sparse shmem pages being reclaimed behind our back: so here when swapping is a good place to look for that case.) But I've not now managed to reproduce this bug, even without the patch. I don't see why I didn't do that earlier: perhaps inhibited by the preference to eliminate shmem_recalc_inode() altogether. Driven by this incident, I do now have a patch to do so at last; but still want to sit on it for a bit, there's a couple of questions yet to be resolved. Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
dbe409e4f5
commit
267a4c76bb
36
mm/shmem.c
36
mm/shmem.c
@ -843,14 +843,14 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
|
||||
list_add_tail(&info->swaplist, &shmem_swaplist);
|
||||
|
||||
if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
|
||||
spin_lock(&info->lock);
|
||||
shmem_recalc_inode(inode);
|
||||
info->swapped++;
|
||||
spin_unlock(&info->lock);
|
||||
|
||||
swap_shmem_alloc(swap);
|
||||
shmem_delete_from_page_cache(page, swp_to_radix_entry(swap));
|
||||
|
||||
spin_lock(&info->lock);
|
||||
info->swapped++;
|
||||
shmem_recalc_inode(inode);
|
||||
spin_unlock(&info->lock);
|
||||
|
||||
mutex_unlock(&shmem_swaplist_mutex);
|
||||
BUG_ON(page_mapped(page));
|
||||
swap_writepage(page, wbc);
|
||||
@ -1078,7 +1078,7 @@ repeat:
|
||||
if (sgp != SGP_WRITE && sgp != SGP_FALLOC &&
|
||||
((loff_t)index << PAGE_CACHE_SHIFT) >= i_size_read(inode)) {
|
||||
error = -EINVAL;
|
||||
goto failed;
|
||||
goto unlock;
|
||||
}
|
||||
|
||||
if (page && sgp == SGP_WRITE)
|
||||
@ -1246,11 +1246,15 @@ clear:
|
||||
/* Perhaps the file has been truncated since we checked */
|
||||
if (sgp != SGP_WRITE && sgp != SGP_FALLOC &&
|
||||
((loff_t)index << PAGE_CACHE_SHIFT) >= i_size_read(inode)) {
|
||||
if (alloced) {
|
||||
ClearPageDirty(page);
|
||||
delete_from_page_cache(page);
|
||||
spin_lock(&info->lock);
|
||||
shmem_recalc_inode(inode);
|
||||
spin_unlock(&info->lock);
|
||||
}
|
||||
error = -EINVAL;
|
||||
if (alloced)
|
||||
goto trunc;
|
||||
else
|
||||
goto failed;
|
||||
goto unlock;
|
||||
}
|
||||
*pagep = page;
|
||||
return 0;
|
||||
@ -1258,23 +1262,13 @@ clear:
|
||||
/*
|
||||
* Error recovery.
|
||||
*/
|
||||
trunc:
|
||||
info = SHMEM_I(inode);
|
||||
ClearPageDirty(page);
|
||||
delete_from_page_cache(page);
|
||||
spin_lock(&info->lock);
|
||||
info->alloced--;
|
||||
inode->i_blocks -= BLOCKS_PER_PAGE;
|
||||
spin_unlock(&info->lock);
|
||||
decused:
|
||||
sbinfo = SHMEM_SB(inode->i_sb);
|
||||
if (sbinfo->max_blocks)
|
||||
percpu_counter_add(&sbinfo->used_blocks, -1);
|
||||
unacct:
|
||||
shmem_unacct_blocks(info->flags, 1);
|
||||
failed:
|
||||
if (swap.val && error != -EINVAL &&
|
||||
!shmem_confirm_swap(mapping, index, swap))
|
||||
if (swap.val && !shmem_confirm_swap(mapping, index, swap))
|
||||
error = -EEXIST;
|
||||
unlock:
|
||||
if (page) {
|
||||
|
Loading…
x
Reference in New Issue
Block a user