ext3: Avoid filesystem corruption after a crash under heavy delete load
It can happen that ext3_free_branches calls ext3_forget() for an indirect block in an earlier transaction than a transaction in which we clear pointer to this indirect block. Thus if we crash before a transaction clearing the block pointer is committed, we will see indirect block pointing to already freed blocks and complain during orphan list cleanup. The fix is simple: Make sure ext3_forget() is called in the transaction doing block pointer clearing. This is a backport of an ext4 fix by Amir G. <amir73il@users.sourceforge.net> Signed-off-by: Jan Kara <jack@suse.cz>
This commit is contained in:
parent
4c4d390122
commit
f25f624263
@ -2269,27 +2269,6 @@ static void ext3_free_branches(handle_t *handle, struct inode *inode,
|
|||||||
(__le32*)bh->b_data + addr_per_block,
|
(__le32*)bh->b_data + addr_per_block,
|
||||||
depth);
|
depth);
|
||||||
|
|
||||||
/*
|
|
||||||
* We've probably journalled the indirect block several
|
|
||||||
* times during the truncate. But it's no longer
|
|
||||||
* needed and we now drop it from the transaction via
|
|
||||||
* journal_revoke().
|
|
||||||
*
|
|
||||||
* That's easy if it's exclusively part of this
|
|
||||||
* transaction. But if it's part of the committing
|
|
||||||
* transaction then journal_forget() will simply
|
|
||||||
* brelse() it. That means that if the underlying
|
|
||||||
* block is reallocated in ext3_get_block(),
|
|
||||||
* unmap_underlying_metadata() will find this block
|
|
||||||
* and will try to get rid of it. damn, damn.
|
|
||||||
*
|
|
||||||
* If this block has already been committed to the
|
|
||||||
* journal, a revoke record will be written. And
|
|
||||||
* revoke records must be emitted *before* clearing
|
|
||||||
* this block's bit in the bitmaps.
|
|
||||||
*/
|
|
||||||
ext3_forget(handle, 1, inode, bh, bh->b_blocknr);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Everything below this this pointer has been
|
* Everything below this this pointer has been
|
||||||
* released. Now let this top-of-subtree go.
|
* released. Now let this top-of-subtree go.
|
||||||
@ -2313,6 +2292,31 @@ static void ext3_free_branches(handle_t *handle, struct inode *inode,
|
|||||||
truncate_restart_transaction(handle, inode);
|
truncate_restart_transaction(handle, inode);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* We've probably journalled the indirect block several
|
||||||
|
* times during the truncate. But it's no longer
|
||||||
|
* needed and we now drop it from the transaction via
|
||||||
|
* journal_revoke().
|
||||||
|
*
|
||||||
|
* That's easy if it's exclusively part of this
|
||||||
|
* transaction. But if it's part of the committing
|
||||||
|
* transaction then journal_forget() will simply
|
||||||
|
* brelse() it. That means that if the underlying
|
||||||
|
* block is reallocated in ext3_get_block(),
|
||||||
|
* unmap_underlying_metadata() will find this block
|
||||||
|
* and will try to get rid of it. damn, damn. Thus
|
||||||
|
* we don't allow a block to be reallocated until
|
||||||
|
* a transaction freeing it has fully committed.
|
||||||
|
*
|
||||||
|
* We also have to make sure journal replay after a
|
||||||
|
* crash does not overwrite non-journaled data blocks
|
||||||
|
* with old metadata when the block got reallocated for
|
||||||
|
* data. Thus we have to store a revoke record for a
|
||||||
|
* block in the same transaction in which we free the
|
||||||
|
* block.
|
||||||
|
*/
|
||||||
|
ext3_forget(handle, 1, inode, bh, bh->b_blocknr);
|
||||||
|
|
||||||
ext3_free_blocks(handle, inode, nr, 1);
|
ext3_free_blocks(handle, inode, nr, 1);
|
||||||
|
|
||||||
if (parent_bh) {
|
if (parent_bh) {
|
||||||
|
Loading…
x
Reference in New Issue
Block a user