IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Commit aaf9607516ed38825268515ef4d773289a44f429 ("f2fs: check node page
contents all the time") pointed out that "sometimes it was reported that
its contents was missing", so it checks the page's mapping and contents.
When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the
caller. However, commit e1c51b9f1df2f9efc2ec11488717e40cd12015f9 ("f2fs:
clean up node page updating flow") moves "nid != nid_of_node(page)" test
to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a
wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it
was reported that its contents was missing" happens.
This patch restores to check node page contents all the time, and
returns the errno to make the caller known something is wrong and avoid
to use the page. This patch also moves f2fs_bug_on to its proper location.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If there is no cold page, we don't need to do a loop to flush dirty
data pages.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 1.1 GB/s
After : 1.2 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 2.2 GB/s
After : 2.3 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For data pages, let's try to flush as much as possible in background.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 800 MB/s
After : 1.1 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 1.3 GB/s
After : 2.2 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch removes writepages lock.
We can improve multi-threading performance.
tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If flush commands do not incur any congestion, we don't need to throw that to
dispatching queue which causes unnecessary latency.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()
Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.
-> largest
-> ctime/mtime/atime
-> i_current_depth
-> i_xattr_nid
-> i_pino
-> i_advise
-> i_flags
-> i_mode
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces f2fs_i_links_write() to call mark_inode_dirty_sync() when
changing inode->i_links.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
- fs-specific prefix for fscrypto
- fault injection facility
- expose validity bitmaps for user to be aware of fragmentation
- fallocate/rm/preallocation speed up
- use percpu counters
Bug fixes
- some inline_dentry/inline_data bugs
- error handling for atomic/volatile/orphan inodes
- recover broken superblock
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXQPu4AAoJEEAUqH6CSFDSILgP/1dj6fmtytr8c+55EBqXUGpo
M7rS93JTxlmU5BduIo9psJsEquTQoVEmxB/Gjd+ZnI5R6Rp1c/REaP0ba374rEhZ
ecMQh5QqzM1gRNFXrQhWFEL/KtfRqt3T80zebQP7pxFUm/m9NGMLWT43RzQ8AAhr
Y3P0NLdvxA4HAnipKptkPJcGZQlWnL9W/MR+LgsXLXqLDwJHkVu61GcF0y2ibcJM
lEtIRmyH5tg7hP5c5LTw9pKQFHkIZt5cHFLjrJ1x8FSm2TXOcJPbjOrThvcb+NKK
e0O+6R0meH2eMpak+BTkZp2YbPPyXOb1N00j//lmbPjCoJPd4ZuiJ+oRoHUlTxtU
FhO67t0brlDbMFQVRFrtv8VA8M6by+DTAAP3Ffx62I/TJkphKANCSoyQRhlWtxxO
kRU69N7ipnRNxO4WCv40FjaQjSIElCKysP1POazRmAOQm7UFTGT9Nj37+eqUcEPJ
HZ7O61DEHNemb0SMlJ8WSClstt0yUU+2cjRfTPAr2Wd3V8gYbRs0QUg5M2GLgywR
EmiJfpkXse3f/nR8W6g1hganSOXA0AZX+EUibed6VkV3oYemdFbm8OymeEmLmWpM
y2F3D7dPLW7MCoTXJqtwFWdoDwI+zkH4rJaPGTq5TVBRWVU/njX8OvoB47pOvKV1
kccL7zv2PekE1hSDO5WF
=6MSp
-----END PGP SIGNATURE-----
Merge tag 'for-f2fs-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, as Ted pointed out, fscrypto allows one more key prefix
given by filesystem to resolve backward compatibility issues. Other
than that, we've fixed several error handling cases by introducing
a fault injection facility. We've also achieved performance
improvement in some workloads as well as a bunch of bug fixes.
Summary:
Enhancements:
- fs-specific prefix for fscrypto
- fault injection facility
- expose validity bitmaps for user to be aware of fragmentation
- fallocate/rm/preallocation speed up
- use percpu counters
Bug fixes:
- some inline_dentry/inline_data bugs
- error handling for atomic/volatile/orphan inodes
- recover broken superblock"
* tag 'for-f2fs-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (73 commits)
f2fs: fix to update dirty page count correctly
f2fs: flush pending bios right away when error occurs
f2fs: avoid ENOSPC fault in the recovery process
f2fs: make exit_f2fs_fs more clear
f2fs: use percpu_counter for total_valid_inode_count
f2fs: use percpu_counter for alloc_valid_block_count
f2fs: use percpu_counter for # of dirty pages in inode
f2fs: use percpu_counter for page counters
f2fs: use bio count instead of F2FS_WRITEBACK page count
f2fs: manipulate dirty file inodes when DATA_FLUSH is set
f2fs: add fault injection to sysfs
f2fs: no need inc dirty pages under inode lock
f2fs: fix incorrect error path handling in f2fs_move_rehashed_dirents
f2fs: fix i_current_depth during inline dentry conversion
f2fs: correct return value type of f2fs_fill_super
f2fs: fix deadlock when flush inline data
f2fs: avoid f2fs_bug_on during recovery
f2fs: show # of orphan inodes
f2fs: support in batch fzero in dnode page
f2fs: support in batch multi blocks preallocation
...
init_f2fs_fs does:
1) f2fs_build_trace_ios
2) init_inodecache
3) create_node_manager_caches
4) create_segment_manager_caches
5) create_checkpoint_caches
6) create_extent_cache
7) kset_create_and_add
8) kobject_init_and_add
9) register_shrinker
10) register_filesystem
11) f2fs_create_root_stats
12) proc_mkdir
exit_f2fs_fs should do cleanup in the reverse order
to make the code more clear.
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Pull vfs cleanups from Al Viro:
"More cleanups from Christoph"
* 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
nfsd: use RWF_SYNC
fs: add RWF_DSYNC aand RWF_SYNC
ceph: use generic_write_sync
fs: simplify the generic_write_sync prototype
fs: add IOCB_SYNC and IOCB_DSYNC
direct-io: remove the offset argument to dio_complete
direct-io: eliminate the offset argument to ->direct_IO
xfs: eliminate the pos variable in xfs_file_dio_aio_write
filemap: remove the pos argument to generic_file_direct_write
filemap: remove pos variables in generic_file_read_iter
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
This patch introduces a new struct f2fs_fault_info and a global f2fs_fault
to save fault injection status. Fault injection entries are created in
/sys/fs/f2fs/fault_injection/ during initializing f2fs module.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fix two bugs in error path of f2fs_move_rehashed_dirents:
- release dir's inode page if fail to call kmalloc
- recover i_current_depth if fail to converting
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
With below steps, we will see that dentry page becoming unaccessable later.
This is because we forget updating i_current_depth in inode during inline
dentry conversion, after that, once we failed at somewhere, it will leave
i_current_depth as 0 in non-inline directory. Then, during ->lookup, the
current_depth value makes all dentry pages in first level invisible. Fix
it.
1) mount f2fs with inline_dentry option
2) mkdir dir
3) touch 180 files named [0-179] in dir
4) touch 180 in dir (fail after inline dir conversion)
5) ll dir
ls: cannot access /mnt/f2fs/dir/0: No such file or directory
ls: cannot access /mnt/f2fs/dir/1: No such file or directory
ls: cannot access /mnt/f2fs/dir/2: No such file or directory
ls: cannot access /mnt/f2fs/dir/3: No such file or directory
ls: cannot access /mnt/f2fs/dir/4: No such file or directory
drwxr-xr-x 2 root root 4096 may 13 21:47 ./
drwxr-xr-x 3 root root 4096 may 13 21:46 ../
-????????? ? ? ? ? ? 0
-????????? ? ? ? ? ? 1
-????????? ? ? ? ? ? 10
-????????? ? ? ? ? ? 100
-????????? ? ? ? ? ? 101
-????????? ? ? ? ? ? 102
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We don't need to use f2fs_bug_on() to treat with any error case when allocating
a block during recovery.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch tries to speedup fzero_range by making space preallocation and
address removal of blocks in one dnode page as in batch operation.
In virtual machine, with zram driver:
dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=4096
time xfs_io -f /mnt/f2fs/file -c "fzero 0 4096M"
Before:
real 0m3.276s
user 0m0.008s
sys 0m3.260s
After:
real 0m1.568s
user 0m0.000s
sys 0m1.564s
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: consider ENOSPC case]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces reserve_new_blocks to make preallocation of multi
blocks as in batch operation, so it can avoid lots of redundant
operation, result in better performance.
In virtual machine, with rotational device:
time fallocate -l 32G /mnt/f2fs/file
Before:
real 0m4.584s
user 0m0.000s
sys 0m4.580s
After:
real 0m0.292s
user 0m0.000s
sys 0m0.272s
In x86, with SSD:
time fallocate -l 500G $MNT/testfile
Before : 24.758 s
After : 1.604 s
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix bugs and add performance numbers measured in x86.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
atomic/volatile ioctl interfaces are exposed to user like other file
operation interface, it needs to make them getting exclusion against
to each other to avoid potential conflict among these operations
in concurrent scenario.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In interfaces of ioctl, mnt_{want,drop}_write_file should be used for:
- get exclusion against file system freezing which may used by lvm
snapshot.
- do telling filesystem that a write is about to be performed on it, and
make sure that the writes are permitted.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously f2fs_preallocate_blocks() tries to allocate unaligned blocks.
In f2fs_write_begin(), however, prepare_write_begin() does not skip its
allocation due to (len != 4KB).
So, it needs locking node page twice unexpectedly.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch enables reading node blocks in advance when truncating large
data blocks.
> time rm $MNT/testfile (500GB) after drop_cachees
Before : 9.422 s
After : 4.821 s
Reported-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch is to improve the expand_inode speed in fallocate by allocating
data blocks as many as possible in single locked node page.
In SSD,
# time fallocate -l 500G $MNT/testfile
Before : 1m 33.410 s
After : 24.758 s
Reported-by: Stephen Bates <stephen.bates@microsemi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>