IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 1f0d5c911b64165c9754139a26c8c2fad352c132 upstream.
We expect 64-bit calculation result from below statement, however
in 32-bit machine, looped left shift operation on pgoff_t type
variable may cause overflow issue, fix it by forcing type cast.
page->index << PAGE_SHIFT;
Fixes: 26de9b117130 ("f2fs: avoid unnecessary updating inode during fsync")
Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2baf07818549c8bb8d7b3437e889b86eab56d38e ]
We need to drop PG_checked flag on page as well when we clear PG_uptodate
flag, in order to avoid treating the page as GCing one later.
Signed-off-by: Weichao Guo <guoweichao@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit e1da7872f6eda977bd812346bf588c35e4495a1e upstream.
This patch introduces verify_blkaddr to check meta/data block address
with valid range to detect bug earlier.
In addition, once we encounter an invalid blkaddr, notice user to run
fsck to fix, and let the kernel panic.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9:
- I skipped an earlier renaming of is_valid_meta_blkaddr() to
f2fs_is_valid_meta_blkaddr()
- Adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b525dd01365c6764018e374d391c92466be1b7a upstream.
- rename is_valid_blkaddr() to is_valid_meta_blkaddr() for readability.
- introduce is_valid_blkaddr() for cleanup.
No logic change in this patch.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0833721ec3658a4e9d5e58b6fa82cf9edc431e59 upstream.
This patch check blkaddr more accuratly before issue a
write or read bio.
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b86e33075ed1909d8002745b56ecf73b833db143 upstream.
A dead loop can be triggered in f2fs_fiemap() using the test case
as below:
...
fd = open();
fallocate(fd, 0, 0, 4294967296);
ioctl(fd, FS_IOC_FIEMAP, fiemap_buf);
...
It's caused by an overflow in __get_data_block():
...
bh->b_size = map.m_len << inode->i_blkbits;
...
map.m_len is an unsigned int, and bh->b_size is a size_t which is 64 bits
on 64 bits archtecture, type conversion from an unsigned int to a size_t
will result in an overflow.
In the above-mentioned case, bh->b_size will be zero, and f2fs_fiemap()
will call get_data_block() at block 0 again an again.
Fix this by adding a force conversion before left shift.
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 230436b3ef3fd7d4a1da19edf5e87bb2d74e0fc2 upstream.
gcc is unsure about the use of last_ofs_in_node, which might happen
without a prior initialization:
fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’:
fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) {
As pointed out by Chao Yu, the code is actually correct as 'prealloc'
is only set if the last_ofs_in_node has been set, the two always
get updated together.
This initializes last_ofs_in_node to dn.ofs_in_node for each
new dnode at the start of the 'next_block' loop, which at that
point is a correct initialization as well. I assume that compilers
that correctly track the contents of the variables and do not
warn about the condition also figure out that they can eliminate
the extra assignment here.
Fixes: 46008c6d4232 ("f2fs: support in batch multi blocks preallocation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mapping_set_error() helper sets the correct AS_ flag for the mapping
so there is no reason to open code it. Use the helper directly.
[akpm@linux-foundation.org: be honest about conversion from -ENXIO to -EIO]
Link: http://lkml.kernel.org/r/20160912111608.2588-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While we call ->writepages, there are two cases:
a. we didn't writeout any dirty pages, since they are writebacked by other
thread concurrently.
b. we writeout dirty pages, and have already submitted bio to block layer.
In these cases, we don't need to do additional bio flushing unnecessarily,
it may split bio in cache into smaller one.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.
It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.
>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch improves the migration of dirty pages and allows migrating atomic
written pages that F2FS uses in Page Cache. Instead of the fallback releasing
page path, it provides better performance for memory compaction, CMA and other
users of memory page migrating. For dirty pages, there is no need to write back
first when migrating. For an atomic written page before committing, we can
migrate the page and update the related 'inmem_pages' list at the same time.
Signed-off-by: Weichao Guo <guoweichao@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix some coding style]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch allow preallocates data blocks for buffered aio writes
in encrypted file.
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix to avoid BUG_ON]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds to support IO error injection for testing IO error
tolerance of f2fs.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, f2fs_write_begin sets PageUptodate all the time. But, when user
tries to update the entire page (i.e., len == PAGE_SIZE), we need to consider
that the page is able to be copied partially afterwards. In such the case,
we will lose the remaing region in the page.
This patch fixes this by setting PageUptodate in f2fs_write_end as given copied
result. In the short copy case, it returns zero to let generic_perform_write
retry copying user data again.
As a result, f2fs_write_end() works:
PageUptodate len copied return retry
1. no 4096 4096 4096 false -> return 4096
2. no 4096 1024 0 true -> goto #1 case
3. yes 2048 2048 2048 false -> return 2048
4. yes 2048 1024 1024 false -> return 1024
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We don't need to make zeros beyond i_size, since we already wrote that through
NEW_ADDR case.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In write_begin(), we skip checking dnode block for preallocating block
when whole block needs to be updated since we preallocated its block in
f2fs_preallocate_blocks, for partial updated block, we will still try
to lock its node and do preallocation in write_begin(), so in
f2fs_preallocate_blocks we should not preallocate its block.
But previously, the calculation of preallocating block number is
incorrect, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fixes the following sparse warning:
fs/f2fs/data.c:969:12: warning:
symbol 'f2fs_grab_bio' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If we preallocate blocks with f2fs_reserve_blocks in f2fs_map_blocks, we
should call f2fs_balance_fs for checking and reclaiming space, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This reverts commit a2ee0a300344a6da76186129b078113354fe13d2.
When testing with generic/032 of xfstest suit, failure message will be
reported as below:
generic/032 8s ... [failed, exit status 1] - output mismatch (see results/generic/032.out.bad)
--- tests/generic/032.out 2015-01-11 16:52:27.643681072 +0800
+++ results/generic/032.out.bad 2016-08-06 13:44:43.861330500 +0800
@@ -1,5 +1,5 @@
QA output created by 032
-100 iterations
-0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
-*
-0100000
+1: [768..775]: unwritten
+Unwritten extents found!
...
(Run 'diff -u tests/generic/032.out results/generic/032.out.bad' to see the entire diff)
Ran: generic/032
Failures: generic/032
Failed 1 of 1 tests
In write_end(), we should update i_size of inode before unlock page,
otherwise, we will lose newly updated data in following race condition.
Thread A Thread B
- write_end
- unlock page
- writepages
- lock_page
- writepage
if page is out-of-range of file size,
we will skip writting the page.
- update i_size
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Merge 4fc29c1aa375 included this extra line, but it's not needed (or
useful) since we'll bio_set_op_attrs() right after to properly set
the op and flags for the bio.
Signed-off-by: Jens Axboe <axboe@fb.com>
replacing redundant inode page updates with mark_inode_dirty calls. And we tried
to reduce lock contentions as well to improve filesystem scalability.
Other feature is setting F2FS automatically when detecting host-managed SMR.
= Enhancement =
- ioctl to move a range of data between files
- inject orphan inode errors
- avoid flush commands congestion
- support lazytime
= Bug fixes =
- return proper results for some dentry operations
- fix deadlock in add_link failure
- disable extent_cache for fcollapse/finsert
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXmDJFAAoJEEAUqH6CSFDSJeYP/0ru8+5/ui5VTCdNPQB9KxYD
DIUaDGpeoLvmn3ZdrMEdyNr6kWbgjCE9JjOGPQ7l1/apErOGVPyaBwflKcCDwloU
pAlEqVM1Q9j4qH4i9SWTlvPtsHBHB7G7YSe3vDB9fJGSTqumubIlnaBm+Wfjx31U
p53WcPn9LpOyzfmvZf2tOHmvZ7bWLkE/a07x9kPC6XHUFb9C17jLRFFGeuhZQHv1
Yo7HgokBnPExa8TnEILYyX/x+eecFS/1Cp/cN0STsebSu8pStTHTcAP7qEpKQB88
Cc51Lf+d5gFeydxKDFxwdH3VWOGIr9Ppako+lHW83gJcHP0zw8zdxULab+HJMa4n
MOByRRiafwu1sL0dl7TCfsYNIHdEnXhWbhcRhMVZbb5C2Q6+Htuac8ZrKSOWExNN
DUqRkzeTib9u+cHxUTFFPgOGdUjDLmg3XHU7mvb+2hViluVjIImC4tqD5XPpv7vt
WnaDJxLCGD/6DF2yhiVY9NysuxInLTNFFCF06LworZ4L24hlg5TvN0UeUNRO9954
ux6f+lSORCzV3TmrsHP5vwjSAW26FviPXV1q1HHJeTpWKMlhsZtHmOAJOtZKKmxP
WFnHT0aiWF+sQf4qfxVQL+lLqtgRKJAI9zqGRyfDJWJp5aXdRuVsZs9pWNQF7lCo
5gVnCYk3ULjXG3b23j2S
=tKTR
-----END PGP SIGNATURE-----
Merge tag 'for-f2fs-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"The major change in this version is mitigating cpu overheads on write
paths by replacing redundant inode page updates with mark_inode_dirty
calls. And we tried to reduce lock contentions as well to improve
filesystem scalability. Other feature is setting F2FS automatically
when detecting host-managed SMR.
Enhancements:
- ioctl to move a range of data between files
- inject orphan inode errors
- avoid flush commands congestion
- support lazytime
Bug fixes:
- return proper results for some dentry operations
- fix deadlock in add_link failure
- disable extent_cache for fcollapse/finsert"
* tag 'for-f2fs-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits)
f2fs: clean up coding style and redundancy
f2fs: get victim segment again after new cp
f2fs: handle error case with f2fs_bug_on
f2fs: avoid data race when deciding checkpoin in f2fs_sync_file
f2fs: support an ioctl to move a range of data blocks
f2fs: fix to report error number of f2fs_find_entry
f2fs: avoid memory allocation failure due to a long length
f2fs: reset default idle interval value
f2fs: use blk_plug in all the possible paths
f2fs: fix to avoid data update racing between GC and DIO
f2fs: add maximum prefree segments
f2fs: disable extent_cache for fcollapse/finsert inodes
f2fs: refactor __exchange_data_block for speed up
f2fs: fix ERR_PTR returned by bio
f2fs: avoid mark_inode_dirty
f2fs: move i_size_write in f2fs_write_end
f2fs: fix to avoid redundant discard during fstrim
f2fs: avoid mismatching block range for discard
f2fs: fix incorrect f_bfree calculation in ->statfs
f2fs: use percpu_rw_semaphore
...
Merge updates from Andrew Morton:
- a few misc bits
- ocfs2
- most(?) of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits)
thp: fix comments of __pmd_trans_huge_lock()
cgroup: remove unnecessary 0 check from css_from_id()
cgroup: fix idr leak for the first cgroup root
mm: memcontrol: fix documentation for compound parameter
mm: memcontrol: remove BUG_ON in uncharge_list
mm: fix build warnings in <linux/compaction.h>
mm, thp: convert from optimistic swapin collapsing to conservative
mm, thp: fix comment inconsistency for swapin readahead functions
thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
shmem: split huge pages beyond i_size under memory pressure
thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
khugepaged: add support of collapse for tmpfs/shmem pages
shmem: make shmem_inode_info::lock irq-safe
khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
thp: extract khugepaged from mm/huge_memory.c
shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
shmem: add huge pages support
shmem: get_unmapped_area align huge page
shmem: prepare huge= mount option and sysfs knob
mm, rmap: account shmem thp pages
...
Vladimir has noticed that we might declare memcg oom even during
readahead because read_pages only uses GFP_KERNEL (with mapping_gfp
restriction) while __do_page_cache_readahead uses
page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from
OOMs. This gfp mask discrepancy is really unfortunate and easily
fixable. Drop page_cache_alloc_readahead() which only has one user and
outsource the gfp_mask logic into readahead_gfp_mask and propagate this
mask from __do_page_cache_readahead down to read_pages.
This alone would have only very limited impact as most filesystems are
implementing ->readpages and the common implementation mpage_readpages
does GFP_KERNEL (with mapping_gfp restriction) again. We can tell it to
use readahead_gfp_mask instead as this function is called only during
readahead as well. The same applies to read_cache_pages.
ext4 has its own ext4_mpage_readpages but the path which has pages !=
NULL can use the same gfp mask. Btrfs, cifs, f2fs and orangefs are
doing a very similar pattern to mpage_readpages so the same can be
applied to them as well.
[akpm@linux-foundation.org: coding-style fixes]
[mhocko@suse.com: restrict gfp mask in mpage_alloc]
Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Chris Mason <clm@fb.com>
Cc: Steve French <sfrench@samba.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: Mike Marshall <hubcap@omnibond.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Changman Lee <cm224.lee@samsung.com>
Cc: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging),
and adds blk_plug in write paths additionally.
The main reason is that blk_start_plug can be used to wake up from low-power
mode before submitting further bios.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Datas in file can be operated by GC and DIO simultaneously, so we will
face race case as below:
For write case:
Thread A Thread B
- generic_file_direct_write
- invalidate_inode_pages2_range
- f2fs_direct_IO
- do_blockdev_direct_IO
- do_direct_IO
- get_more_blocks
- f2fs_gc
- do_garbage_collect
- gc_data_segment
- move_data_page
- do_write_data_page
migrate data block to new block address
- dio_bio_submit
update user data to old block address
For read case:
Thread A Thread B
- generic_file_direct_write
- invalidate_inode_pages2_range
- f2fs_direct_IO
- do_blockdev_direct_IO
- do_direct_IO
- get_more_blocks
- f2fs_balance_fs
- f2fs_gc
- do_garbage_collect
- gc_data_segment
- move_data_page
- do_write_data_page
migrate data block to new block address
- write_checkpoint
- do_checkpoint
- clear_prefree_segments
- f2fs_issue_discard
discard old block adress
- dio_bio_submit
update user buffer from obsolete block address
In order to fix this, for one file, we should let DIO and GC getting exclusion
against with each other.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This is to fix wrong error pointer handling flow reported by Dan.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer.
When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the
overall performance.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In procedure of synchonized read, after sending out the read request, reader
will try to lock the page for waiting device to finish the read jobs and
unlock the page, but meanwhile, truncater will race with reader, so after
reader get lock of the page, it should check page's mapping to detect
whether someone has truncated the page in advance, then reader has the
chance to do the retry if truncation was done, otherwise read can be failed
due to previous condition check.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For encrypted inode, if user overwrites data of the inode, f2fs will read
encrypted data into page cache, and then do the decryption.
However reader can race with overwriter, and it will see encrypted data
which has not been decrypted by overwriter yet. Fix it by moving decrypting
work to background and keep page non-uptodated until data is decrypted.
Thread A Thread B
- f2fs_file_write_iter
- __generic_file_write_iter
- generic_perform_write
- f2fs_write_begin
- f2fs_submit_page_bio
- generic_file_read_iter
- do_generic_file_read
- lock_page_killable
- unlock_page
- copy_page_to_iter
hit the encrypted data in updated page
- lock_page
- fscrypt_decrypt_page
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This mount option is to enable original log-structured filesystem forcefully.
So, there should be no random writes for main area.
Especially, this supports host-managed SMR device.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In f2fs, we don't need to keep block plugging for NODE and DATA writes, since
we already merged bios as much as possible.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Separate the op from the rq_flag_bits and have f2fs
set/get the bio using bio_set_op_attrs/bio_op.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Fixed up fs/ext4/crypto.c
Signed-off-by: Jens Axboe <axboe@fb.com>
Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.
Reported-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If there is no cold page, we don't need to do a loop to flush dirty
data pages.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 1.1 GB/s
After : 1.2 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 2.2 GB/s
After : 2.3 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For data pages, let's try to flush as much as possible in background.
On /dev/pmem0,
1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
Before : 800 MB/s
After : 1.1 GB/s
2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
Before : 1.3 GB/s
After : 2.2 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch removes writepages lock.
We can improve multi-threading performance.
tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()
Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>