linux

iv/linux

Author	SHA1	Message	Date
Chao Yu	b3582c6892	f2fs: reduce competition among node page writes We do not need to block on ->node_write among different node page writers e.g. fsync/flush, unless we have a node page writer from write_checkpoint. So it's better use rw_semaphore instead of mutex type for ->node_write to promote performance. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-30 23:28:37 -07:00
Jaegeuk Kim	cf2271e781	f2fs: avoid retrying wrong recovery routine when error was occurred This patch eliminates the propagation of recovery errors to the next mount. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-30 14:13:35 -07:00
Jaegeuk Kim	01229f5e1b	f2fs: fix wrong condition for unlikely This patch fixes the wrongly used unlikely condition. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-30 14:13:31 -07:00
Jaegeuk Kim	fff04f90c1	f2fs: add info of appended or updated data writes This patch introduces a inode number list in which represents inodes having appended data writes or updated data writes after last checkpoint. This will be used at fsync to determine whether the recovery information should be written or not. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-29 07:46:11 -07:00
Jaegeuk Kim	39efac41fb	f2fs: use radix_tree for ino management For better ino management, this patch replaces the data structure from list to radix tree. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-29 07:46:11 -07:00
Jaegeuk Kim	6451e041c8	f2fs: add infra for ino management This patch changes the naming of orphan-related data structures to use as inode numbers managed globally. Later, we can use this facility for managing any inode number lists. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-29 07:45:54 -07:00
Jaegeuk Kim	953e6cc6bc	f2fs: punch the core function for inode management This patch punches out the core functions to manage the inode numbers. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-29 07:40:25 -07:00
Linus Torvalds	64b2d1fbbf	f2fs updates for v3.16 This patch-set includes the following major enhancement patches. o enhance wait_on_page_writeback o support SEEK_DATA and SEEK_HOLE o enhance readahead flows o enhance IO flushes o support fiemap o add some tracepoints The other bug fixes are as follows. o fix to support a large volume > 2TB correctly o recovery bug fix wrt fallocated space o fix recursive lock on xattr operations o fix some cases on the remount flow And, there are a bunch of cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTleLYAAoJEEAUqH6CSFDSdhEP/iY5hTZ02jH4ZiFPP/Nee4hS l0BHeZvrMjoccaWUDu0ZvIPC8BiJ7kLOgK+/VWS7LAfY1PY11ALEtYQOrW+RM47+ jMfULegTod/F8WS2B6dk31QMhAZldtnsYvA5PS1VV3S0rht+qbOz+PDejZFgSMc3 VVQ7OZzq30gMmtsw7oh3FHfeTu4xe/bxygdMRXgljQQD2MvorJiOb4MA+ovEDd9z CZMMTQvRKjc0d8LPf0gOkZEvG63GR6klCgFMuiappUsua0O8IPIEhCyEGFrE66vS fObVKpqNAsR2ABhS2grgn6Q7UUvn4xrF6jOwDH5tuw2yzmkQiMAWINwBdgnbEy1c D5S89PQ8TkQK9KBSoU0v8WKWC4HzWZF4ZEd6eq9VxVTS8iT0w8DtNHXK518FVC34 82iqrLc0EhrcGNiW/7Nrc6WzHrWqorylCFN7atB3ruhVqeMh3MZsDU4Gq0WgB2oh pF9XVBEpJQpV35DYSAPzLkm+2+mwHVNqwdY3HcvUs7IP90+wZlrWSRG2FEfFt/e8 6nwvbORrHMTI5VfdYlEPgpjuesFmYPZqEvRGdaDa1YrHqhvvgStEPT9qiq2qVn9+ tr0HjpNRDD/etkaE6ujYOYqdxuk3mm6RY68h+KSbNcY1/VTvt2bN2telwWuDsxIq 8yOsxV2x3JB/euDAJsSU =xqsO -----END PGP SIGNATURE----- Merge tag 'for-f2fs-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, there is no special interesting feature, but we've investigated a couple of tuning points with respect to the I/O flow. Several major bug fixes and a bunch of clean-ups also have been made. This patch-set includes the following major enhancement patches: - enhance wait_on_page_writeback - support SEEK_DATA and SEEK_HOLE - enhance readahead flows - enhance IO flushes - support fiemap - add some tracepoints The other bug fixes are as follows: - fix to support a large volume > 2TB correctly - recovery bug fix wrt fallocated space - fix recursive lock on xattr operations - fix some cases on the remount flow And, there are a bunch of cleanups" * tag 'for-f2fs-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits) f2fs: support f2fs_fiemap f2fs: avoid not to call remove_dirty_inode f2fs: recover fallocated space f2fs: fix to recover data written by dio f2fs: large volume support f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages f2fs: avoid overflow when large directory feathure is enabled f2fs: fix recursive lock by f2fs_setxattr MAINTAINERS: add a co-maintainer from samsung for F2FS MAINTAINERS: change the email address for f2fs f2fs: use inode_init_owner() to simplify codes f2fs: avoid to use slab memory in f2fs_issue_flush for efficiency f2fs: add a tracepoint for f2fs_read_data_page f2fs: add a tracepoint for f2fs_write_{meta,node,data}_pages f2fs: add a tracepoint for f2fs_write_{meta,node,data}_page f2fs: add a tracepoint for f2fs_write_end f2fs: add a tracepoint for f2fs_write_begin f2fs: fix checkpatch warning f2fs: deactivate inode page if the inode is evicted f2fs: decrease the lock granularity during write_begin ...	2014-06-09 19:11:44 -07:00
Mel Gorman	2457aec637	mm: non-atomically mark page accessed during page cache allocation where possible aops->write_begin may allocate a new page and make it visible only to have mark_page_accessed called almost immediately after. Once the page is visible the atomic operations are necessary which is noticable overhead when writing to an in-memory filesystem like tmpfs but should also be noticable with fast storage. The objective of the patch is to initialse the accessed information with non-atomic operations before the page is visible. The bulk of filesystems directly or indirectly use grab_cache_page_write_begin or find_or_create_page for the initial allocation of a page cache page. This patch adds an init_page_accessed() helper which behaves like the first call to mark_page_accessed() but may called before the page is visible and can be done non-atomically. The primary APIs of concern in this care are the following and are used by most filesystems. find_get_page find_lock_page find_or_create_page grab_cache_page_nowait grab_cache_page_write_begin All of them are very similar in detail to the patch creates a core helper pagecache_get_page() which takes a flags parameter that affects its behavior such as whether the page should be marked accessed or not. Then old API is preserved but is basically a thin wrapper around this core function. Each of the filesystems are then updated to avoid calling mark_page_accessed when it is known that the VM interfaces have already done the job. There is a slight snag in that the timing of the mark_page_accessed() has now changed so in rare cases it's possible a page gets to the end of the LRU as PageReferenced where as previously it might have been repromoted. This is expected to be rare but it's worth the filesystem people thinking about it in case they see a problem with the timing change. It is also the case that some filesystems may be marking pages accessed that previously did not but it makes sense that filesystems have consistent behaviour in this regard. The test case used to evaulate this is a simple dd of a large file done multiple times with the file deleted on each iterations. The size of the file is 1/10th physical memory to avoid dirty page balancing. In the async case it will be possible that the workload completes without even hitting the disk and will have variable results but highlight the impact of mark_page_accessed for async IO. The sync results are expected to be more stable. The exception is tmpfs where the normal case is for the "IO" to not hit the disk. The test machine was single socket and UMA to avoid any scheduling or NUMA artifacts. Throughput and wall times are presented for sync IO, only wall times are shown for async as the granularity reported by dd and the variability is unsuitable for comparison. As async results were variable do to writback timings, I'm only reporting the maximum figures. The sync results were stable enough to make the mean and stddev uninteresting. The performance results are reported based on a run with no profiling. Profile data is based on a separate run with oprofile running. async dd 3.15.0-rc3 3.15.0-rc3 vanilla accessed-v2 ext3 Max elapsed 13.9900 ( 0.00%) 11.5900 ( 17.16%) tmpfs Max elapsed 0.5100 ( 0.00%) 0.4900 ( 3.92%) btrfs Max elapsed 12.8100 ( 0.00%) 12.7800 ( 0.23%) ext4 Max elapsed 18.6000 ( 0.00%) 13.3400 ( 28.28%) xfs Max elapsed 12.5600 ( 0.00%) 2.0900 ( 83.36%) The XFS figure is a bit strange as it managed to avoid a worst case by sheer luck but the average figures looked reasonable. samples percentage ext3 86107 0.9783 vmlinux-3.15.0-rc4-vanilla mark_page_accessed ext3 23833 0.2710 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed ext3 5036 0.0573 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed ext4 64566 0.8961 vmlinux-3.15.0-rc4-vanilla mark_page_accessed ext4 5322 0.0713 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed ext4 2869 0.0384 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed xfs 62126 1.7675 vmlinux-3.15.0-rc4-vanilla mark_page_accessed xfs 1904 0.0554 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed xfs 103 0.0030 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed btrfs 10655 0.1338 vmlinux-3.15.0-rc4-vanilla mark_page_accessed btrfs 2020 0.0273 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed btrfs 587 0.0079 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed tmpfs 59562 3.2628 vmlinux-3.15.0-rc4-vanilla mark_page_accessed tmpfs 1210 0.0696 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed tmpfs 94 0.0054 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed [akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer] Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Tested-by: Prabhakar Lad <prabhakar.csengg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:54:10 -07:00
Changman Lee	1dbe415216	f2fs: large volume support f2fs's cp has one page which consists of struct f2fs_checkpoint and version bitmap of sit and nat. To support lots of segments, we need more blocks for sit bitmap. So let's arrange sit bitmap as following: +-----------------+------------+ \| f2fs_checkpoint \| sit bitmap \| \| + nat bitmap \| \| +-----------------+------------+ 0 4k N blocks Signed-off-by: Changman Lee <cm224.lee@samsung.com> [Jaegeuk Kim: simple code change for readability] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-06-04 13:34:30 +09:00
Chao Yu	e574843438	f2fs: add a tracepoint for f2fs_write_{meta,node,data}_pages This patch adds a tracepoint for f2fs_write_{meta,node,data}_pages to trace when pages are fsyncing/flushing. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:59 +09:00
Chao Yu	ecda0de343	f2fs: add a tracepoint for f2fs_write_{meta,node,data}_page This patch adds a tracepoint for f2fs_write_{meta,node,data}_page to trace when page is writting out. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:59 +09:00
Jaegeuk Kim	bde446866c	f2fs: no need to wait on page writebck to meta pages This patch removes grab_cache_page_write_begin for meta pages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:58 +09:00
Fabian Frederick	b49ad51e6d	f2fs: add static to get_max_meta_blks inline get_max_meta_blks is only used in checkpoint.c Use standard static inline format. Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:55 +09:00
Jaegeuk Kim	ed57c27f73	f2fs: remove costly dirty_dir_inode operations This patch removes list opeations in handling dirty dir inodes. Previously, F2FS traverses whole the list of dirty dir inodes to check whether there is an existing inode or not, resulting in heavy CPU overheads. So this patch removes such the traverse operations by adding FI_DIRTY_DIR to indicate the inode lies on the list or not. Through this simple flag, we can remove redundant operations gracefully. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:54 +09:00
Jaegeuk Kim	76f60268e7	f2fs: call redirty_page_for_writepage This patch replace some general codes with redirty_page_for_writepage, which can be enabled after consideration on additional procedure like counting dirty pages appropriately. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:54 +09:00
Jaegeuk Kim	1e87a78d95	f2fs: avoid to conduct roll-forward due to the remained garbage blocks The f2fs always scans the next chain of direct node blocks. But some garbage blocks are able to be remained due to no discard support or SSR triggers. This occasionally wreaks recovering wrong inodes that were used or BUG_ONs due to reallocating node ids as follows. When mount this f2fs image: http://linuxtesting.org/downloads/f2fs_fault_image.zip BUG_ON is triggered in f2fs driver (messages below are generated on kernel 3.13.2; for other kernels output is similar): kernel BUG at fs/f2fs/node.c:215! Call Trace: [<ffffffffa032ebad>] recover_inode_page+0x1fd/0x3e0 [f2fs] [<ffffffff811446e7>] ? __lock_page+0x67/0x70 [<ffffffff81089990>] ? autoremove_wake_function+0x50/0x50 [<ffffffffa0337788>] recover_fsync_data+0x1398/0x15d0 [f2fs] [<ffffffff812b9e5c>] ? selinux_d_instantiate+0x1c/0x20 [<ffffffff811cb20b>] ? d_instantiate+0x5b/0x80 [<ffffffffa0321044>] f2fs_fill_super+0xb04/0xbf0 [f2fs] [<ffffffff811b861e>] ? mount_bdev+0x7e/0x210 [<ffffffff811b8769>] mount_bdev+0x1c9/0x210 [<ffffffffa0320540>] ? validate_superblock+0x210/0x210 [f2fs] [<ffffffffa031cf8d>] f2fs_mount+0x1d/0x30 [f2fs] [<ffffffff811b9497>] mount_fs+0x47/0x1c0 [<ffffffff81166e00>] ? __alloc_percpu+0x10/0x20 [<ffffffff811d4032>] vfs_kern_mount+0x72/0x110 [<ffffffff811d6763>] do_mount+0x493/0x910 [<ffffffff811615cb>] ? strndup_user+0x5b/0x80 [<ffffffff811d6c70>] SyS_mount+0x90/0xe0 [<ffffffff8166f8d9>] system_call_fastpath+0x16/0x1b Found by Linux File System Verification project (linuxtesting.org). Reported-by: Andrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-07 10:21:54 +09:00
Chao Yu	2d7b822ad9	f2fs: use list_for_each_entry{_safe} for simplyfying code This patch use list_for_each_entry{_safe} instead of list_for_each{_safe} for simplfying code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-04-02 09:56:27 +09:00
Chao Yu	cf0ee0f09b	f2fs: avoid free slab cache under spinlock Move kmem_cache_free out of spinlock protection region for better performance. Change log from v1: o remove spinlock protection for kmem_cache_free in destroy_node_manager suggested by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-04-02 09:56:12 +09:00
Jaegeuk Kim	3cb5ad152b	f2fs: call f2fs_wait_on_page_writeback instead of native function If a page is on writeback, f2fs can face with deadlock due to under writepages. This is caused by merging IOs inside f2fs, so if it comes to detect, let's throw merged IOs, which is implemented by f2fs_wait_on_page_writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-20 22:10:04 +09:00
Jaegeuk Kim	50c8cdb35a	f2fs: introduce nr_pages_to_write for segment alignment This patch introduces nr_pages_to_write to align page writes to the segment or other operational unit size, which can be tuned according to the system environment. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-18 16:37:53 +09:00
Jaegeuk Kim	d3baf95da5	f2fs: increase pages_skipped when skipping writepages This patch increases pages_skipped when skipping writepages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-18 16:37:16 +09:00
Jaegeuk Kim	87d6f89094	f2fs: avoid small data writes by skipping writepages This patch introduces nr_pages_to_skip(sbi, type) to determine writepages can be skipped. The dentry, node, and meta pages can be conrolled by F2FS without breaking the FS consistency. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-18 13:58:59 +09:00
Jaegeuk Kim	f8b2c1f940	f2fs: introduce get_dirty_dents for readability The get_dirty_dents gives us the number of dirty dentry pages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-18 12:34:30 +09:00
Gu Zheng	e8512d2e0c	f2fs: remove the unused ctor argument of f2fs_kmem_cache_create() Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-10 18:45:14 +09:00
Chao Yu	9cf3c3898a	f2fs: fix dirty page accounting when redirty We should de-account dirty counters for page when redirty in ->writepage(). Wu Fengguang described in 'commit 971767caf632190f77a40b4011c19948232eed75': "writeback: fix dirtied pages accounting on redirty De-account the accumulative dirty counters on page redirty. Page redirties (very common in ext4) will introduce mismatch between counters (a) and (b) a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied b) NR_WRITTEN, BDI_WRITTEN This will introduce systematic errors in balanced_rate and result in dirty page position errors (ie. the dirty pages are no longer balanced around the global/bdi setpoints)." Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-28 13:09:08 +09:00
Chao Yu	81c1a0f13e	f2fs: readahead contiguous SSA blocks for f2fs_gc If there are multi segments in one section, we will read those SSA blocks which have contiguous address one by one in f2fs_gc. It may lost performance, let's read ahead SSA blocks by merge multi read request. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-27 20:40:36 +09:00
Changman Lee	942e0be621	f2fs: show counts of checkpoint in status This patch shows the counts of checkpoint in f2fs' status. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:53 +09:00
Chao Yu	662befda25	f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pages This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages function into ra_meta_pages. Additionally the new function is used to readahead cp block in recover_orphan_inodes. Change log from v1: o fix a deadloop bug pointed by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:53 +09:00
Jaegeuk Kim	1fe54f9dd3	f2fs: clean up redundant function call This patch integrates inode_[inc\|dec]_dirty_dents with inc_page_count to remove redundant calls. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:53 +09:00
Jaegeuk Kim	203681f65b	f2fs: fix f2fs_write_meta_page at no checkpoint status If f2fs entered errorneous checkpoint status, it should skip writing meta pages instead of redirtying the pages out. Otherwise, it cannot unmount the partition even though f2fs is under read-only status. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-02-17 14:58:53 +09:00
Jaegeuk Kim	4ef51a8fcc	f2fs: introduce NODE_MAPPING for code consistency This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by Gu Zheng. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:41:08 +09:00
Gu Zheng	63f5384c9a	f2fs: remove the orphan block page array As the orphan_blocks may be max to 504, so it is not security and rigorous to store such a large array in the kernel stack as Dan Carpenter said. In fact, grab_meta_page has locked the page in the page cache, and we can use find_get_page() to fetch the page safely in the downstream, so we can remove the page array directly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:41:08 +09:00
Gu Zheng	9df27d982d	f2fs: add help function META_MAPPING Introduce help function META_MAPPING() to get the cache meta blocks' address space. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:41:07 +09:00
Gu Zheng	17b692f60e	f2fs: use spinlock rather than mutex for better speed With the 2 previous changes, all the long time operations are moved out of the protection region, so here we can use spinlock rather than mutex (orphan_inode_mutex) for lower overhead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-14 18:12:05 +09:00
Gu Zheng	c1ef372572	f2fs: move alloc new orphan node out of lock protection region Move alloc new orphan node out of lock protection region. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-14 18:12:04 +09:00
Gu Zheng	4531929e39	f2fs: move grabing orphan pages out of protection region Move grabing orphan block page out of protection region, and grab all the orphan block pages ahead. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove unnecessary code pointed by Chao Yu] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-14 18:11:20 +09:00
Gu Zheng	0d47c1adc2	f2fs: convert max_orphans to a field of f2fs_sb_info Previously, we need to calculate the max orphan num when we try to acquire an orphan inode, but it's a stable value since the super block was inited. So converting it to a field of f2fs_sb_info and use it directly when needed seems a better choose. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-26 20:37:52 +09:00
Jaegeuk Kim	5459aa9770	f2fs: write dirty meta pages collectively This patch enhances writing dirty meta pages collectively in background. During the file data writes, it'd better avoid to write small dirty meta pages frequently. So let's give a chance to collect a number of dirty meta pages for a while. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:07 +09:00
Jaegeuk Kim	458e6197c3	f2fs: refactor bio->rw handling This patch introduces f2fs_io_info to mitigate the complex parameter list. struct f2fs_io_info { enum page_type type; /* contains DATA/NODE/META/META_FLUSH / int rw; / contains R/RS/W/WS / int rw_flag; / contains REQ_META/REQ_PRIO / } 1. f2fs_write_data_pages - DATA - WRITE_SYNC is set when wbc->WB_SYNC_ALL. 2. sync_node_pages - NODE - WRITE_SYNC all the time 3. sync_meta_pages - META - WRITE_SYNC all the time - REQ_META \| REQ_PRIO all the time * f2fs_submit_merged_bio() handles META_FLUSH. 4. ra_nat_pages, ra_sit_pages, ra_sum_pages - META - READ_SYNC Cc: Fan Li <fanofcode.li@samsung.com> Cc: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:06 +09:00
Jaegeuk Kim	6bacf52fb5	f2fs: add unlikely() macro for compiler more aggressively This patch adds unlikely() macro into the most of codes. The basic rule is to add that when: - checking unusual errors, - checking page mappings, - and the other unlikely conditions. Change log from v1: - Don't add unlikely for the NULL test and error test: advised by Andi Kleen. Cc: Chao Yu <chao2.yu@samsung.com> Cc: Andi Kleen <andi@firstfloor.org> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:06 +09:00
Chao Yu	cfb271d485	f2fs: add unlikely() macro for compiler optimization As we know, some of our branch condition will rarely be true. So we could add 'unlikely' to let compiler optimize these code, by this way we could drop unneeded 'jump' assemble code to improve performance. change log: o add unlikely as many as possible across the whole source files at once suggested by Jaegeuk Kim. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:06 +09:00
Jaegeuk Kim	93dfe2ac51	f2fs: refactor bio-related operations This patch integrates redundant bio operations on read and write IOs. 1. Move bio-related codes to the top of data.c. 2. Replace f2fs_submit_bio with f2fs_submit_merged_bio, which handles read bios additionally. 3. Introduce __submit_merged_bio to submit the merged bio. 4. Change f2fs_readpage to f2fs_submit_page_bio. 5. Introduce f2fs_submit_page_mbio to integrate previous submit_read_page and submit_write_page. Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com > Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:05 +09:00
Chao Yu	8f99a946f3	f2fs: convert recover_orphan_inodes to void The recover_orphan_inodes() returns no error all the time, so we don't need to check its errors. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:05 +09:00
Chao Yu	6947eea957	f2fs: avoid to calculate incorrect max orphan number Because we will write node summaries when do_checkpoint with umount flag, our number of max orphan blocks should minus NR_CURSEG_NODE_TYPE additional. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Shu Tan <shu.tan@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:04 +09:00
Jaegeuk Kim	f9a4e6df52	f2fs: bug fix on bit overflow from 32bits to 64bits This patch fixes some bit overflows by the shift operations. Dan Carpenter reported potential bugs on bit overflows as follows. fs/f2fs/segment.c:910 submit_write_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/checkpoint.c:429 get_valid_checkpoint() warn: should '1 << ()' be a 64 bit type? fs/f2fs/data.c:408 f2fs_readpage() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:457 submit_read_page() warn: should 'blk_addr << ((sbi)->log_blocksize - 9)' be a 64 bit type? fs/f2fs/data.c:525 get_data_block_ro() warn: should 'i << blkbits' be a 64 bit type? Bug-Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:04 +09:00
Gu Zheng	3679556794	f2fs: fix a potential out of range issue Fix a potential out of range issue introduced by commit: 22fb72225a f2fs: simplify write_orphan_inodes for better readable Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:04 +09:00
Changman Lee	03232305ff	f2fs: send REQ_META or REQ_PRIO when reading meta area Let's send REQ_META or REQ_PRIO when reading meta area such as NAT/SIT etc. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:03 +09:00
Gu Zheng	ce3b7d80ed	f2fs: move the list_head initialization into the lock protection region Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:02 +09:00
Gu Zheng	502c6e0bcd	f2fs: simplify write_orphan_inodes for better readable Simplify write_orphan_inodes for better readable. Because we hold the orphan_inode_mutex, so it's safe to use list_for_each_entry instead of list_for_each_safe. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:18:01 +09:00

1 2

84 Commits