f2fs-for-5.18

In this cycle, f2fs has some performance improvements for Android workloads such
 as using read-unfair rwsems and adding some sysfs entries to control GCs and
 discard commands in more details. In addtiion, it has some tunings to improve
 the recovery speed after sudden power-cut.
 
 Enhancement:
  - add reader-unfair rwsems with F2FS_UNFAIR_RWSEM
   : will replace with generic API support
  - adjust to make the readahead/recovery flow more efficiently
  - sysfs entries to control issue speeds of GCs and Discard commands
  - enable idmapped mounts
 
 Bug fix:
  - correct wrong error handling routines
  - fix missing conditions in quota
  - fix a potential deadlock between writeback and block plug routines
  - fix a deadlock btween freezefs and evict_inode
 
 We've added some boundary checks to avoid kernel panics on corrupted images,
 and several minor code clean-ups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmI44c4ACgkQQBSofoJI
 UNIqdBAAgBjV/76Gphbpg2lR5+13pWBV0jp66yYaaPiqmM6IsSPYKTlGMJpEBy41
 x6M+MRc+NjtwSEHAWOptOIPbP9zwXYJn/KSDMCAP3+454YhBFDLqDAkAxBt1frYT
 0EkwCIYw/LqmVnuIttQ01gnT8v5zH4d/x4+gsdM+b7flmpCP/AoZDvI19Zd66F0y
 RdOdQQWyhvmmetZbaeaPoxbjS8LJ9b0ZMcxidTv9a+5GylCAXNicBdM9x1iVVmJ1
 dT1n2w7USKVdL4ydpwPUiec6RwACRk49CL3FgyyGNRlcpMmU9ArcY2l/Qr+At7ky
 tgPODXme/EvH12DsfoixjkNSLc4a7RHPfiJ3qy8XC6dshWYMKIegjateG8lVhf0P
 kdifMRCdOa+/l+RoyD1IjKTXPmVl9ihh6RBYDr6YrFclxg3uI4CvJCXht4dSXOCE
 5vLIVZEf5yk+6Ee2ozcNTG2hZ8gd+aNy1WqBN3/5lFxhBYVNlTnUYd0URzenwIdW
 i2QP99mFrntCL25lhF7f7AeTHxSg/UVXnRA1oQZ+6qIPPLhNdApfd1lov/6+Hhe4
 0zDbCbmIfVko/vZJeYOppaj+6jSZ3FafMfH5dDYyis4S4RbX2sjR9wGSd8PEdOTw
 /4dZXXfB2XslPb3KQsJSyGz75af3PxZ8PHLxj0HBSQXOA140htY=
 =t75l
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this cycle, f2fs has some performance improvements for Android
  workloads such as using read-unfair rwsems and adding some sysfs
  entries to control GCs and discard commands in more details. In
  addtiion, it has some tunings to improve the recovery speed after
  sudden power-cut.

  Enhancement:
   - add reader-unfair rwsems with F2FS_UNFAIR_RWSEM: will replace with
     generic API support
   - adjust to make the readahead/recovery flow more efficiently
   - sysfs entries to control issue speeds of GCs and Discard commands
   - enable idmapped mounts

  Bug fix:
   - correct wrong error handling routines
   - fix missing conditions in quota
   - fix a potential deadlock between writeback and block plug routines
   - fix a deadlock btween freezefs and evict_inode

  We've added some boundary checks to avoid kernel panics on corrupted
  images, and several minor code clean-ups"

* tag 'f2fs-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (27 commits)
  f2fs: fix to do sanity check on .cp_pack_total_block_count
  f2fs: make gc_urgent and gc_segment_mode sysfs node readable
  f2fs: use aggressive GC policy during f2fs_disable_checkpoint()
  f2fs: fix compressed file start atomic write may cause data corruption
  f2fs: initialize sbi->gc_mode explicitly
  f2fs: introduce gc_urgent_mid mode
  f2fs: compress: fix to print raw data size in error path of lz4 decompression
  f2fs: remove redundant parameter judgment
  f2fs: use spin_lock to avoid hang
  f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
  f2fs: remove unnecessary read for F2FS_FITS_IN_INODE
  f2fs: introduce F2FS_UNFAIR_RWSEM to support unfair rwsem
  f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes
  f2fs: fix to do sanity check on curseg->alloc_type
  f2fs: fix to avoid potential deadlock
  f2fs: quota: fix loop condition at f2fs_quota_sync()
  f2fs: Restore rwsem lockdep support
  f2fs: fix missing free nid in f2fs_handle_failed_inode
  f2fs: support idmapped mounts
  f2fs: add a way to limit roll forward recovery time
  ...
This commit is contained in:
Linus Torvalds 2022-03-22 10:00:31 -07:00
commit ef510682af
23 changed files with 699 additions and 391 deletions

View File

@ -55,8 +55,9 @@ Description: Controls the in-place-update policy.
0x04 F2FS_IPU_UTIL 0x04 F2FS_IPU_UTIL
0x08 F2FS_IPU_SSR_UTIL 0x08 F2FS_IPU_SSR_UTIL
0x10 F2FS_IPU_FSYNC 0x10 F2FS_IPU_FSYNC
0x20 F2FS_IPU_ASYNC, 0x20 F2FS_IPU_ASYNC
0x40 F2FS_IPU_NOCACHE 0x40 F2FS_IPU_NOCACHE
0x80 F2FS_IPU_HONOR_OPU_WRITE
==== ================= ==== =================
Refer segment.h for details. Refer segment.h for details.
@ -98,6 +99,33 @@ Description: Controls the issue rate of discard commands that consist of small
checkpoint is triggered, and issued during the checkpoint. checkpoint is triggered, and issued during the checkpoint.
By default, it is disabled with 0. By default, it is disabled with 0.
What: /sys/fs/f2fs/<disk>/max_discard_request
Date: December 2021
Contact: "Konstantin Vyshetsky" <vkon@google.com>
Description: Controls the number of discards a thread will issue at a time.
Higher number will allow the discard thread to finish its work
faster, at the cost of higher latency for incomming I/O.
What: /sys/fs/f2fs/<disk>/min_discard_issue_time
Date: December 2021
Contact: "Konstantin Vyshetsky" <vkon@google.com>
Description: Controls the interval the discard thread will wait between
issuing discard requests when there are discards to be issued and
no I/O aware interruptions occur.
What: /sys/fs/f2fs/<disk>/mid_discard_issue_time
Date: December 2021
Contact: "Konstantin Vyshetsky" <vkon@google.com>
Description: Controls the interval the discard thread will wait between
issuing discard requests when there are discards to be issued and
an I/O aware interruption occurs.
What: /sys/fs/f2fs/<disk>/max_discard_issue_time
Date: December 2021
Contact: "Konstantin Vyshetsky" <vkon@google.com>
Description: Controls the interval the discard thread will wait when there are
no discard operations to be issued.
What: /sys/fs/f2fs/<disk>/discard_granularity What: /sys/fs/f2fs/<disk>/discard_granularity
Date: July 2017 Date: July 2017
Contact: "Chao Yu" <yuchao0@huawei.com> Contact: "Chao Yu" <yuchao0@huawei.com>
@ -269,11 +297,16 @@ Description: Shows current reserved blocks in system, it may be temporarily
What: /sys/fs/f2fs/<disk>/gc_urgent What: /sys/fs/f2fs/<disk>/gc_urgent
Date: August 2017 Date: August 2017
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description: Do background GC aggressively when set. When gc_urgent = 1, Description: Do background GC aggressively when set. Set to 0 by default.
background thread starts to do GC by given gc_urgent_sleep_time gc urgent high(1): does GC forcibly in a period of given
interval. When gc_urgent = 2, F2FS will lower the bar of gc_urgent_sleep_time and ignores I/O idling check. uses greedy
checking idle in order to process outstanding discard commands GC approach and turns SSR mode on.
and GC a little bit aggressively. It is set to 0 by default. gc urgent low(2): lowers the bar of checking I/O idling in
order to process outstanding discard commands and GC a
little bit aggressively. uses cost benefit GC approach.
gc urgent mid(3): does GC forcibly in a period of given
gc_urgent_sleep_time and executes a mid level of I/O idling check.
uses cost benefit GC approach.
What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time
Date: August 2017 Date: August 2017
@ -430,6 +463,7 @@ Description: Show status of f2fs superblock in real time.
0x800 SBI_QUOTA_SKIP_FLUSH skip flushing quota in current CP 0x800 SBI_QUOTA_SKIP_FLUSH skip flushing quota in current CP
0x1000 SBI_QUOTA_NEED_REPAIR quota file may be corrupted 0x1000 SBI_QUOTA_NEED_REPAIR quota file may be corrupted
0x2000 SBI_IS_RESIZEFS resizefs is in process 0x2000 SBI_IS_RESIZEFS resizefs is in process
0x4000 SBI_IS_FREEZING freefs is in process
====== ===================== ================================= ====== ===================== =================================
What: /sys/fs/f2fs/<disk>/ckpt_thread_ioprio What: /sys/fs/f2fs/<disk>/ckpt_thread_ioprio
@ -503,7 +537,7 @@ Date: July 2021
Contact: "Daeho Jeong" <daehojeong@google.com> Contact: "Daeho Jeong" <daehojeong@google.com>
Description: Show how many segments have been reclaimed by GC during a specific Description: Show how many segments have been reclaimed by GC during a specific
GC mode (0: GC normal, 1: GC idle CB, 2: GC idle greedy, GC mode (0: GC normal, 1: GC idle CB, 2: GC idle greedy,
3: GC idle AT, 4: GC urgent high, 5: GC urgent low) 3: GC idle AT, 4: GC urgent high, 5: GC urgent low 6: GC urgent mid)
You can re-initialize this value to "0". You can re-initialize this value to "0".
What: /sys/fs/f2fs/<disk>/gc_segment_mode What: /sys/fs/f2fs/<disk>/gc_segment_mode
@ -540,3 +574,9 @@ Contact: "Daeho Jeong" <daehojeong@google.com>
Description: You can set the trial count limit for GC urgent high mode with this value. Description: You can set the trial count limit for GC urgent high mode with this value.
If GC thread gets to the limit, the mode will turn back to GC normal mode. If GC thread gets to the limit, the mode will turn back to GC normal mode.
By default, the value is zero, which means there is no limit like before. By default, the value is zero, which means there is no limit like before.
What: /sys/fs/f2fs/<disk>/max_roll_forward_node_blocks
Date: January 2022
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description: Controls max # of node block writes to be used for roll forward
recovery. This can limit the roll forward recovery time.

View File

@ -143,3 +143,10 @@ config F2FS_IOSTAT
Support getting IO statistics through sysfs and printing out periodic Support getting IO statistics through sysfs and printing out periodic
IO statistics tracepoint events. You have to turn on "iostat_enable" IO statistics tracepoint events. You have to turn on "iostat_enable"
sysfs node to enable this feature. sysfs node to enable this feature.
config F2FS_UNFAIR_RWSEM
bool "F2FS unfair rw_semaphore"
depends on F2FS_FS && BLK_CGROUP
help
Use unfair rw_semaphore, if system configured IO priority by block
cgroup.

View File

@ -204,8 +204,9 @@ struct posix_acl *f2fs_get_acl(struct inode *inode, int type, bool rcu)
return __f2fs_get_acl(inode, type, NULL); return __f2fs_get_acl(inode, type, NULL);
} }
static int f2fs_acl_update_mode(struct inode *inode, umode_t *mode_p, static int f2fs_acl_update_mode(struct user_namespace *mnt_userns,
struct posix_acl **acl) struct inode *inode, umode_t *mode_p,
struct posix_acl **acl)
{ {
umode_t mode = inode->i_mode; umode_t mode = inode->i_mode;
int error; int error;
@ -218,14 +219,15 @@ static int f2fs_acl_update_mode(struct inode *inode, umode_t *mode_p,
return error; return error;
if (error == 0) if (error == 0)
*acl = NULL; *acl = NULL;
if (!in_group_p(i_gid_into_mnt(&init_user_ns, inode)) && if (!in_group_p(i_gid_into_mnt(mnt_userns, inode)) &&
!capable_wrt_inode_uidgid(&init_user_ns, inode, CAP_FSETID)) !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
mode &= ~S_ISGID; mode &= ~S_ISGID;
*mode_p = mode; *mode_p = mode;
return 0; return 0;
} }
static int __f2fs_set_acl(struct inode *inode, int type, static int __f2fs_set_acl(struct user_namespace *mnt_userns,
struct inode *inode, int type,
struct posix_acl *acl, struct page *ipage) struct posix_acl *acl, struct page *ipage)
{ {
int name_index; int name_index;
@ -238,7 +240,8 @@ static int __f2fs_set_acl(struct inode *inode, int type,
case ACL_TYPE_ACCESS: case ACL_TYPE_ACCESS:
name_index = F2FS_XATTR_INDEX_POSIX_ACL_ACCESS; name_index = F2FS_XATTR_INDEX_POSIX_ACL_ACCESS;
if (acl && !ipage) { if (acl && !ipage) {
error = f2fs_acl_update_mode(inode, &mode, &acl); error = f2fs_acl_update_mode(mnt_userns, inode,
&mode, &acl);
if (error) if (error)
return error; return error;
set_acl_inode(inode, mode); set_acl_inode(inode, mode);
@ -279,7 +282,7 @@ int f2fs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
if (unlikely(f2fs_cp_error(F2FS_I_SB(inode)))) if (unlikely(f2fs_cp_error(F2FS_I_SB(inode))))
return -EIO; return -EIO;
return __f2fs_set_acl(inode, type, acl, NULL); return __f2fs_set_acl(mnt_userns, inode, type, acl, NULL);
} }
/* /*
@ -419,7 +422,7 @@ int f2fs_init_acl(struct inode *inode, struct inode *dir, struct page *ipage,
f2fs_mark_inode_dirty_sync(inode, true); f2fs_mark_inode_dirty_sync(inode, true);
if (default_acl) { if (default_acl) {
error = __f2fs_set_acl(inode, ACL_TYPE_DEFAULT, default_acl, error = __f2fs_set_acl(NULL, inode, ACL_TYPE_DEFAULT, default_acl,
ipage); ipage);
posix_acl_release(default_acl); posix_acl_release(default_acl);
} else { } else {
@ -427,7 +430,7 @@ int f2fs_init_acl(struct inode *inode, struct inode *dir, struct page *ipage,
} }
if (acl) { if (acl) {
if (!error) if (!error)
error = __f2fs_set_acl(inode, ACL_TYPE_ACCESS, acl, error = __f2fs_set_acl(NULL, inode, ACL_TYPE_ACCESS, acl,
ipage); ipage);
posix_acl_release(acl); posix_acl_release(acl);
} else { } else {

View File

@ -98,6 +98,13 @@ repeat:
} }
if (unlikely(!PageUptodate(page))) { if (unlikely(!PageUptodate(page))) {
if (page->index == sbi->metapage_eio_ofs &&
sbi->metapage_eio_cnt++ == MAX_RETRY_META_PAGE_EIO) {
set_ckpt_flags(sbi, CP_ERROR_FLAG);
} else {
sbi->metapage_eio_ofs = page->index;
sbi->metapage_eio_cnt = 0;
}
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
return ERR_PTR(-EIO); return ERR_PTR(-EIO);
} }
@ -282,18 +289,22 @@ out:
return blkno - start; return blkno - start;
} }
void f2fs_ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index) void f2fs_ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index,
unsigned int ra_blocks)
{ {
struct page *page; struct page *page;
bool readahead = false; bool readahead = false;
if (ra_blocks == RECOVERY_MIN_RA_BLOCKS)
return;
page = find_get_page(META_MAPPING(sbi), index); page = find_get_page(META_MAPPING(sbi), index);
if (!page || !PageUptodate(page)) if (!page || !PageUptodate(page))
readahead = true; readahead = true;
f2fs_put_page(page, 0); f2fs_put_page(page, 0);
if (readahead) if (readahead)
f2fs_ra_meta_pages(sbi, index, BIO_MAX_VECS, META_POR, true); f2fs_ra_meta_pages(sbi, index, ra_blocks, META_POR, true);
} }
static int __f2fs_write_meta_page(struct page *page, static int __f2fs_write_meta_page(struct page *page,
@ -351,13 +362,13 @@ static int f2fs_write_meta_pages(struct address_space *mapping,
goto skip_write; goto skip_write;
/* if locked failed, cp will flush dirty pages instead */ /* if locked failed, cp will flush dirty pages instead */
if (!down_write_trylock(&sbi->cp_global_sem)) if (!f2fs_down_write_trylock(&sbi->cp_global_sem))
goto skip_write; goto skip_write;
trace_f2fs_writepages(mapping->host, wbc, META); trace_f2fs_writepages(mapping->host, wbc, META);
diff = nr_pages_to_write(sbi, META, wbc); diff = nr_pages_to_write(sbi, META, wbc);
written = f2fs_sync_meta_pages(sbi, META, wbc->nr_to_write, FS_META_IO); written = f2fs_sync_meta_pages(sbi, META, wbc->nr_to_write, FS_META_IO);
up_write(&sbi->cp_global_sem); f2fs_up_write(&sbi->cp_global_sem);
wbc->nr_to_write = max((long)0, wbc->nr_to_write - written - diff); wbc->nr_to_write = max((long)0, wbc->nr_to_write - written - diff);
return 0; return 0;
@ -864,6 +875,7 @@ static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
struct page *cp_page_1 = NULL, *cp_page_2 = NULL; struct page *cp_page_1 = NULL, *cp_page_2 = NULL;
struct f2fs_checkpoint *cp_block = NULL; struct f2fs_checkpoint *cp_block = NULL;
unsigned long long cur_version = 0, pre_version = 0; unsigned long long cur_version = 0, pre_version = 0;
unsigned int cp_blocks;
int err; int err;
err = get_checkpoint_version(sbi, cp_addr, &cp_block, err = get_checkpoint_version(sbi, cp_addr, &cp_block,
@ -871,15 +883,16 @@ static struct page *validate_checkpoint(struct f2fs_sb_info *sbi,
if (err) if (err)
return NULL; return NULL;
if (le32_to_cpu(cp_block->cp_pack_total_block_count) > cp_blocks = le32_to_cpu(cp_block->cp_pack_total_block_count);
sbi->blocks_per_seg) {
if (cp_blocks > sbi->blocks_per_seg || cp_blocks <= F2FS_CP_PACKS) {
f2fs_warn(sbi, "invalid cp_pack_total_block_count:%u", f2fs_warn(sbi, "invalid cp_pack_total_block_count:%u",
le32_to_cpu(cp_block->cp_pack_total_block_count)); le32_to_cpu(cp_block->cp_pack_total_block_count));
goto invalid_cp; goto invalid_cp;
} }
pre_version = *version; pre_version = *version;
cp_addr += le32_to_cpu(cp_block->cp_pack_total_block_count) - 1; cp_addr += cp_blocks - 1;
err = get_checkpoint_version(sbi, cp_addr, &cp_block, err = get_checkpoint_version(sbi, cp_addr, &cp_block,
&cp_page_2, version); &cp_page_2, version);
if (err) if (err)
@ -1159,7 +1172,7 @@ static bool __need_flush_quota(struct f2fs_sb_info *sbi)
if (!is_journalled_quota(sbi)) if (!is_journalled_quota(sbi))
return false; return false;
if (!down_write_trylock(&sbi->quota_sem)) if (!f2fs_down_write_trylock(&sbi->quota_sem))
return true; return true;
if (is_sbi_flag_set(sbi, SBI_QUOTA_SKIP_FLUSH)) { if (is_sbi_flag_set(sbi, SBI_QUOTA_SKIP_FLUSH)) {
ret = false; ret = false;
@ -1171,7 +1184,7 @@ static bool __need_flush_quota(struct f2fs_sb_info *sbi)
} else if (get_pages(sbi, F2FS_DIRTY_QDATA)) { } else if (get_pages(sbi, F2FS_DIRTY_QDATA)) {
ret = true; ret = true;
} }
up_write(&sbi->quota_sem); f2fs_up_write(&sbi->quota_sem);
return ret; return ret;
} }
@ -1228,10 +1241,10 @@ retry_flush_dents:
* POR: we should ensure that there are no dirty node pages * POR: we should ensure that there are no dirty node pages
* until finishing nat/sit flush. inode->i_blocks can be updated. * until finishing nat/sit flush. inode->i_blocks can be updated.
*/ */
down_write(&sbi->node_change); f2fs_down_write(&sbi->node_change);
if (get_pages(sbi, F2FS_DIRTY_IMETA)) { if (get_pages(sbi, F2FS_DIRTY_IMETA)) {
up_write(&sbi->node_change); f2fs_up_write(&sbi->node_change);
f2fs_unlock_all(sbi); f2fs_unlock_all(sbi);
err = f2fs_sync_inode_meta(sbi); err = f2fs_sync_inode_meta(sbi);
if (err) if (err)
@ -1241,15 +1254,15 @@ retry_flush_dents:
} }
retry_flush_nodes: retry_flush_nodes:
down_write(&sbi->node_write); f2fs_down_write(&sbi->node_write);
if (get_pages(sbi, F2FS_DIRTY_NODES)) { if (get_pages(sbi, F2FS_DIRTY_NODES)) {
up_write(&sbi->node_write); f2fs_up_write(&sbi->node_write);
atomic_inc(&sbi->wb_sync_req[NODE]); atomic_inc(&sbi->wb_sync_req[NODE]);
err = f2fs_sync_node_pages(sbi, &wbc, false, FS_CP_NODE_IO); err = f2fs_sync_node_pages(sbi, &wbc, false, FS_CP_NODE_IO);
atomic_dec(&sbi->wb_sync_req[NODE]); atomic_dec(&sbi->wb_sync_req[NODE]);
if (err) { if (err) {
up_write(&sbi->node_change); f2fs_up_write(&sbi->node_change);
f2fs_unlock_all(sbi); f2fs_unlock_all(sbi);
return err; return err;
} }
@ -1262,13 +1275,13 @@ retry_flush_nodes:
* dirty node blocks and some checkpoint values by block allocation. * dirty node blocks and some checkpoint values by block allocation.
*/ */
__prepare_cp_block(sbi); __prepare_cp_block(sbi);
up_write(&sbi->node_change); f2fs_up_write(&sbi->node_change);
return err; return err;
} }
static void unblock_operations(struct f2fs_sb_info *sbi) static void unblock_operations(struct f2fs_sb_info *sbi)
{ {
up_write(&sbi->node_write); f2fs_up_write(&sbi->node_write);
f2fs_unlock_all(sbi); f2fs_unlock_all(sbi);
} }
@ -1543,6 +1556,7 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* update user_block_counts */ /* update user_block_counts */
sbi->last_valid_block_count = sbi->total_valid_block_count; sbi->last_valid_block_count = sbi->total_valid_block_count;
percpu_counter_set(&sbi->alloc_valid_block_count, 0); percpu_counter_set(&sbi->alloc_valid_block_count, 0);
percpu_counter_set(&sbi->rf_node_block_count, 0);
/* Here, we have one bio having CP pack except cp pack 2 page */ /* Here, we have one bio having CP pack except cp pack 2 page */
f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO); f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
@ -1612,7 +1626,7 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_warn(sbi, "Start checkpoint disabled!"); f2fs_warn(sbi, "Start checkpoint disabled!");
} }
if (cpc->reason != CP_RESIZE) if (cpc->reason != CP_RESIZE)
down_write(&sbi->cp_global_sem); f2fs_down_write(&sbi->cp_global_sem);
if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) && if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
((cpc->reason & CP_FASTBOOT) || (cpc->reason & CP_SYNC) || ((cpc->reason & CP_FASTBOOT) || (cpc->reason & CP_SYNC) ||
@ -1693,7 +1707,7 @@ stop:
trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint"); trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish checkpoint");
out: out:
if (cpc->reason != CP_RESIZE) if (cpc->reason != CP_RESIZE)
up_write(&sbi->cp_global_sem); f2fs_up_write(&sbi->cp_global_sem);
return err; return err;
} }
@ -1741,9 +1755,9 @@ static int __write_checkpoint_sync(struct f2fs_sb_info *sbi)
struct cp_control cpc = { .reason = CP_SYNC, }; struct cp_control cpc = { .reason = CP_SYNC, };
int err; int err;
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
err = f2fs_write_checkpoint(sbi, &cpc); err = f2fs_write_checkpoint(sbi, &cpc);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
return err; return err;
} }
@ -1831,9 +1845,9 @@ int f2fs_issue_checkpoint(struct f2fs_sb_info *sbi)
if (!test_opt(sbi, MERGE_CHECKPOINT) || cpc.reason != CP_SYNC) { if (!test_opt(sbi, MERGE_CHECKPOINT) || cpc.reason != CP_SYNC) {
int ret; int ret;
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
ret = f2fs_write_checkpoint(sbi, &cpc); ret = f2fs_write_checkpoint(sbi, &cpc);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
return ret; return ret;
} }

View File

@ -314,10 +314,9 @@ static int lz4_decompress_pages(struct decompress_io_ctx *dic)
} }
if (ret != PAGE_SIZE << dic->log_cluster_size) { if (ret != PAGE_SIZE << dic->log_cluster_size) {
printk_ratelimited("%sF2FS-fs (%s): lz4 invalid rlen:%zu, " printk_ratelimited("%sF2FS-fs (%s): lz4 invalid ret:%d, "
"expected:%lu\n", KERN_ERR, "expected:%lu\n", KERN_ERR,
F2FS_I_SB(dic->inode)->sb->s_id, F2FS_I_SB(dic->inode)->sb->s_id, ret,
dic->rlen,
PAGE_SIZE << dic->log_cluster_size); PAGE_SIZE << dic->log_cluster_size);
return -EIO; return -EIO;
} }
@ -1267,7 +1266,7 @@ static int f2fs_write_compressed_pages(struct compress_ctx *cc,
* checkpoint. This can only happen to quota writes which can cause * checkpoint. This can only happen to quota writes which can cause
* the below discard race condition. * the below discard race condition.
*/ */
down_read(&sbi->node_write); f2fs_down_read(&sbi->node_write);
} else if (!f2fs_trylock_op(sbi)) { } else if (!f2fs_trylock_op(sbi)) {
goto out_free; goto out_free;
} }
@ -1384,7 +1383,7 @@ unlock_continue:
f2fs_put_dnode(&dn); f2fs_put_dnode(&dn);
if (IS_NOQUOTA(inode)) if (IS_NOQUOTA(inode))
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
else else
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
@ -1410,7 +1409,7 @@ out_put_dnode:
f2fs_put_dnode(&dn); f2fs_put_dnode(&dn);
out_unlock_op: out_unlock_op:
if (IS_NOQUOTA(inode)) if (IS_NOQUOTA(inode))
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
else else
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
out_free: out_free:

View File

@ -584,7 +584,7 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi,
enum page_type btype = PAGE_TYPE_OF_BIO(type); enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io = sbi->write_io[btype] + temp; struct f2fs_bio_info *io = sbi->write_io[btype] + temp;
down_write(&io->io_rwsem); f2fs_down_write(&io->io_rwsem);
/* change META to META_FLUSH in the checkpoint procedure */ /* change META to META_FLUSH in the checkpoint procedure */
if (type >= META_FLUSH) { if (type >= META_FLUSH) {
@ -594,7 +594,7 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi,
io->bio->bi_opf |= REQ_PREFLUSH | REQ_FUA; io->bio->bi_opf |= REQ_PREFLUSH | REQ_FUA;
} }
__submit_merged_bio(io); __submit_merged_bio(io);
up_write(&io->io_rwsem); f2fs_up_write(&io->io_rwsem);
} }
static void __submit_merged_write_cond(struct f2fs_sb_info *sbi, static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
@ -609,9 +609,9 @@ static void __submit_merged_write_cond(struct f2fs_sb_info *sbi,
enum page_type btype = PAGE_TYPE_OF_BIO(type); enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io = sbi->write_io[btype] + temp; struct f2fs_bio_info *io = sbi->write_io[btype] + temp;
down_read(&io->io_rwsem); f2fs_down_read(&io->io_rwsem);
ret = __has_merged_page(io->bio, inode, page, ino); ret = __has_merged_page(io->bio, inode, page, ino);
up_read(&io->io_rwsem); f2fs_up_read(&io->io_rwsem);
} }
if (ret) if (ret)
__f2fs_submit_merged_write(sbi, type, temp); __f2fs_submit_merged_write(sbi, type, temp);
@ -732,9 +732,9 @@ static void add_bio_entry(struct f2fs_sb_info *sbi, struct bio *bio,
if (bio_add_page(bio, page, PAGE_SIZE, 0) != PAGE_SIZE) if (bio_add_page(bio, page, PAGE_SIZE, 0) != PAGE_SIZE)
f2fs_bug_on(sbi, 1); f2fs_bug_on(sbi, 1);
down_write(&io->bio_list_lock); f2fs_down_write(&io->bio_list_lock);
list_add_tail(&be->list, &io->bio_list); list_add_tail(&be->list, &io->bio_list);
up_write(&io->bio_list_lock); f2fs_up_write(&io->bio_list_lock);
} }
static void del_bio_entry(struct bio_entry *be) static void del_bio_entry(struct bio_entry *be)
@ -756,7 +756,7 @@ static int add_ipu_page(struct f2fs_io_info *fio, struct bio **bio,
struct list_head *head = &io->bio_list; struct list_head *head = &io->bio_list;
struct bio_entry *be; struct bio_entry *be;
down_write(&io->bio_list_lock); f2fs_down_write(&io->bio_list_lock);
list_for_each_entry(be, head, list) { list_for_each_entry(be, head, list) {
if (be->bio != *bio) if (be->bio != *bio)
continue; continue;
@ -780,7 +780,7 @@ static int add_ipu_page(struct f2fs_io_info *fio, struct bio **bio,
__submit_bio(sbi, *bio, DATA); __submit_bio(sbi, *bio, DATA);
break; break;
} }
up_write(&io->bio_list_lock); f2fs_up_write(&io->bio_list_lock);
} }
if (ret) { if (ret) {
@ -806,7 +806,7 @@ void f2fs_submit_merged_ipu_write(struct f2fs_sb_info *sbi,
if (list_empty(head)) if (list_empty(head))
continue; continue;
down_read(&io->bio_list_lock); f2fs_down_read(&io->bio_list_lock);
list_for_each_entry(be, head, list) { list_for_each_entry(be, head, list) {
if (target) if (target)
found = (target == be->bio); found = (target == be->bio);
@ -816,14 +816,14 @@ void f2fs_submit_merged_ipu_write(struct f2fs_sb_info *sbi,
if (found) if (found)
break; break;
} }
up_read(&io->bio_list_lock); f2fs_up_read(&io->bio_list_lock);
if (!found) if (!found)
continue; continue;
found = false; found = false;
down_write(&io->bio_list_lock); f2fs_down_write(&io->bio_list_lock);
list_for_each_entry(be, head, list) { list_for_each_entry(be, head, list) {
if (target) if (target)
found = (target == be->bio); found = (target == be->bio);
@ -836,7 +836,7 @@ void f2fs_submit_merged_ipu_write(struct f2fs_sb_info *sbi,
break; break;
} }
} }
up_write(&io->bio_list_lock); f2fs_up_write(&io->bio_list_lock);
} }
if (found) if (found)
@ -894,7 +894,7 @@ void f2fs_submit_page_write(struct f2fs_io_info *fio)
f2fs_bug_on(sbi, is_read_io(fio->op)); f2fs_bug_on(sbi, is_read_io(fio->op));
down_write(&io->io_rwsem); f2fs_down_write(&io->io_rwsem);
next: next:
if (fio->in_list) { if (fio->in_list) {
spin_lock(&io->io_lock); spin_lock(&io->io_lock);
@ -961,7 +961,7 @@ out:
if (is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN) || if (is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN) ||
!f2fs_is_checkpoint_ready(sbi)) !f2fs_is_checkpoint_ready(sbi))
__submit_merged_bio(io); __submit_merged_bio(io);
up_write(&io->io_rwsem); f2fs_up_write(&io->io_rwsem);
} }
static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr, static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr,
@ -1371,9 +1371,9 @@ void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
{ {
if (flag == F2FS_GET_BLOCK_PRE_AIO) { if (flag == F2FS_GET_BLOCK_PRE_AIO) {
if (lock) if (lock)
down_read(&sbi->node_change); f2fs_down_read(&sbi->node_change);
else else
up_read(&sbi->node_change); f2fs_up_read(&sbi->node_change);
} else { } else {
if (lock) if (lock)
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
@ -2448,6 +2448,9 @@ static inline bool check_inplace_update_policy(struct inode *inode,
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
unsigned int policy = SM_I(sbi)->ipu_policy; unsigned int policy = SM_I(sbi)->ipu_policy;
if (policy & (0x1 << F2FS_IPU_HONOR_OPU_WRITE) &&
is_inode_flag_set(inode, FI_OPU_WRITE))
return false;
if (policy & (0x1 << F2FS_IPU_FORCE)) if (policy & (0x1 << F2FS_IPU_FORCE))
return true; return true;
if (policy & (0x1 << F2FS_IPU_SSR) && f2fs_need_SSR(sbi)) if (policy & (0x1 << F2FS_IPU_SSR) && f2fs_need_SSR(sbi))
@ -2518,6 +2521,9 @@ bool f2fs_should_update_outplace(struct inode *inode, struct f2fs_io_info *fio)
if (is_inode_flag_set(inode, FI_ALIGNED_WRITE)) if (is_inode_flag_set(inode, FI_ALIGNED_WRITE))
return true; return true;
if (is_inode_flag_set(inode, FI_OPU_WRITE))
return true;
if (fio) { if (fio) {
if (page_private_gcing(fio->page)) if (page_private_gcing(fio->page))
return true; return true;
@ -2737,13 +2743,13 @@ write:
* the below discard race condition. * the below discard race condition.
*/ */
if (IS_NOQUOTA(inode)) if (IS_NOQUOTA(inode))
down_read(&sbi->node_write); f2fs_down_read(&sbi->node_write);
fio.need_lock = LOCK_DONE; fio.need_lock = LOCK_DONE;
err = f2fs_do_write_data_page(&fio); err = f2fs_do_write_data_page(&fio);
if (IS_NOQUOTA(inode)) if (IS_NOQUOTA(inode))
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
goto done; goto done;
} }
@ -3142,8 +3148,8 @@ static int __f2fs_write_data_pages(struct address_space *mapping,
f2fs_available_free_memory(sbi, DIRTY_DENTS)) f2fs_available_free_memory(sbi, DIRTY_DENTS))
goto skip_write; goto skip_write;
/* skip writing during file defragment */ /* skip writing in file defragment preparing stage */
if (is_inode_flag_set(inode, FI_DO_DEFRAG)) if (is_inode_flag_set(inode, FI_SKIP_WRITES))
goto skip_write; goto skip_write;
trace_f2fs_writepages(mapping->host, wbc, DATA); trace_f2fs_writepages(mapping->host, wbc, DATA);
@ -3151,8 +3157,12 @@ static int __f2fs_write_data_pages(struct address_space *mapping,
/* to avoid spliting IOs due to mixed WB_SYNC_ALL and WB_SYNC_NONE */ /* to avoid spliting IOs due to mixed WB_SYNC_ALL and WB_SYNC_NONE */
if (wbc->sync_mode == WB_SYNC_ALL) if (wbc->sync_mode == WB_SYNC_ALL)
atomic_inc(&sbi->wb_sync_req[DATA]); atomic_inc(&sbi->wb_sync_req[DATA]);
else if (atomic_read(&sbi->wb_sync_req[DATA])) else if (atomic_read(&sbi->wb_sync_req[DATA])) {
/* to avoid potential deadlock */
if (current->plug)
blk_finish_plug(current->plug);
goto skip_write; goto skip_write;
}
if (__should_serialize_io(inode, wbc)) { if (__should_serialize_io(inode, wbc)) {
mutex_lock(&sbi->writepages); mutex_lock(&sbi->writepages);
@ -3201,14 +3211,14 @@ void f2fs_write_failed(struct inode *inode, loff_t to)
/* In the fs-verity case, f2fs_end_enable_verity() does the truncate */ /* In the fs-verity case, f2fs_end_enable_verity() does the truncate */
if (to > i_size && !f2fs_verity_in_progress(inode)) { if (to > i_size && !f2fs_verity_in_progress(inode)) {
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
truncate_pagecache(inode, i_size); truncate_pagecache(inode, i_size);
f2fs_truncate_blocks(inode, i_size, true); f2fs_truncate_blocks(inode, i_size, true);
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
} }
} }
@ -3341,7 +3351,7 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
*fsdata = NULL; *fsdata = NULL;
if (len == PAGE_SIZE) if (len == PAGE_SIZE && !(f2fs_is_atomic_file(inode)))
goto repeat; goto repeat;
ret = f2fs_prepare_compress_overwrite(inode, pagep, ret = f2fs_prepare_compress_overwrite(inode, pagep,
@ -3709,19 +3719,20 @@ static int f2fs_migrate_blocks(struct inode *inode, block_t start_blk,
unsigned int end_sec = secidx + blkcnt / blk_per_sec; unsigned int end_sec = secidx + blkcnt / blk_per_sec;
int ret = 0; int ret = 0;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
set_inode_flag(inode, FI_ALIGNED_WRITE); set_inode_flag(inode, FI_ALIGNED_WRITE);
set_inode_flag(inode, FI_OPU_WRITE);
for (; secidx < end_sec; secidx++) { for (; secidx < end_sec; secidx++) {
down_write(&sbi->pin_sem); f2fs_down_write(&sbi->pin_sem);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
f2fs_allocate_new_section(sbi, CURSEG_COLD_DATA_PINNED, false); f2fs_allocate_new_section(sbi, CURSEG_COLD_DATA_PINNED, false);
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
set_inode_flag(inode, FI_DO_DEFRAG); set_inode_flag(inode, FI_SKIP_WRITES);
for (blkofs = 0; blkofs < blk_per_sec; blkofs++) { for (blkofs = 0; blkofs < blk_per_sec; blkofs++) {
struct page *page; struct page *page;
@ -3729,7 +3740,7 @@ static int f2fs_migrate_blocks(struct inode *inode, block_t start_blk,
page = f2fs_get_lock_data_page(inode, blkidx, true); page = f2fs_get_lock_data_page(inode, blkidx, true);
if (IS_ERR(page)) { if (IS_ERR(page)) {
up_write(&sbi->pin_sem); f2fs_up_write(&sbi->pin_sem);
ret = PTR_ERR(page); ret = PTR_ERR(page);
goto done; goto done;
} }
@ -3738,22 +3749,23 @@ static int f2fs_migrate_blocks(struct inode *inode, block_t start_blk,
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
} }
clear_inode_flag(inode, FI_DO_DEFRAG); clear_inode_flag(inode, FI_SKIP_WRITES);
ret = filemap_fdatawrite(inode->i_mapping); ret = filemap_fdatawrite(inode->i_mapping);
up_write(&sbi->pin_sem); f2fs_up_write(&sbi->pin_sem);
if (ret) if (ret)
break; break;
} }
done: done:
clear_inode_flag(inode, FI_DO_DEFRAG); clear_inode_flag(inode, FI_SKIP_WRITES);
clear_inode_flag(inode, FI_OPU_WRITE);
clear_inode_flag(inode, FI_ALIGNED_WRITE); clear_inode_flag(inode, FI_ALIGNED_WRITE);
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
return ret; return ret;
} }

View File

@ -21,7 +21,7 @@
#include "gc.h" #include "gc.h"
static LIST_HEAD(f2fs_stat_list); static LIST_HEAD(f2fs_stat_list);
static DEFINE_MUTEX(f2fs_stat_mutex); static DEFINE_RAW_SPINLOCK(f2fs_stat_lock);
#ifdef CONFIG_DEBUG_FS #ifdef CONFIG_DEBUG_FS
static struct dentry *f2fs_debugfs_root; static struct dentry *f2fs_debugfs_root;
#endif #endif
@ -338,14 +338,16 @@ static char *s_flag[] = {
[SBI_QUOTA_SKIP_FLUSH] = " quota_skip_flush", [SBI_QUOTA_SKIP_FLUSH] = " quota_skip_flush",
[SBI_QUOTA_NEED_REPAIR] = " quota_need_repair", [SBI_QUOTA_NEED_REPAIR] = " quota_need_repair",
[SBI_IS_RESIZEFS] = " resizefs", [SBI_IS_RESIZEFS] = " resizefs",
[SBI_IS_FREEZING] = " freezefs",
}; };
static int stat_show(struct seq_file *s, void *v) static int stat_show(struct seq_file *s, void *v)
{ {
struct f2fs_stat_info *si; struct f2fs_stat_info *si;
int i = 0, j = 0; int i = 0, j = 0;
unsigned long flags;
mutex_lock(&f2fs_stat_mutex); raw_spin_lock_irqsave(&f2fs_stat_lock, flags);
list_for_each_entry(si, &f2fs_stat_list, stat_list) { list_for_each_entry(si, &f2fs_stat_list, stat_list) {
update_general_status(si->sbi); update_general_status(si->sbi);
@ -474,12 +476,14 @@ static int stat_show(struct seq_file *s, void *v)
si->node_segs, si->bg_node_segs); si->node_segs, si->bg_node_segs);
seq_printf(s, " - Reclaimed segs : Normal (%d), Idle CB (%d), " seq_printf(s, " - Reclaimed segs : Normal (%d), Idle CB (%d), "
"Idle Greedy (%d), Idle AT (%d), " "Idle Greedy (%d), Idle AT (%d), "
"Urgent High (%d), Urgent Low (%d)\n", "Urgent High (%d), Urgent Mid (%d), "
"Urgent Low (%d)\n",
si->sbi->gc_reclaimed_segs[GC_NORMAL], si->sbi->gc_reclaimed_segs[GC_NORMAL],
si->sbi->gc_reclaimed_segs[GC_IDLE_CB], si->sbi->gc_reclaimed_segs[GC_IDLE_CB],
si->sbi->gc_reclaimed_segs[GC_IDLE_GREEDY], si->sbi->gc_reclaimed_segs[GC_IDLE_GREEDY],
si->sbi->gc_reclaimed_segs[GC_IDLE_AT], si->sbi->gc_reclaimed_segs[GC_IDLE_AT],
si->sbi->gc_reclaimed_segs[GC_URGENT_HIGH], si->sbi->gc_reclaimed_segs[GC_URGENT_HIGH],
si->sbi->gc_reclaimed_segs[GC_URGENT_MID],
si->sbi->gc_reclaimed_segs[GC_URGENT_LOW]); si->sbi->gc_reclaimed_segs[GC_URGENT_LOW]);
seq_printf(s, "Try to move %d blocks (BG: %d)\n", si->tot_blks, seq_printf(s, "Try to move %d blocks (BG: %d)\n", si->tot_blks,
si->bg_data_blks + si->bg_node_blks); si->bg_data_blks + si->bg_node_blks);
@ -532,6 +536,9 @@ static int stat_show(struct seq_file *s, void *v)
si->ndirty_meta, si->meta_pages); si->ndirty_meta, si->meta_pages);
seq_printf(s, " - imeta: %4d\n", seq_printf(s, " - imeta: %4d\n",
si->ndirty_imeta); si->ndirty_imeta);
seq_printf(s, " - fsync mark: %4lld\n",
percpu_counter_sum_positive(
&si->sbi->rf_node_block_count));
seq_printf(s, " - NATs: %9d/%9d\n - SITs: %9d/%9d\n", seq_printf(s, " - NATs: %9d/%9d\n - SITs: %9d/%9d\n",
si->dirty_nats, si->nats, si->dirty_sits, si->sits); si->dirty_nats, si->nats, si->dirty_sits, si->sits);
seq_printf(s, " - free_nids: %9d/%9d\n - alloc_nids: %9d\n", seq_printf(s, " - free_nids: %9d/%9d\n - alloc_nids: %9d\n",
@ -573,7 +580,7 @@ static int stat_show(struct seq_file *s, void *v)
seq_printf(s, " - paged : %llu KB\n", seq_printf(s, " - paged : %llu KB\n",
si->page_mem >> 10); si->page_mem >> 10);
} }
mutex_unlock(&f2fs_stat_mutex); raw_spin_unlock_irqrestore(&f2fs_stat_lock, flags);
return 0; return 0;
} }
@ -584,6 +591,7 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
{ {
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi); struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
struct f2fs_stat_info *si; struct f2fs_stat_info *si;
unsigned long flags;
int i; int i;
si = f2fs_kzalloc(sbi, sizeof(struct f2fs_stat_info), GFP_KERNEL); si = f2fs_kzalloc(sbi, sizeof(struct f2fs_stat_info), GFP_KERNEL);
@ -619,9 +627,9 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
atomic_set(&sbi->max_aw_cnt, 0); atomic_set(&sbi->max_aw_cnt, 0);
atomic_set(&sbi->max_vw_cnt, 0); atomic_set(&sbi->max_vw_cnt, 0);
mutex_lock(&f2fs_stat_mutex); raw_spin_lock_irqsave(&f2fs_stat_lock, flags);
list_add_tail(&si->stat_list, &f2fs_stat_list); list_add_tail(&si->stat_list, &f2fs_stat_list);
mutex_unlock(&f2fs_stat_mutex); raw_spin_unlock_irqrestore(&f2fs_stat_lock, flags);
return 0; return 0;
} }
@ -629,10 +637,11 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
void f2fs_destroy_stats(struct f2fs_sb_info *sbi) void f2fs_destroy_stats(struct f2fs_sb_info *sbi)
{ {
struct f2fs_stat_info *si = F2FS_STAT(sbi); struct f2fs_stat_info *si = F2FS_STAT(sbi);
unsigned long flags;
mutex_lock(&f2fs_stat_mutex); raw_spin_lock_irqsave(&f2fs_stat_lock, flags);
list_del(&si->stat_list); list_del(&si->stat_list);
mutex_unlock(&f2fs_stat_mutex); raw_spin_unlock_irqrestore(&f2fs_stat_lock, flags);
kfree(si); kfree(si);
} }

View File

@ -766,7 +766,7 @@ add_dentry:
f2fs_wait_on_page_writeback(dentry_page, DATA, true, true); f2fs_wait_on_page_writeback(dentry_page, DATA, true, true);
if (inode) { if (inode) {
down_write(&F2FS_I(inode)->i_sem); f2fs_down_write(&F2FS_I(inode)->i_sem);
page = f2fs_init_inode_metadata(inode, dir, fname, NULL); page = f2fs_init_inode_metadata(inode, dir, fname, NULL);
if (IS_ERR(page)) { if (IS_ERR(page)) {
err = PTR_ERR(page); err = PTR_ERR(page);
@ -793,7 +793,7 @@ add_dentry:
f2fs_update_parent_metadata(dir, inode, current_depth); f2fs_update_parent_metadata(dir, inode, current_depth);
fail: fail:
if (inode) if (inode)
up_write(&F2FS_I(inode)->i_sem); f2fs_up_write(&F2FS_I(inode)->i_sem);
f2fs_put_page(dentry_page, 1); f2fs_put_page(dentry_page, 1);
@ -858,7 +858,7 @@ int f2fs_do_tmpfile(struct inode *inode, struct inode *dir)
struct page *page; struct page *page;
int err = 0; int err = 0;
down_write(&F2FS_I(inode)->i_sem); f2fs_down_write(&F2FS_I(inode)->i_sem);
page = f2fs_init_inode_metadata(inode, dir, NULL, NULL); page = f2fs_init_inode_metadata(inode, dir, NULL, NULL);
if (IS_ERR(page)) { if (IS_ERR(page)) {
err = PTR_ERR(page); err = PTR_ERR(page);
@ -869,7 +869,7 @@ int f2fs_do_tmpfile(struct inode *inode, struct inode *dir)
clear_inode_flag(inode, FI_NEW_INODE); clear_inode_flag(inode, FI_NEW_INODE);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME); f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
fail: fail:
up_write(&F2FS_I(inode)->i_sem); f2fs_up_write(&F2FS_I(inode)->i_sem);
return err; return err;
} }
@ -877,7 +877,7 @@ void f2fs_drop_nlink(struct inode *dir, struct inode *inode)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
down_write(&F2FS_I(inode)->i_sem); f2fs_down_write(&F2FS_I(inode)->i_sem);
if (S_ISDIR(inode->i_mode)) if (S_ISDIR(inode->i_mode))
f2fs_i_links_write(dir, false); f2fs_i_links_write(dir, false);
@ -888,7 +888,7 @@ void f2fs_drop_nlink(struct inode *dir, struct inode *inode)
f2fs_i_links_write(inode, false); f2fs_i_links_write(inode, false);
f2fs_i_size_write(inode, 0); f2fs_i_size_write(inode, 0);
} }
up_write(&F2FS_I(inode)->i_sem); f2fs_up_write(&F2FS_I(inode)->i_sem);
if (inode->i_nlink == 0) if (inode->i_nlink == 0)
f2fs_add_orphan_inode(inode); f2fs_add_orphan_inode(inode);

View File

@ -123,6 +123,20 @@ typedef u32 nid_t;
#define COMPRESS_EXT_NUM 16 #define COMPRESS_EXT_NUM 16
/*
* An implementation of an rwsem that is explicitly unfair to readers. This
* prevents priority inversion when a low-priority reader acquires the read lock
* while sleeping on the write lock but the write lock is needed by
* higher-priority clients.
*/
struct f2fs_rwsem {
struct rw_semaphore internal_rwsem;
#ifdef CONFIG_F2FS_UNFAIR_RWSEM
wait_queue_head_t read_waiters;
#endif
};
struct f2fs_mount_info { struct f2fs_mount_info {
unsigned int opt; unsigned int opt;
int write_io_size_bits; /* Write IO size bits */ int write_io_size_bits; /* Write IO size bits */
@ -386,6 +400,10 @@ struct discard_cmd_control {
struct mutex cmd_lock; struct mutex cmd_lock;
unsigned int nr_discards; /* # of discards in the list */ unsigned int nr_discards; /* # of discards in the list */
unsigned int max_discards; /* max. discards to be issued */ unsigned int max_discards; /* max. discards to be issued */
unsigned int max_discard_request; /* max. discard request per round */
unsigned int min_discard_issue_time; /* min. interval between discard issue */
unsigned int mid_discard_issue_time; /* mid. interval between discard issue */
unsigned int max_discard_issue_time; /* max. interval between discard issue */
unsigned int discard_granularity; /* discard granularity */ unsigned int discard_granularity; /* discard granularity */
unsigned int undiscard_blks; /* # of undiscard blocks */ unsigned int undiscard_blks; /* # of undiscard blocks */
unsigned int next_pos; /* next discard position */ unsigned int next_pos; /* next discard position */
@ -561,6 +579,9 @@ enum {
/* maximum retry quota flush count */ /* maximum retry quota flush count */
#define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8 #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8
/* maximum retry of EIO'ed meta page */
#define MAX_RETRY_META_PAGE_EIO 100
#define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */ #define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */
#define MAX_DIR_RA_PAGES 4 /* maximum ra pages of dir */ #define MAX_DIR_RA_PAGES 4 /* maximum ra pages of dir */
@ -574,6 +595,9 @@ enum {
/* number of extent info in extent cache we try to shrink */ /* number of extent info in extent cache we try to shrink */
#define EXTENT_CACHE_SHRINK_NUMBER 128 #define EXTENT_CACHE_SHRINK_NUMBER 128
#define RECOVERY_MAX_RA_BLOCKS BIO_MAX_VECS
#define RECOVERY_MIN_RA_BLOCKS 1
struct rb_entry { struct rb_entry {
struct rb_node rb_node; /* rb node located in rb-tree */ struct rb_node rb_node; /* rb node located in rb-tree */
union { union {
@ -721,7 +745,8 @@ enum {
FI_DROP_CACHE, /* drop dirty page cache */ FI_DROP_CACHE, /* drop dirty page cache */
FI_DATA_EXIST, /* indicate data exists */ FI_DATA_EXIST, /* indicate data exists */
FI_INLINE_DOTS, /* indicate inline dot dentries */ FI_INLINE_DOTS, /* indicate inline dot dentries */
FI_DO_DEFRAG, /* indicate defragment is running */ FI_SKIP_WRITES, /* should skip data page writeback */
FI_OPU_WRITE, /* used for opu per file */
FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */ FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */
FI_PREALLOCATED_ALL, /* all blocks for write were preallocated */ FI_PREALLOCATED_ALL, /* all blocks for write were preallocated */
FI_HOT_DATA, /* indicate file is hot */ FI_HOT_DATA, /* indicate file is hot */
@ -752,7 +777,7 @@ struct f2fs_inode_info {
/* Use below internally in f2fs*/ /* Use below internally in f2fs*/
unsigned long flags[BITS_TO_LONGS(FI_MAX)]; /* use to pass per-file flags */ unsigned long flags[BITS_TO_LONGS(FI_MAX)]; /* use to pass per-file flags */
struct rw_semaphore i_sem; /* protect fi info */ struct f2fs_rwsem i_sem; /* protect fi info */
atomic_t dirty_pages; /* # of dirty pages */ atomic_t dirty_pages; /* # of dirty pages */
f2fs_hash_t chash; /* hash value of given file name */ f2fs_hash_t chash; /* hash value of given file name */
unsigned int clevel; /* maximum level of given file name */ unsigned int clevel; /* maximum level of given file name */
@ -777,8 +802,8 @@ struct f2fs_inode_info {
struct extent_tree *extent_tree; /* cached extent_tree entry */ struct extent_tree *extent_tree; /* cached extent_tree entry */
/* avoid racing between foreground op and gc */ /* avoid racing between foreground op and gc */
struct rw_semaphore i_gc_rwsem[2]; struct f2fs_rwsem i_gc_rwsem[2];
struct rw_semaphore i_xattr_sem; /* avoid racing between reading and changing EAs */ struct f2fs_rwsem i_xattr_sem; /* avoid racing between reading and changing EAs */
int i_extra_isize; /* size of extra space located in i_addr */ int i_extra_isize; /* size of extra space located in i_addr */
kprojid_t i_projid; /* id for project quota */ kprojid_t i_projid; /* id for project quota */
@ -897,6 +922,7 @@ struct f2fs_nm_info {
nid_t max_nid; /* maximum possible node ids */ nid_t max_nid; /* maximum possible node ids */
nid_t available_nids; /* # of available node ids */ nid_t available_nids; /* # of available node ids */
nid_t next_scan_nid; /* the next nid to be scanned */ nid_t next_scan_nid; /* the next nid to be scanned */
nid_t max_rf_node_blocks; /* max # of nodes for recovery */
unsigned int ram_thresh; /* control the memory footprint */ unsigned int ram_thresh; /* control the memory footprint */
unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */
unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */
@ -904,7 +930,7 @@ struct f2fs_nm_info {
/* NAT cache management */ /* NAT cache management */
struct radix_tree_root nat_root;/* root of the nat entry cache */ struct radix_tree_root nat_root;/* root of the nat entry cache */
struct radix_tree_root nat_set_root;/* root of the nat set cache */ struct radix_tree_root nat_set_root;/* root of the nat set cache */
struct rw_semaphore nat_tree_lock; /* protect nat entry tree */ struct f2fs_rwsem nat_tree_lock; /* protect nat entry tree */
struct list_head nat_entries; /* cached nat entry list (clean) */ struct list_head nat_entries; /* cached nat entry list (clean) */
spinlock_t nat_list_lock; /* protect clean nat entry list */ spinlock_t nat_list_lock; /* protect clean nat entry list */
unsigned int nat_cnt[MAX_NAT_STATE]; /* the # of cached nat entries */ unsigned int nat_cnt[MAX_NAT_STATE]; /* the # of cached nat entries */
@ -1017,7 +1043,7 @@ struct f2fs_sm_info {
struct dirty_seglist_info *dirty_info; /* dirty segment information */ struct dirty_seglist_info *dirty_info; /* dirty segment information */
struct curseg_info *curseg_array; /* active segment information */ struct curseg_info *curseg_array; /* active segment information */
struct rw_semaphore curseg_lock; /* for preventing curseg change */ struct f2fs_rwsem curseg_lock; /* for preventing curseg change */
block_t seg0_blkaddr; /* block address of 0'th segment */ block_t seg0_blkaddr; /* block address of 0'th segment */
block_t main_blkaddr; /* start block address of main area */ block_t main_blkaddr; /* start block address of main area */
@ -1201,11 +1227,11 @@ struct f2fs_bio_info {
struct bio *bio; /* bios to merge */ struct bio *bio; /* bios to merge */
sector_t last_block_in_bio; /* last block number */ sector_t last_block_in_bio; /* last block number */
struct f2fs_io_info fio; /* store buffered io info. */ struct f2fs_io_info fio; /* store buffered io info. */
struct rw_semaphore io_rwsem; /* blocking op for bio */ struct f2fs_rwsem io_rwsem; /* blocking op for bio */
spinlock_t io_lock; /* serialize DATA/NODE IOs */ spinlock_t io_lock; /* serialize DATA/NODE IOs */
struct list_head io_list; /* track fios */ struct list_head io_list; /* track fios */
struct list_head bio_list; /* bio entry list head */ struct list_head bio_list; /* bio entry list head */
struct rw_semaphore bio_list_lock; /* lock to protect bio entry list */ struct f2fs_rwsem bio_list_lock; /* lock to protect bio entry list */
}; };
#define FDEV(i) (sbi->devs[i]) #define FDEV(i) (sbi->devs[i])
@ -1267,6 +1293,7 @@ enum {
SBI_QUOTA_SKIP_FLUSH, /* skip flushing quota in current CP */ SBI_QUOTA_SKIP_FLUSH, /* skip flushing quota in current CP */
SBI_QUOTA_NEED_REPAIR, /* quota file may be corrupted */ SBI_QUOTA_NEED_REPAIR, /* quota file may be corrupted */
SBI_IS_RESIZEFS, /* resizefs is in process */ SBI_IS_RESIZEFS, /* resizefs is in process */
SBI_IS_FREEZING, /* freezefs is in process */
}; };
enum { enum {
@ -1286,6 +1313,7 @@ enum {
GC_IDLE_AT, GC_IDLE_AT,
GC_URGENT_HIGH, GC_URGENT_HIGH,
GC_URGENT_LOW, GC_URGENT_LOW,
GC_URGENT_MID,
MAX_GC_MODE, MAX_GC_MODE,
}; };
@ -1571,7 +1599,7 @@ struct f2fs_sb_info {
struct super_block *sb; /* pointer to VFS super block */ struct super_block *sb; /* pointer to VFS super block */
struct proc_dir_entry *s_proc; /* proc entry */ struct proc_dir_entry *s_proc; /* proc entry */
struct f2fs_super_block *raw_super; /* raw super block pointer */ struct f2fs_super_block *raw_super; /* raw super block pointer */
struct rw_semaphore sb_lock; /* lock for raw super block */ struct f2fs_rwsem sb_lock; /* lock for raw super block */
int valid_super_block; /* valid super block no */ int valid_super_block; /* valid super block no */
unsigned long s_flag; /* flags for sbi */ unsigned long s_flag; /* flags for sbi */
struct mutex writepages; /* mutex for writepages() */ struct mutex writepages; /* mutex for writepages() */
@ -1591,18 +1619,20 @@ struct f2fs_sb_info {
/* for bio operations */ /* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE]; /* for write bios */ struct f2fs_bio_info *write_io[NR_PAGE_TYPE]; /* for write bios */
/* keep migration IO order for LFS mode */ /* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock; struct f2fs_rwsem io_order_lock;
mempool_t *write_io_dummy; /* Dummy pages */ mempool_t *write_io_dummy; /* Dummy pages */
pgoff_t metapage_eio_ofs; /* EIO page offset */
int metapage_eio_cnt; /* EIO count */
/* for checkpoint */ /* for checkpoint */
struct f2fs_checkpoint *ckpt; /* raw checkpoint pointer */ struct f2fs_checkpoint *ckpt; /* raw checkpoint pointer */
int cur_cp_pack; /* remain current cp pack */ int cur_cp_pack; /* remain current cp pack */
spinlock_t cp_lock; /* for flag in ckpt */ spinlock_t cp_lock; /* for flag in ckpt */
struct inode *meta_inode; /* cache meta blocks */ struct inode *meta_inode; /* cache meta blocks */
struct rw_semaphore cp_global_sem; /* checkpoint procedure lock */ struct f2fs_rwsem cp_global_sem; /* checkpoint procedure lock */
struct rw_semaphore cp_rwsem; /* blocking FS operations */ struct f2fs_rwsem cp_rwsem; /* blocking FS operations */
struct rw_semaphore node_write; /* locking node writes */ struct f2fs_rwsem node_write; /* locking node writes */
struct rw_semaphore node_change; /* locking node change */ struct f2fs_rwsem node_change; /* locking node change */
wait_queue_head_t cp_wait; wait_queue_head_t cp_wait;
unsigned long last_time[MAX_TIME]; /* to store time in jiffies */ unsigned long last_time[MAX_TIME]; /* to store time in jiffies */
long interval_time[MAX_TIME]; /* to store thresholds */ long interval_time[MAX_TIME]; /* to store thresholds */
@ -1662,12 +1692,14 @@ struct f2fs_sb_info {
block_t unusable_block_count; /* # of blocks saved by last cp */ block_t unusable_block_count; /* # of blocks saved by last cp */
unsigned int nquota_files; /* # of quota sysfile */ unsigned int nquota_files; /* # of quota sysfile */
struct rw_semaphore quota_sem; /* blocking cp for flags */ struct f2fs_rwsem quota_sem; /* blocking cp for flags */
/* # of pages, see count_type */ /* # of pages, see count_type */
atomic_t nr_pages[NR_COUNT_TYPE]; atomic_t nr_pages[NR_COUNT_TYPE];
/* # of allocated blocks */ /* # of allocated blocks */
struct percpu_counter alloc_valid_block_count; struct percpu_counter alloc_valid_block_count;
/* # of node block writes as roll forward recovery */
struct percpu_counter rf_node_block_count;
/* writeback control */ /* writeback control */
atomic_t wb_sync_req[META]; /* count # of WB_SYNC threads */ atomic_t wb_sync_req[META]; /* count # of WB_SYNC threads */
@ -1678,7 +1710,7 @@ struct f2fs_sb_info {
struct f2fs_mount_info mount_opt; /* mount options */ struct f2fs_mount_info mount_opt; /* mount options */
/* for cleaning operations */ /* for cleaning operations */
struct rw_semaphore gc_lock; /* struct f2fs_rwsem gc_lock; /*
* semaphore for GC, avoid * semaphore for GC, avoid
* race between GC and GC or CP * race between GC and GC or CP
*/ */
@ -1698,7 +1730,7 @@ struct f2fs_sb_info {
/* threshold for gc trials on pinned files */ /* threshold for gc trials on pinned files */
u64 gc_pin_file_threshold; u64 gc_pin_file_threshold;
struct rw_semaphore pin_sem; struct f2fs_rwsem pin_sem;
/* maximum # of trials to find a victim segment for SSR and GC */ /* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search; unsigned int max_victim_search;
@ -2092,9 +2124,81 @@ static inline void clear_ckpt_flags(struct f2fs_sb_info *sbi, unsigned int f)
spin_unlock_irqrestore(&sbi->cp_lock, flags); spin_unlock_irqrestore(&sbi->cp_lock, flags);
} }
#define init_f2fs_rwsem(sem) \
do { \
static struct lock_class_key __key; \
\
__init_f2fs_rwsem((sem), #sem, &__key); \
} while (0)
static inline void __init_f2fs_rwsem(struct f2fs_rwsem *sem,
const char *sem_name, struct lock_class_key *key)
{
__init_rwsem(&sem->internal_rwsem, sem_name, key);
#ifdef CONFIG_F2FS_UNFAIR_RWSEM
init_waitqueue_head(&sem->read_waiters);
#endif
}
static inline int f2fs_rwsem_is_locked(struct f2fs_rwsem *sem)
{
return rwsem_is_locked(&sem->internal_rwsem);
}
static inline int f2fs_rwsem_is_contended(struct f2fs_rwsem *sem)
{
return rwsem_is_contended(&sem->internal_rwsem);
}
static inline void f2fs_down_read(struct f2fs_rwsem *sem)
{
#ifdef CONFIG_F2FS_UNFAIR_RWSEM
wait_event(sem->read_waiters, down_read_trylock(&sem->internal_rwsem));
#else
down_read(&sem->internal_rwsem);
#endif
}
static inline int f2fs_down_read_trylock(struct f2fs_rwsem *sem)
{
return down_read_trylock(&sem->internal_rwsem);
}
#ifdef CONFIG_DEBUG_LOCK_ALLOC
static inline void f2fs_down_read_nested(struct f2fs_rwsem *sem, int subclass)
{
down_read_nested(&sem->internal_rwsem, subclass);
}
#else
#define f2fs_down_read_nested(sem, subclass) f2fs_down_read(sem)
#endif
static inline void f2fs_up_read(struct f2fs_rwsem *sem)
{
up_read(&sem->internal_rwsem);
}
static inline void f2fs_down_write(struct f2fs_rwsem *sem)
{
down_write(&sem->internal_rwsem);
}
static inline int f2fs_down_write_trylock(struct f2fs_rwsem *sem)
{
return down_write_trylock(&sem->internal_rwsem);
}
static inline void f2fs_up_write(struct f2fs_rwsem *sem)
{
up_write(&sem->internal_rwsem);
#ifdef CONFIG_F2FS_UNFAIR_RWSEM
wake_up_all(&sem->read_waiters);
#endif
}
static inline void f2fs_lock_op(struct f2fs_sb_info *sbi) static inline void f2fs_lock_op(struct f2fs_sb_info *sbi)
{ {
down_read(&sbi->cp_rwsem); f2fs_down_read(&sbi->cp_rwsem);
} }
static inline int f2fs_trylock_op(struct f2fs_sb_info *sbi) static inline int f2fs_trylock_op(struct f2fs_sb_info *sbi)
@ -2103,22 +2207,22 @@ static inline int f2fs_trylock_op(struct f2fs_sb_info *sbi)
f2fs_show_injection_info(sbi, FAULT_LOCK_OP); f2fs_show_injection_info(sbi, FAULT_LOCK_OP);
return 0; return 0;
} }
return down_read_trylock(&sbi->cp_rwsem); return f2fs_down_read_trylock(&sbi->cp_rwsem);
} }
static inline void f2fs_unlock_op(struct f2fs_sb_info *sbi) static inline void f2fs_unlock_op(struct f2fs_sb_info *sbi)
{ {
up_read(&sbi->cp_rwsem); f2fs_up_read(&sbi->cp_rwsem);
} }
static inline void f2fs_lock_all(struct f2fs_sb_info *sbi) static inline void f2fs_lock_all(struct f2fs_sb_info *sbi)
{ {
down_write(&sbi->cp_rwsem); f2fs_down_write(&sbi->cp_rwsem);
} }
static inline void f2fs_unlock_all(struct f2fs_sb_info *sbi) static inline void f2fs_unlock_all(struct f2fs_sb_info *sbi)
{ {
up_write(&sbi->cp_rwsem); f2fs_up_write(&sbi->cp_rwsem);
} }
static inline int __get_cp_reason(struct f2fs_sb_info *sbi) static inline int __get_cp_reason(struct f2fs_sb_info *sbi)
@ -2681,6 +2785,9 @@ static inline bool is_idle(struct f2fs_sb_info *sbi, int type)
if (is_inflight_io(sbi, type)) if (is_inflight_io(sbi, type))
return false; return false;
if (sbi->gc_mode == GC_URGENT_MID)
return true;
if (sbi->gc_mode == GC_URGENT_LOW && if (sbi->gc_mode == GC_URGENT_LOW &&
(type == DISCARD_TIME || type == GC_TIME)) (type == DISCARD_TIME || type == GC_TIME))
return true; return true;
@ -3579,7 +3686,8 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
block_t blkaddr, int type); block_t blkaddr, int type);
int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages, int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
int type, bool sync); int type, bool sync);
void f2fs_ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index); void f2fs_ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index,
unsigned int ra_blocks);
long f2fs_sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type, long f2fs_sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type,
long nr_to_write, enum iostat_type io_type); long nr_to_write, enum iostat_type io_type);
void f2fs_add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type); void f2fs_add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type);

View File

@ -237,13 +237,13 @@ static void try_to_fix_pino(struct inode *inode)
struct f2fs_inode_info *fi = F2FS_I(inode); struct f2fs_inode_info *fi = F2FS_I(inode);
nid_t pino; nid_t pino;
down_write(&fi->i_sem); f2fs_down_write(&fi->i_sem);
if (file_wrong_pino(inode) && inode->i_nlink == 1 && if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
get_parent_ino(inode, &pino)) { get_parent_ino(inode, &pino)) {
f2fs_i_pino_write(inode, pino); f2fs_i_pino_write(inode, pino);
file_got_pino(inode); file_got_pino(inode);
} }
up_write(&fi->i_sem); f2fs_up_write(&fi->i_sem);
} }
static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end, static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
@ -318,9 +318,9 @@ go_write:
* Both of fdatasync() and fsync() are able to be recovered from * Both of fdatasync() and fsync() are able to be recovered from
* sudden-power-off. * sudden-power-off.
*/ */
down_read(&F2FS_I(inode)->i_sem); f2fs_down_read(&F2FS_I(inode)->i_sem);
cp_reason = need_do_checkpoint(inode); cp_reason = need_do_checkpoint(inode);
up_read(&F2FS_I(inode)->i_sem); f2fs_up_read(&F2FS_I(inode)->i_sem);
if (cp_reason) { if (cp_reason) {
/* all the dirty node pages should be flushed for POR */ /* all the dirty node pages should be flushed for POR */
@ -812,7 +812,7 @@ int f2fs_getattr(struct user_namespace *mnt_userns, const struct path *path,
{ {
struct inode *inode = d_inode(path->dentry); struct inode *inode = d_inode(path->dentry);
struct f2fs_inode_info *fi = F2FS_I(inode); struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_inode *ri; struct f2fs_inode *ri = NULL;
unsigned int flags; unsigned int flags;
if (f2fs_has_extra_attr(inode) && if (f2fs_has_extra_attr(inode) &&
@ -844,7 +844,7 @@ int f2fs_getattr(struct user_namespace *mnt_userns, const struct path *path,
STATX_ATTR_NODUMP | STATX_ATTR_NODUMP |
STATX_ATTR_VERITY); STATX_ATTR_VERITY);
generic_fillattr(&init_user_ns, inode, stat); generic_fillattr(mnt_userns, inode, stat);
/* we need to show initial sectors used for inline_data/dentries */ /* we need to show initial sectors used for inline_data/dentries */
if ((S_ISREG(inode->i_mode) && f2fs_has_inline_data(inode)) || if ((S_ISREG(inode->i_mode) && f2fs_has_inline_data(inode)) ||
@ -904,7 +904,7 @@ int f2fs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
!f2fs_is_compress_backend_ready(inode)) !f2fs_is_compress_backend_ready(inode))
return -EOPNOTSUPP; return -EOPNOTSUPP;
err = setattr_prepare(&init_user_ns, dentry, attr); err = setattr_prepare(mnt_userns, dentry, attr);
if (err) if (err)
return err; return err;
@ -958,7 +958,7 @@ int f2fs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
return err; return err;
} }
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
truncate_setsize(inode, attr->ia_size); truncate_setsize(inode, attr->ia_size);
@ -970,7 +970,7 @@ int f2fs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
* larger than i_size. * larger than i_size.
*/ */
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (err) if (err)
return err; return err;
@ -980,10 +980,10 @@ int f2fs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
spin_unlock(&F2FS_I(inode)->i_size_lock); spin_unlock(&F2FS_I(inode)->i_size_lock);
} }
__setattr_copy(&init_user_ns, inode, attr); __setattr_copy(mnt_userns, inode, attr);
if (attr->ia_valid & ATTR_MODE) { if (attr->ia_valid & ATTR_MODE) {
err = posix_acl_chmod(&init_user_ns, inode, f2fs_get_inode_mode(inode)); err = posix_acl_chmod(mnt_userns, inode, f2fs_get_inode_mode(inode));
if (is_inode_flag_set(inode, FI_ACL_MODE)) { if (is_inode_flag_set(inode, FI_ACL_MODE)) {
if (!err) if (!err)
@ -1112,7 +1112,7 @@ static int punch_hole(struct inode *inode, loff_t offset, loff_t len)
blk_start = (loff_t)pg_start << PAGE_SHIFT; blk_start = (loff_t)pg_start << PAGE_SHIFT;
blk_end = (loff_t)pg_end << PAGE_SHIFT; blk_end = (loff_t)pg_end << PAGE_SHIFT;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
truncate_pagecache_range(inode, blk_start, blk_end - 1); truncate_pagecache_range(inode, blk_start, blk_end - 1);
@ -1122,7 +1122,7 @@ static int punch_hole(struct inode *inode, loff_t offset, loff_t len)
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
} }
} }
@ -1355,7 +1355,7 @@ static int f2fs_do_collapse(struct inode *inode, loff_t offset, loff_t len)
f2fs_balance_fs(sbi, true); f2fs_balance_fs(sbi, true);
/* avoid gc operation during block exchange */ /* avoid gc operation during block exchange */
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
@ -1365,7 +1365,7 @@ static int f2fs_do_collapse(struct inode *inode, loff_t offset, loff_t len)
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
return ret; return ret;
} }
@ -1500,7 +1500,7 @@ static int f2fs_zero_range(struct inode *inode, loff_t offset, loff_t len,
unsigned int end_offset; unsigned int end_offset;
pgoff_t end; pgoff_t end;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(mapping); filemap_invalidate_lock(mapping);
truncate_pagecache_range(inode, truncate_pagecache_range(inode,
@ -1514,7 +1514,7 @@ static int f2fs_zero_range(struct inode *inode, loff_t offset, loff_t len,
if (ret) { if (ret) {
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
filemap_invalidate_unlock(mapping); filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
goto out; goto out;
} }
@ -1526,7 +1526,7 @@ static int f2fs_zero_range(struct inode *inode, loff_t offset, loff_t len,
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
filemap_invalidate_unlock(mapping); filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
f2fs_balance_fs(sbi, dn.node_changed); f2fs_balance_fs(sbi, dn.node_changed);
@ -1600,7 +1600,7 @@ static int f2fs_insert_range(struct inode *inode, loff_t offset, loff_t len)
idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
/* avoid gc operation during block exchange */ /* avoid gc operation during block exchange */
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(mapping); filemap_invalidate_lock(mapping);
truncate_pagecache(inode, offset); truncate_pagecache(inode, offset);
@ -1618,7 +1618,7 @@ static int f2fs_insert_range(struct inode *inode, loff_t offset, loff_t len)
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
} }
filemap_invalidate_unlock(mapping); filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
/* write out all moved pages, if possible */ /* write out all moved pages, if possible */
filemap_invalidate_lock(mapping); filemap_invalidate_lock(mapping);
@ -1674,13 +1674,13 @@ static int expand_inode_data(struct inode *inode, loff_t offset,
next_alloc: next_alloc:
if (has_not_enough_free_secs(sbi, 0, if (has_not_enough_free_secs(sbi, 0,
GET_SEC_FROM_SEG(sbi, overprovision_segments(sbi)))) { GET_SEC_FROM_SEG(sbi, overprovision_segments(sbi)))) {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
err = f2fs_gc(sbi, true, false, false, NULL_SEGNO); err = f2fs_gc(sbi, true, false, false, NULL_SEGNO);
if (err && err != -ENODATA && err != -EAGAIN) if (err && err != -ENODATA && err != -EAGAIN)
goto out_err; goto out_err;
} }
down_write(&sbi->pin_sem); f2fs_down_write(&sbi->pin_sem);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
f2fs_allocate_new_section(sbi, CURSEG_COLD_DATA_PINNED, false); f2fs_allocate_new_section(sbi, CURSEG_COLD_DATA_PINNED, false);
@ -1690,7 +1690,7 @@ next_alloc:
err = f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_DIO); err = f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_DIO);
file_dont_truncate(inode); file_dont_truncate(inode);
up_write(&sbi->pin_sem); f2fs_up_write(&sbi->pin_sem);
expanded += map.m_len; expanded += map.m_len;
sec_len -= map.m_len; sec_len -= map.m_len;
@ -1989,11 +1989,12 @@ static int f2fs_ioc_getversion(struct file *filp, unsigned long arg)
static int f2fs_ioc_start_atomic_write(struct file *filp) static int f2fs_ioc_start_atomic_write(struct file *filp)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
struct f2fs_inode_info *fi = F2FS_I(inode); struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
int ret; int ret;
if (!inode_owner_or_capable(&init_user_ns, inode)) if (!inode_owner_or_capable(mnt_userns, inode))
return -EACCES; return -EACCES;
if (!S_ISREG(inode->i_mode)) if (!S_ISREG(inode->i_mode))
@ -2008,7 +2009,10 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
inode_lock(inode); inode_lock(inode);
f2fs_disable_compressed_file(inode); if (!f2fs_disable_compressed_file(inode)) {
ret = -EINVAL;
goto out;
}
if (f2fs_is_atomic_file(inode)) { if (f2fs_is_atomic_file(inode)) {
if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST))
@ -2020,7 +2024,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
if (ret) if (ret)
goto out; goto out;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
/* /*
* Should wait end_io to count F2FS_WB_CP_DATA correctly by * Should wait end_io to count F2FS_WB_CP_DATA correctly by
@ -2031,7 +2035,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
inode->i_ino, get_dirty_pages(inode)); inode->i_ino, get_dirty_pages(inode));
ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX);
if (ret) { if (ret) {
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
goto out; goto out;
} }
@ -2044,7 +2048,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
/* add inode in inmem_list first and set atomic_file */ /* add inode in inmem_list first and set atomic_file */
set_inode_flag(inode, FI_ATOMIC_FILE); set_inode_flag(inode, FI_ATOMIC_FILE);
clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST); clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME); f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
F2FS_I(inode)->inmem_task = current; F2FS_I(inode)->inmem_task = current;
@ -2058,9 +2062,10 @@ out:
static int f2fs_ioc_commit_atomic_write(struct file *filp) static int f2fs_ioc_commit_atomic_write(struct file *filp)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
int ret; int ret;
if (!inode_owner_or_capable(&init_user_ns, inode)) if (!inode_owner_or_capable(mnt_userns, inode))
return -EACCES; return -EACCES;
ret = mnt_want_write_file(filp); ret = mnt_want_write_file(filp);
@ -2100,9 +2105,10 @@ err_out:
static int f2fs_ioc_start_volatile_write(struct file *filp) static int f2fs_ioc_start_volatile_write(struct file *filp)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
int ret; int ret;
if (!inode_owner_or_capable(&init_user_ns, inode)) if (!inode_owner_or_capable(mnt_userns, inode))
return -EACCES; return -EACCES;
if (!S_ISREG(inode->i_mode)) if (!S_ISREG(inode->i_mode))
@ -2135,9 +2141,10 @@ out:
static int f2fs_ioc_release_volatile_write(struct file *filp) static int f2fs_ioc_release_volatile_write(struct file *filp)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
int ret; int ret;
if (!inode_owner_or_capable(&init_user_ns, inode)) if (!inode_owner_or_capable(mnt_userns, inode))
return -EACCES; return -EACCES;
ret = mnt_want_write_file(filp); ret = mnt_want_write_file(filp);
@ -2164,9 +2171,10 @@ out:
static int f2fs_ioc_abort_volatile_write(struct file *filp) static int f2fs_ioc_abort_volatile_write(struct file *filp)
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
int ret; int ret;
if (!inode_owner_or_capable(&init_user_ns, inode)) if (!inode_owner_or_capable(mnt_userns, inode))
return -EACCES; return -EACCES;
ret = mnt_want_write_file(filp); ret = mnt_want_write_file(filp);
@ -2351,7 +2359,7 @@ static int f2fs_ioc_get_encryption_pwsalt(struct file *filp, unsigned long arg)
if (err) if (err)
return err; return err;
down_write(&sbi->sb_lock); f2fs_down_write(&sbi->sb_lock);
if (uuid_is_nonzero(sbi->raw_super->encrypt_pw_salt)) if (uuid_is_nonzero(sbi->raw_super->encrypt_pw_salt))
goto got_it; goto got_it;
@ -2370,7 +2378,7 @@ got_it:
16)) 16))
err = -EFAULT; err = -EFAULT;
out_err: out_err:
up_write(&sbi->sb_lock); f2fs_up_write(&sbi->sb_lock);
mnt_drop_write_file(filp); mnt_drop_write_file(filp);
return err; return err;
} }
@ -2447,12 +2455,12 @@ static int f2fs_ioc_gc(struct file *filp, unsigned long arg)
return ret; return ret;
if (!sync) { if (!sync) {
if (!down_write_trylock(&sbi->gc_lock)) { if (!f2fs_down_write_trylock(&sbi->gc_lock)) {
ret = -EBUSY; ret = -EBUSY;
goto out; goto out;
} }
} else { } else {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
} }
ret = f2fs_gc(sbi, sync, true, false, NULL_SEGNO); ret = f2fs_gc(sbi, sync, true, false, NULL_SEGNO);
@ -2483,12 +2491,12 @@ static int __f2fs_ioc_gc_range(struct file *filp, struct f2fs_gc_range *range)
do_more: do_more:
if (!range->sync) { if (!range->sync) {
if (!down_write_trylock(&sbi->gc_lock)) { if (!f2fs_down_write_trylock(&sbi->gc_lock)) {
ret = -EBUSY; ret = -EBUSY;
goto out; goto out;
} }
} else { } else {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
} }
ret = f2fs_gc(sbi, range->sync, true, false, ret = f2fs_gc(sbi, range->sync, true, false,
@ -2559,10 +2567,6 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
bool fragmented = false; bool fragmented = false;
int err; int err;
/* if in-place-update policy is enabled, don't waste time here */
if (f2fs_should_update_inplace(inode, NULL))
return -EINVAL;
pg_start = range->start >> PAGE_SHIFT; pg_start = range->start >> PAGE_SHIFT;
pg_end = (range->start + range->len) >> PAGE_SHIFT; pg_end = (range->start + range->len) >> PAGE_SHIFT;
@ -2570,6 +2574,13 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
inode_lock(inode); inode_lock(inode);
/* if in-place-update policy is enabled, don't waste time here */
set_inode_flag(inode, FI_OPU_WRITE);
if (f2fs_should_update_inplace(inode, NULL)) {
err = -EINVAL;
goto out;
}
/* writeback all dirty pages in the range */ /* writeback all dirty pages in the range */
err = filemap_write_and_wait_range(inode->i_mapping, range->start, err = filemap_write_and_wait_range(inode->i_mapping, range->start,
range->start + range->len - 1); range->start + range->len - 1);
@ -2651,7 +2662,7 @@ do_map:
goto check; goto check;
} }
set_inode_flag(inode, FI_DO_DEFRAG); set_inode_flag(inode, FI_SKIP_WRITES);
idx = map.m_lblk; idx = map.m_lblk;
while (idx < map.m_lblk + map.m_len && cnt < blk_per_seg) { while (idx < map.m_lblk + map.m_len && cnt < blk_per_seg) {
@ -2676,15 +2687,16 @@ check:
if (map.m_lblk < pg_end && cnt < blk_per_seg) if (map.m_lblk < pg_end && cnt < blk_per_seg)
goto do_map; goto do_map;
clear_inode_flag(inode, FI_DO_DEFRAG); clear_inode_flag(inode, FI_SKIP_WRITES);
err = filemap_fdatawrite(inode->i_mapping); err = filemap_fdatawrite(inode->i_mapping);
if (err) if (err)
goto out; goto out;
} }
clear_out: clear_out:
clear_inode_flag(inode, FI_DO_DEFRAG); clear_inode_flag(inode, FI_SKIP_WRITES);
out: out:
clear_inode_flag(inode, FI_OPU_WRITE);
inode_unlock(inode); inode_unlock(inode);
if (!err) if (!err)
range->len = (u64)total << PAGE_SHIFT; range->len = (u64)total << PAGE_SHIFT;
@ -2820,10 +2832,10 @@ static int f2fs_move_file_range(struct file *file_in, loff_t pos_in,
f2fs_balance_fs(sbi, true); f2fs_balance_fs(sbi, true);
down_write(&F2FS_I(src)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(src)->i_gc_rwsem[WRITE]);
if (src != dst) { if (src != dst) {
ret = -EBUSY; ret = -EBUSY;
if (!down_write_trylock(&F2FS_I(dst)->i_gc_rwsem[WRITE])) if (!f2fs_down_write_trylock(&F2FS_I(dst)->i_gc_rwsem[WRITE]))
goto out_src; goto out_src;
} }
@ -2841,9 +2853,9 @@ static int f2fs_move_file_range(struct file *file_in, loff_t pos_in,
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
if (src != dst) if (src != dst)
up_write(&F2FS_I(dst)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(dst)->i_gc_rwsem[WRITE]);
out_src: out_src:
up_write(&F2FS_I(src)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(src)->i_gc_rwsem[WRITE]);
out_unlock: out_unlock:
if (src != dst) if (src != dst)
inode_unlock(dst); inode_unlock(dst);
@ -2938,7 +2950,7 @@ static int f2fs_ioc_flush_device(struct file *filp, unsigned long arg)
end_segno = min(start_segno + range.segments, dev_end_segno); end_segno = min(start_segno + range.segments, dev_end_segno);
while (start_segno < end_segno) { while (start_segno < end_segno) {
if (!down_write_trylock(&sbi->gc_lock)) { if (!f2fs_down_write_trylock(&sbi->gc_lock)) {
ret = -EBUSY; ret = -EBUSY;
goto out; goto out;
} }
@ -2990,7 +3002,7 @@ static int f2fs_ioc_setproject(struct inode *inode, __u32 projid)
{ {
struct f2fs_inode_info *fi = F2FS_I(inode); struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct page *ipage; struct f2fs_inode *ri = NULL;
kprojid_t kprojid; kprojid_t kprojid;
int err; int err;
@ -3014,17 +3026,8 @@ static int f2fs_ioc_setproject(struct inode *inode, __u32 projid)
if (IS_NOQUOTA(inode)) if (IS_NOQUOTA(inode))
return err; return err;
ipage = f2fs_get_node_page(sbi, inode->i_ino); if (!F2FS_FITS_IN_INODE(ri, fi->i_extra_isize, i_projid))
if (IS_ERR(ipage)) return -EOVERFLOW;
return PTR_ERR(ipage);
if (!F2FS_FITS_IN_INODE(F2FS_INODE(ipage), fi->i_extra_isize,
i_projid)) {
err = -EOVERFLOW;
f2fs_put_page(ipage, 1);
return err;
}
f2fs_put_page(ipage, 1);
err = f2fs_dquot_initialize(inode); err = f2fs_dquot_initialize(inode);
if (err) if (err)
@ -3215,9 +3218,9 @@ int f2fs_precache_extents(struct inode *inode)
while (map.m_lblk < end) { while (map.m_lblk < end) {
map.m_len = end - map.m_lblk; map.m_len = end - map.m_lblk;
down_write(&fi->i_gc_rwsem[WRITE]); f2fs_down_write(&fi->i_gc_rwsem[WRITE]);
err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_PRECACHE); err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_PRECACHE);
up_write(&fi->i_gc_rwsem[WRITE]); f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
if (err) if (err)
return err; return err;
@ -3294,11 +3297,11 @@ static int f2fs_ioc_getfslabel(struct file *filp, unsigned long arg)
if (!vbuf) if (!vbuf)
return -ENOMEM; return -ENOMEM;
down_read(&sbi->sb_lock); f2fs_down_read(&sbi->sb_lock);
count = utf16s_to_utf8s(sbi->raw_super->volume_name, count = utf16s_to_utf8s(sbi->raw_super->volume_name,
ARRAY_SIZE(sbi->raw_super->volume_name), ARRAY_SIZE(sbi->raw_super->volume_name),
UTF16_LITTLE_ENDIAN, vbuf, MAX_VOLUME_NAME); UTF16_LITTLE_ENDIAN, vbuf, MAX_VOLUME_NAME);
up_read(&sbi->sb_lock); f2fs_up_read(&sbi->sb_lock);
if (copy_to_user((char __user *)arg, vbuf, if (copy_to_user((char __user *)arg, vbuf,
min(FSLABEL_MAX, count))) min(FSLABEL_MAX, count)))
@ -3326,7 +3329,7 @@ static int f2fs_ioc_setfslabel(struct file *filp, unsigned long arg)
if (err) if (err)
goto out; goto out;
down_write(&sbi->sb_lock); f2fs_down_write(&sbi->sb_lock);
memset(sbi->raw_super->volume_name, 0, memset(sbi->raw_super->volume_name, 0,
sizeof(sbi->raw_super->volume_name)); sizeof(sbi->raw_super->volume_name));
@ -3336,7 +3339,7 @@ static int f2fs_ioc_setfslabel(struct file *filp, unsigned long arg)
err = f2fs_commit_super(sbi, false); err = f2fs_commit_super(sbi, false);
up_write(&sbi->sb_lock); f2fs_up_write(&sbi->sb_lock);
mnt_drop_write_file(filp); mnt_drop_write_file(filp);
out: out:
@ -3462,7 +3465,7 @@ static int f2fs_release_compress_blocks(struct file *filp, unsigned long arg)
if (!atomic_read(&F2FS_I(inode)->i_compr_blocks)) if (!atomic_read(&F2FS_I(inode)->i_compr_blocks))
goto out; goto out;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
@ -3499,7 +3502,7 @@ static int f2fs_release_compress_blocks(struct file *filp, unsigned long arg)
} }
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
out: out:
inode_unlock(inode); inode_unlock(inode);
@ -3615,7 +3618,7 @@ static int f2fs_reserve_compress_blocks(struct file *filp, unsigned long arg)
goto unlock_inode; goto unlock_inode;
} }
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
@ -3652,7 +3655,7 @@ static int f2fs_reserve_compress_blocks(struct file *filp, unsigned long arg)
} }
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (ret >= 0) { if (ret >= 0) {
clear_inode_flag(inode, FI_COMPRESS_RELEASED); clear_inode_flag(inode, FI_COMPRESS_RELEASED);
@ -3770,7 +3773,7 @@ static int f2fs_sec_trim_file(struct file *filp, unsigned long arg)
if (ret) if (ret)
goto err; goto err;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(mapping); filemap_invalidate_lock(mapping);
ret = filemap_write_and_wait_range(mapping, range.start, ret = filemap_write_and_wait_range(mapping, range.start,
@ -3859,7 +3862,7 @@ static int f2fs_sec_trim_file(struct file *filp, unsigned long arg)
prev_block, len, range.flags); prev_block, len, range.flags);
out: out:
filemap_invalidate_unlock(mapping); filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
err: err:
inode_unlock(inode); inode_unlock(inode);
file_end_write(filp); file_end_write(filp);
@ -4291,12 +4294,12 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
trace_f2fs_direct_IO_enter(inode, iocb, count, READ); trace_f2fs_direct_IO_enter(inode, iocb, count, READ);
if (iocb->ki_flags & IOCB_NOWAIT) { if (iocb->ki_flags & IOCB_NOWAIT) {
if (!down_read_trylock(&fi->i_gc_rwsem[READ])) { if (!f2fs_down_read_trylock(&fi->i_gc_rwsem[READ])) {
ret = -EAGAIN; ret = -EAGAIN;
goto out; goto out;
} }
} else { } else {
down_read(&fi->i_gc_rwsem[READ]); f2fs_down_read(&fi->i_gc_rwsem[READ]);
} }
/* /*
@ -4315,7 +4318,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
ret = iomap_dio_complete(dio); ret = iomap_dio_complete(dio);
} }
up_read(&fi->i_gc_rwsem[READ]); f2fs_up_read(&fi->i_gc_rwsem[READ]);
file_accessed(file); file_accessed(file);
out: out:
@ -4497,12 +4500,12 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
goto out; goto out;
} }
if (!down_read_trylock(&fi->i_gc_rwsem[WRITE])) { if (!f2fs_down_read_trylock(&fi->i_gc_rwsem[WRITE])) {
ret = -EAGAIN; ret = -EAGAIN;
goto out; goto out;
} }
if (do_opu && !down_read_trylock(&fi->i_gc_rwsem[READ])) { if (do_opu && !f2fs_down_read_trylock(&fi->i_gc_rwsem[READ])) {
up_read(&fi->i_gc_rwsem[WRITE]); f2fs_up_read(&fi->i_gc_rwsem[WRITE]);
ret = -EAGAIN; ret = -EAGAIN;
goto out; goto out;
} }
@ -4511,9 +4514,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
if (ret) if (ret)
goto out; goto out;
down_read(&fi->i_gc_rwsem[WRITE]); f2fs_down_read(&fi->i_gc_rwsem[WRITE]);
if (do_opu) if (do_opu)
down_read(&fi->i_gc_rwsem[READ]); f2fs_down_read(&fi->i_gc_rwsem[READ]);
} }
if (whint_mode == WHINT_MODE_OFF) if (whint_mode == WHINT_MODE_OFF)
iocb->ki_hint = WRITE_LIFE_NOT_SET; iocb->ki_hint = WRITE_LIFE_NOT_SET;
@ -4542,8 +4545,8 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
if (whint_mode == WHINT_MODE_OFF) if (whint_mode == WHINT_MODE_OFF)
iocb->ki_hint = hint; iocb->ki_hint = hint;
if (do_opu) if (do_opu)
up_read(&fi->i_gc_rwsem[READ]); f2fs_up_read(&fi->i_gc_rwsem[READ]);
up_read(&fi->i_gc_rwsem[WRITE]); f2fs_up_read(&fi->i_gc_rwsem[WRITE]);
if (ret < 0) if (ret < 0)
goto out; goto out;
@ -4644,12 +4647,12 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
/* Don't leave any preallocated blocks around past i_size. */ /* Don't leave any preallocated blocks around past i_size. */
if (preallocated && i_size_read(inode) < target_size) { if (preallocated && i_size_read(inode) < target_size) {
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
filemap_invalidate_lock(inode->i_mapping); filemap_invalidate_lock(inode->i_mapping);
if (!f2fs_truncate(inode)) if (!f2fs_truncate(inode))
file_dont_truncate(inode); file_dont_truncate(inode);
filemap_invalidate_unlock(inode->i_mapping); filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
} else { } else {
file_dont_truncate(inode); file_dont_truncate(inode);
} }

View File

@ -103,23 +103,26 @@ static int gc_thread_func(void *data)
sbi->gc_urgent_high_remaining--; sbi->gc_urgent_high_remaining--;
} }
spin_unlock(&sbi->gc_urgent_high_lock); spin_unlock(&sbi->gc_urgent_high_lock);
}
if (sbi->gc_mode == GC_URGENT_HIGH ||
sbi->gc_mode == GC_URGENT_MID) {
wait_ms = gc_th->urgent_sleep_time; wait_ms = gc_th->urgent_sleep_time;
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
goto do_gc; goto do_gc;
} }
if (foreground) { if (foreground) {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
goto do_gc; goto do_gc;
} else if (!down_write_trylock(&sbi->gc_lock)) { } else if (!f2fs_down_write_trylock(&sbi->gc_lock)) {
stat_other_skip_bggc_count(sbi); stat_other_skip_bggc_count(sbi);
goto next; goto next;
} }
if (!is_idle(sbi, GC_TIME)) { if (!is_idle(sbi, GC_TIME)) {
increase_sleep_time(gc_th, &wait_ms); increase_sleep_time(gc_th, &wait_ms);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
stat_io_skip_bggc_count(sbi); stat_io_skip_bggc_count(sbi);
goto next; goto next;
} }
@ -1038,8 +1041,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
set_sbi_flag(sbi, SBI_NEED_FSCK); set_sbi_flag(sbi, SBI_NEED_FSCK);
} }
if (f2fs_check_nid_range(sbi, dni->ino)) if (f2fs_check_nid_range(sbi, dni->ino)) {
f2fs_put_page(node_page, 1);
return false; return false;
}
*nofs = ofs_of_node(node_page); *nofs = ofs_of_node(node_page);
source_blkaddr = data_blkaddr(NULL, node_page, ofs_in_node); source_blkaddr = data_blkaddr(NULL, node_page, ofs_in_node);
@ -1230,7 +1235,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr; fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
if (lfs_mode) if (lfs_mode)
down_write(&fio.sbi->io_order_lock); f2fs_down_write(&fio.sbi->io_order_lock);
mpage = f2fs_grab_cache_page(META_MAPPING(fio.sbi), mpage = f2fs_grab_cache_page(META_MAPPING(fio.sbi),
fio.old_blkaddr, false); fio.old_blkaddr, false);
@ -1316,7 +1321,7 @@ recover_block:
true, true, true); true, true, true);
up_out: up_out:
if (lfs_mode) if (lfs_mode)
up_write(&fio.sbi->io_order_lock); f2fs_up_write(&fio.sbi->io_order_lock);
put_out: put_out:
f2fs_put_dnode(&dn); f2fs_put_dnode(&dn);
out: out:
@ -1475,7 +1480,7 @@ next_step:
special_file(inode->i_mode)) special_file(inode->i_mode))
continue; continue;
if (!down_write_trylock( if (!f2fs_down_write_trylock(
&F2FS_I(inode)->i_gc_rwsem[WRITE])) { &F2FS_I(inode)->i_gc_rwsem[WRITE])) {
iput(inode); iput(inode);
sbi->skipped_gc_rwsem++; sbi->skipped_gc_rwsem++;
@ -1488,7 +1493,7 @@ next_step:
if (f2fs_post_read_required(inode)) { if (f2fs_post_read_required(inode)) {
int err = ra_data_block(inode, start_bidx); int err = ra_data_block(inode, start_bidx);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (err) { if (err) {
iput(inode); iput(inode);
continue; continue;
@ -1499,7 +1504,7 @@ next_step:
data_page = f2fs_get_read_data_page(inode, data_page = f2fs_get_read_data_page(inode,
start_bidx, REQ_RAHEAD, true); start_bidx, REQ_RAHEAD, true);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (IS_ERR(data_page)) { if (IS_ERR(data_page)) {
iput(inode); iput(inode);
continue; continue;
@ -1518,14 +1523,14 @@ next_step:
int err; int err;
if (S_ISREG(inode->i_mode)) { if (S_ISREG(inode->i_mode)) {
if (!down_write_trylock(&fi->i_gc_rwsem[READ])) { if (!f2fs_down_write_trylock(&fi->i_gc_rwsem[READ])) {
sbi->skipped_gc_rwsem++; sbi->skipped_gc_rwsem++;
continue; continue;
} }
if (!down_write_trylock( if (!f2fs_down_write_trylock(
&fi->i_gc_rwsem[WRITE])) { &fi->i_gc_rwsem[WRITE])) {
sbi->skipped_gc_rwsem++; sbi->skipped_gc_rwsem++;
up_write(&fi->i_gc_rwsem[READ]); f2fs_up_write(&fi->i_gc_rwsem[READ]);
continue; continue;
} }
locked = true; locked = true;
@ -1548,8 +1553,8 @@ next_step:
submitted++; submitted++;
if (locked) { if (locked) {
up_write(&fi->i_gc_rwsem[WRITE]); f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
up_write(&fi->i_gc_rwsem[READ]); f2fs_up_write(&fi->i_gc_rwsem[READ]);
} }
stat_inc_data_blk_count(sbi, 1, gc_type); stat_inc_data_blk_count(sbi, 1, gc_type);
@ -1807,7 +1812,7 @@ stop:
reserved_segments(sbi), reserved_segments(sbi),
prefree_segments(sbi)); prefree_segments(sbi));
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
put_gc_inode(&gc_list); put_gc_inode(&gc_list);
@ -1936,7 +1941,7 @@ static void update_sb_metadata(struct f2fs_sb_info *sbi, int secs)
long long block_count; long long block_count;
int segs = secs * sbi->segs_per_sec; int segs = secs * sbi->segs_per_sec;
down_write(&sbi->sb_lock); f2fs_down_write(&sbi->sb_lock);
section_count = le32_to_cpu(raw_sb->section_count); section_count = le32_to_cpu(raw_sb->section_count);
segment_count = le32_to_cpu(raw_sb->segment_count); segment_count = le32_to_cpu(raw_sb->segment_count);
@ -1957,7 +1962,7 @@ static void update_sb_metadata(struct f2fs_sb_info *sbi, int secs)
cpu_to_le32(dev_segs + segs); cpu_to_le32(dev_segs + segs);
} }
up_write(&sbi->sb_lock); f2fs_up_write(&sbi->sb_lock);
} }
static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs) static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
@ -2031,7 +2036,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
secs = div_u64(shrunk_blocks, BLKS_PER_SEC(sbi)); secs = div_u64(shrunk_blocks, BLKS_PER_SEC(sbi));
/* stop other GC */ /* stop other GC */
if (!down_write_trylock(&sbi->gc_lock)) if (!f2fs_down_write_trylock(&sbi->gc_lock))
return -EAGAIN; return -EAGAIN;
/* stop CP to protect MAIN_SEC in free_segment_range */ /* stop CP to protect MAIN_SEC in free_segment_range */
@ -2051,15 +2056,15 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
out_unlock: out_unlock:
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
if (err) if (err)
return err; return err;
set_sbi_flag(sbi, SBI_IS_RESIZEFS); set_sbi_flag(sbi, SBI_IS_RESIZEFS);
freeze_super(sbi->sb); freeze_super(sbi->sb);
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
down_write(&sbi->cp_global_sem); f2fs_down_write(&sbi->cp_global_sem);
spin_lock(&sbi->stat_lock); spin_lock(&sbi->stat_lock);
if (shrunk_blocks + valid_user_blocks(sbi) + if (shrunk_blocks + valid_user_blocks(sbi) +
@ -2104,8 +2109,8 @@ recover_out:
spin_unlock(&sbi->stat_lock); spin_unlock(&sbi->stat_lock);
} }
out_err: out_err:
up_write(&sbi->cp_global_sem); f2fs_up_write(&sbi->cp_global_sem);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
thaw_super(sbi->sb); thaw_super(sbi->sb);
clear_sbi_flag(sbi, SBI_IS_RESIZEFS); clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
return err; return err;

View File

@ -629,7 +629,7 @@ int f2fs_add_inline_entry(struct inode *dir, const struct f2fs_filename *fname,
} }
if (inode) { if (inode) {
down_write(&F2FS_I(inode)->i_sem); f2fs_down_write(&F2FS_I(inode)->i_sem);
page = f2fs_init_inode_metadata(inode, dir, fname, ipage); page = f2fs_init_inode_metadata(inode, dir, fname, ipage);
if (IS_ERR(page)) { if (IS_ERR(page)) {
err = PTR_ERR(page); err = PTR_ERR(page);
@ -658,7 +658,7 @@ int f2fs_add_inline_entry(struct inode *dir, const struct f2fs_filename *fname,
f2fs_update_parent_metadata(dir, inode, 0); f2fs_update_parent_metadata(dir, inode, 0);
fail: fail:
if (inode) if (inode)
up_write(&F2FS_I(inode)->i_sem); f2fs_up_write(&F2FS_I(inode)->i_sem);
out: out:
f2fs_put_page(ipage, 1); f2fs_put_page(ipage, 1);
return err; return err;

View File

@ -778,7 +778,8 @@ void f2fs_evict_inode(struct inode *inode)
f2fs_remove_ino_entry(sbi, inode->i_ino, UPDATE_INO); f2fs_remove_ino_entry(sbi, inode->i_ino, UPDATE_INO);
f2fs_remove_ino_entry(sbi, inode->i_ino, FLUSH_INO); f2fs_remove_ino_entry(sbi, inode->i_ino, FLUSH_INO);
sb_start_intwrite(inode->i_sb); if (!is_sbi_flag_set(sbi, SBI_IS_FREEZING))
sb_start_intwrite(inode->i_sb);
set_inode_flag(inode, FI_NO_ALLOC); set_inode_flag(inode, FI_NO_ALLOC);
i_size_write(inode, 0); i_size_write(inode, 0);
retry: retry:
@ -809,7 +810,8 @@ retry:
if (dquot_initialize_needed(inode)) if (dquot_initialize_needed(inode))
set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR); set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
} }
sb_end_intwrite(inode->i_sb); if (!is_sbi_flag_set(sbi, SBI_IS_FREEZING))
sb_end_intwrite(inode->i_sb);
no_delete: no_delete:
dquot_drop(inode); dquot_drop(inode);
@ -885,6 +887,7 @@ void f2fs_handle_failed_inode(struct inode *inode)
err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false); err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false);
if (err) { if (err) {
set_sbi_flag(sbi, SBI_NEED_FSCK); set_sbi_flag(sbi, SBI_NEED_FSCK);
set_inode_flag(inode, FI_FREE_NID);
f2fs_warn(sbi, "May loss orphan inode, run fsck to fix."); f2fs_warn(sbi, "May loss orphan inode, run fsck to fix.");
goto out; goto out;
} }

View File

@ -22,7 +22,8 @@
#include "acl.h" #include "acl.h"
#include <trace/events/f2fs.h> #include <trace/events/f2fs.h>
static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode) static struct inode *f2fs_new_inode(struct user_namespace *mnt_userns,
struct inode *dir, umode_t mode)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
nid_t ino; nid_t ino;
@ -46,7 +47,7 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode)
nid_free = true; nid_free = true;
inode_init_owner(&init_user_ns, inode, dir, mode); inode_init_owner(mnt_userns, inode, dir, mode);
inode->i_ino = ino; inode->i_ino = ino;
inode->i_blocks = 0; inode->i_blocks = 0;
@ -67,7 +68,7 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode)
(F2FS_I(dir)->i_flags & F2FS_PROJINHERIT_FL)) (F2FS_I(dir)->i_flags & F2FS_PROJINHERIT_FL))
F2FS_I(inode)->i_projid = F2FS_I(dir)->i_projid; F2FS_I(inode)->i_projid = F2FS_I(dir)->i_projid;
else else
F2FS_I(inode)->i_projid = make_kprojid(&init_user_ns, F2FS_I(inode)->i_projid = make_kprojid(mnt_userns,
F2FS_DEF_PROJID); F2FS_DEF_PROJID);
err = fscrypt_prepare_new_inode(dir, inode, &encrypt); err = fscrypt_prepare_new_inode(dir, inode, &encrypt);
@ -196,7 +197,7 @@ static inline void set_file_temperature(struct f2fs_sb_info *sbi, struct inode *
__u8 (*extlist)[F2FS_EXTENSION_LEN] = sbi->raw_super->extension_list; __u8 (*extlist)[F2FS_EXTENSION_LEN] = sbi->raw_super->extension_list;
int i, cold_count, hot_count; int i, cold_count, hot_count;
down_read(&sbi->sb_lock); f2fs_down_read(&sbi->sb_lock);
cold_count = le32_to_cpu(sbi->raw_super->extension_count); cold_count = le32_to_cpu(sbi->raw_super->extension_count);
hot_count = sbi->raw_super->hot_ext_count; hot_count = sbi->raw_super->hot_ext_count;
@ -206,7 +207,7 @@ static inline void set_file_temperature(struct f2fs_sb_info *sbi, struct inode *
break; break;
} }
up_read(&sbi->sb_lock); f2fs_up_read(&sbi->sb_lock);
if (i == cold_count + hot_count) if (i == cold_count + hot_count)
return; return;
@ -299,19 +300,19 @@ static void set_compress_inode(struct f2fs_sb_info *sbi, struct inode *inode,
(!ext_cnt && !noext_cnt)) (!ext_cnt && !noext_cnt))
return; return;
down_read(&sbi->sb_lock); f2fs_down_read(&sbi->sb_lock);
cold_count = le32_to_cpu(sbi->raw_super->extension_count); cold_count = le32_to_cpu(sbi->raw_super->extension_count);
hot_count = sbi->raw_super->hot_ext_count; hot_count = sbi->raw_super->hot_ext_count;
for (i = cold_count; i < cold_count + hot_count; i++) { for (i = cold_count; i < cold_count + hot_count; i++) {
if (is_extension_exist(name, extlist[i], false)) { if (is_extension_exist(name, extlist[i], false)) {
up_read(&sbi->sb_lock); f2fs_up_read(&sbi->sb_lock);
return; return;
} }
} }
up_read(&sbi->sb_lock); f2fs_up_read(&sbi->sb_lock);
for (i = 0; i < noext_cnt; i++) { for (i = 0; i < noext_cnt; i++) {
if (is_extension_exist(name, noext[i], false)) { if (is_extension_exist(name, noext[i], false)) {
@ -349,7 +350,7 @@ static int f2fs_create(struct user_namespace *mnt_userns, struct inode *dir,
if (err) if (err)
return err; return err;
inode = f2fs_new_inode(dir, mode); inode = f2fs_new_inode(mnt_userns, dir, mode);
if (IS_ERR(inode)) if (IS_ERR(inode))
return PTR_ERR(inode); return PTR_ERR(inode);
@ -679,7 +680,7 @@ static int f2fs_symlink(struct user_namespace *mnt_userns, struct inode *dir,
if (err) if (err)
return err; return err;
inode = f2fs_new_inode(dir, S_IFLNK | S_IRWXUGO); inode = f2fs_new_inode(mnt_userns, dir, S_IFLNK | S_IRWXUGO);
if (IS_ERR(inode)) if (IS_ERR(inode))
return PTR_ERR(inode); return PTR_ERR(inode);
@ -750,7 +751,7 @@ static int f2fs_mkdir(struct user_namespace *mnt_userns, struct inode *dir,
if (err) if (err)
return err; return err;
inode = f2fs_new_inode(dir, S_IFDIR | mode); inode = f2fs_new_inode(mnt_userns, dir, S_IFDIR | mode);
if (IS_ERR(inode)) if (IS_ERR(inode))
return PTR_ERR(inode); return PTR_ERR(inode);
@ -807,7 +808,7 @@ static int f2fs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
if (err) if (err)
return err; return err;
inode = f2fs_new_inode(dir, mode); inode = f2fs_new_inode(mnt_userns, dir, mode);
if (IS_ERR(inode)) if (IS_ERR(inode))
return PTR_ERR(inode); return PTR_ERR(inode);
@ -834,8 +835,9 @@ out:
return err; return err;
} }
static int __f2fs_tmpfile(struct inode *dir, struct dentry *dentry, static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
umode_t mode, struct inode **whiteout) struct dentry *dentry, umode_t mode,
struct inode **whiteout)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(dir); struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
struct inode *inode; struct inode *inode;
@ -845,7 +847,7 @@ static int __f2fs_tmpfile(struct inode *dir, struct dentry *dentry,
if (err) if (err)
return err; return err;
inode = f2fs_new_inode(dir, mode); inode = f2fs_new_inode(mnt_userns, dir, mode);
if (IS_ERR(inode)) if (IS_ERR(inode))
return PTR_ERR(inode); return PTR_ERR(inode);
@ -909,20 +911,22 @@ static int f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
if (!f2fs_is_checkpoint_ready(sbi)) if (!f2fs_is_checkpoint_ready(sbi))
return -ENOSPC; return -ENOSPC;
return __f2fs_tmpfile(dir, dentry, mode, NULL); return __f2fs_tmpfile(mnt_userns, dir, dentry, mode, NULL);
} }
static int f2fs_create_whiteout(struct inode *dir, struct inode **whiteout) static int f2fs_create_whiteout(struct user_namespace *mnt_userns,
struct inode *dir, struct inode **whiteout)
{ {
if (unlikely(f2fs_cp_error(F2FS_I_SB(dir)))) if (unlikely(f2fs_cp_error(F2FS_I_SB(dir))))
return -EIO; return -EIO;
return __f2fs_tmpfile(dir, NULL, S_IFCHR | WHITEOUT_MODE, whiteout); return __f2fs_tmpfile(mnt_userns, dir, NULL,
S_IFCHR | WHITEOUT_MODE, whiteout);
} }
static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry, static int f2fs_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
struct inode *new_dir, struct dentry *new_dentry, struct dentry *old_dentry, struct inode *new_dir,
unsigned int flags) struct dentry *new_dentry, unsigned int flags)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(old_dir); struct f2fs_sb_info *sbi = F2FS_I_SB(old_dir);
struct inode *old_inode = d_inode(old_dentry); struct inode *old_inode = d_inode(old_dentry);
@ -960,7 +964,7 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
} }
if (flags & RENAME_WHITEOUT) { if (flags & RENAME_WHITEOUT) {
err = f2fs_create_whiteout(old_dir, &whiteout); err = f2fs_create_whiteout(mnt_userns, old_dir, &whiteout);
if (err) if (err)
return err; return err;
} }
@ -1023,11 +1027,11 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
new_page = NULL; new_page = NULL;
new_inode->i_ctime = current_time(new_inode); new_inode->i_ctime = current_time(new_inode);
down_write(&F2FS_I(new_inode)->i_sem); f2fs_down_write(&F2FS_I(new_inode)->i_sem);
if (old_dir_entry) if (old_dir_entry)
f2fs_i_links_write(new_inode, false); f2fs_i_links_write(new_inode, false);
f2fs_i_links_write(new_inode, false); f2fs_i_links_write(new_inode, false);
up_write(&F2FS_I(new_inode)->i_sem); f2fs_up_write(&F2FS_I(new_inode)->i_sem);
if (!new_inode->i_nlink) if (!new_inode->i_nlink)
f2fs_add_orphan_inode(new_inode); f2fs_add_orphan_inode(new_inode);
@ -1048,13 +1052,13 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
f2fs_i_links_write(new_dir, true); f2fs_i_links_write(new_dir, true);
} }
down_write(&F2FS_I(old_inode)->i_sem); f2fs_down_write(&F2FS_I(old_inode)->i_sem);
if (!old_dir_entry || whiteout) if (!old_dir_entry || whiteout)
file_lost_pino(old_inode); file_lost_pino(old_inode);
else else
/* adjust dir's i_pino to pass fsck check */ /* adjust dir's i_pino to pass fsck check */
f2fs_i_pino_write(old_inode, new_dir->i_ino); f2fs_i_pino_write(old_inode, new_dir->i_ino);
up_write(&F2FS_I(old_inode)->i_sem); f2fs_up_write(&F2FS_I(old_inode)->i_sem);
old_inode->i_ctime = current_time(old_inode); old_inode->i_ctime = current_time(old_inode);
f2fs_mark_inode_dirty_sync(old_inode, false); f2fs_mark_inode_dirty_sync(old_inode, false);
@ -1107,8 +1111,7 @@ out_dir:
out_old: out_old:
f2fs_put_page(old_page, 0); f2fs_put_page(old_page, 0);
out: out:
if (whiteout) iput(whiteout);
iput(whiteout);
return err; return err;
} }
@ -1214,38 +1217,38 @@ static int f2fs_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
/* update directory entry info of old dir inode */ /* update directory entry info of old dir inode */
f2fs_set_link(old_dir, old_entry, old_page, new_inode); f2fs_set_link(old_dir, old_entry, old_page, new_inode);
down_write(&F2FS_I(old_inode)->i_sem); f2fs_down_write(&F2FS_I(old_inode)->i_sem);
if (!old_dir_entry) if (!old_dir_entry)
file_lost_pino(old_inode); file_lost_pino(old_inode);
else else
/* adjust dir's i_pino to pass fsck check */ /* adjust dir's i_pino to pass fsck check */
f2fs_i_pino_write(old_inode, new_dir->i_ino); f2fs_i_pino_write(old_inode, new_dir->i_ino);
up_write(&F2FS_I(old_inode)->i_sem); f2fs_up_write(&F2FS_I(old_inode)->i_sem);
old_dir->i_ctime = current_time(old_dir); old_dir->i_ctime = current_time(old_dir);
if (old_nlink) { if (old_nlink) {
down_write(&F2FS_I(old_dir)->i_sem); f2fs_down_write(&F2FS_I(old_dir)->i_sem);
f2fs_i_links_write(old_dir, old_nlink > 0); f2fs_i_links_write(old_dir, old_nlink > 0);
up_write(&F2FS_I(old_dir)->i_sem); f2fs_up_write(&F2FS_I(old_dir)->i_sem);
} }
f2fs_mark_inode_dirty_sync(old_dir, false); f2fs_mark_inode_dirty_sync(old_dir, false);
/* update directory entry info of new dir inode */ /* update directory entry info of new dir inode */
f2fs_set_link(new_dir, new_entry, new_page, old_inode); f2fs_set_link(new_dir, new_entry, new_page, old_inode);
down_write(&F2FS_I(new_inode)->i_sem); f2fs_down_write(&F2FS_I(new_inode)->i_sem);
if (!new_dir_entry) if (!new_dir_entry)
file_lost_pino(new_inode); file_lost_pino(new_inode);
else else
/* adjust dir's i_pino to pass fsck check */ /* adjust dir's i_pino to pass fsck check */
f2fs_i_pino_write(new_inode, old_dir->i_ino); f2fs_i_pino_write(new_inode, old_dir->i_ino);
up_write(&F2FS_I(new_inode)->i_sem); f2fs_up_write(&F2FS_I(new_inode)->i_sem);
new_dir->i_ctime = current_time(new_dir); new_dir->i_ctime = current_time(new_dir);
if (new_nlink) { if (new_nlink) {
down_write(&F2FS_I(new_dir)->i_sem); f2fs_down_write(&F2FS_I(new_dir)->i_sem);
f2fs_i_links_write(new_dir, new_nlink > 0); f2fs_i_links_write(new_dir, new_nlink > 0);
up_write(&F2FS_I(new_dir)->i_sem); f2fs_up_write(&F2FS_I(new_dir)->i_sem);
} }
f2fs_mark_inode_dirty_sync(new_dir, false); f2fs_mark_inode_dirty_sync(new_dir, false);
@ -1300,7 +1303,8 @@ static int f2fs_rename2(struct user_namespace *mnt_userns,
* VFS has already handled the new dentry existence case, * VFS has already handled the new dentry existence case,
* here, we just deal with "RENAME_NOREPLACE" as regular rename. * here, we just deal with "RENAME_NOREPLACE" as regular rename.
*/ */
return f2fs_rename(old_dir, old_dentry, new_dir, new_dentry, flags); return f2fs_rename(mnt_userns, old_dir, old_dentry,
new_dir, new_dentry, flags);
} }
static const char *f2fs_encrypted_get_link(struct dentry *dentry, static const char *f2fs_encrypted_get_link(struct dentry *dentry,

View File

@ -382,14 +382,14 @@ int f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid)
struct nat_entry *e; struct nat_entry *e;
bool need = false; bool need = false;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, nid); e = __lookup_nat_cache(nm_i, nid);
if (e) { if (e) {
if (!get_nat_flag(e, IS_CHECKPOINTED) && if (!get_nat_flag(e, IS_CHECKPOINTED) &&
!get_nat_flag(e, HAS_FSYNCED_INODE)) !get_nat_flag(e, HAS_FSYNCED_INODE))
need = true; need = true;
} }
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
return need; return need;
} }
@ -399,11 +399,11 @@ bool f2fs_is_checkpointed_node(struct f2fs_sb_info *sbi, nid_t nid)
struct nat_entry *e; struct nat_entry *e;
bool is_cp = true; bool is_cp = true;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, nid); e = __lookup_nat_cache(nm_i, nid);
if (e && !get_nat_flag(e, IS_CHECKPOINTED)) if (e && !get_nat_flag(e, IS_CHECKPOINTED))
is_cp = false; is_cp = false;
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
return is_cp; return is_cp;
} }
@ -413,13 +413,13 @@ bool f2fs_need_inode_block_update(struct f2fs_sb_info *sbi, nid_t ino)
struct nat_entry *e; struct nat_entry *e;
bool need_update = true; bool need_update = true;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, ino); e = __lookup_nat_cache(nm_i, ino);
if (e && get_nat_flag(e, HAS_LAST_FSYNC) && if (e && get_nat_flag(e, HAS_LAST_FSYNC) &&
(get_nat_flag(e, IS_CHECKPOINTED) || (get_nat_flag(e, IS_CHECKPOINTED) ||
get_nat_flag(e, HAS_FSYNCED_INODE))) get_nat_flag(e, HAS_FSYNCED_INODE)))
need_update = false; need_update = false;
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
return need_update; return need_update;
} }
@ -431,14 +431,14 @@ static void cache_nat_entry(struct f2fs_sb_info *sbi, nid_t nid,
struct nat_entry *new, *e; struct nat_entry *new, *e;
/* Let's mitigate lock contention of nat_tree_lock during checkpoint */ /* Let's mitigate lock contention of nat_tree_lock during checkpoint */
if (rwsem_is_locked(&sbi->cp_global_sem)) if (f2fs_rwsem_is_locked(&sbi->cp_global_sem))
return; return;
new = __alloc_nat_entry(sbi, nid, false); new = __alloc_nat_entry(sbi, nid, false);
if (!new) if (!new)
return; return;
down_write(&nm_i->nat_tree_lock); f2fs_down_write(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, nid); e = __lookup_nat_cache(nm_i, nid);
if (!e) if (!e)
e = __init_nat_entry(nm_i, new, ne, false); e = __init_nat_entry(nm_i, new, ne, false);
@ -447,7 +447,7 @@ static void cache_nat_entry(struct f2fs_sb_info *sbi, nid_t nid,
nat_get_blkaddr(e) != nat_get_blkaddr(e) !=
le32_to_cpu(ne->block_addr) || le32_to_cpu(ne->block_addr) ||
nat_get_version(e) != ne->version); nat_get_version(e) != ne->version);
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
if (e != new) if (e != new)
__free_nat_entry(new); __free_nat_entry(new);
} }
@ -459,7 +459,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
struct nat_entry *e; struct nat_entry *e;
struct nat_entry *new = __alloc_nat_entry(sbi, ni->nid, true); struct nat_entry *new = __alloc_nat_entry(sbi, ni->nid, true);
down_write(&nm_i->nat_tree_lock); f2fs_down_write(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, ni->nid); e = __lookup_nat_cache(nm_i, ni->nid);
if (!e) { if (!e) {
e = __init_nat_entry(nm_i, new, NULL, true); e = __init_nat_entry(nm_i, new, NULL, true);
@ -508,7 +508,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
set_nat_flag(e, HAS_FSYNCED_INODE, true); set_nat_flag(e, HAS_FSYNCED_INODE, true);
set_nat_flag(e, HAS_LAST_FSYNC, fsync_done); set_nat_flag(e, HAS_LAST_FSYNC, fsync_done);
} }
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
} }
int f2fs_try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink) int f2fs_try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink)
@ -516,7 +516,7 @@ int f2fs_try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink)
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
int nr = nr_shrink; int nr = nr_shrink;
if (!down_write_trylock(&nm_i->nat_tree_lock)) if (!f2fs_down_write_trylock(&nm_i->nat_tree_lock))
return 0; return 0;
spin_lock(&nm_i->nat_list_lock); spin_lock(&nm_i->nat_list_lock);
@ -538,7 +538,7 @@ int f2fs_try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink)
} }
spin_unlock(&nm_i->nat_list_lock); spin_unlock(&nm_i->nat_list_lock);
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
return nr - nr_shrink; return nr - nr_shrink;
} }
@ -560,13 +560,13 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid,
ni->nid = nid; ni->nid = nid;
retry: retry:
/* Check nat cache */ /* Check nat cache */
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, nid); e = __lookup_nat_cache(nm_i, nid);
if (e) { if (e) {
ni->ino = nat_get_ino(e); ni->ino = nat_get_ino(e);
ni->blk_addr = nat_get_blkaddr(e); ni->blk_addr = nat_get_blkaddr(e);
ni->version = nat_get_version(e); ni->version = nat_get_version(e);
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
return 0; return 0;
} }
@ -576,11 +576,11 @@ retry:
* nat_tree_lock. Therefore, we should retry, if we failed to grab here * nat_tree_lock. Therefore, we should retry, if we failed to grab here
* while not bothering checkpoint. * while not bothering checkpoint.
*/ */
if (!rwsem_is_locked(&sbi->cp_global_sem) || checkpoint_context) { if (!f2fs_rwsem_is_locked(&sbi->cp_global_sem) || checkpoint_context) {
down_read(&curseg->journal_rwsem); down_read(&curseg->journal_rwsem);
} else if (rwsem_is_contended(&nm_i->nat_tree_lock) || } else if (f2fs_rwsem_is_contended(&nm_i->nat_tree_lock) ||
!down_read_trylock(&curseg->journal_rwsem)) { !down_read_trylock(&curseg->journal_rwsem)) {
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
goto retry; goto retry;
} }
@ -589,15 +589,15 @@ retry:
ne = nat_in_journal(journal, i); ne = nat_in_journal(journal, i);
node_info_from_raw_nat(ni, &ne); node_info_from_raw_nat(ni, &ne);
} }
up_read(&curseg->journal_rwsem); up_read(&curseg->journal_rwsem);
if (i >= 0) { if (i >= 0) {
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
goto cache; goto cache;
} }
/* Fill node_info from nat page */ /* Fill node_info from nat page */
index = current_nat_addr(sbi, nid); index = current_nat_addr(sbi, nid);
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
page = f2fs_get_meta_page(sbi, index); page = f2fs_get_meta_page(sbi, index);
if (IS_ERR(page)) if (IS_ERR(page))
@ -1609,17 +1609,17 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
goto redirty_out; goto redirty_out;
if (wbc->for_reclaim) { if (wbc->for_reclaim) {
if (!down_read_trylock(&sbi->node_write)) if (!f2fs_down_read_trylock(&sbi->node_write))
goto redirty_out; goto redirty_out;
} else { } else {
down_read(&sbi->node_write); f2fs_down_read(&sbi->node_write);
} }
/* This page is already truncated */ /* This page is already truncated */
if (unlikely(ni.blk_addr == NULL_ADDR)) { if (unlikely(ni.blk_addr == NULL_ADDR)) {
ClearPageUptodate(page); ClearPageUptodate(page);
dec_page_count(sbi, F2FS_DIRTY_NODES); dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
unlock_page(page); unlock_page(page);
return 0; return 0;
} }
@ -1627,7 +1627,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
if (__is_valid_data_blkaddr(ni.blk_addr) && if (__is_valid_data_blkaddr(ni.blk_addr) &&
!f2fs_is_valid_blkaddr(sbi, ni.blk_addr, !f2fs_is_valid_blkaddr(sbi, ni.blk_addr,
DATA_GENERIC_ENHANCE)) { DATA_GENERIC_ENHANCE)) {
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
goto redirty_out; goto redirty_out;
} }
@ -1648,7 +1648,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
f2fs_do_write_node_page(nid, &fio); f2fs_do_write_node_page(nid, &fio);
set_node_addr(sbi, &ni, fio.new_blkaddr, is_fsync_dnode(page)); set_node_addr(sbi, &ni, fio.new_blkaddr, is_fsync_dnode(page));
dec_page_count(sbi, F2FS_DIRTY_NODES); dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write); f2fs_up_read(&sbi->node_write);
if (wbc->for_reclaim) { if (wbc->for_reclaim) {
f2fs_submit_merged_write_cond(sbi, NULL, page, 0, NODE); f2fs_submit_merged_write_cond(sbi, NULL, page, 0, NODE);
@ -1782,6 +1782,7 @@ continue_unlock:
if (!atomic || page == last_page) { if (!atomic || page == last_page) {
set_fsync_mark(page, 1); set_fsync_mark(page, 1);
percpu_counter_inc(&sbi->rf_node_block_count);
if (IS_INODE(page)) { if (IS_INODE(page)) {
if (is_inode_flag_set(inode, if (is_inode_flag_set(inode,
FI_DIRTY_INODE)) FI_DIRTY_INODE))
@ -2111,8 +2112,12 @@ static int f2fs_write_node_pages(struct address_space *mapping,
if (wbc->sync_mode == WB_SYNC_ALL) if (wbc->sync_mode == WB_SYNC_ALL)
atomic_inc(&sbi->wb_sync_req[NODE]); atomic_inc(&sbi->wb_sync_req[NODE]);
else if (atomic_read(&sbi->wb_sync_req[NODE])) else if (atomic_read(&sbi->wb_sync_req[NODE])) {
/* to avoid potential deadlock */
if (current->plug)
blk_finish_plug(current->plug);
goto skip_write; goto skip_write;
}
trace_f2fs_writepages(mapping->host, wbc, NODE); trace_f2fs_writepages(mapping->host, wbc, NODE);
@ -2225,14 +2230,14 @@ bool f2fs_nat_bitmap_enabled(struct f2fs_sb_info *sbi)
unsigned int i; unsigned int i;
bool ret = true; bool ret = true;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
for (i = 0; i < nm_i->nat_blocks; i++) { for (i = 0; i < nm_i->nat_blocks; i++) {
if (!test_bit_le(i, nm_i->nat_block_bitmap)) { if (!test_bit_le(i, nm_i->nat_block_bitmap)) {
ret = false; ret = false;
break; break;
} }
} }
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
return ret; return ret;
} }
@ -2415,7 +2420,7 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi)
unsigned int i, idx; unsigned int i, idx;
nid_t nid; nid_t nid;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
for (i = 0; i < nm_i->nat_blocks; i++) { for (i = 0; i < nm_i->nat_blocks; i++) {
if (!test_bit_le(i, nm_i->nat_block_bitmap)) if (!test_bit_le(i, nm_i->nat_block_bitmap))
@ -2438,7 +2443,7 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi)
out: out:
scan_curseg_cache(sbi); scan_curseg_cache(sbi);
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
} }
static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi, static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
@ -2473,7 +2478,7 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
f2fs_ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nid), FREE_NID_PAGES, f2fs_ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nid), FREE_NID_PAGES,
META_NAT, true); META_NAT, true);
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
while (1) { while (1) {
if (!test_bit_le(NAT_BLOCK_OFFSET(nid), if (!test_bit_le(NAT_BLOCK_OFFSET(nid),
@ -2488,7 +2493,7 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
} }
if (ret) { if (ret) {
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
f2fs_err(sbi, "NAT is corrupt, run fsck to fix it"); f2fs_err(sbi, "NAT is corrupt, run fsck to fix it");
return ret; return ret;
} }
@ -2508,7 +2513,7 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
/* find free nids from current sum_pages */ /* find free nids from current sum_pages */
scan_curseg_cache(sbi); scan_curseg_cache(sbi);
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
f2fs_ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nm_i->next_scan_nid), f2fs_ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nm_i->next_scan_nid),
nm_i->ra_nid_pages, META_NAT, false); nm_i->ra_nid_pages, META_NAT, false);
@ -2953,7 +2958,7 @@ void f2fs_enable_nat_bits(struct f2fs_sb_info *sbi)
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
unsigned int nat_ofs; unsigned int nat_ofs;
down_read(&nm_i->nat_tree_lock); f2fs_down_read(&nm_i->nat_tree_lock);
for (nat_ofs = 0; nat_ofs < nm_i->nat_blocks; nat_ofs++) { for (nat_ofs = 0; nat_ofs < nm_i->nat_blocks; nat_ofs++) {
unsigned int valid = 0, nid_ofs = 0; unsigned int valid = 0, nid_ofs = 0;
@ -2973,7 +2978,7 @@ void f2fs_enable_nat_bits(struct f2fs_sb_info *sbi)
__update_nat_bits(nm_i, nat_ofs, valid); __update_nat_bits(nm_i, nat_ofs, valid);
} }
up_read(&nm_i->nat_tree_lock); f2fs_up_read(&nm_i->nat_tree_lock);
} }
static int __flush_nat_entry_set(struct f2fs_sb_info *sbi, static int __flush_nat_entry_set(struct f2fs_sb_info *sbi,
@ -3071,15 +3076,15 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
* nat_cnt[DIRTY_NAT]. * nat_cnt[DIRTY_NAT].
*/ */
if (cpc->reason & CP_UMOUNT) { if (cpc->reason & CP_UMOUNT) {
down_write(&nm_i->nat_tree_lock); f2fs_down_write(&nm_i->nat_tree_lock);
remove_nats_in_journal(sbi); remove_nats_in_journal(sbi);
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
} }
if (!nm_i->nat_cnt[DIRTY_NAT]) if (!nm_i->nat_cnt[DIRTY_NAT])
return 0; return 0;
down_write(&nm_i->nat_tree_lock); f2fs_down_write(&nm_i->nat_tree_lock);
/* /*
* if there are no enough space in journal to store dirty nat * if there are no enough space in journal to store dirty nat
@ -3108,7 +3113,7 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
break; break;
} }
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
/* Allow dirty nats by node block allocation in write_begin */ /* Allow dirty nats by node block allocation in write_begin */
return err; return err;
@ -3218,6 +3223,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
nm_i->ram_thresh = DEF_RAM_THRESHOLD; nm_i->ram_thresh = DEF_RAM_THRESHOLD;
nm_i->ra_nid_pages = DEF_RA_NID_PAGES; nm_i->ra_nid_pages = DEF_RA_NID_PAGES;
nm_i->dirty_nats_ratio = DEF_DIRTY_NAT_RATIO_THRESHOLD; nm_i->dirty_nats_ratio = DEF_DIRTY_NAT_RATIO_THRESHOLD;
nm_i->max_rf_node_blocks = DEF_RF_NODE_BLOCKS;
INIT_RADIX_TREE(&nm_i->free_nid_root, GFP_ATOMIC); INIT_RADIX_TREE(&nm_i->free_nid_root, GFP_ATOMIC);
INIT_LIST_HEAD(&nm_i->free_nid_list); INIT_LIST_HEAD(&nm_i->free_nid_list);
@ -3228,7 +3234,7 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
mutex_init(&nm_i->build_lock); mutex_init(&nm_i->build_lock);
spin_lock_init(&nm_i->nid_list_lock); spin_lock_init(&nm_i->nid_list_lock);
init_rwsem(&nm_i->nat_tree_lock); init_f2fs_rwsem(&nm_i->nat_tree_lock);
nm_i->next_scan_nid = le32_to_cpu(sbi->ckpt->next_free_nid); nm_i->next_scan_nid = le32_to_cpu(sbi->ckpt->next_free_nid);
nm_i->bitmap_size = __bitmap_size(sbi, NAT_BITMAP); nm_i->bitmap_size = __bitmap_size(sbi, NAT_BITMAP);
@ -3334,7 +3340,7 @@ void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
/* destroy nat cache */ /* destroy nat cache */
down_write(&nm_i->nat_tree_lock); f2fs_down_write(&nm_i->nat_tree_lock);
while ((found = __gang_lookup_nat_cache(nm_i, while ((found = __gang_lookup_nat_cache(nm_i,
nid, NATVEC_SIZE, natvec))) { nid, NATVEC_SIZE, natvec))) {
unsigned idx; unsigned idx;
@ -3364,7 +3370,7 @@ void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
kmem_cache_free(nat_entry_set_slab, setvec[idx]); kmem_cache_free(nat_entry_set_slab, setvec[idx]);
} }
} }
up_write(&nm_i->nat_tree_lock); f2fs_up_write(&nm_i->nat_tree_lock);
kvfree(nm_i->nat_block_bitmap); kvfree(nm_i->nat_block_bitmap);
if (nm_i->free_nid_bitmap) { if (nm_i->free_nid_bitmap) {

View File

@ -31,6 +31,9 @@
/* control total # of nats */ /* control total # of nats */
#define DEF_NAT_CACHE_THRESHOLD 100000 #define DEF_NAT_CACHE_THRESHOLD 100000
/* control total # of node writes used for roll-fowrad recovery */
#define DEF_RF_NODE_BLOCKS 0
/* vector size for gang look-up from nat cache that consists of radix tree */ /* vector size for gang look-up from nat cache that consists of radix tree */
#define NATVEC_SIZE 64 #define NATVEC_SIZE 64
#define SETVEC_SIZE 32 #define SETVEC_SIZE 32

View File

@ -56,6 +56,10 @@ bool f2fs_space_for_roll_forward(struct f2fs_sb_info *sbi)
if (sbi->last_valid_block_count + nalloc > sbi->user_block_count) if (sbi->last_valid_block_count + nalloc > sbi->user_block_count)
return false; return false;
if (NM_I(sbi)->max_rf_node_blocks &&
percpu_counter_sum_positive(&sbi->rf_node_block_count) >=
NM_I(sbi)->max_rf_node_blocks)
return false;
return true; return true;
} }
@ -343,6 +347,19 @@ static int recover_inode(struct inode *inode, struct page *page)
return 0; return 0;
} }
static unsigned int adjust_por_ra_blocks(struct f2fs_sb_info *sbi,
unsigned int ra_blocks, unsigned int blkaddr,
unsigned int next_blkaddr)
{
if (blkaddr + 1 == next_blkaddr)
ra_blocks = min_t(unsigned int, RECOVERY_MAX_RA_BLOCKS,
ra_blocks * 2);
else if (next_blkaddr % sbi->blocks_per_seg)
ra_blocks = max_t(unsigned int, RECOVERY_MIN_RA_BLOCKS,
ra_blocks / 2);
return ra_blocks;
}
static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head, static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head,
bool check_only) bool check_only)
{ {
@ -350,6 +367,7 @@ static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head,
struct page *page = NULL; struct page *page = NULL;
block_t blkaddr; block_t blkaddr;
unsigned int loop_cnt = 0; unsigned int loop_cnt = 0;
unsigned int ra_blocks = RECOVERY_MAX_RA_BLOCKS;
unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg - unsigned int free_blocks = MAIN_SEGS(sbi) * sbi->blocks_per_seg -
valid_user_blocks(sbi); valid_user_blocks(sbi);
int err = 0; int err = 0;
@ -424,11 +442,14 @@ next:
break; break;
} }
ra_blocks = adjust_por_ra_blocks(sbi, ra_blocks, blkaddr,
next_blkaddr_of_node(page));
/* check next segment */ /* check next segment */
blkaddr = next_blkaddr_of_node(page); blkaddr = next_blkaddr_of_node(page);
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
f2fs_ra_meta_pages_cond(sbi, blkaddr); f2fs_ra_meta_pages_cond(sbi, blkaddr, ra_blocks);
} }
return err; return err;
} }
@ -704,6 +725,7 @@ static int recover_data(struct f2fs_sb_info *sbi, struct list_head *inode_list,
struct page *page = NULL; struct page *page = NULL;
int err = 0; int err = 0;
block_t blkaddr; block_t blkaddr;
unsigned int ra_blocks = RECOVERY_MAX_RA_BLOCKS;
/* get node pages in the current segment */ /* get node pages in the current segment */
curseg = CURSEG_I(sbi, CURSEG_WARM_NODE); curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
@ -715,8 +737,6 @@ static int recover_data(struct f2fs_sb_info *sbi, struct list_head *inode_list,
if (!f2fs_is_valid_blkaddr(sbi, blkaddr, META_POR)) if (!f2fs_is_valid_blkaddr(sbi, blkaddr, META_POR))
break; break;
f2fs_ra_meta_pages_cond(sbi, blkaddr);
page = f2fs_get_tmp_page(sbi, blkaddr); page = f2fs_get_tmp_page(sbi, blkaddr);
if (IS_ERR(page)) { if (IS_ERR(page)) {
err = PTR_ERR(page); err = PTR_ERR(page);
@ -759,9 +779,14 @@ static int recover_data(struct f2fs_sb_info *sbi, struct list_head *inode_list,
if (entry->blkaddr == blkaddr) if (entry->blkaddr == blkaddr)
list_move_tail(&entry->list, tmp_inode_list); list_move_tail(&entry->list, tmp_inode_list);
next: next:
ra_blocks = adjust_por_ra_blocks(sbi, ra_blocks, blkaddr,
next_blkaddr_of_node(page));
/* check next segment */ /* check next segment */
blkaddr = next_blkaddr_of_node(page); blkaddr = next_blkaddr_of_node(page);
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
f2fs_ra_meta_pages_cond(sbi, blkaddr, ra_blocks);
} }
if (!err) if (!err)
f2fs_allocate_new_segments(sbi); f2fs_allocate_new_segments(sbi);
@ -796,7 +821,7 @@ int f2fs_recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
INIT_LIST_HEAD(&dir_list); INIT_LIST_HEAD(&dir_list);
/* prevent checkpoint */ /* prevent checkpoint */
down_write(&sbi->cp_global_sem); f2fs_down_write(&sbi->cp_global_sem);
/* step #1: find fsynced inode numbers */ /* step #1: find fsynced inode numbers */
err = find_fsync_dnodes(sbi, &inode_list, check_only); err = find_fsync_dnodes(sbi, &inode_list, check_only);
@ -845,7 +870,7 @@ skip:
if (!err) if (!err)
clear_sbi_flag(sbi, SBI_POR_DOING); clear_sbi_flag(sbi, SBI_POR_DOING);
up_write(&sbi->cp_global_sem); f2fs_up_write(&sbi->cp_global_sem);
/* let's drop all the directory inodes for clean checkpoint */ /* let's drop all the directory inodes for clean checkpoint */
destroy_fsync_dnodes(&dir_list, err); destroy_fsync_dnodes(&dir_list, err);

View File

@ -471,7 +471,7 @@ int f2fs_commit_inmem_pages(struct inode *inode)
f2fs_balance_fs(sbi, true); f2fs_balance_fs(sbi, true);
down_write(&fi->i_gc_rwsem[WRITE]); f2fs_down_write(&fi->i_gc_rwsem[WRITE]);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
set_inode_flag(inode, FI_ATOMIC_COMMIT); set_inode_flag(inode, FI_ATOMIC_COMMIT);
@ -483,7 +483,7 @@ int f2fs_commit_inmem_pages(struct inode *inode)
clear_inode_flag(inode, FI_ATOMIC_COMMIT); clear_inode_flag(inode, FI_ATOMIC_COMMIT);
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
up_write(&fi->i_gc_rwsem[WRITE]); f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
return err; return err;
} }
@ -521,7 +521,7 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need)
io_schedule(); io_schedule();
finish_wait(&sbi->gc_thread->fggc_wq, &wait); finish_wait(&sbi->gc_thread->fggc_wq, &wait);
} else { } else {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
f2fs_gc(sbi, false, false, false, NULL_SEGNO); f2fs_gc(sbi, false, false, false, NULL_SEGNO);
} }
} }
@ -529,7 +529,7 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need)
static inline bool excess_dirty_threshold(struct f2fs_sb_info *sbi) static inline bool excess_dirty_threshold(struct f2fs_sb_info *sbi)
{ {
int factor = rwsem_is_locked(&sbi->cp_rwsem) ? 3 : 2; int factor = f2fs_rwsem_is_locked(&sbi->cp_rwsem) ? 3 : 2;
unsigned int dents = get_pages(sbi, F2FS_DIRTY_DENTS); unsigned int dents = get_pages(sbi, F2FS_DIRTY_DENTS);
unsigned int qdata = get_pages(sbi, F2FS_DIRTY_QDATA); unsigned int qdata = get_pages(sbi, F2FS_DIRTY_QDATA);
unsigned int nodes = get_pages(sbi, F2FS_DIRTY_NODES); unsigned int nodes = get_pages(sbi, F2FS_DIRTY_NODES);
@ -570,7 +570,7 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg)
/* there is background inflight IO or foreground operation recently */ /* there is background inflight IO or foreground operation recently */
if (is_inflight_io(sbi, REQ_TIME) || if (is_inflight_io(sbi, REQ_TIME) ||
(!f2fs_time_over(sbi, REQ_TIME) && rwsem_is_locked(&sbi->cp_rwsem))) (!f2fs_time_over(sbi, REQ_TIME) && f2fs_rwsem_is_locked(&sbi->cp_rwsem)))
return; return;
/* exceed periodical checkpoint timeout threshold */ /* exceed periodical checkpoint timeout threshold */
@ -1156,14 +1156,14 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi,
dpolicy->ordered = false; dpolicy->ordered = false;
dpolicy->granularity = granularity; dpolicy->granularity = granularity;
dpolicy->max_requests = DEF_MAX_DISCARD_REQUEST; dpolicy->max_requests = dcc->max_discard_request;
dpolicy->io_aware_gran = MAX_PLIST_NUM; dpolicy->io_aware_gran = MAX_PLIST_NUM;
dpolicy->timeout = false; dpolicy->timeout = false;
if (discard_type == DPOLICY_BG) { if (discard_type == DPOLICY_BG) {
dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; dpolicy->min_interval = dcc->min_discard_issue_time;
dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; dpolicy->mid_interval = dcc->mid_discard_issue_time;
dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; dpolicy->max_interval = dcc->max_discard_issue_time;
dpolicy->io_aware = true; dpolicy->io_aware = true;
dpolicy->sync = false; dpolicy->sync = false;
dpolicy->ordered = true; dpolicy->ordered = true;
@ -1171,12 +1171,12 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi,
dpolicy->granularity = 1; dpolicy->granularity = 1;
if (atomic_read(&dcc->discard_cmd_cnt)) if (atomic_read(&dcc->discard_cmd_cnt))
dpolicy->max_interval = dpolicy->max_interval =
DEF_MIN_DISCARD_ISSUE_TIME; dcc->min_discard_issue_time;
} }
} else if (discard_type == DPOLICY_FORCE) { } else if (discard_type == DPOLICY_FORCE) {
dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; dpolicy->min_interval = dcc->min_discard_issue_time;
dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; dpolicy->mid_interval = dcc->mid_discard_issue_time;
dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; dpolicy->max_interval = dcc->max_discard_issue_time;
dpolicy->io_aware = false; dpolicy->io_aware = false;
} else if (discard_type == DPOLICY_FSTRIM) { } else if (discard_type == DPOLICY_FSTRIM) {
dpolicy->io_aware = false; dpolicy->io_aware = false;
@ -1781,7 +1781,7 @@ static int issue_discard_thread(void *data)
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
wait_queue_head_t *q = &dcc->discard_wait_queue; wait_queue_head_t *q = &dcc->discard_wait_queue;
struct discard_policy dpolicy; struct discard_policy dpolicy;
unsigned int wait_ms = DEF_MIN_DISCARD_ISSUE_TIME; unsigned int wait_ms = dcc->min_discard_issue_time;
int issued; int issued;
set_freezable(); set_freezable();
@ -2180,6 +2180,10 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi)
atomic_set(&dcc->discard_cmd_cnt, 0); atomic_set(&dcc->discard_cmd_cnt, 0);
dcc->nr_discards = 0; dcc->nr_discards = 0;
dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg;
dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST;
dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME;
dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME;
dcc->max_discard_issue_time = DEF_MAX_DISCARD_ISSUE_TIME;
dcc->undiscard_blks = 0; dcc->undiscard_blks = 0;
dcc->next_pos = 0; dcc->next_pos = 0;
dcc->root = RB_ROOT_CACHED; dcc->root = RB_ROOT_CACHED;
@ -2821,7 +2825,7 @@ static void __f2fs_init_atgc_curseg(struct f2fs_sb_info *sbi)
if (!sbi->am.atgc_enabled) if (!sbi->am.atgc_enabled)
return; return;
down_read(&SM_I(sbi)->curseg_lock); f2fs_down_read(&SM_I(sbi)->curseg_lock);
mutex_lock(&curseg->curseg_mutex); mutex_lock(&curseg->curseg_mutex);
down_write(&SIT_I(sbi)->sentry_lock); down_write(&SIT_I(sbi)->sentry_lock);
@ -2831,7 +2835,7 @@ static void __f2fs_init_atgc_curseg(struct f2fs_sb_info *sbi)
up_write(&SIT_I(sbi)->sentry_lock); up_write(&SIT_I(sbi)->sentry_lock);
mutex_unlock(&curseg->curseg_mutex); mutex_unlock(&curseg->curseg_mutex);
up_read(&SM_I(sbi)->curseg_lock); f2fs_up_read(&SM_I(sbi)->curseg_lock);
} }
void f2fs_init_inmem_curseg(struct f2fs_sb_info *sbi) void f2fs_init_inmem_curseg(struct f2fs_sb_info *sbi)
@ -2982,7 +2986,7 @@ void f2fs_allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type,
struct curseg_info *curseg = CURSEG_I(sbi, type); struct curseg_info *curseg = CURSEG_I(sbi, type);
unsigned int segno; unsigned int segno;
down_read(&SM_I(sbi)->curseg_lock); f2fs_down_read(&SM_I(sbi)->curseg_lock);
mutex_lock(&curseg->curseg_mutex); mutex_lock(&curseg->curseg_mutex);
down_write(&SIT_I(sbi)->sentry_lock); down_write(&SIT_I(sbi)->sentry_lock);
@ -3006,7 +3010,7 @@ unlock:
type, segno, curseg->segno); type, segno, curseg->segno);
mutex_unlock(&curseg->curseg_mutex); mutex_unlock(&curseg->curseg_mutex);
up_read(&SM_I(sbi)->curseg_lock); f2fs_up_read(&SM_I(sbi)->curseg_lock);
} }
static void __allocate_new_segment(struct f2fs_sb_info *sbi, int type, static void __allocate_new_segment(struct f2fs_sb_info *sbi, int type,
@ -3038,23 +3042,23 @@ static void __allocate_new_section(struct f2fs_sb_info *sbi,
void f2fs_allocate_new_section(struct f2fs_sb_info *sbi, int type, bool force) void f2fs_allocate_new_section(struct f2fs_sb_info *sbi, int type, bool force)
{ {
down_read(&SM_I(sbi)->curseg_lock); f2fs_down_read(&SM_I(sbi)->curseg_lock);
down_write(&SIT_I(sbi)->sentry_lock); down_write(&SIT_I(sbi)->sentry_lock);
__allocate_new_section(sbi, type, force); __allocate_new_section(sbi, type, force);
up_write(&SIT_I(sbi)->sentry_lock); up_write(&SIT_I(sbi)->sentry_lock);
up_read(&SM_I(sbi)->curseg_lock); f2fs_up_read(&SM_I(sbi)->curseg_lock);
} }
void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi) void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi)
{ {
int i; int i;
down_read(&SM_I(sbi)->curseg_lock); f2fs_down_read(&SM_I(sbi)->curseg_lock);
down_write(&SIT_I(sbi)->sentry_lock); down_write(&SIT_I(sbi)->sentry_lock);
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++)
__allocate_new_segment(sbi, i, false, false); __allocate_new_segment(sbi, i, false, false);
up_write(&SIT_I(sbi)->sentry_lock); up_write(&SIT_I(sbi)->sentry_lock);
up_read(&SM_I(sbi)->curseg_lock); f2fs_up_read(&SM_I(sbi)->curseg_lock);
} }
static const struct segment_allocation default_salloc_ops = { static const struct segment_allocation default_salloc_ops = {
@ -3192,9 +3196,9 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range)
if (sbi->discard_blks == 0) if (sbi->discard_blks == 0)
goto out; goto out;
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
err = f2fs_write_checkpoint(sbi, &cpc); err = f2fs_write_checkpoint(sbi, &cpc);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
if (err) if (err)
goto out; goto out;
@ -3431,7 +3435,7 @@ void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
bool from_gc = (type == CURSEG_ALL_DATA_ATGC); bool from_gc = (type == CURSEG_ALL_DATA_ATGC);
struct seg_entry *se = NULL; struct seg_entry *se = NULL;
down_read(&SM_I(sbi)->curseg_lock); f2fs_down_read(&SM_I(sbi)->curseg_lock);
mutex_lock(&curseg->curseg_mutex); mutex_lock(&curseg->curseg_mutex);
down_write(&sit_i->sentry_lock); down_write(&sit_i->sentry_lock);
@ -3514,7 +3518,7 @@ void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
mutex_unlock(&curseg->curseg_mutex); mutex_unlock(&curseg->curseg_mutex);
up_read(&SM_I(sbi)->curseg_lock); f2fs_up_read(&SM_I(sbi)->curseg_lock);
} }
void f2fs_update_device_state(struct f2fs_sb_info *sbi, nid_t ino, void f2fs_update_device_state(struct f2fs_sb_info *sbi, nid_t ino,
@ -3550,7 +3554,7 @@ static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio)
bool keep_order = (f2fs_lfs_mode(fio->sbi) && type == CURSEG_COLD_DATA); bool keep_order = (f2fs_lfs_mode(fio->sbi) && type == CURSEG_COLD_DATA);
if (keep_order) if (keep_order)
down_read(&fio->sbi->io_order_lock); f2fs_down_read(&fio->sbi->io_order_lock);
reallocate: reallocate:
f2fs_allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr, f2fs_allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr,
&fio->new_blkaddr, sum, type, fio); &fio->new_blkaddr, sum, type, fio);
@ -3570,7 +3574,7 @@ reallocate:
f2fs_update_device_state(fio->sbi, fio->ino, fio->new_blkaddr, 1); f2fs_update_device_state(fio->sbi, fio->ino, fio->new_blkaddr, 1);
if (keep_order) if (keep_order)
up_read(&fio->sbi->io_order_lock); f2fs_up_read(&fio->sbi->io_order_lock);
} }
void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct page *page, void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct page *page,
@ -3705,7 +3709,7 @@ void f2fs_do_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno); se = get_seg_entry(sbi, segno);
type = se->type; type = se->type;
down_write(&SM_I(sbi)->curseg_lock); f2fs_down_write(&SM_I(sbi)->curseg_lock);
if (!recover_curseg) { if (!recover_curseg) {
/* for recovery flow */ /* for recovery flow */
@ -3774,7 +3778,7 @@ void f2fs_do_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
up_write(&sit_i->sentry_lock); up_write(&sit_i->sentry_lock);
mutex_unlock(&curseg->curseg_mutex); mutex_unlock(&curseg->curseg_mutex);
up_write(&SM_I(sbi)->curseg_lock); f2fs_up_write(&SM_I(sbi)->curseg_lock);
} }
void f2fs_replace_block(struct f2fs_sb_info *sbi, struct dnode_of_data *dn, void f2fs_replace_block(struct f2fs_sb_info *sbi, struct dnode_of_data *dn,
@ -4789,6 +4793,13 @@ static int sanity_check_curseg(struct f2fs_sb_info *sbi)
sanity_check_seg_type(sbi, curseg->seg_type); sanity_check_seg_type(sbi, curseg->seg_type);
if (curseg->alloc_type != LFS && curseg->alloc_type != SSR) {
f2fs_err(sbi,
"Current segment has invalid alloc_type:%d",
curseg->alloc_type);
return -EFSCORRUPTED;
}
if (f2fs_test_bit(blkofs, se->cur_valid_map)) if (f2fs_test_bit(blkofs, se->cur_valid_map))
goto out; goto out;
@ -5258,7 +5269,7 @@ int f2fs_build_segment_manager(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(&sm_info->sit_entry_set); INIT_LIST_HEAD(&sm_info->sit_entry_set);
init_rwsem(&sm_info->curseg_lock); init_f2fs_rwsem(&sm_info->curseg_lock);
if (!f2fs_readonly(sbi->sb)) { if (!f2fs_readonly(sbi->sb)) {
err = f2fs_create_flush_cmd_control(sbi); err = f2fs_create_flush_cmd_control(sbi);

View File

@ -651,7 +651,9 @@ static inline int utilization(struct f2fs_sb_info *sbi)
* pages over min_fsync_blocks. (=default option) * pages over min_fsync_blocks. (=default option)
* F2FS_IPU_ASYNC - do IPU given by asynchronous write requests. * F2FS_IPU_ASYNC - do IPU given by asynchronous write requests.
* F2FS_IPU_NOCACHE - disable IPU bio cache. * F2FS_IPU_NOCACHE - disable IPU bio cache.
* F2FS_IPUT_DISABLE - disable IPU. (=default option in LFS mode) * F2FS_IPU_HONOR_OPU_WRITE - use OPU write prior to IPU write if inode has
* FI_OPU_WRITE flag.
* F2FS_IPU_DISABLE - disable IPU. (=default option in LFS mode)
*/ */
#define DEF_MIN_IPU_UTIL 70 #define DEF_MIN_IPU_UTIL 70
#define DEF_MIN_FSYNC_BLOCKS 8 #define DEF_MIN_FSYNC_BLOCKS 8
@ -667,6 +669,7 @@ enum {
F2FS_IPU_FSYNC, F2FS_IPU_FSYNC,
F2FS_IPU_ASYNC, F2FS_IPU_ASYNC,
F2FS_IPU_NOCACHE, F2FS_IPU_NOCACHE,
F2FS_IPU_HONOR_OPU_WRITE,
}; };
static inline unsigned int curseg_segno(struct f2fs_sb_info *sbi, static inline unsigned int curseg_segno(struct f2fs_sb_info *sbi,

View File

@ -1355,16 +1355,16 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb)
/* Initialize f2fs-specific inode info */ /* Initialize f2fs-specific inode info */
atomic_set(&fi->dirty_pages, 0); atomic_set(&fi->dirty_pages, 0);
atomic_set(&fi->i_compr_blocks, 0); atomic_set(&fi->i_compr_blocks, 0);
init_rwsem(&fi->i_sem); init_f2fs_rwsem(&fi->i_sem);
spin_lock_init(&fi->i_size_lock); spin_lock_init(&fi->i_size_lock);
INIT_LIST_HEAD(&fi->dirty_list); INIT_LIST_HEAD(&fi->dirty_list);
INIT_LIST_HEAD(&fi->gdirty_list); INIT_LIST_HEAD(&fi->gdirty_list);
INIT_LIST_HEAD(&fi->inmem_ilist); INIT_LIST_HEAD(&fi->inmem_ilist);
INIT_LIST_HEAD(&fi->inmem_pages); INIT_LIST_HEAD(&fi->inmem_pages);
mutex_init(&fi->inmem_lock); mutex_init(&fi->inmem_lock);
init_rwsem(&fi->i_gc_rwsem[READ]); init_f2fs_rwsem(&fi->i_gc_rwsem[READ]);
init_rwsem(&fi->i_gc_rwsem[WRITE]); init_f2fs_rwsem(&fi->i_gc_rwsem[WRITE]);
init_rwsem(&fi->i_xattr_sem); init_f2fs_rwsem(&fi->i_xattr_sem);
/* Will be used by directory only */ /* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level; fi->i_dir_level = F2FS_SB(sb)->dir_level;
@ -1501,8 +1501,9 @@ static void f2fs_free_inode(struct inode *inode)
static void destroy_percpu_info(struct f2fs_sb_info *sbi) static void destroy_percpu_info(struct f2fs_sb_info *sbi)
{ {
percpu_counter_destroy(&sbi->alloc_valid_block_count);
percpu_counter_destroy(&sbi->total_valid_inode_count); percpu_counter_destroy(&sbi->total_valid_inode_count);
percpu_counter_destroy(&sbi->rf_node_block_count);
percpu_counter_destroy(&sbi->alloc_valid_block_count);
} }
static void destroy_device_list(struct f2fs_sb_info *sbi) static void destroy_device_list(struct f2fs_sb_info *sbi)
@ -1662,11 +1663,15 @@ static int f2fs_freeze(struct super_block *sb)
/* ensure no checkpoint required */ /* ensure no checkpoint required */
if (!llist_empty(&F2FS_SB(sb)->cprc_info.issue_list)) if (!llist_empty(&F2FS_SB(sb)->cprc_info.issue_list))
return -EINVAL; return -EINVAL;
/* to avoid deadlock on f2fs_evict_inode->SB_FREEZE_FS */
set_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING);
return 0; return 0;
} }
static int f2fs_unfreeze(struct super_block *sb) static int f2fs_unfreeze(struct super_block *sb)
{ {
clear_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING);
return 0; return 0;
} }
@ -2075,6 +2080,7 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
{ {
unsigned int s_flags = sbi->sb->s_flags; unsigned int s_flags = sbi->sb->s_flags;
struct cp_control cpc; struct cp_control cpc;
unsigned int gc_mode;
int err = 0; int err = 0;
int ret; int ret;
block_t unusable; block_t unusable;
@ -2087,8 +2093,11 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
f2fs_update_time(sbi, DISABLE_TIME); f2fs_update_time(sbi, DISABLE_TIME);
gc_mode = sbi->gc_mode;
sbi->gc_mode = GC_URGENT_HIGH;
while (!f2fs_time_over(sbi, DISABLE_TIME)) { while (!f2fs_time_over(sbi, DISABLE_TIME)) {
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
err = f2fs_gc(sbi, true, false, false, NULL_SEGNO); err = f2fs_gc(sbi, true, false, false, NULL_SEGNO);
if (err == -ENODATA) { if (err == -ENODATA) {
err = 0; err = 0;
@ -2110,7 +2119,7 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
goto restore_flag; goto restore_flag;
} }
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
cpc.reason = CP_PAUSE; cpc.reason = CP_PAUSE;
set_sbi_flag(sbi, SBI_CP_DISABLED); set_sbi_flag(sbi, SBI_CP_DISABLED);
err = f2fs_write_checkpoint(sbi, &cpc); err = f2fs_write_checkpoint(sbi, &cpc);
@ -2122,8 +2131,9 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
spin_unlock(&sbi->stat_lock); spin_unlock(&sbi->stat_lock);
out_unlock: out_unlock:
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
restore_flag: restore_flag:
sbi->gc_mode = gc_mode;
sbi->sb->s_flags = s_flags; /* Restore SB_RDONLY status */ sbi->sb->s_flags = s_flags; /* Restore SB_RDONLY status */
return err; return err;
} }
@ -2142,12 +2152,12 @@ static void f2fs_enable_checkpoint(struct f2fs_sb_info *sbi)
if (unlikely(retry < 0)) if (unlikely(retry < 0))
f2fs_warn(sbi, "checkpoint=enable has some unwritten data."); f2fs_warn(sbi, "checkpoint=enable has some unwritten data.");
down_write(&sbi->gc_lock); f2fs_down_write(&sbi->gc_lock);
f2fs_dirty_to_prefree(sbi); f2fs_dirty_to_prefree(sbi);
clear_sbi_flag(sbi, SBI_CP_DISABLED); clear_sbi_flag(sbi, SBI_CP_DISABLED);
set_sbi_flag(sbi, SBI_IS_DIRTY); set_sbi_flag(sbi, SBI_IS_DIRTY);
up_write(&sbi->gc_lock); f2fs_up_write(&sbi->gc_lock);
f2fs_sync_fs(sbi->sb, 1); f2fs_sync_fs(sbi->sb, 1);
} }
@ -2688,7 +2698,7 @@ int f2fs_quota_sync(struct super_block *sb, int type)
struct f2fs_sb_info *sbi = F2FS_SB(sb); struct f2fs_sb_info *sbi = F2FS_SB(sb);
struct quota_info *dqopt = sb_dqopt(sb); struct quota_info *dqopt = sb_dqopt(sb);
int cnt; int cnt;
int ret; int ret = 0;
/* /*
* Now when everything is written we can discard the pagecache so * Now when everything is written we can discard the pagecache so
@ -2699,26 +2709,26 @@ int f2fs_quota_sync(struct super_block *sb, int type)
if (type != -1 && cnt != type) if (type != -1 && cnt != type)
continue; continue;
if (!sb_has_quota_active(sb, type)) if (!sb_has_quota_active(sb, cnt))
return 0; continue;
inode_lock(dqopt->files[cnt]); inode_lock(dqopt->files[cnt]);
/* /*
* do_quotactl * do_quotactl
* f2fs_quota_sync * f2fs_quota_sync
* down_read(quota_sem) * f2fs_down_read(quota_sem)
* dquot_writeback_dquots() * dquot_writeback_dquots()
* f2fs_dquot_commit * f2fs_dquot_commit
* block_operation * block_operation
* down_read(quota_sem) * f2fs_down_read(quota_sem)
*/ */
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
down_read(&sbi->quota_sem); f2fs_down_read(&sbi->quota_sem);
ret = f2fs_quota_sync_file(sbi, cnt); ret = f2fs_quota_sync_file(sbi, cnt);
up_read(&sbi->quota_sem); f2fs_up_read(&sbi->quota_sem);
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
inode_unlock(dqopt->files[cnt]); inode_unlock(dqopt->files[cnt]);
@ -2843,11 +2853,11 @@ static int f2fs_dquot_commit(struct dquot *dquot)
struct f2fs_sb_info *sbi = F2FS_SB(dquot->dq_sb); struct f2fs_sb_info *sbi = F2FS_SB(dquot->dq_sb);
int ret; int ret;
down_read_nested(&sbi->quota_sem, SINGLE_DEPTH_NESTING); f2fs_down_read_nested(&sbi->quota_sem, SINGLE_DEPTH_NESTING);
ret = dquot_commit(dquot); ret = dquot_commit(dquot);
if (ret < 0) if (ret < 0)
set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR); set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
up_read(&sbi->quota_sem); f2fs_up_read(&sbi->quota_sem);
return ret; return ret;
} }
@ -2856,11 +2866,11 @@ static int f2fs_dquot_acquire(struct dquot *dquot)
struct f2fs_sb_info *sbi = F2FS_SB(dquot->dq_sb); struct f2fs_sb_info *sbi = F2FS_SB(dquot->dq_sb);
int ret; int ret;
down_read(&sbi->quota_sem); f2fs_down_read(&sbi->quota_sem);
ret = dquot_acquire(dquot); ret = dquot_acquire(dquot);
if (ret < 0) if (ret < 0)
set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR); set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
up_read(&sbi->quota_sem); f2fs_up_read(&sbi->quota_sem);
return ret; return ret;
} }
@ -3574,6 +3584,7 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
F2FS_NODE_INO(sbi) = le32_to_cpu(raw_super->node_ino); F2FS_NODE_INO(sbi) = le32_to_cpu(raw_super->node_ino);
F2FS_META_INO(sbi) = le32_to_cpu(raw_super->meta_ino); F2FS_META_INO(sbi) = le32_to_cpu(raw_super->meta_ino);
sbi->cur_victim_sec = NULL_SECNO; sbi->cur_victim_sec = NULL_SECNO;
sbi->gc_mode = GC_NORMAL;
sbi->next_victim_seg[BG_GC] = NULL_SEGNO; sbi->next_victim_seg[BG_GC] = NULL_SEGNO;
sbi->next_victim_seg[FG_GC] = NULL_SEGNO; sbi->next_victim_seg[FG_GC] = NULL_SEGNO;
sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH; sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH;
@ -3601,14 +3612,14 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
INIT_LIST_HEAD(&sbi->s_list); INIT_LIST_HEAD(&sbi->s_list);
mutex_init(&sbi->umount_mutex); mutex_init(&sbi->umount_mutex);
init_rwsem(&sbi->io_order_lock); init_f2fs_rwsem(&sbi->io_order_lock);
spin_lock_init(&sbi->cp_lock); spin_lock_init(&sbi->cp_lock);
sbi->dirty_device = 0; sbi->dirty_device = 0;
spin_lock_init(&sbi->dev_lock); spin_lock_init(&sbi->dev_lock);
init_rwsem(&sbi->sb_lock); init_f2fs_rwsem(&sbi->sb_lock);
init_rwsem(&sbi->pin_sem); init_f2fs_rwsem(&sbi->pin_sem);
} }
static int init_percpu_info(struct f2fs_sb_info *sbi) static int init_percpu_info(struct f2fs_sb_info *sbi)
@ -3619,11 +3630,20 @@ static int init_percpu_info(struct f2fs_sb_info *sbi)
if (err) if (err)
return err; return err;
err = percpu_counter_init(&sbi->rf_node_block_count, 0, GFP_KERNEL);
if (err)
goto err_valid_block;
err = percpu_counter_init(&sbi->total_valid_inode_count, 0, err = percpu_counter_init(&sbi->total_valid_inode_count, 0,
GFP_KERNEL); GFP_KERNEL);
if (err) if (err)
percpu_counter_destroy(&sbi->alloc_valid_block_count); goto err_node_block;
return 0;
err_node_block:
percpu_counter_destroy(&sbi->rf_node_block_count);
err_valid_block:
percpu_counter_destroy(&sbi->alloc_valid_block_count);
return err; return err;
} }
@ -3957,7 +3977,8 @@ static void f2fs_tuning_parameters(struct f2fs_sb_info *sbi)
F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE; F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE;
if (f2fs_block_unit_discard(sbi)) if (f2fs_block_unit_discard(sbi))
sm_i->dcc_info->discard_granularity = 1; sm_i->dcc_info->discard_granularity = 1;
sm_i->ipu_policy = 1 << F2FS_IPU_FORCE; sm_i->ipu_policy = 1 << F2FS_IPU_FORCE |
1 << F2FS_IPU_HONOR_OPU_WRITE;
} }
sbi->readdir_ra = 1; sbi->readdir_ra = 1;
@ -4067,11 +4088,11 @@ try_onemore:
/* init f2fs-specific super block info */ /* init f2fs-specific super block info */
sbi->valid_super_block = valid_super_block; sbi->valid_super_block = valid_super_block;
init_rwsem(&sbi->gc_lock); init_f2fs_rwsem(&sbi->gc_lock);
mutex_init(&sbi->writepages); mutex_init(&sbi->writepages);
init_rwsem(&sbi->cp_global_sem); init_f2fs_rwsem(&sbi->cp_global_sem);
init_rwsem(&sbi->node_write); init_f2fs_rwsem(&sbi->node_write);
init_rwsem(&sbi->node_change); init_f2fs_rwsem(&sbi->node_change);
/* disallow all the data/node/meta page writes */ /* disallow all the data/node/meta page writes */
set_sbi_flag(sbi, SBI_POR_DOING); set_sbi_flag(sbi, SBI_POR_DOING);
@ -4092,18 +4113,18 @@ try_onemore:
} }
for (j = HOT; j < n; j++) { for (j = HOT; j < n; j++) {
init_rwsem(&sbi->write_io[i][j].io_rwsem); init_f2fs_rwsem(&sbi->write_io[i][j].io_rwsem);
sbi->write_io[i][j].sbi = sbi; sbi->write_io[i][j].sbi = sbi;
sbi->write_io[i][j].bio = NULL; sbi->write_io[i][j].bio = NULL;
spin_lock_init(&sbi->write_io[i][j].io_lock); spin_lock_init(&sbi->write_io[i][j].io_lock);
INIT_LIST_HEAD(&sbi->write_io[i][j].io_list); INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
INIT_LIST_HEAD(&sbi->write_io[i][j].bio_list); INIT_LIST_HEAD(&sbi->write_io[i][j].bio_list);
init_rwsem(&sbi->write_io[i][j].bio_list_lock); init_f2fs_rwsem(&sbi->write_io[i][j].bio_list_lock);
} }
} }
init_rwsem(&sbi->cp_rwsem); init_f2fs_rwsem(&sbi->cp_rwsem);
init_rwsem(&sbi->quota_sem); init_f2fs_rwsem(&sbi->quota_sem);
init_waitqueue_head(&sbi->cp_wait); init_waitqueue_head(&sbi->cp_wait);
init_sb_info(sbi); init_sb_info(sbi);
@ -4528,7 +4549,7 @@ static struct file_system_type f2fs_fs_type = {
.name = "f2fs", .name = "f2fs",
.mount = f2fs_mount, .mount = f2fs_mount,
.kill_sb = kill_f2fs_super, .kill_sb = kill_f2fs_super,
.fs_flags = FS_REQUIRES_DEV, .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
}; };
MODULE_ALIAS_FS("f2fs"); MODULE_ALIAS_FS("f2fs");

View File

@ -41,6 +41,16 @@ enum {
ATGC_INFO, /* struct atgc_management */ ATGC_INFO, /* struct atgc_management */
}; };
static const char *gc_mode_names[MAX_GC_MODE] = {
"GC_NORMAL",
"GC_IDLE_CB",
"GC_IDLE_GREEDY",
"GC_IDLE_AT",
"GC_URGENT_HIGH",
"GC_URGENT_LOW",
"GC_URGENT_MID"
};
struct f2fs_attr { struct f2fs_attr {
struct attribute attr; struct attribute attr;
ssize_t (*show)(struct f2fs_attr *, struct f2fs_sb_info *, char *); ssize_t (*show)(struct f2fs_attr *, struct f2fs_sb_info *, char *);
@ -316,8 +326,13 @@ static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
return sysfs_emit(buf, "%u\n", sbi->compr_new_inode); return sysfs_emit(buf, "%u\n", sbi->compr_new_inode);
#endif #endif
if (!strcmp(a->attr.name, "gc_urgent"))
return sysfs_emit(buf, "%s\n",
gc_mode_names[sbi->gc_mode]);
if (!strcmp(a->attr.name, "gc_segment_mode")) if (!strcmp(a->attr.name, "gc_segment_mode"))
return sysfs_emit(buf, "%u\n", sbi->gc_segment_mode); return sysfs_emit(buf, "%s\n",
gc_mode_names[sbi->gc_segment_mode]);
if (!strcmp(a->attr.name, "gc_reclaimed_segments")) { if (!strcmp(a->attr.name, "gc_reclaimed_segments")) {
return sysfs_emit(buf, "%u\n", return sysfs_emit(buf, "%u\n",
@ -363,7 +378,7 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
if (!strlen(name) || strlen(name) >= F2FS_EXTENSION_LEN) if (!strlen(name) || strlen(name) >= F2FS_EXTENSION_LEN)
return -EINVAL; return -EINVAL;
down_write(&sbi->sb_lock); f2fs_down_write(&sbi->sb_lock);
ret = f2fs_update_extension_list(sbi, name, hot, set); ret = f2fs_update_extension_list(sbi, name, hot, set);
if (ret) if (ret)
@ -373,7 +388,7 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
if (ret) if (ret)
f2fs_update_extension_list(sbi, name, hot, !set); f2fs_update_extension_list(sbi, name, hot, !set);
out: out:
up_write(&sbi->sb_lock); f2fs_up_write(&sbi->sb_lock);
return ret ? ret : count; return ret ? ret : count;
} }
@ -468,6 +483,13 @@ out:
} }
} else if (t == 2) { } else if (t == 2) {
sbi->gc_mode = GC_URGENT_LOW; sbi->gc_mode = GC_URGENT_LOW;
} else if (t == 3) {
sbi->gc_mode = GC_URGENT_MID;
if (sbi->gc_thread) {
sbi->gc_thread->gc_wake = 1;
wake_up_interruptible_all(
&sbi->gc_thread->gc_wait_queue_head);
}
} else { } else {
return -EINVAL; return -EINVAL;
} }
@ -481,7 +503,7 @@ out:
} else if (t == GC_IDLE_AT) { } else if (t == GC_IDLE_AT) {
if (!sbi->am.atgc_enabled) if (!sbi->am.atgc_enabled)
return -EINVAL; return -EINVAL;
sbi->gc_mode = GC_AT; sbi->gc_mode = GC_IDLE_AT;
} else { } else {
sbi->gc_mode = GC_NORMAL; sbi->gc_mode = GC_NORMAL;
} }
@ -716,6 +738,10 @@ F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_idle, gc_mode);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_urgent, gc_mode); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_urgent, gc_mode);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards); F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_discard_request, max_discard_request);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, min_discard_issue_time, min_discard_issue_time);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, mid_discard_issue_time, mid_discard_issue_time);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_discard_issue_time, max_discard_issue_time);
F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, discard_granularity); F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, discard_granularity);
F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, reserved_blocks); F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, reserved_blocks);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections);
@ -728,6 +754,7 @@ F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ssr_sections, min_ssr_sections);
F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ram_thresh, ram_thresh); F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ram_thresh, ram_thresh);
F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ra_nid_pages, ra_nid_pages); F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ra_nid_pages, ra_nid_pages);
F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, dirty_nats_ratio, dirty_nats_ratio); F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, dirty_nats_ratio, dirty_nats_ratio);
F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, max_roll_forward_node_blocks, max_rf_node_blocks);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search, max_victim_search); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search, max_victim_search);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, migration_granularity, migration_granularity); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, migration_granularity, migration_granularity);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level);
@ -832,6 +859,10 @@ static struct attribute *f2fs_attrs[] = {
ATTR_LIST(reclaim_segments), ATTR_LIST(reclaim_segments),
ATTR_LIST(main_blkaddr), ATTR_LIST(main_blkaddr),
ATTR_LIST(max_small_discards), ATTR_LIST(max_small_discards),
ATTR_LIST(max_discard_request),
ATTR_LIST(min_discard_issue_time),
ATTR_LIST(mid_discard_issue_time),
ATTR_LIST(max_discard_issue_time),
ATTR_LIST(discard_granularity), ATTR_LIST(discard_granularity),
ATTR_LIST(pending_discard), ATTR_LIST(pending_discard),
ATTR_LIST(batched_trim_sections), ATTR_LIST(batched_trim_sections),
@ -847,6 +878,7 @@ static struct attribute *f2fs_attrs[] = {
ATTR_LIST(ram_thresh), ATTR_LIST(ram_thresh),
ATTR_LIST(ra_nid_pages), ATTR_LIST(ra_nid_pages),
ATTR_LIST(dirty_nats_ratio), ATTR_LIST(dirty_nats_ratio),
ATTR_LIST(max_roll_forward_node_blocks),
ATTR_LIST(cp_interval), ATTR_LIST(cp_interval),
ATTR_LIST(idle_interval), ATTR_LIST(idle_interval),
ATTR_LIST(discard_idle_interval), ATTR_LIST(discard_idle_interval),

View File

@ -208,7 +208,7 @@ cleanup:
* from re-instantiating cached pages we are truncating (since unlike * from re-instantiating cached pages we are truncating (since unlike
* normal file accesses, garbage collection isn't limited by i_size). * normal file accesses, garbage collection isn't limited by i_size).
*/ */
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
truncate_inode_pages(inode->i_mapping, inode->i_size); truncate_inode_pages(inode->i_mapping, inode->i_size);
err2 = f2fs_truncate(inode); err2 = f2fs_truncate(inode);
if (err2) { if (err2) {
@ -216,7 +216,7 @@ cleanup:
err2); err2);
set_sbi_flag(sbi, SBI_NEED_FSCK); set_sbi_flag(sbi, SBI_NEED_FSCK);
} }
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
clear_inode_flag(inode, FI_VERITY_IN_PROGRESS); clear_inode_flag(inode, FI_VERITY_IN_PROGRESS);
return err ?: err2; return err ?: err2;
} }

View File

@ -525,10 +525,10 @@ int f2fs_getxattr(struct inode *inode, int index, const char *name,
if (len > F2FS_NAME_LEN) if (len > F2FS_NAME_LEN)
return -ERANGE; return -ERANGE;
down_read(&F2FS_I(inode)->i_xattr_sem); f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);
error = lookup_all_xattrs(inode, ipage, index, len, name, error = lookup_all_xattrs(inode, ipage, index, len, name,
&entry, &base_addr, &base_size, &is_inline); &entry, &base_addr, &base_size, &is_inline);
up_read(&F2FS_I(inode)->i_xattr_sem); f2fs_up_read(&F2FS_I(inode)->i_xattr_sem);
if (error) if (error)
return error; return error;
@ -562,9 +562,9 @@ ssize_t f2fs_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size)
int error; int error;
size_t rest = buffer_size; size_t rest = buffer_size;
down_read(&F2FS_I(inode)->i_xattr_sem); f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);
error = read_all_xattrs(inode, NULL, &base_addr); error = read_all_xattrs(inode, NULL, &base_addr);
up_read(&F2FS_I(inode)->i_xattr_sem); f2fs_up_read(&F2FS_I(inode)->i_xattr_sem);
if (error) if (error)
return error; return error;
@ -786,9 +786,9 @@ int f2fs_setxattr(struct inode *inode, int index, const char *name,
f2fs_balance_fs(sbi, true); f2fs_balance_fs(sbi, true);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
down_write(&F2FS_I(inode)->i_xattr_sem); f2fs_down_write(&F2FS_I(inode)->i_xattr_sem);
err = __f2fs_setxattr(inode, index, name, value, size, ipage, flags); err = __f2fs_setxattr(inode, index, name, value, size, ipage, flags);
up_write(&F2FS_I(inode)->i_xattr_sem); f2fs_up_write(&F2FS_I(inode)->i_xattr_sem);
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
f2fs_update_time(sbi, REQ_TIME); f2fs_update_time(sbi, REQ_TIME);