IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
First: Ted, Jaegeuk, and I have decided to add me as a co-maintainer for
fscrypt, and we're now using a shared git tree. So we've updated
MAINTAINERS accordingly, and I'm doing the pull request this time.
The actual changes for v5.1 are:
- Remove the fs-specific kconfig options like CONFIG_EXT4_ENCRYPTION and
make fscrypt support for all fscrypt-capable filesystems be controlled
by CONFIG_FS_ENCRYPTION, similar to how CONFIG_QUOTA works.
- Improve error code for rename() and link() into encrypted directories.
- Various cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCXH2YDRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+SAAQCWYOTwYko8uE8Ze8i2fiUm0vr91NOg
zj5DGmK7Izxy/gEAsNDOVA7zWrDg/f5600/7aLpDQQTGHA38YVsgiyd7DgY=
=S3tT
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"First: Ted, Jaegeuk, and I have decided to add me as a co-maintainer
for fscrypt, and we're now using a shared git tree. So we've updated
MAINTAINERS accordingly, and I'm doing the pull request this time.
The actual changes for v5.1 are:
- Remove the fs-specific kconfig options like CONFIG_EXT4_ENCRYPTION
and make fscrypt support for all fscrypt-capable filesystems be
controlled by CONFIG_FS_ENCRYPTION, similar to how CONFIG_QUOTA
works.
- Improve error code for rename() and link() into encrypted
directories.
- Various cleanups"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
MAINTAINERS: add Eric Biggers as an fscrypt maintainer
fscrypt: return -EXDEV for incompatible rename or link into encrypted dir
fscrypt: remove filesystem specific build config option
f2fs: use IS_ENCRYPTED() to check encryption status
ext4: use IS_ENCRYPTED() to check encryption status
fscrypt: remove CRYPTO_CTR dependency
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlx63XIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpp2vEACfrrQsap7R+Av28mmXpmXi2FPa3g5Tev1t
yYjK2qHvhlMZjPTYw3hCmbYdDDczlF7PEgSE2x2DjdcsYapb8Fy1lZ2X16c7ztBR
HD/t9b5AVSQsczZzKgv3RqsNtTnjzS5V0A8XH8FAP2QRgiwDMwSN6G0FP0JBLbE/
ZgxQrH1Iy1F33Wz4hI3Z7dEghKPZrH1IlegkZCEu47q9SlWS76qUetSy2GEtchOl
3Lgu54mQZyVdI5/QZf9DyMDLF6dIz3tYU2qhuo01AHjGRCC72v86p8sIiXcUr94Q
8pbegJhJ/g8KBol9Qhv3+pWG/QUAZwi/ZwasTkK+MJ4klRXfOrznxPubW1z6t9Vn
QRo39Po5SqqP0QWAscDxCFjESIQlWlKa+LZurJL7DJDCUGrSgzTpnVwFqKwc5zTP
HJa5MT2tEeL2TfUYRYCfh0ZV0elINdHA1y1klDBh38drh4EWr2gW8xdseGYXqRjh
fLgEpoF7VQ8kTvxKN+E4jZXkcZmoLmefp0ZyAbblS6IawpPVC7kXM9Fdn2OU8f2c
fjVjvSiqxfeN6dnpfeLDRbbN9894HwgP/LPropJOQ7KmjCorQq5zMDkAvoh3tElq
qwluRqdBJpWT/F05KweY+XVW8OawIycmUWqt6JrVNoIDAK31auHQv47kR0VA4OvE
DRVVhYpocw==
=VBaU
-----END PGP SIGNATURE-----
Merge tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
"Not a huge amount of changes in this round, the biggest one is that we
finally have Mings multi-page bvec support merged. Apart from that,
this pull request contains:
- Small series that avoids quiescing the queue for sysfs changes that
match what we currently have (Aleksei)
- Series of bcache fixes (via Coly)
- Series of lightnvm fixes (via Mathias)
- NVMe pull request from Christoph. Nothing major, just SPDX/license
cleanups, RR mp policy (Hannes), and little fixes (Bart,
Chaitanya).
- BFQ series (Paolo)
- Save blk-mq cpu -> hw queue mapping, removing a pointer indirection
for the fast path (Jianchao)
- fops->iopoll() added for async IO polling, this is a feature that
the upcoming io_uring interface will use (Christoph, me)
- Partition scan loop fixes (Dongli)
- mtip32xx conversion from managed resource API (Christoph)
- cdrom registration race fix (Guenter)
- MD pull from Song, two minor fixes.
- Various documentation fixes (Marcos)
- Multi-page bvec feature. This brings a lot of nice improvements
with it, like more efficient splitting, larger IOs can be supported
without growing the bvec table size, and so on. (Ming)
- Various little fixes to core and drivers"
* tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block: (117 commits)
block: fix updating bio's front segment size
block: Replace function name in string with __func__
nbd: propagate genlmsg_reply return code
floppy: remove set but not used variable 'q'
null_blk: fix checking for REQ_FUA
block: fix NULL pointer dereference in register_disk
fs: fix guard_bio_eod to check for real EOD errors
blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map
block: optimize bvec iteration in bvec_iter_advance
block: introduce mp_bvec_for_each_page() for iterating over page
block: optimize blk_bio_segment_split for single-page bvec
block: optimize __blk_segment_map_sg() for single-page bvec
block: introduce bvec_nth_page()
iomap: wire up the iopoll method
block: add bio_set_polled() helper
block: wire up block device iopoll method
fs: add an iopoll method to struct file_operations
loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part()
loop: do not print warn message if partition scan is successful
block: bounce: make sure that bvec table is updated
...
This patch introduces one extra iterator variable to bio_for_each_segment_all(),
then we can allow bio_for_each_segment_all() to iterate over multi-page bvec.
Given it is just one mechannical & simple change on all bio_for_each_segment_all()
users, this patch does tree-wide change in one single patch, so that we can
avoid to use a temporary helper for this conversion.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In order to have a common code base for fscrypt "post read" processing
for all filesystems which support encryption, this commit removes
filesystem specific build config option (e.g. CONFIG_EXT4_FS_ENCRYPTION)
and replaces it with a build option (i.e. CONFIG_FS_ENCRYPTION) whose
value affects all the filesystems making use of fscrypt.
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
This commit removes the f2fs specific f2fs_encrypted_inode() and makes
use of the generic IS_ENCRYPTED() macro to check for the encryption
status of an inode.
Acked-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In this round, we've focused on bug fixes since Pixel devices have been
shipping with f2fs. Some of them were related to hardware encryption support
which are actually not an issue in mainline, but would be better to merge
them in order to avoid potential bugs.
Enhancement:
- do GC sub-sections when the section is large
- add a flag in ioctl(SHUTDOWN) to trigger fsck for QA
- use kvmalloc() in order to give another chance to avoid ENOMEM
Bug fix:
- fix accessing memory boundaries in a malformed iamge
- GC gives stale unencrypted block
- GC counts in large sections
- detect idle time more precisely
- block allocation of DIO writes
- race conditions between write_begin and write_checkpoint
- allow GCs for node segments via ioctl()
There are various clean-ups and minor bug fixes as well.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAlwmkzcACgkQQBSofoJI
UNLIlw//RUd8rH8VMW/BcXQ3fwB8qFCk296cIVlWQu12BV7vmMKY886Sc+C9uAuR
9Lze4D6Iyg1k2KLkaWKhnE2E4ZWQ2O5zVf11r4amm1bN9glPJAKfLWbenJhMfeGa
IdWo1ksntxgjfqLim2QlGSv5cBUjEo8zf32D446BbQURQMD/mm/lndywItS5vTx1
2eGKJ3pSXX6geFzyf1pD/wmC/7j6roKaTKAIMBagIuINNukeHctT/gRfklmYmOFC
lRbXlsYhjIfGln6gktY6r4qyZbouOMad2fDl+dkp+UewQM4/2xLUaMqiN2eqd7Hr
cag9LGe80MGgWwesqS6VOJHQviOlNd78Ie//iO1+ZsaIxKrvgzocRSDdxZswv+Vq
reZNsEKH0s4/0ik9hiprW1NdQLvtg7e/UWPyrRvVRlDVitW/BXOWRBpGVgkdg01K
OmAxNrVcr2+Gxp8jZL7M+rQdpApL/ms31x9HE/8u5ABBZfmCy3zhpICTZ6Hdngp1
3iUi2PvKUg+Y7KgII1nK2HLDpmZZ5UkLm8NhxLI0ntBWExjQjOBasOxPJf4en/UR
TAtK3ozNKim3fQ01NCBy3B8WMKRtovxLK6PgqNPROJparlKJ3FEHnPIrYKglDUPg
sxWCm1gFSraRGP86TaeGeD+v+PP6erI9CTaf6F/U3pWFgIpHoPg=
=NEns
-----END PGP SIGNATURE-----
Merge tag 'f2fs-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've focused on bug fixes since Pixel devices have
been shipping with f2fs. Some of them were related to hardware
encryption support which are actually not an issue in mainline, but
would be better to merge them in order to avoid potential bugs.
Enhancements:
- do GC sub-sections when the section is large
- add a flag in ioctl(SHUTDOWN) to trigger fsck for QA
- use kvmalloc() in order to give another chance to avoid ENOMEM
Bug fixes:
- fix accessing memory boundaries in a malformed iamge
- GC gives stale unencrypted block
- GC counts in large sections
- detect idle time more precisely
- block allocation of DIO writes
- race conditions between write_begin and write_checkpoint
- allow GCs for node segments via ioctl()
There are various clean-ups and minor bug fixes as well"
* tag 'f2fs-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (43 commits)
f2fs: sanity check of xattr entry size
f2fs: fix use-after-free issue when accessing sbi->stat_info
f2fs: check PageWriteback flag for ordered case
f2fs: fix validation of the block count in sanity_check_raw_super
f2fs: fix missing unlock(sbi->gc_mutex)
f2fs: fix to dirty inode synchronously
f2fs: clean up structure extent_node
f2fs: fix block address for __check_sit_bitmap
f2fs: fix sbi->extent_list corruption issue
f2fs: clean up checkpoint flow
f2fs: flush stale issued discard candidates
f2fs: correct wrong spelling, issing_*
f2fs: use kvmalloc, if kmalloc is failed
f2fs: remove redundant comment of unused wio_mutex
f2fs: fix to reorder set_page_dirty and wait_on_page_writeback
f2fs: clear PG_writeback if IPU failed
f2fs: add an ioctl() to explicitly trigger fsck later
f2fs: avoid frequent costly fsck triggers
f2fs: fix m_may_create to make OPU DIO write correctly
f2fs: fix to update new block address correctly for OPU
...
All callers of migrate_page_move_mapping() now pass NULL for 'head'
argument. Drop it.
Link: http://lkml.kernel.org/r/20181211172143.7358-7-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a security report where f2fs_getxattr() has a hole to expose wrong
memory region when the image is malformed like this.
f2fs_getxattr: entry->e_name_len: 4, size: 12288, buffer_size: 16384, len: 4
Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
iput() on sbi->node_inode can update sbi->stat_info
in the below context, if the f2fs_write_checkpoint()
has failed with error.
f2fs_balance_fs_bg+0x1ac/0x1ec
f2fs_write_node_pages+0x4c/0x260
do_writepages+0x80/0xbc
__writeback_single_inode+0xdc/0x4ac
writeback_single_inode+0x9c/0x144
write_inode_now+0xc4/0xec
iput+0x194/0x22c
f2fs_put_super+0x11c/0x1e8
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
Fix this by moving f2fs_destroy_stats() further below iput() in
both f2fs_put_super() and f2fs_fill_super() paths.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For all ordered cases in f2fs_wait_on_page_writeback(), we need to
check PageWriteback status, so let's clean up to relocate the check
into f2fs_wait_on_page_writeback().
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Treat "block_count" from struct f2fs_super_block as 64-bit little endian
value in sanity_check_raw_super() because struct f2fs_super_block
declares "block_count" as "__le64".
This fixes a bug where the superblock validation fails on big endian
devices with the following error:
F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
F2FS-fs (sda1): Can't find valid F2FS filesystem in 1th superblock
F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
F2FS-fs (sda1): Can't find valid F2FS filesystem in 2th superblock
As result of this the partition cannot be mounted.
With this patch applied the superblock validation works fine and the
partition can be mounted again:
F2FS-fs (sda1): Mounted with checkpoint version = 7c84
My little endian x86-64 hardware was able to mount the partition without
this fix.
To confirm that mounting f2fs filesystems works on big endian machines
again I tested this on a 32-bit MIPS big endian (lantiq) device.
Fixes: 0cfe75c5b01199 ("f2fs: enhance sanity_check_raw_super() to avoid potential overflows")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If user change inode's i_flags via ioctl, let's add it into global
dirty list, so that checkpoint can guarantee its persistence before
fsync, it can make checkpoint keeping strong consistency.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The union in struct extent_node wass only to indicate below fields
struct rb_node rb_node;
union {
struct {
unsigned int fofs;
unsigned int len;
...
...
can be parsed as fields in struct rb_entry, but they were never be
used explicitly before, so let's remove them for cleanup.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Should use lstart (logical start address) instead of start (in dev) here.
This fixes a bug in multi-device scenarios.
Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When there is a failure in f2fs_fill_super() after/during
the recovery of fsync'd nodes, it frees the current sbi and
retries again. This time the mount is successful, but the files
that got recovered before retry, still holds the extent tree,
whose extent nodes list is corrupted since sbi and sbi->extent_list
is freed up. The list_del corruption issue is observed when the
file system is getting unmounted and when those recoverd files extent
node is being freed up in the below context.
list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
<...>
kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
lr : __list_del_entry_valid+0x94/0xb4
pc : __list_del_entry_valid+0x94/0xb4
<...>
Call trace:
__list_del_entry_valid+0x94/0xb4
__release_extent_node+0xb0/0x114
__free_extent_tree+0x58/0x7c
f2fs_shrink_extent_tree+0xdc/0x3b0
f2fs_leave_shrinker+0x28/0x7c
f2fs_put_super+0xfc/0x1e0
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
__cleanup_mnt+0x1c/0x28
task_work_run+0x48/0xd0
do_notify_resume+0x678/0xe98
work_pending+0x8/0x14
Fix this by not creating extents for those recovered files if shrinker is
not registered yet. Once mount is successful and shrinker is registered,
those files can have extents again.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch cleans up checkpoint flow a bit:
- remove unneeded circulation of flushing meta pages.
- don't flush nat_bits pages in prior to other checkpoint pages.
- add bug_on to check remained meta pages after flushing.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Sometimes, I could observe # of issuing_discard to be 1 which blocks background
jobs due to is_idle()=false.
The only way to get out of it was to trigger gc_urgent. This patch avoids that
by checking any candidates as done in the list.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
One report says memalloc failure during mount.
(unwind_backtrace) from [<c010cd4c>] (show_stack+0x10/0x14)
(show_stack) from [<c049c6b8>] (dump_stack+0x8c/0xa0)
(dump_stack) from [<c024fcf0>] (warn_alloc+0xc4/0x160)
(warn_alloc) from [<c0250218>] (__alloc_pages_nodemask+0x3f4/0x10d0)
(__alloc_pages_nodemask) from [<c0270450>] (kmalloc_order_trace+0x2c/0x120)
(kmalloc_order_trace) from [<c03fa748>] (build_node_manager+0x35c/0x688)
(build_node_manager) from [<c03de494>] (f2fs_fill_super+0xf0c/0x16cc)
(f2fs_fill_super) from [<c02a5864>] (mount_bdev+0x15c/0x188)
(mount_bdev) from [<c03da624>] (f2fs_mount+0x18/0x20)
(f2fs_mount) from [<c02a68b8>] (mount_fs+0x158/0x19c)
(mount_fs) from [<c02c3c9c>] (vfs_kern_mount+0x78/0x134)
(vfs_kern_mount) from [<c02c76ac>] (do_mount+0x474/0xca4)
(do_mount) from [<c02c8264>] (SyS_mount+0x94/0xbc)
(SyS_mount) from [<c0108180>] (ret_fast_syscall+0x0/0x48)
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Commit 089842de ("f2fs: remove codes of unused wio_mutex") removes codes
of unused wio_mutex, but missing the comment, so delete it.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch reorders flow from
- update page
- set_page_dirty
- wait_on_page_writeback
to
- wait_on_page_writeback
- update page
- set_page_dirty
The reason is:
- set_page_dirty will increase reference of dirty page, the reference
should be cleared before wait_on_page_writeback to keep its consistency.
- some devices need stable page during page writebacking, so we
should not change page's data.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If IPU failed, nothing is commited, we should end page writeback.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This adds an option in ioctl(F2FS_IOC_SHUTDOWN) in order to trigger fsck by
setting a NEED_FSCK flag.
Generally, shutdown is used for the test to validate filesystem consistency, and
setting NEED_FSCK flag can be used for Android to trigger fsck.f2fs at boot time
explicitly so that we could measure the elapsed time as well as force filesystem
check.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If we want to re-enable nat_bits, we rely on fsck which requires full scan
of directory tree. Let's do that by regular fsck or unclean shutdown.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, we added a parameter @map.m_may_create to trigger OPU
allocation and call f2fs_balance_fs() correctly.
But in get_more_blocks(), @create has been overwritten by below code.
So the function f2fs_map_blocks() will not allocate new block address
but directly go out. Meanwile,there are several functions calling
f2fs_map_blocks() directly and @map.m_may_create not initialized.
CODE:
create = dio->op == REQ_OP_WRITE;
if (dio->flags & DIO_SKIP_HOLES) {
if (fs_startblk <= ((i_size_read(dio->inode) - 1) >>
i_blkbits))
create = 0;
}
This patch fixes it.
Signed-off-by: Jia Zhu <zhujia13@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, we allocated a new block address for OPU mode in direct_IO.
But the new address couldn't be assigned to @map->m_pblk correctly.
This patch fix it.
Cc: <stable@vger.kernel.org>
Fixes: 511f52d02f05 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Jia Zhu <zhujia13@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Adjust the trace print in f2fs_get_victim() to cover GC done by
F2FS_IOC_GARBAGE_COLLECT_RANGE.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Allow node type segments also to be GC'd via f2fs ioctl
F2FS_IOC_GARBAGE_COLLECT_RANGE.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The function truncate_node frees the page with f2fs_put_page. However,
the page index is read after that. So, the patch reads the index before
freeing the page.
Fixes: bf39c00a9a7f ("f2fs: drop obsolete node page when it is truncated")
Cc: <stable@vger.kernel.org>
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When call f2fs_acl_create_masq() failed, the caller f2fs_acl_create()
should return -EIO instead of -ENOMEM, this patch makes it consistent
with posix_acl_create() which has been fixed in commit beaf226b863a
("posix_acl: don't ignore return value of posix_acl_create_masq()").
Fixes: 83dfe53c185e ("f2fs: fix reference leaks in f2fs_acl_create")
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
After merging the f2fs tree, today's linux-next build
(x86_64_allmodconfig) produced this warning:
In file included from fs/f2fs/dir.c:11:
fs/f2fs/f2fs.h: In function '__mark_inode_dirty_flag':
fs/f2fs/f2fs.h:2388:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (set)
^
fs/f2fs/f2fs.h:2390:2: note: here
case FI_DATA_EXIST:
^~~~
Exposed by my use of -Wimplicit-fallthrough
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The following race could lead to inconsistent SIT bitmap:
Task A Task B
====== ======
f2fs_write_checkpoint
block_operations
f2fs_lock_all
down_write(node_change)
down_write(node_write)
... sync ...
up_write(node_change)
f2fs_file_write_iter
set_inode_flag(FI_NO_PREALLOC)
......
f2fs_write_begin(index=0, has inline data)
prepare_write_begin
__do_map_lock(AIO) => down_read(node_change)
f2fs_convert_inline_page => update SIT
__do_map_lock(AIO) => up_read(node_change)
f2fs_flush_sit_entries <= inconsistent SIT
finish write checkpoint
sudden-power-off
If SPO occurs after checkpoint is finished, SIT bitmap will be set
incorrectly.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If namelen is corrupted to have very long value, fill_dentries can copy
wrong memory area.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, when f2fs finds which temp bio cache owns the target page,
it will flush all the three temp bio caches, but we only need to flush
one single bio cache indeed, which can help to keep bio merged.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In get_more_blocks(), we may override @create as below code:
create = dio->op == REQ_OP_WRITE;
if (dio->flags & DIO_SKIP_HOLES) {
if (fs_startblk <= ((i_size_read(dio->inode) - 1) >>
i_blkbits))
create = 0;
}
But in f2fs_map_blocks(), we only trigger f2fs_balance_fs() if @create
is 1, so in LFS mode, dio overwrite under LFS mode can easily run out
of free segments, result in below panic.
Call Trace:
allocate_segment_by_default+0xa8/0x270 [f2fs]
f2fs_allocate_data_block+0x1ea/0x5c0 [f2fs]
__allocate_data_block+0x306/0x480 [f2fs]
f2fs_map_blocks+0x6f6/0x920 [f2fs]
__get_data_block+0x4f/0xb0 [f2fs]
get_data_block_dio_write+0x50/0x60 [f2fs]
do_blockdev_direct_IO+0xcd5/0x21e0
__blockdev_direct_IO+0x3a/0x3c
f2fs_direct_IO+0x1ff/0x4a0 [f2fs]
generic_file_direct_write+0xd9/0x160
__generic_file_write_iter+0xbb/0x1e0
f2fs_file_write_iter+0xaf/0x220 [f2fs]
__vfs_write+0xd0/0x130
vfs_write+0xb2/0x1b0
SyS_pwrite64+0x69/0xa0
? vtime_user_exit+0x29/0x70
do_syscall_64+0x6e/0x160
entry_SYSCALL64_slow_path+0x25/0x25
RIP: new_curseg+0x36f/0x380 [f2fs] RSP: ffffac570393f7a8
So this patch introduces a parameter map.m_may_create to indicate that
f2fs_map_blocks() is called from write or read path, which can give the
right hint to let f2fs_map_blocks() trigger OPU allocation and call
f2fs_balanc_fs() correctly.
BTW, it disables physical address preallocation for direct IO in
f2fs_preallocate_blocks, which is redundant to OPU allocation of
f2fs_map_blocks.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds f2fs_dio_submit_bio() to hook submit_io/end_io functions
in direct IO path, in order to account DIO.
Later, we will add this count into is_idle() to let background GC/Discard
thread be aware of DIO.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch move dir data flush to write checkpoint process, by
doing this, it may reduce some time for dir fsync.
pre:
-f2fs_do_sync_file enter
-file_write_and_wait_range <- flush & wait
-write_checkpoint
-do_checkpoint <- wait all
-f2fs_do_sync_file exit
now:
-f2fs_do_sync_file enter
-write_checkpoint
-block_operations <- flush dir & no wait
-do_checkpoint <- wait all
-f2fs_do_sync_file exit
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs_ioc_gc_range skips blocks_per_seg each time, however, f2fs_gc moves
blocks of section each time, so fix it from segment to section.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add one sysfs entry to control migration granularity of GC in large
section f2fs, it can be tuned to mitigate heavy overhead of migrating
huge number of blocks in large section.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Section is minimal garbage collection unit of f2fs, in zoned block
device, or ancient block mapping flash device, in order to improve
GC efficiency, we can align GC unit to lower device erase unit,
normally, it consists of multiple of segments.
Once background or foreground GC triggers, it brings a large number
of IOs, which will impact user IO, and also occupy cpu/memory resource
intensively.
So, to reduce impact of GC on large size section, this patch supports
subsectional GC, in one cycle of GC, it only migrate partial segment{s}
in victim section. Currently, by default, we use sbi->segs_per_sec as
migration granularity.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Introduce a wrapper __is_large_section() to clean up codes.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to
access .raw_super field, to avoid unneeded pointer conversion, this
patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly.
Just do cleanup, no logic change.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before
gc starts, do_garbage_collect will skip counting seg_freed++, and this
will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, we only account preflush command for flush_merge mode,
so for noflush_merge mode, we can not know in-flight preflush
command count, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>