Chao Yu
bb3af019e1
f2fs: fix memory leak of percpu counter in fill_super()
...
[ Upstream commit 4a70e255449c9a13eed7a6eeecc85a1ea63cef76 ]
In fill_super -> init_percpu_info, we should destroy percpu counter
in error path, otherwise memory allcoated for percpu counter will
leak.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25 09:52:13 +01:00
Surbhi Palande
002d268018
f2fs: check all the data segments against all node ones
...
[ Upstream commit 1166c1f2f69117ad254189ca781287afa6e550b6 ]
As a part of the sanity checking while mounting, distinct segment number
assignment to data and node segments is verified. Fixing a small bug in
this verification between node and data segments. We need to check all
the data segments with all the node segments.
Fixes: 042be0f849e5f ("f2fs: fix to do sanity check with current segment number")
Signed-off-by: Surbhi Palande <csurbhi@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 12:30:10 +02:00
Chao Yu
d1a623e772
f2fs: fix to do sanity check with current segment number
...
[ Upstream commit 042be0f849e5fc24116d0afecfaf926eed5cac63 ]
https://bugzilla.kernel.org/show_bug.cgi?id=200219
Reproduction way:
- mount image
- run poc code
- umount image
F2FS-fs (loop1): Bitmap was wrongly set, blk:15364
------------[ cut here ]------------
kernel BUG at /home/yuchao/git/devf2fs/segment.c:2061!
invalid opcode: 0000 [#1 ] PREEMPT SMP
CPU: 2 PID: 17686 Comm: umount Tainted: G W O 4.18.0-rc2+ #39
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
EIP: update_sit_entry+0x459/0x4e0 [f2fs]
Code: e8 1c b5 fd ff 0f 0b 0f 0b 8b 45 e4 c7 44 24 08 9c 7a 6c f8 c7 44 24 04 bc 4a 6c f8 89 44 24 0c 8b 06 89 04 24 e8 f7 b4 fd ff <0f> 0b 8b 45 e4 0f b6 d2 89 54 24 10 c7 44 24 08 60 7a 6c f8 c7 44
EAX: 00000032 EBX: 000000f8 ECX: 00000002 EDX: 00000001
ESI: d7177000 EDI: f520fe68 EBP: d6477c6c ESP: d6477c34
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010282
CR0: 80050033 CR2: b7fbe000 CR3: 2a99b3c0 CR4: 000406f0
Call Trace:
f2fs_allocate_data_block+0x124/0x580 [f2fs]
do_write_page+0x78/0x150 [f2fs]
f2fs_do_write_node_page+0x25/0xa0 [f2fs]
__write_node_page+0x2bf/0x550 [f2fs]
f2fs_sync_node_pages+0x60e/0x6d0 [f2fs]
? sync_inode_metadata+0x2f/0x40
? f2fs_write_checkpoint+0x28f/0x7d0 [f2fs]
? up_write+0x1e/0x80
f2fs_write_checkpoint+0x2a9/0x7d0 [f2fs]
? mark_held_locks+0x5d/0x80
? _raw_spin_unlock_irq+0x27/0x50
kill_f2fs_super+0x68/0x90 [f2fs]
deactivate_locked_super+0x3d/0x70
deactivate_super+0x40/0x60
cleanup_mnt+0x39/0x70
__cleanup_mnt+0x10/0x20
task_work_run+0x81/0xa0
exit_to_usermode_loop+0x59/0xa7
do_fast_syscall_32+0x1f5/0x22c
entry_SYSENTER_32+0x53/0x86
EIP: 0xb7f95c51
Code: c1 1e f7 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
EAX: 00000000 EBX: 0871ab90 ECX: bfb2cd00 EDX: 00000000
ESI: 00000000 EDI: 0871ab90 EBP: 0871ab90 ESP: bfb2cd7c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
Modules linked in: f2fs(O) crc32_generic bnep rfcomm bluetooth ecdh_generic snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq pcbc joydev aesni_intel snd_seq_device aes_i586 snd_timer crypto_simd snd cryptd soundcore mac_hid serio_raw video i2c_piix4 parport_pc ppdev lp parport hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
---[ end trace d423f83982cfcdc5 ]---
The reason is, different log headers using the same segment, once
one log's next block address is used by another log, it will cause
panic as above.
Main area: 24 segs, 24 secs 24 zones
- COLD data: 0, 0, 0
- WARM data: 1, 1, 1
- HOT data: 20, 20, 20
- Dir dnode: 22, 22, 22
- File dnode: 22, 22, 22
- Indir nodes: 21, 21, 21
So this patch adds sanity check to detect such condition to avoid
this issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20 09:07:51 +02:00
Martin Blumenstingl
362e8cbc7c
f2fs: fix validation of the block count in sanity_check_raw_super
...
commit 88960068f25fcc3759455d85460234dcc9d43fef upstream.
Treat "block_count" from struct f2fs_super_block as 64-bit little endian
value in sanity_check_raw_super() because struct f2fs_super_block
declares "block_count" as "__le64".
This fixes a bug where the superblock validation fails on big endian
devices with the following error:
F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
F2FS-fs (sda1): Can't find valid F2FS filesystem in 1th superblock
F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
F2FS-fs (sda1): Can't find valid F2FS filesystem in 2th superblock
As result of this the partition cannot be mounted.
With this patch applied the superblock validation works fine and the
partition can be mounted again:
F2FS-fs (sda1): Mounted with checkpoint version = 7c84
My little endian x86-64 hardware was able to mount the partition without
this fix.
To confirm that mounting f2fs filesystems works on big endian machines
again I tested this on a 32-bit MIPS big endian (lantiq) device.
Fixes: 0cfe75c5b01199 ("f2fs: enhance sanity_check_raw_super() to avoid potential overflows")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09 16:16:44 +01:00
Chao Yu
91fe514bed
f2fs: fix to do sanity check with cp_pack_start_sum
...
commit e494c2f995d6181d6e29c4927d68e0f295ecf75b upstream.
After fuzzing, cp_pack_start_sum could be corrupted, so current log's
summary info should be wrong due to loading incorrect summary block.
Then, if segment's type in current log is exceeded NR_CURSEG_TYPE, it
can lead accessing invalid dirty_i->dirty_segmap bitmap finally.
Add sanity check for cp_pack_start_sum to fix this issue.
https://bugzilla.kernel.org/show_bug.cgi?id=200419
- Reproduce
- Kernel message (f2fs-dev w/ KASAN)
[ 3117.578432] F2FS-fs (loop0): Invalid log blocks per segment (8)
[ 3117.578445] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock
[ 3117.581364] F2FS-fs (loop0): invalid crc_offset: 30716
[ 3117.583564] WARNING: CPU: 1 PID: 1225 at fs/f2fs/checkpoint.c:90 __get_meta_page+0x448/0x4b0
[ 3117.583570] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.584014] CPU: 1 PID: 1225 Comm: mount Not tainted 4.17.0+ #1
[ 3117.584017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.584022] RIP: 0010:__get_meta_page+0x448/0x4b0
[ 3117.584023] Code: 00 49 8d bc 24 84 00 00 00 e8 74 54 da ff 41 83 8c 24 84 00 00 00 08 4c 89 f6 4c 89 ef e8 c0 d9 95 00 48 89 ef e8 18 e3 00 00 <0f> 0b f0 80 4d 48 04 e9 0f fe ff ff 0f 0b 48 89 c7 48 89 04 24 e8
[ 3117.584072] RSP: 0018:ffff88018eb678c0 EFLAGS: 00010286
[ 3117.584082] RAX: ffff88018f0a6a78 RBX: ffffea0007a46600 RCX: ffffffff9314d1b2
[ 3117.584085] RDX: ffffffff00000001 RSI: 0000000000000000 RDI: ffff88018f0a6a98
[ 3117.584087] RBP: ffff88018ebe9980 R08: 0000000000000002 R09: 0000000000000001
[ 3117.584090] R10: 0000000000000001 R11: ffffed00326e4450 R12: ffff880193722200
[ 3117.584092] R13: ffff88018ebe9afc R14: 0000000000000206 R15: ffff88018eb67900
[ 3117.584096] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.584098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.584101] CR2: 00000000016f21b8 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.584112] Call Trace:
[ 3117.584121] ? f2fs_set_meta_page_dirty+0x150/0x150
[ 3117.584127] ? f2fs_build_segment_manager+0xbf9/0x3190
[ 3117.584133] ? f2fs_npages_for_summary_flush+0x75/0x120
[ 3117.584145] f2fs_build_segment_manager+0xda8/0x3190
[ 3117.584151] ? f2fs_get_valid_checkpoint+0x298/0xa00
[ 3117.584156] ? f2fs_flush_sit_entries+0x10e0/0x10e0
[ 3117.584184] ? map_id_range_down+0x17c/0x1b0
[ 3117.584188] ? __put_user_ns+0x30/0x30
[ 3117.584206] ? find_next_bit+0x53/0x90
[ 3117.584237] ? cpumask_next+0x16/0x20
[ 3117.584249] f2fs_fill_super+0x1948/0x2b40
[ 3117.584258] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584279] ? sget_userns+0x65e/0x690
[ 3117.584296] ? set_blocksize+0x88/0x130
[ 3117.584302] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.584305] mount_bdev+0x1c0/0x200
[ 3117.584310] mount_fs+0x5c/0x190
[ 3117.584320] vfs_kern_mount+0x64/0x190
[ 3117.584330] do_mount+0x2e4/0x1450
[ 3117.584343] ? lockref_put_return+0x130/0x130
[ 3117.584347] ? copy_mount_string+0x20/0x20
[ 3117.584357] ? kasan_unpoison_shadow+0x31/0x40
[ 3117.584362] ? kasan_kmalloc+0xa6/0xd0
[ 3117.584373] ? memcg_kmem_put_cache+0x16/0x90
[ 3117.584377] ? __kmalloc_track_caller+0x196/0x210
[ 3117.584383] ? _copy_from_user+0x61/0x90
[ 3117.584396] ? memdup_user+0x3e/0x60
[ 3117.584401] ksys_mount+0x7e/0xd0
[ 3117.584405] __x64_sys_mount+0x62/0x70
[ 3117.584427] do_syscall_64+0x73/0x160
[ 3117.584440] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.584455] RIP: 0033:0x7f5693f14b9a
[ 3117.584456] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.584505] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.584510] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.584512] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.584514] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.584516] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.584519] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.584523] ---[ end trace a8e0d899985faf31 ]---
[ 3117.685663] F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix.
[ 3117.685673] F2FS-fs (loop0): recover_data: ino = 2 (i_size: recover) recovered = 1, err = 0
[ 3117.685707] ==================================================================
[ 3117.685955] BUG: KASAN: slab-out-of-bounds in __remove_dirty_segment+0xdd/0x1e0
[ 3117.686175] Read of size 8 at addr ffff88018f0a63d0 by task mount/1225
[ 3117.686477] CPU: 0 PID: 1225 Comm: mount Tainted: G W 4.17.0+ #1
[ 3117.686481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.686483] Call Trace:
[ 3117.686494] dump_stack+0x71/0xab
[ 3117.686512] print_address_description+0x6b/0x290
[ 3117.686517] kasan_report+0x28e/0x390
[ 3117.686522] ? __remove_dirty_segment+0xdd/0x1e0
[ 3117.686527] __remove_dirty_segment+0xdd/0x1e0
[ 3117.686532] locate_dirty_segment+0x189/0x190
[ 3117.686538] f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.686543] recover_data+0x703/0x2c20
[ 3117.686547] ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.686553] ? ksys_mount+0x7e/0xd0
[ 3117.686564] ? policy_nodemask+0x1a/0x90
[ 3117.686567] ? policy_node+0x56/0x70
[ 3117.686571] ? add_fsync_inode+0xf0/0xf0
[ 3117.686592] ? blk_finish_plug+0x44/0x60
[ 3117.686597] ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.686602] ? find_inode_fast+0xac/0xc0
[ 3117.686606] ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.686618] ? __radix_tree_lookup+0x150/0x150
[ 3117.686633] ? dqget+0x670/0x670
[ 3117.686648] ? pagecache_get_page+0x29/0x410
[ 3117.686656] ? kmem_cache_alloc+0x176/0x1e0
[ 3117.686660] ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.686664] f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.686670] ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.686674] ? rb_insert_color+0x323/0x3d0
[ 3117.686678] ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.686683] ? proc_register+0x153/0x1d0
[ 3117.686686] ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.686695] ? f2fs_attr_store+0x50/0x50
[ 3117.686700] ? proc_create_single_data+0x52/0x60
[ 3117.686707] f2fs_fill_super+0x1d06/0x2b40
[ 3117.686728] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686735] ? sget_userns+0x65e/0x690
[ 3117.686740] ? set_blocksize+0x88/0x130
[ 3117.686745] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.686748] mount_bdev+0x1c0/0x200
[ 3117.686753] mount_fs+0x5c/0x190
[ 3117.686758] vfs_kern_mount+0x64/0x190
[ 3117.686762] do_mount+0x2e4/0x1450
[ 3117.686769] ? lockref_put_return+0x130/0x130
[ 3117.686773] ? copy_mount_string+0x20/0x20
[ 3117.686777] ? kasan_unpoison_shadow+0x31/0x40
[ 3117.686780] ? kasan_kmalloc+0xa6/0xd0
[ 3117.686786] ? memcg_kmem_put_cache+0x16/0x90
[ 3117.686790] ? __kmalloc_track_caller+0x196/0x210
[ 3117.686795] ? _copy_from_user+0x61/0x90
[ 3117.686801] ? memdup_user+0x3e/0x60
[ 3117.686804] ksys_mount+0x7e/0xd0
[ 3117.686809] __x64_sys_mount+0x62/0x70
[ 3117.686816] do_syscall_64+0x73/0x160
[ 3117.686824] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.686829] RIP: 0033:0x7f5693f14b9a
[ 3117.686830] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.686887] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.686892] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.686894] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.686896] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.686899] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.686901] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.687005] Allocated by task 1225:
[ 3117.687152] kasan_kmalloc+0xa6/0xd0
[ 3117.687157] kmem_cache_alloc_trace+0xfd/0x200
[ 3117.687161] f2fs_build_segment_manager+0x2d09/0x3190
[ 3117.687165] f2fs_fill_super+0x1948/0x2b40
[ 3117.687168] mount_bdev+0x1c0/0x200
[ 3117.687171] mount_fs+0x5c/0x190
[ 3117.687174] vfs_kern_mount+0x64/0x190
[ 3117.687177] do_mount+0x2e4/0x1450
[ 3117.687180] ksys_mount+0x7e/0xd0
[ 3117.687182] __x64_sys_mount+0x62/0x70
[ 3117.687186] do_syscall_64+0x73/0x160
[ 3117.687190] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.687285] Freed by task 19:
[ 3117.687412] __kasan_slab_free+0x137/0x190
[ 3117.687416] kfree+0x8b/0x1b0
[ 3117.687460] ttm_bo_man_put_node+0x61/0x80 [ttm]
[ 3117.687476] ttm_bo_cleanup_refs+0x15f/0x250 [ttm]
[ 3117.687492] ttm_bo_delayed_delete+0x2f0/0x300 [ttm]
[ 3117.687507] ttm_bo_delayed_workqueue+0x17/0x50 [ttm]
[ 3117.687528] process_one_work+0x2f9/0x740
[ 3117.687531] worker_thread+0x78/0x6b0
[ 3117.687541] kthread+0x177/0x1c0
[ 3117.687545] ret_from_fork+0x35/0x40
[ 3117.687638] The buggy address belongs to the object at ffff88018f0a6300
which belongs to the cache kmalloc-192 of size 192
[ 3117.688014] The buggy address is located 16 bytes to the right of
192-byte region [ffff88018f0a6300, ffff88018f0a63c0)
[ 3117.688382] The buggy address belongs to the page:
[ 3117.688554] page:ffffea00063c2980 count:1 mapcount:0 mapping:ffff8801f3403180 index:0x0
[ 3117.688788] flags: 0x17fff8000000100(slab)
[ 3117.688944] raw: 017fff8000000100 ffffea00063c2840 0000000e0000000e ffff8801f3403180
[ 3117.689166] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 3117.689386] page dumped because: kasan: bad access detected
[ 3117.689653] Memory state around the buggy address:
[ 3117.689816] ffff88018f0a6280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3117.690027] ffff88018f0a6300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690239] >ffff88018f0a6380: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.690448] ^
[ 3117.690644] ffff88018f0a6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3117.690868] ffff88018f0a6480: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 3117.691077] ==================================================================
[ 3117.691290] Disabling lock debugging due to kernel taint
[ 3117.693893] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 3117.694120] PGD 80000001f01bc067 P4D 80000001f01bc067 PUD 1d9638067 PMD 0
[ 3117.694338] Oops: 0002 [#1 ] SMP KASAN PTI
[ 3117.694490] CPU: 1 PID: 1225 Comm: mount Tainted: G B W 4.17.0+ #1
[ 3117.694703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 3117.695073] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.695246] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.695793] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.695969] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.696182] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.696391] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.696604] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.696813] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.697032] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3117.697280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.702357] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0
[ 3117.707235] Call Trace:
[ 3117.712077] locate_dirty_segment+0x189/0x190
[ 3117.716891] f2fs_allocate_new_segments+0xa9/0xe0
[ 3117.721617] recover_data+0x703/0x2c20
[ 3117.726316] ? f2fs_recover_fsync_data+0x48f/0xd50
[ 3117.730957] ? ksys_mount+0x7e/0xd0
[ 3117.735573] ? policy_nodemask+0x1a/0x90
[ 3117.740198] ? policy_node+0x56/0x70
[ 3117.744829] ? add_fsync_inode+0xf0/0xf0
[ 3117.749487] ? blk_finish_plug+0x44/0x60
[ 3117.754152] ? f2fs_ra_meta_pages+0x38b/0x5e0
[ 3117.758831] ? find_inode_fast+0xac/0xc0
[ 3117.763448] ? f2fs_is_valid_blkaddr+0x320/0x320
[ 3117.768046] ? __radix_tree_lookup+0x150/0x150
[ 3117.772603] ? dqget+0x670/0x670
[ 3117.777159] ? pagecache_get_page+0x29/0x410
[ 3117.781648] ? kmem_cache_alloc+0x176/0x1e0
[ 3117.786067] ? f2fs_is_valid_blkaddr+0x11d/0x320
[ 3117.790476] f2fs_recover_fsync_data+0xc23/0xd50
[ 3117.794790] ? f2fs_space_for_roll_forward+0x60/0x60
[ 3117.799086] ? rb_insert_color+0x323/0x3d0
[ 3117.803304] ? f2fs_recover_orphan_inodes+0xa5/0x700
[ 3117.807563] ? proc_register+0x153/0x1d0
[ 3117.811766] ? f2fs_remove_orphan_inode+0x10/0x10
[ 3117.815947] ? f2fs_attr_store+0x50/0x50
[ 3117.820087] ? proc_create_single_data+0x52/0x60
[ 3117.824262] f2fs_fill_super+0x1d06/0x2b40
[ 3117.828367] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.832432] ? sget_userns+0x65e/0x690
[ 3117.836500] ? set_blocksize+0x88/0x130
[ 3117.840501] ? f2fs_commit_super+0x1a0/0x1a0
[ 3117.844420] mount_bdev+0x1c0/0x200
[ 3117.848275] mount_fs+0x5c/0x190
[ 3117.852053] vfs_kern_mount+0x64/0x190
[ 3117.855810] do_mount+0x2e4/0x1450
[ 3117.859441] ? lockref_put_return+0x130/0x130
[ 3117.862996] ? copy_mount_string+0x20/0x20
[ 3117.866417] ? kasan_unpoison_shadow+0x31/0x40
[ 3117.869719] ? kasan_kmalloc+0xa6/0xd0
[ 3117.872948] ? memcg_kmem_put_cache+0x16/0x90
[ 3117.876121] ? __kmalloc_track_caller+0x196/0x210
[ 3117.879333] ? _copy_from_user+0x61/0x90
[ 3117.882467] ? memdup_user+0x3e/0x60
[ 3117.885604] ksys_mount+0x7e/0xd0
[ 3117.888700] __x64_sys_mount+0x62/0x70
[ 3117.891742] do_syscall_64+0x73/0x160
[ 3117.894692] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3117.897669] RIP: 0033:0x7f5693f14b9a
[ 3117.900563] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 3117.906922] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 3117.910159] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a
[ 3117.913469] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040
[ 3117.916764] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 3117.920071] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040
[ 3117.923393] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003
[ 3117.926680] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy
[ 3117.949979] CR2: 0000000000000000
[ 3117.954283] ---[ end trace a8e0d899985faf32 ]---
[ 3117.958575] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0
[ 3117.962810] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7
[ 3117.971789] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292
[ 3117.976333] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000
[ 3117.980926] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297
[ 3117.985497] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb
[ 3117.990098] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019
[ 3117.994761] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0
[ 3117.999392] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000
[ 3118.004096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3118.008816] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0
- Location
https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/segment.c#L775
if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
dirty_i->nr_dirty[t]--;
Here dirty_i->dirty_segmap[t] can be NULL which leads to crash in test_and_clear_bit()
Reported-by Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9: The function is called sanity_check_ckpt()]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-08 13:05:14 +01:00
Chao Yu
06e606aced
f2fs: fix to do sanity check with user_block_count
...
commit 9dc956b2c8523aed39d1e6508438be9fea28c8fc upstream.
This patch fixs to do sanity check with user_block_count.
- Overview
Divide zero in utilization when mount() a corrupted f2fs image
- Reproduce (4.18 upstream kernel)
- Kernel message
[ 564.099503] F2FS-fs (loop0): invalid crc value
[ 564.101991] divide error: 0000 [#1 ] SMP KASAN PTI
[ 564.103103] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Not tainted 4.18.0-rc1+ #4
[ 564.104584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 564.106624] RIP: 0010:issue_discard_thread+0x248/0x5c0
[ 564.107692] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[ 564.111686] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[ 564.112775] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[ 564.114250] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[ 564.115706] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[ 564.117177] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[ 564.118634] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[ 564.120094] FS: 0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[ 564.121748] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 564.122923] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[ 564.124383] Call Trace:
[ 564.124924] ? __issue_discard_cmd+0x480/0x480
[ 564.125882] ? __sched_text_start+0x8/0x8
[ 564.126756] ? __kthread_parkme+0xcb/0x100
[ 564.127620] ? kthread_blkcg+0x70/0x70
[ 564.128412] kthread+0x180/0x1d0
[ 564.129105] ? __issue_discard_cmd+0x480/0x480
[ 564.130029] ? kthread_associate_blkcg+0x150/0x150
[ 564.131033] ret_from_fork+0x35/0x40
[ 564.131794] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[ 564.141798] ---[ end trace 4ce02f25ff7d3df5 ]---
[ 564.142773] RIP: 0010:issue_discard_thread+0x248/0x5c0
[ 564.143885] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86
[ 564.147776] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206
[ 564.148856] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03
[ 564.150424] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850
[ 564.151906] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0
[ 564.153463] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc
[ 564.154915] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000
[ 564.156405] FS: 0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[ 564.158070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 564.159279] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0
[ 564.161043] ==================================================================
[ 564.162587] BUG: KASAN: stack-out-of-bounds in from_kuid_munged+0x1d/0x50
[ 564.163994] Read of size 4 at addr ffff8801f3117c84 by task f2fs_discard-7:/1298
[ 564.165852] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Tainted: G D 4.18.0-rc1+ #4
[ 564.167593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 564.169522] Call Trace:
[ 564.170057] dump_stack+0x7b/0xb5
[ 564.170778] print_address_description+0x70/0x290
[ 564.171765] kasan_report+0x291/0x390
[ 564.172540] ? from_kuid_munged+0x1d/0x50
[ 564.173408] __asan_load4+0x78/0x80
[ 564.174148] from_kuid_munged+0x1d/0x50
[ 564.174962] do_notify_parent+0x1f5/0x4f0
[ 564.175808] ? send_sigqueue+0x390/0x390
[ 564.176639] ? css_set_move_task+0x152/0x340
[ 564.184197] do_exit+0x1290/0x1390
[ 564.184950] ? __issue_discard_cmd+0x480/0x480
[ 564.185884] ? mm_update_next_owner+0x380/0x380
[ 564.186829] ? __sched_text_start+0x8/0x8
[ 564.187672] ? __kthread_parkme+0xcb/0x100
[ 564.188528] ? kthread_blkcg+0x70/0x70
[ 564.189333] ? kthread+0x180/0x1d0
[ 564.190052] ? __issue_discard_cmd+0x480/0x480
[ 564.190983] rewind_stack_do_exit+0x17/0x20
[ 564.192190] The buggy address belongs to the page:
[ 564.193213] page:ffffea0007cc45c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 564.194856] flags: 0x2ffff0000000000()
[ 564.195644] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000
[ 564.197247] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 564.198826] page dumped because: kasan: bad access detected
[ 564.200299] Memory state around the buggy address:
[ 564.201306] ffff8801f3117b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 564.202779] ffff8801f3117c00: 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3
[ 564.204252] >ffff8801f3117c80: f3 f3 f3 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
[ 564.205742] ^
[ 564.206424] ffff8801f3117d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 564.207908] ffff8801f3117d80: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[ 564.209389] ==================================================================
[ 564.231795] F2FS-fs (loop0): Mounted with checkpoint version = 2
- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L586
return div_u64((u64)valid_user_blocks(sbi) * 100,
sbi->user_block_count);
Missing checks on sbi->user_block_count.
Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-08 13:05:14 +01:00
Chao Yu
a3dccfacd3
f2fs: fix to do sanity check with secs_per_zone
...
commit 42bf546c1fe3f3654bdf914e977acbc2b80a5be5 upstream.
As Wen Xu reported in below link:
https://bugzilla.kernel.org/show_bug.cgi?id=200183
- Overview
Divide zero in reset_curseg() when mounting a crafted f2fs image
- Reproduce
- Kernel message
[ 588.281510] divide error: 0000 [#1 ] SMP KASAN PTI
[ 588.282701] CPU: 0 PID: 1293 Comm: mount Not tainted 4.18.0-rc1+ #4
[ 588.284000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 588.286178] RIP: 0010:reset_curseg+0x94/0x1a0
[ 588.298166] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[ 588.299360] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[ 588.300809] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[ 588.305272] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[ 588.306822] FS: 00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[ 588.308456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 588.309623] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0
[ 588.311085] Call Trace:
[ 588.311637] f2fs_build_segment_manager+0x103f/0x3410
[ 588.316136] ? f2fs_commit_super+0x1b0/0x1b0
[ 588.317031] ? set_blocksize+0x90/0x140
[ 588.319473] f2fs_mount+0x15/0x20
[ 588.320166] mount_fs+0x60/0x1a0
[ 588.320847] ? alloc_vfsmnt+0x309/0x360
[ 588.321647] vfs_kern_mount+0x6b/0x1a0
[ 588.322432] do_mount+0x34a/0x18c0
[ 588.323175] ? strndup_user+0x46/0x70
[ 588.323937] ? copy_mount_string+0x20/0x20
[ 588.324793] ? memcg_kmem_put_cache+0x1b/0xa0
[ 588.325702] ? kasan_check_write+0x14/0x20
[ 588.326562] ? _copy_from_user+0x6a/0x90
[ 588.327375] ? memdup_user+0x42/0x60
[ 588.328118] ksys_mount+0x83/0xd0
[ 588.328808] __x64_sys_mount+0x67/0x80
[ 588.329607] do_syscall_64+0x78/0x170
[ 588.330400] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 588.331461] RIP: 0033:0x7fad848e8b9a
[ 588.336022] RSP: 002b:00007ffd7c5b6be8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 588.337547] RAX: ffffffffffffffda RBX: 00000000016f8030 RCX: 00007fad848e8b9a
[ 588.338999] RDX: 00000000016f8210 RSI: 00000000016f9f30 RDI: 0000000001700ec0
[ 588.340442] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 588.341887] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001700ec0
[ 588.343341] R13: 00000000016f8210 R14: 0000000000000000 R15: 0000000000000003
[ 588.354891] ---[ end trace 4ce02f25ff7d3df5 ]---
[ 588.355862] RIP: 0010:reset_curseg+0x94/0x1a0
[ 588.360742] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246
[ 588.361812] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b
[ 588.363485] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64
[ 588.365213] RBP: ffff8801e88d7968 R08: ffffed003c32266f R09: ffffed003c32266f
[ 588.366661] R10: 0000000000000001 R11: ffffed003c32266e R12: ffff8801f0337700
[ 588.368110] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000
[ 588.370057] FS: 00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000
[ 588.372099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 588.373291] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0
- Location
https://elixir.bootlin.com/linux/latest/source/fs/f2fs/segment.c#L2147
curseg->zone = GET_ZONE_FROM_SEG(sbi, curseg->segno);
If secs_per_zone is corrupted due to fuzzing test, it will cause divide
zero operation when using GET_ZONE_FROM_SEG macro, so we should do more
sanity check with secs_per_zone during mount to avoid this issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-08 13:05:14 +01:00
Jaegeuk Kim
65e1055a69
f2fs: enhance sanity_check_raw_super() to avoid potential overflow
...
commit 0cfe75c5b011994651a4ca6d74f20aa997bfc69a upstream.
In order to avoid the below overflow issue, we should have checked the
boundaries in superblock before reaching out to allocation. As Linus suggested,
the right place should be sanity_check_raw_super().
Dr Silvio Cesare of InfoSect reported:
There are integer overflows with using the cp_payload superblock field in the
f2fs filesystem potentially leading to memory corruption.
include/linux/f2fs_fs.h
struct f2fs_super_block {
...
__le32 cp_payload;
fs/f2fs/f2fs.h
typedef u32 block_t; /*
* should not change u32, since it is the on-disk block
* address format, __le32.
*/
...
static inline block_t __cp_payload(struct f2fs_sb_info *sbi)
{
return le32_to_cpu(F2FS_RAW_SUPER(sbi)->cp_payload);
}
fs/f2fs/checkpoint.c
block_t start_blk, orphan_blocks, i, j;
...
start_blk = __start_cp_addr(sbi) + 1 + __cp_payload(sbi);
orphan_blocks = __start_sum_addr(sbi) - 1 - __cp_payload(sbi);
+++ integer overflows
...
unsigned int cp_blks = 1 + __cp_payload(sbi);
...
sbi->ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL);
+++ integer overflow leading to incorrect heap allocation.
int cp_payload_blks = __cp_payload(sbi);
...
ckpt->cp_pack_start_sum = cpu_to_le32(1 + cp_payload_blks +
orphan_blocks);
+++ sign bug and integer overflow
...
for (i = 1; i < 1 + cp_payload_blks; i++)
+++ integer overflow
...
sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS -
NR_CURSEG_TYPE - __cp_payload(sbi)) *
F2FS_ORPHANS_PER_BLOCK;
+++ integer overflow
Reported-by: Greg KH <greg@kroah.com>
Reported-by: Silvio Cesare <silvio.cesare@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9: No hot file extension support]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-08 13:05:13 +01:00
Jaegeuk Kim
240ceff482
f2fs: detect wrong layout
...
commit 2040fce83fe17763b07c97c1f691da2bb85e4135 upstream.
Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect
that as well.
Refer this in f2fs-tools.
mkfs.f2fs: detect small partition by overprovision ratio and # of segments
Reported-and-Tested-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-08 13:05:13 +01:00
Chao Yu
a5ab2d3f81
f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesize
...
[ Upstream commit c77ec61ca0a49544ca81881cc5d5529858f7e196 ]
This patch adds to do sanity check with {sit,nat}_ver_bitmap_bytesize
during mount, in order to avoid accessing across cache boundary with
this abnormal bitmap size.
- Overview
buffer overrun in build_sit_info() when mounting a crafted f2fs image
- Reproduce
- Kernel message
[ 548.580867] F2FS-fs (loop0): Invalid log blocks per segment (8201)
[ 548.580877] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock
[ 548.584979] ==================================================================
[ 548.586568] BUG: KASAN: use-after-free in kmemdup+0x36/0x50
[ 548.587715] Read of size 64 at addr ffff8801e9c265ff by task mount/1295
[ 548.589428] CPU: 1 PID: 1295 Comm: mount Not tainted 4.18.0-rc1+ #4
[ 548.589432] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 548.589438] Call Trace:
[ 548.589474] dump_stack+0x7b/0xb5
[ 548.589487] print_address_description+0x70/0x290
[ 548.589492] kasan_report+0x291/0x390
[ 548.589496] ? kmemdup+0x36/0x50
[ 548.589509] check_memory_region+0x139/0x190
[ 548.589514] memcpy+0x23/0x50
[ 548.589518] kmemdup+0x36/0x50
[ 548.589545] f2fs_build_segment_manager+0x8fa/0x3410
[ 548.589551] ? __asan_loadN+0xf/0x20
[ 548.589560] ? f2fs_sanity_check_ckpt+0x1be/0x240
[ 548.589566] ? f2fs_flush_sit_entries+0x10c0/0x10c0
[ 548.589587] ? __put_user_ns+0x40/0x40
[ 548.589604] ? find_next_bit+0x57/0x90
[ 548.589610] f2fs_fill_super+0x194b/0x2b40
[ 548.589617] ? f2fs_commit_super+0x1b0/0x1b0
[ 548.589637] ? set_blocksize+0x90/0x140
[ 548.589651] mount_bdev+0x1c5/0x210
[ 548.589655] ? f2fs_commit_super+0x1b0/0x1b0
[ 548.589667] f2fs_mount+0x15/0x20
[ 548.589672] mount_fs+0x60/0x1a0
[ 548.589683] ? alloc_vfsmnt+0x309/0x360
[ 548.589688] vfs_kern_mount+0x6b/0x1a0
[ 548.589699] do_mount+0x34a/0x18c0
[ 548.589710] ? lockref_put_or_lock+0xcf/0x160
[ 548.589716] ? copy_mount_string+0x20/0x20
[ 548.589728] ? memcg_kmem_put_cache+0x1b/0xa0
[ 548.589734] ? kasan_check_write+0x14/0x20
[ 548.589740] ? _copy_from_user+0x6a/0x90
[ 548.589744] ? memdup_user+0x42/0x60
[ 548.589750] ksys_mount+0x83/0xd0
[ 548.589755] __x64_sys_mount+0x67/0x80
[ 548.589781] do_syscall_64+0x78/0x170
[ 548.589797] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 548.589820] RIP: 0033:0x7f76fc331b9a
[ 548.589821] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 548.589880] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 548.589890] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a
[ 548.589892] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0
[ 548.589895] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 548.589897] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0
[ 548.589900] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003
[ 548.590242] The buggy address belongs to the page:
[ 548.591243] page:ffffea0007a70980 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 548.592886] flags: 0x2ffff0000000000()
[ 548.593665] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000
[ 548.595258] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 548.603713] page dumped because: kasan: bad access detected
[ 548.605203] Memory state around the buggy address:
[ 548.606198] ffff8801e9c26480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 548.607676] ffff8801e9c26500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 548.609157] >ffff8801e9c26580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 548.610629] ^
[ 548.612088] ffff8801e9c26600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 548.613674] ffff8801e9c26680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 548.615141] ==================================================================
[ 548.616613] Disabling lock debugging due to kernel taint
[ 548.622871] WARNING: CPU: 1 PID: 1295 at mm/page_alloc.c:4065 __alloc_pages_slowpath+0xe4a/0x1420
[ 548.622878] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy
[ 548.623217] CPU: 1 PID: 1295 Comm: mount Tainted: G B 4.18.0-rc1+ #4
[ 548.623219] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 548.623226] RIP: 0010:__alloc_pages_slowpath+0xe4a/0x1420
[ 548.623227] Code: ff ff 01 89 85 c8 fe ff ff e9 91 fc ff ff 41 89 c5 e9 5c fc ff ff 0f 0b 89 f8 25 ff ff f7 ff 89 85 8c fe ff ff e9 d5 f2 ff ff <0f> 0b e9 65 f2 ff ff 65 8b 05 38 81 d2 47 f6 c4 01 74 1c 65 48 8b
[ 548.623281] RSP: 0018:ffff8801f28c7678 EFLAGS: 00010246
[ 548.623284] RAX: 0000000000000000 RBX: 00000000006040c0 RCX: ffffffffb82f73b7
[ 548.623287] RDX: 1ffff1003e518eeb RSI: 000000000000000c RDI: 0000000000000000
[ 548.623290] RBP: ffff8801f28c7880 R08: 0000000000000000 R09: ffffed0047fff2c5
[ 548.623292] R10: 0000000000000001 R11: ffffed0047fff2c4 R12: ffff8801e88de040
[ 548.623295] R13: 00000000006040c0 R14: 000000000000000c R15: ffff8801f28c7938
[ 548.623299] FS: 00007f76fca51840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000
[ 548.623302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 548.623304] CR2: 00007f19b9171760 CR3: 00000001ed952000 CR4: 00000000000006e0
[ 548.623317] Call Trace:
[ 548.623325] ? kasan_check_read+0x11/0x20
[ 548.623330] ? __zone_watermark_ok+0x92/0x240
[ 548.623336] ? get_page_from_freelist+0x1c3/0x1d90
[ 548.623347] ? _raw_spin_lock_irqsave+0x2a/0x60
[ 548.623353] ? warn_alloc+0x250/0x250
[ 548.623358] ? save_stack+0x46/0xd0
[ 548.623361] ? kasan_kmalloc+0xad/0xe0
[ 548.623366] ? __isolate_free_page+0x2a0/0x2a0
[ 548.623370] ? mount_fs+0x60/0x1a0
[ 548.623374] ? vfs_kern_mount+0x6b/0x1a0
[ 548.623378] ? do_mount+0x34a/0x18c0
[ 548.623383] ? ksys_mount+0x83/0xd0
[ 548.623387] ? __x64_sys_mount+0x67/0x80
[ 548.623391] ? do_syscall_64+0x78/0x170
[ 548.623396] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 548.623401] __alloc_pages_nodemask+0x3c5/0x400
[ 548.623407] ? __alloc_pages_slowpath+0x1420/0x1420
[ 548.623412] ? __mutex_lock_slowpath+0x20/0x20
[ 548.623417] ? kvmalloc_node+0x31/0x80
[ 548.623424] alloc_pages_current+0x75/0x110
[ 548.623436] kmalloc_order+0x24/0x60
[ 548.623442] kmalloc_order_trace+0x24/0xb0
[ 548.623448] __kmalloc_track_caller+0x207/0x220
[ 548.623455] ? f2fs_build_node_manager+0x399/0xbb0
[ 548.623460] kmemdup+0x20/0x50
[ 548.623465] f2fs_build_node_manager+0x399/0xbb0
[ 548.623470] f2fs_fill_super+0x195e/0x2b40
[ 548.623477] ? f2fs_commit_super+0x1b0/0x1b0
[ 548.623481] ? set_blocksize+0x90/0x140
[ 548.623486] mount_bdev+0x1c5/0x210
[ 548.623489] ? f2fs_commit_super+0x1b0/0x1b0
[ 548.623495] f2fs_mount+0x15/0x20
[ 548.623498] mount_fs+0x60/0x1a0
[ 548.623503] ? alloc_vfsmnt+0x309/0x360
[ 548.623508] vfs_kern_mount+0x6b/0x1a0
[ 548.623513] do_mount+0x34a/0x18c0
[ 548.623518] ? lockref_put_or_lock+0xcf/0x160
[ 548.623523] ? copy_mount_string+0x20/0x20
[ 548.623528] ? memcg_kmem_put_cache+0x1b/0xa0
[ 548.623533] ? kasan_check_write+0x14/0x20
[ 548.623537] ? _copy_from_user+0x6a/0x90
[ 548.623542] ? memdup_user+0x42/0x60
[ 548.623547] ksys_mount+0x83/0xd0
[ 548.623552] __x64_sys_mount+0x67/0x80
[ 548.623557] do_syscall_64+0x78/0x170
[ 548.623562] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 548.623566] RIP: 0033:0x7f76fc331b9a
[ 548.623567] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48
[ 548.623632] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 548.623636] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a
[ 548.623639] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0
[ 548.623641] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013
[ 548.623643] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0
[ 548.623646] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003
[ 548.623650] ---[ end trace 4ce02f25ff7d3df5 ]---
[ 548.623656] F2FS-fs (loop0): Failed to initialize F2FS node manager
[ 548.627936] F2FS-fs (loop0): Invalid log blocks per segment (8201)
[ 548.627940] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock
[ 548.635835] F2FS-fs (loop0): Failed to initialize F2FS node manager
- Location
https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.c#L3578
sit_i->sit_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL);
Buffer overrun happens when doing memcpy. I suspect there is missing (inconsistent) checks on bitmap_size.
Reported by Wen Xu (wen.xu@gatech.edu ) from SSLab, Gatech.
Reported-by: Wen Xu <wen.xu@gatech.edu>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-19 22:47:16 +02:00
Anatoly Pugachev
c9ab0cefc5
disable loading f2fs module on PAGE_SIZE > 4KB
...
[ Upstream commit 4071e67cffcc5c2a007116a02437471351f550eb ]
The following patch disables loading of f2fs module on architectures
which have PAGE_SIZE > 4096 , since it is impossible to mount f2fs on
such architectures , log messages are:
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/vdiskb1, missing codepage or helper program, or other error.
/dev/vdiskb1: F2FS filesystem,
UUID=1d8b9ca4-2389-4910-af3b-10998969f09c, volume name ""
May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid
page_cache_size (8192), supports only 4KB
May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS
filesystem in 1th superblock
May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid
page_cache_size (8192), supports only 4KB
May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS
filesystem in 2th superblock
May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid
page_cache_size (8192), supports only 4KB
which was introduced by git commit 5c9b469295fb6b10d98923eab5e79c4edb80ed20
tested on git kernel 4.17.0-rc6-00309-gec30dcf7f425
with patch applied:
modprobe: ERROR: could not insert 'f2fs': Invalid argument
May 28 01:40:28 v215 kernel: F2FS not supported on PAGE_SIZE(8192) != 4096
Signed-off-by: Anatoly Pugachev <matorola@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03 07:55:14 +02:00
Jin Qian
0f442c5b2e
f2fs: sanity check checkpoint segno and blkoff
...
commit 15d3042a937c13f5d9244241c7a9c8416ff6e82a upstream.
Make sure segno and blkoff read from raw image are valid.
Cc: stable@vger.kernel.org
Signed-off-by: Jin Qian <jinqian@google.com>
[Jaegeuk Kim: adjust minor coding style]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[AmitP: Found in Android Security bulletin for Aug'17, fixes CVE-2017-10663]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-11 08:49:31 -07:00
Jin Qian
93862955cb
f2fs: sanity check segment count
...
commit b9dd46188edc2f0d1f37328637860bb65a771124 upstream.
F2FS uses 4 bytes to represent block address. As a result, supported
size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-14 14:00:22 +02:00
Jaegeuk Kim
725ba1a3eb
f2fs: remove percpu_count due to performance regression
...
commit 35782b233f37e48ecc469d9c7232f3f6a7fad41a upstream.
This patch removes percpu_count usage due to performance regression in iozone.
Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12 11:39:35 +01:00
Jaegeuk Kim
a43e1c459a
Revert "f2fs: use percpu_counter for # of dirty pages in inode"
...
commit 204706c7accfabb67b97eef9f9a28361b6201199 upstream.
This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.
The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06 10:40:14 +01:00
Chao Yu
0f34802858
f2fs: support checkpoint error injection
...
This patch adds to support checkpoint error injection in f2fs for testing
fatal error tolerance, it will be useful that it can simulate abnormal
power off by f2fs itself instead of calling godown ioctl by running apps.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:35 -07:00
Chao Yu
2443b8b363
f2fs: fix to recover old fault injection config in ->remount_fs
...
In ->remount_fs, we didn't recover original fault injection config if
we encounter error, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:34 -07:00
Chao Yu
36dbd3287f
f2fs: do fault injection initialization in default_options
...
Do fault injection initialization in default_options to keep consistent
with other default option configurating.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:33 -07:00
Chao Yu
1ecc0c5c50
f2fs: support configuring fault injection per superblock
...
Previously, we only support global fault injection configuration, so that
when we configure type/rate of fault injection through sysfs, mount
option, it will influence all f2fs partition which is being used.
It is not make sence, since it will be not convenient if developer want
to test separated partitions with different fault injection rate/type
simultaneously, also it's not possible to enable fault injection in one
partition and disable fault injection in other one.
>From now on, we move global configuration of fault injection in module
into per-superblock, hence injection testing can be more flexible.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:31 -07:00
Chao Yu
d32853de50
f2fs: adjust display format of segment bit
...
Just adjust segment bit info printed in procfs.
Before:
1008 5|0 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1009 3|183|0 0 61 20 20 0 0 21 80 c0 2 e4 e 54 0 21 21 17 a 44 d0 28 e4 50 40 30 8 0 2d 32 0 5 b0 80 1 43 2 8e f8 7b 2 25 93 bf e0 73 8e 9a 19 44 60 ff e4 cc e6 8e bf f9 ff 5 3d 31 3d 13
1010 3|1 |0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
After:
1008 5|0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1009 4|434| ff 7d ff bf d9 3f ff e7 ff bf d7 bf ff bb be ff fb df f7 fb fa bf fb fe bb df dd ff fe ef ff fe ef e2 27 bf ab bf fb df fd bd bf fb db fc ff ff 3f ff ff bf ff 5f db 3f fb fb bf fb bf 4f ff ef
1010 4|422| ff bb fe ff ef d7 ee ff ff fc bf ef 7d eb ec fd fb 3f 97 7f ef ff af ff db ff ff 69 bf ff f6 e7 ff fb f7 7b fb df be ff ff ef f3 fe ff ff df fe f7 fa ff b7 77 be fe fb a9 7f 87 a2 ac c7 ff 75
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:30 -07:00
Jaegeuk Kim
bb5dada7d2
f2fs: remove dirty inode pages in error path
...
When getting EIO while handling orphan inodes, we can get some dirty node
pages. Then, f2fs_write_node_pages() called by iput(node_inode) will try
to flush node pages. But in this case, we should prevent to do that, since
we will try again from the start.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:29 -07:00
Jaegeuk Kim
d41065e204
f2fs: handle errors during recover_orphan_inodes
...
This patch fixes to handle EIO during recover_orphan_inode() given the below
panic.
F2FS-fs : inject IO error in f2fs_read_end_io+0xe6/0x100 [f2fs]
------------[ cut here ]------------
RIP: 0010:[<ffffffffc0b244e3>] [<ffffffffc0b244e3>] f2fs_evict_inode+0x433/0x470 [f2fs]
RSP: 0018:ffff92f8b7fb7c30 EFLAGS: 00010246
RAX: ffff92fb88a13500 RBX: ffff92f890566ea0 RCX: 00000000fd3c255c
RDX: 0000000000000001 RSI: ffff92fb88a13d90 RDI: ffff92fb8ee127e8
RBP: ffff92f8b7fb7c58 R08: 0000000000000001 R09: ffff92fb88a13d58
R10: 000000005a6a9373 R11: 0000000000000001 R12: 00000000fffffffb
R13: ffff92fb8ee12000 R14: 00000000000034ca R15: ffff92fb8ee12620
FS: 00007f1fefd8e880(0000) GS:ffff92fb95600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc211d34cdb CR3: 000000012d43a000 CR4: 00000000001406e0
Stack:
ffff92f890566ea0 ffff92f890567078 ffffffffc0b5a0c0 ffff92f890566f28
ffff92fb888b2000 ffff92f8b7fb7c80 ffffffffbc27ff55 ffff92f890566ea0
ffff92fb8bf10000 ffffffffc0b5a0c0 ffff92f8b7fb7cb0 ffffffffbc28090d
Call Trace:
[<ffffffffbc27ff55>] evict+0xc5/0x1a0
[<ffffffffbc28090d>] iput+0x1ad/0x2c0
[<ffffffffc0b3304c>] recover_orphan_inodes+0x10c/0x2e0 [f2fs]
[<ffffffffc0b2e0f4>] f2fs_fill_super+0x884/0x1150 [f2fs]
[<ffffffffbc2644ac>] mount_bdev+0x18c/0x1c0
[<ffffffffc0b2d870>] ? f2fs_commit_super+0x100/0x100 [f2fs]
[<ffffffffc0b2a755>] f2fs_mount+0x15/0x20 [f2fs]
[<ffffffffbc264e49>] mount_fs+0x39/0x170
[<ffffffffbc28555b>] vfs_kern_mount+0x6b/0x160
[<ffffffffbc2881df>] do_mount+0x1cf/0xd00
[<ffffffffbc287f2c>] ? copy_mount_options+0xac/0x170
[<ffffffffbc289003>] SyS_mount+0x83/0xd0
[<ffffffffbc8ee880>] entry_SYSCALL_64_fastpath+0x23/0xc1
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:27 -07:00
Chao Yu
aaec2b1d18
f2fs: introduce cp_lock to protect updating of ckpt_flags
...
This patch introduces spinlock to protect updating process of ckpt_flags
field in struct f2fs_checkpoint, it avoids incorrectly updating in race
condition.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: add __is_set_ckpt_flags likewise __set_ckpt_flags]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 17:34:20 -07:00
Jaegeuk Kim
a468f0ef51
f2fs: use crc and cp version to determine roll-forward recovery
...
Previously, we used cp_version only to detect recoverable dnodes.
In order to avoid same garbage cp_version, we needed to truncate the next
dnode during checkpoint, resulting in additional discard or data write.
If we can distinguish this by using crc in addition to cp_version, we can
remove this overhead.
There is backward compatibility concern where it changes node_footer layout.
So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to
detect new layout. New layout will be activated only when this flag is set.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30 10:05:46 -07:00
Chao Yu
8b038c70df
f2fs: support IO error injection
...
This patch adds to support IO error injection for testing IO error
tolerance of f2fs.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-22 11:43:06 -07:00
Chao Yu
97c1794a5d
f2fs: enable inline_dentry by default and add noinline_dentry option
...
Make inline_dentry as default mount option to improve space usage and
IO performance in scenario of numerous small directory.
It adds noinline_dentry mount option, instead.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-29 18:31:17 -07:00
Jaegeuk Kim
b873b798af
Revert "f2fs: use percpu_rw_semaphore"
...
LKP reported -36.3% regression of fsmark.files_per_sec due to this patch.
I've confirmed that fxmark [1] has also slight regression for DWAL.
[1] https://github.com/sslab-gatech/fxmark
This reverts commit ec795418c41850056feb956534edf059dc1155d4.
2016-08-19 11:15:08 +09:00
Linus Torvalds
6784725ab0
Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
...
Pull vfs updates from Al Viro:
"Assorted cleanups and fixes.
Probably the most interesting part long-term is ->d_init() - that will
have a bunch of followups in (at least) ceph and lustre, but we'll
need to sort the barrier-related rules before it can get used for
really non-trivial stuff.
Another fun thing is the merge of ->d_iput() callers (dentry_iput()
and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
except the one in __d_lookup_lru())"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
fs/dcache.c: avoid soft-lockup in dput()
vfs: new d_init method
vfs: Update lookup_dcache() comment
bdev: get rid of ->bd_inodes
Remove last traces of ->sync_page
new helper: d_same_name()
dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
vfs: clean up documentation
vfs: document ->d_real()
vfs: merge .d_select_inode() into .d_real()
unify dentry_iput() and dentry_unlink_inode()
binfmt_misc: ->s_root is not going anywhere
drop redundant ->owner initializations
ufs: get rid of redundant checks
orangefs: constify inode_operations
missed comment updates from ->direct_IO() prototype change
file_inode(f)->i_mapping is f->f_mapping
trim fsnotify hooks a bit
9p: new helper - v9fs_parent_fid()
debugfs: ->d_parent is never NULL or negative
...
2016-07-28 12:59:05 -07:00
Chao Yu
82e0a5aa5d
f2fs: fix to avoid data update racing between GC and DIO
...
Datas in file can be operated by GC and DIO simultaneously, so we will
face race case as below:
For write case:
Thread A Thread B
- generic_file_direct_write
- invalidate_inode_pages2_range
- f2fs_direct_IO
- do_blockdev_direct_IO
- do_direct_IO
- get_more_blocks
- f2fs_gc
- do_garbage_collect
- gc_data_segment
- move_data_page
- do_write_data_page
migrate data block to new block address
- dio_bio_submit
update user data to old block address
For read case:
Thread A Thread B
- generic_file_direct_write
- invalidate_inode_pages2_range
- f2fs_direct_IO
- do_blockdev_direct_IO
- do_direct_IO
- get_more_blocks
- f2fs_balance_fs
- f2fs_gc
- do_garbage_collect
- gc_data_segment
- move_data_page
- do_write_data_page
migrate data block to new block address
- write_checkpoint
- do_checkpoint
- clear_prefree_segments
- f2fs_issue_discard
discard old block adress
- dio_bio_submit
update user buffer from obsolete block address
In order to fix this, for one file, we should let DIO and GC getting exclusion
against with each other.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15 15:21:22 -07:00
Jaegeuk Kim
b56ab837a0
f2fs: avoid mark_inode_dirty
...
Let's check inode's dirtiness before calling mark_inode_dirty.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08 10:34:09 -07:00
Chao Yu
3e6d0b4d9c
f2fs: fix incorrect f_bfree calculation in ->statfs
...
As manual described, f_bfree indicates total free blocks in fs, in f2fs, it
includes two parts: visible free blocks and over-provision blocks. This
patch corrrects the calculation.
fsblkcnt_t f_bfree; /* free blocks in fs */
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08 10:33:32 -07:00
Jaegeuk Kim
ec795418c4
f2fs: use percpu_rw_semaphore
...
This patch replaces rw_semaphore with percpu_rw_semaphore for:
sbi->cp_rwsem
nm_i->nat_tree_lock
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08 10:33:31 -07:00
Chao Yu
64058be9c8
f2fs: add nodiscard mount option
...
This patch adds 'nodiscard' mount option.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08 10:33:26 -07:00
Jaegeuk Kim
52763a4b7a
f2fs: detect host-managed SMR by feature flag
...
If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs
by default.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:07 -07:00
Jaegeuk Kim
67c3758d22
f2fs: call update_inode_page for orphan inodes
...
Let's store orphan inode pages right away.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:07 -07:00
Jaegeuk Kim
36abef4e79
f2fs: introduce mode=lfs mount option
...
This mount option is to enable original log-structured filesystem forcefully.
So, there should be no random writes for main area.
Especially, this supports host-managed SMR device.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13 11:55:21 -07:00
Jaegeuk Kim
7dfeaa3220
f2fs: avoid reverse IO order for NODE and DATA
...
There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(),
which incur unnecessary reversed bio submission.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08 10:25:50 -07:00
Jaegeuk Kim
9a449e9c3b
f2fs: remove obsolete parameter in f2fs_truncate
...
We don't need lock parameter, which is always true.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:39 -07:00
Jaegeuk Kim
338bbfa086
f2fs: avoid wrong count on dirty inodes
...
The number should be covered by spin_lock. Otherwise we can see wrong count
in f2fs_stat.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:38 -07:00
Jaegeuk Kim
53aa6bbfda
f2fs: inject to produce some orphan inodes
...
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:19 -07:00
Jaegeuk Kim
b93f771286
f2fs: remove writepages lock
...
This patch removes writepages lock.
We can improve multi-threading performance.
tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:17 -07:00
Jaegeuk Kim
69e9e42744
f2fs: set flush_merge by default
...
This patch sets flush_merge by default.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:16 -07:00
Jaegeuk Kim
6d94c74ab8
f2fs: add lazytime mount option
...
This patch adds lazytime support.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:14 -07:00
Jaegeuk Kim
26de9b1171
f2fs: avoid unnecessary updating inode during fsync
...
If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:13 -07:00
Jaegeuk Kim
0f18b462b2
f2fs: flush inode metadata when checkpoint is doing
...
This patch registers all the inodes which have dirty metadata to sync when
checkpoint is doing.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:11 -07:00
Jaegeuk Kim
fc9581c809
f2fs: introduce f2fs_i_size_write with mark_inode_dirty_sync
...
This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:08 -07:00
Jaegeuk Kim
91942321e4
f2fs: use inode pointer for {set, clear}_inode_flag
...
This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:07 -07:00
Al Viro
84c60b1388
drop redundant ->owner initializations
...
it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-29 19:08:00 -04:00
Tiezhu Yang
b8bef79df7
f2fs: make exit_f2fs_fs more clear
...
init_f2fs_fs does:
1) f2fs_build_trace_ios
2) init_inodecache
3) create_node_manager_caches
4) create_segment_manager_caches
5) create_checkpoint_caches
6) create_extent_cache
7) kset_create_and_add
8) kobject_init_and_add
9) register_shrinker
10) register_filesystem
11) f2fs_create_root_stats
12) proc_mkdir
exit_f2fs_fs should do cleanup in the reverse order
to make the code more clear.
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:31 -07:00
Jaegeuk Kim
513c5f3735
f2fs: use percpu_counter for total_valid_inode_count
...
This patch uses percpu_counter to avoid stat_lock.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:30 -07:00