- 26 Dec, 2018 7 commits
-
-
Qiuyang Sun authored
Should use lstart (logical start address) instead of start (in dev) here. This fixes a bug in multi-device scenarios. Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Sahitya Tummala authored
When there is a failure in f2fs_fill_super() after/during the recovery of fsync'd nodes, it frees the current sbi and retries again. This time the mount is successful, but the files that got recovered before retry, still holds the extent tree, whose extent nodes list is corrupted since sbi and sbi->extent_list is freed up. The list_del corruption issue is observed when the file system is getting unmounted and when those recoverd files extent node is being freed up in the below context. list_del corruption. prev->next should be fffffff1e1ef5480, but was (null) <...> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53! lr : __list_del_entry_valid+0x94/0xb4 pc : __list_del_entry_valid+0x94/0xb4 <...> Call trace: __list_del_entry_valid+0x94/0xb4 __release_extent_node+0xb0/0x114 __free_extent_tree+0x58/0x7c f2fs_shrink_extent_tree+0xdc/0x3b0 f2fs_leave_shrinker+0x28/0x7c f2fs_put_super+0xfc/0x1e0 generic_shutdown_super+0x70/0xf4 kill_block_super+0x2c/0x5c kill_f2fs_super+0x44/0x50 deactivate_locked_super+0x60/0x8c deactivate_super+0x68/0x74 cleanup_mnt+0x40/0x78 __cleanup_mnt+0x1c/0x28 task_work_run+0x48/0xd0 do_notify_resume+0x678/0xe98 work_pending+0x8/0x14 Fix this by not creating extents for those recovered files if shrinker is not registered yet. Once mount is successful and shrinker is registered, those files can have extents again. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch cleans up checkpoint flow a bit: - remove unneeded circulation of flushing meta pages. - don't flush nat_bits pages in prior to other checkpoint pages. - add bug_on to check remained meta pages after flushing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Sometimes, I could observe # of issuing_discard to be 1 which blocks background jobs due to is_idle()=false. The only way to get out of it was to trigger gc_urgent. This patch avoids that by checking any candidates as done in the list. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Let's use "queued" instead of "issuing". Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
One report says memalloc failure during mount. (unwind_backtrace) from [<c010cd4c>] (show_stack+0x10/0x14) (show_stack) from [<c049c6b8>] (dump_stack+0x8c/0xa0) (dump_stack) from [<c024fcf0>] (warn_alloc+0xc4/0x160) (warn_alloc) from [<c0250218>] (__alloc_pages_nodemask+0x3f4/0x10d0) (__alloc_pages_nodemask) from [<c0270450>] (kmalloc_order_trace+0x2c/0x120) (kmalloc_order_trace) from [<c03fa748>] (build_node_manager+0x35c/0x688) (build_node_manager) from [<c03de494>] (f2fs_fill_super+0xf0c/0x16cc) (f2fs_fill_super) from [<c02a5864>] (mount_bdev+0x15c/0x188) (mount_bdev) from [<c03da624>] (f2fs_mount+0x18/0x20) (f2fs_mount) from [<c02a68b8>] (mount_fs+0x158/0x19c) (mount_fs) from [<c02c3c9c>] (vfs_kern_mount+0x78/0x134) (vfs_kern_mount) from [<c02c76ac>] (do_mount+0x474/0xca4) (do_mount) from [<c02c8264>] (SyS_mount+0x94/0xbc) (SyS_mount) from [<c0108180>] (ret_fast_syscall+0x0/0x48) Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
Commit 089842de ("f2fs: remove codes of unused wio_mutex") removes codes of unused wio_mutex, but missing the comment, so delete it. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
- 14 Dec, 2018 3 commits
-
-
Chao Yu authored
This patch reorders flow from - update page - set_page_dirty - wait_on_page_writeback to - wait_on_page_writeback - update page - set_page_dirty The reason is: - set_page_dirty will increase reference of dirty page, the reference should be cleared before wait_on_page_writeback to keep its consistency. - some devices need stable page during page writebacking, so we should not change page's data. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Sheng Yong authored
If IPU failed, nothing is commited, we should end page writeback. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This adds an option in ioctl(F2FS_IOC_SHUTDOWN) in order to trigger fsck by setting a NEED_FSCK flag. Generally, shutdown is used for the test to validate filesystem consistency, and setting NEED_FSCK flag can be used for Android to trigger fsck.f2fs at boot time explicitly so that we could measure the elapsed time as well as force filesystem check. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
- 28 Nov, 2018 1 commit
-
-
Jaegeuk Kim authored
If we want to re-enable nat_bits, we rely on fsck which requires full scan of directory tree. Let's do that by regular fsck or unclean shutdown. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
- 27 Nov, 2018 4 commits
-
-
Jia Zhu authored
Previously, we added a parameter @map.m_may_create to trigger OPU allocation and call f2fs_balance_fs() correctly. But in get_more_blocks(), @create has been overwritten by below code. So the function f2fs_map_blocks() will not allocate new block address but directly go out. Meanwile,there are several functions calling f2fs_map_blocks() directly and @map.m_may_create not initialized. CODE: create = dio->op == REQ_OP_WRITE; if (dio->flags & DIO_SKIP_HOLES) { if (fs_startblk <= ((i_size_read(dio->inode) - 1) >> i_blkbits)) create = 0; } This patch fixes it. Signed-off-by: Jia Zhu <zhujia13@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jia Zhu authored
Previously, we allocated a new block address for OPU mode in direct_IO. But the new address couldn't be assigned to @map->m_pblk correctly. This patch fix it. Cc: <stable@vger.kernel.org> Fixes: 511f52d02f05 ("f2fs: allow out-place-update for direct IO in LFS mode") Signed-off-by: Jia Zhu <zhujia13@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Sahitya Tummala authored
Adjust the trace print in f2fs_get_victim() to cover GC done by F2FS_IOC_GARBAGE_COLLECT_RANGE. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Sahitya Tummala authored
Allow node type segments also to be GC'd via f2fs ioctl F2FS_IOC_GARBAGE_COLLECT_RANGE. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
- 26 Nov, 2018 22 commits
-
-
Alexey Dobriyan authored
Those strings are immutable. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Pan Bian authored
The function truncate_node frees the page with f2fs_put_page. However, the page index is read after that. So, the patch reads the index before freeing the page. Fixes: bf39c00a ("f2fs: drop obsolete node page when it is truncated") Cc: <stable@vger.kernel.org> Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Tiezhu Yang authored
When call f2fs_acl_create_masq() failed, the caller f2fs_acl_create() should return -EIO instead of -ENOMEM, this patch makes it consistent with posix_acl_create() which has been fixed in commit beaf226b ("posix_acl: don't ignore return value of posix_acl_create_masq()"). Fixes: 83dfe53c ("f2fs: fix reference leaks in f2fs_acl_create") Signed-off-by: Tiezhu Yang <kernelpatch@126.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
After merging the f2fs tree, today's linux-next build (x86_64_allmodconfig) produced this warning: In file included from fs/f2fs/dir.c:11: fs/f2fs/f2fs.h: In function '__mark_inode_dirty_flag': fs/f2fs/f2fs.h:2388:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (set) ^ fs/f2fs/f2fs.h:2390:2: note: here case FI_DATA_EXIST: ^~~~ Exposed by my use of -Wimplicit-fallthrough Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Sheng Yong authored
The following race could lead to inconsistent SIT bitmap: Task A Task B ====== ====== f2fs_write_checkpoint block_operations f2fs_lock_all down_write(node_change) down_write(node_write) ... sync ... up_write(node_change) f2fs_file_write_iter set_inode_flag(FI_NO_PREALLOC) ...... f2fs_write_begin(index=0, has inline data) prepare_write_begin __do_map_lock(AIO) => down_read(node_change) f2fs_convert_inline_page => update SIT __do_map_lock(AIO) => up_read(node_change) f2fs_flush_sit_entries <= inconsistent SIT finish write checkpoint sudden-power-off If SPO occurs after checkpoint is finished, SIT bitmap will be set incorrectly. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
If namelen is corrupted to have very long value, fill_dentries can copy wrong memory area. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
Previously, when f2fs finds which temp bio cache owns the target page, it will flush all the three temp bio caches, but we only need to flush one single bio cache indeed, which can help to keep bio merged. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
In get_more_blocks(), we may override @create as below code: create = dio->op == REQ_OP_WRITE; if (dio->flags & DIO_SKIP_HOLES) { if (fs_startblk <= ((i_size_read(dio->inode) - 1) >> i_blkbits)) create = 0; } But in f2fs_map_blocks(), we only trigger f2fs_balance_fs() if @create is 1, so in LFS mode, dio overwrite under LFS mode can easily run out of free segments, result in below panic. Call Trace: allocate_segment_by_default+0xa8/0x270 [f2fs] f2fs_allocate_data_block+0x1ea/0x5c0 [f2fs] __allocate_data_block+0x306/0x480 [f2fs] f2fs_map_blocks+0x6f6/0x920 [f2fs] __get_data_block+0x4f/0xb0 [f2fs] get_data_block_dio_write+0x50/0x60 [f2fs] do_blockdev_direct_IO+0xcd5/0x21e0 __blockdev_direct_IO+0x3a/0x3c f2fs_direct_IO+0x1ff/0x4a0 [f2fs] generic_file_direct_write+0xd9/0x160 __generic_file_write_iter+0xbb/0x1e0 f2fs_file_write_iter+0xaf/0x220 [f2fs] __vfs_write+0xd0/0x130 vfs_write+0xb2/0x1b0 SyS_pwrite64+0x69/0xa0 ? vtime_user_exit+0x29/0x70 do_syscall_64+0x6e/0x160 entry_SYSCALL64_slow_path+0x25/0x25 RIP: new_curseg+0x36f/0x380 [f2fs] RSP: ffffac570393f7a8 So this patch introduces a parameter map.m_may_create to indicate that f2fs_map_blocks() is called from write or read path, which can give the right hint to let f2fs_map_blocks() trigger OPU allocation and call f2fs_balanc_fs() correctly. BTW, it disables physical address preallocation for direct IO in f2fs_preallocate_blocks, which is redundant to OPU allocation of f2fs_map_blocks. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch adds missing in-flight discard/preflush/dio command count check in is_idle(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch adds f2fs_dio_submit_bio() to hook submit_io/end_io functions in direct IO path, in order to account DIO. Later, we will add this count into is_idle() to let background GC/Discard thread be aware of DIO. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlei He authored
This patch move dir data flush to write checkpoint process, by doing this, it may reduce some time for dir fsync. pre: -f2fs_do_sync_file enter -file_write_and_wait_range <- flush & wait -write_checkpoint -do_checkpoint <- wait all -f2fs_do_sync_file exit now: -f2fs_do_sync_file enter -write_checkpoint -block_operations <- flush dir & no wait -do_checkpoint <- wait all -f2fs_do_sync_file exit Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yangtao Li authored
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
f2fs_ioc_gc_range skips blocks_per_seg each time, however, f2fs_gc moves blocks of section each time, so fix it from segment to section. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Add one sysfs entry to control migration granularity of GC in large section f2fs, it can be tuned to mitigate heavy overhead of migrating huge number of blocks in large section. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Section is minimal garbage collection unit of f2fs, in zoned block device, or ancient block mapping flash device, in order to improve GC efficiency, we can align GC unit to lower device erase unit, normally, it consists of multiple of segments. Once background or foreground GC triggers, it brings a large number of IOs, which will impact user IO, and also occupy cpu/memory resource intensively. So, to reduce impact of GC on large size section, this patch supports subsectional GC, in one cycle of GC, it only migrate partial segment{s} in victim section. Currently, by default, we use sbi->segs_per_sec as migration granularity. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Introduce a wrapper __is_large_section() to clean up codes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
In F2FS_HAS_FEATURE(), we will use F2FS_SB(sb) to get sbi pointer to access .raw_super field, to avoid unneeded pointer conversion, this patch changes to F2FS_HAS_FEATURE() accept sbi parameter directly. Just do cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before gc starts, do_garbage_collect will skip counting seg_freed++, and this will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Previously, we only account preflush command for flush_merge mode, so for noflush_merge mode, we can not know in-flight preflush command count, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Yunlong Song authored
The encrypted file may be corrupted by GC in following case: Time 1: | segment 1 blkaddr = A | GC -> | segment 2 blkaddr = B | Encrypted block 1 is moved from blkaddr A of segment 1 to blkaddr B of segment 2, Time 2: | segment 1 blkaddr = B | GC -> | segment 3 blkaddr = C | Before page 1 is written back and if segment 2 become a victim, then page 1 is moved from blkaddr B of segment 2 to blkaddr Cof segment 3, during the GC process of Time 2, f2fs should wait for page 1 written back before reading it, or move_data_block will read a garbage block from blkaddr B since page is not written back to blkaddr B yet. Commit 6aa58d8a ("f2fs: readahead encrypted block during GC") introduce ra_data_block to read encrypted block, but it forgets to add f2fs_wait_on_page_writeback to avoid racing between GC and flush. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
-
Linus Torvalds authored
Merge tag 'hwmon-for-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - fix temp4_type attribute permissions in w83795 driver - fix tacho fault detection in mlxreg-fan driver - fix current value calculations in ina2xx driver - fix initial notification/warning in raspberrypi driver - fix a NULL pointer access in ina2xx * tag 'hwmon-for-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (w83795) temp4_type has writable permission hwmon: (mlxreg-fan) Fix macros for tacho fault reading hwmon: (ina2xx) Fix current value calculation hwmon: (raspberrypi) Fix initial notify hwmon (ina2xx) Fix NULL id pointer in probe()
-
- 25 Nov, 2018 3 commits
-
-
Linus Torvalds authored
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: "Two dma-direct / swiotlb regressions fixes: - zero is a valid physical address on some arm boards, we can't use it as the error value - don't try to cache flush the error return value (no matter what it is)" * tag 'dma-mapping-4.20-3' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: Skip cache maintenance on map error dma-direct: Make DIRECT_MAPPING_ERROR viable for SWIOTLB
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: - Fix a NFSv4 state manager deadlock when returning a delegation - NFSv4.2 copy do not allocate memory under the lock - flexfiles: Use the correct stateid for IO in the tightly coupled case * tag 'nfs-for-4.20-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: flexfiles: use per-mirror specified stateid for IO NFSv4.2 copy do not allocate memory under the lock NFSv4: Fix a NFSv4 state manager deadlock
-