Commit 041fae9c authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added two features: F2FS_IOC_START_ATOMIC_REPLACE
  and a per-block age-based extent cache.

  F2FS_IOC_START_ATOMIC_REPLACE is a variant of the previous atomic
  write feature which guarantees a per-file atomicity. It would be more
  efficient than AtomicFile implementation in Android framework.

  The per-block age-based extent cache implements another type of extent
  cache in memory which keeps the per-block age in a file, so that block
  allocator could split the hot and cold data blocks more accurately.

  Enhancements:
   - introduce F2FS_IOC_START_ATOMIC_REPLACE
   - refactor extent_cache to add a new per-block-age-based extent cache support
   - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
   - add proc entry to show discard_plist info
   - optimize iteration over sparse directories
   - add barrier mount option

  Bug fixes:
   - avoid victim selection from previous victim section
   - fix to enable compress for newly created file if extension matches
   - set zstd compress level correctly
   - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
   - correct i_size change for atomic writes
   - allow to read node block after shutdown
   - allow to set compression for inlined file
   - fix gc mode when gc_urgent_high_remaining is 1
   - should put a page when checking the summary info

  Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and
  doc"

* tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
  f2fs: reset wait_ms to default if any of the victims have been selected
  f2fs: fix some format WARNING in debug.c and sysfs.c
  f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
  f2fs: fix iostat parameter for discard
  f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
  f2fs: add block_age-based extent cache
  f2fs: allocate the extent_cache by default
  f2fs: refactor extent_cache to support for read and more
  f2fs: remove unnecessary __init_extent_tree
  f2fs: move internal functions into extent_cache.c
  f2fs: specify extent cache for read explicitly
  f2fs: introduce f2fs_is_readonly() for readability
  f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
  f2fs: do some cleanup for f2fs module init
  MAINTAINERS: Add f2fs bug tracker link
  f2fs: remove the unused flush argument to change_curseg
  f2fs: open code allocate_segment_by_default
  f2fs: remove struct segment_allocation default_salloc_ops
  f2fs: introduce discard_urgent_util sysfs node
  f2fs: define MIN_DISCARD_GRANULARITY macro
  ...
parents eb67d239 26a8057a
......@@ -99,6 +99,12 @@ Description: Controls the issue rate of discard commands that consist of small
checkpoint is triggered, and issued during the checkpoint.
By default, it is disabled with 0.
What: /sys/fs/f2fs/<disk>/max_ordered_discard
Date: October 2022
Contact: "Yangtao Li" <frank.li@vivo.com>
Description: Controls the maximum ordered discard, the unit size is one block(4KB).
Set it to 16 by default.
What: /sys/fs/f2fs/<disk>/max_discard_request
Date: December 2021
Contact: "Konstantin Vyshetsky" <vkon@google.com>
......@@ -132,7 +138,8 @@ Contact: "Chao Yu" <yuchao0@huawei.com>
Description: Controls discard granularity of inner discard thread. Inner thread
will not issue discards with size that is smaller than granularity.
The unit size is one block(4KB), now only support configuring
in range of [1, 512]. Default value is 4(=16KB).
in range of [1, 512]. Default value is 16.
For small devices, default value is 1.
What: /sys/fs/f2fs/<disk>/umount_discard_timeout
Date: January 2019
......@@ -235,7 +242,7 @@ Description: Shows total written kbytes issued to disk.
What: /sys/fs/f2fs/<disk>/features
Date: July 2017
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description: <deprecated: should use /sys/fs/f2fs/<disk>/feature_list/
Description: <deprecated: should use /sys/fs/f2fs/<disk>/feature_list/>
Shows all enabled features in current device.
Supported features:
encryption, blkzoned, extra_attr, projquota, inode_checksum,
......@@ -592,10 +599,10 @@ Description: With "mode=fragment:block" mount options, we can scatter block allo
in the length of 1..<max_fragment_hole> by turns. This value can be set
between 1..512 and the default value is 4.
What: /sys/fs/f2fs/<disk>/gc_urgent_high_remaining
Date: December 2021
Contact: "Daeho Jeong" <daehojeong@google.com>
Description: You can set the trial count limit for GC urgent high mode with this value.
What: /sys/fs/f2fs/<disk>/gc_remaining_trials
Date: October 2022
Contact: "Yangtao Li" <frank.li@vivo.com>
Description: You can set the trial count limit for GC urgent and idle mode with this value.
If GC thread gets to the limit, the mode will turn back to GC normal mode.
By default, the value is zero, which means there is no limit like before.
......@@ -634,3 +641,31 @@ Date: July 2022
Contact: "Daeho Jeong" <daehojeong@google.com>
Description: Show the accumulated total revoked atomic write block count after boot.
If you write "0" here, you can initialize to "0".
What: /sys/fs/f2fs/<disk>/gc_mode
Date: October 2022
Contact: "Yangtao Li" <frank.li@vivo.com>
Description: Show the current gc_mode as a string.
This is a read-only entry.
What: /sys/fs/f2fs/<disk>/discard_urgent_util
Date: November 2022
Contact: "Yangtao Li" <frank.li@vivo.com>
Description: When space utilization exceeds this, do background DISCARD aggressively.
Does DISCARD forcibly in a period of given min_discard_issue_time when the number
of discards is not 0 and set discard granularity to 1.
Default: 80
What: /sys/fs/f2fs/<disk>/hot_data_age_threshold
Date: November 2022
Contact: "Ping Xiong" <xiongping1@xiaomi.com>
Description: When DATA SEPARATION is on, it controls the age threshold to indicate
the data blocks as hot. By default it was initialized as 262144 blocks
(equals to 1GB).
What: /sys/fs/f2fs/<disk>/warm_data_age_threshold
Date: November 2022
Contact: "Ping Xiong" <xiongping1@xiaomi.com>
Description: When DATA SEPARATION is on, it controls the age threshold to indicate
the data blocks as warm. By default it was initialized as 2621440 blocks
(equals to 10GB).
......@@ -25,10 +25,14 @@ a consistency checking tool (fsck.f2fs), and a debugging tool (dump.f2fs).
- git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git
For reporting bugs and sending patches, please use the following mailing list:
For sending patches, please use the following mailing list:
- linux-f2fs-devel@lists.sourceforge.net
For reporting bugs, please use the following f2fs bug tracker link:
- https://bugzilla.kernel.org/enter_bug.cgi?product=File%20System&component=f2fs
Background and Design issues
============================
......@@ -154,6 +158,8 @@ nobarrier This option can be used if underlying storage guarantees
If this option is set, no cache_flush commands are issued
but f2fs still guarantees the write ordering of all the
data writes.
barrier If this option is set, cache_flush commands are allowed to be
issued.
fastboot This option is used when a system wants to reduce mount
time as much as possible, even though normal performance
can be sacrificed.
......@@ -199,6 +205,7 @@ fault_type=%d Support configuring fault injection type, should be
FAULT_SLAB_ALLOC 0x000008000
FAULT_DQUOT_INIT 0x000010000
FAULT_LOCK_OP 0x000020000
FAULT_BLKADDR 0x000040000
=================== ===========
mode=%s Control block allocation mode which supports "adaptive"
and "lfs". In "lfs" mode, there should be no random
......@@ -340,6 +347,10 @@ memory=%s Control memory mode. This supports "normal" and "low" modes.
Because of the nature of low memory devices, in this mode, f2fs
will try to save memory sometimes by sacrificing performance.
"normal" mode is the default mode and same as before.
age_extent_cache Enable an age extent cache based on rb-tree. It records
data block update frequency of the extent per inode, in
order to provide better temperature hints for data block
allocation.
======================== ============================================================
Debugfs Entries
......
......@@ -7889,6 +7889,7 @@ M: Chao Yu <chao@kernel.org>
L: linux-f2fs-devel@lists.sourceforge.net
S: Maintained
W: https://f2fs.wiki.kernel.org/
B: https://bugzilla.kernel.org/enter_bug.cgi?product=File%20System&component=f2fs
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git
F: Documentation/ABI/testing/sysfs-fs-f2fs
F: Documentation/filesystems/f2fs.rst
......
......@@ -171,6 +171,11 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr,
bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
block_t blkaddr, int type)
{
if (time_to_inject(sbi, FAULT_BLKADDR)) {
f2fs_show_injection_info(sbi, FAULT_BLKADDR);
return false;
}
switch (type) {
case META_NAT:
break;
......@@ -1897,8 +1902,10 @@ int f2fs_start_ckpt_thread(struct f2fs_sb_info *sbi)
cprc->f2fs_issue_ckpt = kthread_run(issue_checkpoint_thread, sbi,
"f2fs_ckpt-%u:%u", MAJOR(dev), MINOR(dev));
if (IS_ERR(cprc->f2fs_issue_ckpt)) {
int err = PTR_ERR(cprc->f2fs_issue_ckpt);
cprc->f2fs_issue_ckpt = NULL;
return -ENOMEM;
return err;
}
set_task_ioprio(cprc->f2fs_issue_ckpt, cprc->ckpt_thread_ioprio);
......
......@@ -346,7 +346,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
if (!level)
level = F2FS_ZSTD_DEFAULT_CLEVEL;
params = zstd_get_params(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen);
params = zstd_get_params(level, cc->rlen);
workspace_size = zstd_cstream_workspace_bound(&params.cParams);
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
......@@ -567,10 +567,7 @@ MODULE_PARM_DESC(num_compress_pages,
int f2fs_init_compress_mempool(void)
{
compress_page_pool = mempool_create_page_pool(num_compress_pages, 0);
if (!compress_page_pool)
return -ENOMEM;
return 0;
return compress_page_pool ? 0 : -ENOMEM;
}
void f2fs_destroy_compress_mempool(void)
......@@ -1981,9 +1978,7 @@ int f2fs_init_page_array_cache(struct f2fs_sb_info *sbi)
sbi->page_array_slab = f2fs_kmem_cache_create(slab_name,
sbi->page_array_slab_size);
if (!sbi->page_array_slab)
return -ENOMEM;
return 0;
return sbi->page_array_slab ? 0 : -ENOMEM;
}
void f2fs_destroy_page_array_cache(struct f2fs_sb_info *sbi)
......@@ -1991,53 +1986,24 @@ void f2fs_destroy_page_array_cache(struct f2fs_sb_info *sbi)
kmem_cache_destroy(sbi->page_array_slab);
}
static int __init f2fs_init_cic_cache(void)
int __init f2fs_init_compress_cache(void)
{
cic_entry_slab = f2fs_kmem_cache_create("f2fs_cic_entry",
sizeof(struct compress_io_ctx));
if (!cic_entry_slab)
return -ENOMEM;
return 0;
}
static void f2fs_destroy_cic_cache(void)
{
kmem_cache_destroy(cic_entry_slab);
}
static int __init f2fs_init_dic_cache(void)
{
dic_entry_slab = f2fs_kmem_cache_create("f2fs_dic_entry",
sizeof(struct decompress_io_ctx));
if (!dic_entry_slab)
return -ENOMEM;
return 0;
}
static void f2fs_destroy_dic_cache(void)
{
kmem_cache_destroy(dic_entry_slab);
}
int __init f2fs_init_compress_cache(void)
{
int err;
err = f2fs_init_cic_cache();
if (err)
goto out;
err = f2fs_init_dic_cache();
if (err)
goto free_cic;
return 0;
free_cic:
f2fs_destroy_cic_cache();
out:
kmem_cache_destroy(cic_entry_slab);
return -ENOMEM;
}
void f2fs_destroy_compress_cache(void)
{
f2fs_destroy_dic_cache();
f2fs_destroy_cic_cache();
kmem_cache_destroy(dic_entry_slab);
kmem_cache_destroy(cic_entry_slab);
}
......@@ -39,10 +39,8 @@ static struct bio_set f2fs_bioset;
int __init f2fs_init_bioset(void)
{
if (bioset_init(&f2fs_bioset, F2FS_BIO_POOL_SIZE,
0, BIOSET_NEED_BVECS))
return -ENOMEM;
return 0;
return bioset_init(&f2fs_bioset, F2FS_BIO_POOL_SIZE,
0, BIOSET_NEED_BVECS);
}
void f2fs_destroy_bioset(void)
......@@ -1145,7 +1143,7 @@ void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr)
{
dn->data_blkaddr = blkaddr;
f2fs_set_data_blkaddr(dn);
f2fs_update_extent_cache(dn);
f2fs_update_read_extent_cache(dn);
}
/* dn->ofs_in_node will be returned with up-to-date last block pointer */
......@@ -1214,7 +1212,7 @@ int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index)
struct extent_info ei = {0, };
struct inode *inode = dn->inode;
if (f2fs_lookup_extent_cache(inode, index, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, index, &ei)) {
dn->data_blkaddr = ei.blk + index - ei.fofs;
return 0;
}
......@@ -1223,7 +1221,8 @@ int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index)
}
struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
blk_opf_t op_flags, bool for_write)
blk_opf_t op_flags, bool for_write,
pgoff_t *next_pgofs)
{
struct address_space *mapping = inode->i_mapping;
struct dnode_of_data dn;
......@@ -1235,7 +1234,7 @@ struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
if (!page)
return ERR_PTR(-ENOMEM);
if (f2fs_lookup_extent_cache(inode, index, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, index, &ei)) {
dn.data_blkaddr = ei.blk + index - ei.fofs;
if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), dn.data_blkaddr,
DATA_GENERIC_ENHANCE_READ)) {
......@@ -1249,12 +1248,17 @@ struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE);
if (err)
if (err) {
if (err == -ENOENT && next_pgofs)
*next_pgofs = f2fs_get_next_page_offset(&dn, index);
goto put_err;
}
f2fs_put_dnode(&dn);
if (unlikely(dn.data_blkaddr == NULL_ADDR)) {
err = -ENOENT;
if (next_pgofs)
*next_pgofs = index + 1;
goto put_err;
}
if (dn.data_blkaddr != NEW_ADDR &&
......@@ -1298,7 +1302,8 @@ struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index,
return ERR_PTR(err);
}
struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index)
struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index,
pgoff_t *next_pgofs)
{
struct address_space *mapping = inode->i_mapping;
struct page *page;
......@@ -1308,7 +1313,7 @@ struct page *f2fs_find_data_page(struct inode *inode, pgoff_t index)
return page;
f2fs_put_page(page, 0);
page = f2fs_get_read_data_page(inode, index, 0, false);
page = f2fs_get_read_data_page(inode, index, 0, false, next_pgofs);
if (IS_ERR(page))
return page;
......@@ -1334,7 +1339,7 @@ struct page *f2fs_get_lock_data_page(struct inode *inode, pgoff_t index,
struct address_space *mapping = inode->i_mapping;
struct page *page;
repeat:
page = f2fs_get_read_data_page(inode, index, 0, for_write);
page = f2fs_get_read_data_page(inode, index, 0, for_write, NULL);
if (IS_ERR(page))
return page;
......@@ -1497,7 +1502,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
pgofs = (pgoff_t)map->m_lblk;
end = pgofs + maxblocks;
if (!create && f2fs_lookup_extent_cache(inode, pgofs, &ei)) {
if (!create && f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) {
if (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO &&
map->m_may_create)
goto next_dnode;
......@@ -1707,7 +1712,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
if (map->m_flags & F2FS_MAP_MAPPED) {
unsigned int ofs = start_pgofs - map->m_lblk;
f2fs_update_extent_cache_range(&dn,
f2fs_update_read_extent_cache_range(&dn,
start_pgofs, map->m_pblk + ofs,
map->m_len - ofs);
}
......@@ -1752,7 +1757,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
if (map->m_flags & F2FS_MAP_MAPPED) {
unsigned int ofs = start_pgofs - map->m_lblk;
f2fs_update_extent_cache_range(&dn,
f2fs_update_read_extent_cache_range(&dn,
start_pgofs, map->m_pblk + ofs,
map->m_len - ofs);
}
......@@ -2212,7 +2217,7 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret,
if (f2fs_cluster_is_empty(cc))
goto out;
if (f2fs_lookup_extent_cache(inode, start_idx, &ei))
if (f2fs_lookup_read_extent_cache(inode, start_idx, &ei))
from_dnode = false;
if (!from_dnode)
......@@ -2643,7 +2648,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
set_new_dnode(&dn, inode, NULL, NULL, 0);
if (need_inplace_update(fio) &&
f2fs_lookup_extent_cache(inode, page->index, &ei)) {
f2fs_lookup_read_extent_cache(inode, page->index, &ei)) {
fio->old_blkaddr = ei.blk + page->index - ei.fofs;
if (!f2fs_is_valid_blkaddr(fio->sbi, fio->old_blkaddr,
......@@ -3367,7 +3372,7 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
} else if (locked) {
err = f2fs_get_block(&dn, index);
} else {
if (f2fs_lookup_extent_cache(inode, index, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, index, &ei)) {
dn.data_blkaddr = ei.blk + index - ei.fofs;
} else {
/* hole case */
......@@ -3408,7 +3413,7 @@ static int __find_data_block(struct inode *inode, pgoff_t index,
set_new_dnode(&dn, inode, ipage, ipage, 0);
if (f2fs_lookup_extent_cache(inode, index, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, index, &ei)) {
dn.data_blkaddr = ei.blk + index - ei.fofs;
} else {
/* hole case */
......@@ -3472,6 +3477,9 @@ static int prepare_atomic_write_begin(struct f2fs_sb_info *sbi,
else if (*blk_addr != NULL_ADDR)
return 0;
if (is_inode_flag_set(inode, FI_ATOMIC_REPLACE))
goto reserve_block;
/* Look for the block in the original inode */
err = __find_data_block(inode, index, &ori_blk_addr);
if (err)
......@@ -4093,9 +4101,7 @@ int f2fs_init_post_read_wq(struct f2fs_sb_info *sbi)
sbi->post_read_wq = alloc_workqueue("f2fs_post_read_wq",
WQ_UNBOUND | WQ_HIGHPRI,
num_online_cpus());
if (!sbi->post_read_wq)
return -ENOMEM;
return 0;
return sbi->post_read_wq ? 0 : -ENOMEM;
}
void f2fs_destroy_post_read_wq(struct f2fs_sb_info *sbi)
......@@ -4108,9 +4114,7 @@ int __init f2fs_init_bio_entry_cache(void)
{
bio_entry_slab = f2fs_kmem_cache_create("f2fs_bio_entry_slab",
sizeof(struct bio_entry));
if (!bio_entry_slab)
return -ENOMEM;
return 0;
return bio_entry_slab ? 0 : -ENOMEM;
}
void f2fs_destroy_bio_entry_cache(void)
......
......@@ -72,15 +72,26 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->main_area_zones = si->main_area_sections /
le32_to_cpu(raw_super->secs_per_zone);
/* validation check of the segment numbers */
/* general extent cache stats */
for (i = 0; i < NR_EXTENT_CACHES; i++) {
struct extent_tree_info *eti = &sbi->extent_tree[i];
si->hit_cached[i] = atomic64_read(&sbi->read_hit_cached[i]);
si->hit_rbtree[i] = atomic64_read(&sbi->read_hit_rbtree[i]);
si->total_ext[i] = atomic64_read(&sbi->total_hit_ext[i]);
si->hit_total[i] = si->hit_cached[i] + si->hit_rbtree[i];
si->ext_tree[i] = atomic_read(&eti->total_ext_tree);
si->zombie_tree[i] = atomic_read(&eti->total_zombie_tree);
si->ext_node[i] = atomic_read(&eti->total_ext_node);
}
/* read extent_cache only */
si->hit_largest = atomic64_read(&sbi->read_hit_largest);
si->hit_cached = atomic64_read(&sbi->read_hit_cached);
si->hit_rbtree = atomic64_read(&sbi->read_hit_rbtree);
si->hit_total = si->hit_largest + si->hit_cached + si->hit_rbtree;
si->total_ext = atomic64_read(&sbi->total_hit_ext);
si->ext_tree = atomic_read(&sbi->total_ext_tree);
si->zombie_tree = atomic_read(&sbi->total_zombie_tree);
si->ext_node = atomic_read(&sbi->total_ext_node);
si->hit_total[EX_READ] += si->hit_largest;
/* block age extent_cache only */
si->allocated_data_blocks = atomic64_read(&sbi->allocated_data_blocks);
/* validation check of the segment numbers */
si->ndirty_node = get_pages(sbi, F2FS_DIRTY_NODES);
si->ndirty_dent = get_pages(sbi, F2FS_DIRTY_DENTS);
si->ndirty_meta = get_pages(sbi, F2FS_DIRTY_META);
......@@ -294,25 +305,32 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
sizeof(struct nat_entry_set);
for (i = 0; i < MAX_INO_ENTRY; i++)
si->cache_mem += sbi->im[i].ino_num * sizeof(struct ino_entry);
si->cache_mem += atomic_read(&sbi->total_ext_tree) *
for (i = 0; i < NR_EXTENT_CACHES; i++) {
struct extent_tree_info *eti = &sbi->extent_tree[i];
si->ext_mem[i] = atomic_read(&eti->total_ext_tree) *
sizeof(struct extent_tree);
si->cache_mem += atomic_read(&sbi->total_ext_node) *
si->ext_mem[i] += atomic_read(&eti->total_ext_node) *
sizeof(struct extent_node);
si->cache_mem += si->ext_mem[i];
}
si->page_mem = 0;
if (sbi->node_inode) {
unsigned npages = NODE_MAPPING(sbi)->nrpages;
unsigned long npages = NODE_MAPPING(sbi)->nrpages;
si->page_mem += (unsigned long long)npages << PAGE_SHIFT;
}
if (sbi->meta_inode) {
unsigned npages = META_MAPPING(sbi)->nrpages;
unsigned long npages = META_MAPPING(sbi)->nrpages;
si->page_mem += (unsigned long long)npages << PAGE_SHIFT;
}
#ifdef CONFIG_F2FS_FS_COMPRESSION
if (sbi->compress_inode) {
unsigned npages = COMPRESS_MAPPING(sbi)->nrpages;
unsigned long npages = COMPRESS_MAPPING(sbi)->nrpages;
si->page_mem += (unsigned long long)npages << PAGE_SHIFT;
}
#endif
......@@ -460,28 +478,28 @@ static int stat_show(struct seq_file *s, void *v)
si->meta_count[META_NAT]);
seq_printf(s, " - ssa blocks : %u\n",
si->meta_count[META_SSA]);
seq_printf(s, "CP merge (Queued: %4d, Issued: %4d, Total: %4d, "
"Cur time: %4d(ms), Peak time: %4d(ms))\n",
si->nr_queued_ckpt, si->nr_issued_ckpt,
si->nr_total_ckpt, si->cur_ckpt_time,
si->peak_ckpt_time);
seq_puts(s, "CP merge:\n");
seq_printf(s, " - Queued : %4d\n", si->nr_queued_ckpt);
seq_printf(s, " - Issued : %4d\n", si->nr_issued_ckpt);
seq_printf(s, " - Total : %4d\n", si->nr_total_ckpt);
seq_printf(s, " - Cur time : %4d(ms)\n", si->cur_ckpt_time);
seq_printf(s, " - Peak time : %4d(ms)\n", si->peak_ckpt_time);
seq_printf(s, "GC calls: %d (BG: %d)\n",
si->call_count, si->bg_gc);
seq_printf(s, " - data segments : %d (%d)\n",
si->data_segs, si->bg_data_segs);
seq_printf(s, " - node segments : %d (%d)\n",
si->node_segs, si->bg_node_segs);
seq_printf(s, " - Reclaimed segs : Normal (%d), Idle CB (%d), "
"Idle Greedy (%d), Idle AT (%d), "
"Urgent High (%d), Urgent Mid (%d), "
"Urgent Low (%d)\n",
si->sbi->gc_reclaimed_segs[GC_NORMAL],
si->sbi->gc_reclaimed_segs[GC_IDLE_CB],
si->sbi->gc_reclaimed_segs[GC_IDLE_GREEDY],
si->sbi->gc_reclaimed_segs[GC_IDLE_AT],
si->sbi->gc_reclaimed_segs[GC_URGENT_HIGH],
si->sbi->gc_reclaimed_segs[GC_URGENT_MID],
si->sbi->gc_reclaimed_segs[GC_URGENT_LOW]);
seq_puts(s, " - Reclaimed segs :\n");
seq_printf(s, " - Normal : %d\n", si->sbi->gc_reclaimed_segs[GC_NORMAL]);
seq_printf(s, " - Idle CB : %d\n", si->sbi->gc_reclaimed_segs[GC_IDLE_CB]);
seq_printf(s, " - Idle Greedy : %d\n",
si->sbi->gc_reclaimed_segs[GC_IDLE_GREEDY]);
seq_printf(s, " - Idle AT : %d\n", si->sbi->gc_reclaimed_segs[GC_IDLE_AT]);
seq_printf(s, " - Urgent High : %d\n",
si->sbi->gc_reclaimed_segs[GC_URGENT_HIGH]);
seq_printf(s, " - Urgent Mid : %d\n", si->sbi->gc_reclaimed_segs[GC_URGENT_MID]);
seq_printf(s, " - Urgent Low : %d\n", si->sbi->gc_reclaimed_segs[GC_URGENT_LOW]);
seq_printf(s, "Try to move %d blocks (BG: %d)\n", si->tot_blks,
si->bg_data_blks + si->bg_node_blks);
seq_printf(s, " - data blocks : %d (%d)\n", si->data_blks,
......@@ -490,26 +508,44 @@ static int stat_show(struct seq_file *s, void *v)
si->bg_node_blks);
seq_printf(s, "BG skip : IO: %u, Other: %u\n",
si->io_skip_bggc, si->other_skip_bggc);
seq_puts(s, "\nExtent Cache:\n");
seq_puts(s, "\nExtent Cache (Read):\n");
seq_printf(s, " - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n",
si->hit_largest, si->hit_cached,
si->hit_rbtree);
si->hit_largest, si->hit_cached[EX_READ],
si->hit_rbtree[EX_READ]);
seq_printf(s, " - Hit Ratio: %llu%% (%llu / %llu)\n",
!si->total_ext[EX_READ] ? 0 :
div64_u64(si->hit_total[EX_READ] * 100,
si->total_ext[EX_READ]),
si->hit_total[EX_READ], si->total_ext[EX_READ]);
seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n",
si->ext_tree[EX_READ], si->zombie_tree[EX_READ],
si->ext_node[EX_READ]);
seq_puts(s, "\nExtent Cache (Block Age):\n");
seq_printf(s, " - Allocated Data Blocks: %llu\n",
si->allocated_data_blocks);
seq_printf(s, " - Hit Count: L1:%llu L2:%llu\n",
si->hit_cached[EX_BLOCK_AGE],
si->hit_rbtree[EX_BLOCK_AGE]);
seq_printf(s, " - Hit Ratio: %llu%% (%llu / %llu)\n",
!si->total_ext ? 0 :
div64_u64(si->hit_total * 100, si->total_ext),
si->hit_total, si->total_ext);
!si->total_ext[EX_BLOCK_AGE] ? 0 :
div64_u64(si->hit_total[EX_BLOCK_AGE] * 100,
si->total_ext[EX_BLOCK_AGE]),
si->hit_total[EX_BLOCK_AGE],
si->total_ext[EX_BLOCK_AGE]);
seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n",
si->ext_tree, si->zombie_tree, si->ext_node);
si->ext_tree[EX_BLOCK_AGE],
si->zombie_tree[EX_BLOCK_AGE],
si->ext_node[EX_BLOCK_AGE]);
seq_puts(s, "\nBalancing F2FS Async:\n");
seq_printf(s, " - DIO (R: %4d, W: %4d)\n",
si->nr_dio_read, si->nr_dio_write);
seq_printf(s, " - IO_R (Data: %4d, Node: %4d, Meta: %4d\n",
si->nr_rd_data, si->nr_rd_node, si->nr_rd_meta);
seq_printf(s, " - IO_W (CP: %4d, Data: %4d, Flush: (%4d %4d %4d), "
"Discard: (%4d %4d)) cmd: %4d undiscard:%4u\n",
seq_printf(s, " - IO_W (CP: %4d, Data: %4d, Flush: (%4d %4d %4d), ",
si->nr_wb_cp_data, si->nr_wb_data,
si->nr_flushing, si->nr_flushed,
si->flush_list_empty,
si->flush_list_empty);
seq_printf(s, "Discard: (%4d %4d)) cmd: %4d undiscard:%4u\n",
si->nr_discarding, si->nr_discarded,
si->nr_discard_cmd, si->undiscard_blks);
seq_printf(s, " - atomic IO: %4d (Max. %4d)\n",
......@@ -566,8 +602,12 @@ static int stat_show(struct seq_file *s, void *v)
(si->base_mem + si->cache_mem + si->page_mem) >> 10);
seq_printf(s, " - static: %llu KB\n",
si->base_mem >> 10);
seq_printf(s, " - cached: %llu KB\n",
seq_printf(s, " - cached all: %llu KB\n",
si->cache_mem >> 10);
seq_printf(s, " - read extent cache: %llu KB\n",
si->ext_mem[EX_READ] >> 10);
seq_printf(s, " - block age extent cache: %llu KB\n",
si->ext_mem[EX_BLOCK_AGE] >> 10);
seq_printf(s, " - paged : %llu KB\n",
si->page_mem >> 10);
}
......@@ -600,10 +640,15 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
si->sbi = sbi;
sbi->stat_info = si;
atomic64_set(&sbi->total_hit_ext, 0);
atomic64_set(&sbi->read_hit_rbtree, 0);
/* general extent cache stats */
for (i = 0; i < NR_EXTENT_CACHES; i++) {
atomic64_set(&sbi->total_hit_ext[i], 0);
atomic64_set(&sbi->read_hit_rbtree[i], 0);
atomic64_set(&sbi->read_hit_cached[i], 0);
}
/* read extent_cache only */
atomic64_set(&sbi->read_hit_largest, 0);
atomic64_set(&sbi->read_hit_cached, 0);
atomic_set(&sbi->inline_xattr, 0);
atomic_set(&sbi->inline_inode, 0);
......
......@@ -340,6 +340,7 @@ static struct f2fs_dir_entry *find_in_level(struct inode *dir,
unsigned int bidx, end_block;
struct page *dentry_page;
struct f2fs_dir_entry *de = NULL;
pgoff_t next_pgofs;
bool room = false;
int max_slots;
......@@ -350,12 +351,13 @@ static struct f2fs_dir_entry *find_in_level(struct inode *dir,
le32_to_cpu(fname->hash) % nbucket);
end_block = bidx + nblock;
for (; bidx < end_block; bidx++) {
while (bidx < end_block) {
/* no need to allocate new dentry pages to all the indices */
dentry_page = f2fs_find_data_page(dir, bidx);
dentry_page = f2fs_find_data_page(dir, bidx, &next_pgofs);
if (IS_ERR(dentry_page)) {
if (PTR_ERR(dentry_page) == -ENOENT) {
room = true;
bidx = next_pgofs;
continue;
} else {
*res_page = dentry_page;
......@@ -376,6 +378,8 @@ static struct f2fs_dir_entry *find_in_level(struct inode *dir,
if (max_slots >= s)
room = true;
f2fs_put_page(dentry_page, 0);
bidx++;
}
if (!de && room && F2FS_I(dir)->chash != fname->hash) {
......@@ -956,7 +960,7 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,
bool f2fs_empty_dir(struct inode *dir)
{
unsigned long bidx;
unsigned long bidx = 0;
struct page *dentry_page;
unsigned int bit_pos;
struct f2fs_dentry_block *dentry_blk;
......@@ -965,13 +969,17 @@ bool f2fs_empty_dir(struct inode *dir)
if (f2fs_has_inline_dentry(dir))
return f2fs_empty_inline_dir(dir);
for (bidx = 0; bidx < nblock; bidx++) {
dentry_page = f2fs_get_lock_data_page(dir, bidx, false);
while (bidx < nblock) {
pgoff_t next_pgofs;
dentry_page = f2fs_find_data_page(dir, bidx, &next_pgofs);
if (IS_ERR(dentry_page)) {
if (PTR_ERR(dentry_page) == -ENOENT)
if (PTR_ERR(dentry_page) == -ENOENT) {
bidx = next_pgofs;
continue;
else
} else {
return false;
}
}
dentry_blk = page_address(dentry_page);
......@@ -983,10 +991,12 @@ bool f2fs_empty_dir(struct inode *dir)
NR_DENTRY_IN_BLOCK,
bit_pos);
f2fs_put_page(dentry_page, 1);
f2fs_put_page(dentry_page, 0);
if (bit_pos < NR_DENTRY_IN_BLOCK)
return false;
bidx++;
}
return true;
}
......@@ -1000,7 +1010,7 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct f2fs_dentry_ptr *d,
struct fscrypt_str de_name = FSTR_INIT(NULL, 0);
struct f2fs_sb_info *sbi = F2FS_I_SB(d->inode);
struct blk_plug plug;
bool readdir_ra = sbi->readdir_ra == 1;
bool readdir_ra = sbi->readdir_ra;
bool found_valid_dirent = false;
int err = 0;
......@@ -1104,7 +1114,8 @@ static int f2fs_readdir(struct file *file, struct dir_context *ctx)
goto out_free;
}
for (; n < npages; n++, ctx->pos = n * NR_DENTRY_IN_BLOCK) {
for (; n < npages; ctx->pos = n * NR_DENTRY_IN_BLOCK) {
pgoff_t next_pgofs;
/* allow readdir() to be interrupted */
if (fatal_signal_pending(current)) {
......@@ -1118,11 +1129,12 @@ static int f2fs_readdir(struct file *file, struct dir_context *ctx)
page_cache_sync_readahead(inode->i_mapping, ra, file, n,
min(npages - n, (pgoff_t)MAX_DIR_RA_PAGES));
dentry_page = f2fs_find_data_page(inode, n);
dentry_page = f2fs_find_data_page(inode, n, &next_pgofs);
if (IS_ERR(dentry_page)) {
err = PTR_ERR(dentry_page);
if (err == -ENOENT) {
err = 0;
n = next_pgofs;
continue;
} else {
goto out_free;
......@@ -1141,6 +1153,8 @@ static int f2fs_readdir(struct file *file, struct dir_context *ctx)
}
f2fs_put_page(dentry_page, 0);
n++;
}
out_free:
fscrypt_fname_free_buffer(&fstr);
......
This diff is collapsed.
This diff is collapsed.
......@@ -571,7 +571,7 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count)
raw_node = F2FS_NODE(dn->node_page);
addr = blkaddr_in_node(raw_node) + base + ofs;
/* Assumption: truncateion starts with cluster */
/* Assumption: truncation starts with cluster */
for (; count > 0; count--, addr++, dn->ofs_in_node++, cluster_index++) {
block_t blkaddr = le32_to_cpu(*addr);
......@@ -618,7 +618,8 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count)
*/
fofs = f2fs_start_bidx_of_node(ofs_of_node(dn->node_page),
dn->inode) + ofs;
f2fs_update_extent_cache_range(dn, fofs, 0, len);
f2fs_update_read_extent_cache_range(dn, fofs, 0, len);
f2fs_update_age_extent_cache_range(dn, fofs, nr_free);
dec_valid_block_count(sbi, dn->inode, nr_free);
}
dn->ofs_in_node = ofs;
......@@ -1496,7 +1497,7 @@ static int f2fs_do_zero_range(struct dnode_of_data *dn, pgoff_t start,
f2fs_set_data_blkaddr(dn);
}
f2fs_update_extent_cache_range(dn, start, 0, index - start);
f2fs_update_read_extent_cache_range(dn, start, 0, index - start);
return ret;
}
......@@ -1915,6 +1916,10 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
if (!f2fs_disable_compressed_file(inode))
return -EINVAL;
} else {
/* try to convert inline_data to support compression */
int err = f2fs_convert_inline_inode(inode);
if (err)
return err;
if (!f2fs_may_compress(inode))
return -EINVAL;
if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode))
......@@ -2030,13 +2035,14 @@ static int f2fs_ioc_getversion(struct file *filp, unsigned long arg)
return put_user(inode->i_generation, (int __user *)arg);
}
static int f2fs_ioc_start_atomic_write(struct file *filp)
static int f2fs_ioc_start_atomic_write(struct file *filp, bool truncate)
{
struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct inode *pinode;
loff_t isize;
int ret;
if (!inode_owner_or_capable(mnt_userns, inode))
......@@ -2095,13 +2101,25 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
goto out;
}
f2fs_i_size_write(fi->cow_inode, i_size_read(inode));
f2fs_write_inode(inode, NULL);
stat_inc_atomic_inode(inode);
set_inode_flag(inode, FI_ATOMIC_FILE);
set_inode_flag(fi->cow_inode, FI_COW_FILE);
clear_inode_flag(fi->cow_inode, FI_INLINE_DATA);
isize = i_size_read(inode);
fi->original_i_size = isize;
if (truncate) {
set_inode_flag(inode, FI_ATOMIC_REPLACE);
truncate_inode_pages_final(inode->i_mapping);
f2fs_i_size_write(inode, 0);
isize = 0;
}
f2fs_i_size_write(fi->cow_inode, isize);
f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
f2fs_update_time(sbi, REQ_TIME);
......@@ -2133,16 +2151,14 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
if (f2fs_is_atomic_file(inode)) {
ret = f2fs_commit_atomic_write(inode);
if (ret)
goto unlock_out;
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
if (!ret)
f2fs_abort_atomic_write(inode, false);
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
f2fs_abort_atomic_write(inode, ret);
} else {
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
unlock_out:
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
......@@ -2543,7 +2559,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
struct f2fs_map_blocks map = { .m_next_extent = NULL,
.m_seg_type = NO_CHECK_TYPE,
.m_may_create = false };
struct extent_info ei = {0, 0, 0};
struct extent_info ei = {0, };
pgoff_t pg_start, pg_end, next_pgofs;
unsigned int blk_per_seg = sbi->blocks_per_seg;
unsigned int total = 0, sec_num;
......@@ -2575,7 +2591,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
* lookup mapping info in extent cache, skip defragmenting if physical
* block addresses are continuous.
*/
if (f2fs_lookup_extent_cache(inode, pg_start, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, pg_start, &ei)) {
if (ei.fofs + ei.len >= pg_end)
goto out;
}
......@@ -4131,7 +4147,9 @@ static long __f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
case FS_IOC_GETVERSION:
return f2fs_ioc_getversion(filp, arg);
case F2FS_IOC_START_ATOMIC_WRITE:
return f2fs_ioc_start_atomic_write(filp);
return f2fs_ioc_start_atomic_write(filp, false);
case F2FS_IOC_START_ATOMIC_REPLACE:
return f2fs_ioc_start_atomic_write(filp, true);
case F2FS_IOC_COMMIT_ATOMIC_WRITE:
return f2fs_ioc_commit_atomic_write(filp);
case F2FS_IOC_ABORT_ATOMIC_WRITE:
......
......@@ -96,16 +96,6 @@ static int gc_thread_func(void *data)
* invalidated soon after by user update or deletion.
* So, I'd like to wait some time to collect dirty segments.
*/
if (sbi->gc_mode == GC_URGENT_HIGH) {
spin_lock(&sbi->gc_urgent_high_lock);
if (sbi->gc_urgent_high_remaining) {
sbi->gc_urgent_high_remaining--;
if (!sbi->gc_urgent_high_remaining)
sbi->gc_mode = GC_NORMAL;
}
spin_unlock(&sbi->gc_urgent_high_lock);
}
if (sbi->gc_mode == GC_URGENT_HIGH ||
sbi->gc_mode == GC_URGENT_MID) {
wait_ms = gc_th->urgent_sleep_time;
......@@ -151,6 +141,10 @@ static int gc_thread_func(void *data)
/* don't bother wait_ms by foreground gc */
if (!foreground)
wait_ms = gc_th->no_gc_sleep_time;
} else {
/* reset wait_ms to default sleep time */
if (wait_ms == gc_th->no_gc_sleep_time)
wait_ms = gc_th->min_sleep_time;
}
if (foreground)
......@@ -162,6 +156,15 @@ static int gc_thread_func(void *data)
/* balancing f2fs's metadata periodically */
f2fs_balance_fs_bg(sbi, true);
next:
if (sbi->gc_mode != GC_NORMAL) {
spin_lock(&sbi->gc_remaining_trials_lock);
if (sbi->gc_remaining_trials) {
sbi->gc_remaining_trials--;
if (!sbi->gc_remaining_trials)
sbi->gc_mode = GC_NORMAL;
}
spin_unlock(&sbi->gc_remaining_trials_lock);
}
sb_end_write(sbi->sb);
} while (!kthread_should_stop());
......@@ -172,13 +175,10 @@ int f2fs_start_gc_thread(struct f2fs_sb_info *sbi)
{
struct f2fs_gc_kthread *gc_th;
dev_t dev = sbi->sb->s_bdev->bd_dev;
int err = 0;
gc_th = f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL);
if (!gc_th) {
err = -ENOMEM;
goto out;
}
if (!gc_th)
return -ENOMEM;
gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME;
gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME;
......@@ -193,12 +193,14 @@ int f2fs_start_gc_thread(struct f2fs_sb_info *sbi)
sbi->gc_thread->f2fs_gc_task = kthread_run(gc_thread_func, sbi,
"f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev));
if (IS_ERR(gc_th->f2fs_gc_task)) {
err = PTR_ERR(gc_th->f2fs_gc_task);
int err = PTR_ERR(gc_th->f2fs_gc_task);
kfree(gc_th);
sbi->gc_thread = NULL;
return err;
}
out:
return err;
return 0;
}
void f2fs_stop_gc_thread(struct f2fs_sb_info *sbi)
......@@ -1079,7 +1081,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
{
struct page *node_page;
nid_t nid;
unsigned int ofs_in_node, max_addrs;
unsigned int ofs_in_node, max_addrs, base;
block_t source_blkaddr;
nid = le32_to_cpu(sum->nid);
......@@ -1105,11 +1107,18 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
return false;
}
max_addrs = IS_INODE(node_page) ? DEF_ADDRS_PER_INODE :
DEF_ADDRS_PER_BLOCK;
if (ofs_in_node >= max_addrs) {
f2fs_err(sbi, "Inconsistent ofs_in_node:%u in summary, ino:%u, nid:%u, max:%u",
ofs_in_node, dni->ino, dni->nid, max_addrs);
if (IS_INODE(node_page)) {
base = offset_in_addr(F2FS_INODE(node_page));
max_addrs = DEF_ADDRS_PER_INODE;
} else {
base = 0;
max_addrs = DEF_ADDRS_PER_BLOCK;
}
if (base + ofs_in_node >= max_addrs) {
f2fs_err(sbi, "Inconsistent blkaddr offset: base:%u, ofs_in_node:%u, max:%u, ino:%u, nid:%u",
base, ofs_in_node, max_addrs, dni->ino, dni->nid);
f2fs_put_page(node_page, 1);
return false;
}
......@@ -1141,7 +1150,7 @@ static int ra_data_block(struct inode *inode, pgoff_t index)
struct address_space *mapping = inode->i_mapping;
struct dnode_of_data dn;
struct page *page;
struct extent_info ei = {0, 0, 0};
struct extent_info ei = {0, };
struct f2fs_io_info fio = {
.sbi = sbi,
.ino = inode->i_ino,
......@@ -1159,7 +1168,7 @@ static int ra_data_block(struct inode *inode, pgoff_t index)
if (!page)
return -ENOMEM;
if (f2fs_lookup_extent_cache(inode, index, &ei)) {
if (f2fs_lookup_read_extent_cache(inode, index, &ei)) {
dn.data_blkaddr = ei.blk + index - ei.fofs;
if (unlikely(!f2fs_is_valid_blkaddr(sbi, dn.data_blkaddr,
DATA_GENERIC_ENHANCE_READ))) {
......@@ -1563,8 +1572,8 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
continue;
}
data_page = f2fs_get_read_data_page(inode,
start_bidx, REQ_RAHEAD, true);
data_page = f2fs_get_read_data_page(inode, start_bidx,
REQ_RAHEAD, true, NULL);
f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (IS_ERR(data_page)) {
iput(inode);
......@@ -1744,8 +1753,9 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
get_valid_blocks(sbi, segno, false) == 0)
seg_freed++;
if (__is_large_section(sbi) && segno + 1 < end_segno)
sbi->next_victim_seg[gc_type] = segno + 1;
if (__is_large_section(sbi))
sbi->next_victim_seg[gc_type] =
(segno + 1 < end_segno) ? segno + 1 : NULL_SEGNO;
skip:
f2fs_put_page(sum_page, 0);
}
......@@ -1898,9 +1908,7 @@ int __init f2fs_create_garbage_collection_cache(void)
{
victim_entry_slab = f2fs_kmem_cache_create("f2fs_victim_entry",
sizeof(struct victim_entry));
if (!victim_entry_slab)
return -ENOMEM;
return 0;
return victim_entry_slab ? 0 : -ENOMEM;
}
void f2fs_destroy_garbage_collection_cache(void)
......@@ -2133,8 +2141,6 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
if (err)
return err;
set_sbi_flag(sbi, SBI_IS_RESIZEFS);
freeze_super(sbi->sb);
f2fs_down_write(&sbi->gc_lock);
f2fs_down_write(&sbi->cp_global_sem);
......@@ -2150,6 +2156,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
if (err)
goto out_err;
set_sbi_flag(sbi, SBI_IS_RESIZEFS);
err = free_segment_range(sbi, secs, false);
if (err)
goto recover_out;
......@@ -2173,6 +2180,7 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
f2fs_commit_super(sbi, false);
}
recover_out:
clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
if (err) {
set_sbi_flag(sbi, SBI_NEED_FSCK);
f2fs_err(sbi, "resize_fs failed, should run fsck to repair!");
......@@ -2185,6 +2193,5 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
f2fs_up_write(&sbi->cp_global_sem);
f2fs_up_write(&sbi->gc_lock);
thaw_super(sbi->sb);
clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
return err;
}
......@@ -262,8 +262,8 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page)
return false;
}
if (fi->extent_tree) {
struct extent_info *ei = &fi->extent_tree->largest;
if (fi->extent_tree[EX_READ]) {
struct extent_info *ei = &fi->extent_tree[EX_READ]->largest;
if (ei->len &&
(!f2fs_is_valid_blkaddr(sbi, ei->blk,
......@@ -392,8 +392,6 @@ static int do_read_inode(struct inode *inode)
fi->i_pino = le32_to_cpu(ri->i_pino);
fi->i_dir_level = ri->i_dir_level;
f2fs_init_extent_tree(inode, node_page);
get_inline_info(inode, ri);
fi->i_extra_isize = f2fs_has_extra_attr(inode) ?
......@@ -479,6 +477,11 @@ static int do_read_inode(struct inode *inode)
}
init_idisk_time(inode);
/* Need all the flag bits */
f2fs_init_read_extent_tree(inode, node_page);
f2fs_init_age_extent_tree(inode);
f2fs_put_page(node_page, 1);
stat_inc_inline_xattr(inode);
......@@ -607,7 +610,7 @@ struct inode *f2fs_iget_retry(struct super_block *sb, unsigned long ino)
void f2fs_update_inode(struct inode *inode, struct page *node_page)
{
struct f2fs_inode *ri;
struct extent_tree *et = F2FS_I(inode)->extent_tree;
struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ];
f2fs_wait_on_page_writeback(node_page, NODE, true, true);
set_page_dirty(node_page);
......@@ -621,12 +624,15 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page)
ri->i_uid = cpu_to_le32(i_uid_read(inode));
ri->i_gid = cpu_to_le32(i_gid_read(inode));
ri->i_links = cpu_to_le32(inode->i_nlink);
ri->i_size = cpu_to_le64(i_size_read(inode));
ri->i_blocks = cpu_to_le64(SECTOR_TO_BLOCK(inode->i_blocks) + 1);
if (!f2fs_is_atomic_file(inode) ||
is_inode_flag_set(inode, FI_ATOMIC_COMMITTED))
ri->i_size = cpu_to_le64(i_size_read(inode));
if (et) {
read_lock(&et->lock);
set_raw_extent(&et->largest, &ri->i_ext);
set_raw_read_extent(&et->largest, &ri->i_ext);
read_unlock(&et->lock);
} else {
memset(&ri->i_ext, 0, sizeof(ri->i_ext));
......
This diff is collapsed.
......@@ -60,7 +60,7 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type)
avail_ram = val.totalram - val.totalhigh;
/*
* give 25%, 25%, 50%, 50%, 50% memory for each components respectively
* give 25%, 25%, 50%, 50%, 25%, 25% memory for each components respectively
*/
if (type == FREE_NIDS) {
mem_size = (nm_i->nid_cnt[FREE_NID] *
......@@ -85,12 +85,16 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type)
sizeof(struct ino_entry);
mem_size >>= PAGE_SHIFT;
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1);
} else if (type == EXTENT_CACHE) {
mem_size = (atomic_read(&sbi->total_ext_tree) *
} else if (type == READ_EXTENT_CACHE || type == AGE_EXTENT_CACHE) {
enum extent_type etype = type == READ_EXTENT_CACHE ?
EX_READ : EX_BLOCK_AGE;
struct extent_tree_info *eti = &sbi->extent_tree[etype];
mem_size = (atomic_read(&eti->total_ext_tree) *
sizeof(struct extent_tree) +
atomic_read(&sbi->total_ext_node) *
atomic_read(&eti->total_ext_node) *
sizeof(struct extent_node)) >> PAGE_SHIFT;
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1);
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 2);
} else if (type == DISCARD_CACHE) {
mem_size = (atomic_read(&dcc->discard_cmd_cnt) *
sizeof(struct discard_cmd)) >> PAGE_SHIFT;
......@@ -859,7 +863,7 @@ int f2fs_get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode)
blkaddr = data_blkaddr(dn->inode, dn->node_page,
dn->ofs_in_node + 1);
f2fs_update_extent_tree_range_compressed(dn->inode,
f2fs_update_read_extent_tree_range_compressed(dn->inode,
index, blkaddr,
F2FS_I(dn->inode)->i_cluster_size,
c_len);
......@@ -1360,8 +1364,7 @@ static int read_node_page(struct page *page, blk_opf_t op_flags)
return err;
/* NEW_ADDR can be seen, after cp_error drops some dirty node pages */
if (unlikely(ni.blk_addr == NULL_ADDR || ni.blk_addr == NEW_ADDR) ||
is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN)) {
if (unlikely(ni.blk_addr == NULL_ADDR || ni.blk_addr == NEW_ADDR)) {
ClearPageUptodate(page);
return -ENOENT;
}
......
......@@ -146,7 +146,8 @@ enum mem_type {
NAT_ENTRIES, /* indicates the cached nat entry */
DIRTY_DENTS, /* indicates dirty dentry pages */
INO_ENTRIES, /* indicates inode entries */
EXTENT_CACHE, /* indicates extent cache */
READ_EXTENT_CACHE, /* indicates read extent cache */
AGE_EXTENT_CACHE, /* indicates age extent cache */
DISCARD_CACHE, /* indicates memory of cached discard cmds */
COMPRESS_PAGE, /* indicates memory of cached compressed pages */
BASE_CHECK, /* check kernel status */
......
......@@ -923,9 +923,7 @@ int __init f2fs_create_recovery_cache(void)
{
fsync_entry_slab = f2fs_kmem_cache_create("f2fs_fsync_inode_entry",
sizeof(struct fsync_inode_entry));
if (!fsync_entry_slab)
return -ENOMEM;
return 0;
return fsync_entry_slab ? 0 : -ENOMEM;
}
void f2fs_destroy_recovery_cache(void)
......
This diff is collapsed.
......@@ -222,10 +222,6 @@ struct sec_entry {
unsigned int valid_blocks; /* # of valid blocks in a section */
};
struct segment_allocation {
void (*allocate_segment)(struct f2fs_sb_info *, int, bool);
};
#define MAX_SKIP_GC_COUNT 16
struct revoke_entry {
......@@ -235,8 +231,6 @@ struct revoke_entry {
};
struct sit_info {
const struct segment_allocation *s_ops;
block_t sit_base_addr; /* start block address of SIT area */
block_t sit_blocks; /* # of blocks used by SIT area */
block_t written_valid_blocks; /* # of valid blocks in main area */
......
......@@ -28,10 +28,13 @@ static unsigned long __count_free_nids(struct f2fs_sb_info *sbi)
return count > 0 ? count : 0;
}
static unsigned long __count_extent_cache(struct f2fs_sb_info *sbi)
static unsigned long __count_extent_cache(struct f2fs_sb_info *sbi,
enum extent_type type)
{
return atomic_read(&sbi->total_zombie_tree) +
atomic_read(&sbi->total_ext_node);
struct extent_tree_info *eti = &sbi->extent_tree[type];
return atomic_read(&eti->total_zombie_tree) +
atomic_read(&eti->total_ext_node);
}
unsigned long f2fs_shrink_count(struct shrinker *shrink,
......@@ -53,8 +56,11 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink,
}
spin_unlock(&f2fs_list_lock);
/* count extent cache entries */
count += __count_extent_cache(sbi);
/* count read extent cache entries */
count += __count_extent_cache(sbi, EX_READ);
/* count block age extent cache entries */
count += __count_extent_cache(sbi, EX_BLOCK_AGE);
/* count clean nat cache entries */
count += __count_nat_entries(sbi);
......@@ -100,7 +106,10 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink,
sbi->shrinker_run_no = run_no;
/* shrink extent cache entries */
freed += f2fs_shrink_extent_tree(sbi, nr >> 1);
freed += f2fs_shrink_age_extent_tree(sbi, nr >> 2);
/* shrink read extent cache entries */
freed += f2fs_shrink_read_extent_tree(sbi, nr >> 2);
/* shrink clean nat cache entries */
if (freed < nr)
......@@ -130,7 +139,9 @@ void f2fs_join_shrinker(struct f2fs_sb_info *sbi)
void f2fs_leave_shrinker(struct f2fs_sb_info *sbi)
{
f2fs_shrink_extent_tree(sbi, __count_extent_cache(sbi));
f2fs_shrink_read_extent_tree(sbi, __count_extent_cache(sbi, EX_READ));
f2fs_shrink_age_extent_tree(sbi,
__count_extent_cache(sbi, EX_BLOCK_AGE));
spin_lock(&f2fs_list_lock);
list_del_init(&sbi->s_list);
......
This diff is collapsed.
This diff is collapsed.
......@@ -48,6 +48,8 @@ TRACE_DEFINE_ENUM(CP_DISCARD);
TRACE_DEFINE_ENUM(CP_TRIMMED);
TRACE_DEFINE_ENUM(CP_PAUSE);
TRACE_DEFINE_ENUM(CP_RESIZE);
TRACE_DEFINE_ENUM(EX_READ);
TRACE_DEFINE_ENUM(EX_BLOCK_AGE);
#define show_block_type(type) \
__print_symbolic(type, \
......@@ -154,6 +156,11 @@ TRACE_DEFINE_ENUM(CP_RESIZE);
{ COMPRESS_ZSTD, "ZSTD" }, \
{ COMPRESS_LZORLE, "LZO-RLE" })
#define show_extent_type(type) \
__print_symbolic(type, \
{ EX_READ, "Read" }, \
{ EX_BLOCK_AGE, "Block Age" })
struct f2fs_sb_info;
struct f2fs_io_info;
struct extent_info;
......@@ -322,7 +329,7 @@ TRACE_EVENT(f2fs_unlink_enter,
__field(ino_t, ino)
__field(loff_t, size)
__field(blkcnt_t, blocks)
__field(const char *, name)
__string(name, dentry->d_name.name)
),
TP_fast_assign(
......@@ -330,7 +337,7 @@ TRACE_EVENT(f2fs_unlink_enter,
__entry->ino = dir->i_ino;
__entry->size = dir->i_size;
__entry->blocks = dir->i_blocks;
__entry->name = dentry->d_name.name;
__assign_str(name, dentry->d_name.name);
),
TP_printk("dev = (%d,%d), dir ino = %lu, i_size = %lld, "
......@@ -338,7 +345,7 @@ TRACE_EVENT(f2fs_unlink_enter,
show_dev_ino(__entry),
__entry->size,
(unsigned long long)__entry->blocks,
__entry->name)
__get_str(name))
);
DEFINE_EVENT(f2fs__inode_exit, f2fs_unlink_exit,
......@@ -940,25 +947,29 @@ TRACE_EVENT(f2fs_direct_IO_enter,
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
__field(struct kiocb *, iocb)
__field(loff_t, ki_pos)
__field(int, ki_flags)
__field(u16, ki_ioprio)
__field(unsigned long, len)
__field(int, rw)
),
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->iocb = iocb;
__entry->len = len;
__entry->rw = rw;
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->ki_pos = iocb->ki_pos;
__entry->ki_flags = iocb->ki_flags;
__entry->ki_ioprio = iocb->ki_ioprio;
__entry->len = len;
__entry->rw = rw;
),
TP_printk("dev = (%d,%d), ino = %lu pos = %lld len = %lu ki_flags = %x ki_ioprio = %x rw = %d",
show_dev_ino(__entry),
__entry->iocb->ki_pos,
__entry->ki_pos,
__entry->len,
__entry->iocb->ki_flags,
__entry->iocb->ki_ioprio,
__entry->ki_flags,
__entry->ki_ioprio,
__entry->rw)
);
......@@ -1400,26 +1411,26 @@ TRACE_EVENT(f2fs_readpages,
TRACE_EVENT(f2fs_write_checkpoint,
TP_PROTO(struct super_block *sb, int reason, char *msg),
TP_PROTO(struct super_block *sb, int reason, const char *msg),
TP_ARGS(sb, reason, msg),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(int, reason)
__field(char *, msg)
__string(dest_msg, msg)
),
TP_fast_assign(
__entry->dev = sb->s_dev;
__entry->reason = reason;
__entry->msg = msg;
__assign_str(dest_msg, msg);
),
TP_printk("dev = (%d,%d), checkpoint for %s, state = %s",
show_dev(__entry->dev),
show_cpreason(__entry->reason),
__entry->msg)
__get_str(dest_msg))
);
DECLARE_EVENT_CLASS(f2fs_discard,
......@@ -1518,28 +1529,31 @@ TRACE_EVENT(f2fs_issue_flush,
TRACE_EVENT(f2fs_lookup_extent_tree_start,
TP_PROTO(struct inode *inode, unsigned int pgofs),
TP_PROTO(struct inode *inode, unsigned int pgofs, enum extent_type type),
TP_ARGS(inode, pgofs),
TP_ARGS(inode, pgofs, type),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
__field(unsigned int, pgofs)
__field(enum extent_type, type)
),
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->pgofs = pgofs;
__entry->type = type;
),
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u",
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, type = %s",
show_dev_ino(__entry),
__entry->pgofs)
__entry->pgofs,
show_extent_type(__entry->type))
);
TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end,
TRACE_EVENT_CONDITION(f2fs_lookup_read_extent_tree_end,
TP_PROTO(struct inode *inode, unsigned int pgofs,
struct extent_info *ei),
......@@ -1553,8 +1567,8 @@ TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end,
__field(ino_t, ino)
__field(unsigned int, pgofs)
__field(unsigned int, fofs)
__field(u32, blk)
__field(unsigned int, len)
__field(u32, blk)
),
TP_fast_assign(
......@@ -1562,26 +1576,65 @@ TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end,
__entry->ino = inode->i_ino;
__entry->pgofs = pgofs;
__entry->fofs = ei->fofs;
__entry->len = ei->len;
__entry->blk = ei->blk;
),
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, "
"read_ext_info(fofs: %u, len: %u, blk: %u)",
show_dev_ino(__entry),
__entry->pgofs,
__entry->fofs,
__entry->len,
__entry->blk)
);
TRACE_EVENT_CONDITION(f2fs_lookup_age_extent_tree_end,
TP_PROTO(struct inode *inode, unsigned int pgofs,
struct extent_info *ei),
TP_ARGS(inode, pgofs, ei),
TP_CONDITION(ei),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
__field(unsigned int, pgofs)
__field(unsigned int, fofs)
__field(unsigned int, len)
__field(unsigned long long, age)
__field(unsigned long long, blocks)
),
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->pgofs = pgofs;
__entry->fofs = ei->fofs;
__entry->len = ei->len;
__entry->age = ei->age;
__entry->blocks = ei->last_blocks;
),
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, "
"ext_info(fofs: %u, blk: %u, len: %u)",
"age_ext_info(fofs: %u, len: %u, age: %llu, blocks: %llu)",
show_dev_ino(__entry),
__entry->pgofs,
__entry->fofs,
__entry->blk,
__entry->len)
__entry->len,
__entry->age,
__entry->blocks)
);
TRACE_EVENT(f2fs_update_extent_tree_range,
TRACE_EVENT(f2fs_update_read_extent_tree_range,
TP_PROTO(struct inode *inode, unsigned int pgofs, block_t blkaddr,
unsigned int len,
TP_PROTO(struct inode *inode, unsigned int pgofs, unsigned int len,
block_t blkaddr,
unsigned int c_len),
TP_ARGS(inode, pgofs, blkaddr, len, c_len),
TP_ARGS(inode, pgofs, len, blkaddr, c_len),
TP_STRUCT__entry(
__field(dev_t, dev)
......@@ -1596,67 +1649,108 @@ TRACE_EVENT(f2fs_update_extent_tree_range,
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->pgofs = pgofs;
__entry->blk = blkaddr;
__entry->len = len;
__entry->blk = blkaddr;
__entry->c_len = c_len;
),
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, "
"blkaddr = %u, len = %u, "
"c_len = %u",
"len = %u, blkaddr = %u, c_len = %u",
show_dev_ino(__entry),
__entry->pgofs,
__entry->blk,
__entry->len,
__entry->blk,
__entry->c_len)
);
TRACE_EVENT(f2fs_update_age_extent_tree_range,
TP_PROTO(struct inode *inode, unsigned int pgofs, unsigned int len,
unsigned long long age,
unsigned long long last_blks),
TP_ARGS(inode, pgofs, len, age, last_blks),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
__field(unsigned int, pgofs)
__field(unsigned int, len)
__field(unsigned long long, age)
__field(unsigned long long, blocks)
),
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->pgofs = pgofs;
__entry->len = len;
__entry->age = age;
__entry->blocks = last_blks;
),
TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, "
"len = %u, age = %llu, blocks = %llu",
show_dev_ino(__entry),
__entry->pgofs,
__entry->len,
__entry->age,
__entry->blocks)
);
TRACE_EVENT(f2fs_shrink_extent_tree,
TP_PROTO(struct f2fs_sb_info *sbi, unsigned int node_cnt,
unsigned int tree_cnt),
unsigned int tree_cnt, enum extent_type type),
TP_ARGS(sbi, node_cnt, tree_cnt),
TP_ARGS(sbi, node_cnt, tree_cnt, type),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(unsigned int, node_cnt)
__field(unsigned int, tree_cnt)
__field(enum extent_type, type)
),
TP_fast_assign(
__entry->dev = sbi->sb->s_dev;
__entry->node_cnt = node_cnt;
__entry->tree_cnt = tree_cnt;
__entry->type = type;
),
TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u",
TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u, type = %s",
show_dev(__entry->dev),
__entry->node_cnt,
__entry->tree_cnt)
__entry->tree_cnt,
show_extent_type(__entry->type))
);
TRACE_EVENT(f2fs_destroy_extent_tree,
TP_PROTO(struct inode *inode, unsigned int node_cnt),
TP_PROTO(struct inode *inode, unsigned int node_cnt,
enum extent_type type),
TP_ARGS(inode, node_cnt),
TP_ARGS(inode, node_cnt, type),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
__field(unsigned int, node_cnt)
__field(enum extent_type, type)
),
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
__entry->node_cnt = node_cnt;
__entry->type = type;
),
TP_printk("dev = (%d,%d), ino = %lu, destroyed: node_cnt = %u",
TP_printk("dev = (%d,%d), ino = %lu, destroyed: node_cnt = %u, type = %s",
show_dev_ino(__entry),
__entry->node_cnt)
__entry->node_cnt,
show_extent_type(__entry->type))
);
DECLARE_EVENT_CLASS(f2fs_sync_dirty_inodes,
......
......@@ -42,6 +42,7 @@
struct f2fs_comp_option)
#define F2FS_IOC_DECOMPRESS_FILE _IO(F2FS_IOCTL_MAGIC, 23)
#define F2FS_IOC_COMPRESS_FILE _IO(F2FS_IOCTL_MAGIC, 24)
#define F2FS_IOC_START_ATOMIC_REPLACE _IO(F2FS_IOCTL_MAGIC, 25)
/*
* should be same as XFS_IOC_GOINGDOWN.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment