1. 12 Feb, 2021 1 commit
  2. 08 Feb, 2021 1 commit
    • Jaegeuk Kim's avatar
      f2fs: don't grab superblock freeze for flush/ckpt thread · d50dfc0c
      Jaegeuk Kim authored
      There are controlled by f2fs_freeze().
      
      This fixes xfstests/generic/068 which is stuck at
      
       task:f2fs_ckpt-252:3 state:D stack:    0 pid: 5761 ppid:     2 flags:0x00004000
       Call Trace:
        __schedule+0x44c/0x8a0
        schedule+0x4f/0xc0
        percpu_rwsem_wait+0xd8/0x140
        ? percpu_down_write+0xf0/0xf0
        __percpu_down_read+0x56/0x70
        issue_checkpoint_thread+0x12c/0x160 [f2fs]
        ? wait_woken+0x80/0x80
        kthread+0x114/0x150
        ? __checkpoint_and_complete_reqs+0x110/0x110 [f2fs]
        ? kthread_park+0x90/0x90
        ret_from_fork+0x22/0x30
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d50dfc0c
  3. 03 Feb, 2021 2 commits
    • Daeho Jeong's avatar
      f2fs: add ckpt_thread_ioprio sysfs node · e6592066
      Daeho Jeong authored
      Added "ckpt_thread_ioprio" sysfs node to give a way to change checkpoint
      merge daemon's io priority. Its default value is "be,3", which means
      "BE" I/O class and I/O priority "3". We can select the class between "rt"
      and "be", and set the I/O priority within valid range of it.
      "," delimiter is necessary in between I/O class and priority number.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e6592066
    • Daeho Jeong's avatar
      f2fs: introduce checkpoint_merge mount option · 261eeb9c
      Daeho Jeong authored
      We've added a new mount options, "checkpoint_merge" and "nocheckpoint_merge",
      which creates a kernel daemon and makes it to merge concurrent checkpoint
      requests as much as possible to eliminate redundant checkpoint issues. Plus,
      we can eliminate the sluggish issue caused by slow checkpoint operation
      when the checkpoint is done in a process context in a cgroup having
      low i/o budget and cpu shares. To make this do better, we set the
      default i/o priority of the kernel daemon to "3", to give one higher
      priority than other kernel threads. The below verification result
      explains this.
      The basic idea has come from https://opensource.samsung.com.
      
      [Verification]
      Android Pixel Device(ARM64, 7GB RAM, 256GB UFS)
      Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20)
      Set "strict_guarantees" to "1" in BFQ tunables
      
      In "fg" cgroup,
      - thread A => trigger 1000 checkpoint operations
        "for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1;
         done"
      - thread B => gererating async. I/O
        "fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1
             --filename=test_img --name=test"
      
      In "bg" cgroup,
      - thread C => trigger repeated checkpoint operations
        "echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file;
         fsync test_dir2; done"
      
      We've measured thread A's execution time.
      
      [ w/o patch ]
      Elapsed Time: Avg. 68 seconds
      [ w/  patch ]
      Elapsed Time: Avg. 48 seconds
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      [Jaegeuk Kim: fix the return value in f2fs_start_ckpt_thread, reported by Dan]
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      261eeb9c
  4. 02 Feb, 2021 2 commits
  5. 01 Feb, 2021 3 commits
    • Liu Song's avatar
      f2fs: remove unnecessary initialization in xattr.c · 2e0cd472
      Liu Song authored
      These variables will be explicitly assigned before use,
      so there is no need to initialize.
      Signed-off-by: default avatarLiu Song <liu.song11@zte.com.cn>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2e0cd472
    • Yi Chen's avatar
      f2fs: fix to avoid inconsistent quota data · 25fb04db
      Yi Chen authored
      Occasionally, quota data may be corrupted detected by fsck:
      
      Info: checkpoint state = 45 :  crc compacted_summary unmount
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1543036928, 762) != expected (1543032832, 762)
      [ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid quota file content found.
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1352478720, 344) != expected (1352474624, 344)
      [ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid quota file content found.
      
      [FSCK] Unreachable nat entries                        [Ok..] [0x0]
      [FSCK] SIT valid block bitmap checking                [Ok..]
      [FSCK] Hard link checking for regular file            [Ok..] [0x0]
      [FSCK] valid_block_count matching with CP             [Ok..] [0xdf299]
      [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x2b01]
      [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0x2b01]
      [FSCK] valid_inode_count matched with CP              [Ok..] [0x2665]
      [FSCK] free segment_count matched with CP             [Ok..] [0xcb04]
      [FSCK] next block offset is free                      [Ok..]
      [FSCK] fixing SIT types
      [FSCK] other corrupted bugs                           [Fail]
      
      The root cause is:
      If we open file w/ readonly flag, disk quota info won't be initialized
      for this file, however, following mmap() will force to convert inline
      inode via f2fs_convert_inline_inode(), which may increase block usage
      for this inode w/o updating quota data, it causes inconsistent disk quota
      info.
      
      The issue will happen in following stack:
      open(file, O_RDONLY)
      mmap(file)
      - f2fs_convert_inline_inode
       - f2fs_convert_inline_page
        - f2fs_reserve_block
         - f2fs_reserve_new_block
          - f2fs_reserve_new_blocks
           - f2fs_i_blocks_write
            - dquot_claim_block
      inode->i_blocks increase, but the dqb_curspace keep the size for the dquots
      is NULL.
      
      To fix this issue, let's call dquot_initialize() anyway in both
      f2fs_truncate() and f2fs_convert_inline_inode() functions to avoid potential
      inconsistent quota data issue.
      
      Fixes: 0abd675e ("f2fs: support plain user/group quota")
      Signed-off-by: default avatarDaiyue Zhang <zhangdaiyue1@huawei.com>
      Signed-off-by: default avatarDehe Gu <gudehe@huawei.com>
      Signed-off-by: default avatarJunchao Jiang <jiangjunchao1@huawei.com>
      Signed-off-by: default avatarGe Qiu <qiuge@huawei.com>
      Signed-off-by: default avatarYi Chen <chenyi77@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      25fb04db
    • Jaegeuk Kim's avatar
      f2fs: flush data when enabling checkpoint back · b0ff4fe7
      Jaegeuk Kim authored
      During checkpoint=disable period, f2fs bypasses all the synchronous IOs such as
      sync and fsync. So, when enabling it back, we must flush all of them in order
      to keep the data persistent. Otherwise, suddern power-cut right after enabling
      checkpoint will cause data loss.
      
      Fixes: 4354994f ("f2fs: checkpoint disabling")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b0ff4fe7
  6. 27 Jan, 2021 23 commits
  7. 26 Jan, 2021 7 commits
  8. 25 Jan, 2021 1 commit
    • Paolo Bonzini's avatar
      KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX · 9a78e158
      Paolo Bonzini authored
      VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS,
      which may need to be loaded outside guest mode.  Therefore we cannot
      WARN in that case.
      
      However, that part of nested_get_vmcs12_pages is _not_ needed at
      vmentry time.  Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling,
      so that both vmentry and migration (and in the latter case, independent
      of is_guest_mode) do the parts that are needed.
      
      Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3b: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
      Cc: <stable@vger.kernel.org> # 5.10.x
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9a78e158