1. 20 Aug, 2015 6 commits
  2. 14 Aug, 2015 3 commits
  3. 11 Aug, 2015 2 commits
    • Chao Yu's avatar
      f2fs: remove inmem radix tree · decd36b6
      Chao Yu authored
      Previously, we use radix tree to index all registered page entries for
      atomic file, but now we only use radix tree to see whether current page
      is indexed or not, since the other user of radix tree is gone in commit
      042b7816 ("f2fs: remove unnecessary call to invalidate inmemory pages").
      
      So in this patch, we try to use one more efficient way:
      Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
      value to indicate page indexing status. By using this way, we can save
      memory and lookup time.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      decd36b6
    • Chao Yu's avatar
      f2fs: report EINVAL for unalignment direct IO · c15e8599
      Chao Yu authored
      We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in
      detail is as fallow:
      
      dio04
      
      <<<test_start>>>
      tag=dio04 stime=1432278894
      cmdline="diotest4"
      contacts=""
      analysis=exit
      <<<test_output>>>
      diotest4    1  TPASS  :  Negative Offset
      diotest4    2  TPASS  :  removed
      diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
      diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
      diotest4    5  TPASS  :  Read beyond the file size
      ......
      
      the result of ext4 with same environment:
      
      dio04
      
      <<<test_start>>>
      tag=dio04 stime=1432259643
      cmdline="diotest4"
      contacts=""
      analysis=exit
      <<<test_output>>>
      diotest4    1  TPASS  :  Negative Offset
      diotest4    2  TPASS  :  removed
      diotest4    3  TPASS  :  Odd count of read and write
      diotest4    4  TPASS  :  Read beyond the file size
      ......
      
      The reason is that when triggering DIO in f2fs, we will return zero value
      in ->direct_IO if writer's buffer offset, file offset and transfer size is
      not alignment to block size of filesystem, resulting in falling back into
      buffered write instead of returning -EINVAL.
      
      This patch fixes that problem by returning correct error number for above
      case, and removing the judgement condition in check_direct_IO to make sure
      the verification will be enabled for direct reader too.
      
      Besides, Jaegeuk Kim pointed out that there is expectional cases we should
      always make direct-io falling back into buffered write, such as dio in
      encrypted file.
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      [Chao Yu make small change and add detail description in commit message]
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c15e8599
  4. 10 Aug, 2015 1 commit
  5. 06 Aug, 2015 2 commits
    • Chao Yu's avatar
      f2fs: recover invalid/reserved block address for fsynced file · 12a8343e
      Chao Yu authored
      When testing with generic/101 in xfstests, error message outputed as below:
      
          --- tests/generic/101.out
          +++ results//generic/101.out.bad
          @@ -10,10 +10,14 @@
           File foo content after log replay:
           0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
           *
          -0200000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          +0200000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
           *
           0372000
          ...
          (Run 'diff -u tests/generic/101.out results/generic/101.out.bad'  to see the entire diff)
      
      The test flow is like below:
      1. pwrite foo -S 0xaa 0 64K
      2. pwrite foo -S 0xbb 64K 61K
      3. sync
      4. truncate foo 64K
      5. truncate foo 125K
      6. fsync foo
      7. flakey drop writes
      8. umount
      
      After this test, we expect the data of recovered file will have the first
      64k of data filling with value 0xaa and the next 61k of data filling with
      value 0x00 because we have fsynced it before dropping writes in dm.
      
      In f2fs, during recovering, we will only recover the valid block address
      in direct node page if it is marked as a fsynced dnode, but block address
      which means invalid/reserved (with value NULL_ADDR/NEW_ADDR) will not be
      recovered. So, the file recovered shows its incorrect data 0xbb in range of
      [61k, 125k].
      
      In this patch, we fix to recover invalid/reserved block during recover flow.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      12a8343e
    • Fan Li's avatar
      f2fs: use extent cache to optimize f2fs_reserve_block · 759af1c9
      Fan Li authored
      In some cases, we only need the block address when we call
      f2fs_reserve_block,
      other fields of struct dnode_of_data aren't necessary.
      We can try extent cache first for such cases in order to speed up the
      process.
      Signed-off-by: default avatarFan li <fanofcode.li@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      759af1c9
  6. 05 Aug, 2015 19 commits
  7. 04 Aug, 2015 7 commits
    • Chao Yu's avatar
      f2fs: expose f2fs_write_cache_pages · 8f46dcae
      Chao Yu authored
      If there are gced dirty pages and normal dirty pages in the mapping
      of one inode, we might writeback them alternately with discontinuous
      block address, resulting in low performance.
      
      This patch introduces f2fs_write_cache_pages with codes copied from
      write_cache_pages in mm/page-writeback.c.
      
      In this function, we refactor flow with two steps:
      1) writeback all cold type pages.
      2) writeback all non-cold type pages.
      
      By using this method, f2fs will writeback dirty pages with the same
      temperature in bunch mode, it makes writeouted block being with
      more continuous address, so they can be merged as much as possible
      in f2fs bio cache, and also it will reduce the chance of submiting
      small IO from block layer.
      
      Test environment: 8g nokia sd card (very old sd card, but it shows
      better effect when testing with this patch, and with a 32g kingston
      sd card, I didn't see much more improvement).
      
      Test step:
      1. touch testfile;
      2. truncate -s 512K testfile;
      3. write all pages with odd index;
      4. trigger gc by ioctl;
      5. write all pages with even index;
      6. time fsync testfile.
      
      before:
      real	0m0.402s
      user	0m0.000s
      sys	0m0.000s
      
      after:
      real	0m0.143s
      user	0m0.004s
      sys	0m0.004s
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8f46dcae
    • Chao Yu's avatar
      f2fs: correct return value of ->setxattr · 037fe70c
      Chao Yu authored
      This patch fixes to return correct error number of ->setxattr, which
      is reported by xfstest tests/generic/026 as below:
      
      generic/026      - output mismatch
          --- tests/generic/026.out
          +++ results/generic/026.out.bad
          @@ -4,6 +4,6 @@
           1 below acl max
           acl max
           1 above acl max
          -chacl: cannot set access acl on "largeaclfile": Argument list too long
          +chacl: cannot set access acl on "largeaclfile": Numerical result out of range
           use 16 aces
           use 17 aces
          ...
      Ran: generic/026
      Failures: generic/026
      Failed 1 of 1 tests
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      037fe70c
    • Chao Yu's avatar
      f2fs: cleanup write_orphan_inodes · bd936f84
      Chao Yu authored
      Previously, since 'commit 4531929e ("f2fs: move grabing orphan
      pages out of protection region")' was committed, in write_orphan_inodes(),
      we will grab all meta page in a batch before we use them under spinlock,
      so that we can avoid large time delay of grabbing meta pages under
      spinlock.
      
      Now, 'commit d6c67a4f ("f2fs: revmove spin_lock for
      write_orphan_inodes")' remove the spinlock in write_orphan_inodes,
      so there is no issue we describe above, we'd better recover to move
      the grab operation to original place for readability.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bd936f84
    • Chao Yu's avatar
      f2fs: warm up cold page after mmaped write · 5b339124
      Chao Yu authored
      With cost-benifit method, background gc will consider old section with
      fewer valid blocks as candidate victim, these old blocks in section will
      be treated as cold data, and laterly will be moved into cold segment.
      
      But if the gcing page is attached by user through buffered or mmaped
      write, we should reset the page as non-cold one, because this page may
      have more opportunity for further updating.
      
      So fix to add clearing code for the missed 'mmap' case.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5b339124
    • Chao Yu's avatar
      f2fs: add new ioctl F2FS_IOC_GARBAGE_COLLECT · c1c1b583
      Chao Yu authored
      When background gc is off, the only way to trigger gc is executing
      a force gc in some operations who wants to grab space in disk.
      
      The executing condition is limited: to execute force gc, we should
      wait for the time when there is almost no more free section for LFS
      allocation. This seems not reasonable for our user who wants to
      control triggering gc by himself.
      
      This patch introduces F2FS_IOC_GARBAGE_COLLECT interface for
      triggering garbage collection by using ioctl. It provides our users
      one more option to trigger gc.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c1c1b583
    • Chao Yu's avatar
      f2fs: maintain extent cache in separated file · a28ef1f5
      Chao Yu authored
      This patch moves extent cache related code from data.c into extent_cache.c
      since extent cache is independent feature, and its codes are not relate to
      others in data.c, it's better for us to maintain them in separated place.
      
      There is no functionality change, but several small coding style fixes
      including:
      * rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
      * rename misspelled word 'untill' to 'until';
      * remove unneeded 'return' in the end of f2fs_destroy_extent_tree().
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a28ef1f5
    • Fan Li's avatar
      f2fs: don't try to split extents shorter than F2FS_MIN_EXTENT_LEN · 3c7df87d
      Fan Li authored
      Since only parts of extents longer than F2FS_MIN_EXTENT_LEN will
      be kept in extent cache after split, extents already shorter than
      F2FS_MIN_EXTENT_LEN don't need to try split at all.
      Signed-off-by: default avatarFan Li <fanofcode.li@samsung.com>
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3c7df87d