1. 11 Sep, 2020 3 commits
    • Jack Qiu's avatar
      f2fs: correct statistic of APP_DIRECT_IO/APP_DIRECT_READ_IO · 335cac8b
      Jack Qiu authored
      Miss to update APP_DIRECT_IO/APP_DIRECT_READ_IO when receiving async DIO.
      For example: fio -filename=/data/test.0 -bs=1m -ioengine=libaio -direct=1
      		-name=fill -size=10m -numjobs=1 -iodepth=32 -rw=write
      Signed-off-by: default avatarJack Qiu <jack.qiu@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      335cac8b
    • Matthew Wilcox (Oracle)'s avatar
      f2fs: Simplify SEEK_DATA implementation · 4cb03fec
      Matthew Wilcox (Oracle) authored
      Instead of finding the first dirty page and then seeing if it matches
      the index of a block that is NEW_ADDR, delay the lookup of the dirty
      bit until we've actually found a block that's NEW_ADDR.
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4cb03fec
    • Chao Yu's avatar
      f2fs: support age threshold based garbage collection · 093749e2
      Chao Yu authored
      There are several issues in current background GC algorithm:
      - valid blocks is one of key factors during cost overhead calculation,
      so if segment has less valid block, however even its age is young or
      it locates hot segment, CB algorithm will still choose the segment as
      victim, it's not appropriate.
      - GCed data/node will go to existing logs, no matter in-there datas'
      update frequency is the same or not, it may mix hot and cold data
      again.
      - GC alloctor mainly use LFS type segment, it will cost free segment
      more quickly.
      
      This patch introduces a new algorithm named age threshold based
      garbage collection to solve above issues, there are three steps
      mainly:
      
      1. select a source victim:
      - set an age threshold, and select candidates beased threshold:
      e.g.
       0 means youngest, 100 means oldest, if we set age threshold to 80
       then select dirty segments which has age in range of [80, 100] as
       candiddates;
      - set candidate_ratio threshold, and select candidates based the
      ratio, so that we can shrink candidates to those oldest segments;
      - select target segment with fewest valid blocks in order to
      migrate blocks with minimum cost;
      
      2. select a target victim:
      - select candidates beased age threshold;
      - set candidate_radius threshold, search candidates whose age is
      around source victims, searching radius should less than the
      radius threshold.
      - select target segment with most valid blocks in order to avoid
      migrating current target segment.
      
      3. merge valid blocks from source victim into target victim with
      SSR alloctor.
      
      Test steps:
      - create 160 dirty segments:
       * half of them have 128 valid blocks per segment
       * left of them have 384 valid blocks per segment
      - run background GC
      
      Benefit: GC count and block movement count both decrease obviously:
      
      - Before:
        - Valid: 86
        - Dirty: 1
        - Prefree: 11
        - Free: 6001 (6001)
      
      GC calls: 162 (BG: 220)
        - data segments : 160 (160)
        - node segments : 2 (2)
      Try to move 41454 blocks (BG: 41454)
        - data blocks : 40960 (40960)
        - node blocks : 494 (494)
      
      IPU: 0 blocks
      SSR: 0 blocks in 0 segments
      LFS: 41364 blocks in 81 segments
      
      - After:
      
        - Valid: 87
        - Dirty: 0
        - Prefree: 4
        - Free: 6008 (6008)
      
      GC calls: 75 (BG: 76)
        - data segments : 74 (74)
        - node segments : 1 (1)
      Try to move 12813 blocks (BG: 12813)
        - data blocks : 12544 (12544)
        - node blocks : 269 (269)
      
      IPU: 0 blocks
      SSR: 12032 blocks in 77 segments
      LFS: 855 blocks in 2 segments
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      093749e2
  2. 10 Sep, 2020 15 commits
  3. 09 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · ab29a807
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Fix an NFS/RDMA resource leak
      
       - Fix the error handling during delegation recall
      
       - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR
      
       - Stop printk reading past end of string
      
      * tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: stop printk reading past end of string
        NFS: Zero-stateid SETATTR should first return delegation
        NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
        xprtrdma: Release in-flight MRs on disconnect
      ab29a807
    • Gabriel Krisman Bertazi's avatar
      f2fs: Return EOF on unaligned end of file DIO read · 20d0a107
      Gabriel Krisman Bertazi authored
      Reading past end of file returns EOF for aligned reads but -EINVAL for
      unaligned reads on f2fs.  While documentation is not strict about this
      corner case, most filesystem returns EOF on this case, like iomap
      filesystems.  This patch consolidates the behavior for f2fs, by making
      it return EOF(0).
      
      it can be verified by a read loop on a file that does a partial read
      before EOF (A file that doesn't end at an aligned address).  The
      following code fails on an unaligned file on f2fs, but not on
      btrfs, ext4, and xfs.
      
        while (done < total) {
          ssize_t delta = pread(fd, buf + done, total - done, off + done);
          if (!delta)
            break;
          ...
        }
      
      It is arguable whether filesystems should actually return EOF or
      -EINVAL, but since iomap filesystems support it, and so does the
      original DIO code, it seems reasonable to consolidate on that.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      20d0a107
    • Sahitya Tummala's avatar
      f2fs: fix indefinite loop scanning for free nid · e2cab031
      Sahitya Tummala authored
      If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
      are free nids in that NAT block between the start of the block and
      next_free_nid, then those free nids will not be scanned in scan_nat_page().
      This results into mismatch between nm_i->available_nids and the sum of
      nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
      will always be greater than the sum of free nids in all the blocks.
      Under this condition, if we use all the currently scanned free nids,
      then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
      is still not zero but nm_i->free_nid_count of that partially scanned
      NAT block is zero.
      
      Fix this to align the nm_i->next_scan_nid to the first nid of the
      corresponding NAT block.
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e2cab031
    • Shin'ichiro Kawasaki's avatar
      f2fs: Fix type of section block count variables · 123aaf77
      Shin'ichiro Kawasaki authored
      Commit da52f8ad ("f2fs: get the right gc victim section when section
      has several segments") added code to count blocks of each section using
      variables with type 'unsigned short', which has 2 bytes size in many
      systems. However, the counts can be larger than the 2 bytes range and
      type conversion results in wrong values. Especially when the f2fs
      sections have blocks as many as USHRT_MAX + 1, the count is handled as 0.
      This triggers eternal loop in init_dirty_segmap() at mount system call.
      Fix this by changing the type of the variables to block_t.
      
      Fixes: da52f8ad ("f2fs: get the right gc victim section when section has several segments")
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      123aaf77
  4. 08 Sep, 2020 8 commits
  5. 07 Sep, 2020 1 commit
  6. 06 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block · a8205e31
      Linus Torvalds authored
      Pull more io_uring fixes from Jens Axboe:
       "Two followup fixes. One is fixing a regression from this merge window,
        the other is two commits fixing cancelation of deferred requests.
      
        Both have gone through full testing, and both spawned a few new
        regression test additions to liburing.
      
         - Don't play games with const, properly store the output iovec and
           assign it as needed.
      
         - Deferred request cancelation fix (Pavel)"
      
      * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
        io_uring: fix linked deferred ->files cancellation
        io_uring: fix cancel of deferred reqs with ->files
        io_uring: fix explicit async read/write mapping for large segments
      a8205e31
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 2ccdd9f8
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
         pointer dereference bug and serialize a hardware register access as
         required by the VT-d spec.
      
       - two patches for AMD IOMMU to force AMD GPUs into translation mode
         when memory encryption is active and disallow using IOMMUv2
         functionality.  This makes the AMDGPU driver work when memory
         encryption is active.
      
       - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
         Table Entries.
      
       - MAINTAINERS file update for the Qualcom IOMMU driver.
      
      * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Handle 36bit addressing for x86-32
        iommu/amd: Do not use IOMMUv2 functionality when SME is active
        iommu/amd: Do not force direct mapping when SME is active
        iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
        iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
        iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
        iommu/vt-d: Serialize IOMMU GCMD register modifications
        MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
      2ccdd9f8
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 015b3155
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - more generic entry code ABI fallout
      
       - debug register handling bugfixes
      
       - fix vmalloc mappings on 32-bit kernels
      
       - kprobes instrumentation output fix on 32-bit kernels
      
       - fix over-eager WARN_ON_ONCE() on !SMAP hardware
      
       - NUMA debugging fix
      
       - fix Clang related crash on !RETPOLINE kernels
      
      * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Unbreak 32bit fast syscall
        x86/debug: Allow a single level of #DB recursion
        x86/entry: Fix AC assertion
        tracing/kprobes, x86/ptrace: Fix regs argument order for i386
        x86, fakenuma: Fix invalid starting node ID
        x86/mm/32: Bring back vmalloc faulting on x86_32
        x86/cmdline: Disable jump tables for cmdline.c
      015b3155
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 68beef57
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
       "A small series for fixing a problem with Xen PVH guests when running
        as backends (e.g. as dom0).
      
        Mapping other guests' memory is now working via ZONE_DEVICE, thus not
        requiring to abuse the memory hotplug functionality for that purpose"
      
      * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: add helpers to allocate unpopulated memory
        memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
        xen/balloon: add header guard
      68beef57
  7. 05 Sep, 2020 5 commits
    • Pavel Begunkov's avatar
      io_uring: fix linked deferred ->files cancellation · c127a2a1
      Pavel Begunkov authored
      While looking for ->files in ->defer_list, consider that requests there
      may actually be links.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c127a2a1
    • Pavel Begunkov's avatar
      io_uring: fix cancel of deferred reqs with ->files · b7ddce3c
      Pavel Begunkov authored
      While trying to cancel requests with ->files, it also should look for
      requests in ->defer_list, otherwise it might end up hanging a thread.
      
      Cancel all requests in ->defer_list up to the last request there with
      matching ->files, that's needed to follow drain ordering semantics.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b7ddce3c
    • Linus Torvalds's avatar
      Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4'... · dd9fb9bb
      Linus Torvalds authored
      Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux
      
      Pull misc fixes from Miguel Ojeda:
       "A trivial patch for auxdisplay:
      
         - Replace HTTP links with HTTPS ones (Alexander A. Klimov)
      
        The usual clang-format trivial update:
      
         - Update with the latest for_each macro list (Miguel Ojeda)
      
        And Luc requested me to pick a sparse fix on my queue, so here it goes
        along with other two trivial Compiler Attributes ones (also from Luc).
      
         - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van
           Oostenryck)
      
         - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van
           Oostenryck)
      
         - Compiler Attributes: remove comment about sparse not supporting
           __has_attribute (Luc Van Oostenryck)"
      
      * tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        auxdisplay: Replace HTTP links with HTTPS ones
      
      * tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        clang-format: Update with the latest for_each macro list
      
      * tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux:
        sparse: use static inline for __chk_{user,io}_ptr()
        Compiler Attributes: fix comment concerning GCC 4.6
        Compiler Attributes: remove comment about sparse not supporting __has_attribute
      dd9fb9bb
    • Linus Torvalds's avatar
      Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 70187f77
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - HSDK-4xd Dev system: perf driver updates for sampling interrupt
      
       - HSDK* Dev System: Ethernet broken [Evgeniy Didin]
      
       - HIGHMEM broken (2 memory banks) [Mike Rapoport]
      
       - show_regs() rewrite once and for all
      
       - Other minor fixes
      
      * tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id
        arc: fix memory initialization for systems with two memory banks
        irqchip/eznps: Fix build error for !ARC700 builds
        ARC: show_regs: fix r12 printing and simplify
        ARC: HSDK: wireup perf irq
        ARC: perf: don't bail setup if pct irq missing in device-tree
        ARC: pgalloc.h: delete a duplicated word + other fixes
      70187f77
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 7514c036
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "19 patches.
      
        Subsystems affected by this patch series: MAINTAINERS, ipc, fork,
        checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration,
        hugetlb)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        include/linux/log2.h: add missing () around n in roundup_pow_of_two()
        mm/khugepaged.c: fix khugepaged's request size in collapse_file
        mm/hugetlb: fix a race between hugetlb sysctl handlers
        mm/hugetlb: try preferred node first when alloc gigantic page from cma
        mm/migrate: preserve soft dirty in remove_migration_pte()
        mm/migrate: remove unnecessary is_zone_device_page() check
        mm/rmap: fixup copying of soft dirty and uffd ptes
        mm/migrate: fixup setting UFFD_WP flag
        mm: madvise: fix vma user-after-free
        checkpatch: fix the usage of capture group ( ... )
        fork: adjust sysctl_max_threads definition to match prototype
        ipc: adjust proc_ipc_sem_dointvec definition to match prototype
        mm: track page table modifications in __apply_to_page_range()
        MAINTAINERS: IA64: mark Status as Odd Fixes only
        MAINTAINERS: add LLVM maintainers
        MAINTAINERS: update Cavium/Marvell entries
        mm: slub: fix conversion of freelist_corrupted()
        mm: memcg: fix memcg reclaim soft lockup
        memcg: fix use-after-free in uncharge_batch
      7514c036