• Ritesh Harjani (IBM)'s avatar
    iomap: Add per-block dirty state tracking to improve performance · 4ce02c67
    Ritesh Harjani (IBM) authored
    When filesystem blocksize is less than folio size (either with
    mapping_large_folio_support() or with blocksize < pagesize) and when the
    folio is uptodate in pagecache, then even a byte write can cause
    an entire folio to be written to disk during writeback. This happens
    because we currently don't have a mechanism to track per-block dirty
    state within struct iomap_folio_state. We currently only track uptodate
    state.
    
    This patch implements support for tracking per-block dirty state in
    iomap_folio_state->state bitmap. This should help improve the filesystem
    write performance and help reduce write amplification.
    
    Performance testing of below fio workload reveals ~16x performance
    improvement using nvme with XFS (4k blocksize) on Power (64K pagesize)
    FIO reported write bw scores improved from around ~28 MBps to ~452 MBps.
    
    1. <test_randwrite.fio>
    [global]
    	ioengine=psync
    	rw=randwrite
    	overwrite=1
    	pre_read=1
    	direct=0
    	bs=4k
    	size=1G
    	dir=./
    	numjobs=8
    	fdatasync=1
    	runtime=60
    	iodepth=64
    	group_reporting=1
    
    [fio-run]
    
    2. Also our internal performance team reported that this patch improves
       their database workload performance by around ~83% (with XFS on Power)
    Reported-by: default avatarAravinda Herle <araherle@in.ibm.com>
    Reported-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    4ce02c67
aops.c 19.5 KB