• Christian Ehrhardt's avatar
    swap: allow swap readahead to be merged · 3fb5c298
    Christian Ehrhardt authored
    Swap readahead works fine, but the I/O to disk is almost always done in
    page size requests, despite the fact that readahead submits
    1<<page-cluster pages at a time.
    
    On older kernels the old per device plugging behavior might have captured
    this and merged the requests, but currently all comes down to much more
    I/Os than required.
    
    On a single device this might not be an issue, but as soon as a server
    runs on shared san resources savin I/Os not only improves swapin
    throughput but also provides a lower resource utilization.
    
    With a load running KVM in a lot of memory overcommitment (the hot memory
    is 1.5 times the host memory) swapping throughput improves significantly
    and the lead feels more responsive as well as achieves more throughput.
    
    In a test setup with 16 swap disks running blocktrace on one of those disks
    shows the improved merging:
    Prior:
    Reads Queued:     560,888,    2,243MiB  Writes Queued:     226,242,  904,968KiB
    Read Dispatches:  544,701,    2,243MiB  Write Dispatches:  159,318,  904,968KiB
    Reads Requeued:         0               Writes Requeued:         0
    Reads Completed:  544,716,    2,243MiB  Writes Completed:  159,321,  904,980KiB
    Read Merges:       16,187,   64,748KiB  Write Merges:       61,744,  246,976KiB
    IO unplugs:       149,614               Timer unplugs:       2,940
    
    With the patch:
    Reads Queued:     734,315,    2,937MiB  Writes Queued:     300,188,    1,200MiB
    Read Dispatches:  214,972,    2,937MiB  Write Dispatches:  215,176,    1,200MiB
    Reads Requeued:         0               Writes Requeued:         0
    Reads Completed:  214,971,    2,937MiB  Writes Completed:  215,177,    1,200MiB
    Read Merges:      519,343,    2,077MiB  Write Merges:       73,325,  293,300KiB
    IO unplugs:       337,130               Timer unplugs:      11,184
    
    I got ~10% to ~40% more throughput in my cases and at the same time much
    lower cpu consumption when broken down per transferred kilobyte (the
    majority of that due to saved interrupts and better cache handling).  In a
    shared SAN others might get an additional benefit as well, because this
    now causes less protocol overhead.
    Signed-off-by: default avatarChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    Acked-by: default avatarJens Axboe <axboe@kernel.dk>
    Reviewed-by: default avatarMinchan Kim <minchan@kernel.org>
    Cc: Hugh Dickins <hughd@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3fb5c298
swap_state.c 10.3 KB