• Ming Lei's avatar
    block: loop: improve performance via blk-mq · b5dd2f60
    Ming Lei authored
    The conversion is a bit straightforward, and use work queue to
    dispatch requests of loop block, and one big change is that requests
    is submitted to backend file/device concurrently with work queue,
    so throughput may get improved much. Given write requests over same
    file are often run exclusively, so don't handle them concurrently for
    avoiding extra context switch cost, possible lock contention and work
    schedule cost. Also with blk-mq, there is opportunity to get loop I/O
    merged before submitting to backend file/device.
    
    In the following test:
    	- base: v3.19-rc2-2041231
    	- loop over file in ext4 file system on SSD disk
    	- bs: 4k, libaio, io depth: 64, O_DIRECT, num of jobs: 1
    	- throughput: IOPS
    
    	------------------------------------------------------
    	|            | base      | base with loop-mq | delta |
    	------------------------------------------------------
    	| randread   | 1740      | 25318             | +1355%|
    	------------------------------------------------------
    	| read       | 42196     | 51771             | +22.6%|
    	-----------------------------------------------------
    	| randwrite  | 35709     | 34624             | -3%   |
    	-----------------------------------------------------
    	| write      | 39137     | 40326             | +3%   |
    	-----------------------------------------------------
    
    So loop-mq can improve throughput for both read and randread, meantime,
    performance of write and randwrite isn't hurted basically.
    
    Another benefit is that loop driver code gets simplified
    much after blk-mq conversion, and the patch can be thought as
    cleanup too.
    Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    b5dd2f60
loop.h 2.2 KB