• Andrew Morton's avatar
    [PATCH] Remove percpufication of in_flight counter in · 046dbb49
    Andrew Morton authored
    From: Ravikiran G Thirumalai <kiran@in.ibm.com>
    
    The routine disk_round_stats showed up considerably under oprofile for high
    disk io load (four processes doing dd to the same disk (different
    partitions) on a 4 way).
    
    This is because the counter in_flight which is per-cpu right now gets read
    every time disk_round_stats gets called.  Per cpu counters like disk
    statistics improve write speed, but reads are slow (since all cpus' local
    counter values have to be read and summed up).  Considering the fact that
    in_flight counter is modified post disk_round_stats (which reads the
    in_flight counter) it is better not to per-cpu this counter.
    
    Following patch does just that.  Below is the profile comparison before and
    after the change.  This was on a 4 way PIII Xeon, 1G ram, 2.6.0-test4-mm2. 
    
    Before:
    c010aa60 2910109  92.2249     poll_idle
    c0275340 23208    0.73549     __copy_to_user_ll
    c02753b0 11191    0.354657    __copy_from_user_ll
    c0114aa0 7168     0.227163    mark_offset_tsc
    c011ad10 6767     0.214455    schedule
    c011a2b0 6741     0.213631    load_balance
    c0138890 6710     0.212648    __generic_file_aio_write_nolock
    c011d302 4683     0.14841     .text.lock.sched
    c02e4b50 4533     0.143656    ahc_linux_isr
    c029cec0 3582     0.113518    disk_round_stats
    c0119b40 3509     0.111205    try_to_wake_up
    c029d320 3306     0.104771    __make_request
    c01567d0 3300     0.104581    __block_write_full_page
    c0156c00 3299     0.104549    __block_prepare_write
    
    After:
    c010aa60 2777940  92.1302     poll_idle
    c0275340 23479    0.778679    __copy_to_user_ll
    c02753b0 10943    0.362924    __copy_from_user_ll
    c0114aa0 7022     0.232884    mark_offset_tsc
    c0138890 6988     0.231757    __generic_file_aio_write_nolock
    c011ad10 6607     0.219121    schedule
    c011d302 5771     0.191395    .text.lock.sched
    c02e4a60 4458     0.147849    ahc_linux_isr
    c011a2b0 3921     0.13004     load_balance
    c01567d0 3569     0.118366    __block_write_full_page
    c029d2a0 3540     0.117404    __make_request
    ...
    c029ceb0 311      0.0103143   disk_round_stats
    c011d5b0 299      0.00991631  remove_wait_queue
    046dbb49
ll_rw_blk.c 72.4 KB