• Thomas Petazzoni's avatar
    dmaengine: mv_xor: optimize performance by using a subset of the XOR channels · 77757291
    Thomas Petazzoni authored
    Due to how async_tx behaves internally, having more XOR channels than
    CPUs is actually hurting performance more than it improves it, because
    memcpy requests get scheduled on a different channel than the XOR
    requests, but async_tx will still wait for the completion of the
    memcpy requests before scheduling the XOR requests.
    
    It is in fact more efficient to have at most one channel per CPU,
    which this patch implements by limiting the number of channels per
    engine, and the number of engines registered depending on the number
    of availables CPUs.
    
    Marvell platforms are currently available in one CPU, two CPUs and
    four CPUs configurations:
    
     - in the configurations with one CPU, only one channel from one
       engine is used.
    
     - in the configurations with two CPUs, only one channel from each
       engine is used (they are two XOR engines)
    
     - in the configurations with four CPUs, both channels of both engines
       are used.
    Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
    Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
    77757291
mv_xor.c 33.2 KB