• Alex Williamson's avatar
    [PATCH] ia64: sba_iommu perf tunning and new functionality · 6898da46
    Alex Williamson authored
       I've been doing some performance tuning and adding some functionality
    to sba_iommu for zx1/sx1000 chipsets.  This adds:
    
          * Long overdue consistent_dma_mask support
          * Long overdue ability to do large mappings in the iommu
          * Tightened spinlock usage for better performance/scalability
          * Added branch prediction hints for some of the performance paths
          * Added explicit data prefetching to some performance paths -
            perfmon shows roughly a 20% decrease in L3 misses in the bitmap
            search code
          * Increased delayed resource freeing depth and added a separate
            lock per ioc to avoid contention
          * Added code to free up queued pdir entries should we be unable to
            find space for new ones (not that I've ever seen the pdir
            anywhere close to full)
          * Finished cleaning out the hint support code, Grant is
            maintaining this separately for now
          * Added option to control bypass of sg mappings separately from
            single/coherent mappings
    
    Much like the swiotlb, sba_iommu allows devices capable of 64bit
    addressing to bypass the iommu and DMA directly to/from memory.  Using a
    worst case scenario test (64bit bypass disabled, all DMA mapped through
    the iommu), I saw a 60% increase in sequential block input throughput
    using bonnie++ on a large RAID0 MD array.  In fact, this patch provides
    the best bonnie++ performance with bypass disabled.  This is likely due
    to benefits seen from coalescing the scatterlist, allowing better disk
    streaming.  I assume that network performance will likely be limited by
    mapping latency, so I added the last bullet item to allow sg mappings to
    get the benefit of coalescing while keeping a low latency path for
    single and coherent mappings.  If anyone is setup for network
    benchmarks, I'd be interested in a before and after with this patch.
    6898da46
sba_iommu.c 52.6 KB