• Michael Kelley's avatar
    swiotlb: reduce swiotlb pool lookups · 7296f230
    Michael Kelley authored
    With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair
    in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple
    places, the pool is found and used in one function, and then must
    be found again in the next function that is called because only the
    tlb_addr is passed as an argument. These are the six call sites:
    
    dma_direct_map_page:
     1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce
    
    dma_direct_unmap_page:
     2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer
     3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu ->
    	swiotlb_bounce
     4. is_swiotlb_buffer
     5. swiotlb_tbl_unmap_single -> swiotlb_del_transient
     6. swiotlb_tbl_unmap_single -> swiotlb_release_slots
    
    Reduce the number of calls by finding the pool at a higher level, and
    passing it as an argument instead of searching again. A key change is
    for is_swiotlb_buffer() to return a pool pointer instead of a boolean,
    and then pass this pool pointer to subsequent swiotlb functions.
    
    There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer
    is a swiotlb buffer before calling a swiotlb function. To reduce code
    duplication in getting the pool pointer and passing it as an argument,
    introduce inline wrappers for this pattern. The generated code is
    essentially unchanged.
    
    Since is_swiotlb_buffer() no longer returns a boolean, rename some
    functions to reflect the change:
    
     * swiotlb_find_pool() becomes __swiotlb_find_pool()
     * is_swiotlb_buffer() becomes swiotlb_find_pool()
     * is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool()
    
    With these changes, a round-trip map/unmap pair requires only 2 pool
    lookups (listed using the new names and wrappers):
    
    dma_direct_unmap_page:
     1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool
     2. swiotlb_tbl_unmap_single -> swiotlb_find_pool
    
    These changes come from noticing the inefficiencies in a code review,
    not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC,
    __swiotlb_find_pool() is not trivial, and it uses an RCU read lock,
    so avoiding the redundant calls helps performance in a hot path.
    When CONFIG_SWIOTLB_DYNAMIC is *not* set, the code size reduction
    is minimal and the perf benefits are likely negligible, but no
    harm is done.
    
    No functional change is intended.
    Signed-off-by: default avatarMichael Kelley <mhklinux@outlook.com>
    Reviewed-by: default avatarPetr Tesarik <petr@tesarici.cz>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    7296f230
dma-iommu.c 50.2 KB