• Omer Peleg's avatar
    iommu/iova: introduce per-cpu caching to iova allocation · 9257b4a2
    Omer Peleg authored
    IOVA allocation has two problems that impede high-throughput I/O.
    First, it can do a linear search over the allocated IOVA ranges.
    Second, the rbtree spinlock that serializes IOVA allocations becomes
    contended.
    
    Address these problems by creating an API for caching allocated IOVA
    ranges, so that the IOVA allocator isn't accessed frequently.  This
    patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs
    without taking the rbtree spinlock.  The per-CPU caches are backed by
    a global cache, to avoid invoking the (linear-time) IOVA allocator
    without needing to make the per-CPU cache size excessive.  This design
    is based on magazines, as described in "Magazines and Vmem: Extending
    the Slab Allocator to Many CPUs and Arbitrary Resources" (currently
    available at https://www.usenix.org/legacy/event/usenix01/bonwick.html)
    
    Adding caching on top of the existing rbtree allocator maintains the
    property that IOVAs are densely packed in the IO virtual address space,
    which is important for keeping IOMMU page table usage low.
    
    To keep the cache size reasonable, we bound the IOVA space a CPU can
    cache by 32 MiB (we cache a bounded number of IOVA ranges, and only
    ranges of size <= 128 KiB).  The shared global cache is bounded at
    4 MiB of IOVA space.
    Signed-off-by: default avatarOmer Peleg <omer@cs.technion.ac.il>
    [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
    Signed-off-by: default avatarAdam Morrison <mad@cs.technion.ac.il>
    Reviewed-by: default avatarShaohua Li <shli@fb.com>
    Reviewed-by: default avatarBen Serebrin <serebrin@google.com>
    [dwmw2: split out VT-d part into a separate patch]
    Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
    9257b4a2
iova.c 23.4 KB