• Dave Hansen's avatar
    x86, mpx: Cleanup unused bound tables · 1de4fa14
    Dave Hansen authored
    The previous patch allocates bounds tables on-demand.  As noted in
    an earlier description, these can add up to *HUGE* amounts of
    memory.  This has caused OOMs in practice when running tests.
    
    This patch adds support for freeing bounds tables when they are no
    longer in use.
    
    There are two types of mappings in play when unmapping tables:
     1. The mapping with the actual data, which userspace is
        munmap()ing or brk()ing away, etc...
     2. The mapping for the bounds table *backing* the data
        (is tagged with VM_MPX, see the patch "add MPX specific
        mmap interface").
    
    If userspace use the prctl() indroduced earlier in this patchset
    to enable the management of bounds tables in kernel, when it
    unmaps the first type of mapping with the actual data, the kernel
    needs to free the mapping for the bounds table backing the data.
    This patch hooks in at the very end of do_unmap() to do so.
    We look at the addresses being unmapped and find the bounds
    directory entries and tables which cover those addresses.  If
    an entire table is unused, we clear associated directory entry
    and free the table.
    
    Once we unmap the bounds table, we would have a bounds directory
    entry pointing at empty address space. That address space might
    now be allocated for some other (random) use, and the MPX
    hardware might now try to walk it as if it were a bounds table.
    That would be bad.  So any unmapping of an enture bounds table
    has to be accompanied by a corresponding write to the bounds
    directory entry to invalidate it.  That write to the bounds
    directory can fault, which causes the following problem:
    
    Since we are doing the freeing from munmap() (and other paths
    like it), we hold mmap_sem for write. If we fault, the page
    fault handler will attempt to acquire mmap_sem for read and
    we will deadlock.  To avoid the deadlock, we pagefault_disable()
    when touching the bounds directory entry and use a
    get_user_pages() to resolve the fault.
    
    The unmapping of bounds tables happends under vm_munmap().  We
    also (indirectly) call vm_munmap() to _do_ the unmapping of the
    bounds tables.  We avoid unbounded recursion by disallowing
    freeing of bounds tables *for* bounds tables.  This would not
    occur normally, so should not have any practical impact.  Being
    strict about it here helps ensure that we do not have an
    exploitable stack overflow.
    Based-on-patch-by: default avatarQiaowei Ren <qiaowei.ren@intel.com>
    Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Cc: linux-mm@kvack.org
    Cc: linux-mips@linux-mips.org
    Cc: Dave Hansen <dave@sr71.net>
    Link: http://lkml.kernel.org/r/20141114151831.E4531C4A@viggo.jf.intel.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    1de4fa14
mpx.h 2.91 KB