• Dave Hansen's avatar
    mm: Implement new pkey_mprotect() system call · 7d06d9c9
    Dave Hansen authored
    pkey_mprotect() is just like mprotect, except it also takes a
    protection key as an argument.  On systems that do not support
    protection keys, it still works, but requires that key=0.
    Otherwise it does exactly what mprotect does.
    
    I expect it to get used like this, if you want to guarantee that
    any mapping you create can *never* be accessed without the right
    protection keys set up.
    
    	int real_prot = PROT_READ|PROT_WRITE;
    	pkey = pkey_alloc(0, PKEY_DENY_ACCESS);
    	ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
    	ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
    
    This way, there is *no* window where the mapping is accessible
    since it was always either PROT_NONE or had a protection key set
    that denied all access.
    
    We settled on 'unsigned long' for the type of the key here.  We
    only need 4 bits on x86 today, but I figured that other
    architectures might need some more space.
    
    Semantically, we have a bit of a problem if we combine this
    syscall with our previously-introduced execute-only support:
    What do we do when we mix execute-only pkey use with
    pkey_mprotect() use?  For instance:
    
    	pkey_mprotect(ptr, PAGE_SIZE, PROT_WRITE, 6); // set pkey=6
    	mprotect(ptr, PAGE_SIZE, PROT_EXEC);  // set pkey=X_ONLY_PKEY?
    	mprotect(ptr, PAGE_SIZE, PROT_WRITE); // is pkey=6 again?
    
    To solve that, we make the plain-mprotect()-initiated execute-only
    support only apply to VMAs that have the default protection key (0)
    set on them.
    
    Proposed semantics:
    1. protection key 0 is special and represents the default,
       "unassigned" protection key.  It is always allocated.
    2. mprotect() never affects a mapping's pkey_mprotect()-assigned
       protection key. A protection key of 0 (even if set explicitly)
       represents an unassigned protection key.
       2a. mprotect(PROT_EXEC) on a mapping with an assigned protection
           key may or may not result in a mapping with execute-only
           properties.  pkey_mprotect() plus pkey_set() on all threads
           should be used to _guarantee_ execute-only semantics if this
           is not a strong enough semantic.
    3. mprotect(PROT_EXEC) may result in an "execute-only" mapping. The
       kernel will internally attempt to allocate and dedicate a
       protection key for the purpose of execute-only mappings.  This
       may not be possible in cases where there are no free protection
       keys available.  It can also happen, of course, in situations
       where there is no hardware support for protection keys.
    Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Cc: linux-arch@vger.kernel.org
    Cc: Dave Hansen <dave@sr71.net>
    Cc: arnd@arndb.de
    Cc: linux-api@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: luto@kernel.org
    Cc: akpm@linux-foundation.org
    Cc: torvalds@linux-foundation.org
    Link: http://lkml.kernel.org/r/20160729163012.3DDD36C4@viggo.jf.intel.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    7d06d9c9
mmu_context.h 6.74 KB