1. 29 Sep, 2016 1 commit
  2. 28 Sep, 2016 4 commits
    • Atish Patra's avatar
      sparc64: Fix irq stack bootmem allocation. · ebb99a4c
      Atish Patra authored
      Currently, irq stack bootmem is allocated for all possible cpus
      before nr_cpus value changes the list of possible cpus. As a result,
      there is unnecessary wastage of bootmemory.
      
      Move the irq stack bootmem allocation so that it happens after
      possible cpu list is modified based on nr_cpus value.
      Signed-off-by: default avatarAtish Patra <atish.patra@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebb99a4c
    • Atish Patra's avatar
      sparc64: Fix cpu_possible_mask if nr_cpus is set · 9b2f753e
      Atish Patra authored
      If kernel boot parameter nr_cpus is set, it should define the number
      of CPUs that can ever be available in the system i.e.
      cpu_possible_mask. setup_nr_cpu_ids() overrides the nr_cpu_ids based
      on the cpu_possible_mask during kernel initialization. If
      cpu_possible_mask is not set based on the nr_cpus value, earlier part
      of the kernel would be initialized using nr_cpus value leading to a
      kernel crash.
      
      Set cpu_possible_mask based on nr_cpus value. Thus setup_nr_cpu_ids()
      becomes redundant and does not corrupt nr_cpu_ids value.
      Signed-off-by: default avatarAtish Patra <atish.patra@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b2f753e
    • Mike Kravetz's avatar
      sparc64 mm: Fix more TSB sizing issues · 1e953d84
      Mike Kravetz authored
      Commit af1b1a9b ("sparc64 mm: Fix base TSB sizing when hugetlb
      pages are used") addressed the difference between hugetlb and THP
      pages when computing TSB sizes.  The following additional issues
      were also discovered while working with the code.
      
      In order to save memory, THP makes use of a huge zero page.  This huge
      zero page does not count against a task's RSS, but it does consume TSB
      entries.  This is similar to hugetlb pages.  Therefore, count huge
      zero page entries in hugetlb_pte_count.
      
      Accounting of THP pages is done in the routine set_pmd_at().
      Unfortunately, this does not catch the case where a THP page is split.
      To handle this case, decrement the count in pmdp_invalidate().
      pmdp_invalidate is only called when splitting a THP.  However, 'sanity
      checks' are added in case it is ever called for other purposes.
      
      A more general issue exists with HPAGE_SIZE accounting.
      hugetlb_pte_count tracks the number of HPAGE_SIZE (8M) pages.  This
      value is used to size the TSB for HPAGE_SIZE pages.  However,
      each HPAGE_SIZE page consists of two REAL_HPAGE_SIZE (4M) pages.
      The TSB contains an entry for each REAL_HPAGE_SIZE page.  Therefore,
      the number of REAL_HPAGE_SIZE pages should be used to size the huge
      page TSB.  A new compile time constant REAL_HPAGE_PER_HPAGE is used
      to multiply hugetlb_pte_count before sizing the TSB.
      
      Changes from V1
      - Fixed build issue if hugetlb or THP not configured
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e953d84
    • Paul Gortmaker's avatar
      sparc64: fix section mismatch in find_numa_latencies_for_group · bdf2f59e
      Paul Gortmaker authored
      To fix:
      
        WARNING: vmlinux.o(.text.unlikely+0x580): Section mismatch in
        reference from the function find_numa_latencies_for_group() to the
        function .init.text:find_mlgroup()
      
        The function find_numa_latencies_for_group() references the
        function __init find_mlgroup().  This is often because
        find_numa_latencies_for_group lacks a __init annotation or the
        annotation of find_mlgroup is wrong.
      
      It turns out find_numa_latencies_for_group is only called from:
          static int __init numa_parse_mdesc(void)
      and hence we can tag find_numa_latencies_for_group with __init.
      
      In doing so we see that find_best_numa_node_for_mlgroup is only
      called from within __init and hence can also be marked with __init.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Nitin Gupta <nitin.m.gupta@oracle.com>
      Cc: Chris Hyser <chris.hyser@oracle.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdf2f59e
  3. 27 Sep, 2016 1 commit
  4. 26 Sep, 2016 3 commits
  5. 25 Sep, 2016 7 commits
    • Lorenzo Stoakes's avatar
      mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing · 38e08854
      Lorenzo Stoakes authored
      The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
      defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
      PMDs respectively as requiring balancing upon a subsequent page fault.
      User-defined PROT_NONE memory regions which also have this flag set will
      not normally invoke the NUMA balancing code as do_page_fault() will send
      a segfault to the process before handle_mm_fault() is even called.
      
      However if access_remote_vm() is invoked to access a PROT_NONE region of
      memory, handle_mm_fault() is called via faultin_page() and
      __get_user_pages() without any access checks being performed, meaning
      the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
      region.
      
      A simple means of triggering this problem is to access PROT_NONE mmap'd
      memory using /proc/self/mem which reliably results in the NUMA handling
      functions being invoked when CONFIG_NUMA_BALANCING is set.
      
      This issue was reported in bugzilla (issue 99101) which includes some
      simple repro code.
      
      There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
      added at commit c0e7cad9 to avoid accidentally provoking strange
      behaviour by attempting to apply NUMA balancing to pages that are in
      fact PROT_NONE.  The BUG_ON()'s are consistently triggered by the repro.
      
      This patch moves the PROT_NONE check into mm/memory.c rather than
      invoking BUG_ON() as faulting in these pages via faultin_page() is a
      valid reason for reaching the NUMA check with the PROT_NONE page table
      flag set and is therefore not always a bug.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101Reported-by: default avatarTrevor Saunders <tbsaunde@tbsaunde.org>
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      38e08854
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 831e45d8
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "A round of 4.8 fixes:
      
        MIPS generic code:
         - Add a missing ".set pop" in an early commit
         - Fix memory regions reaching top of physical
         - MAAR: Fix address alignment
         - vDSO: Fix Malta EVA mapping to vDSO page structs
         - uprobes: fix incorrect uprobe brk handling
         - uprobes: select HAVE_REGS_AND_STACK_ACCESS_API
         - Avoid a BUG warning during PR_SET_FP_MODE prctl
         - SMP: Fix possibility of deadlock when bringing CPUs online
         - R6: Remove compact branch policy Kconfig entries
         - Fix size calc when avoiding IPIs for small icache flushes
         - Fix pre-r6 emulation FPU initialisation
         - Fix delay slot emulation count in debugfs
      
        ATH79:
         - Fix test for error return of clk_register_fixed_factor.
      
        Octeon:
         - Fix kernel header to work for VDSO build.
         - Fix initialization of platform device probing.
      
        paravirt:
         - Fix undefined reference to smp_bootstrap"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: Fix delay slot emulation count in debugfs
        MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
        MIPS: Fix pre-r6 emulation FPU initialisation
        MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
        MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API
        MIPS: Octeon: Fix platform bus probing
        MIPS: Octeon: mangle-port: fix build failure with VDSO code
        MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
        MIPS: c-r4k: Fix size calc when avoiding IPIs for small icache flushes
        MIPS: Add a missing ".set pop" in an early commit
        MIPS: paravirt: Fix undefined reference to smp_bootstrap
        MIPS: Remove compact branch policy Kconfig entries
        MIPS: MAAR: Fix address alignment
        MIPS: Fix memory regions reaching top of physical
        MIPS: uprobes: fix incorrect uprobe brk handling
        MIPS: ath79: Fix test for error return of clk_register_fixed_factor().
      831e45d8
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 751b9a5d
      Linus Torvalds authored
      Pull one more powerpc fix from Michael Ellerman:
       "powernv/pci: Fix m64 checks for SR-IOV and window alignment from
        Russell Currey"
      
      * tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
      751b9a5d
    • Linus Torvalds's avatar
      radix tree: fix sibling entry handling in radix_tree_descend() · 8d2c0d36
      Linus Torvalds authored
      The fixes to the radix tree test suite show that the multi-order case is
      broken.  The basic reason is that the radix tree code uses tagged
      pointers with the "internal" bit in the low bits, and calculating the
      pointer indices was supposed to mask off those bits.  But gcc will
      notice that we then use the index to re-create the pointer, and will
      avoid doing the arithmetic and use the tagged pointer directly.
      
      This cleans the code up, using the existing is_sibling_entry() helper to
      validate the sibling pointer range (instead of open-coding it), and
      using entry_to_node() to mask off the low tag bit from the pointer.  And
      once you do that, you might as well just use the now cleaned-up pointer
      directly.
      
      [ Side note: the multi-order code isn't actually ever used in the kernel
        right now, and the only reason I didn't just delete all that code is
        that Kirill Shutemov piped up and said:
      
          "Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
           It also converts shmem-with-huge-pages and hugetlb to them.
      
           I'm okay with converting it to other mechanism, but I need
           something.  (I looked into Konstantin's RFC patchset[2].  It looks
           okay, but I don't feel myself qualified to review it as I don't
           know much about radix-tree internals.)"
      
        [1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
        [2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]
      Reported-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Cedric Blancher <cedric.blancher@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d2c0d36
    • Matthew Wilcox's avatar
      radix tree test suite: Test radix_tree_replace_slot() for multiorder entries · 62fd5258
      Matthew Wilcox authored
      When we replace a multiorder entry, check that all indices reflect the
      new value.
      
      Also, compile the test suite with -O2, which shows other problems with
      the code due to some dodgy pointer operations in the radix tree code.
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      62fd5258
    • Al Viro's avatar
      fix memory leaks in tracing_buffers_splice_read() · 1ae2293d
      Al Viro authored
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      1ae2293d
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Move mutex to protect against resetting of seq data · 1245800c
      Steven Rostedt (Red Hat) authored
      The iter->seq can be reset outside the protection of the mutex. So can
      reading of user data. Move the mutex up to the beginning of the function.
      
      Fixes: d7350c3f ("tracing/core: make the read callbacks reentrants")
      Cc: stable@vger.kernel.org # 2.6.30+
      Reported-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      1245800c
  6. 24 Sep, 2016 10 commits
  7. 23 Sep, 2016 14 commits