1. 20 Nov, 2016 3 commits
    • Wanpeng Li's avatar
      sched/cputime: Fix prev steal time accouting during CPU hotplug · 97220c6c
      Wanpeng Li authored
      commit 3d89e547 upstream.
      
      Commit:
      
        e9532e69 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")
      
      ... set rq->prev_* to 0 after a CPU hotplug comes back, in order to
      fix the case where (after CPU hotplug) steal time is smaller than
      rq->prev_steal_time.
      
      However, this should never happen. Steal time was only smaller because of the
      KVM-specific bug fixed by the previous patch.  Worse, the previous patch
      triggers a bug on CPU hot-unplug/plug operation: because
      rq->prev_steal_time is cleared, all of the CPU's past steal time will be
      accounted again on hot-plug.
      
      Since the root cause has been fixed, we can just revert commit e9532e69.
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 'commit e9532e69 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")'
      Link: http://lkml.kernel.org/r/1465813966-3116-3-git-send-email-wanpeng.li@hotmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2: adjust filename, context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      97220c6c
    • Bharata B Rao's avatar
      powerpc/numa: Fix multiple bugs in memory_hotplug_max() · 17cc521e
      Bharata B Rao authored
      commit 45b64ee6 upstream.
      
      memory_hotplug_max() uses hot_add_drconf_memory_max() to get maxmimum
      addressable memory by referring to ibm,dyanamic-memory property. There
      are three problems with the current approach:
      
      1 hot_add_drconf_memory_max() assumes that ibm,dynamic-memory includes
        all the LMBs of the guest, but that is not true for PowerKVM which
        populates only DR LMBs (LMBs that can be hotplugged/removed) in that
        property.
      2 hot_add_drconf_memory_max() multiplies lmb-size with lmb-count to arrive
        at the max possible address. Since ibm,dynamic-memory doesn't include
        RMA LMBs, the address thus obtained will be less than the actual max
        address. For example, if max possible memory size is 32G, with lmb-size
        of 256MB there can be 127 LMBs in ibm,dynamic-memory (1 LMB for RMA
        which won't be present here).  hot_add_drconf_memory_max() would then
        return the max addressable memory as 127 * 256MB = 31.75GB, the max
        address should have been 32G which is what ibm,lrdr-capacity shows.
      3 In PowerKVM, there can be a gap between the end of boot time RAM and
        beginning of hotplug RAM area. So just multiplying lmb-count with
        lmb-size will not provide the correct max possible address for PowerKVM.
      
      This patch fixes 1 by using ibm,lrdr-capacity property to return the max
      addressable memory whenever the property is present. Then it fixes 2 & 3
      by fetching the address of the last LMB in ibm,dynamic-memory property.
      
      Fixes: cd34206e ("powerpc: Add memory_hotplug_max()")
      Signed-off-by: default avatarBharata B Rao <bharata@linux.vnet.ibm.com>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      17cc521e
    • Paul Moore's avatar
      netlabel: add address family checks to netlbl_{sock,req}_delattr() · bc64d7bf
      Paul Moore authored
      commit 0e0e3677 upstream.
      
      It seems risky to always rely on the caller to ensure the socket's
      address family is correct before passing it to the NetLabel kAPI,
      especially since we see at least one LSM which didn't. Add address
      family checks to the *_delattr() functions to help prevent future
      problems.
      Reported-by: default avatarManinder Singh <maninder1.s@samsung.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bc64d7bf
  2. 20 Oct, 2016 2 commits
    • Ben Hutchings's avatar
      Linux 3.2.83 · e0aed024
      Ben Hutchings authored
      e0aed024
    • Michal Hocko's avatar
      mm, gup: close FOLL MAP_PRIVATE race · 243f858d
      Michal Hocko authored
      commit 19be0eaf upstream.
      
      faultin_page drops FOLL_WRITE after the page fault handler did the CoW
      and then we retry follow_page_mask to get our CoWed page. This is racy,
      however because the page might have been unmapped by that time and so
      we would have to do a page fault again, this time without CoW. This
      would cause the page cache corruption for FOLL_FORCE on MAP_PRIVATE
      read only mappings with obvious consequences.
      
      This is an ancient bug that was actually already fixed once by Linus
      eleven years ago in commit 4ceb5db9 ("Fix get_user_pages() race
      for write access") but that was then undone due to problems on s390
      by commit f33ea7f4 ("fix get_user_pages bug") because s390 didn't
      have proper dirty pte tracking until abf09bed ("s390/mm: implement
      software dirty bits"). This wasn't a problem at the time as pointed out
      by Hugh Dickins because madvise relied on mmap_sem for write up until
      0a27a14a ("mm: madvise avoid exclusive mmap_sem") but since then we
      can race with madvise which can unmap the fresh COWed page or with KSM
      and corrupt the content of the shared page.
      
      This patch is based on the Linus' approach to not clear FOLL_WRITE after
      the CoW page fault (aka VM_FAULT_WRITE) but instead introduces FOLL_COW
      to note this fact. The flag is then rechecked during follow_pfn_pte to
      enforce the page fault again if we do not see the CoWed page. Linus was
      suggesting to check pte_dirty again as s390 is OK now. But that would
      make backporting to some old kernels harder. So instead let's just make
      sure that vm_normal_page sees a pure anonymous page.
      
      This would guarantee we are seeing a real CoW page. Introduce
      can_follow_write_pte which checks both pte_write and falls back to
      PageAnon on forced write faults which passed CoW already. Thanks to Hugh
      to point out that a special care has to be taken for KSM pages because
      our COWed page might have been merged with a KSM one and keep its
      PageAnon flag.
      
      Fixes: 0a27a14a ("mm: madvise avoid exclusive mmap_sem")
      Reported-by: default avatarPhil "not Paul" Oester <kernel@linuxace.com>
      Disclosed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      [bwh: Backported to 3.2:
       - Adjust filename, context, indentation
       - The 'no_page' exit path in follow_page() is different, so open-code the
         cleanup
       - Delete a now-unused label]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      243f858d
  3. 22 Aug, 2016 35 commits