1. 08 Mar, 2017 16 commits
  2. 03 Mar, 2017 4 commits
    • Stefan Bader's avatar
      UBUNTU: Ubuntu-4.4.0-66.87 · 05022128
      Stefan Bader authored
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      05022128
    • Alexander Popov's avatar
      tty: n_hdlc: get rid of racy n_hdlc.tbuf · 579abe8d
      Alexander Popov authored
      Currently N_HDLC line discipline uses a self-made singly linked list for
      data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after
      an error.
      
      The commit be10eb75
      ("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf.
      After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put
      one data buffer to tx_free_buf_list twice. That causes double free in
      n_hdlc_release().
      
      Let's use standard kernel linked list and get rid of n_hdlc.tbuf:
      in case of tx error put current data buffer after the head of tx_buf_list.
      Signed-off-by: default avatarAlexander Popov <alex.popov@linux.com>
      
      CVE-2017-2636
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      579abe8d
    • Jiri Slaby's avatar
      TTY: n_hdlc, fix lockdep false positive · c1c6bc86
      Jiri Slaby authored
      The class of 4 n_hdls buf locks is the same because a single function
      n_hdlc_buf_list_init is used to init all the locks. But since
      flush_tx_queue takes n_hdlc->tx_buf_list.spinlock and then calls
      n_hdlc_buf_put which takes n_hdlc->tx_free_buf_list.spinlock, lockdep
      emits a warning:
      =============================================
      [ INFO: possible recursive locking detected ]
      4.3.0-25.g91e30a7-default #1 Not tainted
      ---------------------------------------------
      a.out/1248 is trying to acquire lock:
       (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc]
      
      but task is already holding lock:
       (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc]
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&list->spinlock)->rlock);
        lock(&(&list->spinlock)->rlock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by a.out/1248:
       #0:  (&tty->ldisc_sem){++++++}, at: [<ffffffff814c9eb0>] tty_ldisc_ref_wait+0x20/0x50
       #1:  (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc]
      ...
      Call Trace:
      ...
       [<ffffffff81738fd0>] _raw_spin_lock_irqsave+0x50/0x70
       [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc]
       [<ffffffffa01fdc24>] n_hdlc_tty_ioctl+0x144/0x1d0 [n_hdlc]
       [<ffffffff814c25c1>] tty_ioctl+0x3f1/0xe40
      ...
      
      Fix it by initializing the spin_locks separately. This removes also
      reduntand memset of a freshly kzallocated space.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      CVE-2017-2636
      
      (cherry-picked from e9b736d8 upstream)
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      c1c6bc86
    • Stefan Bader's avatar
      UBUNTU: Start new release · 05eb5f2b
      Stefan Bader authored
      Ignore: yes
      Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      05eb5f2b
  3. 20 Feb, 2017 3 commits
  4. 01 Feb, 2017 1 commit
  5. 31 Jan, 2017 5 commits
  6. 30 Jan, 2017 3 commits
  7. 26 Jan, 2017 8 commits
    • Michal Hocko's avatar
      mm, oom: prevent premature OOM killer invocation for high order request · 018a3c3c
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      There have been several reports about pre-mature OOM killer invocation
      in 4.7 kernel when order-2 allocation request (for the kernel stack)
      invoked OOM killer even during basic workloads (light IO or even kernel
      compile on some filesystems).  In all reported cases the memory is
      fragmented and there are no order-2+ pages available.  There is usually
      a large amount of slab memory (usually dentries/inodes) and further
      debugging has shown that there are way too many unmovable blocks which
      are skipped during the compaction.  Multiple reporters have confirmed
      that the current linux-next which includes [1] and [2] helped and OOMs
      are not reproducible anymore.
      
      A simpler fix for the late rc and stable is to simply ignore the
      compaction feedback and retry as long as there is a reclaim progress and
      we are not getting OOM for order-0 pages.  We already do that for
      CONFING_COMPACTION=n so let's reuse the same code when compaction is
      enabled as well.
      
      [1] http://lkml.kernel.org/r/20160810091226.6709-1-vbabka@suse.cz
      [2] http://lkml.kernel.org/r/f7a9ea9d-bb88-bfd6-e340-3a933559305a@suse.cz
      
      Fixes: 0a0337e0 ("mm, oom: rework oom detection")
      Link: http://lkml.kernel.org/r/20160823074339.GB23577@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Tested-by: default avatarOlaf Hering <olaf@aepfle.de>
      Tested-by: default avatarRalf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
      Cc: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.7.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (cherry picked from commit 6b4e3181)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      018a3c3c
    • Michal Hocko's avatar
      mm, oom: protect !costly allocations some more for !CONFIG_COMPACTION · 3c2eabde
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      Joonsoo has reported that he is able to trigger OOM for !costly high
      order requests (heavy fork() workload close the OOM) with the new oom
      detection rework.  This is because we rely only on should_reclaim_retry
      when the compaction is disabled and it only checks watermarks for the
      requested order and so we might trigger OOM when there is a lot of free
      memory.
      
      It is not very clear what are the usual workloads when the compaction is
      disabled.  Relying on high order allocations heavily without any
      mechanism to create those orders except for unbound amount of reclaim is
      certainly not a good idea.
      
      To prevent from potential regressions let's help this configuration
      some.  We have to sacrifice the determinsm though because there simply
      is none here possible.  should_compact_retry implementation for
      !CONFIG_COMPACTION, which was empty so far, will do watermark check for
      order-0 on all eligible zones.  This will cause retrying until either
      the reclaim cannot make any further progress or all the zones are
      depleted even for order-0 pages.  This means that the number of retries
      is basically unbounded for !costly orders but that was the case before
      the rework as well so this shouldn't regress.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/1463051677-29418-3-git-send-email-mhocko@kernel.orgReported-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (backported from commit 31e49bfd)
      [cascardo: fixup ac_classzone_idx to ac->classzone_idx]
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      3c2eabde
    • Michal Hocko's avatar
      mm, oom, compaction: prevent from should_compact_retry looping for ever for costly orders · 8f32a02e
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      "mm: consider compaction feedback also for costly allocation" has
      removed the upper bound for the reclaim/compaction retries based on the
      number of reclaimed pages for costly orders.  While this is desirable
      the patch did miss a mis interaction between reclaim, compaction and the
      retry logic.  The direct reclaim tries to get zones over min watermark
      while compaction backs off and returns COMPACT_SKIPPED when all zones
      are below low watermark + 1<<order gap.  If we are getting really close
      to OOM then __compaction_suitable can keep returning COMPACT_SKIPPED a
      high order request (e.g.  hugetlb order-9) while the reclaim is not able
      to release enough pages to get us over low watermark.  The reclaim is
      still able to make some progress (usually trashing over few remaining
      pages) so we are not able to break out from the loop.
      
      I have seen this happening with the same test described in "mm: consider
      compaction feedback also for costly allocation" on a swapless system.
      The original problem got resolved by "vmscan: consider classzone_idx in
      compaction_ready" but it shows how things might go wrong when we
      approach the oom event horizont.
      
      The reason why compaction requires being over low rather than min
      watermark is not clear to me.  This check was there essentially since
      56de7263 ("mm: compaction: direct compact when a high-order
      allocation fails").  It is clearly an implementation detail though and
      we shouldn't pull it into the generic retry logic while we should be
      able to cope with such eventuality.  The only place in
      should_compact_retry where we retry without any upper bound is for
      compaction_withdrawn() case.
      
      Introduce compaction_zonelist_suitable function which checks the given
      zonelist and returns true only if there is at least one zone which would
      would unblock __compaction_suitable if more memory got reclaimed.  In
      this implementation it checks __compaction_suitable with NR_FREE_PAGES
      plus part of the reclaimable memory as the target for the watermark
      check.  The reclaimable memory is reduced linearly by the allocation
      order.  The idea is that we do not want to reclaim all the remaining
      memory for a single allocation request just unblock
      __compaction_suitable which doesn't guarantee we will make a further
      progress.
      
      The new helper is then used if compaction_withdrawn() feedback was
      provided so we do not retry if there is no outlook for a further
      progress.  !costly requests shouldn't be affected much - e.g.  order-2
      pages would require to have at least 64kB on the reclaimable LRUs while
      order-9 would need at least 32M which should be enough to not lock up.
      
      [vbabka@suse.cz: fix classzone_idx vs. high_zoneidx usage in compaction_zonelist_suitable]
      [akpm@linux-foundation.org: fix it for Mel's mm-page_alloc-remove-field-from-alloc_context.patch]
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (backported from commit 86a294a8)
      [cascardo: fixup ac_classzone_idx to ac->classzone_idx]
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      8f32a02e
    • Michal Hocko's avatar
      mm: consider compaction feedback also for costly allocation · 51760132
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      PAGE_ALLOC_COSTLY_ORDER retry logic is mostly handled inside
      should_reclaim_retry currently where we decide to not retry after at
      least order worth of pages were reclaimed or the watermark check for at
      least one zone would succeed after reclaiming all pages if the reclaim
      hasn't made any progress.  Compaction feedback is mostly ignored and we
      just try to make sure that the compaction did at least something before
      giving up.
      
      The first condition was added by a41f24ea ("page allocator: smarter
      retry of costly-order allocations) and it assumed that lumpy reclaim
      could have created a page of the sufficient order.  Lumpy reclaim, has
      been removed quite some time ago so the assumption doesn't hold anymore.
      Remove the check for the number of reclaimed pages and rely on the
      compaction feedback solely.  should_reclaim_retry now only makes sure
      that we keep retrying reclaim for high order pages only if they are
      hidden by watermaks so order-0 reclaim makes really sense.
      
      should_compact_retry now keeps retrying even for the costly allocations.
      The number of retries is reduced wrt.  !costly requests because they are
      less important and harder to grant and so their pressure shouldn't cause
      contention for other requests or cause an over reclaim.  We also do not
      reset no_progress_loops for costly request to make sure we do not keep
      reclaiming too agressively.
      
      This has been tested by running a process which fragments memory:
      	- compact memory
      	- mmap large portion of the memory (1920M on 2GRAM machine with 2G
      	  of swapspace)
      	- MADV_DONTNEED single page in PAGE_SIZE*((1UL<<MAX_ORDER)-1)
      	  steps until certain amount of memory is freed (250M in my test)
      	  and reduce the step to (step / 2) + 1 after reaching the end of
      	  the mapping
      	- then run a script which populates the page cache 2G (MemTotal)
      	  from /dev/zero to a new file
      And then tries to allocate
      nr_hugepages=$(awk '/MemAvailable/{printf "%d\n", $2/(2*1024)}' /proc/meminfo)
      huge pages.
      
      root@test1:~# echo 1 > /proc/sys/vm/overcommit_memory;echo 1 > /proc/sys/vm/compact_memory; ./fragment-mem-and-run /root/alloc_hugepages.sh 1920M 250M
      Node 0, zone      DMA     31     28     31     10      2      0      2      1      2      3      1
      Node 0, zone    DMA32    437    319    171     50     28     25     20     16     16     14    437
      
      * This is the /proc/buddyinfo after the compaction
      
      Done fragmenting. size=2013265920 freed=262144000
      Node 0, zone      DMA    165     48      3      1      2      0      2      2      2      2      0
      Node 0, zone    DMA32  35109  14575    185     51     41     12      6      0      0      0      0
      
      * /proc/buddyinfo after memory got fragmented
      
      Executing "/root/alloc_hugepages.sh"
      Eating some pagecache
      508623+0 records in
      508623+0 records out
      2083319808 bytes (2.1 GB) copied, 11.7292 s, 178 MB/s
      Node 0, zone      DMA      3      5      3      1      2      0      2      2      2      2      0
      Node 0, zone    DMA32    111    344    153     20     24     10      3      0      0      0      0
      
      * /proc/buddyinfo after page cache got eaten
      
      Trying to allocate 129
      129
      
      * 129 hugepages requested and all of them granted.
      
      Node 0, zone      DMA      3      5      3      1      2      0      2      2      2      2      0
      Node 0, zone    DMA32    127     97     30     99     11      6      2      1      4      0      0
      
      * /proc/buddyinfo after hugetlb allocation.
      
      10 runs will behave as follows:
      Trying to allocate 130
      130
      --
      Trying to allocate 129
      129
      --
      Trying to allocate 128
      128
      --
      Trying to allocate 129
      129
      --
      Trying to allocate 128
      128
      --
      Trying to allocate 129
      129
      --
      Trying to allocate 132
      132
      --
      Trying to allocate 129
      129
      --
      Trying to allocate 128
      128
      --
      Trying to allocate 129
      129
      
      So basically 100% success for all 10 attempts.
      Without the patch numbers looked much worse:
      Trying to allocate 128
      12
      --
      Trying to allocate 129
      14
      --
      Trying to allocate 129
      7
      --
      Trying to allocate 129
      16
      --
      Trying to allocate 129
      30
      --
      Trying to allocate 129
      38
      --
      Trying to allocate 129
      19
      --
      Trying to allocate 129
      37
      --
      Trying to allocate 129
      28
      --
      Trying to allocate 129
      37
      
      Just for completness the base kernel without oom detection rework looks
      as follows:
      Trying to allocate 127
      30
      --
      Trying to allocate 129
      12
      --
      Trying to allocate 129
      52
      --
      Trying to allocate 128
      32
      --
      Trying to allocate 129
      12
      --
      Trying to allocate 129
      10
      --
      Trying to allocate 129
      32
      --
      Trying to allocate 128
      14
      --
      Trying to allocate 128
      16
      --
      Trying to allocate 129
      8
      
      As we can see the success rate is much more volatile and smaller without
      this patch. So the patch not only makes the retry logic for costly
      requests more sensible the success rate is even higher.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (cherry picked from commit 7854ea6c)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      51760132
    • Michal Hocko's avatar
      mm, oom: protect !costly allocations some more · 989e78e9
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      should_reclaim_retry will give up retries for higher order allocations
      if none of the eligible zones has any requested or higher order pages
      available even if we pass the watermak check for order-0.  This is done
      because there is no guarantee that the reclaimable and currently free
      pages will form the required order.
      
      This can, however, lead to situations where the high-order request (e.g.
      order-2 required for the stack allocation during fork) will trigger OOM
      too early - e.g.  after the first reclaim/compaction round.  Such a
      system would have to be highly fragmented and there is no guarantee
      further reclaim/compaction attempts would help but at least make sure
      that the compaction was active before we go OOM and keep retrying even
      if should_reclaim_retry tells us to oom if
      
      	- the last compaction round backed off or
      	- we haven't completed at least MAX_COMPACT_RETRIES active
      	  compaction rounds.
      
      The first rule ensures that the very last attempt for compaction was not
      ignored while the second guarantees that the compaction has done some
      work.  Multiple retries might be needed to prevent occasional pigggy
      backing of other contexts to steal the compacted pages before the
      current context manages to retry to allocate them.
      
      compaction_failed() is taken as a final word from the compaction that
      the retry doesn't make much sense.  We have to be careful though because
      the first compaction round is MIGRATE_ASYNC which is rather weak as it
      ignores pages under writeback and gives up too easily in other
      situations.  We therefore have to make sure that MIGRATE_SYNC_LIGHT mode
      has been used before we give up.  With this logic in place we do not
      have to increase the migration mode unconditionally and rather do it
      only if the compaction failed for the weaker mode.  A nice side effect
      is that the stronger migration mode is used only when really needed so
      this has a potential of smaller latencies in some cases.
      
      Please note that the compaction doesn't tell us much about how
      successful it was when returning compaction_made_progress so we just
      have to blindly trust that another retry is worthwhile and cap the
      number to something reasonable to guarantee a convergence.
      
      If the given number of successful retries is not sufficient for a
      reasonable workloads we should focus on the collected compaction
      tracepoints data and try to address the issue in the compaction code.
      If this is not feasible we can increase the retries limit.
      
      [mhocko@suse.com: fix warning]
        Link: http://lkml.kernel.org/r/20160512061636.GA4200@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (cherry picked from commit 33c2d214)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      989e78e9
    • Michal Hocko's avatar
      mm, compaction: abstract compaction feedback to helpers · bca70cc5
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      Compaction can provide a wild variation of feedback to the caller.  Many
      of them are implementation specific and the caller of the compaction
      (especially the page allocator) shouldn't be bound to specifics of the
      current implementation.
      
      This patch abstracts the feedback into three basic types:
      	- compaction_made_progress - compaction was active and made some
      	  progress.
      	- compaction_failed - compaction failed and further attempts to
      	  invoke it would most probably fail and therefore it is not
      	  worth retrying
      	- compaction_withdrawn - compaction wasn't invoked for an
                implementation specific reasons. In the current implementation
                it means that the compaction was deferred, contended or the
                page scanners met too early without any progress. Retrying is
                still worthwhile.
      
      [vbabka@suse.cz: do not change thp back off behavior]
      [akpm@linux-foundation.org: fix typo in comment, per Hillf]
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (backported from commit cab1802b)
      [cascardo: fixup as we don't have kcompactd]
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      bca70cc5
    • Michal Hocko's avatar
      mm, compaction: distinguish between full and partial COMPACT_COMPLETE · ff732276
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      COMPACT_COMPLETE now means that compaction and free scanner met.  This
      is not very useful information if somebody just wants to use this
      feedback and make any decisions based on that.  The current caller might
      be a poor guy who just happened to scan tiny portion of the zone and
      that could be the reason no suitable pages were compacted.  Make sure we
      distinguish the full and partial zone walks.
      
      Consumers should treat COMPACT_PARTIAL_SKIPPED as a potential success
      and be optimistic in retrying.
      
      The existing users of COMPACT_COMPLETE are conservatively changed to use
      COMPACT_PARTIAL_SKIPPED as well but some of them should be probably
      reconsidered and only defer the compaction only for COMPACT_COMPLETE
      with the new semantic.
      
      This patch shouldn't introduce any functional changes.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (cherry picked from commit c8f7de0b)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      ff732276
    • Michal Hocko's avatar
      mm, compaction: simplify __alloc_pages_direct_compact feedback interface · 5d042d3c
      Michal Hocko authored
      BugLink: https://bugs.launchpad.net/bugs/1655842
      
      __alloc_pages_direct_compact communicates potential back off by two
      variables:
      	- deferred_compaction tells that the compaction returned
      	  COMPACT_DEFERRED
      	- contended_compaction is set when there is a contention on
      	  zone->lock resp. zone->lru_lock locks
      
      __alloc_pages_slowpath then backs of for THP allocation requests to
      prevent from long stalls. This is rather messy and it would be much
      cleaner to return a single compact result value and hide all the nasty
      details into __alloc_pages_direct_compact.
      
      This patch shouldn't introduce any functional changes.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      (backported from commit c5d01d0d)
      [cascardo: small cleanup as we backported 0a0337e0 before]
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Acked-by: default avatarBrad Figg <brad.figg@canonical.com>
      5d042d3c