You need to sign in or sign up before continuing.
  1. 24 Feb, 2024 2 commits
  2. 05 Jan, 2024 1 commit
  3. 18 Oct, 2023 1 commit
  4. 04 Oct, 2023 1 commit
    • Qi Zheng's avatar
      zsmalloc: dynamically allocate the mm-zspool shrinker · c19b548b
      Qi Zheng authored
      In preparation for implementing lockless slab shrink, use new APIs to
      dynamically allocate the mm-zspool shrinker, so that it can be freed
      asynchronously via RCU. Then it doesn't need to wait for RCU read-side
      critical section when releasing the struct zs_pool.
      
      Link: https://lkml.kernel.org/r/20230911094444.68966-38-zhengqi.arch@bytedance.com
      
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Anna Schumaker <anna@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Carlos Llamas <cmllamas@google.com>
      Cc: Chandan Babu R <chandan.babu@oracle.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Christian Koenig <christian.koenig@amd.com>
      Cc: Chuck Lever <cel@kernel.org>
      Cc: Coly Li <colyli@suse.de>
      Cc: Dai Ngo <Dai.Ngo@oracle.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: "Darrick J. Wong" <djwong@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Airlie <airlied@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Sterba <dsterba@suse.com>
      Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
      Cc: Gao Xiang <hsiangkao@linux.alibaba.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Marijn Suijten <marijn.suijten@somainline.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mike Snitzer <snitzer@kernel.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Cc: Olga Kornievskaia <kolga@netapp.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Sean Paul <sean@poorly.run>
      Cc: Song Liu <song@kernel.org>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Steven Price <steven.price@arm.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
      Cc: Tom Talpey <tom@talpey.com>
      Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
      Cc: Yue Hu <huyue2@coolpad.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c19b548b
  5. 18 Aug, 2023 4 commits
  6. 04 Aug, 2023 1 commit
  7. 19 Jun, 2023 2 commits
  8. 09 Jun, 2023 1 commit
  9. 17 May, 2023 1 commit
    • Nhat Pham's avatar
      zsmalloc: move LRU update from zs_map_object() to zs_malloc() · d461aac9
      Nhat Pham authored
      Under memory pressure, we sometimes observe the following crash:
      
      [ 5694.832838] ------------[ cut here ]------------
      [ 5694.842093] list_del corruption, ffff888014b6a448->next is LIST_POISON1 (dead000000000100)
      [ 5694.858677] WARNING: CPU: 33 PID: 418824 at lib/list_debug.c:47 __list_del_entry_valid+0x42/0x80
      [ 5694.961820] CPU: 33 PID: 418824 Comm: fuse_counters.s Kdump: loaded Tainted: G S                5.19.0-0_fbk3_rc3_hoangnhatpzsdynshrv41_10870_g85a9558a25de #1
      [ 5694.990194] Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM16 05/24/2021
      [ 5695.007072] RIP: 0010:__list_del_entry_valid+0x42/0x80
      [ 5695.017351] Code: 08 48 83 c2 22 48 39 d0 74 24 48 8b 10 48 39 f2 75 2c 48 8b 51 08 b0 01 48 39 f2 75 34 c3 48 c7 c7 55 d7 78 82 e8 4e 45 3b 00 <0f> 0b eb 31 48 c7 c7 27 a8 70 82 e8 3e 45 3b 00 0f 0b eb 21 48 c7
      [ 5695.054919] RSP: 0018:ffffc90027aef4f0 EFLAGS: 00010246
      [ 5695.065366] RAX: 41fe484987275300 RBX: ffff888008988180 RCX: 0000000000000000
      [ 5695.079636] RDX: ffff88886006c280 RSI: ffff888860060480 RDI: ffff888860060480
      [ 5695.093904] RBP: 0000000000000002 R08: 0000000000000000 R09: ffffc90027aef370
      [ 5695.108175] R10: 0000000000000000 R11: ffffffff82fdf1c0 R12: 0000000010000002
      [ 5695.122447] R13: ffff888014b6a448 R14: ffff888014b6a420 R15: 00000000138dc240
      [ 5695.136717] FS:  00007f23a7d3f740(0000) GS:ffff888860040000(0000) knlGS:0000000000000000
      [ 5695.152899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 5695.164388] CR2: 0000560ceaab6ac0 CR3: 000000001c06c001 CR4: 00000000007706e0
      [ 5695.178659] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 5695.192927] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 5695.207197] PKRU: 55555554
      [ 5695.212602] Call Trace:
      [ 5695.217486]  <TASK>
      [ 5695.221674]  zs_map_object+0x91/0x270
      [ 5695.229000]  zswap_frontswap_store+0x33d/0x870
      [ 5695.237885]  ? do_raw_spin_lock+0x5d/0xa0
      [ 5695.245899]  __frontswap_store+0x51/0xb0
      [ 5695.253742]  swap_writepage+0x3c/0x60
      [ 5695.261063]  shrink_page_list+0x738/0x1230
      [ 5695.269255]  shrink_lruvec+0x5ec/0xcd0
      [ 5695.276749]  ? shrink_slab+0x187/0x5f0
      [ 5695.284240]  ? mem_cgroup_iter+0x6e/0x120
      [ 5695.292255]  shrink_node+0x293/0x7b0
      [ 5695.299402]  do_try_to_free_pages+0xea/0x550
      [ 5695.307940]  try_to_free_pages+0x19a/0x490
      [ 5695.316126]  __folio_alloc+0x19ff/0x3e40
      [ 5695.323971]  ? __filemap_get_folio+0x8a/0x4e0
      [ 5695.332681]  ? walk_component+0x2a8/0xb50
      [ 5695.340697]  ? generic_permission+0xda/0x2a0
      [ 5695.349231]  ? __filemap_get_folio+0x8a/0x4e0
      [ 5695.357940]  ? walk_component+0x2a8/0xb50
      [ 5695.365955]  vma_alloc_folio+0x10e/0x570
      [ 5695.373796]  ? walk_component+0x52/0xb50
      [ 5695.381634]  wp_page_copy+0x38c/0xc10
      [ 5695.388953]  ? filename_lookup+0x378/0xbc0
      [ 5695.397140]  handle_mm_fault+0x87f/0x1800
      [ 5695.405157]  do_user_addr_fault+0x1bd/0x570
      [ 5695.413520]  exc_page_fault+0x5d/0x110
      [ 5695.421017]  asm_exc_page_fault+0x22/0x30
      
      After some investigation, I have found the following issue: unlike other
      zswap backends, zsmalloc performs the LRU list update at the object
      mapping time, rather than when the slot for the object is allocated.
      This deviation was discussed and agreed upon during the review process
      of the zsmalloc writeback patch series:
      
      https://lore.kernel.org/lkml/Y3flcAXNxxrvy3ZH@cmpxchg.org/
      
      Unfortunately, this introduces a subtle bug that occurs when there is a
      concurrent store and reclaim, which interleave as follows:
      
      zswap_frontswap_store()            shrink_worker()
        zs_malloc()                        zs_zpool_shrink()
          spin_lock(&pool->lock)             zs_reclaim_page()
          zspage = find_get_zspage()
          spin_unlock(&pool->lock)
                                               spin_lock(&pool->lock)
                                               zspage = list_first_entry(&pool->lru)
                                               list_del(&zspage->lru)
                                                 zspage->lru.next = LIST_POISON1
                                                 zspage->lru.prev = LIST_POISON2
                                               spin_unlock(&pool->lock)
        zs_map_object()
          spin_lock(&pool->lock)
          if (!list_empty(&zspage->lru))
            list_del(&zspage->lru)
              CHECK_DATA_CORRUPTION(next == LIST_POISON1) /* BOOM */
      
      With the current upstream code, this issue rarely happens. zswap only
      triggers writeback when the pool is already full, at which point all
      further store attempts are short-circuited. This creates an implicit
      pseudo-serialization between reclaim and store. I am working on a new
      zswap shrinking mechanism, which makes interleaving reclaim and store
      more likely, exposing this bug.
      
      zbud and z3fold do not have this problem, because they perform the LRU
      list update in the alloc function, while still holding the pool's lock.
      This patch fixes the aforementioned bug by moving the LRU update back to
      zs_malloc(), analogous to zbud and z3fold.
      
      Link: https://lkml.kernel.org/r/20230505185054.2417128-1-nphamcs@gmail.com
      Fixes: 64f768c6
      
       ("zsmalloc: add a LRU to zs_pool to keep track of zspages in LRU order")
      Signed-off-by: default avatarNhat Pham <nphamcs@gmail.com>
      Suggested-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d461aac9
  10. 21 Apr, 2023 1 commit
  11. 18 Apr, 2023 1 commit
  12. 28 Mar, 2023 4 commits
    • Sergey Senozhatsky's avatar
      zsmalloc: show per fullness group class stats · e1807d5d
      Sergey Senozhatsky authored
      We keep the old fullness (3/4 threshold) reporting in
      zs_stats_size_show().  Switch from allmost full/empty stats to
      fine-grained per inuse ratio (fullness group) reporting, which gives
      signicantly more data on classes fragmentation.
      
      Link: https://lkml.kernel.org/r/20230304034835.2082479-5-senozhatsky@chromium.org
      
      Signed-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e1807d5d
    • Sergey Senozhatsky's avatar
      zsmalloc: rework compaction algorithm · 5a845e9f
      Sergey Senozhatsky authored
      The zsmalloc compaction algorithm has the potential to waste some CPU
      cycles, particularly when compacting pages within the same fullness group.
      This is due to the way it selects the head page of the fullness list for
      source and destination pages, and how it reinserts those pages during each
      iteration.  The algorithm may first use a page as a migration destination
      and then as a migration source, leading to an unnecessary back-and-forth
      movement of objects.
      
      Consider the following fullness list:
      
      PageA PageB PageC PageD PageE
      
      During the first iteration, the compaction algorithm will select PageA as
      the source and PageB as the destination.  All of PageA's objects will be
      moved to PageB, and then PageA will be released while PageB is reinserted
      into the fullness list.
      
      PageB PageC PageD PageE
      
      During the next iteration, the compaction algorithm will again select the
      head of the list as the source and destination, meaning that PageB will
      now serve as the source and PageC as the destination.  This will result in
      the objects being moved away from PageB, the same objects that were just
      moved to PageB in the previous iteration.
      
      To prevent this avalanche effect, the compaction algorithm should not
      reinsert the destination page between iterations.  By doing so, the most
      optimal page will continue to be used and its usage ratio will increase,
      reducing internal fragmentation.  The destination page should only be
      reinserted into the fullness list if:
      - It becomes full
      - No source page is available.
      
      TEST
      ====
      
      It's very challenging to reliably test this series.  I ended up developing
      my own synthetic test that has 100% reproducibility.  The test generates
      significan fragmentation (for each size class) and then performs
      compaction for each class individually and tracks the number of memcpy()
      in zs_object_copy(), so that we can compare the amount work compaction
      does on per-class basis.
      
      Total amount of work (zram mm_stat objs_moved)
      ----------------------------------------------
      
      Old fullness grouping, old compaction algorithm:
      323977 memcpy() in zs_object_copy().
      
      Old fullness grouping, new compaction algorithm:
      262944 memcpy() in zs_object_copy().
      
      New fullness grouping, new compaction algorithm:
      213978 memcpy() in zs_object_copy().
      
      Per-class compaction memcpy() comparison (T-test)
      -------------------------------------------------
      
      x Old fullness grouping, old compaction algorithm
      + Old fullness grouping, new compaction algorithm
      
          N           Min           Max        Median           Avg        Stddev
      x 140           349          3513          2461     2314.1214     806.03271
      + 140           289          2778          2006     1878.1714     641.02073
      Difference at 95.0% confidence
              -435.95 +/- 170.595
              -18.8387% +/- 7.37193%
              (Student's t, pooled s = 728.216)
      
      x Old fullness grouping, old compaction algorithm
      + New fullness grouping, new compaction algorithm
      
          N           Min           Max        Median           Avg        Stddev
      x 140           349          3513          2461     2314.1214     806.03271
      + 140           226          2279          1644     1528.4143     524.85268
      Difference at 95.0% confidence
              -785.707 +/- 159.331
              -33.9527% +/- 6.88516%
              (Student's t, pooled s = 680.132)
      
      Link: https://lkml.kernel.org/r/20230304034835.2082479-4-senozhatsky@chromium.org
      
      Signed-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5a845e9f
    • Sergey Senozhatsky's avatar
      zsmalloc: fine-grained inuse ratio based fullness grouping · 4c7ac972
      Sergey Senozhatsky authored
      Each zspage maintains ->inuse counter which keeps track of the number of
      objects stored in the zspage.  The ->inuse counter also determines the
      zspage's "fullness group" which is calculated as the ratio of the "inuse"
      objects to the total number of objects the zspage can hold
      (objs_per_zspage).  The closer the ->inuse counter is to objs_per_zspage,
      the better.
      
      Each size class maintains several fullness lists, that keep track of
      zspages of particular "fullness".  Pages within each fullness list are
      stored in random order with regard to the ->inuse counter.  This is
      because sorting the zspages by ->inuse counter each time obj_malloc() or
      obj_free() is called would be too expensive.  However, the ->inuse counter
      is still a crucial factor in many situations.
      
      For the two major zsmalloc operations, zs_malloc() and zs_compact(), we
      typically select the head zspage from the corresponding fullness list as
      the best candidate zspage.  However, this assumption is not always
      accurate.
      
      For the zs_malloc() operation, the optimal candidate zspage should have
      the highest ->inuse counter.  This is because the goal is to maximize the
      number of ZS_FULL zspages and make full use of all allocated memory.
      
      For the zs_compact() operation, the optimal source zspage should have the
      lowest ->inuse counter.  This is because compaction needs to move objects
      in use to another page before it can release the zspage and return its
      physical pages to the buddy allocator.  The fewer objects in use, the
      quicker compaction can release the zspage.  Additionally, compaction is
      measured by the number of pages it releases.
      
      This patch reworks the fullness grouping mechanism.  Instead of having two
      groups - ZS_ALMOST_EMPTY (usage ratio below 3/4) and ZS_ALMOST_FULL (usage
      ration above 3/4) - that result in too many zspages being included in the
      ALMOST_EMPTY group for specific classes, size classes maintain a larger
      number of fullness lists that give strict guarantees on the minimum and
      maximum ->inuse values within each group.  Each group represents a 10%
      change in the ->inuse ratio compared to neighboring groups.  In essence,
      there are groups for zspages with 0%, 10%, 20% usage ratios, and so on, up
      to 100%.
      
      This enhances the selection of candidate zspages for both zs_malloc() and
      zs_compact().  A printout of the ->inuse counters of the first 7 zspages
      per (random) class fullness group:
      
       class-768 objs_per_zspage 16:
         fullness 100%:  empty
         fullness  99%:  empty
         fullness  90%:  empty
         fullness  80%:  empty
         fullness  70%:  empty
         fullness  60%:  8  8  9  9  8  8  8
         fullness  50%:  empty
         fullness  40%:  5  5  6  5  5  5  5
         fullness  30%:  4  4  4  4  4  4  4
         fullness  20%:  2  3  2  3  3  2  2
         fullness  10%:  1  1  1  1  1  1  1
         fullness   0%:  empty
      
      The zs_malloc() function searches through the groups of pages starting
      with the one having the highest usage ratio.  This means that it always
      selects a zspage from the group with the least internal fragmentation
      (highest usage ratio) and makes it even less fragmented by increasing its
      usage ratio.
      
      The zs_compact() function, on the other hand, begins by scanning the group
      with the highest fragmentation (lowest usage ratio) to locate the source
      page.  The first available zspage is selected, and then the function moves
      downward to find a destination zspage in the group with the lowest
      internal fragmentation (highest usage ratio).
      
      Link: https://lkml.kernel.org/r/20230304034835.2082479-3-senozhatsky@chromium.org
      
      Signed-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4c7ac972
    • Sergey Senozhatsky's avatar
      zsmalloc: remove insert_zspage() ->inuse optimization · a40a71e8
      Sergey Senozhatsky authored
      Patch series "zsmalloc: fine-grained fullness and new compaction
      algorithm", v4.
      
      Existing zsmalloc page fullness grouping leads to suboptimal page
      selection for both zs_malloc() and zs_compact().  This patchset reworks
      zsmalloc fullness grouping/classification.
      
      Additinally it also implements new compaction algorithm that is expected
      to use less CPU-cycles (as it potentially does fewer memcpy-s in
      zs_object_copy()).
      
      Test (synthetic) results can be seen in patch 0003.
      
      
      This patch (of 4):
      
      This optimization has no effect.  It only ensures that when a zspage was
      added to its corresponding fullness list, its "inuse" counter was higher
      or lower than the "inuse" counter of the zspage at the head of the list. 
      The intention was to keep busy zspages at the head, so they could be
      filled up and moved to the ZS_FULL fullness group more quickly.  However,
      this doesn't work as the "inuse" counter of a zspage can be modified by
      obj_free() but the zspage may still belong to the same fullness list.  So,
      fix_fullness_group() won't change the zspage's position in relation to the
      head's "inuse" counter, leading to a largely random order of zspages
      within the fullness list.
      
      For instance, consider a printout of the "inuse" counters of the first 10
      zspages in a class that holds 93 objects per zspage:
      
       ZS_ALMOST_EMPTY:  36  67  68  64  35  54  63  52
      
      As we can see the zspage with the lowest "inuse" counter
      is actually the head of the fullness list.
      
      Remove this pointless "optimisation".
      
      Link: https://lkml.kernel.org/r/20230304034835.2082479-1-senozhatsky@chromium.org
      Link: https://lkml.kernel.org/r/20230304034835.2082479-2-senozhatsky@chromium.org
      
      Signed-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a40a71e8
  13. 03 Feb, 2023 3 commits
  14. 01 Feb, 2023 1 commit
    • Nhat Pham's avatar
      zsmalloc: fix a race with deferred_handles storing · 85b32581
      Nhat Pham authored
      Currently, there is a race between zs_free() and zs_reclaim_page():
      zs_reclaim_page() finds a handle to an allocated object, but before the
      eviction happens, an independent zs_free() call to the same handle could
      come in and overwrite the object value stored at the handle with the last
      deferred handle.  When zs_reclaim_page() finally gets to call the eviction
      handler, it will see an invalid object value (i.e the previous deferred
      handle instead of the original object value).
      
      This race happens quite infrequently.  We only managed to produce it with
      out-of-tree developmental code that triggers zsmalloc writeback with a
      much higher frequency than usual.
      
      This patch fixes this race by storing the deferred handle in the object
      header instead.  We differentiate the deferred handle from the other two
      cases (handle for allocated object, and linkage for free object) with a
      new tag.  If zspage reclamation succeeds, we will free these deferred
      handles by walking through the zspage objects.  On the other hand, if
      zspage reclamation fails, we reconstruct the zspage freelist (with the
      deferred handle tag and allocated tag) before trying again with the
      reclamation.
      
      [arnd@arndb.de: avoid unused-function warning]
        Link: https://lkml.kernel.org/r/20230117170507.2651972-1-arnd@kernel.org
      Link: https://lkml.kernel.org/r/20230110231701.326724-1-nphamcs@gmail.com
      Fixes: 9997bc01
      
       ("zsmalloc: implement writeback mechanism for zsmalloc")
      Signed-off-by: default avatarNhat Pham <nphamcs@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      85b32581
  15. 19 Jan, 2023 1 commit
  16. 12 Dec, 2022 4 commits
  17. 30 Nov, 2022 2 commits
  18. 21 Oct, 2022 1 commit
  19. 03 Oct, 2022 1 commit
  20. 12 Sep, 2022 3 commits
  21. 28 Aug, 2022 1 commit
  22. 02 Aug, 2022 1 commit
  23. 30 Jul, 2022 1 commit
    • Hui Zhu's avatar
      zsmalloc: zs_malloc: return ERR_PTR on failure · c7e6f17b
      Hui Zhu authored
      zs_malloc returns 0 if it fails.  zs_zpool_malloc will return -1 when
      zs_malloc return 0.  But -1 makes the return value unclear.
      
      For example, when zswap_frontswap_store calls zs_malloc through
      zs_zpool_malloc, it will return -1 to its caller.  The other return value
      is -EINVAL, -ENODEV or something else.
      
      This commit changes zs_malloc to return ERR_PTR on failure.  It didn't
      just let zs_zpool_malloc return -ENOMEM becaue zs_malloc has two types of
      failure:
      
      - size is not OK return -EINVAL
      - memory alloc fail return -ENOMEM.
      
      Link: https://lkml.kernel.org/r/20220714080757.12161-1-teawater@gmail.com
      
      Signed-off-by: default avatarHui Zhu <teawater@antgroup.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c7e6f17b
  24. 04 Jul, 2022 1 commit
    • Roman Gushchin's avatar
      mm: shrinkers: provide shrinkers with names · e33c267a
      Roman Gushchin authored
      Currently shrinkers are anonymous objects.  For debugging purposes they
      can be identified by count/scan function names, but it's not always
      useful: e.g.  for superblock's shrinkers it's nice to have at least an
      idea of to which superblock the shrinker belongs.
      
      This commit adds names to shrinkers.  register_shrinker() and
      prealloc_shrinker() functions are extended to take a format and arguments
      to master a name.
      
      In some cases it's not possible to determine a good name at the time when
      a shrinker is allocated.  For such cases shrinker_debugfs_rename() is
      provided.
      
      The expected format is:
          <subsystem>-<shrinker_type>[:<instance>]-<id>
      For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair.
      
      After this change the shrinker debugfs directory looks like:
        $ cd /sys/kernel/debug/shrinker/
        $ ls
          dquota-cache-16     sb-devpts-28     sb-proc-47       sb-tmpfs-42
          mm-shadow-18        sb-devtmpfs-5    sb-proc-48       sb-tmpfs-43
          mm-zspool:zram0-34  sb-hugetlbfs-17  sb-pstore-31     sb-tmpfs-44
          rcu-kfree-0         sb-hugetlbfs-33  sb-rootfs-2      sb-tmpfs-49
          sb-aio-20           sb-iomem-12      sb-securityfs-6  sb-tracefs-13
          sb-anon_inodefs-15  sb-mqueue-21     sb-selinuxfs-22  sb-xfs:vda1-36
          sb-bdev-3           sb-nsfs-4        sb-sockfs-8      sb-zsmalloc-19
          sb-bpf-32           sb-pipefs-14     sb-sysfs-26      thp-deferred_split-10
          sb-btrfs:vda2-24    sb-proc-25       sb-tmpfs-1       thp-zero-9
          sb-cgroup2-30       sb-proc-39       sb-tmpfs-27      xfs-buf:vda1-37
          sb-configfs-23      sb-proc-41       sb-tmpfs-29      xfs-inodegc:vda1-38
          sb-dax-11           sb-proc-45       sb-tmpfs-35
          sb-debugfs-7        sb-proc-46       sb-tmpfs-40
      
      [roman.gushchin@linux.dev: fix build warnings]
        Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev
      
      Signed-off-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e33c267a