1. 06 Nov, 2021 14 commits
    • Vlastimil Babka's avatar
      mm, slub: change percpu partial accounting from objects to pages · b47291ef
      Vlastimil Babka authored
      With CONFIG_SLUB_CPU_PARTIAL enabled, SLUB keeps a percpu list of
      partial slabs that can be promoted to cpu slab when the previous one is
      depleted, without accessing the shared partial list.  A slab can be
      added to this list by 1) refill of an empty list from get_partial_node()
      - once we really have to access the shared partial list, we acquire
      multiple slabs to amortize the cost of locking, and 2) first free to a
      previously full slab - instead of putting the slab on a shared partial
      list, we can more cheaply freeze it and put it on the per-cpu list.
      
      To control how large a percpu partial list can grow for a kmem cache,
      set_cpu_partial() calculates a target number of free objects on each
      cpu's percpu partial list, and this can be also set by the sysfs file
      cpu_partial.
      
      However, the tracking of actual number of objects is imprecise, in order
      to limit overhead from cpu X freeing an objects to a slab on percpu
      partial list of cpu Y.  Basically, the percpu partial slabs form a
      single linked list, and when we add a new slab to the list with current
      head "oldpage", we set in the struct page of the slab we're adding:
      
          page->pages = oldpage->pages + 1; // this is precise
          page->pobjects = oldpage->pobjects + (page->objects - page->inuse);
          page->next = oldpage;
      
      Thus the real number of free objects in the slab (objects - inuse) is
      only determined at the moment of adding the slab to the percpu partial
      list, and further freeing doesn't update the pobjects counter nor
      propagate it to the current list head.  As Jann reports [1], this can
      easily lead to large inaccuracies, where the target number of objects
      (up to 30 by default) can translate to the same number of (empty) slab
      pages on the list.  In case 2) above, we put a slab with 1 free object
      on the list, thus only increase page->pobjects by 1, even if there are
      subsequent frees on the same slab.  Jann has noticed this in practice
      and so did we [2] when investigating significant increase of kmemcg
      usage after switching from SLAB to SLUB.
      
      While this is no longer a problem in kmemcg context thanks to the
      accounting rewrite in 5.9, the memory waste is still not ideal and it's
      questionable whether it makes sense to perform free object count based
      control when object counts can easily become so much inaccurate.  So
      this patch converts the accounting to be based on number of pages only
      (which is precise) and removes the page->pobjects field completely.
      This is also ultimately simpler.
      
      To retain the existing set_cpu_partial() heuristic, first calculate the
      target number of objects as previously, but then convert it to target
      number of pages by assuming the pages will be half-filled on average.
      This assumption might obviously also be inaccurate in practice, but
      cannot degrade to actual number of pages being equal to the target
      number of objects.
      
      We could also skip the intermediate step with target number of objects
      and rewrite the heuristic in terms of pages.  However we still have the
      sysfs file cpu_partial which uses number of objects and could break
      existing users if it suddenly becomes number of pages, so this patch
      doesn't do that.
      
      In practice, after this patch the heuristics limit the size of percpu
      partial list up to 2 pages.  In case of a reported regression (which
      would mean some workload has benefited from the previous imprecise
      object based counting), we can tune the heuristics to get a better
      compromise within the new scheme, while still avoid the unexpectedly
      long percpu partial lists.
      
      [1] https://lore.kernel.org/linux-mm/CAG48ez2Qx5K1Cab-m8BdSibp6wLTip6ro4=-umR7BLsEgjEYzA@mail.gmail.com/
      [2] https://lore.kernel.org/all/2f0f46e8-2535-410a-1859-e9cfa4e57c18@suse.cz/
      
      ==========
      Evaluation
      ==========
      
      Mel was kind enough to run v1 through mmtests machinery for netperf
      (localhost) and hackbench and, for most significant results see below.
      So there are some apparent regressions, especially with hackbench, which
      I think ultimately boils down to having shorter percpu partial lists on
      average and some benchmarks benefiting from longer ones.  Monitoring
      slab usage also indicated less memory usage by slab.  Based on that, the
      following patch will bump the defaults to allow longer percpu partial
      lists than after this patch.
      
      However the goal is certainly not such that we would limit the percpu
      partial lists to 30 pages just because previously a specific alloc/free
      pattern could lead to the limit of 30 objects translate to a limit to 30
      pages - that would make little sense.  This is a correctness patch, and
      if a workload benefits from larger lists, the sysfs tuning knobs are
      still there to allow that.
      
      Netperf
      
        2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads per socket), 384GB RAM
        TCP-RR:
          hmean before 127045.79 after 121092.94 (-4.69%, worse)
          stddev before  2634.37 after   1254.08
        UDP-RR:
          hmean before 166985.45 after 160668.94 ( -3.78%, worse)
          stddev before 4059.69 after 1943.63
      
        2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads per socket), 512GB RAM
        TCP-RR:
          hmean before 84173.25 after 76914.72 ( -8.62%, worse)
        UDP-RR:
          hmean before 93571.12 after 96428.69 ( 3.05%, better)
          stddev before 23118.54 after 16828.14
      
        2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads per socket), 64GB RAM
        TCP-RR:
          hmean before 49984.92 after 48922.27 ( -2.13%, worse)
          stddev before 6248.15 after 4740.51
        UDP-RR:
          hmean before 61854.31 after 68761.81 ( 11.17%, better)
          stddev before 4093.54 after 5898.91
      
        other machines - within 2%
      
      Hackbench
      
        (results before and after the patch, negative % means worse)
      
        2-socket AMD EPYC 7713 (64 cores, 128 threads per core), 256GB RAM
        hackbench-process-sockets
        Amean 	1 	0.5380	0.5583	( -3.78%)
        Amean 	4 	0.7510	0.8150	( -8.52%)
        Amean 	7 	0.7930	0.9533	( -20.22%)
        Amean 	12 	0.7853	1.1313	( -44.06%)
        Amean 	21 	1.1520	1.4993	( -30.15%)
        Amean 	30 	1.6223	1.9237	( -18.57%)
        Amean 	48 	2.6767	2.9903	( -11.72%)
        Amean 	79 	4.0257	5.1150	( -27.06%)
        Amean 	110	5.5193	7.4720	( -35.38%)
        Amean 	141	7.2207	9.9840	( -38.27%)
        Amean 	172	8.4770	12.1963	( -43.88%)
        Amean 	203	9.6473	14.3137	( -48.37%)
        Amean 	234	11.3960	18.7917	( -64.90%)
        Amean 	265	13.9627	22.4607	( -60.86%)
        Amean 	296	14.9163	26.0483	( -74.63%)
      
        hackbench-thread-sockets
        Amean 	1 	0.5597	0.5877	( -5.00%)
        Amean 	4 	0.7913	0.8960	( -13.23%)
        Amean 	7 	0.8190	1.0017	( -22.30%)
        Amean 	12 	0.9560	1.1727	( -22.66%)
        Amean 	21 	1.7587	1.5660	( 10.96%)
        Amean 	30 	2.4477	1.9807	( 19.08%)
        Amean 	48 	3.4573	3.0630	( 11.41%)
        Amean 	79 	4.7903	5.1733	( -8.00%)
        Amean 	110	6.1370	7.4220	( -20.94%)
        Amean 	141	7.5777	9.2617	( -22.22%)
        Amean 	172	9.2280	11.0907	( -20.18%)
        Amean 	203	10.2793	13.3470	( -29.84%)
        Amean 	234	11.2410	17.1070	( -52.18%)
        Amean 	265	12.5970	23.3323	( -85.22%)
        Amean 	296	17.1540	24.2857	( -41.57%)
      
        2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads
        per socket), 384GB RAM
        hackbench-process-sockets
        Amean 	1 	0.5760	0.4793	( 16.78%)
        Amean 	4 	0.9430	0.9707	( -2.93%)
        Amean 	7 	1.5517	1.8843	( -21.44%)
        Amean 	12 	2.4903	2.7267	( -9.49%)
        Amean 	21 	3.9560	4.2877	( -8.38%)
        Amean 	30 	5.4613	5.8343	( -6.83%)
        Amean 	48 	8.5337	9.2937	( -8.91%)
        Amean 	79 	14.0670	15.2630	( -8.50%)
        Amean 	110	19.2253	21.2467	( -10.51%)
        Amean 	141	23.7557	25.8550	( -8.84%)
        Amean 	172	28.4407	29.7603	( -4.64%)
        Amean 	203	33.3407	33.9927	( -1.96%)
        Amean 	234	38.3633	39.1150	( -1.96%)
        Amean 	265	43.4420	43.8470	( -0.93%)
        Amean 	296	48.3680	48.9300	( -1.16%)
      
        hackbench-thread-sockets
        Amean 	1 	0.6080	0.6493	( -6.80%)
        Amean 	4 	1.0000	1.0513	( -5.13%)
        Amean 	7 	1.6607	2.0260	( -22.00%)
        Amean 	12 	2.7637	2.9273	( -5.92%)
        Amean 	21 	5.0613	4.5153	( 10.79%)
        Amean 	30 	6.3340	6.1140	( 3.47%)
        Amean 	48 	9.0567	9.5577	( -5.53%)
        Amean 	79 	14.5657	15.7983	( -8.46%)
        Amean 	110	19.6213	21.6333	( -10.25%)
        Amean 	141	24.1563	26.2697	( -8.75%)
        Amean 	172	28.9687	30.2187	( -4.32%)
        Amean 	203	33.9763	34.6970	( -2.12%)
        Amean 	234	38.8647	39.3207	( -1.17%)
        Amean 	265	44.0813	44.1507	( -0.16%)
        Amean 	296	49.2040	49.4330	( -0.47%)
      
        2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads
        per socket), 512GB RAM
        hackbench-process-sockets
        Amean 	1 	0.5027	0.5017	( 0.20%)
        Amean 	4 	1.1053	1.2033	( -8.87%)
        Amean 	7 	1.8760	2.1820	( -16.31%)
        Amean 	12 	2.9053	3.1810	( -9.49%)
        Amean 	21 	4.6777	4.9920	( -6.72%)
        Amean 	30 	6.5180	6.7827	( -4.06%)
        Amean 	48 	10.0710	10.5227	( -4.48%)
        Amean 	79 	16.4250	17.5053	( -6.58%)
        Amean 	110	22.6203	24.4617	( -8.14%)
        Amean 	141	28.0967	31.0363	( -10.46%)
        Amean 	172	34.4030	36.9233	( -7.33%)
        Amean 	203	40.5933	43.0850	( -6.14%)
        Amean 	234	46.6477	48.7220	( -4.45%)
        Amean 	265	53.0530	53.9597	( -1.71%)
        Amean 	296	59.2760	59.9213	( -1.09%)
      
        hackbench-thread-sockets
        Amean 	1 	0.5363	0.5330	( 0.62%)
        Amean 	4 	1.1647	1.2157	( -4.38%)
        Amean 	7 	1.9237	2.2833	( -18.70%)
        Amean 	12 	2.9943	3.3110	( -10.58%)
        Amean 	21 	4.9987	5.1880	( -3.79%)
        Amean 	30 	6.7583	7.0043	( -3.64%)
        Amean 	48 	10.4547	10.8353	( -3.64%)
        Amean 	79 	16.6707	17.6790	( -6.05%)
        Amean 	110	22.8207	24.4403	( -7.10%)
        Amean 	141	28.7090	31.0533	( -8.17%)
        Amean 	172	34.9387	36.8260	( -5.40%)
        Amean 	203	41.1567	43.0450	( -4.59%)
        Amean 	234	47.3790	48.5307	( -2.43%)
        Amean 	265	53.9543	54.6987	( -1.38%)
        Amean 	296	60.0820	60.2163	( -0.22%)
      
        1-socket Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz (4 cores, 8 threads),
        32 GB RAM
        hackbench-process-sockets
        Amean 	1 	1.4760	1.5773	( -6.87%)
        Amean 	3 	3.9370	4.0910	( -3.91%)
        Amean 	5 	6.6797	6.9357	( -3.83%)
        Amean 	7 	9.3367	9.7150	( -4.05%)
        Amean 	12	15.7627	16.1400	( -2.39%)
        Amean 	18	23.5360	23.6890	( -0.65%)
        Amean 	24	31.0663	31.3137	( -0.80%)
        Amean 	30	38.7283	39.0037	( -0.71%)
        Amean 	32	41.3417	41.6097	( -0.65%)
      
        hackbench-thread-sockets
        Amean 	1 	1.5250	1.6043	( -5.20%)
        Amean 	3 	4.0897	4.2603	( -4.17%)
        Amean 	5 	6.7760	7.0933	( -4.68%)
        Amean 	7 	9.4817	9.9157	( -4.58%)
        Amean 	12	15.9610	16.3937	( -2.71%)
        Amean 	18	23.9543	24.3417	( -1.62%)
        Amean 	24	31.4400	31.7217	( -0.90%)
        Amean 	30	39.2457	39.5467	( -0.77%)
        Amean 	32	41.8267	42.1230	( -0.71%)
      
        2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads
        per socket), 64GB RAM
        hackbench-process-sockets
        Amean 	1 	1.0347	1.0880	( -5.15%)
        Amean 	4 	1.7267	1.8527	( -7.30%)
        Amean 	7 	2.6707	2.8110	( -5.25%)
        Amean 	12 	4.1617	4.3383	( -4.25%)
        Amean 	21 	7.0070	7.2600	( -3.61%)
        Amean 	30 	9.9187	10.2397	( -3.24%)
        Amean 	48 	15.6710	16.3923	( -4.60%)
        Amean 	79 	24.7743	26.1247	( -5.45%)
        Amean 	110	34.3000	35.9307	( -4.75%)
        Amean 	141	44.2043	44.8010	( -1.35%)
        Amean 	172	54.2430	54.7260	( -0.89%)
        Amean 	192	60.6557	60.9777	( -0.53%)
      
        hackbench-thread-sockets
        Amean 	1 	1.0610	1.1353	( -7.01%)
        Amean 	4 	1.7543	1.9140	( -9.10%)
        Amean 	7 	2.7840	2.9573	( -6.23%)
        Amean 	12 	4.3813	4.4937	( -2.56%)
        Amean 	21 	7.3460	7.5350	( -2.57%)
        Amean 	30 	10.2313	10.5190	( -2.81%)
        Amean 	48 	15.9700	16.5940	( -3.91%)
        Amean 	79 	25.3973	26.6637	( -4.99%)
        Amean 	110	35.1087	36.4797	( -3.91%)
        Amean 	141	45.8220	46.3053	( -1.05%)
        Amean 	172	55.4917	55.7320	( -0.43%)
        Amean 	192	62.7490	62.5410	( 0.33%)
      
      Link: https://lkml.kernel.org/r/20211012134651.11258-1-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarJann Horn <jannh@google.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b47291ef
    • Kefeng Wang's avatar
      slub: add back check for free nonslab objects · d0fe47c6
      Kefeng Wang authored
      After commit f227f0fa ("slub: fix unreclaimable slab stat for bulk
      free"), the check for free nonslab page is replaced by VM_BUG_ON_PAGE,
      which only check with CONFIG_DEBUG_VM enabled, but this config may
      impact performance, so it only for debug.
      
      Commit 0937502a ("slub: Add check for kfree() of non slab objects.")
      add the ability, which should be needed in any configs to catch the
      invalid free, they even could be potential issue, eg, memory corruption,
      use after free and double free, so replace VM_BUG_ON_PAGE to
      WARN_ON_ONCE, add object address printing to help use to debug the
      issue.
      
      Link: https://lkml.kernel.org/r/20210930070214.61499-1-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rienjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0fe47c6
    • Shi Lei's avatar
      mm/slab.c: remove useless lines in enable_cpucache() · ffc95a46
      Shi Lei authored
      These lines are useless, so remove them.
      
      Link: https://lkml.kernel.org/r/20210930034845.2539-1-shi_lei@massclouds.com
      Fixes: 10befea9 ("mm: memcg/slab: use a single set of kmem_caches for all allocations")
      Signed-off-by: default avatarShi Lei <shi_lei@massclouds.com>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ffc95a46
    • Matthew Wilcox (Oracle)'s avatar
      mm: move kvmalloc-related functions to slab.h · 8587ca6f
      Matthew Wilcox (Oracle) authored
      Not all files in the kernel should include mm.h.  Migrating callers from
      kmalloc to kvmalloc is easier if the kvmalloc functions are in slab.h.
      
      [akpm@linux-foundation.org: move the new kvrealloc() also]
      [akpm@linux-foundation.org: drivers/hwmon/occ/p9_sbe.c needs slab.h]
      
      Link: https://lkml.kernel.org/r/20210622215757.3525604-1-willy@infradead.orgSigned-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Acked-by: default avatarPekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8587ca6f
    • Jia He's avatar
      d_path: fix Kernel doc validator complaining · d41b6035
      Jia He authored
      Kernel doc validator complains:
        Function parameter or member 'p' not described in 'prepend_name'
        Excess function parameter 'buffer' description in 'prepend_name'
      
      Link: https://lkml.kernel.org/r/20211011005614.26189-1-justin.he@arm.com
      Fixes: ad08ae58 ("d_path: introduce struct prepend_buffer")
      Signed-off-by: default avatarJia He <justin.he@arm.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d41b6035
    • Arnd Bergmann's avatar
      fs/posix_acl.c: avoid -Wempty-body warning · d1cef29a
      Arnd Bergmann authored
      The fallthrough comment for an ignored cmpxchg() return value produces a
      harmless warning with 'make W=1':
      
        fs/posix_acl.c: In function 'get_acl':
        fs/posix_acl.c:127:36: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
          127 |                 /* fall through */ ;
              |                                    ^
      
      Simplify it as a step towards a clean W=1 build.  As all architectures
      define cmpxchg() as a statement expression these days, it is no longer
      necessary to evaluate its return code, and the if() can just be droped.
      
      Link: https://lkml.kernel.org/r/20210927102410.1863853-1-arnd@kernel.org
      Link: https://lore.kernel.org/all/20210322132103.qiun2rjilnlgztxe@wittgenstein/Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: James Morris <jamorris@linux.microsoft.com>
      Cc: Serge Hallyn <serge@hallyn.com>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d1cef29a
    • Jan Kara's avatar
      ocfs2: do not zero pages beyond i_size · c7c14a36
      Jan Kara authored
      ocfs2_zero_range_for_truncate() can try to zero pages beyond current
      inode size despite the fact that underlying blocks should be already
      zeroed out and writeback will skip writing such pages anyway.  Avoid the
      pointless work.
      
      Link: https://lkml.kernel.org/r/20211025151332.11301-2-jack@suse.czSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c7c14a36
    • Jan Kara's avatar
      ocfs2: fix data corruption on truncate · 839b6386
      Jan Kara authored
      Patch series "ocfs2: Truncate data corruption fix".
      
      As further testing has shown, commit 5314454e ("ocfs2: fix data
      corruption after conversion from inline format") didn't fix all the data
      corruption issues the customer started observing after 6dbf7bb5
      ("fs: Don't invalidate page buffers in block_write_full_page()") This
      time I have tracked them down to two bugs in ocfs2 truncation code.
      
      One bug (truncating page cache before clearing tail cluster and setting
      i_size) could cause data corruption even before 6dbf7bb5, but before
      that commit it needed a race with page fault, after 6dbf7bb5 it
      started to be pretty deterministic.
      
      Another bug (zeroing pages beyond old i_size) used to be harmless
      inefficiency before commit 6dbf7bb5.  But after commit 6dbf7bb5
      in combination with the first bug it resulted in deterministic data
      corruption.
      
      Although fixing only the first problem is needed to stop data
      corruption, I've fixed both issues to make the code more robust.
      
      This patch (of 2):
      
      ocfs2_truncate_file() did unmap invalidate page cache pages before
      zeroing partial tail cluster and setting i_size.  Thus some pages could
      be left (and likely have left if the cluster zeroing happened) in the
      page cache beyond i_size after truncate finished letting user possibly
      see stale data once the file was extended again.  Also the tail cluster
      zeroing was not guaranteed to finish before truncate finished causing
      possible stale data exposure.  The problem started to be particularly
      easy to hit after commit 6dbf7bb5 "fs: Don't invalidate page buffers
      in block_write_full_page()" stopped invalidation of pages beyond i_size
      from page writeback path.
      
      Fix these problems by unmapping and invalidating pages in the page cache
      after the i_size is reduced and tail cluster is zeroed out.
      
      Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz
      Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz
      Fixes: ccd979bd ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      839b6386
    • Colin Ian King's avatar
      ocfs2/dlm: remove redundant assignment of variable ret · 848be75d
      Colin Ian King authored
      The variable ret is being assigned a value that is never read, it is
      updated later on with a different value.  The assignment is redundant
      and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Link: https://lkml.kernel.org/r/20211007233452.30815-1-colin.king@canonical.comSigned-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      848be75d
    • Valentin Vidic's avatar
      ocfs2: cleanup journal init and shutdown · da5e7c87
      Valentin Vidic authored
      Allocate and free struct ocfs2_journal in ocfs2_journal_init and
      ocfs2_journal_shutdown.  Init and release of system inodes references
      the journal so reorder calls to make sure they work correctly.
      
      Link: https://lkml.kernel.org/r/20211009145006.3478-1-vvidic@valentin-vidic.from.hrSigned-off-by: default avatarValentin Vidic <vvidic@valentin-vidic.from.hr>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      da5e7c87
    • Chenyuan Mi's avatar
      ocfs2: fix handle refcount leak in two exception handling paths · ae3fab5b
      Chenyuan Mi authored
      The reference counting issue happens in two exception handling paths of
      ocfs2_replay_truncate_records().  When executing these two exception
      handling paths, the function forgets to decrease the refcount of handle
      increased by ocfs2_start_trans(), causing a refcount leak.
      
      Fix this issue by using ocfs2_commit_trans() to decrease the refcount of
      handle in two handling paths.
      
      Link: https://lkml.kernel.org/r/20210908102055.10168-1-cymi20@fudan.edu.cnSigned-off-by: default avatarChenyuan Mi <cymi20@fudan.edu.cn>
      Signed-off-by: default avatarXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Wengang Wang <wen.gang.wang@oracle.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae3fab5b
    • weidonghui's avatar
      scripts/decodecode: fix faulting instruction no print when opps.file is DOS format · 75e2f715
      weidonghui authored
      If opps.file is in DOS format, faulting instruction cannot be printed:
      
        / # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
        / # ./scripts/decodecode < oops.file
        [ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)
        aarch64-linux-gnu-strip: '/tmp/tmp.5Y9eybnnSi.o': No such file
        aarch64-linux-gnu-objdump: '/tmp/tmp.5Y9eybnnSi.o': No such file
        All code
        ========
           0:   d0002881        adrp    x1, 0x512000
           4:   912f9c21        add     x1, x1, #0xbe7
           8:   94067e68        bl      0x19f9a8
           c:   d2800001        mov     x1, #0x0                        // #0
          10:   b900003f        str     wzr, [x1]
      
        Code starting with the faulting instruction
        ===========================================
      
      Background: The compilation environment is Ubuntu, and the test
      environment is Windows.  Most logs are generated in the Windows
      environment.  In this way, CR (carriage return) will inevitably appear,
      which will affect the use of decodecode in the Ubuntu environment.
      
      The repaired effect is as follows:
      
        / # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
        / # ./scripts/decodecode < oops.file
        [ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)
        All code
        ========
           0:   d0002881        adrp    x1, 0x512000
           4:   912f9c21        add     x1, x1, #0xbe7
           8:   94067e68        bl      0x19f9a8
           c:   d2800001        mov     x1, #0x0                        // #0
          10:*  b900003f        str     wzr, [x1]               <-- trapping instruction
      
        Code starting with the faulting instruction
        ===========================================
           0:   b900003f        str     wzr, [x1]
      
      Link: https://lkml.kernel.org/r/20211008064712.926-1-weidonghui@allwinnertech.comSigned-off-by: default avatarweidonghui <weidonghui@allwinnertech.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Marc Zyngier <maz@misterjones.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Rabin Vincent <rabin@rab.in>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75e2f715
    • Sven Eckelmann's avatar
      scripts/spelling.txt: fix "mistake" version of "synchronization" · 655edc52
      Sven Eckelmann authored
      If both "mistake" version and "correction" version are the same, a
      warning message is created by checkpatch which is impossible to fix.
      But it was noticed that Colan Ian King created a commit e6c0a088
      ("ALSA: aloop: Fix spelling mistake "synchronization" ->
      "synchronization"") which suggests that this spelling mistake was fixed
      by replacing the word "synchronization" with itself.  But the actual
      diff shows that the mistake in the code was "sychronization".  It is
      rather likely that the "mistake" in spelling.txt should have been the
      latter.
      
      Link: https://lkml.kernel.org/r/20210926065529.6880-1-sven@narfation.org
      Fixes: 2e74c9433ba8 ("scripts/spelling.txt: add more spellings to spelling.txt")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Reviewed-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      655edc52
    • Colin Ian King's avatar
      scripts/spelling.txt: add more spellings to spelling.txt · baef1147
      Colin Ian King authored
      Some of the more common spelling mistakes and typos that I've found
      while fixing up spelling mistakes in the kernel in the past few months.
      
      Link: https://lkml.kernel.org/r/20210907072941.7033-1-colin.king@canonical.comSigned-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      baef1147
  2. 31 Oct, 2021 7 commits
  3. 30 Oct, 2021 6 commits
  4. 29 Oct, 2021 13 commits