Commit 2c6efe9c authored by Luis Chamberlain's avatar Luis Chamberlain Committed by Andrew Morton

shmem: add support to ignore swap

In doing experimentations with shmem having the option to avoid swap
becomes a useful mechanism.  One of the *raves* about brd over shmem is
you can avoid swap, but that's not really a good reason to use brd if we
can instead use shmem.  Using brd has its own good reasons to exist, but
just because "tmpfs" doesn't let you do that is not a great reason to
avoid it if we can easily add support for it.

I don't add support for reconfiguring incompatible options, but if we
really wanted to we can add support for that.

To avoid swap we use mapping_set_unevictable() upon inode creation, and
put a WARN_ON_ONCE() stop-gap on writepages() for reclaim.

Link: https://lkml.kernel.org/r/20230309230545.2930737-7-mcgrof@kernel.orgSigned-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
Acked-by: default avatarChristian Brauner <brauner@kernel.org>
Tested-by: default avatarXin Hao <xhao@linux.alibaba.com>
Reviewed-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
Cc: Adam Manzanares <a.manzanares@samsung.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent d0f5a854
...@@ -13,7 +13,8 @@ everything stored therein is lost. ...@@ -13,7 +13,8 @@ everything stored therein is lost.
tmpfs puts everything into the kernel internal caches and grows and tmpfs puts everything into the kernel internal caches and grows and
shrinks to accommodate the files it contains and is able to swap shrinks to accommodate the files it contains and is able to swap
unneeded pages out to swap space, and supports THP. unneeded pages out to swap space, if swap was enabled for the tmpfs
mount. tmpfs also supports THP.
tmpfs extends ramfs with a few userspace configurable options listed and tmpfs extends ramfs with a few userspace configurable options listed and
explained further below, some of which can be reconfigured dynamically on the explained further below, some of which can be reconfigured dynamically on the
...@@ -33,8 +34,8 @@ configured in size at initialization and you cannot dynamically resize them. ...@@ -33,8 +34,8 @@ configured in size at initialization and you cannot dynamically resize them.
Contrary to brd ramdisks, tmpfs has its own filesystem, it does not rely on the Contrary to brd ramdisks, tmpfs has its own filesystem, it does not rely on the
block layer at all. block layer at all.
Since tmpfs lives completely in the page cache and on swap, all tmpfs Since tmpfs lives completely in the page cache and optionally on swap,
pages will be shown as "Shmem" in /proc/meminfo and "Shared" in all tmpfs pages will be shown as "Shmem" in /proc/meminfo and "Shared" in
free(1). Notice that these counters also include shared memory free(1). Notice that these counters also include shared memory
(shmem, see ipcs(1)). The most reliable way to get the count is (shmem, see ipcs(1)). The most reliable way to get the count is
using df(1) and du(1). using df(1) and du(1).
...@@ -83,6 +84,8 @@ nr_inodes The maximum number of inodes for this instance. The default ...@@ -83,6 +84,8 @@ nr_inodes The maximum number of inodes for this instance. The default
is half of the number of your physical RAM pages, or (on a is half of the number of your physical RAM pages, or (on a
machine with highmem) the number of lowmem RAM pages, machine with highmem) the number of lowmem RAM pages,
whichever is the lower. whichever is the lower.
noswap Disables swap. Remounts must respect the original settings.
By default swap is enabled.
========= ============================================================ ========= ============================================================
These parameters accept a suffix k, m or g for kilo, mega and giga and These parameters accept a suffix k, m or g for kilo, mega and giga and
......
...@@ -42,6 +42,8 @@ The unevictable list addresses the following classes of unevictable pages: ...@@ -42,6 +42,8 @@ The unevictable list addresses the following classes of unevictable pages:
* Those owned by ramfs. * Those owned by ramfs.
* Those owned by tmpfs with the noswap mount option.
* Those mapped into SHM_LOCK'd shared memory regions. * Those mapped into SHM_LOCK'd shared memory regions.
* Those mapped into VM_LOCKED [mlock()ed] VMAs. * Those mapped into VM_LOCKED [mlock()ed] VMAs.
......
...@@ -45,6 +45,7 @@ struct shmem_sb_info { ...@@ -45,6 +45,7 @@ struct shmem_sb_info {
kuid_t uid; /* Mount uid for root directory */ kuid_t uid; /* Mount uid for root directory */
kgid_t gid; /* Mount gid for root directory */ kgid_t gid; /* Mount gid for root directory */
bool full_inums; /* If i_ino should be uint or ino_t */ bool full_inums; /* If i_ino should be uint or ino_t */
bool noswap; /* ignores VM reclaim / swap requests */
ino_t next_ino; /* The next per-sb inode number to use */ ino_t next_ino; /* The next per-sb inode number to use */
ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */
struct mempolicy *mpol; /* default memory policy for mappings */ struct mempolicy *mpol; /* default memory policy for mappings */
......
...@@ -116,10 +116,12 @@ struct shmem_options { ...@@ -116,10 +116,12 @@ struct shmem_options {
bool full_inums; bool full_inums;
int huge; int huge;
int seen; int seen;
bool noswap;
#define SHMEM_SEEN_BLOCKS 1 #define SHMEM_SEEN_BLOCKS 1
#define SHMEM_SEEN_INODES 2 #define SHMEM_SEEN_INODES 2
#define SHMEM_SEEN_HUGE 4 #define SHMEM_SEEN_HUGE 4
#define SHMEM_SEEN_INUMS 8 #define SHMEM_SEEN_INUMS 8
#define SHMEM_SEEN_NOSWAP 16
}; };
#ifdef CONFIG_TMPFS #ifdef CONFIG_TMPFS
...@@ -1334,6 +1336,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) ...@@ -1334,6 +1336,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
struct address_space *mapping = folio->mapping; struct address_space *mapping = folio->mapping;
struct inode *inode = mapping->host; struct inode *inode = mapping->host;
struct shmem_inode_info *info = SHMEM_I(inode); struct shmem_inode_info *info = SHMEM_I(inode);
struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
swp_entry_t swap; swp_entry_t swap;
pgoff_t index; pgoff_t index;
...@@ -1347,7 +1350,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) ...@@ -1347,7 +1350,7 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
if (WARN_ON_ONCE(!wbc->for_reclaim)) if (WARN_ON_ONCE(!wbc->for_reclaim))
goto redirty; goto redirty;
if (WARN_ON_ONCE(info->flags & VM_LOCKED)) if (WARN_ON_ONCE((info->flags & VM_LOCKED) || sbinfo->noswap))
goto redirty; goto redirty;
if (!total_swap_pages) if (!total_swap_pages)
...@@ -2372,6 +2375,8 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block ...@@ -2372,6 +2375,8 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block
shmem_set_inode_flags(inode, info->fsflags); shmem_set_inode_flags(inode, info->fsflags);
INIT_LIST_HEAD(&info->shrinklist); INIT_LIST_HEAD(&info->shrinklist);
INIT_LIST_HEAD(&info->swaplist); INIT_LIST_HEAD(&info->swaplist);
if (sbinfo->noswap)
mapping_set_unevictable(inode->i_mapping);
simple_xattrs_init(&info->xattrs); simple_xattrs_init(&info->xattrs);
cache_no_acl(inode); cache_no_acl(inode);
mapping_set_large_folios(inode->i_mapping); mapping_set_large_folios(inode->i_mapping);
...@@ -3459,6 +3464,7 @@ enum shmem_param { ...@@ -3459,6 +3464,7 @@ enum shmem_param {
Opt_uid, Opt_uid,
Opt_inode32, Opt_inode32,
Opt_inode64, Opt_inode64,
Opt_noswap,
}; };
static const struct constant_table shmem_param_enums_huge[] = { static const struct constant_table shmem_param_enums_huge[] = {
...@@ -3480,6 +3486,7 @@ const struct fs_parameter_spec shmem_fs_parameters[] = { ...@@ -3480,6 +3486,7 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
fsparam_u32 ("uid", Opt_uid), fsparam_u32 ("uid", Opt_uid),
fsparam_flag ("inode32", Opt_inode32), fsparam_flag ("inode32", Opt_inode32),
fsparam_flag ("inode64", Opt_inode64), fsparam_flag ("inode64", Opt_inode64),
fsparam_flag ("noswap", Opt_noswap),
{} {}
}; };
...@@ -3563,6 +3570,10 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param) ...@@ -3563,6 +3570,10 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
ctx->full_inums = true; ctx->full_inums = true;
ctx->seen |= SHMEM_SEEN_INUMS; ctx->seen |= SHMEM_SEEN_INUMS;
break; break;
case Opt_noswap:
ctx->noswap = true;
ctx->seen |= SHMEM_SEEN_NOSWAP;
break;
} }
return 0; return 0;
...@@ -3661,6 +3672,14 @@ static int shmem_reconfigure(struct fs_context *fc) ...@@ -3661,6 +3672,14 @@ static int shmem_reconfigure(struct fs_context *fc)
err = "Current inum too high to switch to 32-bit inums"; err = "Current inum too high to switch to 32-bit inums";
goto out; goto out;
} }
if ((ctx->seen & SHMEM_SEEN_NOSWAP) && ctx->noswap && !sbinfo->noswap) {
err = "Cannot disable swap on remount";
goto out;
}
if (!(ctx->seen & SHMEM_SEEN_NOSWAP) && !ctx->noswap && sbinfo->noswap) {
err = "Cannot enable swap on remount if it was disabled on first mount";
goto out;
}
if (ctx->seen & SHMEM_SEEN_HUGE) if (ctx->seen & SHMEM_SEEN_HUGE)
sbinfo->huge = ctx->huge; sbinfo->huge = ctx->huge;
...@@ -3681,6 +3700,10 @@ static int shmem_reconfigure(struct fs_context *fc) ...@@ -3681,6 +3700,10 @@ static int shmem_reconfigure(struct fs_context *fc)
sbinfo->mpol = ctx->mpol; /* transfers initial ref */ sbinfo->mpol = ctx->mpol; /* transfers initial ref */
ctx->mpol = NULL; ctx->mpol = NULL;
} }
if (ctx->noswap)
sbinfo->noswap = true;
raw_spin_unlock(&sbinfo->stat_lock); raw_spin_unlock(&sbinfo->stat_lock);
mpol_put(mpol); mpol_put(mpol);
return 0; return 0;
...@@ -3735,6 +3758,8 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root) ...@@ -3735,6 +3758,8 @@ static int shmem_show_options(struct seq_file *seq, struct dentry *root)
seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge)); seq_printf(seq, ",huge=%s", shmem_format_huge(sbinfo->huge));
#endif #endif
shmem_show_mpol(seq, sbinfo->mpol); shmem_show_mpol(seq, sbinfo->mpol);
if (sbinfo->noswap)
seq_printf(seq, ",noswap");
return 0; return 0;
} }
...@@ -3778,6 +3803,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) ...@@ -3778,6 +3803,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
ctx->inodes = shmem_default_max_inodes(); ctx->inodes = shmem_default_max_inodes();
if (!(ctx->seen & SHMEM_SEEN_INUMS)) if (!(ctx->seen & SHMEM_SEEN_INUMS))
ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64); ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64);
sbinfo->noswap = ctx->noswap;
} else { } else {
sb->s_flags |= SB_NOUSER; sb->s_flags |= SB_NOUSER;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment