Commit 6f01c935 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:
 "This starts with a couple of fixes for potential deadlocks in the
  fowner/fasync handling.

  The next patch removes the old mandatory locking code from the kernel
  altogether.

  The last patch cleans up rw_verify_area a bit more after the mandatory
  locking removal"

* tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fs: clean up after mandatory file locking support removal
  fs: remove mandatory file locking support
  fcntl: fix potential deadlock for &fasync_struct.fa_lock
  fcntl: fix potential deadlocks for &fown_struct.lock
parents 451819aa 2949e842
.. SPDX-License-Identifier: GPL-2.0
=====================================================
Mandatory File Locking For The Linux Operating System
=====================================================
Andy Walker <andy@lysaker.kvaerner.no>
15 April 1996
(Updated September 2007)
0. Why you should avoid mandatory locking
-----------------------------------------
The Linux implementation is prey to a number of difficult-to-fix race
conditions which in practice make it not dependable:
- The write system call checks for a mandatory lock only once
at its start. It is therefore possible for a lock request to
be granted after this check but before the data is modified.
A process may then see file data change even while a mandatory
lock was held.
- Similarly, an exclusive lock may be granted on a file after
the kernel has decided to proceed with a read, but before the
read has actually completed, and the reading process may see
the file data in a state which should not have been visible
to it.
- Similar races make the claimed mutual exclusion between lock
and mmap similarly unreliable.
1. What is mandatory locking?
------------------------------
Mandatory locking is kernel enforced file locking, as opposed to the more usual
cooperative file locking used to guarantee sequential access to files among
processes. File locks are applied using the flock() and fcntl() system calls
(and the lockf() library routine which is a wrapper around fcntl().) It is
normally a process' responsibility to check for locks on a file it wishes to
update, before applying its own lock, updating the file and unlocking it again.
The most commonly used example of this (and in the case of sendmail, the most
troublesome) is access to a user's mailbox. The mail user agent and the mail
transfer agent must guard against updating the mailbox at the same time, and
prevent reading the mailbox while it is being updated.
In a perfect world all processes would use and honour a cooperative, or
"advisory" locking scheme. However, the world isn't perfect, and there's
a lot of poorly written code out there.
In trying to address this problem, the designers of System V UNIX came up
with a "mandatory" locking scheme, whereby the operating system kernel would
block attempts by a process to write to a file that another process holds a
"read" -or- "shared" lock on, and block attempts to both read and write to a
file that a process holds a "write " -or- "exclusive" lock on.
The System V mandatory locking scheme was intended to have as little impact as
possible on existing user code. The scheme is based on marking individual files
as candidates for mandatory locking, and using the existing fcntl()/lockf()
interface for applying locks just as if they were normal, advisory locks.
.. Note::
1. In saying "file" in the paragraphs above I am actually not telling
the whole truth. System V locking is based on fcntl(). The granularity of
fcntl() is such that it allows the locking of byte ranges in files, in
addition to entire files, so the mandatory locking rules also have byte
level granularity.
2. POSIX.1 does not specify any scheme for mandatory locking, despite
borrowing the fcntl() locking scheme from System V. The mandatory locking
scheme is defined by the System V Interface Definition (SVID) Version 3.
2. Marking a file for mandatory locking
---------------------------------------
A file is marked as a candidate for mandatory locking by setting the group-id
bit in its file mode but removing the group-execute bit. This is an otherwise
meaningless combination, and was chosen by the System V implementors so as not
to break existing user programs.
Note that the group-id bit is usually automatically cleared by the kernel when
a setgid file is written to. This is a security measure. The kernel has been
modified to recognize the special case of a mandatory lock candidate and to
refrain from clearing this bit. Similarly the kernel has been modified not
to run mandatory lock candidates with setgid privileges.
3. Available implementations
----------------------------
I have considered the implementations of mandatory locking available with
SunOS 4.1.x, Solaris 2.x and HP-UX 9.x.
Generally I have tried to make the most sense out of the behaviour exhibited
by these three reference systems. There are many anomalies.
All the reference systems reject all calls to open() for a file on which
another process has outstanding mandatory locks. This is in direct
contravention of SVID 3, which states that only calls to open() with the
O_TRUNC flag set should be rejected. The Linux implementation follows the SVID
definition, which is the "Right Thing", since only calls with O_TRUNC can
modify the contents of the file.
HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not
just mandatory locks. That would appear to contravene POSIX.1.
mmap() is another interesting case. All the operating systems mentioned
prevent mandatory locks from being applied to an mmap()'ed file, but HP-UX
also disallows advisory locks for such a file. SVID actually specifies the
paranoid HP-UX behaviour.
In my opinion only MAP_SHARED mappings should be immune from locking, and then
only from mandatory locks - that is what is currently implemented.
SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for
mandatory locks, so reads and writes to locked files always block when they
should return EAGAIN.
I'm afraid that this is such an esoteric area that the semantics described
below are just as valid as any others, so long as the main points seem to
agree.
4. Semantics
------------
1. Mandatory locks can only be applied via the fcntl()/lockf() locking
interface - in other words the System V/POSIX interface. BSD style
locks using flock() never result in a mandatory lock.
2. If a process has locked a region of a file with a mandatory read lock, then
other processes are permitted to read from that region. If any of these
processes attempts to write to the region it will block until the lock is
released, unless the process has opened the file with the O_NONBLOCK
flag in which case the system call will return immediately with the error
status EAGAIN.
3. If a process has locked a region of a file with a mandatory write lock, all
attempts to read or write to that region block until the lock is released,
unless a process has opened the file with the O_NONBLOCK flag in which case
the system call will return immediately with the error status EAGAIN.
4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has
any mandatory locks owned by other processes will be rejected with the
error status EAGAIN.
5. Attempts to apply a mandatory lock to a file that is memory mapped and
shared (via mmap() with MAP_SHARED) will be rejected with the error status
EAGAIN.
6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED)
that has any mandatory locks in effect will be rejected with the error status
EAGAIN.
5. Which system calls are affected?
-----------------------------------
Those which modify a file's contents, not just the inode. That gives read(),
write(), readv(), writev(), open(), creat(), mmap(), truncate() and
ftruncate(). truncate() and ftruncate() are considered to be "write" actions
for the purposes of mandatory locking.
The affected region is usually defined as stretching from the current position
for the total number of bytes read or written. For the truncate calls it is
defined as the bytes of a file removed or added (we must also consider bytes
added, as a lock can specify just "the whole file", rather than a specific
range of bytes.)
Note 3: I may have overlooked some system calls that need mandatory lock
checking in my eagerness to get this code out the door. Please let me know, or
better still fix the system calls yourself and submit a patch to me or Linus.
6. Warning!
-----------
Not even root can override a mandatory lock, so runaway processes can wreak
havoc if they lock crucial files. The way around it is to change the file
permissions (remove the setgid bit) before trying to read or write to it.
Of course, that might be a bit tricky if the system is hung :-(
7. The "mand" mount option
--------------------------
Mandatory locking is disabled on all filesystems by default, and must be
administratively enabled by mounting with "-o mand". That mount option
is only allowed if the mounting task has the CAP_SYS_ADMIN capability.
Since kernel v4.5, it is possible to disable mandatory locking
altogether by setting CONFIG_MANDATORY_FILE_LOCKING to "n". A kernel
with this disabled will reject attempts to mount filesystems with the
"mand" mount option with the error status EPERM.
...@@ -121,10 +121,6 @@ static int v9fs_file_lock(struct file *filp, int cmd, struct file_lock *fl) ...@@ -121,10 +121,6 @@ static int v9fs_file_lock(struct file *filp, int cmd, struct file_lock *fl)
p9_debug(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl); p9_debug(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl);
/* No mandatory locks */
if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
return -ENOLCK;
if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) { if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
filemap_write_and_wait(inode->i_mapping); filemap_write_and_wait(inode->i_mapping);
invalidate_mapping_pages(&inode->i_data, 0, -1); invalidate_mapping_pages(&inode->i_data, 0, -1);
...@@ -312,10 +308,6 @@ static int v9fs_file_lock_dotl(struct file *filp, int cmd, struct file_lock *fl) ...@@ -312,10 +308,6 @@ static int v9fs_file_lock_dotl(struct file *filp, int cmd, struct file_lock *fl)
p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n", p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n",
filp, cmd, fl, filp); filp, cmd, fl, filp);
/* No mandatory locks */
if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
goto out_err;
if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) { if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
filemap_write_and_wait(inode->i_mapping); filemap_write_and_wait(inode->i_mapping);
invalidate_mapping_pages(&inode->i_data, 0, -1); invalidate_mapping_pages(&inode->i_data, 0, -1);
...@@ -327,7 +319,6 @@ static int v9fs_file_lock_dotl(struct file *filp, int cmd, struct file_lock *fl) ...@@ -327,7 +319,6 @@ static int v9fs_file_lock_dotl(struct file *filp, int cmd, struct file_lock *fl)
ret = v9fs_file_getlock(filp, fl); ret = v9fs_file_getlock(filp, fl);
else else
ret = -EINVAL; ret = -EINVAL;
out_err:
return ret; return ret;
} }
...@@ -348,10 +339,6 @@ static int v9fs_file_flock_dotl(struct file *filp, int cmd, ...@@ -348,10 +339,6 @@ static int v9fs_file_flock_dotl(struct file *filp, int cmd,
p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n", p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n",
filp, cmd, fl, filp); filp, cmd, fl, filp);
/* No mandatory locks */
if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
goto out_err;
if (!(fl->fl_flags & FL_FLOCK)) if (!(fl->fl_flags & FL_FLOCK))
goto out_err; goto out_err;
......
...@@ -101,16 +101,6 @@ config FILE_LOCKING ...@@ -101,16 +101,6 @@ config FILE_LOCKING
for filesystems like NFS and for the flock() system for filesystems like NFS and for the flock() system
call. Disabling this option saves about 11k. call. Disabling this option saves about 11k.
config MANDATORY_FILE_LOCKING
bool "Enable Mandatory file locking"
depends on FILE_LOCKING
default y
help
This option enables files appropriately marked files on appropriely
mounted filesystems to support mandatory locking.
To the best of my knowledge this is dead code that no one cares about.
source "fs/crypto/Kconfig" source "fs/crypto/Kconfig"
source "fs/verity/Kconfig" source "fs/verity/Kconfig"
......
...@@ -772,10 +772,6 @@ int afs_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -772,10 +772,6 @@ int afs_lock(struct file *file, int cmd, struct file_lock *fl)
fl->fl_type, fl->fl_flags, fl->fl_type, fl->fl_flags,
(long long) fl->fl_start, (long long) fl->fl_end); (long long) fl->fl_start, (long long) fl->fl_end);
/* AFS doesn't support mandatory locks */
if (__mandatory_lock(&vnode->vfs_inode) && fl->fl_type != F_UNLCK)
return -ENOLCK;
if (IS_GETLK(cmd)) if (IS_GETLK(cmd))
return afs_do_getlk(file, fl); return afs_do_getlk(file, fl);
......
...@@ -240,9 +240,6 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -240,9 +240,6 @@ int ceph_lock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_POSIX)) if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK; return -ENOLCK;
/* No mandatory locks */
if (__mandatory_lock(file->f_mapping->host) && fl->fl_type != F_UNLCK)
return -ENOLCK;
dout("ceph_lock, fl_owner: %p\n", fl->fl_owner); dout("ceph_lock, fl_owner: %p\n", fl->fl_owner);
......
...@@ -150,7 +150,8 @@ void f_delown(struct file *filp) ...@@ -150,7 +150,8 @@ void f_delown(struct file *filp)
pid_t f_getown(struct file *filp) pid_t f_getown(struct file *filp)
{ {
pid_t pid = 0; pid_t pid = 0;
read_lock(&filp->f_owner.lock);
read_lock_irq(&filp->f_owner.lock);
rcu_read_lock(); rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) { if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
pid = pid_vnr(filp->f_owner.pid); pid = pid_vnr(filp->f_owner.pid);
...@@ -158,7 +159,7 @@ pid_t f_getown(struct file *filp) ...@@ -158,7 +159,7 @@ pid_t f_getown(struct file *filp)
pid = -pid; pid = -pid;
} }
rcu_read_unlock(); rcu_read_unlock();
read_unlock(&filp->f_owner.lock); read_unlock_irq(&filp->f_owner.lock);
return pid; return pid;
} }
...@@ -208,7 +209,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg) ...@@ -208,7 +209,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
struct f_owner_ex owner = {}; struct f_owner_ex owner = {};
int ret = 0; int ret = 0;
read_lock(&filp->f_owner.lock); read_lock_irq(&filp->f_owner.lock);
rcu_read_lock(); rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
owner.pid = pid_vnr(filp->f_owner.pid); owner.pid = pid_vnr(filp->f_owner.pid);
...@@ -231,7 +232,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg) ...@@ -231,7 +232,7 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
read_unlock(&filp->f_owner.lock); read_unlock_irq(&filp->f_owner.lock);
if (!ret) { if (!ret) {
ret = copy_to_user(owner_p, &owner, sizeof(owner)); ret = copy_to_user(owner_p, &owner, sizeof(owner));
...@@ -249,10 +250,10 @@ static int f_getowner_uids(struct file *filp, unsigned long arg) ...@@ -249,10 +250,10 @@ static int f_getowner_uids(struct file *filp, unsigned long arg)
uid_t src[2]; uid_t src[2];
int err; int err;
read_lock(&filp->f_owner.lock); read_lock_irq(&filp->f_owner.lock);
src[0] = from_kuid(user_ns, filp->f_owner.uid); src[0] = from_kuid(user_ns, filp->f_owner.uid);
src[1] = from_kuid(user_ns, filp->f_owner.euid); src[1] = from_kuid(user_ns, filp->f_owner.euid);
read_unlock(&filp->f_owner.lock); read_unlock_irq(&filp->f_owner.lock);
err = put_user(src[0], &dst[0]); err = put_user(src[0], &dst[0]);
err |= put_user(src[1], &dst[1]); err |= put_user(src[1], &dst[1]);
...@@ -1003,13 +1004,14 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band) ...@@ -1003,13 +1004,14 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
{ {
while (fa) { while (fa) {
struct fown_struct *fown; struct fown_struct *fown;
unsigned long flags;
if (fa->magic != FASYNC_MAGIC) { if (fa->magic != FASYNC_MAGIC) {
printk(KERN_ERR "kill_fasync: bad magic number in " printk(KERN_ERR "kill_fasync: bad magic number in "
"fasync_struct!\n"); "fasync_struct!\n");
return; return;
} }
read_lock(&fa->fa_lock); read_lock_irqsave(&fa->fa_lock, flags);
if (fa->fa_file) { if (fa->fa_file) {
fown = &fa->fa_file->f_owner; fown = &fa->fa_file->f_owner;
/* Don't send SIGURG to processes which have not set a /* Don't send SIGURG to processes which have not set a
...@@ -1018,7 +1020,7 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band) ...@@ -1018,7 +1020,7 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
if (!(sig == SIGURG && fown->signum == 0)) if (!(sig == SIGURG && fown->signum == 0))
send_sigio(fown, fa->fa_fd, band); send_sigio(fown, fa->fa_fd, band);
} }
read_unlock(&fa->fa_lock); read_unlock_irqrestore(&fa->fa_lock, flags);
fa = rcu_dereference(fa->fa_next); fa = rcu_dereference(fa->fa_next);
} }
} }
......
...@@ -1237,9 +1237,6 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -1237,9 +1237,6 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_POSIX)) if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK; return -ENOLCK;
if (__mandatory_lock(&ip->i_inode) && fl->fl_type != F_UNLCK)
return -ENOLCK;
if (cmd == F_CANCELLK) { if (cmd == F_CANCELLK) {
/* Hack: */ /* Hack: */
cmd = F_SETLK; cmd = F_SETLK;
......
...@@ -1397,103 +1397,6 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl) ...@@ -1397,103 +1397,6 @@ static int posix_lock_inode_wait(struct inode *inode, struct file_lock *fl)
return error; return error;
} }
#ifdef CONFIG_MANDATORY_FILE_LOCKING
/**
* locks_mandatory_locked - Check for an active lock
* @file: the file to check
*
* Searches the inode's list of locks to find any POSIX locks which conflict.
* This function is called from locks_verify_locked() only.
*/
int locks_mandatory_locked(struct file *file)
{
int ret;
struct inode *inode = locks_inode(file);
struct file_lock_context *ctx;
struct file_lock *fl;
ctx = smp_load_acquire(&inode->i_flctx);
if (!ctx || list_empty_careful(&ctx->flc_posix))
return 0;
/*
* Search the lock list for this inode for any POSIX locks.
*/
spin_lock(&ctx->flc_lock);
ret = 0;
list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
if (fl->fl_owner != current->files &&
fl->fl_owner != file) {
ret = -EAGAIN;
break;
}
}
spin_unlock(&ctx->flc_lock);
return ret;
}
/**
* locks_mandatory_area - Check for a conflicting lock
* @inode: the file to check
* @filp: how the file was opened (if it was)
* @start: first byte in the file to check
* @end: lastbyte in the file to check
* @type: %F_WRLCK for a write lock, else %F_RDLCK
*
* Searches the inode's list of locks to find any POSIX locks which conflict.
*/
int locks_mandatory_area(struct inode *inode, struct file *filp, loff_t start,
loff_t end, unsigned char type)
{
struct file_lock fl;
int error;
bool sleep = false;
locks_init_lock(&fl);
fl.fl_pid = current->tgid;
fl.fl_file = filp;
fl.fl_flags = FL_POSIX | FL_ACCESS;
if (filp && !(filp->f_flags & O_NONBLOCK))
sleep = true;
fl.fl_type = type;
fl.fl_start = start;
fl.fl_end = end;
for (;;) {
if (filp) {
fl.fl_owner = filp;
fl.fl_flags &= ~FL_SLEEP;
error = posix_lock_inode(inode, &fl, NULL);
if (!error)
break;
}
if (sleep)
fl.fl_flags |= FL_SLEEP;
fl.fl_owner = current->files;
error = posix_lock_inode(inode, &fl, NULL);
if (error != FILE_LOCK_DEFERRED)
break;
error = wait_event_interruptible(fl.fl_wait,
list_empty(&fl.fl_blocked_member));
if (!error) {
/*
* If we've been sleeping someone might have
* changed the permissions behind our back.
*/
if (__mandatory_lock(inode))
continue;
}
break;
}
locks_delete_block(&fl);
return error;
}
EXPORT_SYMBOL(locks_mandatory_area);
#endif /* CONFIG_MANDATORY_FILE_LOCKING */
static void lease_clear_pending(struct file_lock *fl, int arg) static void lease_clear_pending(struct file_lock *fl, int arg)
{ {
switch (arg) { switch (arg) {
...@@ -2486,14 +2389,6 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd, ...@@ -2486,14 +2389,6 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd,
if (file_lock == NULL) if (file_lock == NULL)
return -ENOLCK; return -ENOLCK;
/* Don't allow mandatory locks on files that may be memory mapped
* and shared.
*/
if (mandatory_lock(inode) && mapping_writably_mapped(filp->f_mapping)) {
error = -EAGAIN;
goto out;
}
error = flock_to_posix_lock(filp, file_lock, flock); error = flock_to_posix_lock(filp, file_lock, flock);
if (error) if (error)
goto out; goto out;
...@@ -2611,21 +2506,12 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd, ...@@ -2611,21 +2506,12 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd,
struct flock64 *flock) struct flock64 *flock)
{ {
struct file_lock *file_lock = locks_alloc_lock(); struct file_lock *file_lock = locks_alloc_lock();
struct inode *inode = locks_inode(filp);
struct file *f; struct file *f;
int error; int error;
if (file_lock == NULL) if (file_lock == NULL)
return -ENOLCK; return -ENOLCK;
/* Don't allow mandatory locks on files that may be memory mapped
* and shared.
*/
if (mandatory_lock(inode) && mapping_writably_mapped(filp->f_mapping)) {
error = -EAGAIN;
goto out;
}
error = flock64_to_posix_lock(filp, file_lock, flock); error = flock64_to_posix_lock(filp, file_lock, flock);
if (error) if (error)
goto out; goto out;
...@@ -2857,8 +2743,7 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl, ...@@ -2857,8 +2743,7 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
seq_puts(f, "POSIX "); seq_puts(f, "POSIX ");
seq_printf(f, " %s ", seq_printf(f, " %s ",
(inode == NULL) ? "*NOINODE*" : (inode == NULL) ? "*NOINODE*" : "ADVISORY ");
mandatory_lock(inode) ? "MANDATORY" : "ADVISORY ");
} else if (IS_FLOCK(fl)) { } else if (IS_FLOCK(fl)) {
if (fl->fl_type & LOCK_MAND) { if (fl->fl_type & LOCK_MAND) {
seq_puts(f, "FLOCK MSNFS "); seq_puts(f, "FLOCK MSNFS ");
......
...@@ -3023,9 +3023,7 @@ static int handle_truncate(struct user_namespace *mnt_userns, struct file *filp) ...@@ -3023,9 +3023,7 @@ static int handle_truncate(struct user_namespace *mnt_userns, struct file *filp)
/* /*
* Refuse to truncate files with mandatory locks held on them. * Refuse to truncate files with mandatory locks held on them.
*/ */
error = locks_verify_locked(filp); error = security_path_truncate(path);
if (!error)
error = security_path_truncate(path);
if (!error) { if (!error) {
error = do_truncate(mnt_userns, path->dentry, 0, error = do_truncate(mnt_userns, path->dentry, 0,
ATTR_MTIME|ATTR_CTIME|ATTR_OPEN, ATTR_MTIME|ATTR_CTIME|ATTR_OPEN,
......
...@@ -1715,22 +1715,14 @@ static inline bool may_mount(void) ...@@ -1715,22 +1715,14 @@ static inline bool may_mount(void)
return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN); return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN);
} }
#ifdef CONFIG_MANDATORY_FILE_LOCKING static void warn_mandlock(void)
static bool may_mandlock(void)
{ {
pr_warn_once("======================================================\n" pr_warn_once("=======================================================\n"
"WARNING: the mand mount option is being deprecated and\n" "WARNING: The mand mount option has been deprecated and\n"
" will be removed in v5.15!\n" " and is ignored by this kernel. Remove the mand\n"
"======================================================\n"); " option from the mount to silence this warning.\n"
return capable(CAP_SYS_ADMIN); "=======================================================\n");
} }
#else
static inline bool may_mandlock(void)
{
pr_warn("VFS: \"mand\" mount option not supported");
return false;
}
#endif
static int can_umount(const struct path *path, int flags) static int can_umount(const struct path *path, int flags)
{ {
...@@ -3197,8 +3189,8 @@ int path_mount(const char *dev_name, struct path *path, ...@@ -3197,8 +3189,8 @@ int path_mount(const char *dev_name, struct path *path,
return ret; return ret;
if (!may_mount()) if (!may_mount())
return -EPERM; return -EPERM;
if ((flags & SB_MANDLOCK) && !may_mandlock()) if (flags & SB_MANDLOCK)
return -EPERM; warn_mandlock();
/* Default to relatime unless overriden */ /* Default to relatime unless overriden */
if (!(flags & MS_NOATIME)) if (!(flags & MS_NOATIME))
...@@ -3581,9 +3573,8 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags, ...@@ -3581,9 +3573,8 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
if (fc->phase != FS_CONTEXT_AWAITING_MOUNT) if (fc->phase != FS_CONTEXT_AWAITING_MOUNT)
goto err_unlock; goto err_unlock;
ret = -EPERM; if (fc->sb_flags & SB_MANDLOCK)
if ((fc->sb_flags & SB_MANDLOCK) && !may_mandlock()) warn_mandlock();
goto err_unlock;
newmount.mnt = vfs_create_mount(fc); newmount.mnt = vfs_create_mount(fc);
if (IS_ERR(newmount.mnt)) { if (IS_ERR(newmount.mnt)) {
......
...@@ -806,10 +806,6 @@ int nfs_lock(struct file *filp, int cmd, struct file_lock *fl) ...@@ -806,10 +806,6 @@ int nfs_lock(struct file *filp, int cmd, struct file_lock *fl)
nfs_inc_stats(inode, NFSIOS_VFSLOCK); nfs_inc_stats(inode, NFSIOS_VFSLOCK);
/* No mandatory locks over NFS */
if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
goto out_err;
if (NFS_SERVER(inode)->flags & NFS_MOUNT_LOCAL_FCNTL) if (NFS_SERVER(inode)->flags & NFS_MOUNT_LOCAL_FCNTL)
is_local = 1; is_local = 1;
......
...@@ -5735,16 +5735,6 @@ check_special_stateids(struct net *net, svc_fh *current_fh, stateid_t *stateid, ...@@ -5735,16 +5735,6 @@ check_special_stateids(struct net *net, svc_fh *current_fh, stateid_t *stateid,
NFS4_SHARE_DENY_READ); NFS4_SHARE_DENY_READ);
} }
/*
* Allow READ/WRITE during grace period on recovered state only for files
* that are not able to provide mandatory locking.
*/
static inline int
grace_disallows_io(struct net *net, struct inode *inode)
{
return opens_in_grace(net) && mandatory_lock(inode);
}
static __be32 check_stateid_generation(stateid_t *in, stateid_t *ref, bool has_session) static __be32 check_stateid_generation(stateid_t *in, stateid_t *ref, bool has_session)
{ {
/* /*
...@@ -6026,7 +6016,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp, ...@@ -6026,7 +6016,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
stateid_t *stateid, int flags, struct nfsd_file **nfp, stateid_t *stateid, int flags, struct nfsd_file **nfp,
struct nfs4_stid **cstid) struct nfs4_stid **cstid)
{ {
struct inode *ino = d_inode(fhp->fh_dentry);
struct net *net = SVC_NET(rqstp); struct net *net = SVC_NET(rqstp);
struct nfsd_net *nn = net_generic(net, nfsd_net_id); struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct nfs4_stid *s = NULL; struct nfs4_stid *s = NULL;
...@@ -6035,9 +6024,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp, ...@@ -6035,9 +6024,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
if (nfp) if (nfp)
*nfp = NULL; *nfp = NULL;
if (grace_disallows_io(net, ino))
return nfserr_grace;
if (ZERO_STATEID(stateid) || ONE_STATEID(stateid)) { if (ZERO_STATEID(stateid) || ONE_STATEID(stateid)) {
status = check_special_stateids(net, fhp, stateid, flags); status = check_special_stateids(net, fhp, stateid, flags);
goto done; goto done;
......
...@@ -333,7 +333,6 @@ nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, ...@@ -333,7 +333,6 @@ nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct iattr *iap) struct iattr *iap)
{ {
struct inode *inode = d_inode(fhp->fh_dentry); struct inode *inode = d_inode(fhp->fh_dentry);
int host_err;
if (iap->ia_size < inode->i_size) { if (iap->ia_size < inode->i_size) {
__be32 err; __be32 err;
...@@ -343,20 +342,7 @@ nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, ...@@ -343,20 +342,7 @@ nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (err) if (err)
return err; return err;
} }
return nfserrno(get_write_access(inode));
host_err = get_write_access(inode);
if (host_err)
goto out_nfserrno;
host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
if (host_err)
goto out_put_write_access;
return 0;
out_put_write_access:
put_write_access(inode);
out_nfserrno:
return nfserrno(host_err);
} }
/* /*
...@@ -750,13 +736,6 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type, ...@@ -750,13 +736,6 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
err = nfserr_perm; err = nfserr_perm;
if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE)) if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE))
goto out; goto out;
/*
* We must ignore files (but only files) which might have mandatory
* locks on them because there is no way to know if the accesser has
* the lock.
*/
if (S_ISREG((inode)->i_mode) && mandatory_lock(inode))
goto out;
if (!inode->i_fop) if (!inode->i_fop)
goto out; goto out;
......
...@@ -101,8 +101,6 @@ int ocfs2_flock(struct file *file, int cmd, struct file_lock *fl) ...@@ -101,8 +101,6 @@ int ocfs2_flock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_FLOCK)) if (!(fl->fl_flags & FL_FLOCK))
return -ENOLCK; return -ENOLCK;
if (__mandatory_lock(inode))
return -ENOLCK;
if ((osb->s_mount_opt & OCFS2_MOUNT_LOCALFLOCKS) || if ((osb->s_mount_opt & OCFS2_MOUNT_LOCALFLOCKS) ||
ocfs2_mount_local(osb)) ocfs2_mount_local(osb))
...@@ -121,8 +119,6 @@ int ocfs2_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -121,8 +119,6 @@ int ocfs2_lock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_POSIX)) if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK; return -ENOLCK;
if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
return -ENOLCK;
return ocfs2_plock(osb->cconn, OCFS2_I(inode)->ip_blkno, file, cmd, fl); return ocfs2_plock(osb->cconn, OCFS2_I(inode)->ip_blkno, file, cmd, fl);
} }
...@@ -105,9 +105,7 @@ long vfs_truncate(const struct path *path, loff_t length) ...@@ -105,9 +105,7 @@ long vfs_truncate(const struct path *path, loff_t length)
if (error) if (error)
goto put_write_and_out; goto put_write_and_out;
error = locks_verify_truncate(inode, NULL, length); error = security_path_truncate(path);
if (!error)
error = security_path_truncate(path);
if (!error) if (!error)
error = do_truncate(mnt_userns, path->dentry, length, 0, NULL); error = do_truncate(mnt_userns, path->dentry, length, 0, NULL);
...@@ -189,9 +187,7 @@ long do_sys_ftruncate(unsigned int fd, loff_t length, int small) ...@@ -189,9 +187,7 @@ long do_sys_ftruncate(unsigned int fd, loff_t length, int small)
if (IS_APPEND(file_inode(f.file))) if (IS_APPEND(file_inode(f.file)))
goto out_putf; goto out_putf;
sb_start_write(inode->i_sb); sb_start_write(inode->i_sb);
error = locks_verify_truncate(inode, f.file, length); error = security_path_truncate(&f.file->f_path);
if (!error)
error = security_path_truncate(&f.file->f_path);
if (!error) if (!error)
error = do_truncate(file_mnt_user_ns(f.file), dentry, length, error = do_truncate(file_mnt_user_ns(f.file), dentry, length,
ATTR_MTIME | ATTR_CTIME, f.file); ATTR_MTIME | ATTR_CTIME, f.file);
......
...@@ -365,12 +365,8 @@ SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, ...@@ -365,12 +365,8 @@ SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count) int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count)
{ {
struct inode *inode;
int retval = -EINVAL;
inode = file_inode(file);
if (unlikely((ssize_t) count < 0)) if (unlikely((ssize_t) count < 0))
return retval; return -EINVAL;
/* /*
* ranged mandatory locking does not apply to streams - it makes sense * ranged mandatory locking does not apply to streams - it makes sense
...@@ -381,19 +377,12 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t ...@@ -381,19 +377,12 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t
if (unlikely(pos < 0)) { if (unlikely(pos < 0)) {
if (!unsigned_offsets(file)) if (!unsigned_offsets(file))
return retval; return -EINVAL;
if (count >= -pos) /* both values are in 0..LLONG_MAX */ if (count >= -pos) /* both values are in 0..LLONG_MAX */
return -EOVERFLOW; return -EOVERFLOW;
} else if (unlikely((loff_t) (pos + count) < 0)) { } else if (unlikely((loff_t) (pos + count) < 0)) {
if (!unsigned_offsets(file)) if (!unsigned_offsets(file))
return retval; return -EINVAL;
}
if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
retval = locks_mandatory_area(inode, file, pos, pos + count - 1,
read_write == READ ? F_RDLCK : F_WRLCK);
if (retval < 0)
return retval;
} }
} }
......
...@@ -99,24 +99,12 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in, ...@@ -99,24 +99,12 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in,
static int remap_verify_area(struct file *file, loff_t pos, loff_t len, static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
bool write) bool write)
{ {
struct inode *inode = file_inode(file);
if (unlikely(pos < 0 || len < 0)) if (unlikely(pos < 0 || len < 0))
return -EINVAL; return -EINVAL;
if (unlikely((loff_t) (pos + len) < 0)) if (unlikely((loff_t) (pos + len) < 0))
return -EINVAL; return -EINVAL;
if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
loff_t end = len ? pos + len - 1 : OFFSET_MAX;
int retval;
retval = locks_mandatory_area(inode, file, pos, end,
write ? F_WRLCK : F_RDLCK);
if (retval < 0)
return retval;
}
return security_file_permission(file, write ? MAY_WRITE : MAY_READ); return security_file_permission(file, write ? MAY_WRITE : MAY_READ);
} }
......
...@@ -2612,90 +2612,6 @@ extern struct kobject *fs_kobj; ...@@ -2612,90 +2612,6 @@ extern struct kobject *fs_kobj;
#define MAX_RW_COUNT (INT_MAX & PAGE_MASK) #define MAX_RW_COUNT (INT_MAX & PAGE_MASK)
#ifdef CONFIG_MANDATORY_FILE_LOCKING
extern int locks_mandatory_locked(struct file *);
extern int locks_mandatory_area(struct inode *, struct file *, loff_t, loff_t, unsigned char);
/*
* Candidates for mandatory locking have the setgid bit set
* but no group execute bit - an otherwise meaningless combination.
*/
static inline int __mandatory_lock(struct inode *ino)
{
return (ino->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID;
}
/*
* ... and these candidates should be on SB_MANDLOCK mounted fs,
* otherwise these will be advisory locks
*/
static inline int mandatory_lock(struct inode *ino)
{
return IS_MANDLOCK(ino) && __mandatory_lock(ino);
}
static inline int locks_verify_locked(struct file *file)
{
if (mandatory_lock(locks_inode(file)))
return locks_mandatory_locked(file);
return 0;
}
static inline int locks_verify_truncate(struct inode *inode,
struct file *f,
loff_t size)
{
if (!inode->i_flctx || !mandatory_lock(inode))
return 0;
if (size < inode->i_size) {
return locks_mandatory_area(inode, f, size, inode->i_size - 1,
F_WRLCK);
} else {
return locks_mandatory_area(inode, f, inode->i_size, size - 1,
F_WRLCK);
}
}
#else /* !CONFIG_MANDATORY_FILE_LOCKING */
static inline int locks_mandatory_locked(struct file *file)
{
return 0;
}
static inline int locks_mandatory_area(struct inode *inode, struct file *filp,
loff_t start, loff_t end, unsigned char type)
{
return 0;
}
static inline int __mandatory_lock(struct inode *inode)
{
return 0;
}
static inline int mandatory_lock(struct inode *inode)
{
return 0;
}
static inline int locks_verify_locked(struct file *file)
{
return 0;
}
static inline int locks_verify_truncate(struct inode *inode, struct file *filp,
size_t size)
{
return 0;
}
#endif /* CONFIG_MANDATORY_FILE_LOCKING */
#ifdef CONFIG_FILE_LOCKING #ifdef CONFIG_FILE_LOCKING
static inline int break_lease(struct inode *inode, unsigned int mode) static inline int break_lease(struct inode *inode, unsigned int mode)
{ {
......
...@@ -1517,12 +1517,6 @@ unsigned long do_mmap(struct file *file, unsigned long addr, ...@@ -1517,12 +1517,6 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE)) if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
return -EACCES; return -EACCES;
/*
* Make sure there are no mandatory locks on the file.
*/
if (locks_verify_locked(file))
return -EAGAIN;
vm_flags |= VM_SHARED | VM_MAYSHARE; vm_flags |= VM_SHARED | VM_MAYSHARE;
if (!(file->f_mode & FMODE_WRITE)) if (!(file->f_mode & FMODE_WRITE))
vm_flags &= ~(VM_MAYWRITE | VM_SHARED); vm_flags &= ~(VM_MAYWRITE | VM_SHARED);
......
...@@ -826,9 +826,6 @@ static int validate_mmap_request(struct file *file, ...@@ -826,9 +826,6 @@ static int validate_mmap_request(struct file *file,
(file->f_mode & FMODE_WRITE)) (file->f_mode & FMODE_WRITE))
return -EACCES; return -EACCES;
if (locks_verify_locked(file))
return -EAGAIN;
if (!(capabilities & NOMMU_MAP_DIRECT)) if (!(capabilities & NOMMU_MAP_DIRECT))
return -ENODEV; return -ENODEV;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment