Commit 0f8821da authored by Sebastian Andrzej Siewior's avatar Sebastian Andrzej Siewior Committed by Christian Brauner

fs/namespace: Boost the mount_lock.lock owner instead of spinning on PREEMPT_RT.

The MNT_WRITE_HOLD flag is used to hold back any new writers while the
mount point is about to be made read-only. __mnt_want_write() then loops
with disabled preemption until this flag disappears. Callers of
mnt_hold_writers() (which sets the flag) hold the spinlock_t of
mount_lock (seqlock_t) which disables preemption on !PREEMPT_RT and
ensures the task is not scheduled away so that the spinning side spins
for a long time.

On PREEMPT_RT the spinlock_t does not disable preemption and so it is
possible that the task setting MNT_WRITE_HOLD is preempted by task with
higher priority which then spins infinitely waiting for MNT_WRITE_HOLD
to get removed.

Acquire mount_lock::lock which is held by setter of MNT_WRITE_HOLD. This
will PI-boost the owner and wait until the lock is dropped and which
means that MNT_WRITE_HOLD is cleared again.

Link: https://lore.kernel.org/r/20211025152218.opvcqfku2lhqvp4o@linutronix.de
Link: https://lore.kernel.org/r/20211125120711.dgbsienyrsxfzpoi@linutronix.deAcked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
parent 13605725
...@@ -343,8 +343,24 @@ int __mnt_want_write(struct vfsmount *m) ...@@ -343,8 +343,24 @@ int __mnt_want_write(struct vfsmount *m)
* incremented count after it has set MNT_WRITE_HOLD. * incremented count after it has set MNT_WRITE_HOLD.
*/ */
smp_mb(); smp_mb();
while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) might_lock(&mount_lock.lock);
cpu_relax(); while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) {
if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
cpu_relax();
} else {
/*
* This prevents priority inversion, if the task
* setting MNT_WRITE_HOLD got preempted on a remote
* CPU, and it prevents life lock if the task setting
* MNT_WRITE_HOLD has a lower priority and is bound to
* the same CPU as the task that is spinning here.
*/
preempt_enable();
lock_mount_hash();
unlock_mount_hash();
preempt_disable();
}
}
/* /*
* After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will * After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will
* be set to match its requirements. So we must not load that until * be set to match its requirements. So we must not load that until
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment