Commit cefdc26e authored by Martin Brandenburg's avatar Martin Brandenburg Committed by Linus Torvalds

orangefs: move features validation to fix filesystem hang

Without this fix (and another to the userspace component itself
described later), the kernel will be unable to process any OrangeFS
requests after the userspace component is restarted (due to a crash or
at the administrator's behest).

The bug here is that inside orangefs_remount, the orangefs_request_mutex
is locked.  When the userspace component restarts while the filesystem
is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device,
which causes the kernel to send it a few requests aimed at synchronizing
the state between the two.  While this is happening the
orangefs_request_mutex is locked to prevent any other requests going
through.

This is only half of the bugfix.  The other half is in the userspace
component which outright ignores(!) requests made before it considers
the filesystem remounted, which is after the ioctl returns.  Of course
the ioctl doesn't return until after the userspace component responds to
the request it ignores.  The userspace component has been changed to
allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status.

Mike Marshall says:
 "I've tested this patch against the fixed userspace part. This patch is
  real important, I hope it can make it into 4.11...

  Here's what happens when the userspace daemon is restarted, without
  the patch:

    =============================================
    [ INFO: possible recursive locking detected ]
    [   4.10.0-00007-ge98bdb30 #1 Not tainted    ]
    ---------------------------------------------
    pvfs2-client-co/29032 is trying to acquire lock:
     (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs]
                  but task is already holding lock:
     (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs]

    CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb30 #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
    Call Trace:
     __lock_acquire+0x7eb/0x1290
     lock_acquire+0xe8/0x1d0
     mutex_lock_killable_nested+0x6f/0x6e0
     service_operation+0x3c7/0x7b0 [orangefs]
     orangefs_remount+0xea/0x150 [orangefs]
     dispatch_ioctl_command+0x227/0x330 [orangefs]
     orangefs_devreq_ioctl+0x29/0x70 [orangefs]
     do_vfs_ioctl+0xa3/0x6e0
     SyS_ioctl+0x79/0x90"
Signed-off-by: default avatarMartin Brandenburg <martin@omnibond.com>
Acked-by: default avatarMike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent c2eb7bea
......@@ -263,8 +263,13 @@ int orangefs_remount(struct orangefs_sb_info_s *orangefs_sb)
if (!new_op)
return -ENOMEM;
new_op->upcall.req.features.features = 0;
ret = service_operation(new_op, "orangefs_features", 0);
orangefs_features = new_op->downcall.resp.features.features;
ret = service_operation(new_op, "orangefs_features",
ORANGEFS_OP_PRIORITY | ORANGEFS_OP_NO_MUTEX);
if (!ret)
orangefs_features =
new_op->downcall.resp.features.features;
else
orangefs_features = 0;
op_release(new_op);
} else {
orangefs_features = 0;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment