• Jens Axboe's avatar
    io_uring: fix recursive completion locking on oveflow flush · 7271ef3a
    Jens Axboe authored
    syszbot reports a scenario where we recurse on the completion lock
    when flushing an overflow:
    
    1 lock held by syz-executor287/6816:
     #0: ffff888093cdb4d8 (&ctx->completion_lock){....}-{2:2}, at: io_cqring_overflow_flush+0xc6/0xab0 fs/io_uring.c:1333
    
    stack backtrace:
    CPU: 1 PID: 6816 Comm: syz-executor287 Not tainted 5.8.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x1f0/0x31e lib/dump_stack.c:118
     print_deadlock_bug kernel/locking/lockdep.c:2391 [inline]
     check_deadlock kernel/locking/lockdep.c:2432 [inline]
     validate_chain+0x69a4/0x88a0 kernel/locking/lockdep.c:3202
     __lock_acquire+0x1161/0x2ab0 kernel/locking/lockdep.c:4426
     lock_acquire+0x160/0x730 kernel/locking/lockdep.c:5005
     __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
     _raw_spin_lock_irq+0x67/0x80 kernel/locking/spinlock.c:167
     spin_lock_irq include/linux/spinlock.h:379 [inline]
     io_queue_linked_timeout fs/io_uring.c:5928 [inline]
     __io_queue_async_work fs/io_uring.c:1192 [inline]
     __io_queue_deferred+0x36a/0x790 fs/io_uring.c:1237
     io_cqring_overflow_flush+0x774/0xab0 fs/io_uring.c:1359
     io_ring_ctx_wait_and_kill+0x2a1/0x570 fs/io_uring.c:7808
     io_uring_release+0x59/0x70 fs/io_uring.c:7829
     __fput+0x34f/0x7b0 fs/file_table.c:281
     task_work_run+0x137/0x1c0 kernel/task_work.c:135
     exit_task_work include/linux/task_work.h:25 [inline]
     do_exit+0x5f3/0x1f20 kernel/exit.c:806
     do_group_exit+0x161/0x2d0 kernel/exit.c:903
     __do_sys_exit_group+0x13/0x20 kernel/exit.c:914
     __se_sys_exit_group+0x10/0x10 kernel/exit.c:912
     __x64_sys_exit_group+0x37/0x40 kernel/exit.c:912
     do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    
    Fix this by passing back the link from __io_queue_async_work(), and
    then let the caller handle the queueing of the link. Take care to also
    punt the submission reference put to the caller, as we're holding the
    completion lock for the __io_queue_defer() case. Hence we need to mark
    the io_kiocb appropriately for that case.
    
    Reported-by: syzbot+996f91b6ec3812c48042@syzkaller.appspotmail.com
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    7271ef3a
io_uring.c 206 KB