• Jason Gunthorpe's avatar
    iommufd: IOMMUFD_DESTROY should not increase the refcount · 99f98a7c
    Jason Gunthorpe authored
    syzkaller found a race where IOMMUFD_DESTROY increments the refcount:
    
           obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY);
           if (IS_ERR(obj))
                   return PTR_ERR(obj);
           iommufd_ref_to_users(obj);
           /* See iommufd_ref_to_users() */
           if (!iommufd_object_destroy_user(ucmd->ictx, obj))
    
    As part of the sequence to join the two existing primitives together.
    
    Allowing the refcount the be elevated without holding the destroy_rwsem
    violates the assumption that all temporary refcount elevations are
    protected by destroy_rwsem. Racing IOMMUFD_DESTROY with
    iommufd_object_destroy_user() will cause spurious failures:
    
      WARNING: CPU: 0 PID: 3076 at drivers/iommu/iommufd/device.c:477 iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:478
      Modules linked in:
      CPU: 0 PID: 3076 Comm: syz-executor.0 Not tainted 6.3.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
      RIP: 0010:iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:477
      Code: e8 3d 4e 00 00 84 c0 74 01 c3 0f 0b c3 0f 1f 44 00 00 f3 0f 1e fa 48 89 fe 48 8b bf a8 00 00 00 e8 1d 4e 00 00 84 c0 74 01 c3 <0f> 0b c3 0f 1f 44 00 00 41 57 41 56 41 55 4c 8d ae d0 00 00 00 41
      RSP: 0018:ffffc90003067e08 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff888109ea0300 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000ffffffff
      RBP: 0000000000000004 R08: 0000000000000000 R09: ffff88810bbb3500
      R10: ffff88810bbb3e48 R11: 0000000000000000 R12: ffffc90003067e88
      R13: ffffc90003067ea8 R14: ffff888101249800 R15: 00000000fffffffe
      FS:  00007ff7254fe6c0(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000555557262da8 CR3: 000000010a6fd000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
       iommufd_test_create_access drivers/iommu/iommufd/selftest.c:596 [inline]
       iommufd_test+0x71c/0xcf0 drivers/iommu/iommufd/selftest.c:813
       iommufd_fops_ioctl+0x10f/0x1b0 drivers/iommu/iommufd/main.c:337
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:870 [inline]
       __se_sys_ioctl fs/ioctl.c:856 [inline]
       __x64_sys_ioctl+0x84/0xc0 fs/ioctl.c:856
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x80 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    The solution is to not increment the refcount on the IOMMUFD_DESTROY path
    at all. Instead use the xa_lock to serialize everything. The refcount
    check == 1 and xa_erase can be done under a single critical region. This
    avoids the need for any refcount incrementing.
    
    It has the downside that if userspace races destroy with other operations
    it will get an EBUSY instead of waiting, but this is kind of racing is
    already dangerous.
    
    Fixes: 2ff4bed7 ("iommufd: File descriptor, context, kconfig and makefiles")
    Link: https://lore.kernel.org/r/2-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.comReviewed-by: default avatarKevin Tian <kevin.tian@intel.com>
    Reported-by: syzbot+7574ebfe589049630608@syzkaller.appspotmail.com
    Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    99f98a7c
iommufd_private.h 11.3 KB