1. 16 Mar, 2022 1 commit
  2. 15 Mar, 2022 1 commit
  3. 14 Mar, 2022 1 commit
  4. 09 Mar, 2022 1 commit
  5. 28 Feb, 2022 1 commit
    • Yu Kuai's avatar
      blktrace: fix use after free for struct blk_trace · 30939293
      Yu Kuai authored
      When tracing the whole disk, 'dropped' and 'msg' will be created
      under 'q->debugfs_dir' and 'bt->dir' is NULL, thus blk_trace_free()
      won't remove those files. What's worse, the following UAF can be
      triggered because of accessing stale 'dropped' and 'msg':
      
      ==================================================================
      BUG: KASAN: use-after-free in blk_dropped_read+0x89/0x100
      Read of size 4 at addr ffff88816912f3d8 by task blktrace/1188
      
      CPU: 27 PID: 1188 Comm: blktrace Not tainted 5.17.0-rc4-next-20220217+ #469
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
      Call Trace:
       <TASK>
       dump_stack_lvl+0x34/0x44
       print_address_description.constprop.0.cold+0xab/0x381
       ? blk_dropped_read+0x89/0x100
       ? blk_dropped_read+0x89/0x100
       kasan_report.cold+0x83/0xdf
       ? blk_dropped_read+0x89/0x100
       kasan_check_range+0x140/0x1b0
       blk_dropped_read+0x89/0x100
       ? blk_create_buf_file_callback+0x20/0x20
       ? kmem_cache_free+0xa1/0x500
       ? do_sys_openat2+0x258/0x460
       full_proxy_read+0x8f/0xc0
       vfs_read+0xc6/0x260
       ksys_read+0xb9/0x150
       ? vfs_write+0x3d0/0x3d0
       ? fpregs_assert_state_consistent+0x55/0x60
       ? exit_to_user_mode_prepare+0x39/0x1e0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7fbc080d92fd
      Code: ce 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 1
      RSP: 002b:00007fbb95ff9cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
      RAX: ffffffffffffffda RBX: 00007fbb95ff9dc0 RCX: 00007fbc080d92fd
      RDX: 0000000000000100 RSI: 00007fbb95ff9cc0 RDI: 0000000000000045
      RBP: 0000000000000045 R08: 0000000000406299 R09: 00000000fffffffd
      R10: 000000000153afa0 R11: 0000000000000293 R12: 00007fbb780008c0
      R13: 00007fbb78000938 R14: 0000000000608b30 R15: 00007fbb780029c8
       </TASK>
      
      Allocated by task 1050:
       kasan_save_stack+0x1e/0x40
       __kasan_kmalloc+0x81/0xa0
       do_blk_trace_setup+0xcb/0x410
       __blk_trace_setup+0xac/0x130
       blk_trace_ioctl+0xe9/0x1c0
       blkdev_ioctl+0xf1/0x390
       __x64_sys_ioctl+0xa5/0xe0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Freed by task 1050:
       kasan_save_stack+0x1e/0x40
       kasan_set_track+0x21/0x30
       kasan_set_free_info+0x20/0x30
       __kasan_slab_free+0x103/0x180
       kfree+0x9a/0x4c0
       __blk_trace_remove+0x53/0x70
       blk_trace_ioctl+0x199/0x1c0
       blkdev_common_ioctl+0x5e9/0xb30
       blkdev_ioctl+0x1a5/0x390
       __x64_sys_ioctl+0xa5/0xe0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff88816912f380
       which belongs to the cache kmalloc-96 of size 96
      The buggy address is located 88 bytes inside of
       96-byte region [ffff88816912f380, ffff88816912f3e0)
      The buggy address belongs to the page:
      page:000000009a1b4e7c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0f
      flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
      raw: 0017ffffc0000200 ffffea00044f1100 dead000000000002 ffff88810004c780
      raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88816912f280: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff88816912f300: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      >ffff88816912f380: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                                          ^
       ffff88816912f400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff88816912f480: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      ==================================================================
      
      Fixes: c0ea5760 ("blktrace: remove debugfs file dentries from struct blk_trace")
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20220228034354.4047385-1-yukuai3@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      30939293
  6. 24 Feb, 2022 1 commit
  7. 23 Feb, 2022 3 commits
  8. 22 Feb, 2022 1 commit
  9. 17 Feb, 2022 3 commits
  10. 11 Feb, 2022 3 commits
  11. 10 Feb, 2022 1 commit
  12. 09 Feb, 2022 2 commits
  13. 04 Feb, 2022 1 commit
  14. 03 Feb, 2022 3 commits
  15. 02 Feb, 2022 5 commits
    • Song Liu's avatar
      md: fix NULL pointer deref with nowait but no mddev->queue · 0f9650bd
      Song Liu authored
      Leon reported NULL pointer deref with nowait support:
      
      [   15.123761] device-mapper: raid: Loading target version 1.15.1
      [   15.124185] device-mapper: raid: Ignoring chunk size parameter for RAID 1
      [   15.124192] device-mapper: raid: Choosing default region size of 4MiB
      [   15.129524] BUG: kernel NULL pointer dereference, address: 0000000000000060
      [   15.129530] #PF: supervisor write access in kernel mode
      [   15.129533] #PF: error_code(0x0002) - not-present page
      [   15.129535] PGD 0 P4D 0
      [   15.129538] Oops: 0002 [#1] PREEMPT SMP NOPTI
      [   15.129541] CPU: 5 PID: 494 Comm: ldmtool Not tainted 5.17.0-rc2-1-mainline #1 9fe89d43dfcb215d2731e6f8851740520778615e
      [   15.129546] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F36e 10/14/2021
      [   15.129549] RIP: 0010:blk_queue_flag_set+0x7/0x20
      [   15.129555] Code: 00 00 00 0f 1f 44 00 00 48 8b 35 e4 e0 04 02 48 8d 57 28 bf 40 01 \
             00 00 e9 16 c1 be ff 66 0f 1f 44 00 00 0f 1f 44 00 00 89 ff <f0> 48 0f ab 7e 60 \
             31 f6 89 f7 c3 66 66 2e 0f 1f 84 00 00 00 00 00
      [   15.129559] RSP: 0018:ffff966b81987a88 EFLAGS: 00010202
      [   15.129562] RAX: ffff8b11c363a0d0 RBX: ffff8b11e294b070 RCX: 0000000000000000
      [   15.129564] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001d
      [   15.129566] RBP: ffff8b11e294b058 R08: 0000000000000000 R09: 0000000000000000
      [   15.129568] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11e294b070
      [   15.129570] R13: 0000000000000000 R14: ffff8b11e294b000 R15: 0000000000000001
      [   15.129572] FS:  00007fa96e826780(0000) GS:ffff8b18deb40000(0000) knlGS:0000000000000000
      [   15.129575] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   15.129577] CR2: 0000000000000060 CR3: 000000010b8ce000 CR4: 00000000003506e0
      [   15.129580] Call Trace:
      [   15.129582]  <TASK>
      [   15.129584]  md_run+0x67c/0xc70 [md_mod 1e470c1b6bcf1114198109f42682f5a2740e9531]
      [   15.129597]  raid_ctr+0x134a/0x28ea [dm_raid 6a645dd7519e72834bd7e98c23497eeade14cd63]
      [   15.129604]  ? dm_split_args+0x63/0x150 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129615]  dm_table_add_target+0x188/0x380 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129625]  table_load+0x13b/0x370 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129635]  ? dev_suspend+0x2d0/0x2d0 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129644]  ctl_ioctl+0x1bd/0x460 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129655]  dm_ctl_ioctl+0xa/0x20 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129663]  __x64_sys_ioctl+0x8e/0xd0
      [   15.129667]  do_syscall_64+0x5c/0x90
      [   15.129672]  ? syscall_exit_to_user_mode+0x23/0x50
      [   15.129675]  ? do_syscall_64+0x69/0x90
      [   15.129677]  ? do_syscall_64+0x69/0x90
      [   15.129679]  ? syscall_exit_to_user_mode+0x23/0x50
      [   15.129682]  ? do_syscall_64+0x69/0x90
      [   15.129684]  ? do_syscall_64+0x69/0x90
      [   15.129686]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   15.129689] RIP: 0033:0x7fa96ecd559b
      [   15.129692] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c \
          c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff \
          ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48
      [   15.129696] RSP: 002b:00007ffcaf85c258 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
      [   15.129699] RAX: ffffffffffffffda RBX: 00007fa96f1b48f0 RCX: 00007fa96ecd559b
      [   15.129701] RDX: 00007fa97017e610 RSI: 00000000c138fd09 RDI: 0000000000000003
      [   15.129702] RBP: 00007fa96ebab583 R08: 00007fa97017c9e0 R09: 00007ffcaf85bf27
      [   15.129704] R10: 0000000000000001 R11: 0000000000000206 R12: 00007fa97017e610
      [   15.129706] R13: 00007fa97017e640 R14: 00007fa97017e6c0 R15: 00007fa97017e530
      [   15.129709]  </TASK>
      
      This is caused by missing mddev->queue check for setting QUEUE_FLAG_NOWAIT
      Fix this by moving the QUEUE_FLAG_NOWAIT logic to under mddev->queue check.
      
      Fixes: f51d46d0 ("md: add support for REQ_NOWAIT")
      Reported-by: default avatarLeon Möller <jkhsjdhjs@totally.rip>
      Tested-by: default avatarLeon Möller <jkhsjdhjs@totally.rip>
      Cc: Vishal Verma <vverma@digitalocean.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      0f9650bd
    • Ilya Dryomov's avatar
      block: fix DIO handling regressions in blkdev_read_iter() · 3e1f941d
      Ilya Dryomov authored
      Commit ceaa7625 ("block: move direct_IO into our own read_iter
      handler") introduced several regressions for bdev DIO:
      
      1. read spanning EOF always returns 0 instead of the number of bytes
         read.  This is because "count" is assigned early and isn't updated
         when the iterator is truncated:
      
           $ lsblk -o name,size /dev/vdb
           NAME SIZE
           vdb    1G
           $ xfs_io -d -c 'pread -b 4M 1021M 4M' /dev/vdb
           read 0/4194304 bytes at offset 1070596096
           0.000000 bytes, 0 ops; 0.0007 sec (0.000000 bytes/sec and 0.0000 ops/sec)
      
           instead of
      
           $ xfs_io -d -c 'pread -b 4M 1021M 4M' /dev/vdb
           read 3145728/4194304 bytes at offset 1070596096
           3 MiB, 1 ops; 0.0007 sec (3.865 GiB/sec and 1319.2612 ops/sec)
      
      2. truncated iterator isn't reexpanded
      3. iterator isn't reverted on blkdev_direct_IO() error
      4. zero size read no longer skips atime update
      
      Fixes: ceaa7625 ("block: move direct_IO into our own read_iter handler")
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220201100420.25875-1-idryomov@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3e1f941d
    • Sagi Grimberg's avatar
      nvme-rdma: fix possible use-after-free in transport error_recovery work · b6bb1722
      Sagi Grimberg authored
      While nvme_rdma_submit_async_event_work is checking the ctrl and queue
      state before preparing the AER command and scheduling io_work, in order
      to fully prevent a race where this check is not reliable the error
      recovery work must flush async_event_work before continuing to destroy
      the admin queue after setting the ctrl state to RESETTING such that
      there is no race .submit_async_event and the error recovery handler
      itself changing the ctrl state.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      b6bb1722
    • Sagi Grimberg's avatar
      nvme-tcp: fix possible use-after-free in transport error_recovery work · ff9fc7eb
      Sagi Grimberg authored
      While nvme_tcp_submit_async_event_work is checking the ctrl and queue
      state before preparing the AER command and scheduling io_work, in order
      to fully prevent a race where this check is not reliable the error
      recovery work must flush async_event_work before continuing to destroy
      the admin queue after setting the ctrl state to RESETTING such that
      there is no race .submit_async_event and the error recovery handler
      itself changing the ctrl state.
      Tested-by: default avatarChris Leech <cleech@redhat.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      ff9fc7eb
    • Sagi Grimberg's avatar
      nvme: fix a possible use-after-free in controller reset during load · 0fa0f99f
      Sagi Grimberg authored
      Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
      readiness for AER submission. This may lead to a use-after-free
      condition that was observed with nvme-tcp.
      
      The race condition may happen in the following scenario:
      1. driver executes its reset_ctrl_work
      2. -> nvme_stop_ctrl - flushes ctrl async_event_work
      3. ctrl sends AEN which is received by the host, which in turn
         schedules AEN handling
      4. teardown admin queue (which releases the queue socket)
      5. AEN processed, submits another AER, calling the driver to submit
      6. driver attempts to send the cmd
      ==> use-after-free
      
      In order to fix that, add ctrl state check to validate the ctrl
      is actually able to accept the AER submission.
      
      This addresses the above race in controller resets because the driver
      during teardown should:
      1. change ctrl state to RESETTING
      2. flush async_event_work (as well as other async work elements)
      
      So after 1,2, any other AER command will find the
      ctrl state to be RESETTING and bail out without submitting the AER.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      0fa0f99f
  16. 28 Jan, 2022 3 commits
  17. 27 Jan, 2022 4 commits
  18. 26 Jan, 2022 1 commit
  19. 23 Jan, 2022 4 commits