1. 18 Oct, 2021 1 commit
  2. 17 Oct, 2021 1 commit
  3. 12 Aug, 2021 1 commit
  4. 01 Jun, 2021 1 commit
  5. 07 May, 2021 1 commit
  6. 21 Apr, 2021 1 commit
    • Calvin Owens's avatar
      brd: expose number of allocated pages in debugfs · f4be591f
      Calvin Owens authored
      
      While the maximum size of each ramdisk is defined either as a module
      parameter, or compile time default, it's impossible to know how many pages
      have currently been allocated by each ram%d device, since they're
      allocated when used and never freed.
      
      This patch creates a new directory at this location:
      
      /sys/kernel/debug/ramdisk_pages/
      
      which will contain a file named "ram%d" for each instantiated ramdisk on
      the system. The file is read-only, and read() will output the number of
      pages currently held by that ramdisk.
      
      We lose track how much memory a ramdisk is using as pages once used are
      simply recycled but never freed.
      
      In instances where we exhaust the size of the ramdisk with a file that
      exceeds it, encounter ENOSPC and delete the file for mitigation; df would
      show decrease in used and increase in available blocks but the since we
      have touched all pages, the memory footprint of the ramdisk does not
      reflect the blocks used/available count
      
      ...
      [root@localhost ~]# mkfs.ext2 /dev/ram15
      mke2fs 1.45.6 (20-Mar-2020)
      Creating filesystem with 4096 1k blocks and 1024 inodes
      [root@localhost ~]# mount /dev/ram15 /mnt/ram15/
      
      [root@localhost ~]# cat
      /sys/kernel/debug/ramdisk_pages/ram15
      58
      [root@kerneltest008.06.prn3 ~]# df /dev/ram15
      Filesystem     1K-blocks  Used Available Use% Mounted on
      /dev/ram15          3963    31      3728   1% /mnt/ram15
      [root@kerneltest008.06.prn3 ~]# dd if=/dev/urandom of=/mnt/ram15/test2
      bs=1M count=5
      dd: error writing '/mnt/ram15/test2': No space left on device
      4+0 records in
      3+0 records out
      4005888 bytes (4.0 MB, 3.8 MiB) copied, 0.0446614 s, 89.7 MB/s
      [root@kerneltest008.06.prn3 ~]# df /mnt/ram15/
      Filesystem     1K-blocks  Used Available Use% Mounted on
      /dev/ram15          3963  3960         0 100% /mnt/ram15
      [root@kerneltest008.06.prn3 ~]# cat
      /sys/kernel/debug/ramdisk_pages/ram15
      1024
      [root@kerneltest008.06.prn3 ~]# rm /mnt/ram15/test2
      rm: remove regular file '/mnt/ram15/test2'? y
      [root@kerneltest008.06.prn3 /var]# df /dev/ram15
      Filesystem     1K-blocks  Used Available Use% Mounted on
      /dev/ram15          3963    31      3728   1% /mnt/ram15
      
      # Acutal memory footprint
      [root@kerneltest008.06.prn3 /var]# cat
      /sys/kernel/debug/ramdisk_pages/ram15
      1024
      ...
      
      This debugfs counter will always reveal the accurate number of
      permanently allocated pages to the ramdisk.
      Signed-off-by: default avatarCalvin Owens <calvinowens@fb.com>
      [cleaned up the !CONFIG_DEBUG_FS case and API changes for HEAD]
      Signed-off-by: default avatarKyle McMartin <jkkm@fb.com>
      [rebased]
      Signed-off-by: default avatarSaravanan D <saravanand@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f4be591f
  7. 25 Jan, 2021 2 commits
  8. 16 Nov, 2020 1 commit
  9. 24 Sep, 2020 1 commit
  10. 01 Jul, 2020 1 commit
  11. 27 Mar, 2020 1 commit
    • Christoph Hellwig's avatar
      block: simplify queue allocation · 3d745ea5
      Christoph Hellwig authored
      
      Current make_request based drivers use either blk_alloc_queue_node or
      blk_alloc_queue to allocate a queue, and then set up the make_request_fn
      function pointer and a few parameters using the blk_queue_make_request
      helper.  Simplify this by passing the make_request pointer to
      blk_alloc_queue, and while at it merge the _node variant into the main
      helper by always passing a node_id, and remove the superfluous gfp_mask
      parameter.  A lower-level __blk_alloc_queue is kept for the blk-mq case.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3d745ea5
  12. 04 Feb, 2020 1 commit
    • Zhiqiang Liu's avatar
      brd: check and limit max_part par · c8ab4225
      Zhiqiang Liu authored
      
      In brd_init func, rd_nr num of brd_device are firstly allocated
      and add in brd_devices, then brd_devices are traversed to add each
      brd_device by calling add_disk func. When allocating brd_device,
      the disk->first_minor is set to i * max_part, if rd_nr * max_part
      is larger than MINORMASK, two different brd_device may have the same
      devt, then only one of them can be successfully added.
      when rmmod brd.ko, it will cause oops when calling brd_exit.
      
      Follow those steps:
        # modprobe brd rd_nr=3 rd_size=102400 max_part=1048576
        # rmmod brd
      then, the oops will appear.
      
      Oops log:
      [  726.613722] Call trace:
      [  726.614175]  kernfs_find_ns+0x24/0x130
      [  726.614852]  kernfs_find_and_get_ns+0x44/0x68
      [  726.615749]  sysfs_remove_group+0x38/0xb0
      [  726.616520]  blk_trace_remove_sysfs+0x1c/0x28
      [  726.617320]  blk_unregister_queue+0x98/0x100
      [  726.618105]  del_gendisk+0x144/0x2b8
      [  726.618759]  brd_exit+0x68/0x560 [brd]
      [  726.619501]  __arm64_sys_delete_module+0x19c/0x2a0
      [  726.620384]  el0_svc_common+0x78/0x130
      [  726.621057]  el0_svc_handler+0x38/0x78
      [  726.621738]  el0_svc+0x8/0xc
      [  726.622259] Code: aa0203f6 aa0103f7 aa1e03e0 d503201f (7940e260)
      
      Here, we add brd_check_and_reset_par func to check and limit max_part par.
      
      --
      V5->V6:
       - remove useless code
      
      V4->V5:(suggested by Ming Lei)
       - make sure max_part is not larger than DISK_MAX_PARTS
      
      V3->V4:(suggested by Ming Lei)
       - remove useless change
       - add one limit of max_part
      
      V2->V3: (suggested by Ming Lei)
       - clear .minors when running out of consecutive minor space in brd_alloc
       - remove limit of rd_nr
      
      V1->V2:
       - add more checks in brd_check_par_valid as suggested by Ming Lei.
      Signed-off-by: default avatarZhiqiang Liu <liuzhiqiang26@huawei.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c8ab4225
  13. 04 Dec, 2019 2 commits
    • Ming Lei's avatar
      brd: warn on un-aligned buffer · f1acbf21
      Ming Lei authored
      
      Queue dma alignment limit requires users(fs, target, ...) of block layer
      to pass aligned buffer.
      
      So far brd doesn't support un-aligned buffer, even though it is easy
      to support it.
      
      However, given brd is often used for debug purpose, and there are other
      drivers which can't support un-aligned buffer too.
      
      So add warning so that brd users know what to fix.
      Reported-by: default avatarStephen Rust <srust@blockbridge.com>
      Cc: Stephen Rust <srust@blockbridge.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f1acbf21
    • Ming Lei's avatar
      brd: remove max_hw_sectors queue limit · 36582a5a
      Ming Lei authored
      
      Now we depend on blk_queue_split() to respect most of queue limit
      (the only one exception could be dma alignment), however
      blk_queue_split() isn't used for brd, so this limit isn't respected
      since v4.3.
      
      Also max_hw_sectors limit doesn't play a big role for brd, which is
      added since brd is added to tree for unknown reason.
      
      So remove it.
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      36582a5a
  14. 21 May, 2019 1 commit
  15. 09 May, 2019 1 commit
  16. 22 Apr, 2019 1 commit
  17. 02 Nov, 2018 1 commit
    • Ming Lei's avatar
      block: brd: associate with queue until adding disk · 153fcd5f
      Ming Lei authored
      
      brd_free() may be called in failure path on one brd instance which
      disk isn't added yet, so release handler of gendisk may free the
      associated request_queue early and causes the following use-after-free[1].
      
      This patch fixes this issue by associating gendisk with request_queue
      just before adding disk.
      
      [1] KASAN: use-after-free Read in del_timer_syncNon-volatile memory driver v1.3
      Linux agpgart interface v0.103
      [drm] Initialized vgem 1.0.0 20120112 for virtual device on minor 0
      usbcore: registered new interface driver udl
      ==================================================================
      BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
      kernel/locking/lockdep.c:3218
      Read of size 8 at addr ffff8801d1b6b540 by task swapper/0/1
      
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x244/0x39d lib/dump_stack.c:113
        print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
        kasan_report_error mm/kasan/report.c:354 [inline]
        kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
        __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
        __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
        lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
        del_timer_sync+0xb7/0x270 kernel/time/timer.c:1283
        blk_cleanup_queue+0x413/0x710 block/blk-core.c:809
        brd_free+0x5d/0x71 drivers/block/brd.c:422
        brd_init+0x2eb/0x393 drivers/block/brd.c:518
        do_one_initcall+0x145/0x957 init/main.c:890
        do_initcall_level init/main.c:958 [inline]
        do_initcalls init/main.c:966 [inline]
        do_basic_setup init/main.c:984 [inline]
        kernel_init_freeable+0x5c6/0x6b9 init/main.c:1148
        kernel_init+0x11/0x1ae init/main.c:1068
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:350
      
      Reported-by: syzbot+3701447012fe951dabb2@syzkaller.appspotmail.com
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      153fcd5f
  18. 18 Jul, 2018 1 commit
    • Tejun Heo's avatar
      block: make bdev_ops->rw_page() take a REQ_OP instead of bool · 3f289dcb
      Tejun Heo authored
      c11f0c0b
      
       ("block/mm: make bdev_ops->rw_page() take a bool for
      read/write") replaced @op with boolean @is_write, which limited the
      amount of information going into ->rw_page() and more importantly
      page_endio(), which removed the need to expose block internals to mm.
      
      Unfortunately, we want to track discards separately and @is_write
      isn't enough information.  This patch updates bdev_ops->rw_page() to
      take REQ_OP instead but leaves page_endio() to take bool @is_write.
      This allows the block part of operations to have enough information
      while not leaking it to mm.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3f289dcb
  19. 24 May, 2018 1 commit
  20. 09 May, 2018 1 commit
  21. 17 Mar, 2018 1 commit
    • Bart Van Assche's avatar
      block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> · 233bde21
      Bart Van Assche authored
      It happens often while I'm preparing a patch for a block driver that
      I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
      available for this driver? Do I have to introduce definitions of these
      constants before I can use these constants? To avoid this confusion,
      move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
      <linux/blkdev.h> header file such that these become available for all
      block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
      header file conditional to avoid that including that header file after
      <linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
      redefinition.
      
      Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
      not been removed from uapi header files nor from NAND drivers in
      which these constants are used for another purpose than converting
      block layer offsets and sizes into a number of sectors.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: M...
      233bde21
  22. 26 Feb, 2018 1 commit
  23. 16 Nov, 2017 1 commit
  24. 15 Nov, 2017 1 commit
    • Dan Williams's avatar
      brd: remove dax support · 7a862fbb
      Dan Williams authored
      
      DAX support in brd is awkward because its backing page frames are
      distinct from the ones provided by pmem, dcssblk, or axonram. We need
      pfn_t_devmap() entries to fully support DAX, and the limited DAX support
      for pfn_t_special() page frames is not interesting for brd when pmem is
      already a superset of brd.  Lastly, brd is the only dax capable driver
      that may sleep in its ->direct_access() implementation. So it causes a
      global burden with no net gain of kernel functionality.
      
      For all these reasons, remove DAX support.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      7a862fbb
  25. 11 Nov, 2017 1 commit
  26. 25 Sep, 2017 1 commit
  27. 07 Sep, 2017 1 commit
  28. 23 Aug, 2017 1 commit
    • Christoph Hellwig's avatar
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig authored
      
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      74d46992
  29. 10 Jul, 2017 1 commit
  30. 27 Jun, 2017 2 commits
  31. 03 May, 2017 1 commit
  32. 25 Apr, 2017 1 commit
  33. 19 Apr, 2017 1 commit
  34. 08 Apr, 2017 1 commit
  35. 24 Dec, 2016 1 commit
  36. 25 Oct, 2016 2 commits