1. 14 Jun, 2024 1 commit
    • Cyril Hrubis's avatar
      loop: Disable fallocate() zero and discard if not supported · 5f75e081
      Cyril Hrubis authored
      If fallcate is implemented but zero and discard operations are not
      supported by the filesystem the backing file is on we continue to fill
      dmesg with errors from the blk_mq_end_request() since each time we call
      fallocate() on the loop device the EOPNOTSUPP error from lo_fallocate()
      ends up propagated into the block layer. In the end syscall succeeds
      since the blkdev_issue_zeroout() falls back to writing zeroes which
      makes the errors even more misleading and confusing.
      
      How to reproduce:
      
      1. make sure /tmp is mounted as tmpfs
      2. dd if=/dev/zero of=/tmp/disk.img bs=1M count=100
      3. losetup /dev/loop0 /tmp/disk.img
      4. mkfs.ext2 /dev/loop0
      5. dmesg |tail
      
      [710690.898214] operation not supported error, dev loop0, sector 204672 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.898279] operation not supported error, dev loop0, sector 522 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.898603] operation not supported error, dev loop0, sector 16906 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.898917] operation not supported error, dev loop0, sector 32774 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.899218] operation not supported error, dev loop0, sector 49674 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.899484] operation not supported error, dev loop0, sector 65542 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.899743] operation not supported error, dev loop0, sector 82442 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.900015] operation not supported error, dev loop0, sector 98310 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.900276] operation not supported error, dev loop0, sector 115210 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      [710690.900546] operation not supported error, dev loop0, sector 131078 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0
      
      This patch changes the lo_fallocate() to clear the flags for zero and
      discard operations if we get EOPNOTSUPP from the backing file fallocate
      callback, that way we at least stop spewing errors after the first
      unsuccessful try.
      
      CC: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarCyril Hrubis <chrubis@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20240613163817.22640-1-chrubis@suse.czSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5f75e081
  2. 13 Jun, 2024 2 commits
  3. 12 Jun, 2024 8 commits
  4. 05 Jun, 2024 2 commits
  5. 31 May, 2024 3 commits
  6. 30 May, 2024 4 commits
    • Damien Le Moal's avatar
      block: Fix zone write plugging handling of devices with a runt zone · 29459c3e
      Damien Le Moal authored
      A zoned device may have a last sequential write required zone that is
      smaller than other zones. However, all tests to check if a zone write
      plug write offset exceeds the zone capacity use the same capacity
      value stored in the gendisk zone_capacity field. This is incorrect for a
      zoned device with a last runt (smaller) zone.
      
      Add the new field last_zone_capacity to struct gendisk to store the
      capacity of the last zone of the device. blk_revalidate_seq_zone() and
      blk_revalidate_conv_zone() are both modified to get this value when
      disk_zone_is_last() returns true. Similarly to zone_capacity, the value
      is first stored using the last_zone_capacity field of struct
      blk_revalidate_zone_args. Once zone revalidation of all zones is done,
      this is used to set the gendisk last_zone_capacity field.
      
      The checks to determine if a zone is full or if a sector offset in a
      zone exceeds the zone capacity in disk_should_remove_zone_wplug(),
      disk_zone_wplug_abort_unaligned(), blk_zone_write_plug_init_request(),
      and blk_zone_wplug_prepare_bio() are modified to use the new helper
      functions disk_zone_is_full() and disk_zone_wplug_is_full().
      disk_zone_is_full() uses the zone index to determine if the zone being
      tested is the last one of the disk and uses the either the disk
      zone_capacity or last_zone_capacity accordingly.
      
      Fixes: dd291d77 ("block: Introduce zone write plugging")
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Link: https://lore.kernel.org/r/20240530054035.491497-4-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      29459c3e
    • Damien Le Moal's avatar
      block: Fix validation of zoned device with a runt zone · cd639993
      Damien Le Moal authored
      Commit ecfe43b1 ("block: Remember zone capacity when revalidating
      zones") introduced checks to ensure that the capacity of the zones of
      a zoned device is constant for all zones. However, this check ignores
      the possibility that a zoned device has a smaller last zone with a size
      not equal to the capacity of other zones. Such device correspond in
      practice to an SMR drive with a smaller last zone and all zones with a
      capacity equal to the zone size, leading to the last zone capacity being
      different than the capacity of other zones.
      
      Correctly handle such device by fixing the check for the constant zone
      capacity in blk_revalidate_seq_zone() using the new helper function
      disk_zone_is_last(). This helper function is also used in
      blk_revalidate_zone_cb() when checking the zone size.
      
      Fixes: ecfe43b1 ("block: Remember zone capacity when revalidating zones")
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Link: https://lore.kernel.org/r/20240530054035.491497-3-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cd639993
    • Damien Le Moal's avatar
      null_blk: Do not allow runt zone with zone capacity smaller then zone size · b1643168
      Damien Le Moal authored
      A zoned device with a smaller last zone together with a zone capacity
      smaller than the zone size does make any sense as that does not
      correspond to any possible setup for a real device:
      1) For ZNS and zoned UFS devices, all zones are always the same size.
      2) For SMR HDDs, all zones always have the same capacity.
      In other words, if we have a smaller last runt zone, then this zone
      capacity should always be equal to the zone size.
      
      Add a check in null_init_zoned_dev() to prevent a configuration to have
      both a smaller zone size and a zone capacity smaller than the zone size.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Link: https://lore.kernel.org/r/20240530054035.491497-2-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b1643168
    • Jens Axboe's avatar
      Merge tag 'nvme-6.10-2024-05-29' of git://git.infradead.org/nvme into block-6.10 · 1521dc24
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "nvme fixes for Linux 6.10
      
       - Removing unused fields (Kanchan)
       - Large folio offsets support (Kundan)
       - Multipath NUMA node initialiazation fix (Nilay)
       - Multipath IO stats accounting fixes (Keith)
       - Circular lockdep fix (Keith)
       - Target race condition fix (Sagi)
       - Target memory leak fix (Sagi)"
      
      * tag 'nvme-6.10-2024-05-29' of git://git.infradead.org/nvme:
        nvmet: fix a possible leak when destroy a ctrl during qp establishment
        nvme: use srcu for iterating namespace list
        nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset
        nvme: remove sgs and sws
        nvmet: fix ns enable/disable possible hang
        nvme-multipath: fix io accounting on failover
        nvme: fix multipath batched completion accounting
        nvme-multipath: find NUMA path only for online numa-node
      1521dc24
  7. 28 May, 2024 9 commits
  8. 27 May, 2024 6 commits
  9. 26 May, 2024 5 commits