1. 16 Jun, 2024 3 commits
    • Damien Le Moal's avatar
      dm: Improve zone resource limits handling · 73a74af0
      Damien Le Moal authored
      The generic stacking of limits implemented in the block layer cannot
      correctly handle stacking of zone resource limits (max open zones and
      max active zones) because these limits are for an entire device but the
      stacking may be for a portion of that device (e.g. a dm-linear target
      that does not cover an entire block device). As a result, when DM
      devices are created on top of zoned block devices, the DM device never
      has any zone resource limits advertized, which is only correct if all
      underlying target devices also have no zone resource limits.
      If at least one target device has resource limits, the user may see
      either performance issues (if the max open zone limit of the device is
      exceeded) or write I/O errors if the max active zone limit of one of
      the underlying target devices is exceeded.
      
      While it is very difficult to correctly and reliably stack zone resource
      limits in general, cases where targets are not sharing zone resources of
      the same device can be dealt with relatively easily. Such situation
      happens when a target maps all sequential zones of a zoned block device:
      for such mapping, other targets mapping other parts of the same zoned
      block device can only contain conventional zones and thus will not
      require any zone resource to correctly handle write operations.
      
      For a mapped device constructed with such targets, which includes mapped
      devices constructed with targets mapping entire zoned block devices, the
      zone resource limits can be reliably determined using the non-zero
      minimum of the zone resource limits of all targets.
      
      For mapped devices that include targets partially mapping the set of
      sequential write required zones of zoned block devices, instead of
      advertizing no zone resource limits, it is also better to set the mapped
      device limits to the non-zero minimum of the limits of all targets. In
      this case the limits for a target depend on the number of sequential
      zones being mapped: if this number of zone is larger than the limits,
      then the limits of the device apply and can be used. If on the other
      hand the target maps a number of zones smaller than the limits, then no
      limits is needed and we can assume that the target has no limits (limits
      set to 0).
      
      This commit improves zone resource limits handling as described above
      by modifying dm_set_zones_restrictions() to iterate the targets of a
      mapped device to evaluate the max open and max active zone limits. This
      relies on an internal "stacking" of the limits of the target devices
      combined with a direct counting of the number of sequential zones
      mapped by the targets.
      1) For a target mapping an entire zoned block device, the limits for the
         target are set to the limits of the device.
      2) For a target partially mapping a zoned block device, the number of
         mapped sequential zones is used to determine the limits: if the
         target maps more sequential write required zones than the device
         limits, then the limits of the device are used as-is. If the number
         of mapped sequential zones is lower than the limits, then we assume
         that the target has no limits (limits set to 0).
      As this evaluation is done for each target, the zone resource limits
      for the mapped device are evaluated as the non-zero minimum of the
      limits of all the targets.
      
      For configurations resulting in unreliable limits, i.e. a table
      containing a target partially mapping a zoned device, a warning message
      is issued.
      
      The counting of mapped sequential zones for the target is done using the
      new function dm_device_count_zones() which performs a report zones on
      the entire block device with the callback dm_device_count_zones_cb().
      This count of mapped sequential zones is also used to determine if the
      mapped device contains only conventional zones. This allows simplifying
      dm_set_zones_restrictions() to not do a report zones just for this.
      For mapped devices mapping only conventional zones, as before, the
      mapped device is changed to a regular device by setting its zoned limit
      to false and clearing all its zone related limits.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBenjamin Marzinski <bmarzins@redhat.com>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Link: https://lore.kernel.org/r/20240611023639.89277-4-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      73a74af0
    • Damien Le Moal's avatar
      dm: Call dm_revalidate_zones() after setting the queue limits · 7f91ccd8
      Damien Le Moal authored
      dm_revalidate_zones() is called from dm_set_zone_restrictions() when the
      mapped device queue limits are not yet set. However,
      dm_revalidate_zones() calls blk_revalidate_disk_zones() and this
      function consults and modifies the mapped device queue limits. Thus,
      currently, blk_revalidate_disk_zones() operates on limits that are not
      yet initialized.
      
      Fix this by moving the call to dm_revalidate_zones() out of
      dm_set_zone_restrictions() and into dm_table_set_restrictions() after
      executing queue_limits_set().
      
      To further cleanup dm_set_zones_restrictions(), the message about the
      type of zone append (native or emulated) is also moved inside
      dm_revalidate_zones().
      
      Fixes: 1c0e7202 ("dm: use queue_limits_set")
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBenjamin Marzinski <bmarzins@redhat.com>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Link: https://lore.kernel.org/r/20240611023639.89277-3-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7f91ccd8
    • Damien Le Moal's avatar
      block: Improve checks on zone resource limits · e21d12c7
      Damien Le Moal authored
      Make sure that the zone resource limits of a zoned block device are
      correct by checking that:
      (a) If the device has a max active zones limit, make sure that the max
          open zones limit is lower than the max active zones limit.
      (b) If the device has zone resource limits, check that the limits
          values are lower than the number of sequential zones of the device.
          If it is not, assume that the zoned device has no limits by setting
          the limits to 0.
      
      For (a), a check is added to blk_validate_zoned_limits() and an error
      returned if the max open zones limit exceeds the value of the max active
      zone limit (if there is one).
      
      For (b), given that we need the number of sequential zones of the zoned
      device, this check is added to disk_update_zone_resources(). This is
      safe to do as that function is executed with the disk queue frozen and
      the check executed after queue_limits_start_update() which takes the
      queue limits lock. Of note is that the early return in this function
      for zoned devices that do not use zone write plugging (e.g. DM devices
      using native zone append) is moved to after the new check and adjustment
      of the zone resource limits so that the check applies to any zoned
      device.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarNiklas Cassel <cassel@kernel.org>
      Reviewed-by: default avatarBenjamin Marzinski <bmarzins@redhat.com>
      Link: https://lore.kernel.org/r/20240611023639.89277-2-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e21d12c7
  2. 15 Jun, 2024 1 commit
  3. 14 Jun, 2024 30 commits
  4. 12 Jun, 2024 6 commits