1. 23 Jan, 2019 3 commits
  2. 09 Jan, 2019 1 commit
  3. 13 Dec, 2018 1 commit
    • Jens Axboe's avatar
      scsi: sd: use mempool for discard special page · 61cce6f6
      Jens Axboe authored
      
      When boxes are run near (or to) OOM, we have a problem with the discard
      page allocation in sd. If we fail allocating the special page, we return
      busy, and it'll get retried. But since ordering is honored for dispatch
      requests, we can keep retrying this same IO and failing. Behind that IO
      could be requests that want to free memory, but they never get the
      chance. This means you get repeated spews of traces like this:
      
      [1201401.625972] Call Trace:
      [1201401.631748]  dump_stack+0x4d/0x65
      [1201401.639445]  warn_alloc+0xec/0x190
      [1201401.647335]  __alloc_pages_slowpath+0xe84/0xf30
      [1201401.657722]  ? get_page_from_freelist+0x11b/0xb10
      [1201401.668475]  ? __alloc_pages_slowpath+0x2e/0xf30
      [1201401.679054]  __alloc_pages_nodemask+0x1f9/0x210
      [1201401.689424]  alloc_pages_current+0x8c/0x110
      [1201401.699025]  sd_setup_write_same16_cmnd+0x51/0x150
      [1201401.709987]  sd_init_command+0x49c/0xb70
      [1201401.719029]  scsi_setup_cmnd+0x9c/0x160
      [1201401.727877]  scsi_queue_rq+0x4d9/0x610
      [1201401.736535]  blk_mq_dispatch_rq_list+0x19a/0x360
      [1201401.747113]  blk_mq_sched_dispatch_requests+0xff/0x190
      [1201401.758844]  __blk_mq_run_hw_queue+0x95/0xa0
      [1201401.768653]  blk_mq_run_work_fn+0x2c/0x30
      [1201401.777886]  process_one_work+0x14b/0x400
      [1201401.787119]  worker_thread+0x4b/0x470
      [1201401.795586]  kthread+0x110/0x150
      [1201401.803089]  ? rescuer_thread+0x320/0x320
      [1201401.812322]  ? kthread_park+0x90/0x90
      [1201401.820787]  ? do_syscall_64+0x53/0x150
      [1201401.829635]  ret_from_fork+0x29/0x40
      
      Ensure that the discard page allocation has a mempool backing, so we
      know we can make progress.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      61cce6f6
  4. 10 Nov, 2018 1 commit
  5. 25 Oct, 2018 2 commits
    • Damien Le Moal's avatar
      block: Introduce blk_revalidate_disk_zones() · bf505456
      Damien Le Moal authored
      
      Drivers exposing zoned block devices have to initialize and maintain
      correctness (i.e. revalidate) of the device zone bitmaps attached to
      the device request queue (seq_zones_bitmap and seq_zones_wlock).
      
      To simplify coding this, introduce a generic helper function
      blk_revalidate_disk_zones() suitable for most (and likely all) cases.
      This new function always update the seq_zones_bitmap and seq_zones_wlock
      bitmaps as well as the queue nr_zones field when called for a disk
      using a request based queue. For a disk using a BIO based queue, only
      the number of zones is updated since these queues do not have
      schedulers and so do not need the zone bitmaps.
      
      With this change, the zone bitmap initialization code in sd_zbc.c can be
      replaced with a call to this function in sd_zbc_read_zones(), which is
      called from the disk revalidate block operation method.
      
      A call to blk_revalidate_disk_zones() is also added to the null_blk
      driver for devices created with the zoned mode enabled.
      
      Finally, to ensure that zoned devices created with dm-linear or
      dm-flakey expose the correct number of zones through sysfs, a call to
      blk_revalidate_disk_zones() is added to dm_table_set_restrictions().
      
      The zone bitmaps allocated and initialized with
      blk_revalidate_disk_zones() are freed automatically from
      __blk_release_queue() using the block internal function
      blk_queue_free_zone_bitmaps().
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bf505456
    • Christoph Hellwig's avatar
      block: add a report_zones method · e76239a3
      Christoph Hellwig authored
      
      Dispatching a report zones command through the request queue is a major
      pain due to the command reply payload rewriting necessary. Given that
      blkdev_report_zones() is executing everything synchronously, implement
      report zones as a block device file operation instead, allowing major
      simplification of the code in many places.
      
      sd, null-blk, dm-linear and dm-flakey being the only block device
      drivers supporting exposing zoned block devices, these drivers are
      modified to provide the device side implementation of the
      report_zones() block device file operation.
      
      For device mappers, a new report_zones() target type operation is
      defined so that the upper block layer calls blkdev_report_zones() can
      be propagated down to the underlying devices of the dm targets.
      Implementation for this new operation is added to the dm-linear and
      dm-flakey targets.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      [Damien]
      * Changed method block_device argument to gendisk
      * Various bug fixes and improvements
      * Added support for null_blk, dm-linear and dm-flakey.
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e76239a3
  6. 28 Sep, 2018 1 commit
  7. 26 Sep, 2018 1 commit
    • Bart Van Assche's avatar
      block: Move power management code into a new source file · bca6b067
      Bart Van Assche authored
      
      Move the code for runtime power management from blk-core.c into the
      new source file blk-pm.c. Move the corresponding declarations from
      <linux/blkdev.h> into <linux/blk-pm.h>. For CONFIG_PM=n, leave out
      the declarations of the functions that are not used in that mode.
      This patch not only reduces the number of #ifdefs in the block layer
      core code but also reduces the size of header file <linux/blkdev.h>
      and hence should help to reduce the build time of the Linux kernel
      if CONFIG_PM is not defined.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bca6b067
  8. 21 Sep, 2018 1 commit
  9. 17 Sep, 2018 1 commit
  10. 22 Aug, 2018 1 commit
  11. 30 Jul, 2018 1 commit
  12. 26 Jun, 2018 2 commits
  13. 19 Apr, 2018 1 commit
  14. 20 Mar, 2018 1 commit
    • Martin K. Petersen's avatar
      scsi: sd: Remember that READ CAPACITY(16) succeeded · 597d7400
      Martin K. Petersen authored
      
      The USB storage glue sets the try_rc_10_first flag in an attempt to
      avoid wedging poorly implemented legacy USB devices.
      
      If the device capacity is too large to be expressed in the provided
      response buffer field of READ CAPACITY(10), a well-behaved device will
      set the reported capacity to 0xFFFFFFFF. We will then attempt to issue a
      READ CAPACITY(16) to obtain the real capacity.
      
      Since this part of the discovery logic is not covered by the first_scan
      flag, a warning will be printed a couple of times times per revalidate
      attempt if we upgrade from READ CAPACITY(10) to READ CAPACITY(16).
      
      Remember that we have successfully issued READ CAPACITY(16) so we can
      take the fast path on subsequent revalidate attempts.
      Reported-by: default avatarMenion <menion@gmail.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      597d7400
  15. 08 Mar, 2018 1 commit
  16. 07 Mar, 2018 1 commit
  17. 09 Jan, 2018 1 commit
  18. 12 Dec, 2017 1 commit
    • Bart Van Assche's avatar
      scsi: core: Fix a scsi_show_rq() NULL pointer dereference · 14e3062f
      Bart Van Assche authored
      Avoid that scsi_show_rq() triggers a NULL pointer dereference if called
      after sd_uninit_command(). Swap the NULL pointer assignment and the
      mempool_free() call in sd_uninit_command() to make it less likely that
      scsi_show_rq() triggers a use-after-free. Note: even with these changes
      scsi_show_rq() can trigger a use-after-free but that's a lesser evil
      than e.g. suppressing debug information for T10 PI Type 2 commands
      completely. This patch fixes the following oops:
      
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: scsi_format_opcode_name+0x1a/0x1c0
      CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516
      Call Trace:
       __scsi_format_command+0x27/0xc0
       scsi_show_rq+0x5c/0xc0
       __blk_mq_debugfs_rq_show+0x116/0x130
       blk_mq_debugfs_rq_show+0xe/0x10
       seq_read+0xfe/0x3b0
       full_proxy_read+0x54/0x90
       __vfs_read+0x37/0x160
       vfs_read+0x96/0x130
       SyS_read+0x55/0xc0
       entry_SYSCALL_64_fastpath+0x1a/0xa5
      
      [mkp: added Type 2]
      
      Fixes: 0eebd005
      
       ("scsi: Implement blk_mq_ops.show_rq()")
      Reported-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      14e3062f
  19. 05 Dec, 2017 1 commit
  20. 17 Oct, 2017 2 commits
  21. 03 Oct, 2017 2 commits
  22. 25 Sep, 2017 1 commit
  23. 15 Sep, 2017 1 commit
  24. 25 Aug, 2017 4 commits
  25. 17 Aug, 2017 1 commit
  26. 29 Jun, 2017 1 commit
  27. 26 Jun, 2017 1 commit
  28. 13 Jun, 2017 1 commit
    • Bart Van Assche's avatar
      scsi: Protect SCSI device state changes with a mutex · 0db6ca8a
      Bart Van Assche authored
      
      Serializing SCSI device state changes avoids that two state changes can
      occur concurrently, e.g. the state changes in scsi_target_block() and
      __scsi_remove_device(). This serialization is essential to make patch
      "Make __scsi_remove_device go straight from BLOCKED to DEL" work
      reliably.
      
      Enable this mechanism for all scsi_target_*block() callers but not for
      the scsi_internal_device_unblock() calls from the mpt3sas driver because
      that driver can call scsi_internal_device_unblock() from atomic context.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      0db6ca8a
  29. 17 May, 2017 1 commit
  30. 12 May, 2017 2 commits