- 03 Jan, 2024 7 commits
-
-
Christoph Hellwig authored
ctrl->max_discard_sectors stores a value that is potentially based of the DMRSL field in Identify Controller, which is in units of LBAs and thus dependent on the Format of a namespace. Fix this by moving the calculation of max_discard_sectors entirely into nvme_config_discard and replacing the ctrl->max_discard_sectors value with a local variable so that the calculation is always namespace-specific. Fixes: 1a86924e ("nvme: fix interpretation of DMRSL") Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Christoph Hellwig authored
Don't just skip the discard sectors and segments but also the granularity if a value was already set before. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Christoph Hellwig authored
Expeand the comment a bit to explain what is going on. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Christoph Hellwig authored
No, a __le32 cast doesn't magically byteswap on big-endian systems.. Fixes: 70525e5d ("nvmet-tcp: peek icreq before starting TLS") Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Max Gurtovoy authored
There is no users for NVMF_AUTH_HASH_LEN macro. Reviewed-by:
Israel Rukshin <israelr@nvidia.com> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Guixin Liu authored
There is no requirement to call nvme_tcp_free_queue() for queue deallocation if the pskid is null or the queue allocation fails, as the NVME_TCP_Q_ALLOCATED flag would not be set in such scenarios. Signed-off-by:
Guixin Liu <kanie@linux.alibaba.com> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
- 02 Jan, 2024 3 commits
-
-
Maurizio Lombardi authored
Simplify the nvmet_tcp_handle_h2c_data_pdu() function by removing boilerplate code. Signed-off-by:
Maurizio Lombardi <mlombard@redhat.com> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Maurizio Lombardi authored
in nvmet_tcp_handle_h2c_data_pdu(), if the host sends a data_offset different from rbytes_done, the driver ends up calling nvmet_req_complete() passing a status error. The problem is that at this point cmd->req is not yet initialized, the kernel will crash after dereferencing a NULL pointer. Fix the bug by replacing the call to nvmet_req_complete() with nvmet_tcp_fatal_error(). Fixes: 872d26a3 ("nvmet-tcp: add NVMe over TCP target driver") Reviewed-by:
Keith Busch <kbsuch@kernel.org> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Maurizio Lombardi <mlombard@redhat.com> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Maurizio Lombardi authored
If the host sends an H2CData command with an invalid DATAL, the kernel may crash in nvmet_tcp_build_pdu_iovec(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 lr : nvmet_tcp_io_work+0x6ac/0x718 [nvmet_tcp] Call trace: process_one_work+0x174/0x3c8 worker_thread+0x2d0/0x3e8 kthread+0x104/0x110 Fix the bug by raising a fatal error if DATAL isn't coherent with the packet size. Also, the PDU length should never exceed the MAXH2CDATA parameter which has been communicated to the host in nvmet_tcp_handle_icreq(). Fixes: 872d26a3 ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by:
Maurizio Lombardi <mlombard@redhat.com> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
- 29 Dec, 2023 9 commits
-
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20231228075545.362768-10-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-9-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-8-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-7-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Also don't bother clearing it as a discard granularity without discard_sectors doesn't mean anything. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-6-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20231228075545.362768-5-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Current the discard granularity defaults to 0 and must be initialized by any driver that wants to support discard. Default to the sector size instead, which is the smallest possible value, and a very useful default. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-4-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Just like all block I/O, discards are in units of sectors. Thus setting a smaller than sector size discard limit in case of > 512 byte sectors in bcache doesn't make sense. Always set the discard granularity to 512 bytes instead. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-3-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
A zero discard_granularity is not treated the same as a single-block one, and not having any segments after taking alignment is perfectly fine and does not need a warning. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-2-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 27 Dec, 2023 5 commits
-
-
Christoph Hellwig authored
Give BLK_DEF_MAX_SECTORS a _CAP postfix and document what it is used for. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231227092305.279567-5-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
BLK_DEF_MAX_SECTORS despite the confusing name is the default cap for the max_sectors limits. Don't use it to initialize max_hw_setors, which is a hardware / driver capacility. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231227092305.279567-4-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
BLK_DEF_MAX_SECTORS despite the confusing name is the default cap for the max_sectors limits. Don't use it to initialize max_hw_setors, which is a hardware / driver capacility. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231227092305.279567-3-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
null_blk has some rather odd capping of the max_hw_sectors value to BLK_DEF_MAX_SECTORS, which doesn't make sense - max_hw_sector is the hardware limit, and BLK_DEF_MAX_SECTORS despite the confusing name is the default cap for the max_sectors field used for normal file system I/O. Remove all the capping, and simply leave it to the block layer or user to take up or not all of that for file system I/O. Fixes: ea17fd35 ("null_blk: Allow controlling max_hw_sectors limit") Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231227092305.279567-2-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
loop_set_status doesn't change anything relevant to the discard and write_zeroes setting, so don't bother calling loop_config_discard. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231227082020.249427-1-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 26 Dec, 2023 2 commits
-
-
Christoph Hellwig authored
Use the queue wide write back cache tracking insted of duplicating the value in strut rq_wb. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231226090747.204969-1-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
submit_bio_noacct allows completely invalid operations, or operations that are not supported in the bio path. Extent the existing switch statement to rejcect all invalid types. Move the code point for REQ_OP_ZONE_APPEND so that it's not right in the middle of the zone management operations and the switch statement can follow the numerical order of the operations. Signed-off-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231221070538.1112446-1-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 22 Dec, 2023 2 commits
-
-
Randy Dunlap authored
Fix all kernel-doc warnings in drbd_actlog.c: drbd_actlog.c:963: warning: No description found for return value of 'drbd_rs_begin_io' drbd_actlog.c:1015: warning: Function parameter or member 'peer_device' not described in 'drbd_try_rs_begin_io' drbd_actlog.c:1015: warning: Excess function parameter 'device' description in 'drbd_try_rs_begin_io' drbd_actlog.c:1015: warning: No description found for return value of 'drbd_try_rs_begin_io' drbd_actlog.c:1197: warning: No description found for return value of 'drbd_rs_del_all' Fix one spelling error (s/ore/or/). Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Cc: <drbd-dev@lists.linbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <linux-block@vger.kernel.org> Link: https://lore.kernel.org/r/20231222061909.8791-1-rdunlap@infradead.orgSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Kundan Kumar authored
commit 41fa7222 ("blk-mq: do not include passthrough requests in I/O accounting")' disables I/O accounting for passthrough requests. Since tools like 'iostat' do not show anything useful for passthrough I/O, it's wasteful to do start/end time-stamping. So do away with that. Avoiding the time-stamping improves the I/O performance by ~7% Signed-off-by:
Kundan Kumar <kundan.kumar@samsung.com> Signed-off-by:
Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/r/20231222101707.6921-1-kundan.kumar@samsung.comSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 21 Dec, 2023 4 commits
-
-
git://git.infradead.org/nvmeJens Axboe authored
Pull NVMe updates from Keith: "nvme updates for Linux 6.8 - nvme fabrics spec updates (Guixin, Max) - nvme target udpates (Guixin, Evan) - nvme attribute refactoring (Daniel) - nvme-fc numa fix (Keith)" * tag 'nvme-6.8-2023-12-21' of git://git.infradead.org/nvme: nvme-fc: set numa_node after nvme_init_ctrl nvme-fabrics: don't check discovery ioccsz/iorcsz nvmet: configfs: use ctrl->instance to track passthru subsystems nvme: repack struct nvme_ns_head nvme: add csi, ms and nuse to sysfs nvme: rename ns attribute group nvme: refactor ns info setup function nvme: refactor ns info helpers nvme: move ns id info to struct nvme_ns_head nvmet: remove cntlid_min and cntlid_max check in nvmet_alloc_ctrl nvmet: allow identical cntlid_min and cntlid_max settings nvme-fabrics: check ioccsz and iorcsz nvme: introduce nvme_check_ctrl_fabric_info helper
-
Keith Busch authored
nvme_init_ctrl() resets numa_node to NUMA_NO_NODE, so be sure to set the desired value after that function call so it won't be overwritten. Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Reviewed-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Max Gurtovoy authored
IOCCSZ and IORCSZ are reserved for discovery controllers. Avoid checking their values during identify controller phase. Fixes: 2fcd3ab3 ("nvme-fabrics: check ioccsz and iorcsz") Reported-by:
Daniel Wagner <dwagner@suse.de> Tested-by:
Daniel Wagner <dwagner@suse.de> Signed-off-by:
Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by:
Keith Busch <kbusch@kernel.org>
-
Jens Axboe authored
A previous commit split disk_set_zoned(..., bool) into not taking an argument for whether to set or clear, and instead added disk_clear_zoned() as the counterpart. However, that commit neglected to export the new symbol, causing failures for modular drivers that used it. Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Fixes: d73e93b4 ("block: simplify disk_set_zoned") Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 20 Dec, 2023 5 commits
-
-
Christoph Hellwig authored
disk_clear_zoned only needs to be called when a device reported zone managed mode first and we clear it. Add a check so that disk_clear_zoned isn't called on devices that were never zoned. This avoids a fairly expensive queue freezing when revalidating conventional devices. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Damien Le Moal <dlemoal@kernel.org> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-6-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Only use disk_set_zoned to actually enable zoned device support. For clearing it, call disk_clear_zoned, which is renamed from disk_clear_zone_settings and now directly clears the zoned flag as well. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Damien Le Moal <dlemoal@kernel.org> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-5-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
When zones were first added the SCSI and ATA specs, two different models were supported (in addition to the drive managed one that is invisible to the host): - host managed where non-conventional zones there is strict requirement to write at the write pointer, or else an error is returned - host aware where a write point is maintained if writes always happen at it, otherwise it is left in an under-defined state and the sequential write preferred zones behave like conventional zones (probably very badly performing ones, though) Not surprisingly this lukewarm model didn't prove to be very useful and was finally removed from the ZBC and SBC specs (NVMe never implemented it). Due to to the easily disappearing write pointer host software could never rely on the write pointer to actually be useful for say recovery. Fortunately only a few HDD prototypes shipped using this model which never made it to mass production. Drop the support before it is too late. Note that any such host aware prototype HDD can still be used with Linux as we'll now treat it as a conventional HDD. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-4-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
virtblk_revalidate_zones is called unconditionally from virtblk_config_changed_work from the virtio config_changed callback. virtblk_revalidate_zones is a bit odd in that it re-clears the zoned state for host aware or non-zoned devices, which isn't needed unless the zoned mode changed - but a zone mode change to a host managed model isn't handled at all, and virtio_blk also doesn't handle any other config change except for a capacity change is handled (and even if it was the upper layers above virtio_blk wouldn't handle it very well). But even the useful case of a size change that would add or remove zones isn't handled properly as blk_revalidate_disk_zones expects the device capacity to cover all zones, but the capacity is only updated after virtblk_revalidate_zones. As this code appears to be entirely untested and is getting in the way remove it for now, but it can be readded in a fixed version with proper test coverage if needed. Fixes: 95bfec41 ("virtio-blk: add support for zoned block devices") Fixes: f1ba4e67 ("virtio-blk: fix to match virtio spec") Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Damien Le Moal <dlemoal@kernel.org> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-3-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
Christoph Hellwig authored
Move reading and checking the zoned model from virtblk_probe_zoned_device into the caller, leaving only the code to perform the actual setup for host managed zoned devices in virtblk_probe_zoned_device. This allows to share the model reading and sharing between builds with and without CONFIG_BLK_DEV_ZONED, and improve it for the !CONFIG_BLK_DEV_ZONED case. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Damien Le Moal <dlemoal@kernel.org> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20231217165359.604246-2-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 19 Dec, 2023 3 commits
-
-
Jens Axboe authored
Merge tag 'md-next-20231219' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.8/block Pull MD updates from Song: "1. Remove deprecated flavors, by Song Liu; 2. raid1 read error check support, by Li Nan; 3. Better handle events off-by-1 case, by Alex Lyakas." * tag 'md-next-20231219' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: Remove deprecated CONFIG_MD_FAULTY md: Remove deprecated CONFIG_MD_MULTIPATH md: Remove deprecated CONFIG_MD_LINEAR md/raid1: support read error check md: factor out a helper exceed_read_errors() to check read_errors md: Whenassemble the array, consult the superblock of the freshest device md/raid1: remove unnecessary null checking
-
Song Liu authored
md-faulty has been marked as deprecated for 2.5 years. Remove it. Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Guoqing Jiang <guoqing.jiang@linux.dev> Cc: Mateusz Grzonka <mateusz.grzonka@intel.com> Cc: Jes Sorensen <jes@trained-monkey.org> Signed-off-by:
Song Liu <song@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231214222107.2016042-4-song@kernel.org
-
Song Liu authored
md-multipath has been marked as deprecated for 2.5 years. Remove it. Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Guoqing Jiang <guoqing.jiang@linux.dev> Cc: Mateusz Grzonka <mateusz.grzonka@intel.com> Cc: Jes Sorensen <jes@trained-monkey.org> Signed-off-by:
Song Liu <song@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231214222107.2016042-3-song@kernel.org
-