- 04 Nov, 2019 16 commits
-
-
Logan Gunthorpe authored
It's not apprporiate for the transports to set the data_len field of the request which is only used by the core. In this case, just use a variable on the stack to store the length of the sgl for comparison. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Logan Gunthorpe authored
None of the other transports check data_len which is verified in core code. The function should instead check that the sgl length is non-zero. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Damien Le Moal authored
Introduce the new helper function nvme_lba_to_sect() to convert a device logical block number to a 512B sector number. Use this new helper in obvious places, cleaning up the code. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Damien Le Moal authored
Rename nvme_block_nr() to nvme_sect_to_lba() and use SECTOR_SHIFT instead of its hard coded value 9. Also add a comment to decribe this helper. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Revanth Rajashekar authored
Update enumerations and structures in include/linux/nvme.h to resync with the nvmecli. All the updates are mentioned in the ratified NVMe 1.4 spec https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdfReviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Revanth Rajashekar <revanth.rajashekar@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Max Gurtovoy authored
nvme_cleanup_cmd should be called for each call to nvme_setup_cmd (symmetrical functions). Move the call for nvme_cleanup_cmd to the common core layer and call it during nvme_complete_rq for the good flow. For error flow, each transport will call nvme_cleanup_cmd independently. Also take care of a special case of path failure, where we call nvme_complete_rq without doing nvme_setup_cmd. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Max Gurtovoy authored
Fix the status code of canceled requests initiated by the host according to TP4028 (Status Code 0x371): "Command Aborted By host: The command was aborted as a result of host action (e.g., the host disconnected the Fabric connection)." Also in a multipath environment, unless otherwise specified, errors of this type (path related) should be retried using a different path, if one is available. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Israel Rukshin authored
The calls to nvmet_req_alloc_sgl and rdma_rw_ctx_init should usually succeed, so add this simple optimization to the fast path. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Israel Rukshin authored
The call to sgl_alloc shouldn't fail so add this simple optimization to the fast path. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Israel Rukshin authored
This commit doesn't change any logic. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Israel Rukshin authored
This function improves code readability and reduces code duplication. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
Code today only clears the association_id if a Disconnect LS is transmit. Remove ambiguity and unconditionally clear the association_id if the association has been terminated. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
Change wording on a couple of messages to clarify what happened. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
Set the new category field in the FC-NVME CMND_IU based on queue number. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
Sync sources with revised structure and field names to correspond with FC-NVME-2 header sync-up. Tested interoperability with success: - prior initiator with new target - prior target with new initiator - new on new Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
James Smart authored
Sync the header to FC-NVME-2 r1.06 (T11-2019-00210-v001). Includes some minor mods where pre-release field names changed by the time the spec was released. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 01 Nov, 2019 1 commit
-
-
Darrick J. Wong authored
Currently, if the loop device receives a WRITE_ZEROES request, it asks the underlying filesystem to punch out the range. This behavior is correct if unmapping is allowed. However, a NOUNMAP request means that the caller doesn't want us to free the storage backing the range, so punching out the range is incorrect behavior. To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to the fallocate documentation) required to ensure that the entire range is backed by real storage, which suffices for our purposes. Fixes: 19372e27 ("loop: implement REQ_OP_WRITE_ZEROES") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 25 Oct, 2019 1 commit
-
-
Geert Uytterhoeven authored
Fix misspelling of "configuration". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 24 Oct, 2019 5 commits
-
-
Jens Axboe authored
Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.5/drivers Pull MD changes from Song. * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: no longer compare spare disk superblock events in super_load md: improve handling of bio with REQ_PREFLUSH in md_flush_request() md/bitmap: avoid race window between md_bitmap_resize and bitmap_file_clear_bit md/raid0: Fix an error message in raid0_make_request()
-
Yufen Yu authored
We have a test case as follow: mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] \ --assume-clean --bitmap=internal mdadm -S /dev/md1 mdadm -A /dev/md1 /dev/sd[b-c] --run --force mdadm --zero /dev/sda mdadm /dev/md1 -a /dev/sda echo offline > /sys/block/sdc/device/state echo offline > /sys/block/sdb/device/state sleep 5 mdadm -S /dev/md1 echo running > /sys/block/sdb/device/state echo running > /sys/block/sdc/device/state mdadm -A /dev/md1 /dev/sd[a-c] --run --force When we readd /dev/sda to the array, it started to do recovery. After offline the other two disks in md1, the recovery have been interrupted and superblock update info cannot be written to the offline disks. While the spare disk (/dev/sda) can continue to update superblock info. After stopping the array and assemble it, we found the array run fail, with the follow kernel message: [ 172.986064] md: kicking non-fresh sdb from array! [ 173.004210] md: kicking non-fresh sdc from array! [ 173.022383] md/raid1:md1: active with 0 out of 4 mirrors [ 173.022406] md1: failed to create bitmap (-5) [ 173.023466] md: md1 stopped. Since both sdb and sdc have the value of 'sb->events' smaller than that in sda, they have been kicked from the array. However, the only remained disk sda is in 'spare' state before stop and it cannot be added to conf->mirrors[] array. In the end, raid array assemble and run fail. In fact, we can use the older disk sdb or sdc to assemble the array. That means we should not choose the 'spare' disk as the fresh disk in analyze_sbs(). To fix the problem, we do not compare superblock events when it is a spare disk, as same as validate_super. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>
-
David Jeffery authored
If pers->make_request fails in md_flush_request(), the bio is lost. To fix this, pass back a bool to indicate if the original make_request call should continue to handle the I/O and instead of assuming the flush logic will push it to completion. Convert md_flush_request to return a bool and no longer calls the raid driver's make_request function. If the return is true, then the md flush logic has or will complete the bio and the md make_request call is done. If false, then the md make_request function needs to keep processing like it is a normal bio. Let the original call to md_handle_request handle any need to retry sending the bio to the raid driver's make_request function should it be needed. Also mark md_flush_request and the make_request function pointer as __must_check to issue warnings should these critical return values be ignored. Fixes: 2bc13b83 ("md: batch flush requests.") Cc: stable@vger.kernel.org # # v4.19+ Cc: NeilBrown <neilb@suse.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
-
Guoqing Jiang authored
We need to move "spin_lock_irq(&bitmap->counts.lock)" before unmap previous storage, otherwise panic like belows could happen as follows. [ 902.353802] sdl: detected capacity change from 1077936128 to 3221225472 [ 902.616948] general protection fault: 0000 [#1] SMP [snip] [ 902.618588] CPU: 12 PID: 33698 Comm: md0_raid1 Tainted: G O 4.14.144-1-pserver #4.14.144-1.1~deb10 [ 902.618870] Hardware name: Supermicro SBA-7142G-T4/BHQGE, BIOS 3.00 10/24/2012 [ 902.619120] task: ffff9ae1860fc600 task.stack: ffffb52e4c704000 [ 902.619301] RIP: 0010:bitmap_file_clear_bit+0x90/0xd0 [md_mod] [ 902.619464] RSP: 0018:ffffb52e4c707d28 EFLAGS: 00010087 [ 902.619626] RAX: ffe8008b0d061000 RBX: ffff9ad078c87300 RCX: 0000000000000000 [ 902.619792] RDX: ffff9ad986341868 RSI: 0000000000000803 RDI: ffff9ad078c87300 [ 902.619986] RBP: ffff9ad0ed7a8000 R08: 0000000000000000 R09: 0000000000000000 [ 902.620154] R10: ffffb52e4c707ec0 R11: ffff9ad987d1ed44 R12: ffff9ad0ed7a8360 [ 902.620320] R13: 0000000000000003 R14: 0000000000060000 R15: 0000000000000800 [ 902.620487] FS: 0000000000000000(0000) GS:ffff9ad987d00000(0000) knlGS:0000000000000000 [ 902.620738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 902.620901] CR2: 000055ff12aecec0 CR3: 0000001005207000 CR4: 00000000000406e0 [ 902.621068] Call Trace: [ 902.621256] bitmap_daemon_work+0x2dd/0x360 [md_mod] [ 902.621429] ? find_pers+0x70/0x70 [md_mod] [ 902.621597] md_check_recovery+0x51/0x540 [md_mod] [ 902.621762] raid1d+0x5c/0xeb0 [raid1] [ 902.621939] ? try_to_del_timer_sync+0x4d/0x80 [ 902.622102] ? del_timer_sync+0x35/0x40 [ 902.622265] ? schedule_timeout+0x177/0x360 [ 902.622453] ? call_timer_fn+0x130/0x130 [ 902.622623] ? find_pers+0x70/0x70 [md_mod] [ 902.622794] ? md_thread+0x94/0x150 [md_mod] [ 902.622959] md_thread+0x94/0x150 [md_mod] [ 902.623121] ? wait_woken+0x80/0x80 [ 902.623280] kthread+0x119/0x130 [ 902.623437] ? kthread_create_on_node+0x60/0x60 [ 902.623600] ret_from_fork+0x22/0x40 [ 902.624225] RIP: bitmap_file_clear_bit+0x90/0xd0 [md_mod] RSP: ffffb52e4c707d28 Because mdadm was running on another cpu to do resize, so bitmap_resize was called to replace bitmap as below shows. PID: 38801 TASK: ffff9ad074a90e00 CPU: 0 COMMAND: "mdadm" [exception RIP: queued_spin_lock_slowpath+56] [snip] -- <NMI exception stack> -- #5 [ffffb52e60f17c58] queued_spin_lock_slowpath at ffffffff9c0b27b8 #6 [ffffb52e60f17c58] bitmap_resize at ffffffffc0399877 [md_mod] #7 [ffffb52e60f17d30] raid1_resize at ffffffffc0285bf9 [raid1] #8 [ffffb52e60f17d50] update_size at ffffffffc038a31a [md_mod] #9 [ffffb52e60f17d70] md_ioctl at ffffffffc0395ca4 [md_mod] And the procedure to keep resize bitmap safe is allocate new storage space, then quiesce, copy bits, replace bitmap, and re-start. However the daemon (bitmap_daemon_work) could happen even the array is quiesced, which means when bitmap_file_clear_bit is triggered by raid1d, then it thinks it should be fine to access store->filemap since counts->lock is held, but resize could change the storage without the protection of the lock. Cc: Jack Wang <jinpu.wang@cloud.ionos.com> Cc: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
-
Dan Carpenter authored
The first argument to WARN() is supposed to be a condition. The original code will just print the mdname() instead of the full warning message. Fixes: c84a1372 ("md/raid0: avoid RAID0 data corruption due to layout confusion.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Song Liu <songliubraving@fb.com>
-
- 18 Oct, 2019 1 commit
-
-
Ajay Joshi authored
A zoned block device maintains a write pointer within a zone, and reads beyond the write pointer are undefined. Fill data buffer returned above the write pointer with 0xFF. Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Matias Bjørling <matias.bjorling@wdc.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 07 Oct, 2019 2 commits
-
-
Bart Van Assche authored
This patch makes it possible to test blk_mq_update_nr_hw_queues() from inside a VM. Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Bart Van Assche authored
Introduce a local variable to make the code easier to read. This patch does not change any functionality but makes the next patch in this series easier to read. Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 06 Oct, 2019 4 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
In commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") we changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the executable mappings. Then, people reported that it broke some binaries that had overlapping segments from the same file, and commit ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some overlaying elf segment cases. But only some - despite the summary line of that commit, it only did it when it also does a temporary brk vma for one obvious overlapping case. Now Russell King reports another overlapping case with old 32-bit x86 binaries, which doesn't trigger that limited case. End result: we had better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED. Yes, it's a sign of old binaries generated with old tool-chains, but we do pride ourselves on not breaking existing setups. This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp() and the old load_elf_library() use-cases, because nobody has reported breakage for those. Yet. Note that in all the cases seen so far, the overlapping elf sections seem to be just re-mapping of the same executable with different section attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE flag or similar, which acts like NOREPLACE, but allows just remapping the same executable file using different protection flags. It's not clear that would make a huge difference to anything, but if people really hate that "elf remaps over previous maps" behavior, maybe at least a more limited form of remapping would alleviate some concerns. Alternatively, we should take a look at our elf_map() logic to see if we end up not mapping things properly the first time. In the meantime, this is the minimal "don't do that then" patch while people hopefully think about it more. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") Fixes: ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments") Cc: Michal Hocko <mhocko@suse.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping regression fix from Christoph Hellwig: "Revert an incorret hunk from a patch that caused problems on various arm boards (Andrey Smirnov)" * tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix false positive warnings in dma_common_free_remap()
-
git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "A few fixes this time around: - Fixup of some clock specifications for DRA7 (device-tree fix) - Removal of some dead/legacy CPU OPP/PM code for OMAP that throws warnings at boot - A few more minor fixups for OMAPs, most around display - Enable STM32 QSPI as =y since their rootfs sometimes comes from there - Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool - Fix of thermal zone definition for ux500 (5.4 regression)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support ARM: dts: ux500: Fix up the CPU thermal zone arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y ARM: dts: am4372: Set memory bandwidth limit for DISPC ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage() ARM: OMAP2+: Add missing LCDC midlemode for am335x ARM: OMAP2+: Fix missing reset done flag for am3 and am43 ARM: dts: Fix gpio0 flags for am335x-icev2 ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules ARM: omap2plus_defconfig: Enable DRM_TI_TFP410 DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again ARM: dts: Fix wrong clocks for dra7 mcasp clk: ti: dra7: Fix mcasp8 clock bits
-
- 05 Oct, 2019 10 commits
-
-
Linus Torvalds authored
Merge tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - remove unneeded ar-option and KBUILD_ARFLAGS - remove long-deprecated SUBDIRS - fix modpost to suppress false-positive warnings for UML builds - fix namespace.pl to handle relative paths to ${objtree}, ${srctree} - make setlocalversion work for /bin/sh - make header archive reproducible - fix some Makefiles and documents * tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kheaders: make headers archive reproducible kbuild: update compile-test header list for v5.4-rc2 kbuild: two minor updates for Documentation/kbuild/modules.rst scripts/setlocalversion: clear local variable to make it work for sh namespace: fix namespace.pl script to support relative paths video/logo: do not generate unneeded logo C files video/logo: remove unneeded *.o pattern from clean-files integrity: remove pointless subdir-$(CONFIG_...) integrity: remove unneeded, broken attempt to add -fshort-wchar modpost: fix static EXPORT_SYMBOL warnings for UML build kbuild: correct formatting of header in kbuild module docs kbuild: remove SUBDIRS support kbuild: remove ar-option and KBUILD_ARFLAGS
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "Twelve patches mostly small but obvious fixes or cosmetic but small updates" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix Nport ID display value scsi: qla2xxx: Fix N2N link up fail scsi: qla2xxx: Fix N2N link reset scsi: qla2xxx: Optimize NPIV tear down process scsi: qla2xxx: Fix stale mem access on driver unload scsi: qla2xxx: Fix unbound sleep in fcport delete path. scsi: qla2xxx: Silence fwdump template message scsi: hisi_sas: Make three functions static scsi: megaraid: disable device when probe failed after enabled device scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue scsi: qedf: Remove always false 'tmp_prio < 0' statement scsi: ufs: skip shutdown if hba is not powered scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
-
Linus Torvalds authored
This makes getdents() and getdents64() do sanity checking on the pathname that it gives to user space. And to mitigate the performance impact of that, it first cleans up the way it does the user copying, so that the code avoids doing the SMAP/PAN updates between each part of the dirent structure write. I really wanted to do this during the merge window, but didn't have time. The conversion of filldir to unsafe_put_user() is something I've had around for years now in a private branch, but the extra pathname checking finally made me clean it up to the point where it is mergable. It's worth noting that the filename validity checking really should be a bit smarter: it would be much better to delay the error reporting until the end of the readdir, so that non-corrupted filenames are still returned. But that involves bigger changes, so let's see if anybody actually hits the corrupt directory entry case before worrying about it further. * branch 'readdir': Make filldir[64]() verify the directory entry filename is valid Convert filldir[64]() from __put_user() to unsafe_put_user()
-
Linus Torvalds authored
This has been discussed several times, and now filesystem people are talking about doing it individually at the filesystem layer, so head that off at the pass and just do it in getdents{64}(). This is partially based on a patch by Jann Horn, but checks for NUL bytes as well, and somewhat simplified. There's also commentary about how it might be better if invalid names due to filesystem corruption don't cause an immediate failure, but only an error at the end of the readdir(), so that people can still see the filenames that are ok. There's also been discussion about just how much POSIX strictly speaking requires this since it's about filesystem corruption. It's really more "protect user space from bad behavior" as pointed out by Jann. But since Eric Biederman looked up the POSIX wording, here it is for context: "From readdir: The readdir() function shall return a pointer to a structure representing the directory entry at the current position in the directory stream specified by the argument dirp, and position the directory stream at the next entry. It shall return a null pointer upon reaching the end of the directory stream. The structure dirent defined in the <dirent.h> header describes a directory entry. From definitions: 3.129 Directory Entry (or Link) An object that associates a filename with a file. Several directory entries can associate names with the same file. ... 3.169 Filename A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the slash character and the null byte. The filenames dot and dot-dot have special meaning. A filename is sometimes referred to as a 'pathname component'." Note that I didn't bother adding the checks to any legacy interfaces that nobody uses. Also note that if this ends up being noticeable as a performance regression, we can fix that to do a much more optimized model that checks for both NUL and '/' at the same time one word at a time. We haven't really tended to optimize 'memchr()', and it only checks for one pattern at a time anyway, and we really _should_ check for NUL too (but see the comment about "soft errors" in the code about why it currently only checks for '/') See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name lookup code looks for pathname terminating characters in parallel. Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
We really should avoid the "__{get,put}_user()" functions entirely, because they can easily be mis-used and the original intent of being used for simple direct user accesses no longer holds in a post-SMAP/PAN world. Manually optimizing away the user access range check makes no sense any more, when the range check is generally much cheaper than the "enable user accesses" code that the __{get,put}_user() functions still need. So instead of __put_user(), use the unsafe_put_user() interface with user_access_{begin,end}() that really does generate better code these days, and which is generally a nicer interface. Under some loads, the multiple user writes that filldir() does are actually quite noticeable. This also makes the dirent name copy use unsafe_put_user() with a couple of macros. We do not want to make function calls with SMAP/PAN disabled, and the code this generates is quite good when the architecture uses "asm goto" for unsafe_put_user() like x86 does. Note that this doesn't bother with the legacy cases. Nobody should use them anyway, so performance doesn't really matter there. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold. 2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric Dumazet. 3) txq null deref in mac80211, from Miaoqing Pan. 4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann. 5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso. 6) Avoid division by zero in taprio scheduler, from Vladimir Oltean. 7) Various xgmac fixes in stmmac driver from Jose Abreu. 8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from Michal Kubecek. 9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij. 10) Fix sleep while atomic in sja1105, from Vladimir Oltean. 11) Suspend/resume deadlock in stmmac, from Thierry Reding. 12) Various UDP GSO fixes from Josh Hunt. 13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric Dumazet. 14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern. 15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) selftests/net: add nettest to .gitignore net: qlogic: Fix memory leak in ql_alloc_large_buffers nfc: fix memory leak in llcp_sock_bind() sch_dsmark: fix potential NULL deref in dsmark_init() net: phy: at803x: use operating parameters from PHY-specific status net: phy: extract pause mode net: phy: extract link partner advertisement reading net: phy: fix write to mii-ctrl1000 register ipv6: Handle missing host route in __ipv6_ifa_notify net: phy: allow for reset line to be tied to a sleepy GPIO controller net: ipv4: avoid mixed n_redirects and rate_tokens usage r8152: Set macpassthru in reset_resume callback cxgb4:Fix out-of-bounds MSI-X info array access Revert "ipv6: Handle race in addrconf_dad_work" net: make sock_prot_memory_pressure() return "const char *" rxrpc: Fix rxrpc_recvmsg tracepoint qmi_wwan: add support for Cinterion CLS8 devices tcp: fix slab-out-of-bounds in tcp_zerocopy_receive() lib: textsearch: fix escapes in example code udp: only do GSO if # of segs > 1 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 fixes from Vasily Gorbik: - defconfig updates - Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i" constraint for function arguments. Two kvm changes acked-by Christian Borntraeger. - Fix -Wunused-but-set-variable warnings in mm code. - Avoid a constant misuse in qdio. - Handle a case when cpumf is temporarily unavailable. * tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: KVM: s390: mark __insn32_query() as __always_inline KVM: s390: fix __insn32_query() inline assembly s390: update defconfigs s390/pci: mark function(s) __always_inline s390/mm: mark function(s) __always_inline s390/jump_label: mark function(s) __always_inline s390/cpu_mf: mark function(s) __always_inline s390/atomic,bitops: mark function(s) __always_inline s390/mm: fix -Wunused-but-set-variable warnings s390: mark __cpacf_query() as __always_inline s390/qdio: clarify size of the QIB parm area s390/cpumf: Fix indentation in sampling device driver s390/cpumsf: Check for CPU Measurement sampling s390/cpumf: Use consistant debug print format
-
Heiko Carstens authored
__insn32_query() will not compile if the compiler decides to not inline it, since it contains an inline assembly with an "i" constraint with variable contents. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-
Heiko Carstens authored
The inline assembly constraints of __insn32_query() tell the compiler that only the first byte of "query" is being written to. Intended was probably that 32 bytes are written to. Fix and simplify the code and just use a "memory" clobber. Fixes: d6681397 ("KVM: s390: provide query function for instructions returning 32 byte") Cc: stable@vger.kernel.org # v5.2+ Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-
Andrey Smirnov authored
Commit 5cf45379 ("dma-mapping: introduce a dma_common_find_pages helper") changed invalid input check in dma_common_free_remap() from: if (!area || !area->flags != VM_DMA_COHERENT) to if (!area || !area->flags != VM_DMA_COHERENT || !area->pages) which seem to produce false positives for memory obtained via dma_common_contiguous_remap() This triggers the following warning message when doing "reboot" on ZII VF610 Dev Board Rev B: WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c trying to free invalid coherent area: 9ef82980 Modules linked in: CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-20190820 #119 Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) Backtrace: [<8010d1ec>] (dump_backtrace) from [<8010d588>] (show_stack+0x20/0x24) r7:8015ed78 r6:00000009 r5:00000000 r4:9f4d9b14 [<8010d568>] (show_stack) from [<8077e3f0>] (dump_stack+0x24/0x28) [<8077e3cc>] (dump_stack) from [<801197a0>] (__warn.part.3+0xcc/0xe4) [<801196d4>] (__warn.part.3) from [<80119830>] (warn_slowpath_fmt+0x78/0x94) r6:00000070 r5:808e540c r4:81c03048 [<801197bc>] (warn_slowpath_fmt) from [<8015ed78>] (dma_common_free_remap+0x88/0x8c) r3:9ef82980 r2:808e53e0 r7:00001000 r6:a0b1e000 r5:a0b1e000 r4:00001000 [<8015ecf0>] (dma_common_free_remap) from [<8010fa9c>] (remap_allocator_free+0x60/0x68) r5:81c03048 r4:9f4d9b78 [<8010fa3c>] (remap_allocator_free) from [<801100d0>] (__arm_dma_free.constprop.3+0xf8/0x148) r5:81c03048 r4:9ef82900 [<8010ffd8>] (__arm_dma_free.constprop.3) from [<80110144>] (arm_dma_free+0x24/0x2c) r5:9f563410 r4:80110120 [<80110120>] (arm_dma_free) from [<8015d80c>] (dma_free_attrs+0xa0/0xdc) [<8015d76c>] (dma_free_attrs) from [<8020f3e4>] (dma_pool_destroy+0xc0/0x154) r8:9efa8860 r7:808f02f0 r6:808f02d0 r5:9ef82880 r4:9ef82780 [<8020f324>] (dma_pool_destroy) from [<805525d0>] (ehci_mem_cleanup+0x6c/0x150) r7:9f563410 r6:9efa8810 r5:00000000 r4:9efd0148 [<80552564>] (ehci_mem_cleanup) from [<80558e0c>] (ehci_stop+0xac/0xc0) r5:9efd0148 r4:9efd0000 [<80558d60>] (ehci_stop) from [<8053c4bc>] (usb_remove_hcd+0xf4/0x1b0) r7:9f563410 r6:9efd0074 r5:81c03048 r4:9efd0000 [<8053c3c8>] (usb_remove_hcd) from [<8056361c>] (host_stop+0x48/0xb8) r7:9f563410 r6:9efd0000 r5:9f5f4040 r4:9f5f5040 [<805635d4>] (host_stop) from [<80563d0c>] (ci_hdrc_host_destroy+0x34/0x38) r7:9f563410 r6:9f5f5040 r5:9efa8800 r4:9f5f4040 [<80563cd8>] (ci_hdrc_host_destroy) from [<8055ef18>] (ci_hdrc_remove+0x50/0x10c) [<8055eec8>] (ci_hdrc_remove) from [<804a2ed8>] (platform_drv_remove+0x34/0x4c) r7:9f563410 r6:81c4f99c r5:9efa8810 r4:9efa8810 [<804a2ea4>] (platform_drv_remove) from [<804a18a8>] (device_release_driver_internal+0xec/0x19c) r5:00000000 r4:9efa8810 [<804a17bc>] (device_release_driver_internal) from [<804a1978>] (device_release_driver+0x20/0x24) r7:9f563410 r6:81c41ed0 r5:9efa8810 r4:9f4a1dac [<804a1958>] (device_release_driver) from [<804a01b8>] (bus_remove_device+0xdc/0x108) [<804a00dc>] (bus_remove_device) from [<8049c204>] (device_del+0x150/0x36c) r7:9f563410 r6:81c03048 r5:9efa8854 r4:9efa8810 [<8049c0b4>] (device_del) from [<804a3368>] (platform_device_del.part.2+0x20/0x84) r10:9f563414 r9:809177e0 r8:81cb07dc r7:81c78320 r6:9f563454 r5:9efa8800 r4:9efa8800 [<804a3348>] (platform_device_del.part.2) from [<804a3420>] (platform_device_unregister+0x28/0x34) r5:9f563400 r4:9efa8800 [<804a33f8>] (platform_device_unregister) from [<8055dce0>] (ci_hdrc_remove_device+0x1c/0x30) r5:9f563400 r4:00000001 [<8055dcc4>] (ci_hdrc_remove_device) from [<805652ac>] (ci_hdrc_imx_remove+0x38/0x118) r7:81c78320 r6:9f563454 r5:9f563410 r4:9f541010 [<8056538c>] (ci_hdrc_imx_shutdown) from [<804a2970>] (platform_drv_shutdown+0x2c/0x30) [<804a2944>] (platform_drv_shutdown) from [<8049e4fc>] (device_shutdown+0x158/0x1f0) [<8049e3a4>] (device_shutdown) from [<8013ac80>] (kernel_restart_prepare+0x44/0x48) r10:00000058 r9:9f4d8000 r8:fee1dead r7:379ce700 r6:81c0b280 r5:81c03048 r4:00000000 [<8013ac3c>] (kernel_restart_prepare) from [<8013ad14>] (kernel_restart+0x1c/0x60) [<8013acf8>] (kernel_restart) from [<8013af84>] (__do_sys_reboot+0xe0/0x1d8) r5:81c03048 r4:00000000 [<8013aea4>] (__do_sys_reboot) from [<8013b0ec>] (sys_reboot+0x18/0x1c) r8:80101204 r7:00000058 r6:00000000 r5:00000000 r4:00000000 [<8013b0d4>] (sys_reboot) from [<80101000>] (ret_fast_syscall+0x0/0x54) Exception stack(0x9f4d9fa8 to 0x9f4d9ff0) 9fa0: 00000000 00000000 fee1dead 28121969 01234567 379ce700 9fc0: 00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04 9fe0: 00028e0c 7ec87c64 000135ec 76c1f410 Restore original invalid input check in dma_common_free_remap() to avoid this problem. Fixes: 5cf45379 ("dma-mapping: introduce a dma_common_find_pages helper") Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> [hch: just revert the offending hunk instead of creating a new helper] Signed-off-by: Christoph Hellwig <hch@lst.de>
-