1. 12 Mar, 2020 1 commit
    • Stefan Haberland's avatar
      s390/dasd: fix data corruption for thin provisioned devices · 5e6bdd37
      Stefan Haberland authored
      Devices are formatted in multiple of tracks.
      For an Extent Space Efficient (ESE) volume we get errors when accessing
      unformatted tracks. In this case the driver either formats the track on
      the flight for write requests or returns zero data for read requests.
      
      In case a request spans multiple tracks, the indication of an unformatted
      track presented for the first track is incorrectly applied to all tracks
      covered by the request. As a result, tracks containing data will be handled
      as empty, resulting in zero data being returned on read, or overwriting
      existing data with zero on write.
      
      Fix by determining the track that gets the NRF error.
      For write requests only format the track that is surely not formatted.
      For Read requests all tracks before have returned valid data and should not
      be touched.
      All tracks after the unformatted track might be formatted or not. Those are
      returned to the blocklayer to build a new request.
      
      When using alias devices there is a chance that multiple write requests
      trigger a format of the same track which might lead to data loss. Ensure
      that a track is formatted only once by maintaining a list of currently
      processed tracks.
      
      Fixes: 5e2b17e7 ("s390/dasd: Add dynamic formatting support for ESE volumes")
      Cc: stable@vger.kernel.org # 5.3+
      Signed-off-by: default avatarStefan Haberland <sth@linux.ibm.com>
      Reviewed-by: default avatarJan Hoeppner <hoeppner@linux.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5e6bdd37
  2. 10 Mar, 2020 1 commit
    • Tejun Heo's avatar
      blk-iocost: fix incorrect vtime comparison in iocg_is_idle() · dcd6589b
      Tejun Heo authored
      vtimes may wrap and time_before/after64() should be used to determine
      whether a given vtime is before or after another. iocg_is_idle() was
      incorrectly using plain "<" comparison do determine whether done_vtime
      is before vtime. Here, the only thing we're interested in is whether
      done_vtime matches vtime which indicates that there's nothing in
      flight. Let's test for inequality instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 7caa4715 ("blkcg: implement blk-iocost")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dcd6589b
  3. 06 Mar, 2020 1 commit
    • Carlo Nonato's avatar
      block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() · 14afc593
      Carlo Nonato authored
      The bfq_find_set_group() function takes as input a blkcg (which represents
      a cgroup) and retrieves the corresponding bfq_group, then it updates the
      bfq internal group hierarchy (see comments inside the function for why
      this is needed) and finally it returns the bfq_group.
      In the hierarchy update cycle, the pointer holding the correct bfq_group
      that has to be returned is mistakenly used to traverse the hierarchy
      bottom to top, meaning that in each iteration it gets overwritten with the
      parent of the current group. Since the update cycle stops at root's
      children (depth = 2), the overwrite becomes a problem only if the blkcg
      describes a cgroup at a hierarchy level deeper than that (depth > 2). In
      this case the root's child that happens to be also an ancestor of the
      correct bfq_group is returned. The main consequence is that processes
      contained in a cgroup at depth greater than 2 are wrongly placed in the
      group described above by BFQ.
      
      This commits fixes this problem by using a different bfq_group pointer in
      the update cycle in order to avoid the overwrite of the variable holding
      the original group reference.
      Reported-by: default avatarKwon Je Oh <kwonje.oh2@gmail.com>
      Signed-off-by: default avatarCarlo Nonato <carlo.nonato95@gmail.com>
      Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      14afc593
  4. 05 Mar, 2020 1 commit
    • Cengiz Can's avatar
      blktrace: fix dereference after null check · 153031a3
      Cengiz Can authored
      There was a recent change in blktrace.c that added a RCU protection to
      `q->blk_trace` in order to fix a use-after-free issue during access.
      
      However the change missed an edge case that can lead to dereferencing of
      `bt` pointer even when it's NULL:
      
      Coverity static analyzer marked this as a FORWARD_NULL issue with CID
      1460458.
      
      ```
      /kernel/trace/blktrace.c: 1904 in sysfs_blk_trace_attr_store()
      1898            ret = 0;
      1899            if (bt == NULL)
      1900                    ret = blk_trace_setup_queue(q, bdev);
      1901
      1902            if (ret == 0) {
      1903                    if (attr == &dev_attr_act_mask)
      >>>     CID 1460458:  Null pointer dereferences  (FORWARD_NULL)
      >>>     Dereferencing null pointer "bt".
      1904                            bt->act_mask = value;
      1905                    else if (attr == &dev_attr_pid)
      1906                            bt->pid = value;
      1907                    else if (attr == &dev_attr_start_lba)
      1908                            bt->start_lba = value;
      1909                    else if (attr == &dev_attr_end_lba)
      ```
      
      Added a reassignment with RCU annotation to fix the issue.
      
      Fixes: c780e86d ("blktrace: Protect q->blk_trace with RCU")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarCengiz Can <cengiz@kernel.wtf>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      153031a3
  5. 03 Mar, 2020 1 commit
  6. 02 Mar, 2020 1 commit
  7. 28 Feb, 2020 1 commit
  8. 27 Feb, 2020 1 commit
  9. 26 Feb, 2020 1 commit
  10. 25 Feb, 2020 3 commits
    • Dongli Zhang's avatar
      null_blk: remove unused fields in 'nullb_cmd' · 93d7c318
      Dongli Zhang authored
      'list', 'll_list' and 'csd' are no longer used.
      
      The 'list' is not used since it was introduced by commit f2298c04
      ("null_blk: multi queue aware block test driver").
      
      The 'll_list' is no longer used since commit 3c395a96 ("null_blk: set a
      separate timer for each command").
      
      The 'csd' is no longer used since commit ce2c350b ("null_blk: use
      blk_complete_request and blk_mq_complete_request").
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      93d7c318
    • Jan Kara's avatar
      blktrace: Protect q->blk_trace with RCU · c780e86d
      Jan Kara authored
      KASAN is reporting that __blk_add_trace() has a use-after-free issue
      when accessing q->blk_trace. Indeed the switching of block tracing (and
      thus eventual freeing of q->blk_trace) is completely unsynchronized with
      the currently running tracing and thus it can happen that the blk_trace
      structure is being freed just while __blk_add_trace() works on it.
      Protect accesses to q->blk_trace by RCU during tracing and make sure we
      wait for the end of RCU grace period when shutting down tracing. Luckily
      that is rare enough event that we can afford that. Note that postponing
      the freeing of blk_trace to an RCU callback should better be avoided as
      it could have unexpected user visible side-effects as debugfs files
      would be still existing for a short while block tracing has been shut
      down.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
      CC: stable@vger.kernel.org
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Tested-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reported-by: default avatarTristan Madani <tristmd@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c780e86d
    • Ming Lei's avatar
      blk-mq: insert passthrough request into hctx->dispatch directly · 01e99aec
      Ming Lei authored
      For some reason, device may be in one situation which can't handle
      FS request, so STS_RESOURCE is always returned and the FS request
      will be added to hctx->dispatch. However passthrough request may
      be required at that time for fixing the problem. If passthrough
      request is added to scheduler queue, there isn't any chance for
      blk-mq to dispatch it given we prioritize requests in hctx->dispatch.
      Then the FS IO request may never be completed, and IO hang is caused.
      
      So passthrough request has to be added to hctx->dispatch directly
      for fixing the IO hang.
      
      Fix this issue by inserting passthrough request into hctx->dispatch
      directly together withing adding FS request to the tail of
      hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request
      to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert().
      
      Then it becomes consistent with original legacy IO request
      path, in which passthrough request is always added to q->queue_head.
      
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ewan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      01e99aec
  11. 24 Feb, 2020 1 commit
  12. 23 Feb, 2020 7 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.6-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · d2eee258
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "These are fixes that were found during testing with help of error
        injection, plus some other stable material.
      
        There's a fixup to patch added to rc1 causing locking in wrong context
        warnings, tests found one more deadlock scenario. The patches are
        tagged for stable, two of them now in the queue but we'd like all
        three released at the same time.
      
        I'm not happy about fixes to fixes in such a fast succession during
        rcs, but I hope we found all the fallouts of commit 28553fa9
        ('Btrfs: fix race between shrinking truncate and fiemap')"
      
      * tag 'for-5.6-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix deadlock during fast fsync when logging prealloc extents beyond eof
        Btrfs: fix btrfs_wait_ordered_range() so that it waits for all ordered extents
        btrfs: fix bytes_may_use underflow in prealloc error condtition
        btrfs: handle logged extent failure properly
        btrfs: do not check delayed items are empty for single transaction cleanup
        btrfs: reset fs_root to NULL on error in open_ctree
        btrfs: destroy qgroup extent records on transaction abort
      d2eee258
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a3163ca0
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "More miscellaneous ext4 bug fixes (all stable fodder)"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix mount failure with quota configured as module
        jbd2: fix ocfs2 corrupt when clearing block group bits
        ext4: fix race between writepages and enabling EXT4_EXTENTS_FL
        ext4: rename s_journal_flag_rwsem to s_writepages_rwsem
        ext4: fix potential race between s_flex_groups online resizing and access
        ext4: fix potential race between s_group_info online resizing and access
        ext4: fix potential race between online resizing and write operations
        ext4: add cond_resched() to __ext4_find_entry()
        ext4: fix a data race in EXT4_I(inode)->i_disksize
      a3163ca0
    • Linus Torvalds's avatar
      Merge tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux · c6188dff
      Linus Torvalds authored
      Pull csky updates from Guo Ren:
       "Sorry, I missed 5.6-rc1 merge window, but in this pull request the
        most are the fixes and the rests are between fixes and features. The
        only outside modification is the MAINTAINERS file update with our
        mailing list.
      
         - cache flush implementation fixes
      
         - ftrace modify panic fix
      
         - CONFIG_SMP boot problem fix
      
         - fix pt_regs saving for atomic.S
      
         - fix fixaddr_init without highmem.
      
         - fix stack protector support
      
         - fix fake Tightly-Coupled Memory code compile and use
      
         - fix some typos and coding convention"
      
      * tag 'csky-for-linus-5.6-rc3' of git://github.com/c-sky/csky-linux: (23 commits)
        csky: Replace <linux/clk-provider.h> by <linux/of_clk.h>
        csky: Implement copy_thread_tls
        csky: Add PCI support
        csky: Minimize defconfig to support buildroot config.fragment
        csky: Add setup_initrd check code
        csky: Cleanup old Kconfig options
        arch/csky: fix some Kconfig typos
        csky: Fixup compile warning for three unimplemented syscalls
        csky: Remove unused cache implementation
        csky: Fixup ftrace modify panic
        csky: Add flush_icache_mm to defer flush icache all
        csky: Optimize abiv2 copy_to_user_page with VM_EXEC
        csky: Enable defer flush_dcache_page for abiv2 cpus (807/810/860)
        csky: Remove unnecessary flush_icache_* implementation
        csky: Support icache flush without specific instructions
        csky/Kconfig: Add Kconfig.platforms to support some drivers
        csky/smp: Fixup boot failed when CONFIG_SMP
        csky: Set regs->usp to kernel sp, when the exception is from kernel
        csky/mm: Fixup export invalid_pte_table symbol
        csky: Separate fixaddr_init from highmem
        ...
      c6188dff
    • Geert Uytterhoeven's avatar
      csky: Replace <linux/clk-provider.h> by <linux/of_clk.h> · 99db590b
      Geert Uytterhoeven authored
      The C-Sky platform code is not a clock provider, and just needs to call
      of_clk_init().
      
      Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>.
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarGuo Ren <guoren@linux.alibaba.com>
      99db590b
    • Linus Torvalds's avatar
      Merge tag 'ras-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dca132a6
      Linus Torvalds authored
      Pull RAS fixes from Thomas Gleixner:
       "Two fixes for the AMD MCE driver:
      
         - Populate the per CPU MCA bank descriptor pointer only after it has
           been completely set up to prevent a use-after-free in case that one
           of the subsequent initialization step fails
      
         - Implement a proper release function for the sysfs entries of MCA
           threshold controls instead of freeing the memory right in the CPU
           teardown code, which leads to another use-after-free when the
           associated sysfs file is opened and accessed"
      
      * tag 'ras-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce/amd: Fix kobject lifetime
        x86/mce/amd: Publish the bank pointer only after setup has succeeded
      dca132a6
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f3cc2494
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "Two fixes for the irq core code which are follow ups to the recent MSI
        fixes:
      
         - The WARN_ON which was put into the MSI setaffinity callback for
           paranoia reasons actually triggered via a callchain which escaped
           when all the possible ways to reach that code were analyzed.
      
           The proc/irq/$N/*affinity interfaces have a quirk which came in
           when ALPHA moved to the generic interface: In case that the written
           affinity mask does not contain any online CPU it calls into ALPHAs
           magic auto affinity setting code.
      
           A few years later this mechanism was also made available to x86 for
           no good reasons and in a way which circumvents all sanity checks
           for interrupts which cannot have their affinity set from process
           context on X86 due to the way the X86 interrupt delivery works.
      
           It would be possible to make this work properly, but there is no
           point in doing so. If the interrupt is not yet started then the
           affinity setting has no effect and if it is started already then it
           is already assigned to an online CPU so there is no point to
           randomly move it to some other CPU. Just return EINVAL as the code
           has done before that change forever.
      
         - The new MSI quirk bit in the irq domain flags turned out to be
           already occupied, which escaped the author and the reviewers
           because the already in use bits were 0,6,2,3,4,5 listed in that
           order.
      
           That bit 6 was simply overlooked because the ordering was straight
           forward linear otherwise. So the new bit ended up being a
           duplicate.
      
           Fix it up by switching the oddball 6 to the obvious 1"
      
      * tag 'irq-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/irqdomain: Make sure all irq domain flags are distinct
        genirq/proc: Reject invalid affinity masks (again)
      f3cc2494
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fca10378
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for x86:
      
         - Remove the __force_oder definiton from the kaslr boot code as it is
           already defined in the page table code which makes GCC 10 builds
           fail because it changed the default to -fno-common.
      
         - Address the AMD erratum 1054 concerning the IRPERF capability and
           enable the Instructions Retired fixed counter on machines which are
           not affected by the erratum"
      
      * tag 'x86-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF
        x86/boot/compressed: Don't declare __force_order in kaslr_64.c
      fca10378
  13. 22 Feb, 2020 15 commits
    • Linus Torvalds's avatar
      Merge tag 'zonefs-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs · 0a115e5f
      Linus Torvalds authored
      Pull zonefs fix from Damien Le Moal:
       "A single patch fixing typos in the documentation file"
      
      * tag 'zonefs-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
        zonefs: fix documentation typos etc.
      0a115e5f
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.6-2020-02-22' of git://git.kernel.dk/linux-block · b88025ea
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Here's a small collection of fixes that were queued up:
      
         - Remove unnecessary NULL check (Dan)
      
         - Missing io_req_cancelled() call in fallocate (Pavel)
      
         - Put the cleanup check for aux data in the right spot (Pavel)
      
         - Two fixes for SQPOLL (Stefano, Xiaoguang)"
      
      * tag 'io_uring-5.6-2020-02-22' of git://git.kernel.dk/linux-block:
        io_uring: fix __io_iopoll_check deadlock in io_sq_thread
        io_uring: prevent sq_thread from spinning when it should stop
        io_uring: fix use-after-free by io_cleanup_req()
        io_uring: remove unnecessary NULL checks
        io_uring: add missing io_req_cancelled()
      b88025ea
    • Linus Torvalds's avatar
      Merge tag 'block-5.6-2020-02-22' of git://git.kernel.dk/linux-block · f6c69b7f
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Just a set of NVMe fixes via Keith"
      
      * tag 'block-5.6-2020-02-22' of git://git.kernel.dk/linux-block:
        nvme-multipath: Fix memory leak with ana_log_buf
        nvme: Fix uninitialized-variable warning
        nvme-pci: Use single IRQ vector for old Apple models
        nvme/pci: Add sleep quirk for Samsung and Toshiba drives
      f6c69b7f
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · b98b809c
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four non-core fixes.
      
        Two are reverts of target fixes which turned out to have unwanted side
        effects, one is a revert of an RDMA fix with the same problem and the
        final one fixes an incorrect warning about memory allocation failures
        in megaraid_sas (the driver actually reduces the allocation size until
        it succeeds)"
      Signed-off-by: default avatarJames E.J. Bottomley <jejb@linux.ibm.com>
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: Revert "target: iscsi: Wait for all commands to finish before freeing a session"
        scsi: Revert "RDMA/isert: Fix a recently introduced regression related to logout"
        scsi: megaraid_sas: silence a warning
        scsi: Revert "target/core: Inline transport_lun_remove_cmd()"
      b98b809c
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.6-rc3' of... · 5b442b1a
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix crash in w83627ehf driver seen with W83627DHG-P
      
       - Fix lockdep splat in acpi_power_meter driver
      
       - Fix xdpe12284 documentation Sphinx warnings
      
      * tag 'hwmon-for-v5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (w83627ehf) Fix crash seen with W83627DHG-P
        hwmon: (acpi_power_meter) Fix lockdep splat
        Documentation/hwmon: fix xdpe12284 Sphinx warnings
      5b442b1a
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · fea63021
      Linus Torvalds authored
      Pull devicetree fixes deom Rob Herring:
       "A handful of fixes in DT bindings for MDIO bus, Allwinner CSI, OMAP
        HSMMC, and Tegra124 EMC"
      
      * tag 'devicetree-fixes-for-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: media: csi: Fix clocks description
        dt-bindings: media: csi: Add interconnects properties
        dt-bindings: net: mdio: remove compatible string from example
        dt-bindings: memory-controller: Update example for Tegra124 EMC
        dt-bindings: mmc: omap-hsmmc: Fix SDIO interrupt
      fea63021
    • Linus Torvalds's avatar
      Merge tag 's390-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 591dd4c1
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Remove ieee_emulation_warnings sysctl which is a dead code.
      
       - Avoid triggering rebuild of the kernel during make install.
      
       - Enable protected virtualization guest support in default configs.
      
       - Fix cio_ignore seq_file .next function to increase position index.
         And use kobj_to_dev instead of container_of in cio code.
      
       - Fix storage block address lists to contain absolute addresses in qdio
         code.
      
       - Few clang warnings and spelling fixes.
      
      * tag 's390-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/qdio: fill SBALEs with absolute addresses
        s390/qdio: fill SL with absolute addresses
        s390: remove obsolete ieee_emulation_warnings
        s390: make 'install' not depend on vmlinux
        s390/kaslr: Fix casts in get_random
        s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in storage_key_init_range
        s390/pkey/zcrypt: spelling s/crytp/crypt/
        s390/cio: use kobj_to_dev() API
        s390/defconfig: enable CONFIG_PROTECTED_VIRTUALIZATION_GUEST
        s390/cio: cio_ignore_proc_seq_next should increase position index
      591dd4c1
    • Xiaoguang Wang's avatar
      io_uring: fix __io_iopoll_check deadlock in io_sq_thread · c7849be9
      Xiaoguang Wang authored
      Since commit a3a0e43f ("io_uring: don't enter poll loop if we have
      CQEs pending"), if we already events pending, we won't enter poll loop.
      In case SETUP_IOPOLL and SETUP_SQPOLL are both enabled, if app has
      been terminated and don't reap pending events which are already in cq
      ring, and there are some reqs in poll_list, io_sq_thread will enter
      __io_iopoll_check(), and find pending events, then return, this loop
      will never have a chance to exit.
      
      I have seen this issue in fio stress tests, to fix this issue, let
      io_sq_thread call io_iopoll_getevents() with argument 'min' being zero,
      and remove __io_iopoll_check().
      
      Fixes: a3a0e43f ("io_uring: don't enter poll loop if we have CQEs pending")
      Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c7849be9
    • Jan Kara's avatar
      ext4: fix mount failure with quota configured as module · 9db176bc
      Jan Kara authored
      When CONFIG_QFMT_V2 is configured as a module, the test in
      ext4_feature_set_ok() fails and so mount of filesystems with quota or
      project features fails. Fix the test to use IS_ENABLED macro which
      works properly even for modules.
      
      Link: https://lore.kernel.org/r/20200221100835.9332-1-jack@suse.cz
      Fixes: d65d87a0 ("ext4: improve explanation of a mount failure caused by a misconfigured kernel")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      9db176bc
    • wangyan's avatar
      jbd2: fix ocfs2 corrupt when clearing block group bits · 8eedabfd
      wangyan authored
      I found a NULL pointer dereference in ocfs2_block_group_clear_bits().
      The running environment:
      	kernel version: 4.19
      	A cluster with two nodes, 5 luns mounted on two nodes, and do some
      	file operations like dd/fallocate/truncate/rm on every lun with storage
      	network disconnection.
      
      The fallocate operation on dm-23-45 caused an null pointer dereference.
      
      The information of NULL pointer dereference as follows:
      	[577992.878282] JBD2: Error -5 detected when updating journal superblock for dm-23-45.
      	[577992.878290] Aborting journal on device dm-23-45.
      	...
      	[577992.890778] JBD2: Error -5 detected when updating journal superblock for dm-24-46.
      	[577992.890908] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890916] (fallocate,88392,52):ocfs2_extend_trans:474 ERROR: status = -30
      	[577992.890918] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890920] (fallocate,88392,52):ocfs2_rotate_tree_right:2500 ERROR: status = -30
      	[577992.890922] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890924] (fallocate,88392,52):ocfs2_do_insert_extent:4382 ERROR: status = -30
      	[577992.890928] (fallocate,88392,52):ocfs2_insert_extent:4842 ERROR: status = -30
      	[577992.890928] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890930] (fallocate,88392,52):ocfs2_add_clusters_in_btree:4947 ERROR: status = -30
      	[577992.890933] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890939] __journal_remove_journal_head: freeing b_committed_data
      	[577992.890949] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
      	[577992.890950] Mem abort info:
      	[577992.890951]   ESR = 0x96000004
      	[577992.890952]   Exception class = DABT (current EL), IL = 32 bits
      	[577992.890952]   SET = 0, FnV = 0
      	[577992.890953]   EA = 0, S1PTW = 0
      	[577992.890954] Data abort info:
      	[577992.890955]   ISV = 0, ISS = 0x00000004
      	[577992.890956]   CM = 0, WnR = 0
      	[577992.890958] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000f8da07a9
      	[577992.890960] [0000000000000020] pgd=0000000000000000
      	[577992.890964] Internal error: Oops: 96000004 [#1] SMP
      	[577992.890965] Process fallocate (pid: 88392, stack limit = 0x00000000013db2fd)
      	[577992.890968] CPU: 52 PID: 88392 Comm: fallocate Kdump: loaded Tainted: G        W  OE     4.19.36 #1
      	[577992.890969] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
      	[577992.890971] pstate: 60400009 (nZCv daif +PAN -UAO)
      	[577992.891054] pc : _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2]
      	[577992.891082] lr : _ocfs2_free_suballoc_bits+0x618/0x968 [ocfs2]
      	[577992.891084] sp : ffff0000c8e2b810
      	[577992.891085] x29: ffff0000c8e2b820 x28: 0000000000000000
      	[577992.891087] x27: 00000000000006f3 x26: ffffa07957b02e70
      	[577992.891089] x25: ffff807c59d50000 x24: 00000000000006f2
      	[577992.891091] x23: 0000000000000001 x22: ffff807bd39abc30
      	[577992.891093] x21: ffff0000811d9000 x20: ffffa07535d6a000
      	[577992.891097] x19: ffff000001681638 x18: ffffffffffffffff
      	[577992.891098] x17: 0000000000000000 x16: ffff000080a03df0
      	[577992.891100] x15: ffff0000811d9708 x14: 203d207375746174
      	[577992.891101] x13: 73203a524f525245 x12: 20373439343a6565
      	[577992.891103] x11: 0000000000000038 x10: 0101010101010101
      	[577992.891106] x9 : ffffa07c68a85d70 x8 : 7f7f7f7f7f7f7f7f
      	[577992.891109] x7 : 0000000000000000 x6 : 0000000000000080
      	[577992.891110] x5 : 0000000000000000 x4 : 0000000000000002
      	[577992.891112] x3 : ffff000001713390 x2 : 2ff90f88b1c22f00
      	[577992.891114] x1 : ffff807bd39abc30 x0 : 0000000000000000
      	[577992.891116] Call trace:
      	[577992.891139]  _ocfs2_free_suballoc_bits+0x63c/0x968 [ocfs2]
      	[577992.891162]  _ocfs2_free_clusters+0x100/0x290 [ocfs2]
      	[577992.891185]  ocfs2_free_clusters+0x50/0x68 [ocfs2]
      	[577992.891206]  ocfs2_add_clusters_in_btree+0x198/0x5e0 [ocfs2]
      	[577992.891227]  ocfs2_add_inode_data+0x94/0xc8 [ocfs2]
      	[577992.891248]  ocfs2_extend_allocation+0x1bc/0x7a8 [ocfs2]
      	[577992.891269]  ocfs2_allocate_extents+0x14c/0x338 [ocfs2]
      	[577992.891290]  __ocfs2_change_file_space+0x3f8/0x610 [ocfs2]
      	[577992.891309]  ocfs2_fallocate+0xe4/0x128 [ocfs2]
      	[577992.891316]  vfs_fallocate+0x11c/0x250
      	[577992.891317]  ksys_fallocate+0x54/0x88
      	[577992.891319]  __arm64_sys_fallocate+0x28/0x38
      	[577992.891323]  el0_svc_common+0x78/0x130
      	[577992.891325]  el0_svc_handler+0x38/0x78
      	[577992.891327]  el0_svc+0x8/0xc
      
      My analysis process as follows:
      ocfs2_fallocate
        __ocfs2_change_file_space
          ocfs2_allocate_extents
            ocfs2_extend_allocation
              ocfs2_add_inode_data
                ocfs2_add_clusters_in_btree
                  ocfs2_insert_extent
                    ocfs2_do_insert_extent
                      ocfs2_rotate_tree_right
                        ocfs2_extend_rotate_transaction
                          ocfs2_extend_trans
                            jbd2_journal_restart
                              jbd2__journal_restart
                                /* handle->h_transaction is NULL,
                                 * is_handle_aborted(handle) is true
                                 */
                                handle->h_transaction = NULL;
                                start_this_handle
                                  return -EROFS;
                  ocfs2_free_clusters
                    _ocfs2_free_clusters
                      _ocfs2_free_suballoc_bits
                        ocfs2_block_group_clear_bits
                          ocfs2_journal_access_gd
                            __ocfs2_journal_access
                              jbd2_journal_get_undo_access
                                /* I think jbd2_write_access_granted() will
                                 * return true, because do_get_write_access()
                                 * will return -EROFS.
                                 */
                                if (jbd2_write_access_granted(...)) return 0;
                                do_get_write_access
                                  /* handle->h_transaction is NULL, it will
                                   * return -EROFS here, so do_get_write_access()
                                   * was not called.
                                   */
                                  if (is_handle_aborted(handle)) return -EROFS;
                          /* bh2jh(group_bh) is NULL, caused NULL
                             pointer dereference */
                          undo_bg = (struct ocfs2_group_desc *)
                                      bh2jh(group_bh)->b_committed_data;
      
      If handle->h_transaction == NULL, then jbd2_write_access_granted()
      does not really guarantee that journal_head will stay around,
      not even speaking of its b_committed_data. The bh2jh(group_bh)
      can be removed after ocfs2_journal_access_gd() and before call
      "bh2jh(group_bh)->b_committed_data". So, we should move
      is_handle_aborted() check from do_get_write_access() into
      jbd2_journal_get_undo_access() and jbd2_journal_get_write_access()
      before the call to jbd2_write_access_granted().
      
      Link: https://lore.kernel.org/r/f72a623f-b3f1-381a-d91d-d22a1c83a336@huawei.comSigned-off-by: default avatarYan Wang <wangyan122@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      8eedabfd
    • Eric Biggers's avatar
      ext4: fix race between writepages and enabling EXT4_EXTENTS_FL · cb85f4d2
      Eric Biggers authored
      If EXT4_EXTENTS_FL is set on an inode while ext4_writepages() is running
      on it, the following warning in ext4_add_complete_io() can be hit:
      
      WARNING: CPU: 1 PID: 0 at fs/ext4/page-io.c:234 ext4_put_io_end_defer+0xf0/0x120
      
      Here's a minimal reproducer (not 100% reliable) (root isn't required):
      
              while true; do
                      sync
              done &
              while true; do
                      rm -f file
                      touch file
                      chattr -e file
                      echo X >> file
                      chattr +e file
              done
      
      The problem is that in ext4_writepages(), ext4_should_dioread_nolock()
      (which only returns true on extent-based files) is checked once to set
      the number of reserved journal credits, and also again later to select
      the flags for ext4_map_blocks() and copy the reserved journal handle to
      ext4_io_end::handle.  But if EXT4_EXTENTS_FL is being concurrently set,
      the first check can see dioread_nolock disabled while the later one can
      see it enabled, causing the reserved handle to unexpectedly be NULL.
      
      Since changing EXT4_EXTENTS_FL is uncommon, and there may be other races
      related to doing so as well, fix this by synchronizing changing
      EXT4_EXTENTS_FL with ext4_writepages() via the existing
      s_writepages_rwsem (previously called s_journal_flag_rwsem).
      
      This was originally reported by syzbot without a reproducer at
      https://syzkaller.appspot.com/bug?extid=2202a584a00fffd19fbf,
      but now that dioread_nolock is the default I also started seeing this
      when running syzkaller locally.
      
      Link: https://lore.kernel.org/r/20200219183047.47417-3-ebiggers@kernel.org
      Reported-by: syzbot+2202a584a00fffd19fbf@syzkaller.appspotmail.com
      Fixes: 6b523df4 ("ext4: use transaction reservation for extent conversion in ext4_end_io")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      cb85f4d2
    • Eric Biggers's avatar
      ext4: rename s_journal_flag_rwsem to s_writepages_rwsem · bbd55937
      Eric Biggers authored
      In preparation for making s_journal_flag_rwsem synchronize
      ext4_writepages() with changes to both the EXTENTS and JOURNAL_DATA
      flags (rather than just JOURNAL_DATA as it does currently), rename it to
      s_writepages_rwsem.
      
      Link: https://lore.kernel.org/r/20200219183047.47417-2-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      bbd55937
    • Suraj Jitindar Singh's avatar
      ext4: fix potential race between s_flex_groups online resizing and access · 7c990728
      Suraj Jitindar Singh authored
      During an online resize an array of s_flex_groups structures gets replaced
      so it can get enlarged. If there is a concurrent access to the array and
      this memory has been reused then this can lead to an invalid memory access.
      
      The s_flex_group array has been converted into an array of pointers rather
      than an array of structures. This is to ensure that the information
      contained in the structures cannot get out of sync during a resize due to
      an accessor updating the value in the old structure after it has been
      copied but before the array pointer is updated. Since the structures them-
      selves are no longer copied but only the pointers to them this case is
      mitigated.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443
      Link: https://lore.kernel.org/r/20200221053458.730016-4-tytso@mit.eduSigned-off-by: default avatarSuraj Jitindar Singh <surajjs@amazon.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      7c990728
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 54dedb5b
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "Two small fixes for Xen:
      
         - a fix to avoid warnings with new gcc
      
         - a fix for incorrectly disabled interrupts when calling
           _cond_resched()"
      
      * tag 'for-linus-5.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: Enable interrupts when calling _cond_resched()
        x86/xen: Distribute switch variables for initialization
      54dedb5b
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 63f01d85
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "It's all straightforward apart from the changes to mmap()/mremap() in
        relation to their handling of address arguments from userspace with
        non-zero tag bits in the upper byte.
      
        The change to brk() is necessary to fix a nasty user-visible
        regression in malloc(), but we tightened up mmap() and mremap() at the
        same time because they also allow the user to create virtual aliases
        by accident. It's much less likely than brk() to matter in practice,
        but enforcing the principle of "don't permit the creation of mappings
        using tagged addresses" leads to a straightforward ABI without having
        to worry about the "but what if a crazy program did foo?" aspect of
        things.
      
        Summary:
      
         - Fix regression in malloc() caused by ignored address tags in brk()
      
         - Add missing brackets around argument to untagged_addr() macro
      
         - Fix clang build when using binutils assembler
      
         - Fix silly typo in virtual memory map documentation"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()
        docs: arm64: fix trivial spelling enought to enough in memory.rst
        arm64: memory: Add missing brackets to untagged_addr() macro
        arm64: lse: Fix LSE atomics with LLVM
      63f01d85
  14. 21 Feb, 2020 5 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 28659362
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Some more powerpc fixes for 5.6. This is two weeks worth as I was out
        sick last week:
      
         - Three fixes for the recently added VMAP_STACK on 32-bit.
      
         - Three fixes related to hugepages on 8xx (32-bit).
      
         - A fix for a bug in our transactional memory handling that could
           lead to a kernel crash if we saw a page fault during signal
           delivery.
      
         - A fix for a deadlock in our PCI EEH (Enhanced Error Handling) code.
      
         - A couple of other minor fixes.
      
        Thanks to: Christophe Leroy, Erhard F, Frederic Barrat, Gustavo Luiz
        Duarte, Larry Finger, Leonardo Bras, Oliver O'Halloran, Sam Bobroff"
      
      * tag 'powerpc-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/entry: Fix an #if which should be an #ifdef in entry_32.S
        powerpc/xmon: Fix whitespace handling in getstring()
        powerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACK
        powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK
        powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK
        powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery
        powerpc/8xx: Fix clearing of bits 20-23 in ITLB miss
        powerpc/hugetlb: Fix 8M hugepages on 8xx
        powerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page size
        powerpc/eeh: Fix deadlock handling dead PHB
      28659362
    • Linus Torvalds's avatar
      Merge tag 'linux-watchdog-5.6-rc3' of git://www.linux-watchdog.org/linux-watchdog · 0c0ddd6a
      Linus Torvalds authored
      Pull watchdog fixes from Wim Van Sebroeck:
      
       - mtk_wdt needs RESET_CONTROLLER to build
      
       - da9062 driver fixes:
           - fix power management ops
           - do not ping the hw during stop()
           - add dependency on I2C
      
      * tag 'linux-watchdog-5.6-rc3' of git://www.linux-watchdog.org/linux-watchdog:
        watchdog: da9062: Add dependency on I2C
        watchdog: da9062: fix power management ops
        watchdog: da9062: do not ping the hw during stop()
        watchdog: fix mtk_wdt.c RESET_CONTROLLER build error
      0c0ddd6a
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · bb65619e
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small char/misc driver fixes for 5.6-rc3.
      
        Also included in here are some updates for some documentation files
        that I seem to be maintaining these days.
      
        The driver fixes are:
         - small fixes for the habanalabs driver
         - fsi driver bugfix
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'char-misc-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Documentation/process: Swap out the ambassador for Canonical
        habanalabs: patched cb equals user cb in device memset
        habanalabs: do not halt CoreSight during hard reset
        habanalabs: halt the engines before hard-reset
        MAINTAINERS: remove unnecessary ':' characters
        fsi: aspeed: add unspecified HAS_IOMEM dependency
        COPYING: state that all contributions really are covered by this file
        Documentation/process: Change Microsoft contact for embargoed hardware issues
        embargoed-hardware-issues: drop Amazon contact as the email address now bounces
        Documentation/process: Add Arm contact for embargoed HW issues
      bb65619e
    • Linus Torvalds's avatar
      Merge tag 'staging-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · e5553ac7
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are some small staging driver fixes for 5.6-rc3, along with the
        removal of an unused/unneeded driver as well.
      
        The android vsoc driver is not needed anymore by anyone, so it was
        removed.
      
        The other driver fixes are:
         - ashmem bugfixes
         - greybus audio driver bugfix
         - wireless driver bugfixes and tiny cleanups to error paths
      
        All of these have been in linux-next for a while now with no reported
        issues"
      
      * tag 'staging-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: rtl8723bs: Remove unneeded goto statements
        staging: rtl8188eu: Remove some unneeded goto statements
        staging: rtl8723bs: Fix potential overuse of kernel memory
        staging: rtl8188eu: Fix potential overuse of kernel memory
        staging: rtl8723bs: Fix potential security hole
        staging: rtl8188eu: Fix potential security hole
        staging: greybus: use after free in gb_audio_manager_remove_all()
        staging: android: Delete the 'vsoc' driver
        staging: rtl8723bs: fix copy of overlapping memory
        staging: android: ashmem: Disallow ashmem memory from being remapped
        staging: vt6656: fix sign of rx_dbm to bb_pre_ed_rssi.
      e5553ac7
    • Linus Torvalds's avatar
      Merge tag 'tty-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · ef11f1b7
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are a number of small tty and serial driver fixes for 5.6-rc3
        that resolve a bunch of reported issues.
      
        They are:
         - vt selection and ioctl fixes
         - serdev bugfix
         - atmel serial driver fixes
         - qcom serial driver fixes
         - other minor serial driver fixes
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'tty-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        vt: selection, close sel_buffer race
        vt: selection, handle pending signals in paste_selection
        serial: cpm_uart: call cpm_muram_init before registering console
        tty: serial: qcom_geni_serial: Fix RX cancel command failure
        serial: 8250: Check UPF_IRQ_SHARED in advance
        tty: serial: imx: setup the correct sg entry for tx dma
        vt: vt_ioctl: fix race in VT_RESIZEX
        vt: fix scrollback flushing on background consoles
        tty: serial: tegra: Handle RX transfer in PIO mode if DMA wasn't started
        tty/serial: atmel: manage shutdown in case of RS485 or ISO7816 mode
        serdev: ttyport: restore client ops on deregistration
        serial: ar933x_uart: set UART_CS_{RX,TX}_READY_ORIDE
      ef11f1b7