1. 01 Sep, 2020 6 commits
    • Jens Axboe's avatar
      Merge branch 'block-5.9' into for-5.10/block · a98278ec
      Jens Axboe authored
      * block-5.9:
        blk-stat: make q->stats->lock irqsafe
        blk-iocost: ioc_pd_free() shouldn't assume irq disabled
        block: fix locking in bdev_del_partition
        block: release disk reference in hd_struct_free_work
        block: ensure bdi->io_pages is always initialized
        nvme-pci: cancel nvme device request before disabling
        nvme: only use power of two io boundaries
        nvme: fix controller instance leak
        nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
        nvme: Fix NULL dereference for pci nvme controllers
        nvme-rdma: fix reset hang if controller died in the middle of a reset
        nvme-rdma: fix timeout handler
        nvme-rdma: serialize controller teardown sequences
        nvme-tcp: fix reset hang if controller died in the middle of a reset
        nvme-tcp: fix timeout handler
        nvme-tcp: serialize controller teardown sequences
        nvme: have nvme_wait_freeze_timeout return if it timed out
        nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
        nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
      a98278ec
    • Tejun Heo's avatar
      blk-stat: make q->stats->lock irqsafe · e11d80a8
      Tejun Heo authored
      blk-iocost calls blk_stat_enable_accounting() while holding an irqsafe lock
      which triggers a lockdep splat because q->stats->lock isn't irqsafe. Let's
      make it irqsafe.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: cd006509 ("blk-iocost: account for IO size when testing latencies")
      Cc: stable@vger.kernel.org # v5.8+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e11d80a8
    • Tejun Heo's avatar
      blk-iocost: ioc_pd_free() shouldn't assume irq disabled · 5aeac7c4
      Tejun Heo authored
      ioc_pd_free() grabs irq-safe ioc->lock without ensuring that irq is disabled
      when it can be called with irq disabled or enabled. This has a small chance
      of causing A-A deadlocks and triggers lockdep splats. Use irqsave operations
      instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 7caa4715 ("blkcg: implement blk-iocost")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5aeac7c4
    • Christoph Hellwig's avatar
      block: fix locking in bdev_del_partition · 08fc1ab6
      Christoph Hellwig authored
      We need to hold the whole device bd_mutex to protect against
      other thread concurrently deleting out partition before we get
      to it, and thus causing a use after free.
      
      Fixes: cddae808 ("block: pass a hd_struct to delete_partition")
      Reported-by: syzbot+6448f3c229bc52b82f69@syzkaller.appspotmail.com
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      08fc1ab6
    • Ming Lei's avatar
      block: release disk reference in hd_struct_free_work · cafe01ef
      Ming Lei authored
      Commit e8c7d14a ("block: revert back to synchronous request_queue removal")
      stops to release request queue from wq context because that commit
      supposed all blk_put_queue() is called in context which is allowed
      to sleep. However, this assumption isn't true because we release disk's
      reference in partition's percpu_ref's ->release() which doesn't allow
      to sleep, because the ->release() is run via call_rcu().
      
      Fixes this issue by moving put disk reference into hd_struct_free_work()
      
      Fixes: e8c7d14a ("block: revert back to synchronous request_queue removal")
      Reported-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Tested-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cafe01ef
    • Jens Axboe's avatar
      block: ensure bdi->io_pages is always initialized · de1b0ee4
      Jens Axboe authored
      If a driver leaves the limit settings as the defaults, then we don't
      initialize bdi->io_pages. This means that file systems may need to
      work around bdi->io_pages == 0, which is somewhat messy.
      
      Initialize the default value just like we do for ->ra_pages.
      
      Cc: stable@vger.kernel.org
      Fixes: 9491ae4a ("mm: don't cap request size based on read-ahead setting")
      Reported-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      de1b0ee4
  2. 30 Aug, 2020 12 commits
    • Linus Torvalds's avatar
      Linux 5.9-rc3 · f75aef39
      Linus Torvalds authored
      f75aef39
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e43327c7
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - fix regression in af_alg that affects iwd
      
       - restore polling delay in qat
      
       - fix double free in ingenic on error path
      
       - fix potential build failure in sa2ul due to missing Kconfig dependency
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: af_alg - Work around empty control messages without MSG_MORE
        crypto: sa2ul - add Kconfig selects to fix build error
        crypto: ingenic - Drop kfree for memory allocated with devm_kzalloc
        crypto: qat - add delay before polling mailbox
      e43327c7
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dcc5c6f0
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Three interrupt related fixes for X86:
      
         - Move disabling of the local APIC after invoking fixup_irqs() to
           ensure that interrupts which are incoming are noted in the IRR and
           not ignored.
      
         - Unbreak affinity setting.
      
           The rework of the entry code reused the regular exception entry
           code for device interrupts. The vector number is pushed into the
           errorcode slot on the stack which is then lifted into an argument
           and set to -1 because that's regs->orig_ax which is used in quite
           some places to check whether the entry came from a syscall.
      
           But it was overlooked that orig_ax is used in the affinity cleanup
           code to validate whether the interrupt has arrived on the new
           target. It turned out that this vector check is pointless because
           interrupts are never moved from one vector to another on the same
           CPU. That check is a historical leftover from the time where x86
           supported multi-CPU affinities, but not longer needed with the now
           strict single CPU affinity. Famous last words ...
      
         - Add a missing check for an empty cpumask into the matrix allocator.
      
           The affinity change added a warning to catch the case where an
           interrupt is moved on the same CPU to a different vector. This
           triggers because a condition with an empty cpumask returns an
           assignment from the allocator as the allocator uses for_each_cpu()
           without checking the cpumask for being empty. The historical
           inconsistent for_each_cpu() behaviour of ignoring the cpumask and
           unconditionally claiming that CPU0 is in the mask struck again.
           Sigh.
      
        plus a new entry into the MAINTAINER file for the HPE/UV platform"
      
      * tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
        x86/irq: Unbreak interrupt affinity setting
        x86/hotplug: Silence APIC only after all interrupts are migrated
        MAINTAINERS: Add entry for HPE Superdome Flex (UV) maintainers
      dcc5c6f0
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d2283cdc
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of fixes for interrupt chip drivers:
      
         - Revert the platform driver conversion of interrupt chip drivers as
           it turned out to create more problems than it solves.
      
         - Fix a trivial typo in the new module helpers which made probing
           reliably fail.
      
         - Small fixes in the STM32 and MIPS Ingenic drivers
      
         - The TI firmware rework which had badly managed dependencies and had
           to wait post rc1"
      
      * tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/ingenic: Leave parent IRQ unmasked on suspend
        irqchip/stm32-exti: Avoid losing interrupts due to clearing pending bits by mistake
        irqchip: Revert modular support for drivers using IRQCHIP_PLATFORM_DRIVER helperse
        irqchip: Fix probing deferal when using IRQCHIP_PLATFORM_DRIVER helpers
        arm64: dts: k3-am65: Update the RM resource types
        arm64: dts: k3-am65: ti-sci-inta/intr: Update to latest bindings
        arm64: dts: k3-j721e: ti-sci-inta/intr: Update to latest bindings
        irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC
        irqchip/ti-sci-inta: Do not store TISCI device id in platform device id field
        dt-bindings: irqchip: Convert ti, sci-inta bindings to yaml
        dt-bindings: irqchip: ti, sci-inta: Update docs to support different parent.
        irqchip/ti-sci-intr: Add support for INTR being a parent to INTR
        dt-bindings: irqchip: Convert ti, sci-intr bindings to yaml
        dt-bindings: irqchip: ti, sci-intr: Update bindings to drop the usage of gic as parent
        firmware: ti_sci: Add support for getting resource with subtype
        firmware: ti_sci: Drop unused structure ti_sci_rm_type_map
        firmware: ti_sci: Drop the device id to resource type translation
      d2283cdc
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0063a82d
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single fix for the scheduler:
      
         - Make is_idle_task() __always_inline to prevent the compiler from
           putting it out of line into the wrong section because it's used
           inside noinstr sections"
      
      * tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Use __always_inline on is_idle_task()
      0063a82d
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b69bea8a
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "A set of fixes for lockdep, tracing and RCU:
      
         - Prevent recursion by using raw_cpu_* operations
      
         - Fixup the interrupt state in the cpu idle code to be consistent
      
         - Push rcu_idle_enter/exit() invocations deeper into the idle path so
           that the lock operations are inside the RCU watching sections
      
         - Move trace_cpu_idle() into generic code so it's called before RCU
           goes idle.
      
         - Handle raw_local_irq* vs. local_irq* operations correctly
      
         - Move the tracepoints out from under the lockdep recursion handling
           which turned out to be fragile and inconsistent"
      
      * tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lockdep,trace: Expose tracepoints
        lockdep: Only trace IRQ edges
        mips: Implement arch_irqs_disabled()
        arm64: Implement arch_irqs_disabled()
        nds32: Implement arch_irqs_disabled()
        locking/lockdep: Cleanup
        x86/entry: Remove unused THUNKs
        cpuidle: Move trace_cpu_idle() into generic code
        cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
        sched,idle,rcu: Push rcu_idle deeper into the idle path
        cpuidle: Fixup IRQ state
        lockdep: Use raw_cpu_*() for per-cpu variables
      b69bea8a
    • Linus Torvalds's avatar
      Merge tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6 · 3edd8db2
      Linus Torvalds authored
      Pull cfis fix from Steve French:
       "DFS fix for referral problem when using SMB1"
      
      * tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix check of tcon dfs in smb1
      3edd8db2
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 8bb5021c
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Revert our removal of PROT_SAO, at least one user expressed an
         interest in using it on Power9. Instead don't allow it to be used in
         guests unless enabled explicitly at compile time.
      
       - A fix for a crash introduced by a recent change to FP handling.
      
       - Revert a change to our idle code that left Power10 with no idle
         support.
      
       - One minor fix for the new scv system call path to set PPR.
      
       - Fix a crash in our "generic" PMU if branch stack events were enabled.
      
       - A fix for the IMC PMU, to correctly identify host kernel samples.
      
       - The ADB_PMU powermac code was found to be incompatible with
         VMAP_STACK, so make them incompatible in Kconfig until the code can
         be fixed.
      
       - A build fix in drivers/video/fbdev/controlfb.c, and a documentation
         fix.
      
      Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
      Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin,
      Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan
      Srinivasan.
      
      * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU
        Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
        powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
        powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
        powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
        powerpc/64s: scv entry should set PPR
        Documentation/powerpc: fix malformed table in syscall64-abi
        video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n
        selftests/powerpc: Update PROT_SAO test to skip ISA 3.1
        powerpc/64s: Disallow PROT_SAO in LPARs by default
        Revert "powerpc/64s: Remove PROT_SAO support"
      8bb5021c
    • Linus Torvalds's avatar
      Merge tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 6f0306d1
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Let's try this again...  Here are some USB fixes for 5.9-rc3.
      
        This differs from the previous pull request for this release in that
        the usb gadget patch now does not break some systems, and actually
        does what it was intended to do. Many thanks to Marek Szyprowski for
        quickly noticing and testing the patch from Andy Shevchenko to resolve
        this issue.
      
        Additionally, some more new USB quirks have been added to get some new
        devices to work properly based on user reports.
      
        Other than that, the patches are all here, and they contain:
      
         - usb gadget driver fixes
      
         - xhci driver fixes
      
         - typec fixes
      
         - new quirks and ids
      
         - fixes for USB patches that went into 5.9-rc1.
      
        All of these have been tested in linux-next with no reported issues"
      
      * tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
        usb: storage: Add unusual_uas entry for Sony PSZ drives
        USB: Ignore UAS for JMicron JMS567 ATA/ATAPI Bridge
        usb: host: ohci-exynos: Fix error handling in exynos_ohci_probe()
        USB: gadget: u_f: Unbreak offset calculation in VLAs
        USB: quirks: Ignore duplicate endpoint on Sound Devices MixPre-D
        usb: typec: tcpm: Fix Fix source hard reset response for TDA 2.3.1.1 and TDA 2.3.1.2 failures
        USB: PHY: JZ4770: Fix static checker warning.
        USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()
        USB: gadget: u_f: add overflow checks to VLA macros
        xhci: Always restore EP_SOFT_CLEAR_TOGGLE even if ep reset failed
        xhci: Do warm-reset when both CAS and XDEV_RESUME are set
        usb: host: xhci: fix ep context print mismatch in debugfs
        usb: uas: Add quirk for PNY Pro Elite
        tools: usb: move to tools buildsystem
        USB: Fix device driver race
        USB: Also match device drivers using the ->match vfunc
        usb: host: xhci-tegra: fix tegra_xusb_get_phy()
        usb: host: xhci-tegra: otg usb2/usb3 port init
        usb: hcd: Fix use after free in usb_hcd_pci_remove()
        usb: typec: ucsi: Hold con->lock for the entire duration of ucsi_register_port()
        ...
      6f0306d1
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 42df60fc
      Linus Torvalds authored
      Pull EDAC fix from Borislav Petkov:
       "A fix to properly clear ghes_edac driver state on driver remove so
        that a subsequent load can probe the system properly (Shiju Jose)"
      
      * tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()
      42df60fc
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping · c4011283
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
       "Fix a possibly uninitialized variable (Dan Carpenter)"
      
      * tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping:
        dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()
      c4011283
    • Thomas Gleixner's avatar
      genirq/matrix: Deal with the sillyness of for_each_cpu() on UP · 784a0830
      Thomas Gleixner authored
      Most of the CPU mask operations behave the same way, but for_each_cpu() and
      it's variants ignore the cpumask argument and claim that CPU0 is always in
      the mask. This is historical, inconsistent and annoying behaviour.
      
      The matrix allocator uses for_each_cpu() and can be called on UP with an
      empty cpumask. The calling code does not expect that this succeeds but
      until commit e027ffff ("x86/irq: Unbreak interrupt affinity setting")
      this went unnoticed. That commit added a WARN_ON() to catch cases which
      move an interrupt from one vector to another on the same CPU. The warning
      triggers on UP.
      
      Add a check for the cpumask being empty to prevent this.
      
      Fixes: 2f75d9e1 ("genirq: Implement bitmap matrix allocator")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      784a0830
  3. 29 Aug, 2020 7 commits
    • Linus Torvalds's avatar
      Merge tag 'fallthrough-fixes-5.9-rc3' of... · 1127b219
      Linus Torvalds authored
      Merge tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull fallthrough fixes from Gustavo A. R. Silva:
       "Fix some minor issues introduced by the recent treewide fallthrough
        conversions:
      
         - Fix identation issue
      
         - Fix erroneous fallthrough annotation
      
         - Remove unnecessary fallthrough annotation
      
         - Fix code comment changed by fallthrough conversion"
      
      * tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        arm64/cpuinfo: Remove unnecessary fallthrough annotation
        media: dib0700: Fix identation issue in dib8096_set_param_override()
        afs: Remove erroneous fallthough annotation
        iio: dpot-dac: fix code comment in dpot_dac_read_raw()
      1127b219
    • Linus Torvalds's avatar
      fsldma: fix very broken 32-bit ppc ioread64 functionality · 0a4c56c8
      Linus Torvalds authored
      Commit ef91bb19 ("kernel.h: Silence sparse warning in
      lower_32_bits") caused new warnings to show in the fsldma driver, but
      that commit was not to blame: it only exposed some very incorrect code
      that tried to take the low 32 bits of an address.
      
      That made no sense for multiple reasons, the most notable one being that
      that code was intentionally limited to only 32-bit ppc builds, so "only
      low 32 bits of an address" was completely nonsensical.  There were no
      high bits to mask off to begin with.
      
      But even more importantly fropm a correctness standpoint, turning the
      address into an integer then caused the subsequent address arithmetic to
      be completely wrong too, and the "+1" actually incremented the address
      by one, rather than by four.
      
      Which again was incorrect, since the code was reading two 32-bit values
      and trying to make a 64-bit end result of it all.  Surprisingly, the
      iowrite64() did not suffer from the same odd and incorrect model.
      
      This code has never worked, but it's questionable whether anybody cared:
      of the two users that actually read the 64-bit value (by way of some C
      preprocessor hackery and eventually the 'get_cdar()' inline function),
      one of them explicitly ignored the value, and the other one might just
      happen to work despite the incorrect value being read.
      
      This patch at least makes it not fail the build any more, and makes the
      logic superficially sane.  Whether it makes any difference to the code
      _working_ or not shall remain a mystery.
      Compile-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0a4c56c8
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · e77aee13
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "A core fix for ACPI matching and two driver bugfixes"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: iproc: Fix shifting 31 bits
        i2c: rcar: in slave mode, clear NACK earlier
        i2c: acpi: Remove dead code, i.e. i2c_acpi_match_device()
        i2c: core: Don't fail PRP0001 enumeration when no ID table exist
      e77aee13
    • Linus Torvalds's avatar
      Merge tag 's390-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 1b46b921
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Disable preemption trace in percpu macros since the lockdep code
         itself uses percpu variables now and it causes recursions.
      
       - Fix kernel space 4-level paging broken by recent vmem rework.
      
      * tag 's390-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/vmem: fix vmem_add_range for 4-level paging
        s390: don't trace preemption in percpu macros
      1b46b921
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · c8b5563a
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "Two fixes for Xen: one needed for ongoing work to support virtio with
        Xen, and one for a corner case in IRQ handling with Xen"
      
      * tag 'for-linus-5.9-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        arm/xen: Add misuse warning to virt_to_gfn
        xen/xenbus: Fix granting of vmalloc'd memory
        XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.
      c8b5563a
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.9-rc3' of... · e4cad138
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix tempeerature scale in gsc-hwmon driver
      
       - Fix divide by 0 error in nct7904 driver
      
       - Drop non-existing attribute from pmbus/isl68137 driver
      
       - Fix status check in applesmc driver
      
      * tag 'hwmon-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (gsc-hwmon) Scale temperature to millidegrees
        hwmon: (applesmc) check status earlier.
        hwmon: (nct7904) Correct divide by 0
        hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_1 telemetry for RAA228228
      e4cad138
    • Jens Axboe's avatar
      Merge branch 'nvme-5.9-rc' of git://git.infradead.org/nvme into block-5.9 · 5d220bcd
      Jens Axboe authored
      Pull NVMe fixes from Sagi:
      
      "- instance leak and io boundary fixes from Keith
       - fc locking fix from Christophe
       - various tcp/rdma reset during traffic fixes from Me
       - pci use-after-free fix from Tong
       - tcp target null deref fix from Ziye"
      
      * 'nvme-5.9-rc' of git://git.infradead.org/nvme:
        nvme-pci: cancel nvme device request before disabling
        nvme: only use power of two io boundaries
        nvme: fix controller instance leak
        nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()'
        nvme: Fix NULL dereference for pci nvme controllers
        nvme-rdma: fix reset hang if controller died in the middle of a reset
        nvme-rdma: fix timeout handler
        nvme-rdma: serialize controller teardown sequences
        nvme-tcp: fix reset hang if controller died in the middle of a reset
        nvme-tcp: fix timeout handler
        nvme-tcp: serialize controller teardown sequences
        nvme: have nvme_wait_freeze_timeout return if it timed out
        nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance
        nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu
      5d220bcd
  4. 28 Aug, 2020 15 commits
    • Tong Zhang's avatar
      nvme-pci: cancel nvme device request before disabling · 7ad92f65
      Tong Zhang authored
      This patch addresses an irq free warning and null pointer dereference
      error problem when nvme devices got timeout error during initialization.
      This problem happens when nvme_timeout() function is called while
      nvme_reset_work() is still in execution. This patch fixed the problem by
      setting flag of the problematic request to NVME_REQ_CANCELLED before
      calling nvme_dev_disable() to make sure __nvme_submit_sync_cmd() returns
      an error code and let nvme_submit_sync_cmd() fail gracefully.
      The following is console output.
      
      [   62.472097] nvme nvme0: I/O 13 QID 0 timeout, disable controller
      [   62.488796] nvme nvme0: could not set timestamp (881)
      [   62.494888] ------------[ cut here ]------------
      [   62.495142] Trying to free already-free IRQ 11
      [   62.495366] WARNING: CPU: 0 PID: 7 at kernel/irq/manage.c:1751 free_irq+0x1f7/0x370
      [   62.495742] Modules linked in:
      [   62.495902] CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.8.0+ #8
      [   62.496206] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-p4
      [   62.496772] Workqueue: nvme-reset-wq nvme_reset_work
      [   62.497019] RIP: 0010:free_irq+0x1f7/0x370
      [   62.497223] Code: e8 ce 49 11 00 48 83 c4 08 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 44 89 f6 48 c70
      [   62.498133] RSP: 0000:ffffa96800043d40 EFLAGS: 00010086
      [   62.498391] RAX: 0000000000000000 RBX: ffff9b87fc458400 RCX: 0000000000000000
      [   62.498741] RDX: 0000000000000001 RSI: 0000000000000096 RDI: ffffffff9693d72c
      [   62.499091] RBP: ffff9b87fd4c8f60 R08: ffffa96800043bfd R09: 0000000000000163
      [   62.499440] R10: ffffa96800043bf8 R11: ffffa96800043bfd R12: ffff9b87fd4c8e00
      [   62.499790] R13: ffff9b87fd4c8ea4 R14: 000000000000000b R15: ffff9b87fd76b000
      [   62.500140] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
      [   62.500534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   62.500816] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
      [   62.501165] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   62.501515] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   62.501864] Call Trace:
      [   62.501993]  pci_free_irq+0x13/0x20
      [   62.502167]  nvme_reset_work+0x5d0/0x12a0
      [   62.502369]  ? update_load_avg+0x59/0x580
      [   62.502569]  ? ttwu_queue_wakelist+0xa8/0xc0
      [   62.502780]  ? try_to_wake_up+0x1a2/0x450
      [   62.502979]  process_one_work+0x1d2/0x390
      [   62.503179]  worker_thread+0x45/0x3b0
      [   62.503361]  ? process_one_work+0x390/0x390
      [   62.503568]  kthread+0xf9/0x130
      [   62.503726]  ? kthread_park+0x80/0x80
      [   62.503911]  ret_from_fork+0x22/0x30
      [   62.504090] ---[ end trace de9ed4a70f8d71e2 ]---
      [  123.912275] nvme nvme0: I/O 12 QID 0 timeout, disable controller
      [  123.914670] nvme nvme0: 1/0/0 default/read/poll queues
      [  123.916310] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [  123.917469] #PF: supervisor write access in kernel mode
      [  123.917725] #PF: error_code(0x0002) - not-present page
      [  123.917976] PGD 0 P4D 0
      [  123.918109] Oops: 0002 [#1] SMP PTI
      [  123.918283] CPU: 0 PID: 7 Comm: kworker/u4:0 Tainted: G        W         5.8.0+ #8
      [  123.918650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-p4
      [  123.919219] Workqueue: nvme-reset-wq nvme_reset_work
      [  123.919469] RIP: 0010:__blk_mq_alloc_map_and_request+0x21/0x80
      [  123.919757] Code: 66 0f 1f 84 00 00 00 00 00 41 55 41 54 55 48 63 ee 53 48 8b 47 68 89 ee 48 89 fb 8b4
      [  123.920657] RSP: 0000:ffffa96800043d40 EFLAGS: 00010286
      [  123.920912] RAX: ffff9b87fc4fee40 RBX: ffff9b87fc8cb008 RCX: 0000000000000000
      [  123.921258] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b87fc618000
      [  123.921602] RBP: 0000000000000000 R08: ffff9b87fdc2c4a0 R09: ffff9b87fc616000
      [  123.921949] R10: 0000000000000000 R11: ffff9b87fffd1500 R12: 0000000000000000
      [  123.922295] R13: 0000000000000000 R14: ffff9b87fc8cb200 R15: ffff9b87fc8cb000
      [  123.922641] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
      [  123.923032] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  123.923312] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
      [  123.923660] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  123.924007] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  123.924353] Call Trace:
      [  123.924479]  blk_mq_alloc_tag_set+0x137/0x2a0
      [  123.924694]  nvme_reset_work+0xed6/0x12a0
      [  123.924898]  process_one_work+0x1d2/0x390
      [  123.925099]  worker_thread+0x45/0x3b0
      [  123.925280]  ? process_one_work+0x390/0x390
      [  123.925486]  kthread+0xf9/0x130
      [  123.925642]  ? kthread_park+0x80/0x80
      [  123.925825]  ret_from_fork+0x22/0x30
      [  123.926004] Modules linked in:
      [  123.926158] CR2: 0000000000000000
      [  123.926322] ---[ end trace de9ed4a70f8d71e3 ]---
      [  123.926549] RIP: 0010:__blk_mq_alloc_map_and_request+0x21/0x80
      [  123.926832] Code: 66 0f 1f 84 00 00 00 00 00 41 55 41 54 55 48 63 ee 53 48 8b 47 68 89 ee 48 89 fb 8b4
      [  123.927734] RSP: 0000:ffffa96800043d40 EFLAGS: 00010286
      [  123.927989] RAX: ffff9b87fc4fee40 RBX: ffff9b87fc8cb008 RCX: 0000000000000000
      [  123.928336] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9b87fc618000
      [  123.928679] RBP: 0000000000000000 R08: ffff9b87fdc2c4a0 R09: ffff9b87fc616000
      [  123.929025] R10: 0000000000000000 R11: ffff9b87fffd1500 R12: 0000000000000000
      [  123.929370] R13: 0000000000000000 R14: ffff9b87fc8cb200 R15: ffff9b87fc8cb000
      [  123.929715] FS:  0000000000000000(0000) GS:ffff9b87fdc00000(0000) knlGS:0000000000000000
      [  123.930106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  123.930384] CR2: 0000000000000000 CR3: 000000003aa0a000 CR4: 00000000000006f0
      [  123.930731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  123.931077] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Co-developed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      7ad92f65
    • Keith Busch's avatar
      nvme: only use power of two io boundaries · e83d776f
      Keith Busch authored
      The kernel requires a power of two for boundaries because that's the
      only way it can efficiently split commands that cross them. A
      controller, however, may report a non-power of two boundary.
      
      The driver had been rounding the controller's value to one the kernel
      can use, but splitting on the wrong boundary provides no benefit on the
      device side, and incurs additional submission overhead from non-optimal
      splits.
      
      Don't provide any boundary hint if the controller's value can't be used
      and log a warning when first scanning a disk's unreported IO boundary.
      Since the chunk sector logic has grown, move it to a separate function.
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      e83d776f
    • Keith Busch's avatar
      nvme: fix controller instance leak · 192f6c29
      Keith Busch authored
      If the driver has to unbind from the controller for an early failure
      before the subsystem has been set up, there won't be a subsystem holding
      the controller's instance, so the controller needs to free its own
      instance in this case.
      
      Fixes: 733e4b69 ("nvme: Assign subsys instance from first ctrl")
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      192f6c29
    • Christophe JAILLET's avatar
      nvmet-fc: Fix a missed _irqsave version of spin_lock in 'nvmet_fc_fod_op_done()' · 70e37988
      Christophe JAILLET authored
      The way 'spin_lock()' and 'spin_lock_irqsave()' are used is not consistent
      in this function.
      
      Use 'spin_lock_irqsave()' also here, as there is no guarantee that
      interruptions are disabled at that point, according to surrounding code.
      
      Fixes: a97ec51b ("nvmet_fc: Rework target side abort handling")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      70e37988
    • Sagi Grimberg's avatar
      nvme: Fix NULL dereference for pci nvme controllers · 7cd49f75
      Sagi Grimberg authored
      PCIe controllers do not have fabric opts, verify they exist before
      showing ctrl_loss_tmo or reconnect_delay attributes.
      
      Fixes: 764075fd ("nvme: expose reconnect_delay and ctrl_loss_tmo via sysfs")
      Reported-by: default avatarTobias Markus <tobias@markus-regensburg.de>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      7cd49f75
    • Sagi Grimberg's avatar
      nvme-rdma: fix reset hang if controller died in the middle of a reset · 2362acb6
      Sagi Grimberg authored
      If the controller becomes unresponsive in the middle of a reset, we
      will hang because we are waiting for the freeze to complete, but that
      cannot happen since we have commands that are inflight holding the
      q_usage_counter, and we can't blindly fail requests that times out.
      
      So give a timeout and if we cannot wait for queue freeze before
      unfreezing, fail and have the error handling take care how to
      proceed (either schedule a reconnect of remove the controller).
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      2362acb6
    • Sagi Grimberg's avatar
      nvme-rdma: fix timeout handler · 0475a8dc
      Sagi Grimberg authored
      When a request times out in a LIVE state, we simply trigger error
      recovery and let the error recovery handle the request cancellation,
      however when a request times out in a non LIVE state, we make sure to
      complete it immediately as it might block controller setup or teardown
      and prevent forward progress.
      
      However tearing down the entire set of I/O and admin queues causes
      freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
      an overkill to what we actually need, which is to just fence controller
      teardown that may be running, stop the queue, and cancel the request if
      it is not already completed.
      
      Now that we have the controller teardown_lock, we can safely serialize
      request cancellation. This addresses a hang caused by calling extra
      queue freeze on controller namespaces, causing unfreeze to not complete
      correctly.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      0475a8dc
    • Sagi Grimberg's avatar
      nvme-rdma: serialize controller teardown sequences · 5110f402
      Sagi Grimberg authored
      In the timeout handler we may need to complete a request because the
      request that timed out may be an I/O that is a part of a serial sequence
      of controller teardown or initialization. In order to complete the
      request, we need to fence any other context that may compete with us
      and complete the request that is timing out.
      
      In this case, we could have a potential double completion in case
      a hard-irq or a different competing context triggered error recovery
      and is running inflight request cancellation concurrently with the
      timeout handler.
      
      Protect using a ctrl teardown_lock to serialize contexts that may
      complete a cancelled request due to error recovery or a reset.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      5110f402
    • Sagi Grimberg's avatar
      nvme-tcp: fix reset hang if controller died in the middle of a reset · e5c01f4f
      Sagi Grimberg authored
      If the controller becomes unresponsive in the middle of a reset, we will
      hang because we are waiting for the freeze to complete, but that cannot
      happen since we have commands that are inflight holding the
      q_usage_counter, and we can't blindly fail requests that times out.
      
      So give a timeout and if we cannot wait for queue freeze before
      unfreezing, fail and have the error handling take care how to proceed
      (either schedule a reconnect of remove the controller).
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      e5c01f4f
    • Sagi Grimberg's avatar
      nvme-tcp: fix timeout handler · 236187c4
      Sagi Grimberg authored
      When a request times out in a LIVE state, we simply trigger error
      recovery and let the error recovery handle the request cancellation,
      however when a request times out in a non LIVE state, we make sure to
      complete it immediately as it might block controller setup or teardown
      and prevent forward progress.
      
      However tearing down the entire set of I/O and admin queues causes
      freeze/unfreeze imbalance (q->mq_freeze_depth) because and is really
      an overkill to what we actually need, which is to just fence controller
      teardown that may be running, stop the queue, and cancel the request if
      it is not already completed.
      
      Now that we have the controller teardown_lock, we can safely serialize
      request cancellation. This addresses a hang caused by calling extra
      queue freeze on controller namespaces, causing unfreeze to not complete
      correctly.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      236187c4
    • Sagi Grimberg's avatar
      nvme-tcp: serialize controller teardown sequences · d4d61470
      Sagi Grimberg authored
      In the timeout handler we may need to complete a request because the
      request that timed out may be an I/O that is a part of a serial sequence
      of controller teardown or initialization. In order to complete the
      request, we need to fence any other context that may compete with us
      and complete the request that is timing out.
      
      In this case, we could have a potential double completion in case
      a hard-irq or a different competing context triggered error recovery
      and is running inflight request cancellation concurrently with the
      timeout handler.
      
      Protect using a ctrl teardown_lock to serialize contexts that may
      complete a cancelled request due to error recovery or a reset.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      d4d61470
    • Sagi Grimberg's avatar
      nvme: have nvme_wait_freeze_timeout return if it timed out · 7cf0d7c0
      Sagi Grimberg authored
      Users can detect if the wait has completed or not and take appropriate
      actions based on this information (e.g. weather to continue
      initialization or rather fail and schedule another initialization
      attempt).
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      7cf0d7c0
    • Sagi Grimberg's avatar
      nvme-fabrics: don't check state NVME_CTRL_NEW for request acceptance · d7144f5c
      Sagi Grimberg authored
      NVME_CTRL_NEW should never see any I/O, because in order to start
      initialization it has to transition to NVME_CTRL_CONNECTING and from
      there it will never return to this state.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      d7144f5c
    • Ziye Yang's avatar
      nvmet-tcp: Fix NULL dereference when a connect data comes in h2cdata pdu · a6ce7d7b
      Ziye Yang authored
      When handling commands without in-capsule data, we assign the ttag
      assuming we already have the queue commands array allocated (based
      on the queue size information in the connect data payload). However
      if the connect itself did not send the connect data in-capsule we
      have yet to allocate the queue commands,and we will assign a bogus
      ttag and suffer a NULL dereference when we receive the corresponding
      h2cdata pdu.
      
      Fix this by checking if we already allocated commands before
      dereferencing it when handling h2cdata, if we didn't, its for sure a
      connect and we should use the preallocated connect command.
      Signed-off-by: default avatarZiye Yang <ziye.yang@intel.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      a6ce7d7b
    • Linus Torvalds's avatar
      Merge tag 'block-5.9-2020-08-28' of git://git.kernel.dk/linux-block · 4d41ead6
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - nbd timeout fix (Hou)
      
       - device size fix for loop LOOP_CONFIGURE (Martijn)
      
       - MD pull from Song with raid5 stripe size fix (Yufen)
      
      * tag 'block-5.9-2020-08-28' of git://git.kernel.dk/linux-block:
        md/raid5: make sure stripe_size as power of two
        loop: Set correct device size when using LOOP_CONFIGURE
        nbd: restore default timeout when setting it to zero
      4d41ead6