1. 14 Mar, 2020 3 commits
    • Suravee Suthikulpanit's avatar
      iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE · 730ad0ed
      Suravee Suthikulpanit authored
      Commit b9c6ff94 ("iommu/amd: Re-factor guest virtual APIC
      (de-)activation code") accidentally left out the ir_data pointer when
      calling modity_irte_ga(), which causes the function amd_iommu_update_ga()
      to return prematurely due to struct amd_ir_data.ref is NULL and
      the "is_run" bit of IRTE does not get updated properly.
      
      This results in bad I/O performance since IOMMU AVIC always generate GA Log
      entry and notify IOMMU driver and KVM when it receives interrupt from the
      PCI pass-through device instead of directly inject interrupt to the vCPU.
      
      Fixes by passing ir_data when calling modify_irte_ga() as done previously.
      
      Fixes: b9c6ff94 ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      730ad0ed
    • Daniel Drake's avatar
      iommu/vt-d: Ignore devices with out-of-spec domain number · da72a379
      Daniel Drake authored
      VMD subdevices are created with a PCI domain ID of 0x10000 or
      higher.
      
      These subdevices are also handled like all other PCI devices by
      dmar_pci_bus_notifier().
      
      However, when dmar_alloc_pci_notify_info() take records of such devices,
      it will truncate the domain ID to a u16 value (in info->seg).
      The device at (e.g.) 10000:00:02.0 is then treated by the DMAR code as if
      it is 0000:00:02.0.
      
      In the unlucky event that a real device also exists at 0000:00:02.0 and
      also has a device-specific entry in the DMAR table,
      dmar_insert_dev_scope() will crash on:
         BUG_ON(i >= devices_cnt);
      
      That's basically a sanity check that only one PCI device matches a
      single DMAR entry; in this case we seem to have two matching devices.
      
      Fix this by ignoring devices that have a domain number higher than
      what can be looked up in the DMAR table.
      
      This problem was carefully diagnosed by Jian-Hong Pan.
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarDaniel Drake <drake@endlessm.com>
      Fixes: 59ce0515 ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      da72a379
    • Zhenzhong Duan's avatar
      iommu/vt-d: Fix the wrong printing in RHSA parsing · b0bb0c22
      Zhenzhong Duan authored
      When base address in RHSA structure doesn't match base address in
      each DRHD structure, the base address in last DRHD is printed out.
      
      This doesn't make sense when there are multiple DRHD units, fix it
      by printing the buggy RHSA's base address.
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarZhenzhong Duan <zhenzhong.duan@gmail.com>
      Fixes: fd0c8894 ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      b0bb0c22
  2. 13 Mar, 2020 4 commits
  3. 10 Mar, 2020 2 commits
    • Qian Cai's avatar
      iommu/vt-d: Silence RCU-list debugging warnings · f5152416
      Qian Cai authored
      Similar to the commit 02d715b4 ("iommu/vt-d: Fix RCU list debugging
      warnings"), there are several other places that call
      list_for_each_entry_rcu() outside of an RCU read side critical section
      but with dmar_global_lock held. Silence those false positives as well.
      
       drivers/iommu/intel-iommu.c:4288 RCU-list traversed in non-reader section!!
       1 lock held by swapper/0/1:
        #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1ad/0xb97
      
       drivers/iommu/dmar.c:366 RCU-list traversed in non-reader section!!
       1 lock held by swapper/0/1:
        #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x125/0xb97
      
       drivers/iommu/intel-iommu.c:5057 RCU-list traversed in non-reader section!!
       1 lock held by swapper/0/1:
        #0: ffffffffa71892c8 (dmar_global_lock){++++}, at: intel_iommu_init+0x61a/0xb13
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      f5152416
    • Qian Cai's avatar
      iommu/vt-d: Fix RCU-list bugs in intel_iommu_init() · 2d48ea0e
      Qian Cai authored
      There are several places traverse RCU-list without holding any lock in
      intel_iommu_init(). Fix them by acquiring dmar_global_lock.
      
       WARNING: suspicious RCU usage
       -----------------------------
       drivers/iommu/intel-iommu.c:5216 RCU-list traversed in non-reader section!!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 1
       no locks held by swapper/0/1.
      
       Call Trace:
        dump_stack+0xa0/0xea
        lockdep_rcu_suspicious+0x102/0x10b
        intel_iommu_init+0x947/0xb13
        pci_iommu_init+0x26/0x62
        do_one_initcall+0xfe/0x500
        kernel_init_freeable+0x45a/0x4f8
        kernel_init+0x11/0x139
        ret_from_fork+0x3a/0x50
       DMAR: Intel(R) Virtualization Technology for Directed I/O
      
      Fixes: d8190dc6 ("iommu/vt-d: Enable DMA remapping after rmrr mapped")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      2d48ea0e
  4. 04 Mar, 2020 1 commit
    • Marc Zyngier's avatar
      iommu/dma: Fix MSI reservation allocation · 65ac74f1
      Marc Zyngier authored
      The way cookie_init_hw_msi_region() allocates the iommu_dma_msi_page
      structures doesn't match the way iommu_put_dma_cookie() frees them.
      
      The former performs a single allocation of all the required structures,
      while the latter tries to free them one at a time. It doesn't quite
      work for the main use case (the GICv3 ITS where the range is 64kB)
      when the base granule size is 4kB.
      
      This leads to a nice slab corruption on teardown, which is easily
      observable by simply creating a VF on a SRIOV-capable device, and
      tearing it down immediately (no need to even make use of it).
      Fortunately, this only affects systems where the ITS isn't translated
      by the SMMU, which are both rare and non-standard.
      
      Fix it by allocating iommu_dma_msi_page structures one at a time.
      
      Fixes: 7c1b058c ("iommu/dma: Handle IOMMU API reserved regions")
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      65ac74f1
  5. 02 Mar, 2020 3 commits
  6. 01 Mar, 2020 5 commits
  7. 29 Feb, 2020 4 commits
    • Dan Carpenter's avatar
      ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() · 37b0b6b8
      Dan Carpenter authored
      If sbi->s_flex_groups_allocated is zero and the first allocation fails
      then this code will crash.  The problem is that "i--" will set "i" to
      -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
      is type promoted to unsigned and becomes UINT_MAX.  Since UINT_MAX
      is more than zero, the condition is true so we call kvfree(new_groups[-1]).
      The loop will carry on freeing invalid memory until it crashes.
      
      Fixes: 7c990728 ("ext4: fix potential race between s_flex_groups online resizing and access")
      Reviewed-by: default avatarSuraj Jitindar Singh <surajjs@amazon.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@kernel.org
      Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountainSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      37b0b6b8
    • Wolfram Sang's avatar
      macintosh: therm_windtunnel: fix regression when instantiating devices · 38b17afb
      Wolfram Sang authored
      Removing attach_adapter from this driver caused a regression for at
      least some machines. Those machines had the sensors described in their
      DT, too, so they didn't need manual creation of the sensor devices. The
      old code worked, though, because manual creation came first. Creation of
      DT devices then failed later and caused error logs, but the sensors
      worked nonetheless because of the manually created devices.
      
      When removing attach_adaper, manual creation now comes later and loses
      the race. The sensor devices were already registered via DT, yet with
      another binding, so the driver could not be bound to it.
      
      This fix refactors the code to remove the race and only manually creates
      devices if there are no DT nodes present. Also, the DT binding is updated
      to match both, the DT and manually created devices. Because we don't
      know which device creation will be used at runtime, the code to start
      the kthread is moved to do_probe() which will be called by both methods.
      
      Fixes: 3e7bed52 ("macintosh: therm_windtunnel: drop using attach_adapter")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723Reported-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Tested-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org # v4.19+
      38b17afb
    • Qian Cai's avatar
      jbd2: fix data races at struct journal_head · 6c5d9112
      Qian Cai authored
      journal_head::b_transaction and journal_head::b_next_transaction could
      be accessed concurrently as noticed by KCSAN,
      
       LTP: starting fsync04
       /dev/zero: Can't open blockdev
       EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
       EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
       ==================================================================
       BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]
      
       write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
        __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
        __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
        jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
        (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
        kjournald2+0x13b/0x450 [jbd2]
        kthread+0x1cd/0x1f0
        ret_from_fork+0x27/0x50
      
       read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
        jbd2_write_access_granted+0x1b2/0x250 [jbd2]
        jbd2_write_access_granted at fs/jbd2/transaction.c:1155
        jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
        __ext4_journal_get_write_access+0x50/0x90 [ext4]
        ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
        ext4_mb_new_blocks+0x54f/0xca0 [ext4]
        ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
        ext4_map_blocks+0x3b4/0x950 [ext4]
        _ext4_get_block+0xfc/0x270 [ext4]
        ext4_get_block+0x3b/0x50 [ext4]
        __block_write_begin_int+0x22e/0xae0
        __block_write_begin+0x39/0x50
        ext4_write_begin+0x388/0xb50 [ext4]
        generic_perform_write+0x15d/0x290
        ext4_buffered_write_iter+0x11f/0x210 [ext4]
        ext4_file_write_iter+0xce/0x9e0 [ext4]
        new_sync_write+0x29c/0x3b0
        __vfs_write+0x92/0xa0
        vfs_write+0x103/0x260
        ksys_write+0x9d/0x130
        __x64_sys_write+0x4c/0x60
        do_syscall_64+0x91/0xb05
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       5 locks held by fsync04/25724:
        #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
        #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
        #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
        #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
        #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
       irq event stamp: 1407125
       hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
       hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
       softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
       softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
      
      The plain reads are outside of jh->b_state_lock critical section which result
      in data races. Fix them by adding pairs of READ|WRITE_ONCE().
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pwSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6c5d9112
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 7557c1b3
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four small fixes.
      
        Three are in drivers for fairly obvious bugs. The fourth is a set of
        regressions introduced by the compat_ioctl changes because some of the
        compat updates wrongly replaced .ioctl instead of .compat_ioctl"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places
        scsi: zfcp: fix wrong data and display format of SFP+ temperature
        scsi: sd_sbc: Fix sd_zbc_report_zones()
        scsi: libfc: free response frame from GPN_ID
      7557c1b3
  8. 28 Feb, 2020 18 commits