1. 09 Sep, 2020 40 commits
    • Muchun Song's avatar
      mm/hugetlb: fix a race between hugetlb sysctl handlers · 221ea9a3
      Muchun Song authored
      commit 17743798 upstream.
      
      There is a race between the assignment of `table->data` and write value
      to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
      the other thread.
      
        CPU0:                                 CPU1:
                                              proc_sys_write
        hugetlb_sysctl_handler                  proc_sys_call_handler
        hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
          table->data = &tmp;                       hugetlb_sysctl_handler_common
                                                      table->data = &tmp;
            proc_doulongvec_minmax
              do_proc_doulongvec_minmax           sysctl_head_finish
                __do_proc_doulongvec_minmax         unuse_table
                  i = table->data;
                  *i = val;  // corrupt CPU1's stack
      
      Fix this by duplicating the `table`, and only update the duplicate of
      it.  And introduce a helper of proc_hugetlb_doulongvec_minmax() to
      simplify the code.
      
      The following oops was seen:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          #PF: supervisor instruction fetch in kernel mode
          #PF: error_code(0x0010) - not-present page
          Code: Bad RIP value.
          ...
          Call Trace:
           ? set_max_huge_pages+0x3da/0x4f0
           ? alloc_pool_huge_page+0x150/0x150
           ? proc_doulongvec_minmax+0x46/0x60
           ? hugetlb_sysctl_handler_common+0x1c7/0x200
           ? nr_hugepages_store+0x20/0x20
           ? copy_fd_bitmaps+0x170/0x170
           ? hugetlb_sysctl_handler+0x1e/0x20
           ? proc_sys_call_handler+0x2f1/0x300
           ? unregister_sysctl_table+0xb0/0xb0
           ? __fd_install+0x78/0x100
           ? proc_sys_write+0x14/0x20
           ? __vfs_write+0x4d/0x90
           ? vfs_write+0xef/0x240
           ? ksys_write+0xc0/0x160
           ? __ia32_sys_read+0x50/0x50
           ? __close_fd+0x129/0x150
           ? __x64_sys_write+0x43/0x50
           ? do_syscall_64+0x6c/0x200
           ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: e5ff2159 ("hugetlb: multiple hstates for multiple page sizes")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      221ea9a3
    • Mrinal Pandey's avatar
      checkpatch: fix the usage of capture group ( ... ) · c5927539
      Mrinal Pandey authored
      commit 13e45417 upstream.
      
      The usage of "capture group (...)" in the immediate condition after `&&`
      results in `$1` being uninitialized.  This issues a warning "Use of
      uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl
      line 2638".
      
      I noticed this bug while running checkpatch on the set of commits from
      v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in
      their commit message.
      
      This bug was introduced in the script by commit e518e9a5
      ("checkpatch: emit an error when there's a diff in a changelog").  It
      has been in the script since then.
      
      The author intended to store the match made by capture group in variable
      `$1`.  This should have contained the name of the file as `[\w/]+`
      matched.  However, this couldn't be accomplished due to usage of capture
      group and `$1` in the same regular expression.
      
      Fix this by placing the capture group in the condition before `&&`.
      Thus, `$1` can be initialized to the text that capture group matches
      thereby setting it to the desired and required value.
      
      Fixes: e518e9a5 ("checkpatch: emit an error when there's a diff in a changelog")
      Signed-off-by: default avatarMrinal Pandey <mrinalmni@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Reviewed-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Link: https://lkml.kernel.org/r/20200714032352.f476hanaj2dlmiot@mrinalpandeySigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5927539
    • Alex Williamson's avatar
      vfio/pci: Fix SR-IOV VF handling with MMIO blocking · 81fb3459
      Alex Williamson authored
      commit ebfa440c upstream.
      
      SR-IOV VFs do not implement the memory enable bit of the command
      register, therefore this bit is not set in config space after
      pci_enable_device().  This leads to an unintended difference
      between PF and VF in hand-off state to the user.  We can correct
      this by setting the initial value of the memory enable bit in our
      virtualized config space.  There's really no need however to
      ever fault a user on a VF though as this would only indicate an
      error in the user's management of the enable bit, versus a PF
      where the same access could trigger hardware faults.
      
      Fixes: abafbc55 ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81fb3459
    • James Morse's avatar
      KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception · dcaf364f
      James Morse authored
      commit 71a7f8cb upstream.
      
      AT instructions do a translation table walk and return the result, or
      the fault in PAR_EL1. KVM uses these to find the IPA when the value is
      not provided by the CPU in HPFAR_EL1.
      
      If a translation table walk causes an external abort it is taken as an
      exception, even if it was due to an AT instruction. (DDI0487F.a's D5.2.11
      "Synchronous faults generated by address translation instructions")
      
      While we previously made KVM resilient to exceptions taken due to AT
      instructions, the device access causes mismatched attributes, and may
      occur speculatively. Prevent this, by forbidding a walk through memory
      described as device at stage2. Now such AT instructions will report a
      stage2 fault.
      
      Such a fault will cause KVM to restart the guest. If the AT instructions
      always walk the page tables, but guest execution uses the translation cached
      in the TLB, the guest can't make forward progress until the TLB entry is
      evicted. This isn't a problem, as since commit 5dcd0fdb ("KVM: arm64:
      Defer guest entry when an asynchronous exception is pending"), KVM will
      return to the host to process IRQs allowing the rest of the system to keep
      running.
      
      Cc: stable@vger.kernel.org # v4.19
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dcaf364f
    • James Morse's avatar
      KVM: arm64: Survive synchronous exceptions caused by AT instructions · 4df1ff5f
      James Morse authored
      commit 88a84ccc upstream.
      
      KVM doesn't expect any synchronous exceptions when executing, any such
      exception leads to a panic(). AT instructions access the guest page
      tables, and can cause a synchronous external abort to be taken.
      
      The arm-arm is unclear on what should happen if the guest has configured
      the hardware update of the access-flag, and a memory type in TCR_EL1 that
      does not support atomic operations. B2.2.6 "Possible implementation
      restrictions on using atomic instructions" from DDI0487F.a lists
      synchronous external abort as a possible behaviour of atomic instructions
      that target memory that isn't writeback cacheable, but the page table
      walker may behave differently.
      
      Make KVM robust to synchronous exceptions caused by AT instructions.
      Add a get_user() style helper for AT instructions that returns -EFAULT
      if an exception was generated.
      
      While KVM's version of the exception table mixes synchronous and
      asynchronous exceptions, only one of these can occur at each location.
      
      Re-enter the guest when the AT instructions take an exception on the
      assumption the guest will take the same exception. This isn't guaranteed
      to make forward progress, as the AT instructions may always walk the page
      tables, but guest execution may use the translation cached in the TLB.
      
      This isn't a problem, as since commit 5dcd0fdb ("KVM: arm64: Defer guest
      entry when an asynchronous exception is pending"), KVM will return to the
      host to process IRQs allowing the rest of the system to keep running.
      
      Cc: stable@vger.kernel.org # v4.19
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4df1ff5f
    • James Morse's avatar
      KVM: arm64: Defer guest entry when an asynchronous exception is pending · 204f3831
      James Morse authored
      commit 5dcd0fdb upstream.
      
      SError that occur during world-switch's entry to the guest will be
      accounted to the guest, as the exception is masked until we enter the
      guest... but we want to attribute the SError as precisely as possible.
      
      Reading DISR_EL1 before guest entry requires free registers, and using
      ESB+DISR_EL1 to consume and read back the ESR would leave KVM holding
      a host SError... We would rather leave the SError pending and let the
      host take it once we exit world-switch. To do this, we need to defer
      guest-entry if an SError is pending.
      
      Read the ISR to see if SError (or an IRQ) is pending. If so fake an
      exit. Place this check between __guest_enter()'s save of the host
      registers, and restore of the guest's. SError that occur between
      here and the eret into the guest must have affected the guest's
      registers, which we can naturally attribute to the guest.
      
      The dsb is needed to ensure any previous writes have been done before
      we read ISR_EL1. On systems without the v8.2 RAS extensions this
      doesn't give us anything as we can't contain errors, and the ESR bits
      to describe the severity are all implementation-defined. Replace
      this with a nop for these systems.
      
      Cc: stable@vger.kernel.org # v4.19
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      204f3831
    • James Morse's avatar
      KVM: arm64: Add kvm_extable for vaxorcism code · 3290c6ff
      James Morse authored
      commit e9ee186b upstream.
      
      KVM has a one instruction window where it will allow an SError exception
      to be consumed by the hypervisor without treating it as a hypervisor bug.
      This is used to consume asynchronous external abort that were caused by
      the guest.
      
      As we are about to add another location that survives unexpected exceptions,
      generalise this code to make it behave like the host's extable.
      
      KVM's version has to be mapped to EL2 to be accessible on nVHE systems.
      
      The SError vaxorcism code is a one instruction window, so has two entries
      in the extable. Because the KVM code is copied for VHE and nVHE, we end up
      with four entries, half of which correspond with code that isn't mapped.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3290c6ff
    • Eugeniu Rosca's avatar
      mm: slub: fix conversion of freelist_corrupted() · af2cf2c5
      Eugeniu Rosca authored
      commit dc07a728 upstream.
      
      Commit 52f23478 ("mm/slub.c: fix corrupted freechain in
      deactivate_slab()") suffered an update when picked up from LKML [1].
      
      Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
      created a no-op statement.  Fix it by sticking to the behavior intended
      in the original patch [1].  In addition, make freelist_corrupted()
      immune to passing NULL instead of &freelist.
      
      The issue has been spotted via static analysis and code review.
      
      [1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/
      
      Fixes: 52f23478 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
      Signed-off-by: default avatarEugeniu Rosca <erosca@de.adit-jv.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af2cf2c5
    • Ye Bin's avatar
      dm thin metadata: Avoid returning cmd->bm wild pointer on error · 2c00ee62
      Ye Bin authored
      commit 219403d7 upstream.
      
      Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
      pointer, it will lead to some strange things.
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c00ee62
    • Ye Bin's avatar
      dm cache metadata: Avoid returning cmd->bm wild pointer on error · 67f03c3d
      Ye Bin authored
      commit d16ff19e upstream.
      
      Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
      pointer, it will lead to some strange things.
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67f03c3d
    • Mikulas Patocka's avatar
      dm writecache: handle DAX to partitions on persistent memory correctly · 154096e9
      Mikulas Patocka authored
      commit f9e040ef upstream.
      
      The function dax_direct_access doesn't take partitions into account,
      it always maps pages from the beginning of the device. Therefore,
      persistent_memory_claim() must get the partition offset using
      get_start_sect() and add it to the page offsets passed to
      dax_direct_access().
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Fixes: 48debafe ("dm: add writecache target")
      Cc: stable@vger.kernel.org # 4.18+
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      154096e9
    • Tejun Heo's avatar
      libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks · a8bb7740
      Tejun Heo authored
      commit 3b545563 upstream.
      
      All three generations of Sandisk SSDs lock up hard intermittently.
      Experiments showed that disabling NCQ lowered the failure rate significantly
      and the kernel has been disabling NCQ for some models of SD7's and 8's,
      which is obviously undesirable.
      
      Karthik worked with Sandisk to root cause the hard lockups to trim commands
      larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which
      limits max trim size to 128M and applies it to all three generations of
      Sandisk SSDs.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Karthik Shivaram <karthikgs@fb.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8bb7740
    • Ming Lei's avatar
      block: allow for_each_bvec to support zero len bvec · b48bcb66
      Ming Lei authored
      commit 7e249690 upstream.
      
      Block layer usually doesn't support or allow zero-length bvec. Since
      commit 1bdc76ae ("iov_iter: use bvec iterator to implement
      iterate_bvec()"), iterate_bvec() switches to bvec iterator. However,
      Al mentioned that 'Zero-length segments are not disallowed' in iov_iter.
      
      Fixes for_each_bvec() so that it can move on after seeing one zero
      length bvec.
      
      Fixes: 1bdc76ae ("iov_iter: use bvec iterator to implement iterate_bvec()")
      Reported-by: default avatarsyzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Tested-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.htmlSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b48bcb66
    • Max Staudt's avatar
      affs: fix basic permission bits to actually work · b0a689f8
      Max Staudt authored
      commit d3a84a8d upstream.
      
      The basic permission bits (protection bits in AmigaOS) have been broken
      in Linux' AFFS - it would only set bits, but never delete them.
      Also, contrary to the documentation, the Archived bit was not handled.
      
      Let's fix this for good, and set the bits such that Linux and classic
      AmigaOS can coexist in the most peaceful manner.
      
      Also, update the documentation to represent the current state of things.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Staudt <max@enpas.org>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0a689f8
    • Sean Young's avatar
      media: rc: uevent sysfs file races with rc_unregister_device() · fd1c0e39
      Sean Young authored
      commit 4f0835d6 upstream.
      
      Only report uevent file contents if device still registered, else we
      might read freed memory.
      
      Reported-by: syzbot+ceef16277388d6f24898@syzkaller.appspotmail.com
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: <stable@vger.kernel.org> # 4.16+
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd1c0e39
    • Sean Young's avatar
      media: rc: do not access device via sysfs after rc_unregister_device() · 814f9501
      Sean Young authored
      commit a2e2d73f upstream.
      
      Device drivers do not expect to have change_protocol or wakeup
      re-programming to be accesed after rc_unregister_device(). This can
      cause the device driver to access deallocated resources.
      
      Cc: <stable@vger.kernel.org> # 4.16+
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      814f9501
    • Dan Crawford's avatar
      ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO · c0a7b7fe
      Dan Crawford authored
      commit 15cbff3f upstream.
      
      Following Christian Lachner's patch for Gigabyte X570-based motherboards,
      also patch the MSI X570-A PRO motherboard; the ALC1220 codec requires the
      same workaround for Clevo laptops to enforce the DAC/mixer connection
      path. Set up a quirk entry for that.
      
      I suspect most if all X570 motherboards will require similar patches.
      
      [ The entries reordered in the SSID order -- tiwai ]
      
      Related buglink: https://bugzilla.kernel.org/show_bug.cgi?id=205275Signed-off-by: default avatarDan Crawford <dnlcrwfrd@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200829024946.5691-1-dnlcrwfrd@gmail.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0a7b7fe
    • Takashi Sakamoto's avatar
      ALSA: firewire-digi00x: exclude Avid Adrenaline from detection · 3319b83f
      Takashi Sakamoto authored
      commit acd46a6b upstream.
      
      Avid Adrenaline is reported that ALSA firewire-digi00x driver is bound to.
      However, as long as he investigated, the design of this model is hardly
      similar to the one of Digi 00x family. It's better to exclude the model
      from modalias of ALSA firewire-digi00x driver.
      
      This commit changes device entries so that the model is excluded.
      
      $ python3 crpp < ~/git/am-config-rom/misc/avid-adrenaline.img
                     ROM header and bus information block
                     -----------------------------------------------------------------
      400  04203a9c  bus_info_length 4, crc_length 32, crc 15004
      404  31333934  bus_name "1394"
      408  e064a002  irmc 1, cmc 1, isc 1, bmc 0, cyc_clk_acc 100, max_rec 10 (2048)
      40c  00a07e01  company_id 00a07e     |
      410  00085257  device_id 0100085257  | EUI-64 00a07e0100085257
      
                     root directory
                     -----------------------------------------------------------------
      414  0005d08c  directory_length 5, crc 53388
      418  0300a07e  vendor
      41c  8100000c  --> descriptor leaf at 44c
      420  0c008380  node capabilities
      424  8d000002  --> eui-64 leaf at 42c
      428  d1000004  --> unit directory at 438
      
                     eui-64 leaf at 42c
                     -----------------------------------------------------------------
      42c  0002410f  leaf_length 2, crc 16655
      430  00a07e01  company_id 00a07e     |
      434  00085257  device_id 0100085257  | EUI-64 00a07e0100085257
      
                     unit directory at 438
                     -----------------------------------------------------------------
      438  0004d6c9  directory_length 4, crc 54985
      43c  1200a02d  specifier id: 1394 TA
      440  13014001  version: Vender Unique and AV/C
      444  17000001  model
      448  81000009  --> descriptor leaf at 46c
      
                     descriptor leaf at 44c
                     -----------------------------------------------------------------
      44c  00077205  leaf_length 7, crc 29189
      450  00000000  textual descriptor
      454  00000000  minimal ASCII
      458  41766964  "Avid"
      45c  20546563  " Tec"
      460  686e6f6c  "hnol"
      464  6f677900  "ogy"
      468  00000000
      
                     descriptor leaf at 46c
                     -----------------------------------------------------------------
      46c  000599a5  leaf_length 5, crc 39333
      470  00000000  textual descriptor
      474  00000000  minimal ASCII
      478  41647265  "Adre"
      47c  6e616c69  "nali"
      480  6e650000  "ne"
      Reported-by: default avatarSimon Wood <simon@mungewell.org>
      Fixes: 9edf723f ("ALSA: firewire-digi00x: add skeleton for Digi 002/003 family")
      Cc: <stable@vger.kernel.org> # 4.4+
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20200823075545.56305-1-o-takashi@sakamocchi.jpSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3319b83f
    • Kai Vehmanen's avatar
      ALSA: hda/hdmi: always check pin power status in i915 pin fixup · 374843e4
      Kai Vehmanen authored
      commit 858e0ad9 upstream.
      
      When system is suspended with active audio playback to HDMI/DP, two
      alternative sequences can happen at resume:
        a) monitor is detected first and ALSA prepare follows normal
           stream setup sequence, or
        b) ALSA prepare is called first, but monitor is not yet detected,
           so PCM is restarted without a pin,
      
      In case of (b), on i915 systems, haswell_verify_D0() is not called at
      resume and the pin power state may be incorrect. Result is lack of audio
      after resume with no error reported back to user-space.
      
      Fix the problem by always verifying converter and pin state in the
      i915_pin_cvt_fixup().
      
      BugLink: https://github.com/thesofproject/linux/issues/2388Signed-off-by: default avatarKai Vehmanen <kai.vehmanen@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200826170306.701566-1-kai.vehmanen@linux.intel.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      374843e4
    • Takashi Iwai's avatar
      ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check · 569e1b62
      Takashi Iwai authored
      commit 949a1ebe upstream.
      
      The PCM OSS mulaw plugin has a check of the format of the counter part
      whether it's a linear format.  The check is with snd_BUG_ON() that
      emits WARN_ON() when the debug config is set, and it confuses
      syzkaller as if it were a serious issue.  Let's drop snd_BUG_ON() for
      avoiding that.
      
      While we're at it, correct the error code to a more suitable, EINVAL.
      
      Reported-by: syzbot+23b22dc2e0b81cbfcc95@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200901131802.18157-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      569e1b62
    • Tong Zhang's avatar
      ALSA: ca0106: fix error code handling · a69e790b
      Tong Zhang authored
      commit ee0761d1 upstream.
      
      snd_ca0106_spi_write() returns 1 on error, snd_ca0106_pcm_power_dac()
      is returning the error code directly, and the caller is expecting an
      negative error code
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200824224541.1260307-1-ztong0001@gmail.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a69e790b
    • Rogan Dawes's avatar
      usb: qmi_wwan: add D-Link DWM-222 A2 device ID · 669f2229
      Rogan Dawes authored
      [ Upstream commit 7d605309 ]
      Signed-off-by: default avatarRogan Dawes <rogan@dawes.za.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      669f2229
    • Daniele Palmas's avatar
      net: usb: qmi_wwan: add Telit 0x1050 composition · 3d7de9fe
      Daniele Palmas authored
      [ Upstream commit e0ae2c57 ]
      
      This patch adds support for Telit FN980 0x1050 composition
      
      0x1050: tty, adb, rmnet, tty, tty, tty, tty
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d7de9fe
    • Josef Bacik's avatar
      btrfs: fix potential deadlock in the search ioctl · 7c9bf5c3
      Josef Bacik authored
      [ Upstream commit a48b73ec ]
      
      With the conversion of the tree locks to rwsem I got the following
      lockdep splat:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
        ------------------------------------------------------
        compsize/11122 is trying to acquire lock:
        ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90
      
        but task is already holding lock:
        ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #2 (btrfs-fs-00){++++}-{3:3}:
      	 down_write_nested+0x3b/0x70
      	 __btrfs_tree_lock+0x24/0x120
      	 btrfs_search_slot+0x756/0x990
      	 btrfs_lookup_inode+0x3a/0xb4
      	 __btrfs_update_delayed_inode+0x93/0x270
      	 btrfs_async_run_delayed_root+0x168/0x230
      	 btrfs_work_helper+0xd4/0x570
      	 process_one_work+0x2ad/0x5f0
      	 worker_thread+0x3a/0x3d0
      	 kthread+0x133/0x150
      	 ret_from_fork+0x1f/0x30
      
        -> #1 (&delayed_node->mutex){+.+.}-{3:3}:
      	 __mutex_lock+0x9f/0x930
      	 btrfs_delayed_update_inode+0x50/0x440
      	 btrfs_update_inode+0x8a/0xf0
      	 btrfs_dirty_inode+0x5b/0xd0
      	 touch_atime+0xa1/0xd0
      	 btrfs_file_mmap+0x3f/0x60
      	 mmap_region+0x3a4/0x640
      	 do_mmap+0x376/0x580
      	 vm_mmap_pgoff+0xd5/0x120
      	 ksys_mmap_pgoff+0x193/0x230
      	 do_syscall_64+0x50/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
      	 __lock_acquire+0x1272/0x2310
      	 lock_acquire+0x9e/0x360
      	 __might_fault+0x68/0x90
      	 _copy_to_user+0x1e/0x80
      	 copy_to_sk.isra.32+0x121/0x300
      	 search_ioctl+0x106/0x200
      	 btrfs_ioctl_tree_search_v2+0x7b/0xf0
      	 btrfs_ioctl+0x106f/0x30a0
      	 ksys_ioctl+0x83/0xc0
      	 __x64_sys_ioctl+0x16/0x20
      	 do_syscall_64+0x50/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        other info that might help us debug this:
      
        Chain exists of:
          &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(btrfs-fs-00);
      				 lock(&delayed_node->mutex);
      				 lock(btrfs-fs-00);
          lock(&mm->mmap_lock#2);
      
         *** DEADLOCK ***
      
        1 lock held by compsize/11122:
         #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        stack backtrace:
        CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
        Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
        Call Trace:
         dump_stack+0x78/0xa0
         check_noncircular+0x165/0x180
         __lock_acquire+0x1272/0x2310
         lock_acquire+0x9e/0x360
         ? __might_fault+0x3e/0x90
         ? find_held_lock+0x72/0x90
         __might_fault+0x68/0x90
         ? __might_fault+0x3e/0x90
         _copy_to_user+0x1e/0x80
         copy_to_sk.isra.32+0x121/0x300
         ? btrfs_search_forward+0x2a6/0x360
         search_ioctl+0x106/0x200
         btrfs_ioctl_tree_search_v2+0x7b/0xf0
         btrfs_ioctl+0x106f/0x30a0
         ? __do_sys_newfstat+0x5a/0x70
         ? ksys_ioctl+0x83/0xc0
         ksys_ioctl+0x83/0xc0
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x50/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The problem is we're doing a copy_to_user() while holding tree locks,
      which can deadlock if we have to do a page fault for the copy_to_user().
      This exists even without my locking changes, so it needs to be fixed.
      Rework the search ioctl to do the pre-fault and then
      copy_to_user_nofault for the copying.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c9bf5c3
    • Daniel Borkmann's avatar
      uaccess: Add non-pagefault user-space write function · cfb4721f
      Daniel Borkmann authored
      [ Upstream commit 1d1585ca ]
      
      Commit 3d708182 ("uaccess: Add non-pagefault user-space read functions")
      missed to add probe write function, therefore factor out a probe_write_common()
      helper with most logic of probe_kernel_write() except setting KERNEL_DS, and
      add a new probe_user_write() helper so it can be used from BPF side.
      
      Again, on some archs, the user address space and kernel address space can
      co-exist and be overlapping, so in such case, setting KERNEL_DS would mean
      that the given address is treated as being in kernel address space.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Link: https://lore.kernel.org/bpf/9df2542e68141bfa3addde631441ee45503856a8.1572649915.git.daniel@iogearbox.netSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      cfb4721f
    • Masami Hiramatsu's avatar
      uaccess: Add non-pagefault user-space read functions · 61135a9c
      Masami Hiramatsu authored
      [ Upstream commit 3d708182 ]
      
      Add probe_user_read(), strncpy_from_unsafe_user() and
      strnlen_unsafe_user() which allows caller to access user-space
      in IRQ context.
      
      Current probe_kernel_read() and strncpy_from_unsafe() are
      not available for user-space memory, because it sets
      KERNEL_DS while accessing data. On some arch, user address
      space and kernel address space can be co-exist, but others
      can not. In that case, setting KERNEL_DS means given
      address is treated as a kernel address space.
      Also strnlen_user() is only available from user context since
      it can sleep if pagefault is enabled.
      
      To access user-space memory without pagefault, we need
      these new functions which sets USER_DS while accessing
      the data.
      
      Link: http://lkml.kernel.org/r/155789869802.26965.4940338412595759063.stgit@devnote2Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      61135a9c
    • Josef Bacik's avatar
      btrfs: set the lockdep class for log tree extent buffers · 4689380b
      Josef Bacik authored
      [ Upstream commit d3beaa25 ]
      
      These are special extent buffers that get rewound in order to lookup
      the state of the tree at a specific point in time.  As such they do not
      go through the normal initialization paths that set their lockdep class,
      so handle them appropriately when they are created and before they are
      locked.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4689380b
    • Nikolay Borisov's avatar
      btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind · 88814d0b
      Nikolay Borisov authored
      [ Upstream commit 24cee18a ]
      
      When a rewound buffer is created it already has a ref count of 1 and the
      dummy flag set. Then another ref is taken bumping the count to 2.
      Finally when this buffer is released from btrfs_release_path the extra
      reference is decremented by the special handling code in
      free_extent_buffer.
      
      However, this special code is in fact redundant sinca ref count of 1 is
      still correct since the buffer is only accessed via btrfs_path struct.
      This paves the way forward of removing the special handling in
      free_extent_buffer.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      88814d0b
    • Nikolay Borisov's avatar
      btrfs: Remove redundant extent_buffer_get in get_old_root · 2ca6e25f
      Nikolay Borisov authored
      [ Upstream commit 6c122e2a ]
      
      get_old_root used used only by btrfs_search_old_slot to initialise the
      path structure. The old root is always a cloned buffer (either via alloc
      dummy or via btrfs_clone_extent_buffer) and its reference count is 2: 1
      from allocation, 1 from extent_buffer_get call in get_old_root.
      
      This latter explicit ref count acquire operation is in fact unnecessary
      since the semantic is such that the newly allocated buffer is handed
      over to the btrfs_path for lifetime management. Considering this just
      remove the extra extent_buffer_get in get_old_root.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2ca6e25f
    • Alex Williamson's avatar
      vfio-pci: Invalidate mmaps and block MMIO access on disabled memory · da7aea6e
      Alex Williamson authored
      commit abafbc55 upstream.
      
      Accessing the disabled memory space of a PCI device would typically
      result in a master abort response on conventional PCI, or an
      unsupported request on PCI express.  The user would generally see
      these as a -1 response for the read return data and the write would be
      silently discarded, possibly with an uncorrected, non-fatal AER error
      triggered on the host.  Some systems however take it upon themselves
      to bring down the entire system when they see something that might
      indicate a loss of data, such as this discarded write to a disabled
      memory space.
      
      To avoid this, we want to try to block the user from accessing memory
      spaces while they're disabled.  We start with a semaphore around the
      memory enable bit, where writers modify the memory enable state and
      must be serialized, while readers make use of the memory region and
      can access in parallel.  Writers include both direct manipulation via
      the command register, as well as any reset path where the internal
      mechanics of the reset may both explicitly and implicitly disable
      memory access, and manipulation of the MSI-X configuration, where the
      MSI-X vector table resides in MMIO space of the device.  Readers
      include the read and write file ops to access the vfio device fd
      offsets as well as memory mapped access.  In the latter case, we make
      use of our new vma list support to zap, or invalidate, those memory
      mappings in order to force them to be faulted back in on access.
      
      Our semaphore usage will stall user access to MMIO spaces across
      internal operations like reset, but the user might experience new
      behavior when trying to access the MMIO space while disabled via the
      PCI command register.  Access via read or write while disabled will
      return -EIO and access via memory maps will result in a SIGBUS.  This
      is expected to be compatible with known use cases and potentially
      provides better error handling capabilities than present in the
      hardware, while avoiding the more readily accessible and severe
      platform error responses that might otherwise occur.
      
      Fixes: CVE-2020-12888
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      [Ajay: Regenerated the patch for v4.19]
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      da7aea6e
    • Alex Williamson's avatar
      vfio-pci: Fault mmaps to enable vma tracking · 6c7f2f24
      Alex Williamson authored
      commit 11c4cd07 upstream.
      
      Rather than calling remap_pfn_range() when a region is mmap'd, setup
      a vm_ops handler to support dynamic faulting of the range on access.
      This allows us to manage a list of vmas actively mapping the area that
      we can later use to invalidate those mappings.  The open callback
      invalidates the vma range so that all tracking is inserted in the
      fault handler and removed in the close handler.
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      [Ajay: Regenerated the patch for v4.19]
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6c7f2f24
    • Alex Williamson's avatar
      vfio/type1: Support faulting PFNMAP vmas · 6df21076
      Alex Williamson authored
      commit 41311242 upstream.
      
      With conversion to follow_pfn(), DMA mapping a PFNMAP range depends on
      the range being faulted into the vma.  Add support to manually provide
      that, in the same way as done on KVM with hva_to_pfn_remapped().
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      [Ajay: Regenerated the patch for v4.19]
      Signed-off-by: default avatarAjay Kaher <akaher@vmware.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6df21076
    • Josef Bacik's avatar
      btrfs: drop path before adding new uuid tree entry · 43eadb9e
      Josef Bacik authored
      commit 9771a5cf upstream.
      
      With the conversion of the tree locks to rwsem I got the following
      lockdep splat:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.8.0-rc7-00167-g0d7ba0c5b375-dirty #925 Not tainted
        ------------------------------------------------------
        btrfs-uuid/7955 is trying to acquire lock:
        ffff88bfbafec0f8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        but task is already holding lock:
        ffff88bfbafef2a8 (btrfs-uuid-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #1 (btrfs-uuid-00){++++}-{3:3}:
      	 down_read_nested+0x3e/0x140
      	 __btrfs_tree_read_lock+0x39/0x180
      	 __btrfs_read_lock_root_node+0x3a/0x50
      	 btrfs_search_slot+0x4bd/0x990
      	 btrfs_uuid_tree_add+0x89/0x2d0
      	 btrfs_uuid_scan_kthread+0x330/0x390
      	 kthread+0x133/0x150
      	 ret_from_fork+0x1f/0x30
      
        -> #0 (btrfs-root-00){++++}-{3:3}:
      	 __lock_acquire+0x1272/0x2310
      	 lock_acquire+0x9e/0x360
      	 down_read_nested+0x3e/0x140
      	 __btrfs_tree_read_lock+0x39/0x180
      	 __btrfs_read_lock_root_node+0x3a/0x50
      	 btrfs_search_slot+0x4bd/0x990
      	 btrfs_find_root+0x45/0x1b0
      	 btrfs_read_tree_root+0x61/0x100
      	 btrfs_get_root_ref.part.50+0x143/0x630
      	 btrfs_uuid_tree_iterate+0x207/0x314
      	 btrfs_uuid_rescan_kthread+0x12/0x50
      	 kthread+0x133/0x150
      	 ret_from_fork+0x1f/0x30
      
        other info that might help us debug this:
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(btrfs-uuid-00);
      				 lock(btrfs-root-00);
      				 lock(btrfs-uuid-00);
          lock(btrfs-root-00);
      
         *** DEADLOCK ***
      
        1 lock held by btrfs-uuid/7955:
         #0: ffff88bfbafef2a8 (btrfs-uuid-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        stack backtrace:
        CPU: 73 PID: 7955 Comm: btrfs-uuid Kdump: loaded Not tainted 5.8.0-rc7-00167-g0d7ba0c5b375-dirty #925
        Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
        Call Trace:
         dump_stack+0x78/0xa0
         check_noncircular+0x165/0x180
         __lock_acquire+0x1272/0x2310
         lock_acquire+0x9e/0x360
         ? __btrfs_tree_read_lock+0x39/0x180
         ? btrfs_root_node+0x1c/0x1d0
         down_read_nested+0x3e/0x140
         ? __btrfs_tree_read_lock+0x39/0x180
         __btrfs_tree_read_lock+0x39/0x180
         __btrfs_read_lock_root_node+0x3a/0x50
         btrfs_search_slot+0x4bd/0x990
         btrfs_find_root+0x45/0x1b0
         btrfs_read_tree_root+0x61/0x100
         btrfs_get_root_ref.part.50+0x143/0x630
         btrfs_uuid_tree_iterate+0x207/0x314
         ? btree_readpage+0x20/0x20
         btrfs_uuid_rescan_kthread+0x12/0x50
         kthread+0x133/0x150
         ? kthread_create_on_node+0x60/0x60
         ret_from_fork+0x1f/0x30
      
      This problem exists because we have two different rescan threads,
      btrfs_uuid_scan_kthread which creates the uuid tree, and
      btrfs_uuid_tree_iterate that goes through and updates or deletes any out
      of date roots.  The problem is they both do things in different order.
      btrfs_uuid_scan_kthread() reads the tree_root, and then inserts entries
      into the uuid_root.  btrfs_uuid_tree_iterate() scans the uuid_root, but
      then does a btrfs_get_fs_root() which can read from the tree_root.
      
      It's actually easy enough to not be holding the path in
      btrfs_uuid_scan_kthread() when we add a uuid entry, as we already drop
      it further down and re-start the search when we loop.  So simply move
      the path release before we add our entry to the uuid tree.
      
      This also fixes a problem where we're holding a path open after we do
      btrfs_end_transaction(), which has it's own problems.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43eadb9e
    • Mikulas Patocka's avatar
      xfs: don't update mtime on COW faults · 884fee76
      Mikulas Patocka authored
      commit b17164e2 upstream.
      
      When running in a dax mode, if the user maps a page with MAP_PRIVATE and
      PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime
      when the user hits a COW fault.
      
      This breaks building of the Linux kernel.  How to reproduce:
      
       1. extract the Linux kernel tree on dax-mounted xfs filesystem
       2. run make clean
       3. run make -j12
       4. run make -j12
      
      at step 4, make would incorrectly rebuild the whole kernel (although it
      was already built in step 3).
      
      The reason for the breakage is that almost all object files depend on
      objtool.  When we run objtool, it takes COW page fault on its .data
      section, and these faults will incorrectly update the timestamp of the
      objtool binary.  The updated timestamp causes make to rebuild the whole
      tree.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      884fee76
    • Mikulas Patocka's avatar
      ext2: don't update mtime on COW faults · da0d5ccf
      Mikulas Patocka authored
      commit 1ef6ea0e upstream.
      
      When running in a dax mode, if the user maps a page with MAP_PRIVATE and
      PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime
      when the user hits a COW fault.
      
      This breaks building of the Linux kernel.  How to reproduce:
      
       1. extract the Linux kernel tree on dax-mounted ext2 filesystem
       2. run make clean
       3. run make -j12
       4. run make -j12
      
      at step 4, make would incorrectly rebuild the whole kernel (although it
      was already built in step 3).
      
      The reason for the breakage is that almost all object files depend on
      objtool.  When we run objtool, it takes COW page fault on its .data
      section, and these faults will incorrectly update the timestamp of the
      objtool binary.  The updated timestamp causes make to rebuild the whole
      tree.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da0d5ccf
    • Jason Gunthorpe's avatar
      include/linux/log2.h: add missing () around n in roundup_pow_of_two() · 95968e5c
      Jason Gunthorpe authored
      [ Upstream commit 428fc0af ]
      
      Otherwise gcc generates warnings if the expression is complicated.
      
      Fixes: 312a0c17 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant")
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95968e5c
    • Tony Lindgren's avatar
      thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430 · b0a3e332
      Tony Lindgren authored
      [ Upstream commit 30d24fab ]
      
      We can sometimes get bogus thermal shutdowns on omap4430 at least with
      droid4 running idle with a battery charger connected:
      
      thermal thermal_zone0: critical temperature reached (143 C), shutting down
      
      Dumping out the register values shows we can occasionally get a 0x7f value
      that is outside the TRM listed values in the ADC conversion table. And then
      we get a normal value when reading again after that. Reading the register
      multiple times does not seem help avoiding the bogus values as they stay
      until the next sample is ready.
      
      Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we
      should have values from 13 to 107 listed with a total of 95 values. But
      looking at the omap4430_adc_to_temp array, the values are off, and the
      end values are missing. And it seems that the 4430 ADC table is similar
      to omap3630 rather than omap4460.
      
      Let's fix the issue by using values based on the omap3630 table and just
      ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has
      the missing values added while the TRM table only shows every second
      value.
      
      Note that sometimes the ADC register values within the valid table can
      also be way off for about 1 out of 10 values. But it seems that those
      just show about 25 C too low values rather than too high values. So those
      do not cause a bogus thermal shutdown.
      
      Fixes: 1a31270e ("staging: omap-thermal: add OMAP4 data structures")
      Cc: Merlijn Wajer <merlijn@wizzup.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Sebastian Reichel <sebastian.reichel@collabora.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      b0a3e332
    • Lu Baolu's avatar
      iommu/vt-d: Serialize IOMMU GCMD register modifications · 519837cc
      Lu Baolu authored
      [ Upstream commit 6e4e9ec6 ]
      
      The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
      Description) that:
      
      If multiple control fields in this register need to be modified, software
      must serialize the modifications through multiple writes to this register.
      
      However, in irq_remapping.c, modifications of IRE and CFI are done in one
      write. We need to do two separate writes with STS checking after each. It
      also checks the status register before writing command register to avoid
      unnecessary register write.
      
      Fixes: af8d102f ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess")
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: default avatarKevin Tian <kevin.tian@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      519837cc
    • Huang Ying's avatar
      x86, fakenuma: Fix invalid starting node ID · f10d77cd
      Huang Ying authored
      [ Upstream commit ccae0f36 ]
      
      Commit:
      
        cc9aec03 ("x86/numa_emulation: Introduce uniform split capability")
      
      uses "-1" as the starting node ID, which causes the strange kernel log as
      follows, when "numa=fake=32G" is added to the kernel command line:
      
          Faking node -1 at [mem 0x0000000000000000-0x0000000893ffffff] (35136MB)
          Faking node 0 at [mem 0x0000001840000000-0x000000203fffffff] (32768MB)
          Faking node 1 at [mem 0x0000000894000000-0x000000183fffffff] (64192MB)
          Faking node 2 at [mem 0x0000002040000000-0x000000283fffffff] (32768MB)
          Faking node 3 at [mem 0x0000002840000000-0x000000303fffffff] (32768MB)
      
      And finally the kernel crashes:
      
          BUG: Bad page state in process swapper  pfn:00011
          page:(____ptrval____) refcount:0 mapcount:1 mapping:(____ptrval____) index:0x55cd7e44b270 pfn:0x11
          failed to read mapping contents, not a valid kernel address?
          flags: 0x5(locked|uptodate)
          raw: 0000000000000005 000055cd7e44af30 000055cd7e44af50 0000000100000006
          raw: 000055cd7e44b270 000055cd7e44b290 0000000000000000 000055cd7e44b510
          page dumped because: page still charged to cgroup
          page->mem_cgroup:000055cd7e44b510
          Modules linked in:
          CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc2 #1
          Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
          Call Trace:
           dump_stack+0x57/0x80
           bad_page.cold+0x63/0x94
           __free_pages_ok+0x33f/0x360
           memblock_free_all+0x127/0x195
           mem_init+0x23/0x1f5
           start_kernel+0x219/0x4f5
           secondary_startup_64+0xb6/0xc0
      
      Fix this bug via using 0 as the starting node ID.  This restores the
      original behavior before cc9aec03.
      
      [ mingo: Massaged the changelog. ]
      
      Fixes: cc9aec03 ("x86/numa_emulation: Introduce uniform split capability")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20200904061047.612950-1-ying.huang@intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      f10d77cd
    • Michael Chan's avatar
      tg3: Fix soft lockup when tg3_reset_task() fails. · aea7be64
      Michael Chan authored
      [ Upstream commit 55669934 ]
      
      If tg3_reset_task() fails, the device state is left in an inconsistent
      state with IFF_RUNNING still set but NAPI state not enabled.  A
      subsequent operation, such as ifdown or AER error can cause it to
      soft lock up when it tries to disable NAPI state.
      
      Fix it by bringing down the device to !IFF_RUNNING state when
      tg3_reset_task() fails.  tg3_reset_task() running from workqueue
      will now call tg3_close() when the reset fails.  We need to
      modify tg3_reset_task_cancel() slightly to avoid tg3_close()
      calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
      cancel_work_sync() will wait forever for tg3_reset_task() to
      finish.
      Reported-by: default avatarDavid Christensen <drc@linux.vnet.ibm.com>
      Reported-by: default avatarBaptiste Covolato <baptiste@arista.com>
      Fixes: db219973 ("tg3: Schedule at most one tg3_reset_task run")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aea7be64