1. 19 Oct, 2019 2 commits
    • David Hildenbrand's avatar
      fs/proc/page.c: don't access uninitialized memmaps in fs/proc/page.c · aad5f69b
      David Hildenbrand authored
      There are three places where we access uninitialized memmaps, namely:
      - /proc/kpagecount
      - /proc/kpageflags
      - /proc/kpagecgroup
      
      We have initialized memmaps either when the section is online or when the
      page was initialized to the ZONE_DEVICE.  Uninitialized memmaps contain
      garbage and in the worst case trigger kernel BUGs, especially with
      CONFIG_PAGE_POISONING.
      
      For example, not onlining a DIMM during boot and calling /proc/kpagecount
      with CONFIG_PAGE_POISONING:
      
        :/# cat /proc/kpagecount > tmp.test
        BUG: unable to handle page fault for address: fffffffffffffffe
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 114616067 P4D 114616067 PUD 114618067 PMD 0
        Oops: 0000 [#1] SMP NOPTI
        CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-20191004+ #11
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
        RIP: 0010:kpagecount_read+0xce/0x1e0
        Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480
        RSP: 0018:ffffa14e409b7e78 EFLAGS: 00010202
        RAX: fffffffffffffffe RBX: 0000000000020000 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: 00007f76b5595000 RDI: fffff35645000000
        RBP: 00007f76b5595000 R08: 0000000000000001 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
        R13: 0000000000020000 R14: 00007f76b5595000 R15: ffffa14e409b7f08
        FS:  00007f76b577d580(0000) GS:ffff8f41bd400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: fffffffffffffffe CR3: 0000000078960000 CR4: 00000000000006f0
        Call Trace:
         proc_reg_read+0x3c/0x60
         vfs_read+0xc5/0x180
         ksys_read+0x68/0xe0
         do_syscall_64+0x5c/0xa0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      For now, let's drop support for ZONE_DEVICE from the three pseudo files
      in order to fix this.  To distinguish offline memory (with garbage
      memmap) from ZONE_DEVICE memory with properly initialized memmaps, we
      would have to check get_dev_pagemap() and pfn_zone_device_reserved()
      right now.  The usage of both (especially, special casing devmem) is
      frowned upon and needs to be reworked.
      
      The fundamental issue we have is:
      
      	if (pfn_to_online_page(pfn)) {
      		/* memmap initialized */
      	} else if (pfn_valid(pfn)) {
      		/*
      		 * ???
      		 * a) offline memory. memmap garbage.
      		 * b) devmem: memmap initialized to ZONE_DEVICE.
      		 * c) devmem: reserved for driver. memmap garbage.
      		 * (d) devmem: memmap currently initializing - garbage)
      		 */
      	}
      
      We'll leave the pfn_zone_device_reserved() check in stable_page_flags()
      in place as that function is also used from memory failure.  We now no
      longer dump information about pages that are not in use anymore -
      offline.
      
      Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
      Cc: Pankaj gupta <pagupta@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aad5f69b
    • David Hildenbrand's avatar
      drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store() · 641fe2e9
      David Hildenbrand authored
      Uninitialized memmaps contain garbage and in the worst case trigger kernel
      BUGs, especially with CONFIG_PAGE_POISONING.  They should not get touched.
      
      Right now, when trying to soft-offline a PFN that resides on a memory
      block that was never onlined, one gets a misleading error with
      CONFIG_PAGE_POISONING:
      
        :/# echo 5637144576 > /sys/devices/system/memory/soft_offline_page
        [   23.097167] soft offline: 0x150000 page already poisoned
      
      But the actual result depends on the garbage in the memmap.
      
      soft_offline_page() can only work with online pages, it returns -EIO in
      case of ZONE_DEVICE.  Make sure to only forward pages that are online
      (iow, managed by the buddy) and, therefore, have an initialized memmap.
      
      Add a check against pfn_to_online_page() and similarly return -EIO.
      
      Link: http://lkml.kernel.org/r/20191010141200.8985-1-david@redhat.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e8]
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      641fe2e9
  2. 18 Oct, 2019 15 commits
    • Linus Torvalds's avatar
      filldir[64]: remove WARN_ON_ONCE() for bad directory entries · b9959c7a
      Linus Torvalds authored
      This was always meant to be a temporary thing, just for testing and to
      see if it actually ever triggered.
      
      The only thing that reported it was syzbot doing disk image fuzzing, and
      then that warning is expected.  So let's just remove it before -rc4,
      because the extra sanity testing should probably go to -stable, but we
      don't want the warning to do so.
      
      Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com
      Fixes: 8a23eb80 ("Make filldir[64]() verify the directory entry filename is valid")
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9959c7a
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client · 6b95cf9b
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A future-proofing decoding fix from Jeff intended for stable and a
        patch for a mostly benign race from Dongsheng"
      
      * tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client:
        rbd: cancel lock_dwork if the wait is interrupted
        ceph: just skip unrecognized info in ceph_reply_info_extra
      6b95cf9b
    • Linus Torvalds's avatar
      Merge tag 'for-5.4/dm-fixes' of... · fb8527e5
      Linus Torvalds authored
      Merge tag 'for-5.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM snapshot deadlock that can occur due to COW throttling
         preventing locks from being released.
      
       - Fix DM cache's GFP_NOWAIT allocation failure error paths by switching
         to GFP_NOIO.
      
       - Make __hash_find() static in the DM clone target.
      
      * tag 'for-5.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm cache: fix bugs when a GFP_NOWAIT allocation fails
        dm snapshot: rework COW throttling to fix deadlock
        dm snapshot: introduce account_start_copy() and account_end_copy()
        dm clone: Make __hash_find static
      fb8527e5
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 90105ae1
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - Fixes for page-table issues on Mali GPUs
      
       - Missing free in an error path for ARM-SMMU
      
       - PASID decoding in the AMD IOMMU Event log code
      
       - Another update for the locking fixes in the AMD IOMMU driver
      
       - Reduce the calls to platform_get_irq() in the IPMMU-VMSA and Rockchip
         IOMMUs to get rid of the warning message added to this function
         recently
      
      * tag 'iommu-fixes-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Check PM_LEVEL_SIZE() condition in locked section
        iommu/amd: Fix incorrect PASID decoding from event log
        iommu/ipmmu-vmsa: Only call platform_get_irq() when interrupt is mandatory
        iommu/rockchip: Don't use platform_get_irq to implicitly count irqs
        iommu/io-pgtable-arm: Support all Mali configurations
        iommu/io-pgtable-arm: Correct Mali attributes
        iommu/arm-smmu: Free context bitmap in the err path of arm_smmu_init_domain_context
      90105ae1
    • Linus Torvalds's avatar
      Merge tag 'copy-struct-from-user-v5.4-rc4' of... · 8eb4b3b0
      Linus Torvalds authored
      Merge tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux
      
      Pull usercopy test fixlets from Christian Brauner:
       "This contains two improvements for the copy_struct_from_user() tests:
      
         - a coding style change to get rid of the ugly "if ((ret |= test()))"
           pointed out when pulling the original patchset.
      
         - avoid a soft lockups when running the usercopy tests on machines
           with large page sizes by scanning only a 1024 byte region"
      
      * tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
        usercopy: Avoid soft lockups in test_check_nonzero_user()
        lib: test_user_copy: style cleanup
      8eb4b3b0
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 7571438a
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC host:
         - sdhci-iproc: Prevent some spurious interrupts
         - renesas_sdhi/sh_mmcif: Avoid false warnings about IRQs not found
      
        MEMSTICK host:
         - jmb38x_ms: Fix an error handling path at ->probe()"
      
      * tag 'mmc-v5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()'
        mmc: sdhci-iproc: fix spurious interrupts on Multiblock reads with bcm2711
        mmc: sh_mmcif: Use platform_get_irq_optional() for optional interrupt
        mmc: renesas_sdhi: Do not use platform_get_irq() to count interrupts
      7571438a
    • Linus Torvalds's avatar
      Merge tag 'sound-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 5f93393a
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Just a few small fixes for the usual suspect, HD- and USB-audio:
        enablement of runtime PM for Nvidia due to the recent PCI changes, a
        fix for potential hangs with recent HD-audio platforms, and the rest
        device-specific quirks"
      
      * tag 'sound-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Force runtime PM on Nvidia HDMI codecs
        ALSA: hda/realtek - Enable headset mic on Asus MJ401TA
        ALSA: usb-audio: Disable quirks for BOSS Katana amplifiers
        ALSA: hdac: clear link output stream mapping
        ALSA: hda/realtek: Reduce the Headphone static noise on XPS 9350/9360
      5f93393a
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · adca4ce3
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "Fix possible use-after-free in the ACPI CPPC support code (John Garry)
        and prevent the ACPI HMAT parsing code from using possibly incorrect
        data coming from the platform firmware (Daniel Black)"
      
      * tag 'acpi-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: CPPC: Set pcc_data[pcc_ss_id] to NULL in acpi_cppc_processor_exit()
        ACPI: HMAT: ACPI_HMAT_MEMORY_PD_VALID is deprecated since ACPI-6.3
      adca4ce3
    • Linus Torvalds's avatar
      Merge tag 'pm-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e59b76ff
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These include a fix for a recent regression in the ACPI CPU
      performance scaling code, a PCI device power management fix,
      a system shutdown fix related to cpufreq, a removal of an ACPI
      suspend-to-idle blacklist entry and a build warning fix.
      
      Specifics:
      
         - Fix possible NULL pointer dereference in the ACPI processor scaling
           initialization code introduced by a recent cpufreq update (Rafael
           Wysocki).
      
         - Fix possible deadlock due to suspending cpufreq too late during
           system shutdown (Rafael Wysocki).
      
         - Make the PCI device system resume code path be more consistent with
           its PM-runtime counterpart to fix an issue with missing delay on
           transitions from D3cold to D0 during system resume from
           suspend-to-idle on some systems (Rafael Wysocki).
      
         - Drop Dell XPS13 9360 from the LPS0 Idle _DSM blacklist to make it
           use suspend-to-idle by default (Mario Limonciello).
      
         - Fix build warning in the core system suspend support code (Ben
           Dooks)"
      
      * tag 'pm-5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: processor: Avoid NULL pointer dereferences at init time
        PCI: PM: Fix pci_power_up()
        PM: sleep: include <linux/pm_runtime.h> for pm_wq
        cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown
        ACPI: PM: Drop Dell XPS13 9360 from LPS0 Idle _DSM blacklist
      e59b76ff
    • Linus Torvalds's avatar
      Merge tag 'mkp-scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi · c3419fd6
      Linus Torvalds authored
      Pull scsi fixes from Martin Petersen:
       "These two commits were in a separate postmerge branch due to a
        dependency on changes merged for 5.4 in the block tree.
      
        They fix two issues in the intersection of the request cleanup changes
        from block (b7e9e1fb) and the request batching changes
        (8930a6c2) that were made to SCSI during the 5.4 cycle"
      
      * tag 'mkp-scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi:
        scsi: core: fix dh and multipathing for SCSI hosts without request batching
        scsi: core: fix missing .cleanup_rq for SCSI hosts without request batching
      c3419fd6
    • Joerg Roedel's avatar
      iommu/amd: Check PM_LEVEL_SIZE() condition in locked section · 46ac18c3
      Joerg Roedel authored
      The increase_address_space() function has to check the PM_LEVEL_SIZE()
      condition again under the domain->lock to avoid a false trigger of the
      WARN_ON_ONCE() and to avoid that the address space is increase more
      often than necessary.
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Fixes: 754265bc ("iommu/amd: Fix race in increase_address_space()")
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      46ac18c3
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-tables' · ffba17bb
      Rafael J. Wysocki authored
      * acpi-tables:
        ACPI: HMAT: ACPI_HMAT_MEMORY_PD_VALID is deprecated since ACPI-6.3
      ffba17bb
    • John Garry's avatar
      ACPI: CPPC: Set pcc_data[pcc_ss_id] to NULL in acpi_cppc_processor_exit() · 56a0b978
      John Garry authored
      When enabling KASAN and DEBUG_TEST_DRIVER_REMOVE, I find this KASAN
      warning:
      
      [   20.872057] BUG: KASAN: use-after-free in pcc_data_alloc+0x40/0xb8
      [   20.878226] Read of size 4 at addr ffff00236cdeb684 by task swapper/0/1
      [   20.884826]
      [   20.886309] CPU: 19 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc1-00009-ge7f7df3db5bf-dirty #289
      [   20.894994] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
      [   20.903505] Call trace:
      [   20.905942]  dump_backtrace+0x0/0x200
      [   20.909593]  show_stack+0x14/0x20
      [   20.912899]  dump_stack+0xd4/0x130
      [   20.916291]  print_address_description.isra.9+0x6c/0x3b8
      [   20.921592]  __kasan_report+0x12c/0x23c
      [   20.925417]  kasan_report+0xc/0x18
      [   20.928808]  __asan_load4+0x94/0xb8
      [   20.932286]  pcc_data_alloc+0x40/0xb8
      [   20.935938]  acpi_cppc_processor_probe+0x4e8/0xb08
      [   20.940717]  __acpi_processor_start+0x48/0xb0
      [   20.945062]  acpi_processor_start+0x40/0x60
      [   20.949235]  really_probe+0x118/0x548
      [   20.952887]  driver_probe_device+0x7c/0x148
      [   20.957059]  device_driver_attach+0x94/0xa0
      [   20.961231]  __driver_attach+0xa4/0x110
      [   20.965055]  bus_for_each_dev+0xe8/0x158
      [   20.968966]  driver_attach+0x30/0x40
      [   20.972531]  bus_add_driver+0x234/0x2f0
      [   20.976356]  driver_register+0xbc/0x1d0
      [   20.980182]  acpi_processor_driver_init+0x40/0xe4
      [   20.984875]  do_one_initcall+0xb4/0x254
      [   20.988700]  kernel_init_freeable+0x24c/0x2f8
      [   20.993047]  kernel_init+0x10/0x118
      [   20.996524]  ret_from_fork+0x10/0x18
      [   21.000087]
      [   21.001567] Allocated by task 1:
      [   21.004785]  save_stack+0x28/0xc8
      [   21.008089]  __kasan_kmalloc.isra.9+0xbc/0xd8
      [   21.012435]  kasan_kmalloc+0xc/0x18
      [   21.015913]  pcc_data_alloc+0x94/0xb8
      [   21.019564]  acpi_cppc_processor_probe+0x4e8/0xb08
      [   21.024343]  __acpi_processor_start+0x48/0xb0
      [   21.028689]  acpi_processor_start+0x40/0x60
      [   21.032860]  really_probe+0x118/0x548
      [   21.036512]  driver_probe_device+0x7c/0x148
      [   21.040684]  device_driver_attach+0x94/0xa0
      [   21.044855]  __driver_attach+0xa4/0x110
      [   21.048680]  bus_for_each_dev+0xe8/0x158
      [   21.052591]  driver_attach+0x30/0x40
      [   21.056155]  bus_add_driver+0x234/0x2f0
      [   21.059980]  driver_register+0xbc/0x1d0
      [   21.063805]  acpi_processor_driver_init+0x40/0xe4
      [   21.068497]  do_one_initcall+0xb4/0x254
      [   21.072322]  kernel_init_freeable+0x24c/0x2f8
      [   21.076667]  kernel_init+0x10/0x118
      [   21.080144]  ret_from_fork+0x10/0x18
      [   21.083707]
      [   21.085186] Freed by task 1:
      [   21.088056]  save_stack+0x28/0xc8
      [   21.091360]  __kasan_slab_free+0x118/0x180
      [   21.095445]  kasan_slab_free+0x10/0x18
      [   21.099183]  kfree+0x80/0x268
      [   21.102139]  acpi_cppc_processor_exit+0x1a8/0x1b8
      [   21.106832]  acpi_processor_stop+0x70/0x80
      [   21.110917]  really_probe+0x174/0x548
      [   21.114568]  driver_probe_device+0x7c/0x148
      [   21.118740]  device_driver_attach+0x94/0xa0
      [   21.122912]  __driver_attach+0xa4/0x110
      [   21.126736]  bus_for_each_dev+0xe8/0x158
      [   21.130648]  driver_attach+0x30/0x40
      [   21.134212]  bus_add_driver+0x234/0x2f0
      [   21.0x10/0x18
      [   21.161764]
      [   21.163244] The buggy address belongs to the object at ffff00236cdeb600
      [   21.163244]  which belongs to the cache kmalloc-256 of size 256
      [   21.175750] The buggy address is located 132 bytes inside of
      [   21.175750]  256-byte region [ffff00236cdeb600, ffff00236cdeb700)
      [   21.187473] The buggy address belongs to the page:
      [   21.192254] page:fffffe008d937a00 refcount:1 mapcount:0 mapping:ffff002370c0fa00 index:0x0 compound_mapcount: 0
      [   21.202331] flags: 0x1ffff00000010200(slab|head)
      [   21.206940] raw: 1ffff00000010200 dead000000000100 dead000000000122 ffff002370c0fa00
      [   21.214671] raw: 0000000000000000 00000000802a002a 00000001ffffffff 0000000000000000
      [   21.222400] page dumped because: kasan: bad access detected
      [   21.227959]
      [   21.229438] Memory state around the buggy address:
      [   21.234218]  ffff00236cdeb580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   21.241427]  ffff00236cdeb600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.248637] >ffff00236cdeb680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.255845]                    ^
      [   21.259062]  ffff00236cdeb700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   21.266272]  ffff00236cdeb780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   21.273480] ==================================================================
      
      It seems that global pcc_data[pcc_ss_id] can be freed in
      acpi_cppc_processor_exit(), but we may later reference this value, so
      NULLify it when freed.
      
      Also remove the useless setting of data "pcc_channel_acquired", which
      we're about to free.
      
      Fixes: 85b1407b ("ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs")
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      56a0b978
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-sleep' · b23eb5c7
      Rafael J. Wysocki authored
      * pm-cpufreq:
        ACPI: processor: Avoid NULL pointer dereferences at init time
        cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown
      
      * pm-sleep:
        PM: sleep: include <linux/pm_runtime.h> for pm_wq
        ACPI: PM: Drop Dell XPS13 9360 from LPS0 Idle _DSM blacklist
      b23eb5c7
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 0e2adab6
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "The main thing here is a long-awaited workaround for a CPU erratum on
        ThunderX2 which we have developed in conjunction with engineers from
        Cavium/Marvell.
      
        At the moment, the workaround is unconditionally enabled for affected
        CPUs at runtime but we may add a command-line option to disable it in
        future if performance numbers show up indicating a significant cost
        for real workloads.
      
        Summary:
      
         - Work around Cavium/Marvell ThunderX2 erratum #219
      
         - Fix regression in mlock() ABI caused by sign-extension of TTBR1 addresses
      
         - More fixes to the spurious kernel fault detection logic
      
         - Fix pathological preemption race when enabling some CPU features at boot
      
         - Drop broken kcore macros in favour of generic implementations
      
         - Fix userspace view of ID_AA64ZFR0_EL1 when SVE is disabled
      
         - Avoid NULL dereference on allocation failure during hibernation"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: tags: Preserve tags for addresses translated via TTBR1
        arm64: mm: fix inverted PAR_EL1.F check
        arm64: sysreg: fix incorrect definition of SYS_PAR_EL1_F
        arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled
        arm64: hibernate: check pgd table allocation
        arm64: cpufeature: Treat ID_AA64ZFR0_EL1 as RAZ when SVE is not enabled
        arm64: Fix kcore macros after 52-bit virtual addressing fallout
        arm64: Allow CAVIUM_TX2_ERRATUM_219 to be selected
        arm64: Avoid Cavium TX2 erratum 219 when switching TTBR
        arm64: Enable workaround for Cavium TX2 erratum 219 when running SMT
        arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is set
      0e2adab6
  3. 17 Oct, 2019 15 commits
  4. 16 Oct, 2019 8 commits
    • Chris Wilson's avatar
      drm/i915: Fixup preempt-to-busy vs resubmission of a virtual request · 0a544a2a
      Chris Wilson authored
      As preempt-to-busy leaves the request on the HW as the resubmission is
      processed, that request may complete in the background and even cause a
      second virtual request to enter queue. This second virtual request
      breaks our "single request in the virtual pipeline" assumptions.
      Furthermore, as the virtual request may be completed and retired, we
      lose the reference the virtual engine assumes is held. Normally, just
      removing the request from the scheduler queue removes it from the
      engine, but the virtual engine keeps track of its singleton request via
      its ve->request. This pointer needs protecting with a reference.
      
      v2: Drop unnecessary motion of rq->engine = owner
      
      Fixes: 22b7a426 ("drm/i915/execlists: Preempt-to-busy")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-1-chris@chris-wilson.co.uk
      (cherry picked from commit b647c7df)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      0a544a2a
    • Chris Wilson's avatar
      drm/i915/userptr: Never allow userptr into the mappable GGTT · 4f2a572e
      Chris Wilson authored
      Daniel Vetter uncovered a nasty cycle in using the mmu-notifiers to
      invalidate userptr objects which also happen to be pulled into GGTT
      mmaps. That is when we unbind the userptr object (on mmu invalidation),
      we revoke all CPU mmaps, which may then recurse into mmu invalidation.
      
      We looked for ways of breaking the cycle, but the revocation on
      invalidation is required and cannot be avoided. The only solution we
      could see was to not allow such GGTT bindings of userptr objects in the
      first place. In practice, no one really wants to use a GGTT mmapping of
      a CPU pointer...
      
      Just before Daniel's explosive lockdep patches land in v5.4-rc1, we got
      a genuine blip from CI:
      
      <4>[  246.793958] ======================================================
      <4>[  246.793972] WARNING: possible circular locking dependency detected
      <4>[  246.793989] 5.3.0-gbd6c56f50d15-drmtip_372+ #1 Tainted: G     U
      <4>[  246.794003] ------------------------------------------------------
      <4>[  246.794017] kswapd0/145 is trying to acquire lock:
      <4>[  246.794030] 000000003f565be6 (&dev->struct_mutex/1){+.+.}, at: userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
      <4>[  246.794250]
                        but task is already holding lock:
      <4>[  246.794263] 000000001799cef9 (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe6/0x2a0
      <4>[  246.794291]
                        which lock already depends on the new lock.
      
      <4>[  246.794307]
                        the existing dependency chain (in reverse order) is:
      <4>[  246.794322]
                        -> #3 (&anon_vma->rwsem){++++}:
      <4>[  246.794344]        down_write+0x33/0x70
      <4>[  246.794357]        __vma_adjust+0x3d9/0x7b0
      <4>[  246.794370]        __split_vma+0x16a/0x180
      <4>[  246.794385]        mprotect_fixup+0x2a5/0x320
      <4>[  246.794399]        do_mprotect_pkey+0x208/0x2e0
      <4>[  246.794413]        __x64_sys_mprotect+0x16/0x20
      <4>[  246.794429]        do_syscall_64+0x55/0x1c0
      <4>[  246.794443]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794456]
                        -> #2 (&mapping->i_mmap_rwsem){++++}:
      <4>[  246.794478]        down_write+0x33/0x70
      <4>[  246.794493]        unmap_mapping_pages+0x48/0x130
      <4>[  246.794519]        i915_vma_revoke_mmap+0x81/0x1b0 [i915]
      <4>[  246.794519]        i915_vma_unbind+0x11d/0x4a0 [i915]
      <4>[  246.794519]        i915_vma_destroy+0x31/0x300 [i915]
      <4>[  246.794519]        __i915_gem_free_objects+0xb8/0x4b0 [i915]
      <4>[  246.794519]        drm_file_free.part.0+0x1e6/0x290
      <4>[  246.794519]        drm_release+0xa6/0xe0
      <4>[  246.794519]        __fput+0xc2/0x250
      <4>[  246.794519]        task_work_run+0x82/0xb0
      <4>[  246.794519]        do_exit+0x35b/0xdb0
      <4>[  246.794519]        do_group_exit+0x34/0xb0
      <4>[  246.794519]        __x64_sys_exit_group+0xf/0x10
      <4>[  246.794519]        do_syscall_64+0x55/0x1c0
      <4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794519]
                        -> #1 (&vm->mutex){+.+.}:
      <4>[  246.794519]        i915_gem_shrinker_taints_mutex+0x6d/0xe0 [i915]
      <4>[  246.794519]        i915_address_space_init+0x9f/0x160 [i915]
      <4>[  246.794519]        i915_ggtt_init_hw+0x55/0x170 [i915]
      <4>[  246.794519]        i915_driver_probe+0xc9f/0x1620 [i915]
      <4>[  246.794519]        i915_pci_probe+0x43/0x1b0 [i915]
      <4>[  246.794519]        pci_device_probe+0x9e/0x120
      <4>[  246.794519]        really_probe+0xea/0x3d0
      <4>[  246.794519]        driver_probe_device+0x10b/0x120
      <4>[  246.794519]        device_driver_attach+0x4a/0x50
      <4>[  246.794519]        __driver_attach+0x97/0x130
      <4>[  246.794519]        bus_for_each_dev+0x74/0xc0
      <4>[  246.794519]        bus_add_driver+0x13f/0x210
      <4>[  246.794519]        driver_register+0x56/0xe0
      <4>[  246.794519]        do_one_initcall+0x58/0x300
      <4>[  246.794519]        do_init_module+0x56/0x1f6
      <4>[  246.794519]        load_module+0x25bd/0x2a40
      <4>[  246.794519]        __se_sys_finit_module+0xd3/0xf0
      <4>[  246.794519]        do_syscall_64+0x55/0x1c0
      <4>[  246.794519]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  246.794519]
                        -> #0 (&dev->struct_mutex/1){+.+.}:
      <4>[  246.794519]        __lock_acquire+0x15d8/0x1e90
      <4>[  246.794519]        lock_acquire+0xa6/0x1c0
      <4>[  246.794519]        __mutex_lock+0x9d/0x9b0
      <4>[  246.794519]        userptr_mn_invalidate_range_start+0x18f/0x220 [i915]
      <4>[  246.794519]        __mmu_notifier_invalidate_range_start+0x85/0x110
      <4>[  246.794519]        try_to_unmap_one+0x76b/0x860
      <4>[  246.794519]        rmap_walk_anon+0x104/0x280
      <4>[  246.794519]        try_to_unmap+0xc0/0xf0
      <4>[  246.794519]        shrink_page_list+0x561/0xc10
      <4>[  246.794519]        shrink_inactive_list+0x220/0x440
      <4>[  246.794519]        shrink_node_memcg+0x36e/0x740
      <4>[  246.794519]        shrink_node+0xcb/0x490
      <4>[  246.794519]        balance_pgdat+0x241/0x580
      <4>[  246.794519]        kswapd+0x16c/0x530
      <4>[  246.794519]        kthread+0x119/0x130
      <4>[  246.794519]        ret_from_fork+0x24/0x50
      <4>[  246.794519]
                        other info that might help us debug this:
      
      <4>[  246.794519] Chain exists of:
                          &dev->struct_mutex/1 --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem
      
      <4>[  246.794519]  Possible unsafe locking scenario:
      
      <4>[  246.794519]        CPU0                    CPU1
      <4>[  246.794519]        ----                    ----
      <4>[  246.794519]   lock(&anon_vma->rwsem);
      <4>[  246.794519]                                lock(&mapping->i_mmap_rwsem);
      <4>[  246.794519]                                lock(&anon_vma->rwsem);
      <4>[  246.794519]   lock(&dev->struct_mutex/1);
      <4>[  246.794519]
                         *** DEADLOCK ***
      
      v2: Say no to mmap_ioctl
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111744
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111870Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190928082546.3473-1-chris@chris-wilson.co.uk
      (cherry picked from commit a4311745)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      4f2a572e
    • Ville Syrjälä's avatar
      drm/i915: Favor last VBT child device with conflicting AUX ch/DDC pin · 0336ab58
      Ville Syrjälä authored
      The first come first served apporoach to handling the VBT
      child device AUX ch conflicts has backfired. We have machines
      in the wild where the VBT specifies both port A eDP and
      port E DP (in that order) with port E being the real one.
      
      So let's try to flip the preference around and let the last
      child device win once again.
      
      Cc: stable@vger.kernel.org
      Cc: Jani Nikula <jani.nikula@intel.com>
      Tested-by: default avatarMasami Ichikawa <masami256@gmail.com>
      Tested-by: default avatarTorsten <freedesktop201910@liggy.de>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111966
      Fixes: 36a0f920 ("drm/i915/bios: make child device order the priority order")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191011202030.8829-1-ville.syrjala@linux.intel.comAcked-by: default avatarJani Nikula <jani.nikula@intel.com>
      (cherry picked from commit 41e35ffb)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      0336ab58
    • Chris Wilson's avatar
      drm/i915/execlists: Refactor -EIO markup of hung requests · 128260a4
      Chris Wilson authored
      Pull setting -EIO on the hung requests into its own utility function.
      Having allowed ourselves to short-circuit submission of completed
      requests, we can now do the mark_eio() prior to submission and avoid
      some redundant operations.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-4-chris@chris-wilson.co.uk
      (cherry picked from commit 0d7cf7bc)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      128260a4
    • Will Deacon's avatar
      arm64: tags: Preserve tags for addresses translated via TTBR1 · 597399d0
      Will Deacon authored
      Sign-extending TTBR1 addresses when converting to an untagged address
      breaks the documented POSIX semantics for mlock() in some obscure error
      cases where we end up returning -EINVAL instead of -ENOMEM as a direct
      result of rewriting the upper address bits.
      
      Rework the untagged_addr() macro to preserve the upper address bits for
      TTBR1 addresses and only clear the tag bits for user addresses. This
      matches the behaviour of the 'clear_address_tag' assembly macro, so
      rename that and align the implementations at the same time so that they
      use the same instruction sequences for the tag manipulation.
      
      Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Tested-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      597399d0
    • Mark Rutland's avatar
      arm64: mm: fix inverted PAR_EL1.F check · 38137335
      Mark Rutland authored
      When detecting a spurious EL1 translation fault, we have the CPU retry
      the translation using an AT S1E1R instruction, and inspect PAR_EL1 to
      determine if the fault was spurious.
      
      When PAR_EL1.F == 0, the AT instruction successfully translated the
      address without a fault, which implies the original fault was spurious.
      However, in this case we return false and treat the original fault as if
      it was not spurious.
      
      Invert the return value so that we treat such a case as spurious.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Fixes: 42f91093 ("arm64: mm: Ignore spurious translation faults taken from the kernel")
      Tested-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      38137335
    • Yang Yingliang's avatar
      arm64: sysreg: fix incorrect definition of SYS_PAR_EL1_F · 29a0f5ad
      Yang Yingliang authored
      The 'F' field of the PAR_EL1 register lives in bit 0, not bit 1.
      Fix the broken definition in 'sysreg.h'.
      
      Fixes: e8620cff ("arm64: sysreg: Add some field definitions for PAR_EL1")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      29a0f5ad
    • Julien Thierry's avatar
      arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled · 19c95f26
      Julien Thierry authored
      Preempting from IRQ-return means that the task has its PSTATE saved
      on the stack, which will get restored when the task is resumed and does
      the actual IRQ return.
      
      However, enabling some CPU features requires modifying the PSTATE. This
      means that, if a task was scheduled out during an IRQ-return before all
      CPU features are enabled, the task might restore a PSTATE that does not
      include the feature enablement changes once scheduled back in.
      
      * Task 1:
      
      PAN == 0 ---|                          |---------------
                  |                          |<- return from IRQ, PSTATE.PAN = 0
                  | <- IRQ                   |
                  +--------+ <- preempt()  +--
                                           ^
                                           |
                                           reschedule Task 1, PSTATE.PAN == 1
      * Init:
              --------------------+------------------------
                                  ^
                                  |
                                  enable_cpu_features
                                  set PSTATE.PAN on all CPUs
      
      Worse than this, since PSTATE is untouched when task switching is done,
      a task missing the new bits in PSTATE might affect another task, if both
      do direct calls to schedule() (outside of IRQ/exception contexts).
      
      Fix this by preventing preemption on IRQ-return until features are
      enabled on all CPUs.
      
      This way the only PSTATE values that are saved on the stack are from
      synchronous exceptions. These are expected to be fatal this early, the
      exception is BRK for WARN_ON(), but as this uses do_debug_exception()
      which keeps IRQs masked, it shouldn't call schedule().
      Signed-off-by: default avatarJulien Thierry <julien.thierry@arm.com>
      [james: Replaced a really cool hack, with an even simpler static key in C.
       expanded commit message with Julien's cover-letter ascii art]
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      19c95f26