1. 22 Jun, 2019 36 commits
    • Douglas Anderson's avatar
      clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288 · 72a7c442
      Douglas Anderson authored
      [ Upstream commit 57a20248 ]
      
      Experimentally it can be seen that going into deep sleep (specifically
      setting PMU_CLR_DMA and PMU_CLR_BUS in RK3288_PMU_PWRMODE_CON1)
      appears to fail unless "aclk_dmac1" is on.  The failure is that the
      system never signals that it made it into suspend on the GLOBAL_PWROFF
      pin and it just hangs.
      
      NOTE that it's confirmed that it's the actual suspend that fails, not
      one of the earlier calls to read/write registers.  Specifically if you
      comment out the "PMU_GLOBAL_INT_DISABLE" setting in
      rk3288_slp_mode_set() and then comment out the "cpu_do_idle()" call in
      rockchip_lpmode_enter() then you can exercise the whole suspend path
      without any crashing.
      
      This is currently not a problem with suspend upstream because there is
      no current way to exercise the deep suspend code.  However, anyone
      trying to make it work will run into this issue.
      
      This was not a problem on shipping rk3288-based Chromebooks because
      those devices all ran on an old kernel based on 3.14.  On that kernel
      "aclk_dmac1" appears to be left on all the time.
      
      There are several ways to skin this problem.
      
      A) We could add "aclk_dmac1" to the list of critical clocks and that
      apperas to work, but presumably that wastes power.
      
      B) We could keep a list of "struct clk" objects to enable at suspend
      time in clk-rk3288.c and use the standard clock APIs.
      
      C) We could make the rk3288-pmu driver keep a list of clocks to enable
      at suspend time.  Presumably this would require a dts and bindings
      change.
      
      D) We could just whack the clock on in the existing syscore suspend
      function where we whack a bunch of other clocks.  This is particularly
      easy because we know for sure that the clock's only parent
      ("aclk_cpu") is a critical clock so we don't need to do anything more
      than ungate it.
      
      In this case I have chosen D) because it seemed like the least work,
      but any of the other options would presumably also work fine.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarElaine Zhang <zhangqing@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      72a7c442
    • Nathan Chancellor's avatar
      soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher · 5a1de21c
      Nathan Chancellor authored
      [ Upstream commit 89e28da8 ]
      
      When building with -Wsometimes-uninitialized, Clang warns:
      
      drivers/soc/mediatek/mtk-pmic-wrap.c:1358:6: error: variable 'rdata' is
      used uninitialized whenever '||' condition is true
      [-Werror,-Wsometimes-uninitialized]
      
      If pwrap_write returns non-zero, pwrap_read will not be called to
      initialize rdata, meaning that we will use some random uninitialized
      stack value in our print statement. Zero initialize rdata in case this
      happens.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/401Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5a1de21c
    • Enrico Granata's avatar
      platform/chrome: cros_ec_proto: check for NULL transfer function · 9823dc87
      Enrico Granata authored
      [ Upstream commit 94d4e7af ]
      
      As new transfer mechanisms are added to the EC codebase, they may
      not support v2 of the EC protocol.
      
      If the v3 initial handshake transfer fails, the kernel will try
      and call cmd_xfer as a fallback. If v2 is not supported, cmd_xfer
      will be NULL, and the code will end up causing a kernel panic.
      
      Add a check for NULL before calling the transfer function, along
      with a helpful comment explaining how one might end up in this
      situation.
      Signed-off-by: default avatarEnrico Granata <egranata@chromium.org>
      Reviewed-by: default avatarJett Rink <jettrink@chromium.org>
      Signed-off-by: default avatarEnric Balletbo i Serra <enric.balletbo@collabora.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9823dc87
    • Wenwen Wang's avatar
      x86/PCI: Fix PCI IRQ routing table memory leak · f460e08e
      Wenwen Wang authored
      [ Upstream commit ea094d53 ]
      
      In pcibios_irq_init(), the PCI IRQ routing table 'pirq_table' is first
      found through pirq_find_routing_table().  If the table is not found and
      CONFIG_PCI_BIOS is defined, the table is then allocated in
      pcibios_get_irq_routing_table() using kmalloc().  Later, if the I/O APIC is
      used, this table is actually not used.  In that case, the allocated table
      is not freed, which is a memory leak.
      
      Free the allocated table if it is not used.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      [bhelgaas: added Ingo's reviewed-by, since the only change since v1 was to
      use the irq_routing_table local variable name he suggested]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f460e08e
    • J. Bruce Fields's avatar
      nfsd: allow fh_want_write to be called twice · 101e808f
      J. Bruce Fields authored
      [ Upstream commit 0b8f6262 ]
      
      A fuzzer recently triggered lockdep warnings about potential sb_writers
      deadlocks caused by fh_want_write().
      
      Looks like we aren't careful to pair each fh_want_write() with an
      fh_drop_write().
      
      It's not normally a problem since fh_put() will call fh_drop_write() for
      us.  And was OK for NFSv3 where we'd do one operation that might call
      fh_want_write(), and then put the filehandle.
      
      But an NFSv4 protocol fuzzer can do weird things like call unlink twice
      in a compound, and then we get into trouble.
      
      I'm a little worried about this approach of just leaving everything to
      fh_put().  But I think there are probably a lot of
      fh_want_write()/fh_drop_write() imbalances so for now I think we need it
      to be more forgiving.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      101e808f
    • Kirill Smelkov's avatar
      fuse: retrieve: cap requested size to negotiated max_write · 4edf907d
      Kirill Smelkov authored
      [ Upstream commit 7640682e ]
      
      FUSE filesystem server and kernel client negotiate during initialization
      phase, what should be the maximum write size the client will ever issue.
      Correspondingly the filesystem server then queues sys_read calls to read
      requests with buffer capacity large enough to carry request header + that
      max_write bytes. A filesystem server is free to set its max_write in
      anywhere in the range between [1*page, fc->max_pages*page]. In particular
      go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages
      corresponds to 128K. Libfuse also allows users to configure max_write, but
      by default presets it to possible maximum.
      
      If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we
      allow to retrieve more than max_write bytes, corresponding prepared
      NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
      filesystem server, in full correspondence with server/client contract, will
      be only queuing sys_read with ~max_write buffer capacity, and
      fuse_dev_do_read throws away requests that cannot fit into server request
      buffer. In turn the filesystem server could get stuck waiting indefinitely
      for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is
      understood by clients as that NOTIFY_REPLY was queued and will be sent
      back.
      
      Cap requested size to negotiate max_write to avoid the problem.  This
      aligns with the way NOTIFY_RETRIEVE handler works, which already
      unconditionally caps requested retrieve size to fuse_conn->max_pages.  This
      way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than
      was originally requested.
      
      Please see [1] for context where the problem of stuck filesystem was hit
      for real, how the situation was traced and for more involving patch that
      did not make it into the tree.
      
      [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
      [2] https://github.com/hanwen/go-fuseSigned-off-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      Cc: Han-Wen Nienhuys <hanwen@google.com>
      Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4edf907d
    • Jorge Ramirez-Ortiz's avatar
      nvmem: core: fix read buffer in place · faa4dc52
      Jorge Ramirez-Ortiz authored
      [ Upstream commit 2fe518fe ]
      
      When the bit_offset in the cell is zero, the pointer to the msb will
      not be properly initialized (ie, will still be pointing to the first
      byte in the buffer).
      
      This being the case, if there are bits to clear in the msb, those will
      be left untouched while the mask will incorrectly clear bit positions
      on the first byte.
      
      This commit also makes sure that any byte unused in the cell is
      cleared.
      Signed-off-by: default avatarJorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
      Signed-off-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      faa4dc52
    • Takashi Iwai's avatar
      ALSA: hda - Register irq handler after the chip initialization · 5844c4b2
      Takashi Iwai authored
      [ Upstream commit f495222e ]
      
      Currently the IRQ handler in HD-audio controller driver is registered
      before the chip initialization.  That is, we have some window opened
      between the azx_acquire_irq() call and the CORB/RIRB setup.  If an
      interrupt is triggered in this small window, the IRQ handler may
      access to the uninitialized RIRB buffer, which leads to a NULL
      dereference Oops.
      
      This is usually no big problem since most of Intel chips do register
      the IRQ via MSI, and we've already fixed the order of the IRQ
      enablement and the CORB/RIRB setup in the former commit b61749a8
      ("sound: enable interrupt after dma buffer initialization"), hence the
      IRQ won't be triggered in that room.  However, some platforms use a
      shared IRQ, and this may allow the IRQ trigger by another source.
      
      Another possibility is the kdump environment: a stale interrupt might
      be present in there, the IRQ handler can be falsely triggered as well.
      
      For covering this small race, let's move the azx_acquire_irq() call
      after hda_intel_init_chip() call.  Although this is a bit radical
      change, it can cover more widely than checking the CORB/RIRB setup
      locally in the callee side.
      Reported-by: default avatarLiwei Song <liwei.song@windriver.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5844c4b2
    • Lu Baolu's avatar
      iommu/vt-d: Set intel_iommu_gfx_mapped correctly · ea091c84
      Lu Baolu authored
      [ Upstream commit cf1ec453 ]
      
      The intel_iommu_gfx_mapped flag is exported by the Intel
      IOMMU driver to indicate whether an IOMMU is used for the
      graphic device. In a virtualized IOMMU environment (e.g.
      QEMU), an include-all IOMMU is used for graphic device.
      This flag is found to be clear even the IOMMU is used.
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Reported-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      Fixes: c0771df8 ("intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.")
      Suggested-by: default avatarKevin Tian <kevin.tian@intel.com>
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ea091c84
    • Vladimir Zapolskiy's avatar
      watchdog: fix compile time error of pretimeout governors · 5604d895
      Vladimir Zapolskiy authored
      [ Upstream commit a223770b ]
      
      CONFIG_WATCHDOG_PRETIMEOUT_GOV build symbol adds watchdog_pretimeout.o
      object to watchdog.o, the latter is compiled only if CONFIG_WATCHDOG_CORE
      is selected, so it rightfully makes sense to add it as a dependency.
      
      The change fixes the next compilation errors, if CONFIG_WATCHDOG_CORE=n
      and CONFIG_WATCHDOG_PRETIMEOUT_GOV=y are selected:
      
        drivers/watchdog/pretimeout_noop.o: In function `watchdog_gov_noop_register':
        drivers/watchdog/pretimeout_noop.c:35: undefined reference to `watchdog_register_governor'
        drivers/watchdog/pretimeout_noop.o: In function `watchdog_gov_noop_unregister':
        drivers/watchdog/pretimeout_noop.c:40: undefined reference to `watchdog_unregister_governor'
      
        drivers/watchdog/pretimeout_panic.o: In function `watchdog_gov_panic_register':
        drivers/watchdog/pretimeout_panic.c:35: undefined reference to `watchdog_register_governor'
        drivers/watchdog/pretimeout_panic.o: In function `watchdog_gov_panic_unregister':
        drivers/watchdog/pretimeout_panic.c:40: undefined reference to `watchdog_unregister_governor'
      Reported-by: default avatarKuo, Hsuan-Chi <hckuo2@illinois.edu>
      Fixes: ff84136c ("watchdog: add watchdog pretimeout governor framework")
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@linux-watchdog.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5604d895
    • Georg Hofmann's avatar
      watchdog: imx2_wdt: Fix set_timeout for big timeout values · 236048e5
      Georg Hofmann authored
      [ Upstream commit b07e228e ]
      
      The documentated behavior is: if max_hw_heartbeat_ms is implemented, the
      minimum of the set_timeout argument and max_hw_heartbeat_ms should be used.
      This patch implements this behavior.
      Previously only the first 7bits were used and the input argument was
      returned.
      Signed-off-by: default avatarGeorg Hofmann <georg@hofmannsweb.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@linux-watchdog.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      236048e5
    • Maciej Żenczykowski's avatar
      uml: fix a boot splat wrt use of cpu_all_mask · 4aa215d0
      Maciej Żenczykowski authored
      [ Upstream commit 689a5860 ]
      
      Memory: 509108K/542612K available (3835K kernel code, 919K rwdata, 1028K rodata, 129K init, 211K bss, 33504K reserved, 0K cma-reserved)
      NR_IRQS: 15
      clocksource: timer: mask: 0xffffffffffffffff max_cycles: 0x1cd42e205, max_idle_ns: 881590404426 ns
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 0 at kernel/time/clockevents.c:458 clockevents_register_device+0x72/0x140
      posix-timer cpumask == cpu_all_mask, using cpu_possible_mask instead
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc4-00048-ged79cc87 #4
      Stack:
       604ebda0 603c5370 604ebe20 6046fd17
       00000000 6006fcbb 604ebdb0 603c53b5
       604ebe10 6003bfc4 604ebdd0 9000001ca
      Call Trace:
       [<6006fcbb>] ? printk+0x0/0x94
       [<60083160>] ? clockevents_register_device+0x72/0x140
       [<6001f16e>] show_stack+0x13b/0x155
       [<603c5370>] ? dump_stack_print_info+0xe2/0xeb
       [<6006fcbb>] ? printk+0x0/0x94
       [<603c53b5>] dump_stack+0x2a/0x2c
       [<6003bfc4>] __warn+0x10e/0x13e
       [<60070320>] ? vprintk_func+0xc8/0xcf
       [<60030fd6>] ? block_signals+0x0/0x16
       [<6006fcbb>] ? printk+0x0/0x94
       [<6003c08b>] warn_slowpath_fmt+0x97/0x99
       [<600311a1>] ? set_signals+0x0/0x3f
       [<6003bff4>] ? warn_slowpath_fmt+0x0/0x99
       [<600842cb>] ? tick_oneshot_mode_active+0x44/0x4f
       [<60030fd6>] ? block_signals+0x0/0x16
       [<6006fcbb>] ? printk+0x0/0x94
       [<6007d2d5>] ? __clocksource_select+0x20/0x1b1
       [<60030fd6>] ? block_signals+0x0/0x16
       [<6006fcbb>] ? printk+0x0/0x94
       [<60083160>] clockevents_register_device+0x72/0x140
       [<60031192>] ? get_signals+0x0/0xf
       [<60030fd6>] ? block_signals+0x0/0x16
       [<6006fcbb>] ? printk+0x0/0x94
       [<60002eec>] um_timer_setup+0xc8/0xca
       [<60001b59>] start_kernel+0x47f/0x57e
       [<600035bc>] start_kernel_proc+0x49/0x4d
       [<6006c483>] ? kmsg_dump_register+0x82/0x8a
       [<6001de62>] new_thread_handler+0x81/0xb2
       [<60003571>] ? kmsg_dumper_stdout_init+0x1a/0x1c
       [<60020c75>] uml_finishsetup+0x54/0x59
      
      random: get_random_bytes called from init_oops_id+0x27/0x34 with crng_init=0
      ---[ end trace 00173d0117a88acb ]---
      Calibrating delay loop... 6941.90 BogoMIPS (lpj=34709504)
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: linux-um@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4aa215d0
    • YueHaibing's avatar
      configfs: fix possible use-after-free in configfs_register_group · a4681101
      YueHaibing authored
      [ Upstream commit 35399f87 ]
      
      In configfs_register_group(), if create_default_group() failed, we
      forget to unlink the group. It will left a invalid item in the parent list,
      which may trigger the use-after-free issue seen below:
      
      BUG: KASAN: use-after-free in __list_add_valid+0xd4/0xe0 lib/list_debug.c:26
      Read of size 8 at addr ffff8881ef61ae20 by task syz-executor.0/5996
      
      CPU: 1 PID: 5996 Comm: syz-executor.0 Tainted: G         C        5.0.0+ #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xa9/0x10e lib/dump_stack.c:113
       print_address_description+0x65/0x270 mm/kasan/report.c:187
       kasan_report+0x149/0x18d mm/kasan/report.c:317
       __list_add_valid+0xd4/0xe0 lib/list_debug.c:26
       __list_add include/linux/list.h:60 [inline]
       list_add_tail include/linux/list.h:93 [inline]
       link_obj+0xb0/0x190 fs/configfs/dir.c:759
       link_group+0x1c/0x130 fs/configfs/dir.c:784
       configfs_register_group+0x56/0x1e0 fs/configfs/dir.c:1751
       configfs_register_default_group+0x72/0xc0 fs/configfs/dir.c:1834
       ? 0xffffffffc1be0000
       iio_sw_trigger_init+0x23/0x1000 [industrialio_sw_trigger]
       do_one_initcall+0xbc/0x47d init/main.c:887
       do_init_module+0x1b5/0x547 kernel/module.c:3456
       load_module+0x6405/0x8c10 kernel/module.c:3804
       __do_sys_finit_module+0x162/0x190 kernel/module.c:3898
       do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f494ecbcc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
      RBP: 00007f494ecbcc70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f494ecbd6bc
      R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
      
      Allocated by task 5987:
       set_track mm/kasan/common.c:87 [inline]
       __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:497
       kmalloc include/linux/slab.h:545 [inline]
       kzalloc include/linux/slab.h:740 [inline]
       configfs_register_default_group+0x4c/0xc0 fs/configfs/dir.c:1829
       0xffffffffc1bd0023
       do_one_initcall+0xbc/0x47d init/main.c:887
       do_init_module+0x1b5/0x547 kernel/module.c:3456
       load_module+0x6405/0x8c10 kernel/module.c:3804
       __do_sys_finit_module+0x162/0x190 kernel/module.c:3898
       do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 5987:
       set_track mm/kasan/common.c:87 [inline]
       __kasan_slab_free+0x130/0x180 mm/kasan/common.c:459
       slab_free_hook mm/slub.c:1429 [inline]
       slab_free_freelist_hook mm/slub.c:1456 [inline]
       slab_free mm/slub.c:3003 [inline]
       kfree+0xe1/0x270 mm/slub.c:3955
       configfs_register_default_group+0x9a/0xc0 fs/configfs/dir.c:1836
       0xffffffffc1bd0023
       do_one_initcall+0xbc/0x47d init/main.c:887
       do_init_module+0x1b5/0x547 kernel/module.c:3456
       load_module+0x6405/0x8c10 kernel/module.c:3804
       __do_sys_finit_module+0x162/0x190 kernel/module.c:3898
       do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8881ef61ae00
       which belongs to the cache kmalloc-192 of size 192
      The buggy address is located 32 bytes inside of
       192-byte region [ffff8881ef61ae00, ffff8881ef61aec0)
      The buggy address belongs to the page:
      page:ffffea0007bd8680 count:1 mapcount:0 mapping:ffff8881f6c03000 index:0xffff8881ef61a700
      flags: 0x2fffc0000000200(slab)
      raw: 02fffc0000000200 ffffea0007ca4740 0000000500000005 ffff8881f6c03000
      raw: ffff8881ef61a700 000000008010000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8881ef61ad00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff8881ef61ad80: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      >ffff8881ef61ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                     ^
       ffff8881ef61ae80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff8881ef61af00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 5cf6a51e ("configfs: allow dynamic group creation")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a4681101
    • Chao Yu's avatar
      f2fs: fix to do sanity check on valid block count of segment · dff15a2d
      Chao Yu authored
      [ Upstream commit e95bcdb2 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203233
      
      - Overview
      When mounting the attached crafted image and running program, following errors are reported.
      Additionally, it hangs on sync after running program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      cc poc_13.c
      mkdir test
      mount -t f2fs tmp.img test
      cp a.out test
      cd test
      sudo ./a.out
      sync
      
      - Kernel messages
       F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
       kernel BUG at fs/f2fs/segment.c:2102!
       RIP: 0010:update_sit_entry+0x394/0x410
       Call Trace:
        f2fs_allocate_data_block+0x16f/0x660
        do_write_page+0x62/0x170
        f2fs_do_write_node_page+0x33/0xa0
        __write_node_page+0x270/0x4e0
        f2fs_sync_node_pages+0x5df/0x670
        f2fs_write_checkpoint+0x372/0x1400
        f2fs_sync_fs+0xa3/0x130
        f2fs_do_sync_file+0x1a6/0x810
        do_fsync+0x33/0x60
        __x64_sys_fsync+0xb/0x10
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      sit.vblocks and sum valid block count in sit.valid_map may be
      inconsistent, segment w/ zero vblocks will be treated as free
      segment, while allocating in free segment, we may allocate a
      free block, if its bitmap is valid previously, it can cause
      kernel crash due to bitmap verification failure.
      
      Anyway, to avoid further serious metadata inconsistence and
      corruption, it is necessary and worth to detect SIT
      inconsistence. So let's enable check_block_count() to verify
      vblocks and valid_map all the time rather than do it only
      CONFIG_F2FS_CHECK_FS is enabled.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dff15a2d
    • Chao Yu's avatar
      f2fs: fix to clear dirty inode in error path of f2fs_iget() · 2b653167
      Chao Yu authored
      [ Upstream commit 546d22f0 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203217
      
      - Overview
      When mounting the attached crafted image and running program, I got this error.
      Additionally, it hangs on sync after running the program.
      
      The image is intentionally fuzzed from a normal f2fs image for testing and I enabled option CONFIG_F2FS_CHECK_FS on.
      
      - Reproduces
      cc poc_test_05.c
      mkdir test
      mount -t f2fs tmp.img test
      sudo ./a.out
      sync
      
      - Messages
       kernel BUG at fs/f2fs/inode.c:707!
       RIP: 0010:f2fs_evict_inode+0x33f/0x3a0
       Call Trace:
        evict+0xba/0x180
        f2fs_iget+0x598/0xdf0
        f2fs_lookup+0x136/0x320
        __lookup_slow+0x92/0x140
        lookup_slow+0x30/0x50
        walk_component+0x1c1/0x350
        path_lookupat+0x62/0x200
        filename_lookup+0xb3/0x1a0
        do_readlinkat+0x56/0x110
        __x64_sys_readlink+0x16/0x20
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      During inode loading, __recover_inline_status() can recovery inode status
      and set inode dirty, once we failed in following process, it will fail
      the check in f2fs_evict_inode, result in trigger BUG_ON().
      
      Let's clear dirty inode in error path of f2fs_iget() to avoid panic.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2b653167
    • Chao Yu's avatar
      f2fs: fix to avoid panic in do_recover_data() · 3cdbcbef
      Chao Yu authored
      [ Upstream commit 22d61e28 ]
      
      As Jungyeon reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=203227
      
      - Overview
      When mounting the attached crafted image, following errors are reported.
      Additionally, it hangs on sync after trying to mount it.
      
      The image is intentionally fuzzed from a normal f2fs image for testing.
      Compile options for F2FS are as follows.
      CONFIG_F2FS_FS=y
      CONFIG_F2FS_STAT_FS=y
      CONFIG_F2FS_FS_XATTR=y
      CONFIG_F2FS_FS_POSIX_ACL=y
      CONFIG_F2FS_CHECK_FS=y
      
      - Reproduces
      mkdir test
      mount -t f2fs tmp.img test
      sync
      
      - Messages
       kernel BUG at fs/f2fs/recovery.c:549!
       RIP: 0010:recover_data+0x167a/0x1780
       Call Trace:
        f2fs_recover_fsync_data+0x613/0x710
        f2fs_fill_super+0x1043/0x1aa0
        mount_bdev+0x16d/0x1a0
        mount_fs+0x4a/0x170
        vfs_kern_mount+0x5d/0x100
        do_mount+0x200/0xcf0
        ksys_mount+0x79/0xc0
        __x64_sys_mount+0x1c/0x20
        do_syscall_64+0x43/0xf0
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      During recovery, if ofs_of_node is inconsistent in between recovered
      node page and original checkpointed node page, let's just fail recovery
      instead of making kernel panic.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3cdbcbef
    • Miroslav Lichvar's avatar
      ntp: Allow TAI-UTC offset to be set to zero · 5ab0886e
      Miroslav Lichvar authored
      [ Upstream commit fdc6bae9 ]
      
      The ADJ_TAI adjtimex mode sets the TAI-UTC offset of the system clock.
      It is typically set by NTP/PTP implementations and it is automatically
      updated by the kernel on leap seconds. The initial value is zero (which
      applications may interpret as unknown), but this value cannot be set by
      adjtimex. This limitation seems to go back to the original "nanokernel"
      implementation by David Mills.
      
      Change the ADJ_TAI check to accept zero as a valid TAI-UTC offset in
      order to allow setting it back to the initial value.
      
      Fixes: 153b5d05 ("ntp: support for TAI")
      Suggested-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Link: https://lkml.kernel.org/r/20190417084833.7401-1-mlichvar@redhat.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      5ab0886e
    • Martin Blumenstingl's avatar
      pwm: meson: Use the spin-lock only to protect register modifications · d7541cb8
      Martin Blumenstingl authored
      [ Upstream commit f173747f ]
      
      Holding the spin-lock for all of the code in meson_pwm_apply() can
      result in a "BUG: scheduling while atomic". This can happen because
      clk_get_rate() (which is called from meson_pwm_calc()) may sleep.
      Only hold the spin-lock when modifying registers to solve this.
      
      The reason why we need a spin-lock in the driver is because the
      REG_MISC_AB register is shared between the two channels provided by one
      PWM controller. The only functions where REG_MISC_AB is modified are
      meson_pwm_enable() and meson_pwm_disable() so the register reads/writes
      in there need to be protected by the spin-lock.
      
      The original code also used the spin-lock to protect the values in
      struct meson_pwm_channel. This could be necessary if two consumers can
      use the same PWM channel. However, PWM core doesn't allow this so we
      don't need to protect the values in struct meson_pwm_channel with a
      lock.
      
      Fixes: 211ed630 ("pwm: Add support for Meson PWM Controller")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d7541cb8
    • Josh Poimboeuf's avatar
      objtool: Don't use ignore flag for fake jumps · 4ba76bf2
      Josh Poimboeuf authored
      [ Upstream commit e6da9567 ]
      
      The ignore flag is set on fake jumps in order to keep
      add_jump_destinations() from setting their jump_dest, since it already
      got set when the fake jump was created.
      
      But using the ignore flag is a bit of a hack.  It's normally used to
      skip validation of an instruction, which doesn't really make sense for
      fake jumps.
      
      Also, after the next patch, using the ignore flag for fake jumps can
      trigger a false "why am I validating an ignored function?" warning.
      
      Instead just add an explicit check in add_jump_destinations() to skip
      fake jumps.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/71abc072ff48b2feccc197723a9c52859476c068.1557766718.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4ba76bf2
    • Matt Redfearn's avatar
      drm/bridge: adv7511: Fix low refresh rate selection · ad067e4f
      Matt Redfearn authored
      [ Upstream commit 67793bd3 ]
      
      The driver currently sets register 0xfb (Low Refresh Rate) based on the
      value of mode->vrefresh. Firstly, this field is specified to be in Hz,
      but the magic numbers used by the code are Hz * 1000. This essentially
      leads to the low refresh rate always being set to 0x01, since the
      vrefresh value will always be less than 24000. Fix the magic numbers to
      be in Hz.
      Secondly, according to the comment in drm_modes.h, the field is not
      supposed to be used in a functional way anyway. Instead, use the helper
      function drm_mode_vrefresh().
      
      Fixes: 9c8af882 ("drm: Add adv7511 encoder driver")
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@thinci.com>
      Signed-off-by: default avatarSean Paul <seanpaul@chromium.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424132210.26338-1-matt.redfearn@thinci.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      ad067e4f
    • Stephane Eranian's avatar
      perf/x86/intel: Allow PEBS multi-entry in watermark mode · 35dd88b1
      Stephane Eranian authored
      [ Upstream commit c7a28657 ]
      
      This patch fixes a restriction/bug introduced by:
      
         583feb08 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")
      
      The original patch prevented using multi-entry PEBS when wakeup_events != 0.
      However given that wakeup_events is part of a union with wakeup_watermark, it
      means that in watermark mode, PEBS multi-entry is also disabled which is not the
      intent. This patch fixes this by checking is watermark mode is enabled.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: vincent.weaver@maine.edu
      Fixes: 583feb08 ("perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS")
      Link: http://lkml.kernel.org/r/20190514003400.224340-1-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      35dd88b1
    • Tony Lindgren's avatar
      mfd: twl6040: Fix device init errors for ACCCTL register · 59e1b23b
      Tony Lindgren authored
      [ Upstream commit 48171d0e ]
      
      I noticed that we can get a -EREMOTEIO errors on at least omap4 duovero:
      
      twl6040 0-004b: Failed to write 2d = 19: -121
      
      And then any following register access will produce errors.
      
      There 2d offset above is register ACCCTL that gets written on twl6040
      powerup. With error checking added to the related regcache_sync() call,
      the -EREMOTEIO error is reproducable on twl6040 powerup at least
      duovero.
      
      To fix the error, we need to wait until twl6040 is accessible after the
      powerup. Based on tests on omap4 duovero, we need to wait over 8ms after
      powerup before register write will complete without failures. Let's also
      make sure we warn about possible errors too.
      
      Note that we have twl6040_patch[] reg_sequence with the ACCCTL register
      configuration and regcache_sync() will write the new value to ACCCTL.
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      59e1b23b
    • Binbin Wu's avatar
      mfd: intel-lpss: Set the device in reset state when init · 381a9685
      Binbin Wu authored
      [ Upstream commit dad06532 ]
      
      In virtualized setup, when system reboots due to warm
      reset interrupt storm is seen.
      
      Call Trace:
      <IRQ>
      dump_stack+0x70/0xa5
      __report_bad_irq+0x2e/0xc0
      note_interrupt+0x248/0x290
      ? add_interrupt_randomness+0x30/0x220
      handle_irq_event_percpu+0x54/0x80
      handle_irq_event+0x39/0x60
      handle_fasteoi_irq+0x91/0x150
      handle_irq+0x108/0x180
      do_IRQ+0x52/0xf0
      common_interrupt+0xf/0xf
      </IRQ>
      RIP: 0033:0x76fc2cfabc1d
      Code: 24 28 bf 03 00 00 00 31 c0 48 8d 35 63 77 0e 00 48 8d 15 2e
      94 0e 00 4c 89 f9 49 89 d9 4c 89 d3 e8 b8 e2 01 00 48 8b 54 24 18
      <48> 89 ef 48 89 de 4c 89 e1 e8 d5 97 01 00 84 c0 74 2d 48 8b 04
      24
      RSP: 002b:00007ffd247c1fc0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffda
      RAX: 0000000000000000 RBX: 00007ffd247c1ff0 RCX: 000000000003d3ce
      RDX: 0000000000000000 RSI: 00007ffd247c1ff0 RDI: 000076fc2cbb6010
      RBP: 000076fc2cded010 R08: 00007ffd247c2210 R09: 00007ffd247c22a0
      R10: 000076fc29465470 R11: 0000000000000000 R12: 00007ffd247c1fc0
      R13: 000076fc2ce8e470 R14: 000076fc27ec9960 R15: 0000000000000414
      handlers:
      [<000000000d3fa913>] idma64_irq
      Disabling IRQ #27
      
      To avoid interrupt storm, set the device in reset state
      before bringing out the device from reset state.
      
      Changelog v2:
      - correct the subject line by adding "mfd: "
      Signed-off-by: default avatarBinbin Wu <binbin.wu@intel.com>
      Acked-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      381a9685
    • Daniel Gomez's avatar
      mfd: tps65912-spi: Add missing of table registration · ef0bdc8d
      Daniel Gomez authored
      [ Upstream commit 9e364e87 ]
      
      MODULE_DEVICE_TABLE(of, <of_match_table> should be called to complete DT
      OF mathing mechanism and register it.
      
      Before this patch:
      modinfo drivers/mfd/tps65912-spi.ko | grep alias
      alias:          spi:tps65912
      
      After this patch:
      modinfo drivers/mfd/tps65912-spi.ko | grep alias
      alias:          of:N*T*Cti,tps65912C*
      alias:          of:N*T*Cti,tps65912
      alias:          spi:tps65912
      Reported-by: default avatarJavier Martinez Canillas <javier@dowhile0.org>
      Signed-off-by: default avatarDaniel Gomez <dagmcr@gmail.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ef0bdc8d
    • Amit Kucheria's avatar
      drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER · bad5f0b7
      Amit Kucheria authored
      [ Upstream commit fc7d18cf ]
      
      We print a calibration failure message on -EPROBE_DEFER from
      nvmem/qfprom as follows:
      [    3.003090] qcom-tsens 4a9000.thermal-sensor: version: 1.4
      [    3.005376] qcom-tsens 4a9000.thermal-sensor: tsens calibration failed
      [    3.113248] qcom-tsens 4a9000.thermal-sensor: version: 1.4
      
      This confuses people when, in fact, calibration succeeds later when
      nvmem/qfprom device is available. Don't print this message on a
      -EPROBE_DEFER.
      Signed-off-by: default avatarAmit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: default avatarEduardo Valentin <edubezval@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bad5f0b7
    • Cyrill Gorcunov's avatar
      kernel/sys.c: prctl: fix false positive in validate_prctl_map() · e74cb9e0
      Cyrill Gorcunov authored
      [ Upstream commit a9e73998 ]
      
      While validating new map we require the @start_data to be strictly less
      than @end_data, which is fine for regular applications (this is why this
      nit didn't trigger for that long).  These members are set from executable
      loaders such as elf handers, still it is pretty valid to have a loadable
      data section with zero size in file, in such case the start_data is equal
      to end_data once kernel loader finishes.
      
      As a result when we're trying to restore such programs the procedure fails
      and the kernel returns -EINVAL.  From the image dump of a program:
      
       | "mm_start_code": "0x400000",
       | "mm_end_code": "0x8f5fb4",
       | "mm_start_data": "0xf1bfb0",
       | "mm_end_data": "0xf1bfb0",
      
      Thus we need to change validate_prctl_map from strictly less to less or
      equal operator use.
      
      Link: http://lkml.kernel.org/r/20190408143554.GY1421@uranus.lan
      Fixes: f606b77f ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      Cc: Andrey Vagin <avagin@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e74cb9e0
    • Qian Cai's avatar
      mm/slab.c: fix an infinite loop in leaks_show() · 5dd33451
      Qian Cai authored
      [ Upstream commit 745e1014 ]
      
      "cat /proc/slab_allocators" could hang forever on SMP machines with
      kmemleak or object debugging enabled due to other CPUs running do_drain()
      will keep making kmemleak_object or debug_objects_cache dirty and unable
      to escape the first loop in leaks_show(),
      
      do {
      	set_store_user_clean(cachep);
      	drain_cpu_caches(cachep);
      	...
      
      } while (!is_store_user_clean(cachep));
      
      For example,
      
      do_drain
        slabs_destroy
          slab_destroy
            kmem_cache_free
              __cache_free
                ___cache_free
                  kmemleak_free_recursive
                    delete_object_full
                      __delete_object
                        put_object
                          free_object_rcu
                            kmem_cache_free
                              cache_free_debugcheck --> dirty kmemleak_object
      
      One approach is to check cachep->name and skip both kmemleak_object and
      debug_objects_cache in leaks_show().  The other is to set store_user_clean
      after drain_cpu_caches() which leaves a small window between
      drain_cpu_caches() and set_store_user_clean() where per-CPU caches could
      be dirty again lead to slightly wrong information has been stored but
      could also speed up things significantly which sounds like a good
      compromise.  For example,
      
       # cat /proc/slab_allocators
       0m42.778s # 1st approach
       0m0.737s  # 2nd approach
      
      [akpm@linux-foundation.org: tweak comment]
      Link: http://lkml.kernel.org/r/20190411032635.10325-1-cai@lca.pw
      Fixes: d31676df ("mm/slab: alternative implementation for DEBUG_SLAB_LEAK")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5dd33451
    • Yue Hu's avatar
      mm/cma_debug.c: fix the break condition in cma_maxchunk_get() · aab90fdb
      Yue Hu authored
      [ Upstream commit f0fd5050 ]
      
      If not find zero bit in find_next_zero_bit(), it will return the size
      parameter passed in, so the start bit should be compared with bitmap_maxno
      rather than cma->count.  Although getting maxchunk is working fine due to
      zero value of order_per_bit currently, the operation will be stuck if
      order_per_bit is set as non-zero.
      
      Link: http://lkml.kernel.org/r/20190319092734.276-1-zbestahu@gmail.comSigned-off-by: default avatarYue Hu <huyue2@yulong.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Safonov <d.safonov@partner.samsung.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aab90fdb
    • Yue Hu's avatar
      mm/cma.c: fix crash on CMA allocation if bitmap allocation fails · 8f759452
      Yue Hu authored
      [ Upstream commit 1df3a339 ]
      
      f022d8cb ("mm: cma: Don't crash on allocation if CMA area can't be
      activated") fixes the crash issue when activation fails via setting
      cma->count as 0, same logic exists if bitmap allocation fails.
      
      Link: http://lkml.kernel.org/r/20190325081309.6004-1-zbestahu@gmail.comSigned-off-by: default avatarYue Hu <huyue2@yulong.com>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f759452
    • Linxu Fang's avatar
      mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE · 121100e4
      Linxu Fang authored
      [ Upstream commit 299c83dc ]
      
      342332e6 ("mm/page_alloc.c: introduce kernelcore=mirror option") and
      later patches rewrote the calculation of node spanned pages.
      
      e506b996 ("mem-hotplug: fix node spanned pages when we have a movable
      node"), but the current code still has problems,
      
      When we have a node with only zone_movable and the node id is not zero,
      the size of node spanned pages is double added.
      
      That's because we have an empty normal zone, and zone_start_pfn or
      zone_end_pfn is not between arch_zone_lowest_possible_pfn and
      arch_zone_highest_possible_pfn, so we need to use clamp to constrain the
      range just like the commit <96e907d1> (bootmem: Reimplement
      __absent_pages_in_range() using for_each_mem_pfn_range()).
      
      e.g.
      Zone ranges:
        DMA      [mem 0x0000000000001000-0x0000000000ffffff]
        DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
        Normal   [mem 0x0000000100000000-0x000000023fffffff]
      Movable zone start for each node
        Node 0: 0x0000000100000000
        Node 1: 0x0000000140000000
      Early memory node ranges
        node   0: [mem 0x0000000000001000-0x000000000009efff]
        node   0: [mem 0x0000000000100000-0x00000000bffdffff]
        node   0: [mem 0x0000000100000000-0x000000013fffffff]
        node   1: [mem 0x0000000140000000-0x000000023fffffff]
      
      node 0 DMA	spanned:0xfff   present:0xf9e   absent:0x61
      node 0 DMA32	spanned:0xff000 present:0xbefe0	absent:0x40020
      node 0 Normal	spanned:0	present:0	absent:0
      node 0 Movable	spanned:0x40000 present:0x40000 absent:0
      On node 0 totalpages(node_present_pages): 1048446
      node_spanned_pages:1310719
      node 1 DMA	spanned:0	    present:0		absent:0
      node 1 DMA32	spanned:0	    present:0		absent:0
      node 1 Normal	spanned:0x100000    present:0x100000	absent:0
      node 1 Movable	spanned:0x100000    present:0x100000	absent:0
      On node 1 totalpages(node_present_pages): 2097152
      node_spanned_pages:2097152
      Memory: 6967796K/12582392K available (16388K kernel code, 3686K rwdata,
      4468K rodata, 2160K init, 10444K bss, 5614596K reserved, 0K
      cma-reserved)
      
      It shows that the current memory of node 1 is double added.
      After this patch, the problem is fixed.
      
      node 0 DMA	spanned:0xfff   present:0xf9e   absent:0x61
      node 0 DMA32	spanned:0xff000 present:0xbefe0	absent:0x40020
      node 0 Normal	spanned:0	present:0	absent:0
      node 0 Movable	spanned:0x40000 present:0x40000 absent:0
      On node 0 totalpages(node_present_pages): 1048446
      node_spanned_pages:1310719
      node 1 DMA	spanned:0	    present:0		absent:0
      node 1 DMA32	spanned:0	    present:0		absent:0
      node 1 Normal	spanned:0	    present:0		absent:0
      node 1 Movable	spanned:0x100000    present:0x100000	absent:0
      On node 1 totalpages(node_present_pages): 1048576
      node_spanned_pages:1048576
      memory: 6967796K/8388088K available (16388K kernel code, 3686K rwdata,
      4468K rodata, 2160K init, 10444K bss, 1420292K reserved, 0K
      cma-reserved)
      
      Link: http://lkml.kernel.org/r/1554178276-10372-1-git-send-email-fanglinxu@huawei.comSigned-off-by: default avatarLinxu Fang <fanglinxu@huawei.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      121100e4
    • Mike Kravetz's avatar
      hugetlbfs: on restore reserve error path retain subpool reservation · 4c2e0411
      Mike Kravetz authored
      [ Upstream commit 0919e1b6 ]
      
      When a huge page is allocated, PagePrivate() is set if the allocation
      consumed a reservation.  When freeing a huge page, PagePrivate is checked.
      If set, it indicates the reservation should be restored.  PagePrivate
      being set at free huge page time mostly happens on error paths.
      
      When huge page reservations are created, a check is made to determine if
      the mapping is associated with an explicitly mounted filesystem.  If so,
      pages are also reserved within the filesystem.  The default action when
      freeing a huge page is to decrement the usage count in any associated
      explicitly mounted filesystem.  However, if the reservation is to be
      restored the reservation/use count within the filesystem should not be
      decrementd.  Otherwise, a subsequent page allocation and free for the same
      mapping location will cause the file filesystem usage to go 'negative'.
      
      Filesystem                         Size  Used Avail Use% Mounted on
      nodev                              4.0G -4.0M  4.1G    - /opt/hugepool
      
      To fix, when freeing a huge page do not adjust filesystem usage if
      PagePrivate() is set to indicate the reservation should be restored.
      
      I did not cc stable as the problem has been around since reserves were
      added to hugetlbfs and nobody has noticed.
      
      Link: http://lkml.kernel.org/r/20190328234704.27083-2-mike.kravetz@oracle.comSigned-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4c2e0411
    • Arnd Bergmann's avatar
      ARM: prevent tracing IPI_CPU_BACKTRACE · 98e17eda
      Arnd Bergmann authored
      [ Upstream commit be167862 ]
      
      Patch series "compiler: allow all arches to enable
      CONFIG_OPTIMIZE_INLINING", v3.
      
      This patch (of 11):
      
      When function tracing for IPIs is enabled, we get a warning for an
      overflow of the ipi_types array with the IPI_CPU_BACKTRACE type as
      triggered by raise_nmi():
      
        arch/arm/kernel/smp.c: In function 'raise_nmi':
        arch/arm/kernel/smp.c:489:2: error: array subscript is above array bounds [-Werror=array-bounds]
          trace_ipi_raise(target, ipi_types[ipinr]);
      
      This is a correct warning as we actually overflow the array here.
      
      This patch raise_nmi() to call __smp_cross_call() instead of
      smp_cross_call(), to avoid calling into ftrace.  For clarification, I'm
      also adding a two new code comments describing how this one is special.
      
      The warning appears to have shown up after commit e7273ff4 ("ARM:
      8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI"), which changed the
      number assignment from '15' to '8', but as far as I can tell has existed
      since the IPI tracepoints were first introduced.  If we decide to
      backport this patch to stable kernels, we probably need to backport
      e7273ff4 as well.
      
      [yamada.masahiro@socionext.com: rebase on v5.1-rc1]
      Link: http://lkml.kernel.org/r/20190423034959.13525-2-yamada.masahiro@socionext.com
      Fixes: e7273ff4 ("ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI")
      Fixes: 365ec7b1 ("ARM: add IPI tracepoints") # v3.17
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      98e17eda
    • Li Rongqing's avatar
      ipc: prevent lockup on alloc_msg and free_msg · 7b5598c8
      Li Rongqing authored
      [ Upstream commit d6a2946a ]
      
      msgctl10 of ltp triggers the following lockup When CONFIG_KASAN is
      enabled on large memory SMP systems, the pages initialization can take a
      long time, if msgctl10 requests a huge block memory, and it will block
      rcu scheduler, so release cpu actively.
      
      After adding schedule() in free_msg, free_msg can not be called when
      holding spinlock, so adding msg to a tmp list, and free it out of
      spinlock
      
        rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 16-31): P32505
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 48-63): P34978
        rcu:     (detected by 11, t=35024 jiffies, g=44237529, q=16542267)
        msgctl10        R  running task    21608 32505   2794 0x00000082
        Call Trace:
         preempt_schedule_irq+0x4c/0xb0
         retint_kernel+0x1b/0x2d
        RIP: 0010:__is_insn_slot_addr+0xfb/0x250
        Code: 82 1d 00 48 8b 9b 90 00 00 00 4c 89 f7 49 c1 ee 03 e8 59 83 1d 00 48 b8 00 00 00 00 00 fc ff df 4c 39 eb 48 89 9d 58 ff ff ff <41> c6 04 06 f8 74 66 4c 8d 75 98 4c 89 f1 48 c1 e9 03 48 01 c8 48
        RSP: 0018:ffff88bce041f758 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
        RAX: dffffc0000000000 RBX: ffffffff8471bc50 RCX: ffffffff828a2a57
        RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88bce041f780
        RBP: ffff88bce041f828 R08: ffffed15f3f4c5b3 R09: ffffed15f3f4c5b3
        R10: 0000000000000001 R11: ffffed15f3f4c5b2 R12: 000000318aee9b73
        R13: ffffffff8471bc50 R14: 1ffff1179c083ef0 R15: 1ffff1179c083eec
         kernel_text_address+0xc1/0x100
         __kernel_text_address+0xe/0x30
         unwind_get_return_address+0x2f/0x50
         __save_stack_trace+0x92/0x100
         create_object+0x380/0x650
         __kmalloc+0x14c/0x2b0
         load_msg+0x38/0x1a0
         do_msgsnd+0x19e/0xcf0
         do_syscall_64+0x117/0x400
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
        rcu:     Tasks blocked on level-1 rcu_node (CPUs 0-15): P32170
        rcu:     (detected by 14, t=35016 jiffies, g=44237525, q=12423063)
        msgctl10        R  running task    21608 32170  32155 0x00000082
        Call Trace:
         preempt_schedule_irq+0x4c/0xb0
         retint_kernel+0x1b/0x2d
        RIP: 0010:lock_acquire+0x4d/0x340
        Code: 48 81 ec c0 00 00 00 45 89 c6 4d 89 cf 48 8d 6c 24 20 48 89 3c 24 48 8d bb e4 0c 00 00 89 74 24 0c 48 c7 44 24 20 b3 8a b5 41 <48> c1 ed 03 48 c7 44 24 28 b4 25 18 84 48 c7 44 24 30 d0 54 7a 82
        RSP: 0018:ffff88af83417738 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
        RAX: dffffc0000000000 RBX: ffff88bd335f3080 RCX: 0000000000000002
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88bd335f3d64
        RBP: ffff88af83417758 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000001 R11: ffffed13f3f745b2 R12: 0000000000000000
        R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
         is_bpf_text_address+0x32/0xe0
         kernel_text_address+0xec/0x100
         __kernel_text_address+0xe/0x30
         unwind_get_return_address+0x2f/0x50
         __save_stack_trace+0x92/0x100
         save_stack+0x32/0xb0
         __kasan_slab_free+0x130/0x180
         kfree+0xfa/0x2d0
         free_msg+0x24/0x50
         do_msgrcv+0x508/0xe60
         do_syscall_64+0x117/0x400
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Davidlohr said:
       "So after releasing the lock, the msg rbtree/list is empty and new
        calls will not see those in the newly populated tmp_msg list, and
        therefore they cannot access the delayed msg freeing pointers, which
        is good. Also the fact that the node_cache is now freed before the
        actual messages seems to be harmless as this is wanted for
        msg_insert() avoiding GFP_ATOMIC allocations, and after releasing the
        info->lock the thing is freed anyway so it should not change things"
      
      Link: http://lkml.kernel.org/r/1552029161-4957-1-git-send-email-lirongqing@baidu.comSigned-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarZhang Yu <zhangyu31@baidu.com>
      Reviewed-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b5598c8
    • Christian Brauner's avatar
      sysctl: return -EINVAL if val violates minmax · 726f69d3
      Christian Brauner authored
      [ Upstream commit e260ad01 ]
      
      Currently when userspace gives us a values that overflow e.g.  file-max
      and other callers of __do_proc_doulongvec_minmax() we simply ignore the
      new value and leave the current value untouched.
      
      This can be problematic as it gives the illusion that the limit has
      indeed be bumped when in fact it failed.  This commit makes sure to
      return EINVAL when an overflow is detected.  Please note that this is a
      userspace facing change.
      
      Link: http://lkml.kernel.org/r/20190210203943.8227-4-christian@brauner.ioSigned-off-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Waiman Long <longman@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      726f69d3
    • Hou Tao's avatar
      fs/fat/file.c: issue flush after the writeback of FAT · cc2df631
      Hou Tao authored
      [ Upstream commit bd8309de ]
      
      fsync() needs to make sure the data & meta-data of file are persistent
      after the return of fsync(), even when a power-failure occurs later.  In
      the case of fat-fs, the FAT belongs to the meta-data of file, so we need
      to issue a flush after the writeback of FAT instead before.
      
      Also bail out early when any stage of fsync fails.
      
      Link: http://lkml.kernel.org/r/20190409030158.136316-1-houtao1@huawei.comSigned-off-by: default avatarHou Tao <houtao1@huawei.com>
      Acked-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc2df631
    • Kangjie Lu's avatar
      rapidio: fix a NULL pointer dereference when create_workqueue() fails · 506e0ced
      Kangjie Lu authored
      [ Upstream commit 23015b22 ]
      
      In case create_workqueue fails, the fix releases resources and returns
      -ENOMEM to avoid NULL pointer dereference.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Acked-by: default avatarAlexandre Bounine <alex.bou9@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      506e0ced
  2. 17 Jun, 2019 4 commits