1. 05 Apr, 2019 40 commits
    • Anders Roxell's avatar
      dmaengine: imx-dma: fix warning comparison of distinct pointer types · d66f368b
      Anders Roxell authored
      [ Upstream commit 9227ab56 ]
      
      The warning got introduced by commit 930507c1 ("arm64: add basic
      Kconfig symbols for i.MX8"). Since it got enabled for arm64. The warning
      haven't been seen before since size_t was 'unsigned int' when built on
      arm32.
      
      ../drivers/dma/imx-dma.c: In function ‘imxdma_sg_next’:
      ../include/linux/kernel.h:846:29: warning: comparison of distinct pointer types lacks a cast
         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
                                   ^~
      ../include/linux/kernel.h:860:4: note: in expansion of macro ‘__typecheck’
         (__typecheck(x, y) && __no_side_effects(x, y))
          ^~~~~~~~~~~
      ../include/linux/kernel.h:870:24: note: in expansion of macro ‘__safe_cmp’
        __builtin_choose_expr(__safe_cmp(x, y), \
                              ^~~~~~~~~~
      ../include/linux/kernel.h:879:19: note: in expansion of macro ‘__careful_cmp’
       #define min(x, y) __careful_cmp(x, y, <)
                         ^~~~~~~~~~~~~
      ../drivers/dma/imx-dma.c:288:8: note: in expansion of macro ‘min’
        now = min(d->len, sg_dma_len(sg));
              ^~~
      
      Rework so that we use min_t and pass in the size_t that returns the
      minimum of two values, using the specified type.
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d66f368b
    • Valentin Schneider's avatar
      cpu/hotplug: Mute hotplug lockdep during init · 2ae0dd16
      Valentin Schneider authored
      [ Upstream commit ce48c457 ]
      
      Since we've had:
      
        commit cb538267 ("jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations")
      
      we've been getting some lockdep warnings during init, such as on HiKey960:
      
      [    0.820495] WARNING: CPU: 4 PID: 0 at kernel/cpu.c:316 lockdep_assert_cpus_held+0x3c/0x48
      [    0.820498] Modules linked in:
      [    0.820509] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G S                4.20.0-rc5-00051-g4cae42a #34
      [    0.820511] Hardware name: HiKey960 (DT)
      [    0.820516] pstate: 600001c5 (nZCv dAIF -PAN -UAO)
      [    0.820520] pc : lockdep_assert_cpus_held+0x3c/0x48
      [    0.820523] lr : lockdep_assert_cpus_held+0x38/0x48
      [    0.820526] sp : ffff00000a9cbe50
      [    0.820528] x29: ffff00000a9cbe50 x28: 0000000000000000
      [    0.820533] x27: 00008000b69e5000 x26: ffff8000bff4cfe0
      [    0.820537] x25: ffff000008ba69e0 x24: 0000000000000001
      [    0.820541] x23: ffff000008fce000 x22: ffff000008ba70c8
      [    0.820545] x21: 0000000000000001 x20: 0000000000000003
      [    0.820548] x19: ffff00000a35d628 x18: ffffffffffffffff
      [    0.820552] x17: 0000000000000000 x16: 0000000000000000
      [    0.820556] x15: ffff00000958f848 x14: 455f3052464d4d34
      [    0.820559] x13: 00000000769dde98 x12: ffff8000bf3f65a8
      [    0.820564] x11: 0000000000000000 x10: ffff00000958f848
      [    0.820567] x9 : ffff000009592000 x8 : ffff00000958f848
      [    0.820571] x7 : ffff00000818ffa0 x6 : 0000000000000000
      [    0.820574] x5 : 0000000000000000 x4 : 0000000000000001
      [    0.820578] x3 : 0000000000000000 x2 : 0000000000000001
      [    0.820582] x1 : 00000000ffffffff x0 : 0000000000000000
      [    0.820587] Call trace:
      [    0.820591]  lockdep_assert_cpus_held+0x3c/0x48
      [    0.820598]  static_key_enable_cpuslocked+0x28/0xd0
      [    0.820606]  arch_timer_check_ool_workaround+0xe8/0x228
      [    0.820610]  arch_timer_starting_cpu+0xe4/0x2d8
      [    0.820615]  cpuhp_invoke_callback+0xe8/0xd08
      [    0.820619]  notify_cpu_starting+0x80/0xb8
      [    0.820625]  secondary_start_kernel+0x118/0x1d0
      
      We've also had a similar warning in sched_init_smp() for every
      asymmetric system that would enable the sched_asym_cpucapacity static
      key, although that was singled out in:
      
        commit 40fa3780 ("sched/core: Take the hotplug lock in sched_init_smp()")
      
      Those warnings are actually harmless, since we cannot have hotplug
      operations at the time they appear. Instead of starting to sprinkle
      useless hotplug lock operations in the init codepaths, mute the
      warnings until they start warning about real problems.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: cai@gmx.us
      Cc: daniel.lezcano@linaro.org
      Cc: dietmar.eggemann@arm.com
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: longman@redhat.com
      Cc: marc.zyngier@arm.com
      Cc: mark.rutland@arm.com
      Link: https://lkml.kernel.org/r/1545243796-23224-2-git-send-email-valentin.schneider@arm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2ae0dd16
    • Buland Singh's avatar
      hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable · 96fc367d
      Buland Singh authored
      [ Upstream commit 24d48a61 ]
      
      Commit '3d035f58 ("drivers/char/hpet.c: allow user controlled mmap for
      user processes")' introduced a new kernel command line parameter hpet_mmap,
      that is required to expose the memory map of the HPET registers to
      user-space. Unfortunately the kernel command line parameter 'hpet_mmap' is
      broken and never takes effect due to missing '=' character in the __setup()
      code of hpet_mmap_enable.
      
      Before this patch:
      
      dmesg output with the kernel command line parameter hpet_mmap=1
      
      [    0.204152] HPET mmap disabled
      
      dmesg output with the kernel command line parameter hpet_mmap=0
      
      [    0.204192] HPET mmap disabled
      
      After this patch:
      
      dmesg output with the kernel command line parameter hpet_mmap=1
      
      [    0.203945] HPET mmap enabled
      
      dmesg output with the kernel command line parameter hpet_mmap=0
      
      [    0.204652] HPET mmap disabled
      
      Fixes: 3d035f58 ("drivers/char/hpet.c: allow user controlled mmap for user processes")
      Signed-off-by: default avatarBuland Singh <bsingh@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      96fc367d
    • Song Hongyan's avatar
      HID: intel-ish: ipc: handle PIMR before ish_wakeup also clear PISR busy_clear bit · bce54df8
      Song Hongyan authored
      [ Upstream commit 2edefc05 ]
      
      Host driver should handle interrupt mask register earlier than wake up ish FW
      else there will be conditions when FW interrupt comes, host PIMR register still
      not set ready, so move the interrupt mask setting before ish_wakeup.
      
      Clear PISR busy_clear bit in ish_irq_handler. If not clear, there will be
      conditions host driver received a busy_clear interrupt (before the busy_clear
      mask bit is ready), it will return IRQ_NONE after check_generated_interrupt,
      the interrupt will never be cleared, causing the DEVICE not sending following
      IRQ.
      
      Since PISR clear should not be called for the CHV device we do this change.
      After the change, both ISH2HOST interrupt and busy_clear interrupt will be
      considered as interrupt from ISH, busy_clear interrupt will return IRQ_HANDLED
      from IPC_IS_BUSY check.
      Signed-off-by: default avatarSong Hongyan <hongyan.song@intel.com>
      Acked-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bce54df8
    • Timo Alho's avatar
      soc/tegra: fuse: Fix illegal free of IO base address · dd4e3eaf
      Timo Alho authored
      [ Upstream commit 51294bf6 ]
      
      On cases where device tree entries for fuse and clock provider are in
      different order, fuse driver needs to defer probing. This leads to
      freeing incorrect IO base address as the fuse->base variable gets
      overwritten once during first probe invocation. This leads to the
      following spew during boot:
      
      [    3.082285] Trying to vfree() nonexistent vm area (00000000cfe8fd94)
      [    3.082308] WARNING: CPU: 5 PID: 126 at /hdd/l4t/kernel/stable/mm/vmalloc.c:1511 __vunmap+0xcc/0xd8
      [    3.082318] Modules linked in:
      [    3.082330] CPU: 5 PID: 126 Comm: kworker/5:1 Tainted: G S                4.19.7-tegra-gce119d3 #1
      [    3.082340] Hardware name: quill (DT)
      [    3.082353] Workqueue: events deferred_probe_work_func
      [    3.082364] pstate: 40000005 (nZcv daif -PAN -UAO)
      [    3.082372] pc : __vunmap+0xcc/0xd8
      [    3.082379] lr : __vunmap+0xcc/0xd8
      [    3.082385] sp : ffff00000a1d3b60
      [    3.082391] x29: ffff00000a1d3b60 x28: 0000000000000000
      [    3.082402] x27: 0000000000000000 x26: ffff000008e8b610
      [    3.082413] x25: 0000000000000000 x24: 0000000000000009
      [    3.082423] x23: ffff000009221a90 x22: ffff000009f6d000
      [    3.082432] x21: 0000000000000000 x20: 0000000000000000
      [    3.082442] x19: ffff000009f6d000 x18: ffffffffffffffff
      [    3.082452] x17: 0000000000000000 x16: 0000000000000000
      [    3.082462] x15: ffff0000091396c8 x14: 0720072007200720
      [    3.082471] x13: 0720072007200720 x12: 0720072907340739
      [    3.082481] x11: 0764076607380765 x10: 0766076307300730
      [    3.082491] x9 : 0730073007300730 x8 : 0730073007280720
      [    3.082501] x7 : 0761076507720761 x6 : 0000000000000102
      [    3.082510] x5 : 0000000000000000 x4 : 0000000000000000
      [    3.082519] x3 : ffffffffffffffff x2 : ffff000009150ff8
      [    3.082528] x1 : 3d95b1429fff5200 x0 : 0000000000000000
      [    3.082538] Call trace:
      [    3.082545]  __vunmap+0xcc/0xd8
      [    3.082552]  vunmap+0x24/0x30
      [    3.082561]  __iounmap+0x2c/0x38
      [    3.082569]  tegra_fuse_probe+0xc8/0x118
      [    3.082577]  platform_drv_probe+0x50/0xa0
      [    3.082585]  really_probe+0x1b0/0x288
      [    3.082593]  driver_probe_device+0x58/0x100
      [    3.082601]  __device_attach_driver+0x98/0xf0
      [    3.082609]  bus_for_each_drv+0x64/0xc8
      [    3.082616]  __device_attach+0xd8/0x130
      [    3.082624]  device_initial_probe+0x10/0x18
      [    3.082631]  bus_probe_device+0x90/0x98
      [    3.082638]  deferred_probe_work_func+0x74/0xb0
      [    3.082649]  process_one_work+0x1e0/0x318
      [    3.082656]  worker_thread+0x228/0x450
      [    3.082664]  kthread+0x128/0x130
      [    3.082672]  ret_from_fork+0x10/0x18
      [    3.082678] ---[ end trace 0810fe6ba772c1c7 ]---
      
      Fix this by retaining the value of fuse->base until driver has
      successfully probed.
      Signed-off-by: default avatarTimo Alho <talho@nvidia.com>
      Acked-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dd4e3eaf
    • David Tolnay's avatar
      hwrng: virtio - Avoid repeated init of completion · 7f14d931
      David Tolnay authored
      [ Upstream commit aef027db ]
      
      The virtio-rng driver uses a completion called have_data to wait for a
      virtio read to be fulfilled by the hypervisor. The completion is reset
      before placing a buffer on the virtio queue and completed by the virtio
      callback once data has been written into the buffer.
      
      Prior to this commit, the driver called init_completion on this
      completion both during probe as well as when registering virtio buffers
      as part of a hwrng read operation. The second of these init_completion
      calls should instead be reinit_completion because the have_data
      completion has already been inited by probe. As described in
      Documentation/scheduler/completion.txt, "Calling init_completion() twice
      on the same completion object is most likely a bug".
      
      This bug was present in the initial implementation of virtio-rng in
      f7f510ec ("virtio: An entropy device, as suggested by hpa"). Back
      then the have_data completion was a single static completion rather than
      a member of one of potentially multiple virtrng_info structs as
      implemented later by 08e53fbd ("virtio-rng: support multiple
      virtio-rng devices"). The original driver incorrectly used
      init_completion rather than INIT_COMPLETION to reset have_data during
      read.
      
      Tested by running `head -c48 /dev/random | hexdump` within crosvm, the
      Chrome OS virtual machine monitor, and confirming that the virtio-rng
      driver successfully produces random bytes from the host.
      Signed-off-by: default avatarDavid Tolnay <dtolnay@gmail.com>
      Tested-by: default avatarDavid Tolnay <dtolnay@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f14d931
    • Akinobu Mita's avatar
      media: mt9m111: set initial frame size other than 0x0 · 0bf1f184
      Akinobu Mita authored
      [ Upstream commit 29856308 ]
      
      This driver sets initial frame width and height to 0x0, which is invalid.
      So set it to selection rectangle bounds instead.
      
      This is detected by v4l2-compliance detected.
      
      Cc: Enrico Scholz <enrico.scholz@sigma-chemnitz.de>
      Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
      Cc: Marco Felsch <m.felsch@pengutronix.de>
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0bf1f184
    • Roger Quadros's avatar
      usb: dwc3: gadget: Fix OTG events when gadget driver isn't loaded · df4a6a26
      Roger Quadros authored
      [ Upstream commit 169e3b68 ]
      
      On v3.10a in dual-role mode, if port is in device mode
      and gadget driver isn't loaded, the OTG event interrupts don't
      come through.
      
      It seems that if the core is configured to be OTG2.0 only,
      then we can't leave the DCFG.DEVSPD at Super-speed (default)
      if we expect OTG to work properly. It must be set to High-speed.
      
      Fix this issue by configuring DCFG.DEVSPD to the supported
      maximum speed at gadget init. Device tree still needs to provide
      correct supported maximum speed for this to work.
      
      This issue wasn't present on v2.40a but is seen on v3.10a.
      It doesn't cause any side effects on v2.40a.
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarSekhar Nori <nsekhar@ti.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df4a6a26
    • Nathan Fontenot's avatar
      powerpc/pseries: Perform full re-add of CPU for topology update post-migration · 1b4283ff
      Nathan Fontenot authored
      [ Upstream commit 81b61324 ]
      
      On pseries systems, performing a partition migration can result in
      altering the nodes a CPU is assigned to on the destination system. For
      exampl, pre-migration on the source system CPUs are in node 1 and 3,
      post-migration on the destination system CPUs are in nodes 2 and 3.
      
      Handling the node change for a CPU can cause corruption in the slab
      cache if we hit a timing where a CPUs node is changed while cache_reap()
      is invoked. The corruption occurs because the slab cache code appears
      to rely on the CPU and slab cache pages being on the same node.
      
      The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
      does not prevent us from hitting this scenario.
      
      Changing the device tree property update notification handler that
      recognizes an affinity change for a CPU to do a full DLPAR remove and
      add of the CPU instead of dynamically changing its node resolves this
      issue.
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael W. Bringmann <mwb@linux.vnet.ibm.com>
      Tested-by: default avatarMichael W. Bringmann <mwb@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1b4283ff
    • Manfred Schlaegl's avatar
      tty: increase the default flip buffer limit to 2*640K · 405edea4
      Manfred Schlaegl authored
      [ Upstream commit 7ab57b76 ]
      
      We increase the default limit for buffer memory allocation by a factor of
      10 to 640K to prevent data loss when using fast serial interfaces.
      
      For example when using RS485 without flow-control at speeds of 1Mbit/s
      an upwards we've run into problems such as applications being too slow
      to read out this buffer (on embedded devices based on imx53 or imx6).
      
      If you want to write transmitted data to a slow SD card and thus have
      realtime requirements, this limit can become a problem.
      
      That shouldn't be the case and 640K buffers fix such problems for us.
      
      This value is a maximum limit for allocation only. It has no effect
      on systems that currently run fine. When transmission is slow enough
      applications and hardware can keep up and increasing this limit
      doesn't change anything.
      
      It only _allows_ to allocate more than 2*64K in cases we currently fail to
      allocate memory despite having some.
      Signed-off-by: default avatarManfred Schlaegl <manfred.schlaegl@ginzinger.com>
      Signed-off-by: default avatarMartin Kepplinger <martin.kepplinger@ginzinger.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      405edea4
    • Chen-Yu Tsai's avatar
      backlight: pwm_bl: Use gpiod_get_value_cansleep() to get initial state · f325ee43
      Chen-Yu Tsai authored
      [ Upstream commit cec2b188 ]
      
      gpiod_get_value() gives out a warning if access to the underlying gpiochip
      requires sleeping, which is common for I2C based chips:
      
          WARNING: CPU: 0 PID: 77 at drivers/gpio/gpiolib.c:2500 gpiod_get_value+0xd0/0x100
          Modules linked in:
          CPU: 0 PID: 77 Comm: kworker/0:2 Not tainted 4.14.0-rc3-00589-gf32897915d48-dirty #90
          Hardware name: Allwinner sun4i/sun5i Families
          Workqueue: events deferred_probe_work_func
          [<c010ec50>] (unwind_backtrace) from [<c010b784>] (show_stack+0x10/0x14)
          [<c010b784>] (show_stack) from [<c0797224>] (dump_stack+0x88/0x9c)
          [<c0797224>] (dump_stack) from [<c0125b08>] (__warn+0xe8/0x100)
          [<c0125b08>] (__warn) from [<c0125bd0>] (warn_slowpath_null+0x20/0x28)
          [<c0125bd0>] (warn_slowpath_null) from [<c037069c>] (gpiod_get_value+0xd0/0x100)
          [<c037069c>] (gpiod_get_value) from [<c03778d0>] (pwm_backlight_probe+0x238/0x508)
          [<c03778d0>] (pwm_backlight_probe) from [<c0411a2c>] (platform_drv_probe+0x50/0xac)
          [<c0411a2c>] (platform_drv_probe) from [<c0410224>] (driver_probe_device+0x238/0x2e8)
          [<c0410224>] (driver_probe_device) from [<c040e820>] (bus_for_each_drv+0x44/0x94)
          [<c040e820>] (bus_for_each_drv) from [<c040ff0c>] (__device_attach+0xb0/0x114)
          [<c040ff0c>] (__device_attach) from [<c040f4f8>] (bus_probe_device+0x84/0x8c)
          [<c040f4f8>] (bus_probe_device) from [<c040f944>] (deferred_probe_work_func+0x50/0x14c)
          [<c040f944>] (deferred_probe_work_func) from [<c013be84>] (process_one_work+0x1ec/0x414)
          [<c013be84>] (process_one_work) from [<c013ce5c>] (worker_thread+0x2b0/0x5a0)
          [<c013ce5c>] (worker_thread) from [<c0141908>] (kthread+0x14c/0x154)
          [<c0141908>] (kthread) from [<c0107ab0>] (ret_from_fork+0x14/0x24)
      
      This was missed in commit 0c9501f8 ("backlight: pwm_bl: Handle gpio
      that can sleep"). The code was then moved to a separate function in
      commit 7613c922 ("backlight: pwm_bl: Move the checks for initial power
      state to a separate function").
      
      The only usage of gpiod_get_value() is during the probe stage, which is
      safe to sleep in. Switch to gpiod_get_value_cansleep().
      
      Fixes: 0c9501f8 ("backlight: pwm_bl: Handle gpio that can sleep")
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Acked-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Acked-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f325ee43
    • Oleg Nesterov's avatar
      cgroup/pids: turn cgroup_subsys->free() into cgroup_subsys->release() to fix the accounting · f3b3b543
      Oleg Nesterov authored
      [ Upstream commit 51bee5ab ]
      
      The only user of cgroup_subsys->free() callback is pids_cgrp_subsys which
      needs pids_free() to uncharge the pid.
      
      However, ->free() is called from __put_task_struct()->cgroup_free() and this
      is too late. Even the trivial program which does
      
      	for (;;) {
      		int pid = fork();
      		assert(pid >= 0);
      		if (pid)
      			wait(NULL);
      		else
      			exit(0);
      	}
      
      can run out of limits because release_task()->call_rcu(delayed_put_task_struct)
      implies an RCU gp after the task/pid goes away and before the final put().
      
      Test-case:
      
      	mkdir -p /tmp/CG
      	mount -t cgroup2 none /tmp/CG
      	echo '+pids' > /tmp/CG/cgroup.subtree_control
      
      	mkdir /tmp/CG/PID
      	echo 2 > /tmp/CG/PID/pids.max
      
      	perl -e 'while ($p = fork) { wait; } $p // die "fork failed: $!\n"' &
      	echo $! > /tmp/CG/PID/cgroup.procs
      
      Without this patch the forking process fails soon after migration.
      
      Rename cgroup_subsys->free() to cgroup_subsys->release() and move the callsite
      into the new helper, cgroup_release(), called by release_task() which actually
      frees the pid(s).
      Reported-by: default avatarHerton R. Krzesinski <hkrzesin@redhat.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f3b3b543
    • Valdis Kletnieks's avatar
      bpf: fix missing prototype warnings · 46b2c037
      Valdis Kletnieks authored
      [ Upstream commit 116bfa96 ]
      
      Compiling with W=1 generates warnings:
      
        CC      kernel/bpf/core.o
      kernel/bpf/core.c:721:12: warning: no previous prototype for ?bpf_jit_alloc_exec_limit? [-Wmissing-prototypes]
        721 | u64 __weak bpf_jit_alloc_exec_limit(void)
            |            ^~~~~~~~~~~~~~~~~~~~~~~~
      kernel/bpf/core.c:757:14: warning: no previous prototype for ?bpf_jit_alloc_exec? [-Wmissing-prototypes]
        757 | void *__weak bpf_jit_alloc_exec(unsigned long size)
            |              ^~~~~~~~~~~~~~~~~~
      kernel/bpf/core.c:762:13: warning: no previous prototype for ?bpf_jit_free_exec? [-Wmissing-prototypes]
        762 | void __weak bpf_jit_free_exec(void *addr)
            |             ^~~~~~~~~~~~~~~~~
      
      All three are weak functions that archs can override, provide
      proper prototypes for when a new arch provides their own.
      Signed-off-by: default avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      46b2c037
    • Russell King's avatar
      ARM: avoid Cortex-A9 livelock on tight dmb loops · e2cadf02
      Russell King authored
      [ Upstream commit 5388a5b8 ]
      
      machine_crash_nonpanic_core() does this:
      
      	while (1)
      		cpu_relax();
      
      because the kernel has crashed, and we have no known safe way to deal
      with the CPU.  So, we place the CPU into an infinite loop which we
      expect it to never exit - at least not until the system as a whole is
      reset by some method.
      
      In the absence of erratum 754327, this code assembles to:
      
      	b	.
      
      In other words, an infinite loop.  When erratum 754327 is enabled,
      this becomes:
      
      1:	dmb
      	b	1b
      
      It has been observed that on some systems (eg, OMAP4) where, if a
      crash is triggered, the system tries to kexec into the panic kernel,
      but fails after taking the secondary CPU down - placing it into one
      of these loops.  This causes the system to livelock, and the most
      noticable effect is the system stops after issuing:
      
      	Loading crashdump kernel...
      
      to the system console.
      
      The tested as working solution I came up with was to add wfe() to
      these infinite loops thusly:
      
      	while (1) {
      		cpu_relax();
      		wfe();
      	}
      
      which, without 754327 builds to:
      
      1:	wfe
      	b	1b
      
      or with 754327 is enabled:
      
      1:	dmb
      	wfe
      	b	1b
      
      Adding "wfe" does two things depending on the environment we're running
      under:
      - where we're running on bare metal, and the processor implements
        "wfe", it stops us spinning endlessly in a loop where we're never
        going to do any useful work.
      - if we're running in a VM, it allows the CPU to be given back to the
        hypervisor and rescheduled for other purposes (maybe a different VM)
        rather than wasting CPU cycles inside a crashed VM.
      
      However, in light of erratum 794072, Will Deacon wanted to see 10 nops
      as well - which is reasonable to cover the case where we have erratum
      754327 enabled _and_ we have a processor that doesn't implement the
      wfe hint.
      
      So, we now end up with:
      
      1:      wfe
              b       1b
      
      when erratum 754327 is disabled, or:
      
      1:      dmb
              nop
              nop
              nop
              nop
              nop
              nop
              nop
              nop
              nop
              nop
              wfe
              b       1b
      
      when erratum 754327 is enabled.  We also get the dmb + 10 nop
      sequence elsewhere in the kernel, in terminating loops.
      
      This is reasonable - it means we get the workaround for erratum
      794072 when erratum 754327 is enabled, but still relinquish the dead
      processor - either by placing it in a lower power mode when wfe is
      implemented as such or by returning it to the hypervisior, or in the
      case where wfe is a no-op, we use the workaround specified in erratum
      794072 to avoid the problem.
      
      These as two entirely orthogonal problems - the 10 nops addresses
      erratum 794072, and the wfe is an optimisation that makes the system
      more efficient when crashed either in terms of power consumption or
      by allowing the host/other VMs to make use of the CPU.
      
      I don't see any reason not to use kexec() inside a VM - it has the
      potential to provide automated recovery from a failure of the VMs
      kernel with the opportunity for saving a crashdump of the failure.
      A panic() with a reboot timeout won't do that, and reading the
      libvirt documentation, setting on_reboot to "preserve" won't either
      (the documentation states "The preserve action for an on_reboot event
      is treated as a destroy".)  Surely it has to be a good thing to
      avoiding having CPUs spinning inside a VM that is doing no useful
      work.
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e2cadf02
    • Vladimir Murzin's avatar
      ARM: 8830/1: NOMMU: Toggle only bits in EXC_RETURN we are really care of · cbe71d95
      Vladimir Murzin authored
      [ Upstream commit 72cd4064 ]
      
      ARMv8M introduces support for Security extension to M class, among
      other things it affects exception handling, especially, encoding of
      EXC_RETURN.
      
      The new bits have been added:
      
      Bit [6]	Secure or Non-secure stack
      Bit [5]	Default callee register stacking
      Bit [0]	Exception Secure
      
      which conflicts with hard-coded value of EXC_RETURN:
      
      In fact, we only care of few bits:
      
      Bit [3]	 Mode (0 - Handler, 1 - Thread)
      Bit [2]	 Stack pointer selection (0 - Main, 1 - Process)
      
      We can toggle only those bits and left other bits as they were on
      exception entry.
      
      It is basically, what patch does - saves EXC_RETURN when we do
      transition form Thread to Handler mode (it is first svc), so later
      saved value is used instead of EXC_RET_THREADMODE_PROCESSSTACK.
      Signed-off-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cbe71d95
    • Stanislaw Gruszka's avatar
      mt7601u: bump supported EEPROM version · e7bde590
      Stanislaw Gruszka authored
      [ Upstream commit 3bd1505f ]
      
      As reported by Michael eeprom 0d is supported and work with the driver.
      
      Dump of /sys/kernel/debug/ieee80211/phy1/mt7601u/eeprom_param
      with 0d EEPORM looks like this:
      
      RSSI offset: 0 0
      Reference temp: f9
      LNA gain: 8
      Reg channels: 1-14
      Per rate power:
      	 raw:05 bw20:05 bw40:05
      	 raw:05 bw20:05 bw40:05
      	 raw:03 bw20:03 bw40:03
      	 raw:03 bw20:03 bw40:03
      	 raw:04 bw20:04 bw40:04
      	 raw:00 bw20:00 bw40:00
      	 raw:00 bw20:00 bw40:00
      	 raw:00 bw20:00 bw40:00
      	 raw:02 bw20:02 bw40:02
      	 raw:00 bw20:00 bw40:00
      Per channel power:
      	 tx_power  ch1:09 ch2:09
      	 tx_power  ch3:0a ch4:0a
      	 tx_power  ch5:0a ch6:0a
      	 tx_power  ch7:0b ch8:0b
      	 tx_power  ch9:0b ch10:0b
      	 tx_power  ch11:0b ch12:0b
      	 tx_power  ch13:0b ch14:0b
      Reported-and-tested-by: default avatarMichael <ZeroBeat@gmx.de>
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Acked-by: default avatarJakub Kicinski <kubakici@wp.pl>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e7bde590
    • Alexey Khoroshilov's avatar
      soc: qcom: gsbi: Fix error handling in gsbi_probe() · 8d7504c5
      Alexey Khoroshilov authored
      [ Upstream commit 8cd09a3d ]
      
      If of_platform_populate() fails in gsbi_probe(),
      gsbi->hclk is left undisabled.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarAndy Gross <andy.gross@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d7504c5
    • Ard Biesheuvel's avatar
      efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted · eb262db3
      Ard Biesheuvel authored
      [ Upstream commit 4e46c2a9 ]
      
      The UEFI spec revision 2.7 errata A section 8.4 has the following to
      say about the virtual memory runtime services:
      
        "This section contains function definitions for the virtual memory
        support that may be optionally used by an operating system at runtime.
        If an operating system chooses to make EFI runtime service calls in a
        virtual addressing mode instead of the flat physical mode, then the
        operating system must use the services in this section to switch the
        EFI runtime services from flat physical addressing to virtual
        addressing."
      
      So it is pretty clear that calling SetVirtualAddressMap() is entirely
      optional, and so there is no point in doing so unless it achieves
      anything useful for us.
      
      This is not the case for 64-bit ARM. The identity mapping used by the
      firmware is arbitrarily converted into another permutation of userland
      addresses (i.e., bits [63:48] cleared), and the runtime code could easily
      deal with the original layout in exactly the same way as it deals with
      the converted layout. However, due to constraints related to page size
      differences if the OS is not running with 4k pages, and related to
      systems that may expose the individual sections of PE/COFF runtime
      modules as different memory regions, creating the virtual layout is a
      bit fiddly, and requires us to sort the memory map and reason about
      adjacent regions with identical memory types etc etc.
      
      So the obvious fix is to stop calling SetVirtualAddressMap() altogether
      on arm64 systems. However, to avoid surprises, which are notoriously
      hard to diagnose when it comes to OS<->firmware interactions, let's
      start by making it an opt-out feature, and implement support for the
      'efi=novamap' kernel command line parameter on ARM and arm64 systems.
      
      ( Note that 32-bit ARM generally does require SetVirtualAddressMap() to be
        used, given that the physical memory map and the kernel virtual address
        map are not guaranteed to be non-overlapping like on arm64. However,
        having support for efi=novamap,noruntime on 32-bit ARM, combined with
        the recently proposed support for earlycon=efifb, is likely to be useful
        to diagnose boot issues on such systems if they have no accessible serial
        port. )
      Tested-by: default avatarJeffrey Hugo <jhugo@codeaurora.org>
      Tested-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Tested-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190202094119.13230-8-ard.biesheuvel@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eb262db3
    • Mathieu Malaterre's avatar
      ARM: dts: lpc32xx: Remove leading 0x and 0s from bindings notation · 80a4813b
      Mathieu Malaterre authored
      [ Upstream commit 3e3380d0 ]
      
      Improve the DTS files by removing all the leading "0x" and zeros to fix
      the following dtc warnings:
      
      Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
      
      and
      
      Warning (unit_address_format): Node /XXX unit name should not have leading 0s
      
      Converted using the following command:
      
      find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -i -e "s/@\([0-9a-fA-FxX\.;:#]+\)\s*{/@\L\1 {/g" -e "s/@0x\(.*\) {/@\1 {/g" -e "s/@0+\(.*\) {/@\1 {/g" {} +
      
      For simplicity, two sed expressions were used to solve each warnings
      separately.
      
      To make the regex expression more robust a few other issues were resolved,
      namely setting unit-address to lower case, and adding a whitespace before
      the opening curly brace:
      
      https://elinux.org/Device_Tree_Linux#Linux_conventions
      
      This will solve as a side effect warning:
      
      Warning (simple_bus_reg): Node /XXX@<UPPER> simple-bus unit address format error, expected "<lower>"
      
      This is a follow up to commit 4c9847b7 ("dt-bindings: Remove leading 0x from bindings notation")
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Suggested-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      [vzapolskiy: fixed commit message to pass checkpatch.pl test]
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80a4813b
    • Ard Biesheuvel's avatar
      efi/memattr: Don't bail on zero VA if it equals the region's PA · 7d84e045
      Ard Biesheuvel authored
      [ Upstream commit 5de0fef0 ]
      
      The EFI memory attributes code cross-references the EFI memory map with
      the more granular EFI memory attributes table to ensure that they are in
      sync before applying the strict permissions to the regions it describes.
      
      Since we always install virtual mappings for the EFI runtime regions to
      which these strict permissions apply, we currently perform a sanity check
      on the EFI memory descriptor, and ensure that the EFI_MEMORY_RUNTIME bit
      is set, and that the virtual address has been assigned.
      
      However, in cases where a runtime region exists at physical address 0x0,
      and the virtual mapping equals the physical mapping, e.g., when running
      in mixed mode on x86, we encounter a memory descriptor with the runtime
      attribute and virtual address 0x0, and incorrectly draw the conclusion
      that a runtime region exists for which no virtual mapping was installed,
      and give up altogether. The consequence of this is that firmware mappings
      retain their read-write-execute permissions, making the system more
      vulnerable to attacks.
      
      So let's only bail if the virtual address of 0x0 has been assigned to a
      physical region that does not reside at address 0x0.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Jeffrey Hugo <jhugo@codeaurora.org>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Leif Lindholm <leif.lindholm@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 10f0d2f5 ("efi: Implement generic support for the Memory ...")
      Link: http://lkml.kernel.org/r/20190202094119.13230-4-ard.biesheuvel@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d84e045
    • Hidetoshi Seto's avatar
      sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK · 351ea69c
      Hidetoshi Seto authored
      [ Upstream commit 1ca4fa3a ]
      
      register_sched_domain_sysctl() copies the cpu_possible_mask into
      sd_sysctl_cpus, but only if sd_sysctl_cpus hasn't already been
      allocated (ie, CONFIG_CPUMASK_OFFSTACK is set).  However, when
      CONFIG_CPUMASK_OFFSTACK is not set, sd_sysctl_cpus is left
      uninitialized (all zeroes) and the kernel may fail to initialize
      sched_domain sysctl entries for all possible CPUs.
      
      This is visible to the user if the kernel is booted with maxcpus=n, or
      if ACPI tables have been modified to leave CPUs offline, and then
      checking for missing /proc/sys/kernel/sched_domain/cpu* entries.
      
      Fix this by separating the allocation and initialization, and adding a
      flag to initialize the possible CPU entries while system booting only.
      Tested-by: default avatarSyuuichirou Ishii <ishii.shuuichir@jp.fujitsu.com>
      Tested-by: default avatarTarumizu, Kohei <tarumizu.kohei@jp.fujitsu.com>
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Acked-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190129151245.5073-1-msys.mizuma@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      351ea69c
    • wen yang's avatar
      ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe · bdd46d58
      wen yang authored
      [ Upstream commit 11907e9d ]
      
      The of_find_device_by_node() takes a reference to the underlying device
      structure, we should release that reference.
      Signed-off-by: default avatarWen Yang <yellowriver2010@hotmil.com>
      Cc: Timur Tabi <timur@kernel.org>
      Cc: Nicolin Chen <nicoleotsuka@gmail.com>
      Cc: Xiubo Li <Xiubo.Lee@gmail.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Jaroslav Kysela <perex@perex.cz>
      Cc: Takashi Iwai <tiwai@suse.com>
      Cc: alsa-devel@alsa-project.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bdd46d58
    • Rajneesh Bhardwaj's avatar
      platform/x86: intel_pmc_core: Fix PCH IP sts reading · d9e09a1d
      Rajneesh Bhardwaj authored
      [ Upstream commit 0e68eeea ]
      
      A previous commit "platform/x86: intel_pmc_core: Make the driver PCH
      family agnostic <c977b98b>" provided
      better abstraction to this driver but has some fundamental issues.
      
      e.g. the following condition
      
      for (index = 0; index < pmcdev->map->ppfear_buckets &&
      	index < PPFEAR_MAX_NUM_ENTRIES; index++, iter++)
      
      is wrong because for CNL, PPFEAR_MAX_NUM_ENTRIES is hardcoded as 5 which
      is _wrong_ and even though ppfear_buckets is 8, the loop fails to read
      all eight registers needed for CNL PCH i.e. PPFEAR0 and PPFEAR1. This
      patch refactors the pfear show logic to correctly read PCH IP power
      gating status for Cannonlake and beyond.
      
      Cc: "David E. Box" <david.e.box@intel.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Fixes: c977b98b ("platform/x86: intel_pmc_core: Make the driver PCH family agnostic")
      Signed-off-by: default avatarRajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9e09a1d
    • Konstantin Khlebnikov's avatar
      e1000e: fix cyclic resets at link up with active tx · 9ae89542
      Konstantin Khlebnikov authored
      [ Upstream commit 0f9e980b ]
      
      I'm seeing series of e1000e resets (sometimes endless) at system boot
      if something generates tx traffic at this time. In my case this is
      netconsole who sends message "e1000e 0000:02:00.0: Some CPU C-states
      have been disabled in order to enable jumbo frames" from e1000e itself.
      As result e1000_watchdog_task sees used tx buffer while carrier is off
      and start this reset cycle again.
      
      [   17.794359] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
      [   17.794714] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
      [   22.936455] e1000e 0000:02:00.0 eth1: changing MTU from 1500 to 9000
      [   23.033336] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   26.102364] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
      [   27.174495] 8021q: 802.1Q VLAN Support v1.8
      [   27.174513] 8021q: adding VLAN 0 to HW filter on device eth1
      [   30.671724] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
      [   30.898564] netpoll: netconsole: local port 6666
      [   30.898566] netpoll: netconsole: local IPv6 address 2a02:6b8:0:80b:beae:c5ff:fe28:23f8
      [   30.898567] netpoll: netconsole: interface 'eth1'
      [   30.898568] netpoll: netconsole: remote port 6666
      [   30.898568] netpoll: netconsole: remote IPv6 address 2a02:6b8:b000:605c:e61d:2dff:fe03:3790
      [   30.898569] netpoll: netconsole: remote ethernet address b0:a8:6e:f4:ff:c0
      [   30.917747] console [netcon0] enabled
      [   30.917749] netconsole: network logging started
      [   31.453353] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.185730] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.321840] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.465822] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.597423] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.745417] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   34.877356] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   35.005441] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   35.157376] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   35.289362] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   35.417441] e1000e 0000:02:00.0: Some CPU C-states have been disabled in order to enable jumbo frames
      [   37.790342] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
      
      This patch flushes tx buffers only once when carrier is off
      rather than at each watchdog iteration.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9ae89542
    • Guenter Roeck's avatar
      cdrom: Fix race condition in cdrom_sysctl_register · 7b3a8430
      Guenter Roeck authored
      [ Upstream commit f25191bb ]
      
      The following traceback is sometimes seen when booting an image in qemu:
      
      [   54.608293] cdrom: Uniform CD-ROM driver Revision: 3.20
      [   54.611085] Fusion MPT base driver 3.04.20
      [   54.611877] Copyright (c) 1999-2008 LSI Corporation
      [   54.616234] Fusion MPT SAS Host driver 3.04.20
      [   54.635139] sysctl duplicate entry: /dev/cdrom//info
      [   54.639578] CPU: 0 PID: 266 Comm: kworker/u4:5 Not tainted 5.0.0-rc5 #1
      [   54.639578] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
      [   54.641273] Workqueue: events_unbound async_run_entry_fn
      [   54.641273] Call Trace:
      [   54.641273]  dump_stack+0x67/0x90
      [   54.641273]  __register_sysctl_table+0x50b/0x570
      [   54.641273]  ? rcu_read_lock_sched_held+0x6f/0x80
      [   54.641273]  ? kmem_cache_alloc_trace+0x1c7/0x1f0
      [   54.646814]  __register_sysctl_paths+0x1c8/0x1f0
      [   54.646814]  cdrom_sysctl_register.part.7+0xc/0x5f
      [   54.646814]  register_cdrom.cold.24+0x2a/0x33
      [   54.646814]  sr_probe+0x4bd/0x580
      [   54.646814]  ? __driver_attach+0xd0/0xd0
      [   54.646814]  really_probe+0xd6/0x260
      [   54.646814]  ? __driver_attach+0xd0/0xd0
      [   54.646814]  driver_probe_device+0x4a/0xb0
      [   54.646814]  ? __driver_attach+0xd0/0xd0
      [   54.646814]  bus_for_each_drv+0x73/0xc0
      [   54.646814]  __device_attach+0xd6/0x130
      [   54.646814]  bus_probe_device+0x9a/0xb0
      [   54.646814]  device_add+0x40c/0x670
      [   54.646814]  ? __pm_runtime_resume+0x4f/0x80
      [   54.646814]  scsi_sysfs_add_sdev+0x81/0x290
      [   54.646814]  scsi_probe_and_add_lun+0x888/0xc00
      [   54.646814]  ? scsi_autopm_get_host+0x21/0x40
      [   54.646814]  __scsi_add_device+0x116/0x130
      [   54.646814]  ata_scsi_scan_host+0x93/0x1c0
      [   54.646814]  async_run_entry_fn+0x34/0x100
      [   54.646814]  process_one_work+0x237/0x5e0
      [   54.646814]  worker_thread+0x37/0x380
      [   54.646814]  ? rescuer_thread+0x360/0x360
      [   54.646814]  kthread+0x118/0x130
      [   54.646814]  ? kthread_create_on_node+0x60/0x60
      [   54.646814]  ret_from_fork+0x3a/0x50
      
      The only sensible explanation is that cdrom_sysctl_register() is called
      twice, once from the module init function and once from register_cdrom().
      cdrom_sysctl_register() is not mutex protected and may happily execute
      twice if the second call is made before the first call is complete.
      
      Use a static atomic to ensure that the function is executed exactly once.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b3a8430
    • Manfred Schlaegl's avatar
      fbdev: fbmem: fix memory access if logo is bigger than the screen · 11538754
      Manfred Schlaegl authored
      [ Upstream commit a5399db1 ]
      
      There is no clipping on the x or y axis for logos larger that the framebuffer
      size. Therefore: a logo bigger than screen size leads to invalid memory access:
      
      [    1.254664] Backtrace:
      [    1.254728] [<c02714e0>] (cfb_imageblit) from [<c026184c>] (fb_show_logo+0x620/0x684)
      [    1.254763]  r10:00000003 r9:00027fd8 r8:c6a40000 r7:c6a36e50 r6:00000000 r5:c06b81e4
      [    1.254774]  r4:c6a3e800
      [    1.254810] [<c026122c>] (fb_show_logo) from [<c026c1e4>] (fbcon_switch+0x3fc/0x46c)
      [    1.254842]  r10:c6a3e824 r9:c6a3e800 r8:00000000 r7:c6a0c000 r6:c070b014 r5:c6a3e800
      [    1.254852]  r4:c6808c00
      [    1.254889] [<c026bde8>] (fbcon_switch) from [<c029c8f8>] (redraw_screen+0xf0/0x1e8)
      [    1.254918]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:c070d5a0 r5:00000080
      [    1.254928]  r4:c6808c00
      [    1.254961] [<c029c808>] (redraw_screen) from [<c029d264>] (do_bind_con_driver+0x194/0x2e4)
      [    1.254991]  r9:00000000 r8:00000000 r7:00000014 r6:c070d5a0 r5:c070d5a0 r4:c070d5a0
      
      So prevent displaying a logo bigger than screen size and avoid invalid
      memory access.
      Signed-off-by: default avatarManfred Schlaegl <manfred.schlaegl@ginzinger.com>
      Signed-off-by: default avatarMartin Kepplinger <martin.kepplinger@ginzinger.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      11538754
    • Raju Rangoju's avatar
      iw_cxgb4: fix srqidx leak during connection abort · c0c9311f
      Raju Rangoju authored
      [ Upstream commit f368ff18 ]
      
      When an application aborts the connection by moving QP from RTS to ERROR,
      then iw_cxgb4's modify_rc_qp() RTS->ERROR logic sets the
      *srqidxp to 0 via t4_set_wq_in_error(&qhp->wq, 0), and aborts the
      connection by calling c4iw_ep_disconnect().
      
      c4iw_ep_disconnect() does the following:
       1. sends up a close_complete_upcall(ep, -ECONNRESET) to libcxgb4.
       2. sends abort request CPL to hw.
      
      But, since the close_complete_upcall() is sent before sending the
      ABORT_REQ to hw, libcxgb4 would fail to release the srqidx if the
      connection holds one. Because, the srqidx is passed up to libcxgb4 only
      after corresponding ABORT_RPL is processed by kernel in abort_rpl().
      
      This patch handle the corner-case by moving the call to
      close_complete_upcall() from c4iw_ep_disconnect() to abort_rpl().  So that
      libcxgb4 is notified about the -ECONNRESET only after abort_rpl(), and
      libcxgb4 can relinquish the srqidx properly.
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c0c9311f
    • Thomas Gleixner's avatar
      genirq: Avoid summation loops for /proc/stat · e4688147
      Thomas Gleixner authored
      [ Upstream commit 1136b072 ]
      
      Waiman reported that on large systems with a large amount of interrupts the
      readout of /proc/stat takes a long time to sum up the interrupt
      statistics. In principle this is not a problem. but for unknown reasons
      some enterprise quality software reads /proc/stat with a high frequency.
      
      The reason for this is that interrupt statistics are accounted per cpu. So
      the /proc/stat logic has to sum up the interrupt stats for each interrupt.
      
      This can be largely avoided for interrupts which are not marked as
      'PER_CPU' interrupts by simply adding a per interrupt summation counter
      which is incremented along with the per interrupt per cpu counter.
      
      The PER_CPU interrupts need to avoid that and use only per cpu accounting
      because they share the interrupt number and the interrupt descriptor and
      concurrent updates would conflict or require unwanted synchronization.
      Reported-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Link: https://lkml.kernel.org/r/20190208135020.925487496@linutronix.de
      
      8<-------------
      
      v2: Undo the unintentional layout change of struct irq_desc.
      
       include/linux/irqdesc.h |    1 +
       kernel/irq/chip.c       |   12 ++++++++++--
       kernel/irq/internals.h  |    8 +++++++-
       kernel/irq/irqdesc.c    |    7 ++++++-
       4 files changed, 24 insertions(+), 4 deletions(-)
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e4688147
    • Coly Li's avatar
      bcache: improve sysfs_strtoul_clamp() · 7680e67c
      Coly Li authored
      [ Upstream commit 596b5a5d ]
      
      Currently sysfs_strtoul_clamp() is defined as,
       82 #define sysfs_strtoul_clamp(file, var, min, max)                   \
       83 do {                                                               \
       84         if (attr == &sysfs_ ## file)                               \
       85                 return strtoul_safe_clamp(buf, var, min, max)      \
       86                         ?: (ssize_t) size;                         \
       87 } while (0)
      
      The problem is, if bit width of var is less then unsigned long, min and
      max may not protect var from integer overflow, because overflow happens
      in strtoul_safe_clamp() before checking min and max.
      
      To fix such overflow in sysfs_strtoul_clamp(), to make min and max take
      effect, this patch adds an unsigned long variable, and uses it to macro
      strtoul_safe_clamp() to convert an unsigned long value in range defined
      by [min, max]. Then assign this value to var. By this method, if bit
      width of var is less than unsigned long, integer overflow won't happen
      before min and max are checking.
      
      Now sysfs_strtoul_clamp() can properly handle smaller data type like
      unsigned int, of cause min and max should be defined in range of
      unsigned int too.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7680e67c
    • Coly Li's avatar
      bcache: fix input overflow to sequential_cutoff · 70e8b1e0
      Coly Li authored
      [ Upstream commit 8c27a395 ]
      
      People may set sequential_cutoff of a cached device via sysfs file,
      but current code does not check input value overflow. E.g. if value
      4294967295 (UINT_MAX) is written to file sequential_cutoff, its value
      is 4GB, but if 4294967296 (UINT_MAX + 1) is written into, its value
      will be 0. This is an unexpected behavior.
      
      This patch replaces d_strtoi_h() by sysfs_strtoul_clamp() to convert
      input string to unsigned integer value, and limit its range in
      [0, UINT_MAX]. Then the input overflow can be fixed.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      70e8b1e0
    • Coly Li's avatar
      bcache: fix input overflow to cache set sysfs file io_error_halflife · 77f895ed
      Coly Li authored
      [ Upstream commit a91fbda4 ]
      
      Cache set sysfs entry io_error_halflife is used to set c->error_decay.
      c->error_decay is in type unsigned int, and it is converted by
      strtoul_or_return(), therefore overflow to c->error_decay is possible
      for a large input value.
      
      This patch fixes the overflow by using strtoul_safe_clamp() to convert
      input string to an unsigned long value in range [0, UINT_MAX], then
      divides by 88 and set it to c->error_decay.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      77f895ed
    • Luc Van Oostenryck's avatar
      sched/topology: Fix percpu data types in struct sd_data & struct s_data · 43a81992
      Luc Van Oostenryck authored
      [ Upstream commit 99687cdb ]
      
      The percpu members of struct sd_data and s_data are declared as:
      
      	struct ... ** __percpu member;
      
      So their type is:
      
      	__percpu pointer to pointer to struct ...
      
      But looking at how they're used, their type should be:
      
      	pointer to __percpu pointer to struct ...
      
      and they should thus be declared as:
      
      	struct ... * __percpu *member;
      
      So fix the placement of '__percpu' in the definition of these
      structures.
      
      This addresses a bunch of Sparse's warnings like:
      
      	warning: incorrect type in initializer (different address spaces)
      	  expected void const [noderef] <asn:3> *__vpp_verify
      	  got struct sched_domain **
      Signed-off-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190118144936.79158-1-luc.vanoostenryck@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      43a81992
    • John Stultz's avatar
      usb: f_fs: Avoid crash due to out-of-scope stack ptr access · 74c74604
      John Stultz authored
      [ Upstream commit 54f64d5c ]
      
      Since the 5.0 merge window opened, I've been seeing frequent
      crashes on suspend and reboot with the trace:
      
      [   36.911170] Unable to handle kernel paging request at virtual address ffffff801153d660
      [   36.912769] Unable to handle kernel paging request at virtual address ffffff800004b564
      ...
      [   36.950666] Call trace:
      [   36.950670]  queued_spin_lock_slowpath+0x1cc/0x2c8
      [   36.950681]  _raw_spin_lock_irqsave+0x64/0x78
      [   36.950692]  complete+0x28/0x70
      [   36.950703]  ffs_epfile_io_complete+0x3c/0x50
      [   36.950713]  usb_gadget_giveback_request+0x34/0x108
      [   36.950721]  dwc3_gadget_giveback+0x50/0x68
      [   36.950723]  dwc3_thread_interrupt+0x358/0x1488
      [   36.950731]  irq_thread_fn+0x30/0x88
      [   36.950734]  irq_thread+0x114/0x1b0
      [   36.950739]  kthread+0x104/0x130
      [   36.950747]  ret_from_fork+0x10/0x1c
      
      I isolated this down to in ffs_epfile_io():
      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/function/f_fs.c#n1065
      
      Where the completion done is setup on the stack:
        DECLARE_COMPLETION_ONSTACK(done);
      
      Then later we setup a request and queue it, and wait for it:
        if (unlikely(wait_for_completion_interruptible(&done))) {
          /*
          * To avoid race condition with ffs_epfile_io_complete,
          * dequeue the request first then check
          * status. usb_ep_dequeue API should guarantee no race
          * condition with req->complete callback.
          */
          usb_ep_dequeue(ep->ep, req);
          interrupted = ep->status < 0;
        }
      
      The problem is, that we end up being interrupted, dequeue the
      request, and exit.
      
      But then the irq triggers and we try calling complete() on the
      context pointer which points to now random stack space, which
      results in the panic.
      
      Alan Stern pointed out there is a bug here, in that the snippet
      above "assumes that usb_ep_dequeue() waits until the request has
      been completed." And that:
      
          wait_for_completion(&done);
      
      Is needed right after the usb_ep_dequeue().
      
      Thus this patch implements that change. With it I no longer see
      the crashes on suspend or reboot.
      
      This issue seems to have been uncovered by behavioral changes in
      the dwc3 driver in commit fec9095b ("usb: dwc3: gadget:
      remove wait_end_transfer").
      
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Felipe Balbi <balbi@kernel.org>
      Cc: Zeng Tao <prime.zeng@hisilicon.com>
      Cc: Jack Pham <jackp@codeaurora.org>
      Cc: Thinh Nguyen <thinh.nguyen@synopsys.com>
      Cc: Chen Yu <chenyu56@huawei.com>
      Cc: Jerry Zhang <zhangjerry@google.com>
      Cc: Lars-Peter Clausen <lars@metafoo.de>
      Cc: Vincent Pelletier <plr.vincent@gmail.com>
      Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linux USB List <linux-usb@vger.kernel.org>
      Suggested-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74c74604
    • Ranjani Sridharan's avatar
      ALSA: PCM: check if ops are defined before suspending PCM · 62ecc64c
      Ranjani Sridharan authored
      [ Upstream commit d9c0b2af ]
      
      BE dai links only have internal PCM's and their substream ops may
      not be set. Suspending these PCM's will result in their
       ops->trigger() being invoked and cause a kernel oops.
      So skip suspending PCM's if their ops are NULL.
      
      [ NOTE: this change is required now for following the recent PCM core
        change to get rid of snd_pcm_suspend() call.  Since DPCM BE takes
        the runtime carried from FE while keeping NULL ops, it can hit this
        bug.  See details at:
           https://github.com/thesofproject/linux/pull/582
        -- tiwai ]
      Signed-off-by: default avatarRanjani Sridharan <ranjani.sridharan@linux.intel.com>
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      62ecc64c
    • Nathan Chancellor's avatar
      ARM: 8833/1: Ensure that NEON code always compiles with Clang · 416b593a
      Nathan Chancellor authored
      [ Upstream commit de9c0d49 ]
      
      While building arm32 allyesconfig, I ran into the following errors:
      
        arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
        '-mfloat-abi=softfp -mfpu=neon'
      
        In file included from lib/raid6/neon1.c:27:
        /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
        error: "NEON support not enabled"
      
      Building V=1 showed NEON_FLAGS getting passed along to Clang but
      __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
      only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
      which is the '-march' value for allyesconfig.
      
      >From lib/Basic/Targets/ARM.cpp in the Clang source:
      
        // This only gets set when Neon instructions are actually available, unlike
        // the VFP define, hence the soft float and arch check. This is subtly
        // different from gcc, we follow the intent which was that it should be set
        // when Neon instructions are actually available.
        if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
          Builder.defineMacro("__ARM_NEON", "1");
          Builder.defineMacro("__ARM_NEON__");
          // current AArch32 NEON implementations do not support double-precision
          // floating-point even when it is present in VFP.
          Builder.defineMacro("__ARM_NEON_FP",
                              "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
        }
      
      Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
      beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
      definined by Clang. This doesn't functionally change anything because
      that code will only run where NEON is supported, which is implicitly
      armv7.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/287Suggested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Acked-by: default avatarNicolas Pitre <nico@linaro.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarStefan Agner <stefan@agner.ch>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      416b593a
    • Chieh-Min Wang's avatar
      netfilter: conntrack: fix cloned unconfirmed skb->_nfct race in __nf_conntrack_confirm · 91a604c2
      Chieh-Min Wang authored
      [ Upstream commit 13f5251f ]
      
      For bridge(br_flood) or broadcast/multicast packets, they could clone
      skb with unconfirmed conntrack which break the rule that unconfirmed
      skb->_nfct is never shared.  With nfqueue running on my system, the race
      can be easily reproduced with following warning calltrace:
      
      [13257.707525] CPU: 0 PID: 12132 Comm: main Tainted: P        W       4.4.60 #7744
      [13257.707568] Hardware name: Qualcomm (Flattened Device Tree)
      [13257.714700] [<c021f6dc>] (unwind_backtrace) from [<c021bce8>] (show_stack+0x10/0x14)
      [13257.720253] [<c021bce8>] (show_stack) from [<c0449e10>] (dump_stack+0x94/0xa8)
      [13257.728240] [<c0449e10>] (dump_stack) from [<c022a7e0>] (warn_slowpath_common+0x94/0xb0)
      [13257.735268] [<c022a7e0>] (warn_slowpath_common) from [<c022a898>] (warn_slowpath_null+0x1c/0x24)
      [13257.743519] [<c022a898>] (warn_slowpath_null) from [<c06ee450>] (__nf_conntrack_confirm+0xa8/0x618)
      [13257.752284] [<c06ee450>] (__nf_conntrack_confirm) from [<c0772670>] (ipv4_confirm+0xb8/0xfc)
      [13257.761049] [<c0772670>] (ipv4_confirm) from [<c06e7a60>] (nf_iterate+0x48/0xa8)
      [13257.769725] [<c06e7a60>] (nf_iterate) from [<c06e7af0>] (nf_hook_slow+0x30/0xb0)
      [13257.777108] [<c06e7af0>] (nf_hook_slow) from [<c07f20b4>] (br_nf_post_routing+0x274/0x31c)
      [13257.784486] [<c07f20b4>] (br_nf_post_routing) from [<c06e7a60>] (nf_iterate+0x48/0xa8)
      [13257.792556] [<c06e7a60>] (nf_iterate) from [<c06e7af0>] (nf_hook_slow+0x30/0xb0)
      [13257.800458] [<c06e7af0>] (nf_hook_slow) from [<c07e5580>] (br_forward_finish+0x94/0xa4)
      [13257.808010] [<c07e5580>] (br_forward_finish) from [<c07f22ac>] (br_nf_forward_finish+0x150/0x1ac)
      [13257.815736] [<c07f22ac>] (br_nf_forward_finish) from [<c06e8df0>] (nf_reinject+0x108/0x170)
      [13257.824762] [<c06e8df0>] (nf_reinject) from [<c06ea854>] (nfqnl_recv_verdict+0x3d8/0x420)
      [13257.832924] [<c06ea854>] (nfqnl_recv_verdict) from [<c06e940c>] (nfnetlink_rcv_msg+0x158/0x248)
      [13257.841256] [<c06e940c>] (nfnetlink_rcv_msg) from [<c06e5564>] (netlink_rcv_skb+0x54/0xb0)
      [13257.849762] [<c06e5564>] (netlink_rcv_skb) from [<c06e4ec8>] (netlink_unicast+0x148/0x23c)
      [13257.858093] [<c06e4ec8>] (netlink_unicast) from [<c06e5364>] (netlink_sendmsg+0x2ec/0x368)
      [13257.866348] [<c06e5364>] (netlink_sendmsg) from [<c069fb8c>] (sock_sendmsg+0x34/0x44)
      [13257.874590] [<c069fb8c>] (sock_sendmsg) from [<c06a03dc>] (___sys_sendmsg+0x1ec/0x200)
      [13257.882489] [<c06a03dc>] (___sys_sendmsg) from [<c06a11c8>] (__sys_sendmsg+0x3c/0x64)
      [13257.890300] [<c06a11c8>] (__sys_sendmsg) from [<c0209b40>] (ret_fast_syscall+0x0/0x34)
      
      The original code just triggered the warning but do nothing. It will
      caused the shared conntrack moves to the dying list and the packet be
      droppped (nf_ct_resolve_clash returns NF_DROP for dying conntrack).
      
      - Reproduce steps:
      
      +----------------------------+
      |          br0(bridge)       |
      |                            |
      +-+---------+---------+------+
        | eth0|   | eth1|   | eth2|
        |     |   |     |   |     |
        +--+--+   +--+--+   +---+-+
           |         |          |
           |         |          |
        +--+-+     +-+--+    +--+-+
        | PC1|     | PC2|    | PC3|
        +----+     +----+    +----+
      
      iptables -A FORWARD -m mark --mark 0x1000000/0x1000000 -j NFQUEUE --queue-num 100 --queue-bypass
      
      ps: Our nfq userspace program will set mark on packets whose connection
      has already been processed.
      
      PC1 sends broadcast packets simulated by hping3:
      
      hping3 --rand-source --udp 192.168.1.255 -i u100
      
      - Broadcast racing flow chart is as follow:
      
      br_handle_frame
        BR_HOOK(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, br_handle_frame_finish)
        // skb->_nfct (unconfirmed conntrack) is constructed at PRE_ROUTING stage
        br_handle_frame_finish
          // check if this packet is broadcast
          br_flood_forward
            br_flood
              list_for_each_entry_rcu(p, &br->port_list, list) // iterate through each port
                maybe_deliver
                  deliver_clone
                    skb = skb_clone(skb)
                    __br_forward
                      BR_HOOK(NFPROTO_BRIDGE, NF_BR_FORWARD,...)
                      // queue in our nfq and received by our userspace program
                      // goto __nf_conntrack_confirm with process context on CPU 1
          br_pass_frame_up
            BR_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN,...)
            // goto __nf_conntrack_confirm with softirq context on CPU 0
      
      Because conntrack confirm can happen at both INPUT and POSTROUTING
      stage.  So with NFQUEUE running, skb->_nfct with the same unconfirmed
      conntrack could race on different core.
      
      This patch fixes a repeating kernel splat, now it is only displayed
      once.
      Signed-off-by: default avatarChieh-Min Wang <chiehminw@synology.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      91a604c2
    • Andrea Righi's avatar
      kprobes: Prohibit probing on bsearch() · e62824d1
      Andrea Righi authored
      [ Upstream commit 02106f88 ]
      
      Since kprobe breakpoing handler is using bsearch(), probing on this
      routine can cause recursive breakpoint problem.
      
      int3
       ->do_int3()
         ->ftrace_int3_handler()
           ->ftrace_location()
             ->ftrace_location_range()
               ->bsearch() -> int3
      
      Prohibit probing on bsearch().
      Signed-off-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/154998813406.31052.8791425358974650922.stgit@devboxSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e62824d1
    • Hans de Goede's avatar
      ACPI / video: Refactor and fix dmi_is_desktop() · 67dcd5d7
      Hans de Goede authored
      [ Upstream commit cecf3e3e ]
      
      This commit refactors the chassis-type detection introduced by
      commit 53fa1f6e ("ACPI / video: Only default only_lcd to true on
      Win8-ready _desktops_") (where desktop means anything without a builtin
      screen).
      
      The DMI chassis_type is an unsigned integer, so rather then doing a
      whole bunch of string-compares on it, convert it to an int and feed
      the result to a switch case.
      
      Note the switch case uses hex values, this is done because the spec
      uses hex values too. This changes the check for "Main Server Chassis"
      from checking for 11 decimal to 11 hexadecimal, this is a bug fix,
      the original check for 11 decimal was wrong.
      
      Fixes: 53fa1f6e ("ACPI / video: Only default only_lcd to true ...")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      [ rjw: Drop redundant return statements ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67dcd5d7
    • Sara Sharon's avatar
      iwlwifi: pcie: fix emergency path · 0fbfca57
      Sara Sharon authored
      [ Upstream commit c6ac9f9f ]
      
      Allocator swaps the pending requests with 0 when it starts
      working. This means that relying on it n RX path to decide if
      to move to emergency is not always a good idea, since it may
      be zero, but there are still a lot of unallocated RBs in the
      system. Change allocator to decrement the pending requests on
      real time. It is more expensive since it accesses the atomic
      variable more times, but it gives the RX path a better idea
      of the system's status.
      Reported-by: default avatarIlan Peer <ilan.peer@intel.com>
      Signed-off-by: default avatarSara Sharon <sara.sharon@intel.com>
      Fixes: 868a1e86 ("iwlwifi: pcie: avoid empty free RB queue")
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0fbfca57
    • Michal Kazior's avatar
      leds: lp55xx: fix null deref on firmware load failure · 0affcd54
      Michal Kazior authored
      [ Upstream commit 5ddb0869 ]
      
      I've stumbled upon a kernel crash and the logs
      pointed me towards the lp5562 driver:
      
      > <4>[306013.841294] lp5562 0-0030: Direct firmware load for lp5562 failed with error -2
      > <4>[306013.894990] lp5562 0-0030: Falling back to user helper
      > ...
      > <3>[306073.924886] lp5562 0-0030: firmware request failed
      > <1>[306073.939456] Unable to handle kernel NULL pointer dereference at virtual address 00000000
      > <4>[306074.251011] PC is at _raw_spin_lock+0x1c/0x58
      > <4>[306074.255539] LR is at release_firmware+0x6c/0x138
      > ...
      
      After taking a look I noticed firmware_release()
      could be called with either NULL or a dangling
      pointer.
      
      Fixes: 10c06d17 ("leds-lp55xx: support firmware interface")
      Signed-off-by: default avatarMichal Kazior <michal@plume.com>
      Signed-off-by: default avatarJacek Anaszewski <jacek.anaszewski@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0affcd54