1. 17 Feb, 2016 18 commits
    • Tejun Heo's avatar
      printk: do cond_resched() between lines while outputting to consoles · a623f87a
      Tejun Heo authored
      commit 8d91f8b1 upstream.
      
      @console_may_schedule tracks whether console_sem was acquired through
      lock or trylock.  If the former, we're inside a sleepable context and
      console_conditional_schedule() performs cond_resched().  This allows
      console drivers which use console_lock for synchronization to yield
      while performing time-consuming operations such as scrolling.
      
      However, the actual console outputting is performed while holding
      irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
      before starting outputting lines.  Also, only a few drivers call
      console_conditional_schedule() to begin with.  This means that when a
      lot of lines need to be output by console_unlock(), for example on a
      console registration, the task doing console_unlock() may not yield for
      a long time on a non-preemptible kernel.
      
      If this happens with a slow console devices, for example a serial
      console, the outputting task may occupy the cpu for a very long time.
      Long enough to trigger softlockup and/or RCU stall warnings, which in
      turn pile more messages, sometimes enough to trigger the next cycle of
      warnings incapacitating the system.
      
      Fix it by making console_unlock() insert cond_resched() between lines if
      @console_may_schedule.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarCalvin Owens <calvinowens@fb.com>
      Acked-by: default avatarJan Kara <jack@suse.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Kyle McMartin <kyle@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a623f87a
    • Steven Rostedt's avatar
      tracing/stacktrace: Show entire trace if passed in function not found · bf46aa7e
      Steven Rostedt authored
      commit 6ccd8371 upstream.
      
      When a max stack trace is discovered, the stack dump is saved. In order to
      not record the overhead of the stack tracer, the ip of the traced function
      is looked for within the dump. The trace is started from the location of
      that function. But if for some reason the ip is not found, the entire stack
      trace is then truncated. That's not very useful. Instead, print everything
      if the ip of the traced function is not found within the trace.
      
      This issue showed up on s390.
      
      Link: http://lkml.kernel.org/r/20160129102241.1b3c9c04@gandalf.local.home
      
      Fixes: 72ac426a ("tracing: Clean up stack tracing and fix fentry updates")
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf46aa7e
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs() · cc6d9800
      Steven Rostedt (Red Hat) authored
      commit 7717c6be upstream.
      
      While cleaning the stacktrace code I unintentially changed the skip depth of
      trace_buffer_unlock_commit_regs() from 0 to 6. kprobes uses this function,
      and with skipping 6 call backs, it can easily produce no stack.
      
      Here's how I tested it:
      
       # echo 'p:ext4_sync_fs ext4_sync_fs ' > /sys/kernel/debug/tracing/kprobe_events
       # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
       # cat /sys/kernel/debug/trace
                  sync-2394  [005]   502.457060: ext4_sync_fs: (ffffffff81317650)
                  sync-2394  [005]   502.457063: kernel_stack:         <stack trace>
                  sync-2394  [005]   502.457086: ext4_sync_fs: (ffffffff81317650)
                  sync-2394  [005]   502.457087: kernel_stack:         <stack trace>
                  sync-2394  [005]   502.457091: ext4_sync_fs: (ffffffff81317650)
      
      After putting back the skip stack to zero, we have:
      
                  sync-2270  [000]   748.052693: ext4_sync_fs: (ffffffff81317650)
                  sync-2270  [000]   748.052695: kernel_stack:         <stack trace>
       => iterate_supers (ffffffff8126412e)
       => sys_sync (ffffffff8129c4b6)
       => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
                  sync-2270  [000]   748.053017: ext4_sync_fs: (ffffffff81317650)
                  sync-2270  [000]   748.053019: kernel_stack:         <stack trace>
       => iterate_supers (ffffffff8126412e)
       => sys_sync (ffffffff8129c4b6)
       => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
                  sync-2270  [000]   748.053381: ext4_sync_fs: (ffffffff81317650)
                  sync-2270  [000]   748.053383: kernel_stack:         <stack trace>
       => iterate_supers (ffffffff8126412e)
       => sys_sync (ffffffff8129c4b6)
       => entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
      
      Fixes: 73dddbb5 "tracing: Only create stacktrace option when STACKTRACE is configured"
      Reported-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Tested-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc6d9800
    • Christoph Biedl's avatar
      PCI: Fix minimum allocation address overwrite · 695f6c87
      Christoph Biedl authored
      commit 3460baa6 upstream.
      
      Commit 36e097a8 ("PCI: Split out bridge window override of minimum
      allocation address") claimed to do no functional changes but unfortunately
      did: The "min" variable is altered.  At least the AVM A1 PCMCIA adapter was
      no longer detected, breaking ISDN operation.
      
      Use a local copy of "min" to restore the previous behaviour.
      
      [bhelgaas: avoid gcc "?:" extension for portability and readability]
      Fixes: 36e097a8 ("PCI: Split out bridge window override of minimum allocation address")
      Signed-off-by: default avatarChristoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      695f6c87
    • Grygorii Strashko's avatar
      PCI: host: Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD · ad53ae59
      Grygorii Strashko authored
      commit 8ff0ef99 upstream.
      
      On -RT and if kernel is booting with "threadirqs" cmd line parameter,
      PCIe/PCI (MSI) IRQ cascade handlers (like dra7xx_pcie_msi_irq_handler())
      will be forced threaded and, as result, will generate warnings like this:
      
        WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174()
        irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts
        Backtrace:
         (warn_slowpath_common) from (warn_slowpath_fmt+0x38/0x40)
         (warn_slowpath_fmt) from (handle_irq_event_percpu+0x14c/0x174)
         (handle_irq_event_percpu) from (handle_irq_event+0x84/0xb8)
         (handle_irq_event) from (handle_simple_irq+0x90/0x118)
         (handle_simple_irq) from (generic_handle_irq+0x30/0x44)
         (generic_handle_irq) from (dra7xx_pcie_msi_irq_handler+0x7c/0x8c)
         (dra7xx_pcie_msi_irq_handler) from (irq_forced_thread_fn+0x28/0x5c)
         (irq_forced_thread_fn) from (irq_thread+0x128/0x204)
      
      This happens because all of them invoke generic_handle_irq() from the
      requested handler.  generic_handle_irq() grabs raw_locks and thus needs to
      run in raw-IRQ context.
      
      This issue was originally reproduced on TI dra7-evem, but, as was
      identified during discussion [1], other hosts can also suffer from this
      issue.  Fix all them at once by marking PCIe/PCI (MSI) IRQ cascade handlers
      IRQF_NO_THREAD explicitly.
      
      [1] http://lkml.kernel.org/r/1448027966-21610-1-git-send-email-grygorii.strashko@ti.com
      
      [bhelgaas: add stable tag, fix typos]
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarLucas Stach <l.stach@pengutronix.de>
      CC: Kishon Vijay Abraham I <kishon@ti.com>
      CC: Jingoo Han <jingoohan1@gmail.com>
      CC: Kukjin Kim <kgene@kernel.org>
      CC: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      CC: Richard Zhu <Richard.Zhu@freescale.com>
      CC: Thierry Reding <thierry.reding@gmail.com>
      CC: Stephen Warren <swarren@wwwdotorg.org>
      CC: Alexandre Courbot <gnurou@gmail.com>
      CC: Simon Horman <horms@verge.net.au>
      CC: Pratyush Anand <pratyush.anand@gmail.com>
      CC: Michal Simek <michal.simek@xilinx.com>
      CC: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
      CC: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad53ae59
    • Brian Norris's avatar
      mtd: nand: assign reasonable default name for NAND drivers · e18da11f
      Brian Norris authored
      commit f7a8e38f upstream.
      
      Commits such as commit 853f1c58 ("mtd: nand: omap2: show parent
      device structure in sysfs") attempt to rely on the core MTD code to set
      the MTD name based on the parent device. However, nand_base tries to set
      a different default name according to the flash name (e.g., extracted
      from the ONFI parameter page), which means NAND drivers will never make
      use of the MTD defaults. This is not the intention of commit
      853f1c58.
      
      This results in problems when trying to use the cmdline partition
      parser, since the MTD name is different than expected. Let's fix this by
      providing a default NAND name, where possible.
      
      Note that this is not really a great default name in the long run, since
      this means that if there are multiple MTDs attached to the same
      controller device, they will have the same name. But that is an existing
      issue and requires future work on a better controller vs. flash chip
      abstraction to fix properly.
      
      Fixes: 853f1c58 ("mtd: nand: omap2: show parent device structure in sysfs")
      Reported-by: default avatarHeiko Schocher <hs@denx.de>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Tested-by: default avatarHeiko Schocher <hs@denx.de>
      Cc: Heiko Schocher <hs@denx.de>
      Cc: Frans Klaver <fransklaver@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e18da11f
    • Uri Mashiach's avatar
      wlcore/wl12xx: spi: fix NULL pointer dereference (Oops) · c13c92ac
      Uri Mashiach authored
      commit e47301b0 upstream.
      
      Fix the below Oops when trying to modprobe wlcore_spi.
      The oops occurs because the wl1271_power_{off,on}()
      function doesn't check the power() function pointer.
      
      [   23.401447] Unable to handle kernel NULL pointer dereference at
      virtual address 00000000
      [   23.409954] pgd = c0004000
      [   23.412922] [00000000] *pgd=00000000
      [   23.416693] Internal error: Oops: 80000007 [#1] SMP ARM
      [   23.422168] Modules linked in: wl12xx wlcore mac80211 cfg80211
      musb_dsps musb_hdrc usbcore usb_common snd_soc_simple_card evdev joydev
      omap_rng wlcore_spi snd_soc_tlv320aic23_i2c rng_core snd_soc_tlv320aic23
      c_can_platform c_can can_dev snd_soc_davinci_mcasp snd_soc_edma
      snd_soc_omap omap_wdt musb_am335x cpufreq_dt thermal_sys hwmon
      [   23.453253] CPU: 0 PID: 36 Comm: kworker/0:2 Not tainted
      4.2.0-00002-g951efee-dirty #233
      [   23.461720] Hardware name: Generic AM33XX (Flattened Device Tree)
      [   23.468123] Workqueue: events request_firmware_work_func
      [   23.473690] task: de32efc0 ti: de4ee000 task.ti: de4ee000
      [   23.479341] PC is at 0x0
      [   23.482112] LR is at wl12xx_set_power_on+0x28/0x124 [wlcore]
      [   23.488074] pc : [<00000000>]    lr : [<bf2581f0>]    psr: 60000013
      [   23.488074] sp : de4efe50  ip : 00000002  fp : 00000000
      [   23.500162] r10: de7cdd00  r9 : dc848800  r8 : bf27af00
      [   23.505663] r7 : bf27a1a8  r6 : dcbd8a80  r5 : dce0e2e0  r4 :
      dce0d2e0
      [   23.512536] r3 : 00000000  r2 : 00000000  r1 : 00000001  r0 :
      dc848810
      [   23.519412] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
      Segment kernel
      [   23.527109] Control: 10c5387d  Table: 9cb78019  DAC: 00000015
      [   23.533160] Process kworker/0:2 (pid: 36, stack limit = 0xde4ee218)
      [   23.539760] Stack: (0xde4efe50 to 0xde4f0000)
      
      [...]
      
      [   23.665030] [<bf2581f0>] (wl12xx_set_power_on [wlcore]) from
      [<bf25f7ac>] (wlcore_nvs_cb+0x118/0xa4c [wlcore])
      [   23.675604] [<bf25f7ac>] (wlcore_nvs_cb [wlcore]) from [<c04387ec>]
      (request_firmware_work_func+0x30/0x58)
      [   23.685784] [<c04387ec>] (request_firmware_work_func) from
      [<c0058e2c>] (process_one_work+0x1b4/0x4b4)
      [   23.695591] [<c0058e2c>] (process_one_work) from [<c0059168>]
      (worker_thread+0x3c/0x4a4)
      [   23.704124] [<c0059168>] (worker_thread) from [<c005ee68>]
      (kthread+0xd4/0xf0)
      [   23.711747] [<c005ee68>] (kthread) from [<c000f598>]
      (ret_from_fork+0x14/0x3c)
      [   23.719357] Code: bad PC value
      [   23.722760] ---[ end trace 981be8510db9b3a9 ]---
      
      Prevent oops by validationg power() pointer value before
      calling the function.
      Signed-off-by: default avatarUri Mashiach <uri.mashiach@compulab.co.il>
      Acked-by: default avatarIgor Grinberg <grinberg@compulab.co.il>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c13c92ac
    • Uri Mashiach's avatar
      wlcore/wl12xx: spi: fix oops on firmware load · 1463df0f
      Uri Mashiach authored
      commit 9b2761cb upstream.
      
      The maximum chunks used by the function is
      (SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE + 1).
      The original commands array had space for
      (SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE) commands.
      When the last chunk is used (len > 4 * WSPI_MAX_CHUNK_SIZE), the last
      command is stored outside the bounds of the commands array.
      
      Oops 5 (page fault) is generated during current wl1271 firmware load
      attempt:
      
      root@debian-armhf:~# ifconfig wlan0 up
      [  294.312399] Unable to handle kernel paging request at virtual address
      00203fc4
      [  294.320173] pgd = de528000
      [  294.323028] [00203fc4] *pgd=00000000
      [  294.326916] Internal error: Oops: 5 [#1] SMP ARM
      [  294.331789] Modules linked in: bnep rfcomm bluetooth ipv6 arc4 wl12xx
      wlcore mac80211 musb_dsps cfg80211 musb_hdrc usbcore usb_common
      wlcore_spi omap_rng rng_core musb_am335x omap_wdt cpufreq_dt thermal_sys
      hwmon
      [  294.351838] CPU: 0 PID: 1827 Comm: ifconfig Not tainted
      4.2.0-00002-g3e9ad27-dirty #78
      [  294.360154] Hardware name: Generic AM33XX (Flattened Device Tree)
      [  294.366557] task: dc9d6d40 ti: de550000 task.ti: de550000
      [  294.372236] PC is at __spi_validate+0xa8/0x2ac
      [  294.376902] LR is at __spi_sync+0x78/0x210
      [  294.381200] pc : [<c049c760>]    lr : [<c049ebe0>]    psr: 60000013
      [  294.381200] sp : de551998  ip : de5519d8  fp : 00200000
      [  294.393242] r10: de551c8c  r9 : de5519d8  r8 : de3a9000
      [  294.398730] r7 : de3a9258  r6 : de3a9400  r5 : de551a48  r4 :
      00203fbc
      [  294.405577] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 :
      de3a9000
      [  294.412420] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
      Segment user
      [  294.419918] Control: 10c5387d  Table: 9e528019  DAC: 00000015
      [  294.425954] Process ifconfig (pid: 1827, stack limit = 0xde550218)
      [  294.432437] Stack: (0xde551998 to 0xde552000)
      
      ...
      
      [  294.883613] [<c049c760>] (__spi_validate) from [<c049ebe0>]
      (__spi_sync+0x78/0x210)
      [  294.891670] [<c049ebe0>] (__spi_sync) from [<bf036598>]
      (wl12xx_spi_raw_write+0xfc/0x148 [wlcore_spi])
      [  294.901661] [<bf036598>] (wl12xx_spi_raw_write [wlcore_spi]) from
      [<bf21c694>] (wlcore_boot_upload_firmware+0x1ec/0x458 [wlcore])
      [  294.914038] [<bf21c694>] (wlcore_boot_upload_firmware [wlcore]) from
      [<bf24532c>] (wl12xx_boot+0xc10/0xfac [wl12xx])
      [  294.925161] [<bf24532c>] (wl12xx_boot [wl12xx]) from [<bf20d5cc>]
      (wl1271_op_add_interface+0x5b0/0x910 [wlcore])
      [  294.936364] [<bf20d5cc>] (wl1271_op_add_interface [wlcore]) from
      [<bf15c4ac>] (ieee80211_do_open+0x44c/0xf7c [mac80211])
      [  294.947963] [<bf15c4ac>] (ieee80211_do_open [mac80211]) from
      [<c0537978>] (__dev_open+0xa8/0x110)
      [  294.957307] [<c0537978>] (__dev_open) from [<c0537bf8>]
      (__dev_change_flags+0x88/0x148)
      [  294.965713] [<c0537bf8>] (__dev_change_flags) from [<c0537cd0>]
      (dev_change_flags+0x18/0x48)
      [  294.974576] [<c0537cd0>] (dev_change_flags) from [<c05a55a0>]
      (devinet_ioctl+0x6b4/0x7d0)
      [  294.983191] [<c05a55a0>] (devinet_ioctl) from [<c0517040>]
      (sock_ioctl+0x1e4/0x2bc)
      [  294.991244] [<c0517040>] (sock_ioctl) from [<c017d378>]
      (do_vfs_ioctl+0x420/0x6b0)
      [  294.999208] [<c017d378>] (do_vfs_ioctl) from [<c017d674>]
      (SyS_ioctl+0x6c/0x7c)
      [  295.006880] [<c017d674>] (SyS_ioctl) from [<c000f4c0>]
      (ret_fast_syscall+0x0/0x54)
      [  295.014835] Code: e1550004 e2444034 0a00007d e5953018 (e5942008)
      [  295.021544] ---[ end trace 66ed188198f4e24e ]---
      Signed-off-by: default avatarUri Mashiach <uri.mashiach@compulab.co.il>
      Acked-by: default avatarIgor Grinberg <grinberg@compulab.co.il>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1463df0f
    • xuejiufei's avatar
      ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup · e6fe2687
      xuejiufei authored
      commit c95a5180 upstream.
      
      When recovery master down, dlm_do_local_recovery_cleanup() only remove
      the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
      Which will make umount thread falling in dead loop migrating $RECOVERY
      to the dead node.
      Signed-off-by: default avatarxuejiufei <xuejiufei@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6fe2687
    • xuejiufei's avatar
      ocfs2/dlm: ignore cleaning the migration mle that is inuse · 7889b844
      xuejiufei authored
      commit bef5502d upstream.
      
      We have found that migration source will trigger a BUG that the refcount
      of mle is already zero before put when the target is down during
      migration.  The situation is as follows:
      
      dlm_migrate_lockres
        dlm_add_migration_mle
        dlm_mark_lockres_migrating
        dlm_get_mle_inuse
        <<<<<< Now the refcount of the mle is 2.
        dlm_send_one_lockres and wait for the target to become the
        new master.
        <<<<<< o2hb detect the target down and clean the migration
        mle. Now the refcount is 1.
      
      dlm_migrate_lockres woken, and put the mle twice when found the target
      goes down which trigger the BUG with the following message:
      
        "ERROR: bad mle: ".
      Signed-off-by: default avatarJiufei Xue <xuejiufei@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7889b844
    • Takashi Iwai's avatar
      ALSA: hda - Implement loopback control switch for Realtek and other codecs · 5feecbf0
      Takashi Iwai authored
      commit e7fdd527 upstream.
      
      Many codecs, typically found on Realtek codecs, have the analog
      loopback path merged to the secondary input of the middle of the
      output paths.  Currently, we don't offer the dynamic switching in such
      configuration but let each loopback path mute by itself.
      
      This should work well in theory, but in reality, we often see that
      such a dead loopback path causes some background noises even if all
      the elements get muted.  Such a problem has been fixed by adding the
      quirk accordingly to disable aamix, and it's the right fix, per se.
      The only problem is that it's not so trivial to achieve it; user needs
      to pass a hint string via patch module option or sysfs.
      
      This patch gives a bit improvement on the situation: it adds "Loopback
      Mixing" control element for such codecs like other codecs (e.g. IDT or
      VIA codecs) with the individual loopback paths.  User can turn on/off
      the loopback path simply via a mixer app.
      
      For keeping the compatibility, the loopback is still enabled on these
      codecs.  But user can try to turn it off if experiencing a suspicious
      background or click noise on the fly, then build a static fixup later
      once after the problem is addressed.
      
      Other than the addition of the loopback enable/disablement control,
      there should be no changes.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5feecbf0
    • Ming Lei's avatar
      block: fix bio splitting on max sectors · bfc5caf7
      Ming Lei authored
      commit d0e5fbb0 upstream.
      
      After commit e36f6204(block: split bios to maxpossible length),
      bio can be splitted in the middle of a vector entry, then it
      is easy to split out one bio which size isn't aligned with block
      size, especially when the block size is bigger than 512.
      
      This patch fixes the issue by making the max io size aligned
      to logical block size.
      
      Fixes: e36f6204(block: split bios to maxpossible length)
      Reported-by: default avatarStefan Haberland <sth@linux.vnet.ibm.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMing Lei <tom.leiming@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bfc5caf7
    • Martin Wilck's avatar
      base/platform: Fix platform drivers with no probe callback · 9e5d4a6c
      Martin Wilck authored
      commit 25cad69f upstream.
      
      Since b8b2c7d8, platform_drv_probe() is called for all platform
      devices. If drv->probe is NULL, and dev_pm_domain_attach() fails,
      platform_drv_probe() will return the error code from dev_pm_domain_attach().
      
      This causes real_probe() to enter the "probe_failed" path and set
      dev->driver to NULL. Before b8b2c7d8, real_probe() would assume
      success if both dev->bus->probe and drv->probe were missing. As a result,
      a device and driver could be "bound" together just by matching their names;
      this doesn't work any more after b8b2c7d8.
      
      This may cause problems later for certain usage of platform_driver_register()
      and platform_device_register_simple(). I observed a panic while loading
      the tpm_tis driver with parameter "force=1" (i.e. registering tpm_tis as
      a platform driver), because tpm_tis_init's assumption that the device
      returned by platform_device_register_simple() was bound didn't hold any more
      (tpmm_chip_alloc() dereferences chip->pdev->driver, causing panic).
      
      This patch restores the previous (4.3.0 and earlier) behavior of
      platform_drv_probe() in the case when the associated platform driver has
      no "probe" function.
      
      Fixes: b8b2c7d8 ("base/platform: assert that dev_pm_domain callbacks are called unconditionally")
      Signed-off-by: default avatarMartin Wilck <Martin.Wilck@ts.fujitsu.com>
      Cc: Martin Fuzzey <mfuzzey@parkeon.com>
      Acked-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e5d4a6c
    • Ioan-Adrian Ratiu's avatar
      HID: usbhid: fix recursive deadlock · 0fc93ba9
      Ioan-Adrian Ratiu authored
      commit e470127e upstream.
      
      The critical section protected by usbhid->lock in hid_ctrl() is too
      big and because of this it causes a recursive deadlock. "Too big" means
      the case statement and the call to hid_input_report() do not need to be
      protected by the spinlock (no URB operations are done inside them).
      
      The deadlock happens because in certain rare cases drivers try to grab
      the lock while handling the ctrl irq which grabs the lock before them
      as described above. For example newer wacom tablets like 056a:033c try
      to reschedule proximity reads from wacom_intuos_schedule_prox_event()
      calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report()
      which tries to grab the usbhid lock already held by hid_ctrl().
      
      There are two ways to get out of this deadlock:
          1. Make the drivers work "around" the ctrl critical region, in the
          wacom case for ex. by delaying the scheduling of the proximity read
          request itself to a workqueue.
          2. Shrink the critical region so the usbhid lock protects only the
          instructions which modify usbhid state, calling hid_input_report()
          with the spinlock unlocked, allowing the device driver to grab the
          lock first, finish and then grab the lock afterwards in hid_ctrl().
      
      This patch implements the 2nd solution.
      Signed-off-by: default avatarIoan-Adrian Ratiu <adi@adirat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJason Gerecke <jason.gerecke@wacom.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fc93ba9
    • Tariq Saeed's avatar
      ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock · 00191545
      Tariq Saeed authored
      commit b1b1e15e upstream.
      
      NFS on a 2 node ocfs2 cluster each node exporting dir.  The lock causing
      the hang is the global bit map inode lock.  Node 1 is master, has the
      lock granted in PR mode; Node 2 is in the converting list (PR -> EX).
      There are no holders of the lock on the master node so it should
      downconvert to NL and grant EX to node 2 but that does not happen.
      BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
      Threads are waiting in __ocfs2_cluster_lock on BLOCKED.  One thread
      wants EX, rest want PR.  So it is as though the downconvert thread needs
      to be kicked to complete the conv.
      
      The hang is caused by an EX req coming into __ocfs2_cluster_lock on the
      heels of a PR req after it sets BUSY (drops l_lock, releasing EX
      thread), forcing the incoming EX to wait on BUSY without doing anything.
      PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL -> PR,
      queues ast.
      
      At this time, upconvert (PR ->EX) arrives from node 2, finds conflict
      with node 1 lock in PR, so the lock res is put on dlm thread's dirty
      listt.
      
      After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
      awoken by ast.
      
      Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast, in
      that order.  dlm_shuffle_lists ques a bast on behalf of node 2 (which
      will be run by dlm_thread right after the ast).  ast does its part, sets
      UPCONVERT_FINISHING, clears BUSY and wakes its waiters.  Next,
      dlm_thread runs bast.  It sets BLOCKED and kicks dc thread.  dc thread
      runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
      anything and reques.
      
      Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
      of PR, it wakes up first, finds BLOCKED set and skips doing anything but
      clearing UPCONVERT_FINISHING (which was actually "meant" for the PR
      thread), and this time waits on BLOCKED.  Next, the PR thread comes out
      of wait but since UPCONVERT_FINISHING is not set, it skips updating the
      l_ro_holders and goes straight to wait on BLOCKED.  So there, we have a
      hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock res in osb
      blocked list.  Only when dc thread is awoken, it will run
      ocfs2_unblock_lock and things will unhang.
      
      One way to fix this is to wake the dc thread on the flag after clearing
      UPCONVERT_FINISHING
      
      Orabug: 20933419
      Signed-off-by: default avatarTariq Saeed <tariq.x.saeed@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Eric Ren <zren@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00191545
    • Keith Busch's avatar
      block: split bios to max possible length · d2081cfe
      Keith Busch authored
      commit e36f6204 upstream.
      
      This splits bio in the middle of a vector to form the largest possible
      bio at the h/w's desired alignment, and guarantees the bio being split
      will have some data.
      
      The criteria for splitting is changed from the max sectors to the h/w's
      optimal sector alignment if it is provided. For h/w that advertise their
      block storage's underlying chunk size, it's a big performance win to not
      submit commands that cross them. If sector alignment is not provided,
      this patch uses the max sectors as before.
      
      This addresses the performance issue commit d3805611 attempted to
      fix, but was reverted due to splitting logic error.
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Ming Lei <tom.leiming@gmail.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2081cfe
    • Trond Myklebust's avatar
      NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturn · 1f1c9c9b
      Trond Myklebust authored
      commit 1a093ceb upstream.
      
      Since commit 2d8ae84f, nothing is bumping lo->plh_block_lgets in the
      layoutreturn path, so it should not be touched in nfs4_layoutreturn_release
      either.
      
      Fixes: 2d8ae84f ("NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets...")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f1c9c9b
    • LABBE Corentin's avatar
      crypto: sun4i-ss - add missing statesize · 6abc6562
      LABBE Corentin authored
      commit 4f9ea866 upstream.
      
      sun4i-ss implementaton of md5/sha1 is via ahash algorithms.
      Commit 8996eafd ("crypto: ahash - ensure statesize is non-zero")
      made impossible to load them without giving statesize. This patch
      specifiy statesize for sha1 and md5.
      
      Fixes: 6298e948 ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator")
      Tested-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarLABBE Corentin <clabbe.montjoie@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6abc6562
  2. 31 Jan, 2016 22 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.1 · f1ab5eaf
      Greg Kroah-Hartman authored
      f1ab5eaf
    • Lorenzo Pieralisi's avatar
      arm64: kernel: fix architected PMU registers unconditional access · 9497f702
      Lorenzo Pieralisi authored
      commit f436b2ac upstream.
      
      The Performance Monitors extension is an optional feature of the
      AArch64 architecture, therefore, in order to access Performance
      Monitors registers safely, the kernel should detect the architected
      PMU unit presence through the ID_AA64DFR0_EL1 register PMUVer field
      before accessing them.
      
      This patch implements a guard by reading the ID_AA64DFR0_EL1 register
      PMUVer field to detect the architected PMU presence and prevent accessing
      PMU system registers if the Performance Monitors extension is not
      implemented in the core.
      
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Fixes: 60792ad3 ("arm64: kernel: enforce pmuserenr_el0 initialization and restore")
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9497f702
    • Lorenzo Pieralisi's avatar
      arm64: kernel: enforce pmuserenr_el0 initialization and restore · f50c2907
      Lorenzo Pieralisi authored
      commit 60792ad3 upstream.
      
      The pmuserenr_el0 register value is architecturally UNKNOWN on reset.
      Current kernel code resets that register value iff the core pmu device is
      correctly probed in the kernel. On platforms with missing DT pmu nodes (or
      disabled perf events in the kernel), the pmu is not probed, therefore the
      pmuserenr_el0 register is not reset in the kernel, which means that its
      value retains the reset value that is architecturally UNKNOWN (system
      may run with eg pmuserenr_el0 == 0x1, which means that PMU counters access
      is available at EL0, which must be disallowed).
      
      This patch adds code that resets pmuserenr_el0 on cold boot and restores
      it on core resume from shutdown, so that the pmuserenr_el0 setup is
      always enforced in the kernel.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f50c2907
    • Will Deacon's avatar
      arm64: mm: ensure that the zero page is visible to the page table walker · 8182d4cf
      Will Deacon authored
      commit 32d63978 upstream.
      
      In paging_init, we allocate the zero page, memset it to zero and then
      point TTBR0 to it in order to avoid speculative fetches through the
      identity mapping.
      
      In order to guarantee that the freshly zeroed page is indeed visible to
      the page table walker, we need to execute a dsb instruction prior to
      writing the TTBR.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8182d4cf
    • John Blackwood's avatar
      arm64: Clear out any singlestep state on a ptrace detach operation · c01afb92
      John Blackwood authored
      commit 5db4fd8c upstream.
      
      Make sure to clear out any ptrace singlestep state when a ptrace(2)
      PTRACE_DETACH call is made on arm64 systems.
      
      Otherwise, the previously ptraced task will die off with a SIGTRAP
      signal if the debugger just previously singlestepped the ptraced task.
      Signed-off-by: default avatarJohn Blackwood <john.blackwood@ccur.com>
      [will: added comment to justify why this is in the arch code]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c01afb92
    • Ulrich Weigand's avatar
      powerpc/module: Handle R_PPC64_ENTRY relocations · e268f848
      Ulrich Weigand authored
      commit a61674bd upstream.
      
      GCC 6 will include changes to generated code with -mcmodel=large,
      which is used to build kernel modules on powerpc64le.  This was
      necessary because the large model is supposed to allow arbitrary
      sizes and locations of the code and data sections, but the ELFv2
      global entry point prolog still made the unconditional assumption
      that the TOC associated with any particular function can be found
      within 2 GB of the function entry point:
      
      func:
      	addis r2,r12,(.TOC.-func)@ha
      	addi  r2,r2,(.TOC.-func)@l
      	.localentry func, .-func
      
      To remove this assumption, GCC will now generate instead this global
      entry point prolog sequence when using -mcmodel=large:
      
      	.quad .TOC.-func
      func:
      	.reloc ., R_PPC64_ENTRY
      	ld    r2, -8(r12)
      	add   r2, r2, r12
      	.localentry func, .-func
      
      The new .reloc triggers an optimization in the linker that will
      replace this new prolog with the original code (see above) if the
      linker determines that the distance between .TOC. and func is in
      range after all.
      
      Since this new relocation is now present in module object files,
      the kernel module loader is required to handle them too.  This
      patch adds support for the new relocation and implements the
      same optimization done by the GNU linker.
      Signed-off-by: default avatarUlrich Weigand <ulrich.weigand@de.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e268f848
    • Ulrich Weigand's avatar
      scripts/recordmcount.pl: support data in text section on powerpc · 3e96fc5b
      Ulrich Weigand authored
      commit 2e50c4be upstream.
      
      If a text section starts out with a data blob before the first
      function start label, disassembly parsing doing in recordmcount.pl
      gets confused on powerpc, leading to creation of corrupted module
      objects.
      
      This was not a problem so far since the compiler would never create
      such text sections.  However, this has changed with a recent change
      in GCC 6 to support distances of > 2GB between a function and its
      assoicated TOC in the ELFv2 ABI, exposing this problem.
      
      There is already code in recordmcount.pl to handle such data blobs
      on the sparc64 platform.  This patch uses the same method to handle
      those on powerpc as well.
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarUlrich Weigand <ulrich.weigand@de.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e96fc5b
    • Boqun Feng's avatar
      powerpc: Make {cmp}xchg* and their atomic_ versions fully ordered · badc6884
      Boqun Feng authored
      commit 81d7a329 upstream.
      
      According to memory-barriers.txt, xchg*, cmpxchg* and their atomic_
      versions all need to be fully ordered, however they are now just
      RELEASE+ACQUIRE, which are not fully ordered.
      
      So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
      PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
      __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics
      of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit
      b97021f8 ("powerpc: Fix atomic_xxx_return barrier semantics")
      
      This patch depends on patch "powerpc: Make value-returning atomics fully
      ordered" for PPC_ATOMIC_ENTRY_BARRIER definition.
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      badc6884
    • Boqun Feng's avatar
      powerpc: Make value-returning atomics fully ordered · 2ed6c101
      Boqun Feng authored
      commit 49e9cf3f upstream.
      
      According to memory-barriers.txt:
      
      > Any atomic operation that modifies some state in memory and returns
      > information about the state (old or new) implies an SMP-conditional
      > general memory barrier (smp_mb()) on each side of the actual
      > operation ...
      
      Which mean these operations should be fully ordered. However on PPC,
      PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation,
      which is currently "lwsync" if SMP=y. The leading "lwsync" can not
      guarantee fully ordered atomics, according to Paul Mckenney:
      
      https://lkml.org/lkml/2015/10/14/970
      
      To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee
      the fully-ordered semantics.
      
      This also makes futex atomics fully ordered, which can avoid possible
      memory ordering problems if userspace code relies on futex system call
      for fully ordered semantics.
      
      Fixes: b97021f8 ("powerpc: Fix atomic_xxx_return barrier semantics")
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ed6c101
    • Michael Neuling's avatar
      powerpc/tm: Check for already reclaimed tasks · e924c60d
      Michael Neuling authored
      commit 7f821fc9 upstream.
      
      Currently we can hit a scenario where we'll tm_reclaim() twice.  This
      results in a TM bad thing exception because the second reclaim occurs
      when not in suspend mode.
      
      The scenario in which this can happen is the following.  We attempt to
      deliver a signal to userspace.  To do this we need obtain the stack
      pointer to write the signal context.  To get this stack pointer we
      must tm_reclaim() in case we need to use the checkpointed stack
      pointer (see get_tm_stackpointer()).  Normally we'd then return
      directly to userspace to deliver the signal without going through
      __switch_to().
      
      Unfortunatley, if at this point we get an error (such as a bad
      userspace stack pointer), we need to exit the process.  The exit will
      result in a __switch_to().  __switch_to() will attempt to save the
      process state which results in another tm_reclaim().  This
      tm_reclaim() now causes a TM Bad Thing exception as this state has
      already been saved and the processor is no longer in TM suspend mode.
      Whee!
      
      This patch checks the state of the MSR to ensure we are TM suspended
      before we attempt the tm_reclaim().  If we've already saved the state
      away, we should no longer be in TM suspend mode.  This has the
      additional advantage of checking for a potential TM Bad Thing
      exception.
      
      Found using syscall fuzzer.
      
      Fixes: fb09692e ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes")
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e924c60d
    • Sven Eckelmann's avatar
      batman-adv: Drop immediate orig_node free function · d29da882
      Sven Eckelmann authored
      [ Upstream commit 42eff6a6 ]
      
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_orig_node_free_ref.
      
      Fixes: 72822225 ("batman-adv: Fix rcu_barrier() miss due to double call_rcu() in TT code")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d29da882
    • Sven Eckelmann's avatar
      batman-adv: Drop immediate batadv_hard_iface free function · 11ed6767
      Sven Eckelmann authored
      [ Upstream commit b4d922cf ]
      
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_hardif_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11ed6767
    • Sven Eckelmann's avatar
      batman-adv: Drop immediate neigh_ifinfo free function · 568f95d0
      Sven Eckelmann authored
      [ Upstream commit ae3e1e36 ]
      
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_neigh_ifinfo_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      568f95d0
    • Sven Eckelmann's avatar
      batman-adv: Drop immediate batadv_neigh_node free function · 31459b9b
      Sven Eckelmann authored
      [ Upstream commit 2baa753c ]
      
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_neigh_node_free_ref.
      
      Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31459b9b
    • Sven Eckelmann's avatar
      batman-adv: Drop immediate batadv_orig_ifinfo free function · 53ab5ef0
      Sven Eckelmann authored
      [ Upstream commit deed9660 ]
      
      It is not allowed to free the memory of an object which is part of a list
      which is protected by rcu-read-side-critical sections without making sure
      that no other context is accessing the object anymore. This usually happens
      by removing the references to this object and then waiting until the rcu
      grace period is over and no one (allowedly) accesses it anymore.
      
      But the _now functions ignore this completely. They free the object
      directly even when a different context still tries to access it. This has
      to be avoided and thus these functions must be removed and all functions
      have to use batadv_orig_ifinfo_free_ref.
      
      Fixes: 7351a482 ("batman-adv: split out router from orig_node")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53ab5ef0
    • Sven Eckelmann's avatar
      batman-adv: Avoid recursive call_rcu for batadv_nc_node · d3a4a363
      Sven Eckelmann authored
      [ Upstream commit 44e8e7e9 ]
      
      The batadv_nc_node_free_ref function uses call_rcu to delay the free of the
      batadv_nc_node object until no (already started) rcu_read_lock is enabled
      anymore. This makes sure that no context is still trying to access the
      object which should be removed. But batadv_nc_node also contains a
      reference to orig_node which must be removed.
      
      The reference drop of orig_node was done in the call_rcu function
      batadv_nc_node_free_rcu but should actually be done in the
      batadv_nc_node_release function to avoid nested call_rcus. This is
      important because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will
      not detect the inner call_rcu as relevant for its execution. Otherwise this
      barrier will most likely be inserted in the queue before the callback of
      the first call_rcu was executed. The caller of rcu_barrier will therefore
      continue to run before the inner call_rcu callback finished.
      
      Fixes: d56b1705 ("batman-adv: network coding - detect coding nodes and remove these after timeout")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3a4a363
    • Sven Eckelmann's avatar
      batman-adv: Avoid recursive call_rcu for batadv_bla_claim · 44734833
      Sven Eckelmann authored
      [ Upstream commit 63b39927 ]
      
      The batadv_claim_free_ref function uses call_rcu to delay the free of the
      batadv_bla_claim object until no (already started) rcu_read_lock is enabled
      anymore. This makes sure that no context is still trying to access the
      object which should be removed. But batadv_bla_claim also contains a
      reference to backbone_gw which must be removed.
      
      The reference drop of backbone_gw was done in the call_rcu function
      batadv_claim_free_rcu but should actually be done in the
      batadv_claim_release function to avoid nested call_rcus. This is important
      because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will not
      detect the inner call_rcu as relevant for its execution. Otherwise this
      barrier will most likely be inserted in the queue before the callback of
      the first call_rcu was executed. The caller of rcu_barrier will therefore
      continue to run before the inner call_rcu callback finished.
      
      Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Acked-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      Signed-off-by: default avatarMarek Lindner <mareklindner@neomailbox.ch>
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44734833
    • Ido Schimmel's avatar
      team: Replace rcu_read_lock with a mutex in team_vlan_rx_kill_vid · 442f7b48
      Ido Schimmel authored
      [ Upstream commit 60a6531b ]
      
      We can't be within an RCU read-side critical section when deleting
      VLANs, as underlying drivers might sleep during the hardware operation.
      Therefore, replace the RCU critical section with a mutex. This is
      consistent with team_vlan_rx_add_vid.
      
      Fixes: 3d249d4c ("net: introduce ethernet teaming device")
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      442f7b48
    • Doron Tsur's avatar
      net/mlx5_core: Fix trimming down IRQ number · 9ef3ceb4
      Doron Tsur authored
      [ Upstream commit 0b6e26ce ]
      
      With several ConnectX-4 cards installed on a server, one may receive
      irqn > 255 from the kernel API, which we mistakenly trim to 8bit.
      
      This causes EQ creation failure with the following stack trace:
      [<ffffffff812a11f4>] dump_stack+0x48/0x64
      [<ffffffff810ace21>] __setup_irq+0x3a1/0x4f0
      [<ffffffff810ad7e0>] request_threaded_irq+0x120/0x180
      [<ffffffffa0923660>] ? mlx5_eq_int+0x450/0x450 [mlx5_core]
      [<ffffffffa0922f64>] mlx5_create_map_eq+0x1e4/0x2b0 [mlx5_core]
      [<ffffffffa091de01>] alloc_comp_eqs+0xb1/0x180 [mlx5_core]
      [<ffffffffa091ea99>] mlx5_dev_init+0x5e9/0x6e0 [mlx5_core]
      [<ffffffffa091ec29>] init_one+0x99/0x1c0 [mlx5_core]
      [<ffffffff812e2afc>] local_pci_probe+0x4c/0xa0
      
      Fixing it by changing of the irqn type from u8 to unsigned int to
      support values > 255
      
      Fixes: 61d0e73e ('net/mlx5_core: Use the the real irqn in eq->irqn')
      Reported-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDoron Tsur <doront@mellanox.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ef3ceb4
    • Nikolay Aleksandrov's avatar
      bridge: fix lockdep addr_list_lock false positive splat · db4dca1a
      Nikolay Aleksandrov authored
      [ Upstream commit c6894dec ]
      
      After promisc mode management was introduced a bridge device could do
      dev_set_promiscuity from its ndo_change_rx_flags() callback which in
      turn can be called after the bridge's addr_list_lock has been taken
      (e.g. by dev_uc_add). This causes a false positive lockdep splat because
      the port interfaces' addr_list_lock is taken when br_manage_promisc()
      runs after the bridge's addr list lock was already taken.
      To remove the false positive introduce a custom bridge addr_list_lock
      class and set it on bridge init.
      A simple way to reproduce this is with the following:
      $ brctl addbr br0
      $ ip l add l br0 br0.100 type vlan id 100
      $ ip l set br0 up
      $ ip l set br0.100 up
      $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering
      $ brctl addif br0 eth0
      Splat:
      [   43.684325] =============================================
      [   43.684485] [ INFO: possible recursive locking detected ]
      [   43.684636] 4.4.0-rc8+ #54 Not tainted
      [   43.684755] ---------------------------------------------
      [   43.684906] brctl/1187 is trying to acquire lock:
      [   43.685047]  (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40
      [   43.685460]  but task is already holding lock:
      [   43.685618]  (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80
      [   43.686015]  other info that might help us debug this:
      [   43.686316]  Possible unsafe locking scenario:
      
      [   43.686743]        CPU0
      [   43.686967]        ----
      [   43.687197]   lock(_xmit_ETHER);
      [   43.687544]   lock(_xmit_ETHER);
      [   43.687886] *** DEADLOCK ***
      
      [   43.688438]  May be due to missing lock nesting notation
      
      [   43.688882] 2 locks held by brctl/1187:
      [   43.689134]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20
      [   43.689852]  #1:  (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80
      [   43.690575] stack backtrace:
      [   43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54
      [   43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
      [   43.691770]  ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0
      [   43.692425]  ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139
      [   43.693071]  ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020
      [   43.693709] Call Trace:
      [   43.693931]  [<ffffffff81360ceb>] dump_stack+0x4b/0x70
      [   43.694199]  [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90
      [   43.694483]  [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0
      [   43.694789]  [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0
      [   43.695064]  [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0
      [   43.695340]  [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40
      [   43.695623]  [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80
      [   43.695901]  [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40
      [   43.696180]  [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40
      [   43.696460]  [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50
      [   43.696750]  [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge]
      [   43.697052]  [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge]
      [   43.697348]  [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge]
      [   43.697655]  [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0
      [   43.697943]  [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90
      [   43.698223]  [<ffffffff815072de>] dev_uc_add+0x5e/0x80
      [   43.698498]  [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q]
      [   43.698798]  [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80
      [   43.699083]  [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20
      [   43.699374]  [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80
      [   43.699678]  [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20
      [   43.699973]  [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge]
      [   43.700259]  [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge]
      [   43.700548]  [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge]
      [   43.700836]  [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0
      [   43.701106]  [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0
      [   43.701379]  [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450
      [   43.701665]  [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450
      [   43.701947]  [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50
      [   43.702219]  [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290
      [   43.702500]  [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0
      [   43.702771]  [<ffffffff81243079>] SyS_ioctl+0x79/0x90
      [   43.703033]  [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
      
      CC: Vlad Yasevich <vyasevic@redhat.com>
      CC: Stephen Hemminger <stephen@networkplumber.org>
      CC: Bridge list <bridge@lists.linux-foundation.org>
      CC: Andy Gospodarek <gospo@cumulusnetworks.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Fixes: 2796d0c6 ("bridge: Automatically manage port promiscuous mode.")
      Reported-by: default avatarAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db4dca1a
    • Eric Dumazet's avatar
      ipv6: update skb->csum when CE mark is propagated · a744010b
      Eric Dumazet authored
      [ Upstream commit 34ae6a1a ]
      
      When a tunnel decapsulates the outer header, it has to comply
      with RFC 6080 and eventually propagate CE mark into inner header.
      
      It turns out IP6_ECN_set_ce() does not correctly update skb->csum
      for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
      messages and stack traces.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a744010b
    • Rabin Vincent's avatar
      net: bpf: reject invalid shifts · 35987ff2
      Rabin Vincent authored
      [ Upstream commit 229394e8 ]
      
      On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
      constant shift that can't be encoded in the immediate field of the
      UBFM/SBFM instructions is passed to the JIT.  Since these shifts
      amounts, which are negative or >= regsize, are invalid, reject them in
      the eBPF verifier and the classic BPF filter checker, for all
      architectures.
      Signed-off-by: default avatarRabin Vincent <rabin@rab.in>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35987ff2