1. 21 Nov, 2018 16 commits
    • Waiman Long's avatar
      locking/lockdep: Fix debug_locks off performance problem · ef42ef84
      Waiman Long authored
      [ Upstream commit 9506a742 ]
      
      It was found that when debug_locks was turned off because of a problem
      found by the lockdep code, the system performance could drop quite
      significantly when the lock_stat code was also configured into the
      kernel. For instance, parallel kernel build time on a 4-socket x86-64
      server nearly doubled.
      
      Further analysis into the cause of the slowdown traced back to the
      frequent call to debug_locks_off() from the __lock_acquired() function
      probably due to some inconsistent lockdep states with debug_locks
      off. The debug_locks_off() function did an unconditional atomic xchg
      to write a 0 value into debug_locks which had already been set to 0.
      This led to severe cacheline contention in the cacheline that held
      debug_locks.  As debug_locks is being referenced in quite a few different
      places in the kernel, this greatly slow down the system performance.
      
      To prevent that trashing of debug_locks cacheline, lock_acquired()
      and lock_contended() now checks the state of debug_locks before
      proceeding. The debug_locks_off() function is also modified to check
      debug_locks before calling __debug_locks_off().
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef42ef84
    • Masami Hiramatsu's avatar
      selftests: ftrace: Add synthetic event syntax testcase · 049b96e8
      Masami Hiramatsu authored
      [ Upstream commit ba0e41ca ]
      
      Add a testcase to check the syntax and field types for
      synthetic_events interface.
      
      Link: http://lkml.kernel.org/r/153986838264.18251.16627517536956299922.stgit@devboxAcked-by: default avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      049b96e8
    • Nathan Chancellor's avatar
      net: qla3xxx: Remove overflowing shift statement · ac435f05
      Nathan Chancellor authored
      [ Upstream commit 8c3bf9b6 ]
      
      Clang currently warns:
      
      drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift
      result (0xF00000000) requires 37 bits to represent, but 'int' only has
      32 bits [-Wshift-overflow]
                          ((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
                            ~~~~~~~~~~~~~~ ^  ~~
      1 warning generated.
      
      The warning is certainly accurate since ISP_NVRAM_MASK is defined as
      (0x000F << 16) which is then shifted by 16, resulting in 64424509440,
      well above UINT_MAX.
      
      Given that this is the only location in this driver where ISP_NVRAM_MASK
      is shifted again, it seems likely that ISP_NVRAM_MASK was originally
      defined without a shift and during the move of the shift to the
      definition, this statement wasn't properly removed (since ISP_NVRAM_MASK
      is used in the statenent right above this). Only the maintainers can
      confirm this since this statment has been here since the driver was
      first added to the kernel.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/127Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac435f05
    • Sebastian Andrzej Siewior's avatar
      x86/fpu: Remove second definition of fpu in __fpu__restore_sig() · 055dbfe1
      Sebastian Andrzej Siewior authored
      [ Upstream commit 6aa67676 ]
      
      Commit:
      
        c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")
      
      introduced the 'fpu' variable at top of __restore_xstate_sig(),
      which now shadows the other definition:
      
        arch/x86/kernel/fpu/signal.c:318:28: warning: symbol 'fpu' shadows an earlier one
        arch/x86/kernel/fpu/signal.c:271:20: originally declared here
      
      Remove the shadowed definition of 'fpu', as the two definitions are the same.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c5bedc68 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")
      Link: http://lkml.kernel.org/r/20181016202525.29437-3-bigeasy@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      055dbfe1
    • David S. Miller's avatar
      sparc: Fix single-pcr perf event counter management. · 9190b06c
      David S. Miller authored
      [ Upstream commit cfdc3170 ]
      
      It is important to clear the hw->state value for non-stopped events
      when they are added into the PMU.  Otherwise when the event is
      scheduled out, we won't read the counter because HES_UPTODATE is still
      set.  This breaks 'perf stat' and similar use cases, causing all the
      events to show zero.
      
      This worked for multi-pcr because we make explicit sparc_pmu_start()
      calls in calculate_multiple_pcrs().  calculate_single_pcr() doesn't do
      this because the idea there is to accumulate all of the counter
      settings into the single pcr value.  So we have to add explicit
      hw->state handling there.
      
      Like x86, we use the PERF_HES_ARCH bit to track truly stopped events
      so that we don't accidently start them on a reload.
      
      Related to all of this, sparc_pmu_start() is missing a userpage update
      so add it.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9190b06c
    • Daniel Wagner's avatar
      x86/kconfig: Fall back to ticket spinlocks · 97b8ca65
      Daniel Wagner authored
      Sebastian writes:
      
      """
      We reproducibly observe cache line starvation on a Core2Duo E6850 (2
      cores), a i5-6400 SKL (4 cores) and on a NXP LS2044A ARM Cortex-A72 (4
      cores).
      
      The problem can be triggered with a v4.9-RT kernel by starting
      
          cyclictest -S -p98 -m  -i2000 -b 200
      
      and as "load"
      
          stress-ng --ptrace 4
      
      The reported maximal latency is usually less than 60us. If the problem
      triggers then values around 400us, 800us or even more are reported. The
      upperlimit is the -i parameter.
      
      Reproduction with 4.9-RT is almost immediate on Core2Duo, ARM64 and SKL,
      but it took 7.5 hours to trigger on v4.14-RT on the Core2Duo.
      
      Instrumentation show always the picture:
      
      CPU0                                         CPU1
      => do_syscall_64                              => do_syscall_64
      => SyS_ptrace                                   => syscall_slow_exit_work
      => ptrace_check_attach                          => ptrace_do_notify / rt_read_unlock
      => wait_task_inactive                              rt_spin_lock_slowunlock()
         -> while task_running()                         __rt_mutex_unlock_common()
        /   check_task_state()                           mark_wakeup_next_waiter()
       |     raw_spin_lock_irq(&p->pi_lock);             raw_spin_lock(&current->pi_lock);
       |     .                                               .
       |     raw_spin_unlock_irq(&p->pi_lock);               .
        \  cpu_relax()                                       .
         -                                                   .
          *IRQ*                                          <lock acquired>
      
      In the error case we observe that the while() loop is repeated more than
      5000 times which indicates that the pi_lock can be acquired. CPU1 on the
      other side does not make progress waiting for the same lock with interrupts
      disabled.
      
      This continues until an IRQ hits CPU0. Once CPU0 starts processing the IRQ
      the other CPU is able to acquire pi_lock and the situation relaxes.
      """
      
      This matches with the observeration for v4.4-rt on a Core2Duo E6850:
      
      CPU 0:
      
      - no progress for a very long time in rt_mutex_dequeue_pi):
      
      stress-n-1931    0d..11  5060.891219: function:             __try_to_take_rt_mutex
      stress-n-1931    0d..11  5060.891219: function:                rt_mutex_dequeue
      stress-n-1931    0d..21  5060.891220: function:                rt_mutex_enqueue_pi
      stress-n-1931    0....2  5060.891220: signal_generate:      sig=17 errno=0 code=262148 comm=stress-ng-ptrac pid=1928 grp=1 res=1
      stress-n-1931    0d..21  5060.894114: function:             rt_mutex_dequeue_pi
      stress-n-1931    0d.h11  5060.894115: local_timer_entry:    vector=239
      
      CPU 1:
      
      - IRQ at 5060.894114 on CPU 1 followed by the IRQ on CPU 0
      
      stress-n-1928    1....0  5060.891215: sys_enter:            NR 101 (18, 78b, 0, 0, 17, 788)
      stress-n-1928    1d..11  5060.891216: function:             __try_to_take_rt_mutex
      stress-n-1928    1d..21  5060.891216: function:                rt_mutex_enqueue_pi
      stress-n-1928    1d..21  5060.891217: function:             rt_mutex_dequeue_pi
      stress-n-1928    1....1  5060.891217: function:             rt_mutex_adjust_prio
      stress-n-1928    1d..11  5060.891218: function:                __rt_mutex_adjust_prio
      stress-n-1928    1d.h10  5060.894114: local_timer_entry:    vector=239
      
      Thomas writes:
      
      """
      This has nothing to do with RT. RT is merily exposing the
      problem in an observable way. The same issue happens with upstream, it's
      harder to trigger and it's harder to observe for obvious reasons.
      
      If you read through the discussions [see the links below] then you
      really see that there is an upstream issue with the x86 qrlock
      implementation and Peter has posted fixes which resolve it, both at
      the practical and the theoretical level.
      """
      
      Backporting all qspinlock related patches is very likely to introduce
      regressions on v4.4. Therefore, the recommended solution by Peter and
      Thomas is to drop back to ticket spinlocks for v4.4.
      
      Link :https://lkml.kernel.org/r/20180921120226.6xjgr4oiho22ex75@linutronix.de
      Link: https://lkml.kernel.org/r/20180926110117.405325143@infradead.org
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarDaniel Wagner <daniel.wagner@siemens.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97b8ca65
    • He Zhe's avatar
      x86/corruption-check: Fix panic in memory_corruption_check() when boot option... · 52d8cdd9
      He Zhe authored
      x86/corruption-check: Fix panic in memory_corruption_check() when boot option without value is provided
      
      commit ccde460b upstream.
      
      memory_corruption_check[{_period|_size}]()'s handlers do not check input
      argument before passing it to kstrtoul() or simple_strtoull(). The argument
      would be a NULL pointer if each of the kernel parameters, without its
      value, is set in command line and thus cause the following panic.
      
      PANIC: early exception 0xe3 IP 10:ffffffff73587c22 error 0 cr2 0x0
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18-rc8+ #2
      [    0.000000] RIP: 0010:kstrtoull+0x2/0x10
      ...
      [    0.000000] Call Trace
      [    0.000000]  ? set_corruption_check+0x21/0x49
      [    0.000000]  ? do_early_param+0x4d/0x82
      [    0.000000]  ? parse_args+0x212/0x330
      [    0.000000]  ? rdinit_setup+0x26/0x26
      [    0.000000]  ? parse_early_options+0x20/0x23
      [    0.000000]  ? rdinit_setup+0x26/0x26
      [    0.000000]  ? parse_early_param+0x2d/0x39
      [    0.000000]  ? setup_arch+0x2f7/0xbf4
      [    0.000000]  ? start_kernel+0x5e/0x4c2
      [    0.000000]  ? load_ucode_bsp+0x113/0x12f
      [    0.000000]  ? secondary_startup_64+0xa5/0xb0
      
      This patch adds checks to prevent the panic.
      Signed-off-by: default avatarHe Zhe <zhe.he@windriver.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: gregkh@linuxfoundation.org
      Cc: kstewart@linuxfoundation.org
      Cc: pombredanne@nexb.com
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1534260823-87917-1-git-send-email-zhe.he@windriver.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      52d8cdd9
    • Alex Stanoev's avatar
      ALSA: ca0106: Disable IZD on SB0570 DAC to fix audio pops · 818f57e7
      Alex Stanoev authored
      commit ac237c28 upstream.
      
      The Creative Audigy SE (SB0570) card currently exhibits an audible pop
      whenever playback is stopped or resumed, or during silent periods of an
      audio stream. Initialise the IZD bit to the 0 to eliminate these pops.
      
      The Infinite Zero Detection (IZD) feature on the DAC causes the output
      to be shunted to Vcap after 2048 samples of silence. This discharges the
      AC coupling capacitor through the output and causes the aforementioned
      pop/click noise.
      
      The behaviour of the IZD bit is described on page 15 of the WM8768GEDS
      datasheet: "With IZD=1, applying MUTE for 1024 consecutive input samples
      will cause all outputs to be connected directly to VCAP. This also
      happens if 2048 consecutive zero input samples are applied to all 6
      channels, and IZD=0. It will be removed as soon as any channel receives
      a non-zero input". I believe the second sentence might be referring to
      IZD=1 instead of IZD=0 given the observed behaviour of the card.
      
      This change should make the DAC initialisation consistent with
      Creative's Windows driver, as this popping persists when initialising
      the card in Linux and soft rebooting into Windows, but is not present on
      a cold boot to Windows.
      Signed-off-by: default avatarAlex Stanoev <alex@astanoev.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      818f57e7
    • Jeremy Cline's avatar
      ALSA: hda - Add mic quirk for the Lenovo G50-30 (17aa:3905) · f0a658e5
      Jeremy Cline authored
      commit e7bb6ad5 upstream.
      
      The Lenovo G50-30, like other G50 models, has a Conexant codec that
      requires a quirk for its inverted stereo dmic.
      
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1249364Reported-by: default avatarAlexander Ploumistos <alex.ploumistos@gmail.com>
      Tested-by: default avatarAlexander Ploumistos <alex.ploumistos@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJeremy Cline <jcline@redhat.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0a658e5
    • Helge Deller's avatar
      parisc: Fix map_pages() to not overwrite existing pte entries · ae53e64e
      Helge Deller authored
      commit 3c229b3f upstream.
      
      Fix a long-existing small nasty bug in the map_pages() implementation which
      leads to overwriting already written pte entries with zero, *if* map_pages() is
      called a second time with an end address which isn't aligned on a pmd boundry.
      This happens for example if we want to remap only the text segment read/write
      in order to run alternative patching on the code. Exiting the loop when we
      reach the end address fixes this.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae53e64e
    • John David Anglin's avatar
      parisc: Fix address in HPMC IVA · 7d39307d
      John David Anglin authored
      commit 1138b671 upstream.
      
      Helge noticed that the address of the os_hpmc handler was not being
      correctly calculated in the hpmc macro.  As a result, PDCE_CHECK would
      fail to call os_hpmc:
      
      <Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
      <Cpu2> 37000f7302e00000  8040004000000000  CC_ERR_CPU_CHECK_SUMMARY
      <Cpu2> f600105e02e00000  fffffff0f0c00000  CC_MC_HPMC_MONARCH_SELECTED
      <Cpu2> 140003b202e00000  000000000000000b  CC_ERR_HPMC_STATE_ENTRY
      <Cpu2> 5600100b02e00000  00000000000001a0  CC_MC_OS_HPMC_LEN_ERR
      <Cpu2> 5600106402e00000  fffffff0f0438e70  CC_MC_BR_TO_OS_HPMC_FAILED
      <Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
      <Cpu2> 37000f7302e00000  8040004000000000  CC_ERR_CPU_CHECK_SUMMARY
      <Cpu2> 4000109f02e00000  0000000000000000  CC_MC_HPMC_INITIATED
      <Cpu2> 4000101902e00000  0000000000000000  CC_MC_MULTIPLE_HPMCS
      <Cpu2> 030010d502e00000  0000000000000000  CC_CPU_STOP
      
      The address problem can be seen by dumping the fault vector:
      
      0000000040159000 <fault_vector_20>:
          40159000:   63 6f 77 73     stb r15,-2447(dp)
          40159004:   20 63 61 6e     ldil L%b747000,r3
          40159008:   20 66 6c 79     ldil L%-1c3b3000,r3
              ...
          40159020:   08 00 02 40     nop
          40159024:   20 6e 60 02     ldil L%15d000,r3
          40159028:   34 63 00 00     ldo 0(r3),r3
          4015902c:   e8 60 c0 02     bv,n r0(r3)
          40159030:   08 00 02 40     nop
          40159034:   00 00 00 00     break 0,0
          40159038:   c0 00 70 00     bb,*< r0,sar,40159840 <fault_vector_20+0x840>
          4015903c:   00 00 00 00     break 0,0
      
      Location 40159038 should contain the physical address of os_hpmc:
      
      000000004015d000 <os_hpmc>:
          4015d000:   08 1a 02 43     copy r26,r3
          4015d004:   01 c0 08 a4     mfctl iva,r4
          4015d008:   48 85 00 68     ldw 34(r4),r5
      
      This patch moves the address setup into initialize_ivt to resolve the
      above problem.  I tested the change by dumping the HPMC entry after setup:
      
      0000000040209020:  8000240
      0000000040209024: 206a2004
      0000000040209028: 34630ac0
      000000004020902c: e860c002
      0000000040209030:  8000240
      0000000040209034: 1bdddce6
      0000000040209038:   15d000
      000000004020903c:      1a0
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d39307d
    • Jan Glauber's avatar
      ipmi: Fix timer race with module unload · eda6ef4a
      Jan Glauber authored
      commit 0711e8c1 upstream.
      
      Please note that below oops is from an older kernel, but the same
      race seems to be present in the upstream kernel too.
      
      ---8<---
      
      The following panic was encountered during removing the ipmi_ssif
      module:
      
      [ 526.352555] Unable to handle kernel paging request at virtual address ffff000006923090
      [ 526.360464] Mem abort info:
      [ 526.363257] ESR = 0x86000007
      [ 526.366304] Exception class = IABT (current EL), IL = 32 bits
      [ 526.372221] SET = 0, FnV = 0
      [ 526.375269] EA = 0, S1PTW = 0
      [ 526.378405] swapper pgtable: 4k pages, 48-bit VAs, pgd = 000000008ae60416
      [ 526.385185] [ffff000006923090] *pgd=000000bffcffe803, *pud=000000bffcffd803, *pmd=0000009f4731a003, *pte=0000000000000000
      [ 526.396141] Internal error: Oops: 86000007 [#1] SMP
      [ 526.401008] Modules linked in: nls_iso8859_1 ipmi_devintf joydev input_leds ipmi_msghandler shpchp sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i2c_smbus hid_generic usbhid uas hid usb_storage ast aes_ce_blk i2c_algo_bit aes_ce_cipher qede ttm crc32_ce ptp crct10dif_ce drm_kms_helper ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt pps_core fb_sys_fops sha256_arm64 sha1_ce mpt3sas qed drm raid_class ahci scsi_transport_sas libahci gpio_xlp i2c_xlp9xx aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64 [last unloaded: ipmi_ssif]
      [ 526.468085] CPU: 125 PID: 0 Comm: swapper/125 Not tainted 4.15.0-35-generic #38~lp1775396+build.1
      [ 526.476942] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL022 08/14/2018
      [ 526.484932] pstate: 00400009 (nzcv daif +PAN -UAO)
      [ 526.489713] pc : 0xffff000006923090
      [ 526.493198] lr : call_timer_fn+0x34/0x178
      [ 526.497194] sp : ffff000009b0bdd0
      [ 526.500496] x29: ffff000009b0bdd0 x28: 0000000000000082
      [ 526.505796] x27: 0000000000000002 x26: ffff000009515188
      [ 526.511096] x25: ffff000009515180 x24: ffff0000090f1018
      [ 526.516396] x23: ffff000009519660 x22: dead000000000200
      [ 526.521696] x21: ffff000006923090 x20: 0000000000000100
      [ 526.526995] x19: ffff809eeb466a40 x18: 0000000000000000
      [ 526.532295] x17: 000000000000000e x16: 0000000000000007
      [ 526.537594] x15: 0000000000000000 x14: 071c71c71c71c71c
      [ 526.542894] x13: 0000000000000000 x12: 0000000000000000
      [ 526.548193] x11: 0000000000000001 x10: ffff000009b0be88
      [ 526.553493] x9 : 0000000000000000 x8 : 0000000000000005
      [ 526.558793] x7 : ffff80befc1f8528 x6 : 0000000000000020
      [ 526.564092] x5 : 0000000000000040 x4 : 0000000020001b20
      [ 526.569392] x3 : 0000000000000000 x2 : ffff809eeb466a40
      [ 526.574692] x1 : ffff000006923090 x0 : ffff809eeb466a40
      [ 526.579992] Process swapper/125 (pid: 0, stack limit = 0x000000002eb50acc)
      [ 526.586854] Call trace:
      [ 526.589289] 0xffff000006923090
      [ 526.592419] expire_timers+0xc8/0x130
      [ 526.596070] run_timer_softirq+0xec/0x1b0
      [ 526.600070] __do_softirq+0x134/0x328
      [ 526.603726] irq_exit+0xc8/0xe0
      [ 526.606857] __handle_domain_irq+0x6c/0xc0
      [ 526.610941] gic_handle_irq+0x84/0x188
      [ 526.614679] el1_irq+0xe8/0x180
      [ 526.617822] cpuidle_enter_state+0xa0/0x328
      [ 526.621993] cpuidle_enter+0x34/0x48
      [ 526.625564] call_cpuidle+0x44/0x70
      [ 526.629040] do_idle+0x1b8/0x1f0
      [ 526.632256] cpu_startup_entry+0x2c/0x30
      [ 526.636174] secondary_start_kernel+0x11c/0x130
      [ 526.640694] Code: bad PC value
      [ 526.643800] ---[ end trace d020b0b8417c2498 ]---
      [ 526.648404] Kernel panic - not syncing: Fatal exception in interrupt
      [ 526.654778] SMP: stopping secondary CPUs
      [ 526.658734] Kernel Offset: disabled
      [ 526.662211] CPU features: 0x5800c38
      [ 526.665688] Memory Limit: none
      [ 526.668768] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      Prevent mod_timer from arming a timer that was already removed by
      del_timer during module unload.
      Signed-off-by: default avatarJan Glauber <jglauber@cavium.com>
      Cc: <stable@vger.kernel.org> # 3.19
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eda6ef4a
    • Maciej S. Szmigiero's avatar
      pcmcia: Implement CLKRUN protocol disabling for Ricoh bridges · 0497878b
      Maciej S. Szmigiero authored
      commit 95691e3e upstream.
      
      Currently, "disable_clkrun" yenta_socket module parameter is only
      implemented for TI CardBus bridges.
      Add also an implementation for Ricoh bridges that have the necessary
      setting documented in publicly available datasheets.
      
      Tested on a RL5C476II with a Sunrich C-160 CardBus NIC that doesn't work
      correctly unless the CLKRUN protocol is disabled.
      
      Let's also make it clear in its description that the "disable_clkrun"
      module parameter only works on these two previously mentioned brands of
      CardBus bridges.
      Signed-off-by: default avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0497878b
    • Hou Tao's avatar
      jffs2: free jffs2_sb_info through jffs2_kill_sb() · 85b89ccf
      Hou Tao authored
      commit 92e2921f upstream.
      
      When an invalid mount option is passed to jffs2, jffs2_parse_options()
      will fail and jffs2_sb_info will be freed, but then jffs2_sb_info will
      be used (use-after-free) and freeed (double-free) in jffs2_kill_sb().
      
      Fix it by removing the buggy invocation of kfree() when getting invalid
      mount options.
      
      Fixes: 92abc475 ("jffs2: implement mount option parsing and compression overriding")
      Cc: stable@kernel.org
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Reviewed-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85b89ccf
    • Dmitry Bazhenov's avatar
      hwmon: (pmbus) Fix page count auto-detection. · dff53cf7
      Dmitry Bazhenov authored
      commit e7c6a556 upstream.
      
      Devices with compatible="pmbus" field have zero initial page count,
      and pmbus_clear_faults() being called before the page count auto-
      detection does not actually clear faults because it depends on the
      page count. Non-cleared faults in its turn may fail the subsequent
      page count auto-detection.
      
      This patch fixes this problem by calling pmbus_clear_fault_page()
      for currently set page and calling pmbus_clear_faults() after the
      page count was detected.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Bazhenov <bazhenov.dn@gmail.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dff53cf7
    • Tang Junhui's avatar
      bcache: fix miss key refill->end in writeback · 9b359dd9
      Tang Junhui authored
      commit 2d6cb6ed upstream.
      
      refill->end record the last key of writeback, for example, at the first
      time, keys (1,128K) to (1,1024K) are flush to the backend device, but
      the end key (1,1024K) is not included, since the bellow code:
      	if (bkey_cmp(k, refill->end) >= 0) {
      		ret = MAP_DONE;
      		goto out;
      	}
      And in the next time when we refill writeback keybuf again, we searched
      key start from (1,1024K), and got a key bigger than it, so the key
      (1,1024K) missed.
      This patch modify the above code, and let the end key to be included to
      the writeback key buffer.
      Signed-off-by: default avatarTang Junhui <tang.junhui.linux@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b359dd9
  2. 10 Nov, 2018 24 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.163 · 7a426970
      Greg Kroah-Hartman authored
      7a426970
    • Nathan Chancellor's avatar
      x86/time: Correct the attribute on jiffies' definition · 8474c9b8
      Nathan Chancellor authored
      commit 53c13ba8 upstream.
      
      Clang warns that the declaration of jiffies in include/linux/jiffies.h
      doesn't match the definition in arch/x86/time/kernel.c:
      
      arch/x86/kernel/time.c:29:42: warning: section does not match previous declaration [-Wsection]
      __visible volatile unsigned long jiffies __cacheline_aligned = INITIAL_JIFFIES;
                                               ^
      ./include/linux/cache.h:49:4: note: expanded from macro '__cacheline_aligned'
                       __section__(".data..cacheline_aligned")))
                       ^
      ./include/linux/jiffies.h:81:31: note: previous attribute is here
      extern unsigned long volatile __cacheline_aligned_in_smp __jiffy_arch_data jiffies;
                                    ^
      ./arch/x86/include/asm/cache.h:20:2: note: expanded from macro '__cacheline_aligned_in_smp'
              __page_aligned_data
              ^
      ./include/linux/linkage.h:39:29: note: expanded from macro '__page_aligned_data'
      #define __page_aligned_data     __section(.data..page_aligned) __aligned(PAGE_SIZE)
                                      ^
      ./include/linux/compiler_attributes.h:233:56: note: expanded from macro '__section'
      #define __section(S)                    __attribute__((__section__(#S)))
                                                             ^
      1 warning generated.
      
      The declaration was changed in commit 7c30f352 ("jiffies.h: declare
      jiffies and jiffies_64 with ____cacheline_aligned_in_smp") but wasn't
      updated here. Make them match so Clang no longer warns.
      
      Fixes: 7c30f352 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181013005311.28617-1-natechancellor@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8474c9b8
    • Guillaume Nault's avatar
      l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6 · 80ab1e24
      Guillaume Nault authored
      commit 94d7ee0b upstream.
      
      The code following l2tp_tunnel_find() expects that a new reference is
      held on sk. Either sk_receive_skb() or the discard_put error path will
      drop a reference from the tunnel's socket.
      
      This issue exists in both l2tp_ip and l2tp_ip6.
      
      Fixes: a3c18422 ("l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80ab1e24
    • Catalin Marinas's avatar
      cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE · 8a1d3de1
      Catalin Marinas authored
      commit 9bd616e3 upstream.
      
      The cpuidle_devices per-CPU variable is only defined when CPU_IDLE is
      enabled. Commit c8cc7d4d ("sched/idle: Reorganize the idle loop")
      removed the #ifdef CONFIG_CPU_IDLE around cpuidle_idle_call() with the
      compiler optimising away __this_cpu_read(cpuidle_devices). However, with
      CONFIG_UBSAN && !CONFIG_CPU_IDLE, this optimisation no longer happens
      and the kernel fails to link since cpuidle_devices is not defined.
      
      This patch introduces an accessor function for the current CPU cpuidle
      device (returning NULL when !CONFIG_CPU_IDLE) and uses it in
      cpuidle_idle_call().
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a1d3de1
    • Peter Zijlstra's avatar
      x86/percpu: Fix this_cpu_read() · 74ede0af
      Peter Zijlstra authored
      commit b59167ac upstream.
      
      Eric reported that a sequence count loop using this_cpu_read() got
      optimized out. This is wrong, this_cpu_read() must imply READ_ONCE()
      because the interface is IRQ-safe, therefore an interrupt can have
      changed the per-cpu value.
      
      Fixes: 7c3576d2 ("[PATCH] i386: Convert PDA into the percpu section")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: hpa@zytor.com
      Cc: eric.dumazet@gmail.com
      Cc: bp@alien8.de
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181011104019.748208519@infradead.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74ede0af
    • Phil Auld's avatar
      sched/fair: Fix throttle_list starvation with low CFS quota · 4a2b54a7
      Phil Auld authored
      commit baa9be4f upstream.
      
      With a very low cpu.cfs_quota_us setting, such as the minimum of 1000,
      distribute_cfs_runtime may not empty the throttled_list before it runs
      out of runtime to distribute. In that case, due to the change from
      c06f04c7 to put throttled entries at the head of the list, later entries
      on the list will starve.  Essentially, the same X processes will get pulled
      off the list, given CPU time and then, when expired, get put back on the
      head of the list where distribute_cfs_runtime will give runtime to the same
      set of processes leaving the rest.
      
      Fix the issue by setting a bit in struct cfs_bandwidth when
      distribute_cfs_runtime is running, so that the code in throttle_cfs_rq can
      decide to put the throttled entry on the tail or the head of the list.  The
      bit is set/cleared by the callers of distribute_cfs_runtime while they hold
      cfs_bandwidth->lock.
      
      This is easy to reproduce with a handful of CPU consumers. I use 'crash' on
      the live system. In some cases you can simply look at the throttled list and
      see the later entries are not changing:
      
        crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining | paste - - | awk '{print $1"  "$4}' | pr -t -n3
          1     ffff90b56cb2d200  -976050
          2     ffff90b56cb2cc00  -484925
          3     ffff90b56cb2bc00  -658814
          4     ffff90b56cb2ba00  -275365
          5     ffff90b166a45600  -135138
          6     ffff90b56cb2da00  -282505
          7     ffff90b56cb2e000  -148065
          8     ffff90b56cb2fa00  -872591
          9     ffff90b56cb2c000  -84687
         10     ffff90b56cb2f000  -87237
         11     ffff90b166a40a00  -164582
      
        crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining | paste - - | awk '{print $1"  "$4}' | pr -t -n3
          1     ffff90b56cb2d200  -994147
          2     ffff90b56cb2cc00  -306051
          3     ffff90b56cb2bc00  -961321
          4     ffff90b56cb2ba00  -24490
          5     ffff90b166a45600  -135138
          6     ffff90b56cb2da00  -282505
          7     ffff90b56cb2e000  -148065
          8     ffff90b56cb2fa00  -872591
          9     ffff90b56cb2c000  -84687
         10     ffff90b56cb2f000  -87237
         11     ffff90b166a40a00  -164582
      
      Sometimes it is easier to see by finding a process getting starved and looking
      at the sched_info:
      
        crash> task ffff8eb765994500 sched_info
        PID: 7800   TASK: ffff8eb765994500  CPU: 16  COMMAND: "cputest"
          sched_info = {
            pcount = 8,
            run_delay = 697094208,
            last_arrival = 240260125039,
            last_queued = 240260327513
          },
        crash> task ffff8eb765994500 sched_info
        PID: 7800   TASK: ffff8eb765994500  CPU: 16  COMMAND: "cputest"
          sched_info = {
            pcount = 8,
            run_delay = 697094208,
            last_arrival = 240260125039,
            last_queued = 240260327513
          },
      Signed-off-by: default avatarPhil Auld <pauld@redhat.com>
      Reviewed-by: default avatarBen Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: c06f04c7 ("sched: Fix potential near-infinite distribute_cfs_runtime() loop")
      Link: http://lkml.kernel.org/r/20181008143639.GA4019@pauld.bos.csbSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a2b54a7
    • Mikhail Nikiforov's avatar
      Input: elan_i2c - add ACPI ID for Lenovo IdeaPad 330-15IGM · c057f758
      Mikhail Nikiforov authored
      commit 13c1c5e4 upstream.
      
      Add ELAN061C to the ACPI table to support Elan touchpad found in Lenovo
      IdeaPad 330-15IGM.
      Signed-off-by: default avatarMikhail Nikiforov <jackxviichaos@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c057f758
    • Alan Stern's avatar
      USB: fix the usbfs flag sanitization for control transfers · 506617d9
      Alan Stern authored
      commit 665c365a upstream.
      
      Commit 7a68d9fb ("USB: usbdevfs: sanitize flags more") checks the
      transfer flags for URBs submitted from userspace via usbfs.  However,
      the check for whether the USBDEVFS_URB_SHORT_NOT_OK flag should be
      allowed for a control transfer was added in the wrong place, before
      the code has properly determined the direction of the control
      transfer.  (Control transfers are special because for them, the
      direction is set by the bRequestType byte of the Setup packet rather
      than direction bit of the endpoint address.)
      
      This patch moves code which sets up the allow_short flag for control
      transfers down after is_in has been set to the correct value.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-and-tested-by: syzbot+24a30223a4b609bb802e@syzkaller.appspotmail.com
      Fixes: 7a68d9fb ("USB: usbdevfs: sanitize flags more")
      CC: Oliver Neukum <oneukum@suse.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      506617d9
    • Gustavo A. R. Silva's avatar
      usb: gadget: storage: Fix Spectre v1 vulnerability · 87f8db65
      Gustavo A. R. Silva authored
      commit 9ae24af3 upstream.
      
      num can be indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/usb/gadget/function/f_mass_storage.c:3177 fsg_lun_make() warn:
      potential spectre issue 'fsg_opts->common->luns' [r] (local cap)
      
      Fix this by sanitizing num before using it to index
      fsg_opts->common->luns
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarFelipe Balbi <felipe.balbi@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87f8db65
    • Tobias Herzog's avatar
      cdc-acm: correct counting of UART states in serial state notification · 17275e09
      Tobias Herzog authored
      commit f976d0e5 upstream.
      
      The usb standard ("Universal Serial Bus Class Definitions for Communication
      Devices") distiguishes between "consistent signals" (DSR, DCD), and
      "irregular signals" (break, ring, parity error, framing error, overrun).
      The bits of "irregular signals" are set, if this error/event occurred on
      the device side and are immeadeatly unset, if the serial state notification
      was sent.
      Like other drivers of real serial ports do, just the occurence of those
      events should be counted in serial_icounter_struct (but no 1->0
      transitions).
      Signed-off-by: default avatarTobias Herzog <t-herzog@gmx.de>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17275e09
    • Gustavo A. R. Silva's avatar
      IB/ucm: Fix Spectre v1 vulnerability · 17eb02cc
      Gustavo A. R. Silva authored
      commit 0295e395 upstream.
      
      hdr.cmd can be indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/infiniband/core/ucm.c:1127 ib_ucm_write() warn: potential
      spectre issue 'ucm_cmd_table' [r] (local cap)
      
      Fix this by sanitizing hdr.cmd before using it to index
      ucm_cmd_table.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17eb02cc
    • Gustavo A. R. Silva's avatar
      RDMA/ucma: Fix Spectre v1 vulnerability · 6ede39a8
      Gustavo A. R. Silva authored
      commit a3671a4f upstream.
      
      hdr.cmd can be indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/infiniband/core/ucma.c:1686 ucma_write() warn: potential
      spectre issue 'ucma_cmd_table' [r] (local cap)
      
      Fix this by sanitizing hdr.cmd before using it to index
      ucm_cmd_table.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ede39a8
    • Gustavo A. R. Silva's avatar
      ptp: fix Spectre v1 vulnerability · 3700bfc3
      Gustavo A. R. Silva authored
      commit efa61c8c upstream.
      
      pin_index can be indirectly controlled by user-space, hence leading
      to a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/ptp/ptp_chardev.c:253 ptp_ioctl() warn: potential spectre issue
      'ops->pin_config' [r] (local cap)
      
      Fix this by sanitizing pin_index before using it to index
      ops->pin_config, and before passing it as an argument to
      function ptp_set_pinfunc(), in which it is used to index
      info->pin_config.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3700bfc3
    • Al Viro's avatar
      cachefiles: fix the race between cachefiles_bury_object() and rmdir(2) · d1ce094c
      Al Viro authored
      commit 169b8033 upstream.
      
      the victim might've been rmdir'ed just before the lock_rename();
      unlike the normal callers, we do not look the source up after the
      parents are locked - we know it beforehand and just recheck that it's
      still the child of what used to be its parent.  Unfortunately,
      the check is too weak - we don't spot a dead directory since its
      ->d_parent is unchanged, dentry is positive, etc.  So we sail all
      the way to ->rename(), with hosting filesystems _not_ expecting
      to be asked renaming an rmdir'ed subdirectory.
      
      The fix is easy, fortunately - the lock on parent is sufficient for
      making IS_DEADDIR() on child safe.
      
      Cc: stable@vger.kernel.org
      Fixes: 9ae326a6 (CacheFiles: A cache that backs onto a mounted filesystem)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1ce094c
    • Ard Biesheuvel's avatar
      ahci: don't ignore result code of ahci_reset_controller() · f2adb1f6
      Ard Biesheuvel authored
      [ Upstream commit d312fefe ]
      
      ahci_pci_reset_controller() calls ahci_reset_controller(), which may
      fail, but ignores the result code and always returns success. This
      may result in failures like below
      
        ahci 0000:02:00.0: version 3.0
        ahci 0000:02:00.0: enabling device (0000 -> 0003)
        ahci 0000:02:00.0: SSS flag set, parallel bus scan disabled
        ahci 0000:02:00.0: controller reset failed (0xffffffff)
        ahci 0000:02:00.0: failed to stop engine (-5)
          ... repeated many times ...
        ahci 0000:02:00.0: failed to stop engine (-5)
        Unable to handle kernel paging request at virtual address ffff0000093f9018
          ...
        PC is at ahci_stop_engine+0x5c/0xd8 [libahci]
        LR is at ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
          ...
        [<ffff000000a17014>] ahci_stop_engine+0x5c/0xd8 [libahci]
        [<ffff000000a196b4>] ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
        [<ffff000000a197d8>] ahci_init_controller+0x80/0x168 [libahci]
        [<ffff000000a260f8>] ahci_pci_init_controller+0x60/0x68 [ahci]
        [<ffff000000a26f94>] ahci_init_one+0x75c/0xd88 [ahci]
        [<ffff000008430324>] local_pci_probe+0x3c/0xb8
        [<ffff000008431728>] pci_device_probe+0x138/0x170
        [<ffff000008585e54>] driver_probe_device+0x2dc/0x458
        [<ffff0000085860e4>] __driver_attach+0x114/0x118
        [<ffff000008583ca8>] bus_for_each_dev+0x60/0xa0
        [<ffff000008585638>] driver_attach+0x20/0x28
        [<ffff0000085850b0>] bus_add_driver+0x1f0/0x2a8
        [<ffff000008586ae0>] driver_register+0x60/0xf8
        [<ffff00000842f9b4>] __pci_register_driver+0x3c/0x48
        [<ffff000000a3001c>] ahci_pci_driver_init+0x1c/0x1000 [ahci]
        [<ffff000008083918>] do_one_initcall+0x38/0x120
      
      where an obvious hardware level failure results in an unnecessary 15 second
      delay and a subsequent crash.
      
      So record the result code of ahci_reset_controller() and relay it, rather
      than ignoring it.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f2adb1f6
    • Jia-Ju Bai's avatar
      crypto: shash - Fix a sleep-in-atomic bug in shash_setkey_unaligned · 333de2f4
      Jia-Ju Bai authored
      [ Upstream commit 9039f3ef ]
      
      The SCTP program may sleep under a spinlock, and the function call path is:
      sctp_generate_t3_rtx_event (acquire the spinlock)
        sctp_do_sm
          sctp_side_effects
            sctp_cmd_interpreter
              sctp_make_init_ack
                sctp_pack_cookie
                  crypto_shash_setkey
                    shash_setkey_unaligned
                      kmalloc(GFP_KERNEL)
      
      For the same reason, the orinoco driver may sleep in interrupt handler,
      and the function call path is:
      orinoco_rx_isr_tasklet
        orinoco_rx
          orinoco_mic
            crypto_shash_setkey
              shash_setkey_unaligned
                kmalloc(GFP_KERNEL)
      
      To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.
      This bug is found by my static analysis tool and my code review.
      Signed-off-by: default avatarJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      333de2f4
    • Linus Torvalds's avatar
      mremap: properly flush TLB before releasing the page · 2e3ae534
      Linus Torvalds authored
      Commit eb66ae03 upstream.
      
      This is a backport to stable 4.4.y.
      
      Jann Horn points out that our TLB flushing was subtly wrong for the
      mremap() case.  What makes mremap() special is that we don't follow the
      usual "add page to list of pages to be freed, then flush tlb, and then
      free pages".  No, mremap() obviously just _moves_ the page from one page
      table location to another.
      
      That matters, because mremap() thus doesn't directly control the
      lifetime of the moved page with a freelist: instead, the lifetime of the
      page is controlled by the page table locking, that serializes access to
      the entry.
      
      As a result, we need to flush the TLB not just before releasing the lock
      for the source location (to avoid any concurrent accesses to the entry),
      but also before we release the destination page table lock (to avoid the
      TLB being flushed after somebody else has already done something to that
      page).
      
      This also makes the whole "need_flush" logic unnecessary, since we now
      always end up flushing the TLB for every valid entry.
      Reported-and-tested-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [will: backport to 4.4 stable]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e3ae534
    • Ido Schimmel's avatar
      rtnetlink: Disallow FDB configuration for non-Ethernet device · abd46fca
      Ido Schimmel authored
      [ Upstream commit da715775 ]
      
      When an FDB entry is configured, the address is validated to have the
      length of an Ethernet address, but the device for which the address is
      configured can be of any type.
      
      The above can result in the use of uninitialized memory when the address
      is later compared against existing addresses since 'dev->addr_len' is
      used and it may be greater than ETH_ALEN, as with ip6tnl devices.
      
      Fix this by making sure that FDB entries are only configured for
      Ethernet devices.
      
      BUG: KMSAN: uninit-value in memcmp+0x11d/0x180 lib/string.c:863
      CPU: 1 PID: 4318 Comm: syz-executor998 Not tainted 4.19.0-rc3+ #49
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x14b/0x190 lib/dump_stack.c:113
        kmsan_report+0x183/0x2b0 mm/kmsan/kmsan.c:956
        __msan_warning+0x70/0xc0 mm/kmsan/kmsan_instr.c:645
        memcmp+0x11d/0x180 lib/string.c:863
        dev_uc_add_excl+0x165/0x7b0 net/core/dev_addr_lists.c:464
        ndo_dflt_fdb_add net/core/rtnetlink.c:3463 [inline]
        rtnl_fdb_add+0x1081/0x1270 net/core/rtnetlink.c:3558
        rtnetlink_rcv_msg+0xa0b/0x1530 net/core/rtnetlink.c:4715
        netlink_rcv_skb+0x36e/0x5f0 net/netlink/af_netlink.c:2454
        rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4733
        netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
        netlink_unicast+0x1638/0x1720 net/netlink/af_netlink.c:1343
        netlink_sendmsg+0x1205/0x1290 net/netlink/af_netlink.c:1908
        sock_sendmsg_nosec net/socket.c:621 [inline]
        sock_sendmsg net/socket.c:631 [inline]
        ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
        __sys_sendmsg net/socket.c:2152 [inline]
        __do_sys_sendmsg net/socket.c:2161 [inline]
        __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
        __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
        do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
        entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x440ee9
      Code: e8 cc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
      48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
      ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fff6a93b518 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440ee9
      RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 00000000004002c8 R11: 0000000000000213 R12: 000000000000b4b0
      R13: 0000000000401ec0 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
        kmsan_save_stack_with_flags mm/kmsan/kmsan.c:256 [inline]
        kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:181
        kmsan_kmalloc+0x98/0x100 mm/kmsan/kmsan_hooks.c:91
        kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:100
        slab_post_alloc_hook mm/slab.h:446 [inline]
        slab_alloc_node mm/slub.c:2718 [inline]
        __kmalloc_node_track_caller+0x9e7/0x1160 mm/slub.c:4351
        __kmalloc_reserve net/core/skbuff.c:138 [inline]
        __alloc_skb+0x2f5/0x9e0 net/core/skbuff.c:206
        alloc_skb include/linux/skbuff.h:996 [inline]
        netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
        netlink_sendmsg+0xb49/0x1290 net/netlink/af_netlink.c:1883
        sock_sendmsg_nosec net/socket.c:621 [inline]
        sock_sendmsg net/socket.c:631 [inline]
        ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
        __sys_sendmsg net/socket.c:2152 [inline]
        __do_sys_sendmsg net/socket.c:2161 [inline]
        __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
        __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
        do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
        entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      v2:
      * Make error message more specific (David)
      
      Fixes: 090096bf ("net: generic fdb support for drivers without ndo_fdb_<op>")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-and-tested-by: syzbot+3a288d5f5530b901310e@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+d53ab4e92a1db04110ff@syzkaller.appspotmail.com
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abd46fca
    • Jason Wang's avatar
      vhost: Fix Spectre V1 vulnerability · 628a149b
      Jason Wang authored
      [ Upstream commit ff002269 ]
      
      The idx in vhost_vring_ioctl() was controlled by userspace, hence a
      potential exploitation of the Spectre variant 1 vulnerability.
      
      Fixing this by sanitizing idx before using it to index d->vqs.
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      628a149b
    • Cong Wang's avatar
      net: drop skb on failure in ip_check_defrag() · 8de8589c
      Cong Wang authored
      [ Upstream commit 7de414a9 ]
      
      Most callers of pskb_trim_rcsum() simply drop the skb when
      it fails, however, ip_check_defrag() still continues to pass
      the skb up to stack. This is suspicious.
      
      In ip_check_defrag(), after we learn the skb is an IP fragment,
      passing the skb to callers makes no sense, because callers expect
      fragments are defrag'ed on success. So, dropping the skb when we
      can't defrag it is reasonable.
      
      Note, prior to commit 88078d98, this is not a big problem as
      checksum will be fixed up anyway. After it, the checksum is not
      correct on failure.
      
      Found this during code review.
      
      Fixes: 88078d98 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8de8589c
    • Marcelo Ricardo Leitner's avatar
      sctp: fix race on sctp_id2asoc · fee37f15
      Marcelo Ricardo Leitner authored
      [ Upstream commit b336deca ]
      
      syzbot reported an use-after-free involving sctp_id2asoc.  Dmitry Vyukov
      helped to root cause it and it is because of reading the asoc after it
      was freed:
      
              CPU 1                       CPU 2
      (working on socket 1)            (working on socket 2)
      	                         sctp_association_destroy
      sctp_id2asoc
         spin lock
           grab the asoc from idr
         spin unlock
                                         spin lock
      				     remove asoc from idr
      				   spin unlock
      				   free(asoc)
         if asoc->base.sk != sk ... [*]
      
      This can only be hit if trying to fetch asocs from different sockets. As
      we have a single IDR for all asocs, in all SCTP sockets, their id is
      unique on the system. An application can try to send stuff on an id
      that matches on another socket, and the if in [*] will protect from such
      usage. But it didn't consider that as that asoc may belong to another
      socket, it may be freed in parallel (read: under another socket lock).
      
      We fix it by moving the checks in [*] into the protected region. This
      fixes it because the asoc cannot be freed while the lock is held.
      
      Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
      Acked-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fee37f15
    • Heiner Kallweit's avatar
      r8169: fix NAPI handling under high load · 104ed9e3
      Heiner Kallweit authored
      [ Upstream commit 6b839b6c ]
      
      rtl_rx() and rtl_tx() are called only if the respective bits are set
      in the interrupt status register. Under high load NAPI may not be
      able to process all data (work_done == budget) and it will schedule
      subsequent calls to the poll callback.
      rtl_ack_events() however resets the bits in the interrupt status
      register, therefore subsequent calls to rtl8169_poll() won't call
      rtl_rx() and rtl_tx() - chip interrupts are still disabled.
      
      Fix this by calling rtl_rx() and rtl_tx() independent of the bits
      set in the interrupt status register. Both functions will detect
      if there's nothing to do for them.
      
      Fixes: da78dbff ("r8169: remove work from irq handler.")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      104ed9e3
    • Niklas Cassel's avatar
      net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules · d5df0bdc
      Niklas Cassel authored
      [ Upstream commit 30549aab ]
      
      When building stmmac, it is only possible to select CONFIG_DWMAC_GENERIC,
      or any of the glue drivers, when CONFIG_STMMAC_PLATFORM is set.
      The only exception is CONFIG_STMMAC_PCI.
      
      When calling of_mdiobus_register(), it will call our ->reset()
      callback, which is set to stmmac_mdio_reset().
      
      Most of the code in stmmac_mdio_reset() is protected by a
      "#if defined(CONFIG_STMMAC_PLATFORM)", which will evaluate
      to false when CONFIG_STMMAC_PLATFORM=m.
      
      Because of this, the phy reset gpio will only be pulled when
      stmmac is built as built-in, but not when built as modules.
      
      Fix this by using "#if IS_ENABLED()" instead of "#if defined()".
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5df0bdc
    • Wenwen Wang's avatar
      net: socket: fix a missing-check bug · 98528072
      Wenwen Wang authored
      [ Upstream commit b6168562 ]
      
      In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
      statement to see whether it is necessary to pre-process the ethtool
      structure, because, as mentioned in the comment, the structure
      ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
      is allocated through compat_alloc_user_space(). One thing to note here is
      that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
      partially determined by 'rule_cnt', which is actually acquired from the
      user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
      get_user(). After 'rxnfc' is allocated, the data in the original user-space
      buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
      including the 'rule_cnt' field. However, after this copy, no check is
      re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
      race to change the value in the 'compat_rxnfc->rule_cnt' between these two
      copies. Through this way, the attacker can bypass the previous check on
      'rule_cnt' and inject malicious data. This can cause undefined behavior of
      the kernel and introduce potential security risk.
      
      This patch avoids the above issue via copying the value acquired by
      get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98528072