1. 17 Dec, 2018 40 commits
    • Paolo Bonzini's avatar
      KVM: VMX: introduce alloc_loaded_vmcs · 8f54df97
      Paolo Bonzini authored
      commit f21f165e upstream.
      
      Group together the calls to alloc_vmcs and loaded_vmcs_init.  Soon we'll also
      allocate an MSR bitmap there.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 4.4:
       - No loaded_vmcs::shadow_vmcs field to initialise
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f54df97
    • Jim Mattson's avatar
      KVM: nVMX: Eliminate vmcs02 pool · 81cd4926
      Jim Mattson authored
      commit de3a0021 upstream.
      
      The potential performance advantages of a vmcs02 pool have never been
      realized. To simplify the code, eliminate the pool. Instead, a single
      vmcs02 is allocated per VCPU when the VCPU enters VMX operation.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarMark Kanda <mark.kanda@oracle.com>
      Reviewed-by: default avatarAmeya More <ameya.more@oracle.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 4.4:
       - No loaded_vmcs::shadow_vmcs field to initialise
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81cd4926
    • David Matlack's avatar
      KVM: nVMX: mark vmcs12 pages dirty on L2 exit · 1bec1a14
      David Matlack authored
      commit c9f04407 upstream.
      
      The host physical addresses of L1's Virtual APIC Page and Posted
      Interrupt descriptor are loaded into the VMCS02. The CPU may write
      to these pages via their host physical address while L2 is running,
      bypassing address-translation-based dirty tracking (e.g. EPT write
      protection). Mark them dirty on every exit from L2 to prevent them
      from getting out of sync with dirty tracking.
      
      Also mark the virtual APIC page and the posted interrupt descriptor
      dirty when KVM is virtualizing posted interrupt processing.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bec1a14
    • Radim Krčmář's avatar
      KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC · e90c6ad2
      Radim Krčmář authored
      commit d048c098 upstream.
      
      msr bitmap can be used to avoid a VM exit (interception) on guest MSR
      accesses.  In some configurations of VMX controls, the guest can even
      directly access host's x2APIC MSRs.  See SDM 29.5 VIRTUALIZING MSR-BASED
      APIC ACCESSES.
      
      L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
      To do so, L1 would first trick KVM to disable all possible interceptions
      by enabling APICv features and then would turn those features off;
      nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
      not intercept previously enabled MSRs even though they were not safe
      with the new configuration.
      
      Correctly re-enabling interceptions is not enough as a second bug would
      still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
      VMCSs, so L1 could trigger a race to get the desired combination of msr
      bitmap and VMX controls.
      
      This fix allocates a msr bitmap for every L1 VCPU, allows only safe
      x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
      have to intercept everything anyway.
      
      Fixes: 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap")
      Reported-by: default avatarJim Mattson <jmattson@google.com>
      Suggested-by: default avatarWincy Van <fanwenyi0529@gmail.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      [bwh: Backported to 4.4:
       - handle_vmon() doesn't allocate a cached vmcs12
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e90c6ad2
    • Takashi Sakamoto's avatar
      ALSA: pcm: remove SNDRV_PCM_IOCTL1_INFO internal command · 8420459f
      Takashi Sakamoto authored
      commit e11f0f90 upstream.
      
      Drivers can implement 'struct snd_pcm_ops.ioctl' to handle some requests
      from ALSA PCM core. These requests are internal purpose in kernel land.
      Usually common set of operations are used for it.
      
      SNDRV_PCM_IOCTL1_INFO is one of the requests. According to code comment,
      it has been obsoleted in the old days.
      
      We can see old releases in ftp.alsa-project.org. The command was firstly
      introduced in v0.5.0 release as SND_PCM_IOCTL1_INFO, to allow drivers to
      fill data of 'struct snd_pcm_channel_info' type. In v0.9.0 release,
      this was obsoleted by the other commands for ioctl(2) such as
      SNDRV_PCM_IOCTL_CHANNEL_INFO.
      
      This commit removes the long-abandoned command, bye.
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8420459f
    • Namhyung Kim's avatar
      pstore: Convert console write to use ->write_buf · 7f0324fb
      Namhyung Kim authored
      [ Upstream commit 70ad35db ]
      
      Maybe I'm missing something, but I don't know why it needs to copy the
      input buffer to psinfo->buf and then write.  Instead we can write the
      input buffer directly.  The only implementation that supports console
      message (i.e. ramoops) already does it for ftrace messages.
      
      For the upcoming virtio backend driver, it needs to protect psinfo->buf
      overwritten from console messages.  If it could use ->write_buf method
      instead of ->write, the problem will be solved easily.
      
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f0324fb
    • Pan Bian's avatar
      ocfs2: fix potential use after free · 466570dc
      Pan Bian authored
      [ Upstream commit 164f7e58 ]
      
      ocfs2_get_dentry() calls iput(inode) to drop the reference count of
      inode, and if the reference count hits 0, inode is freed.  However, in
      this function, it then reads inode->i_generation, which may result in a
      use after free bug.  Move the put operation later.
      
      Link: http://lkml.kernel.org/r/1543109237-110227-1-git-send-email-bianpan2016@163.com
      Fixes: 781f200c("ocfs2: Remove masklog ML_EXPORT.")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      466570dc
    • Qian Cai's avatar
      debugobjects: avoid recursive calls with kmemleak · 5d8fe653
      Qian Cai authored
      [ Upstream commit 8de456cf ]
      
      CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
      recursive calls.
      
      fill_pool
        kmemleak_ignore
          make_black_object
            put_object
              __call_rcu (kernel/rcu/tree.c)
                debug_rcu_head_queue
                  debug_object_activate
                    debug_object_init
                      fill_pool
                        kmemleak_ignore
                          make_black_object
                            ...
      
      So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
      allocated debug objects at all.
      
      Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.usSigned-off-by: default avatarQian Cai <cai@gmx.us>
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5d8fe653
    • Pan Bian's avatar
      hfsplus: do not free node before using · 168a1537
      Pan Bian authored
      [ Upstream commit c7d7d620 ]
      
      hfs_bmap_free() frees node via hfs_bnode_put(node).  However it then
      reads node->this when dumping error message on an error path, which may
      result in a use-after-free bug.  This patch frees node only when it is
      never used.
      
      Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.comSigned-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      168a1537
    • Pan Bian's avatar
      hfs: do not free node before using · 736ba5cb
      Pan Bian authored
      [ Upstream commit ce96a407 ]
      
      hfs_bmap_free() frees the node via hfs_bnode_put(node).  However, it
      then reads node->this when dumping error message on an error path, which
      may result in a use-after-free bug.  This patch frees the node only when
      it is never again used.
      
      Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.com
      Fixes: a1185ffa2fc ("HFS rewrite")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      736ba5cb
    • Larry Chen's avatar
      ocfs2: fix deadlock caused by ocfs2_defrag_extent() · 9549f09b
      Larry Chen authored
      [ Upstream commit e21e5744 ]
      
      ocfs2_defrag_extent may fall into deadlock.
      
      ocfs2_ioctl_move_extents
          ocfs2_ioctl_move_extents
            ocfs2_move_extents
              ocfs2_defrag_extent
                ocfs2_lock_allocators_move_extents
      
                  ocfs2_reserve_clusters
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      	  __ocfs2_flush_truncate_log
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
      against the global bitmap if local allocator has not sufficient cluters.
      Once global bitmap could meet the demand, ocfs2_reserve_cluster will
      return success with global bitmap locked.
      
      After ocfs2_reserve_cluster(), if truncate log is full,
      __ocfs2_flush_truncate_log() will definitely fall into deadlock because
      it needs to inode_lock global bitmap, which has already been locked.
      
      To fix this bug, we could remove from
      ocfs2_lock_allocators_move_extents() the code which intends to lock
      global allocator, and put the removed code after
      __ocfs2_flush_truncate_log().
      
      ocfs2_lock_allocators_move_extents() is referred by 2 places, one is
      here, the other does not need the data allocator context, which means
      this patch does not affect the caller so far.
      
      Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.comSigned-off-by: default avatarLarry Chen <lchen@suse.com>
      Reviewed-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9549f09b
    • Colin Ian King's avatar
      fscache, cachefiles: remove redundant variable 'cache' · 8f1ee755
      Colin Ian King authored
      [ Upstream commit 31ffa563 ]
      
      Variable 'cache' is being assigned but is never used hence it is
      redundant and can be removed.
      
      Cleans up clang warning:
      warning: variable 'cache' set but not used [-Wunused-but-set-variable]
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f1ee755
    • NeilBrown's avatar
      fscache: fix race between enablement and dropping of object · b1ef956a
      NeilBrown authored
      [ Upstream commit c5a94f43 ]
      
      It was observed that a process blocked indefintely in
      __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
      to be cleared via fscache_wait_for_deferred_lookup().
      
      At this time, ->backing_objects was empty, which would normaly prevent
      __fscache_read_or_alloc_page() from getting to the point of waiting.
      This implies that ->backing_objects was cleared *after*
      __fscache_read_or_alloc_page was was entered.
      
      When an object is "killed" and then "dropped",
      FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
      KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
      ->backing_objects cleared.  This leaves a window where
      something else can set FSCACHE_COOKIE_LOOKING_UP and
      __fscache_read_or_alloc_page() can start waiting, before
      ->backing_objects is cleared
      
      There is some uncertainty in this analysis, but it seems to be fit the
      observations.  Adding the wake in this patch will be handled correctly
      by __fscache_read_or_alloc_page(), as it checks if ->backing_objects
      is empty again, after waiting.
      
      Customer which reported the hang, also report that the hang cannot be
      reproduced with this fix.
      
      The backtrace for the blocked process looked like:
      
      PID: 29360  TASK: ffff881ff2ac0f80  CPU: 3   COMMAND: "zsh"
       #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
       #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
       #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
       #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
       #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
       #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
       #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
       #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
       #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
       #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
      #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
      #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
      #12 [ffff881ff43eff18] sys_read at ffffffff811fda62
      #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b1ef956a
    • Srikanth Boddepalli's avatar
      xen: xlate_mmu: add missing header to fix 'W=1' warning · 26e08148
      Srikanth Boddepalli authored
      [ Upstream commit 72791ac8 ]
      
      Add a missing header otherwise compiler warns about missed prototype:
      
      drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes]
        int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
            ^~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: default avatarSrikanth Boddepalli <boddepalli.srikanth@gmail.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: default avatarJoey Pabalinas <joeypabalinas@gmail.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      26e08148
    • Y.C. Chen's avatar
      drm/ast: fixed reading monitor EDID not stable issue · 0f1b4c74
      Y.C. Chen authored
      [ Upstream commit 30062562 ]
      
      v1: over-sample data to increase the stability with some specific monitors
      v2: refine to avoid infinite loop
      v3: remove un-necessary "volatile" declaration
      
      [airlied: fix two checkpatch warnings]
      Signed-off-by: default avatarY.C. Chen <yc_chen@aspeedtech.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1542858988-1127-1-git-send-email-yc_chen@aspeedtech.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      0f1b4c74
    • Pan Bian's avatar
      net: hisilicon: remove unexpected free_netdev · 573f19f1
      Pan Bian authored
      [ Upstream commit c7589401 ]
      
      The net device ndev is freed via free_netdev when failing to register
      the device. The control flow then jumps to the error handling code
      block. ndev is used and freed again. Resulting in a use-after-free bug.
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      573f19f1
    • Josh Elsasser's avatar
      ixgbe: recognize 1000BaseLX SFP modules as 1Gbps · 553c8675
      Josh Elsasser authored
      [ Upstream commit a8bf879a ]
      
      Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules,
      allowing the core driver code to establish a link over this SFP type.
      
      This is done by the out-of-tree driver but the fix wasn't in mainline.
      
      Fixes: e23f3336 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”)
      Fixes: 6a14ee0c ("ixgbe: Add X550 support function pointers")
      Signed-off-by: default avatarJosh Elsasser <jelsasser@appneta.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      553c8675
    • Lorenzo Bianconi's avatar
      net: thunderx: fix NULL pointer dereference in nic_remove · 6a1fb1bf
      Lorenzo Bianconi authored
      [ Upstream commit 24a6d2dd ]
      
      Fix a possible NULL pointer dereference in nic_remove routine
      removing the nicpf module if nic_probe fails.
      The issue can be triggered with the following reproducer:
      
      $rmmod nicvf
      $rmmod nicpf
      
      [  521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014
      [  521.422777] Mem abort info:
      [  521.425561]   ESR = 0x96000004
      [  521.428624]   Exception class = DABT (current EL), IL = 32 bits
      [  521.434535]   SET = 0, FnV = 0
      [  521.437579]   EA = 0, S1PTW = 0
      [  521.440730] Data abort info:
      [  521.443603]   ISV = 0, ISS = 0x00000004
      [  521.447431]   CM = 0, WnR = 0
      [  521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42
      [  521.457022] [0000000000000014] pgd=0000000000000000
      [  521.461916] Internal error: Oops: 96000004 [#1] SMP
      [  521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
      [  521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO)
      [  521.523451] pc : nic_remove+0x24/0x88 [nicpf]
      [  521.527808] lr : pci_device_remove+0x48/0xd8
      [  521.532066] sp : ffff000013433cc0
      [  521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000
      [  521.540672] x27: 0000000000000000 x26: 0000000000000000
      [  521.545974] x25: 0000000056000000 x24: 0000000000000015
      [  521.551274] x23: ffff8007ff89a110 x22: ffff000001667070
      [  521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000
      [  521.561877] x19: 0000000000000000 x18: 0000000000000025
      [  521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000
      [  521.593683] x7 : 0000000000000000 x6 : 0000000000000001
      [  521.598983] x5 : 0000000000000002 x4 : 0000000000000003
      [  521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184
      [  521.609585] x1 : ffff000001662118 x0 : ffff000008557be0
      [  521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3)
      [  521.621490] Call trace:
      [  521.623928]  nic_remove+0x24/0x88 [nicpf]
      [  521.627927]  pci_device_remove+0x48/0xd8
      [  521.631847]  device_release_driver_internal+0x1b0/0x248
      [  521.637062]  driver_detach+0x50/0xc0
      [  521.640628]  bus_remove_driver+0x60/0x100
      [  521.644627]  driver_unregister+0x34/0x60
      [  521.648538]  pci_unregister_driver+0x24/0xd8
      [  521.652798]  nic_cleanup_module+0x14/0x111c [nicpf]
      [  521.657672]  __arm64_sys_delete_module+0x150/0x218
      [  521.662460]  el0_svc_handler+0x94/0x110
      [  521.666287]  el0_svc+0x8/0xc
      [  521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)
      
      Fixes: 4863dea3 ("net: Adding support for Cavium ThunderX network controller")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6a1fb1bf
    • Yi Wang's avatar
      KVM: x86: fix empty-body warnings · afca87e8
      Yi Wang authored
      [ Upstream commit 354cb410 ]
      
      We get the following warnings about empty statements when building
      with 'W=1':
      
      arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      
      Rework the debug helper macro to get rid of these warnings.
      Signed-off-by: default avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      afca87e8
    • Aaro Koskinen's avatar
      USB: omap_udc: fix USB gadget functionality on Palm Tungsten E · 026e6bd1
      Aaro Koskinen authored
      [ Upstream commit 2c2322fb ]
      
      On Palm TE nothing happens when you try to use gadget drivers and plug
      the USB cable. Fix by adding the board to the vbus sense quirk list.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      026e6bd1
    • Aaro Koskinen's avatar
      USB: omap_udc: fix omap_udc_start() on 15xx machines · 09542b32
      Aaro Koskinen authored
      [ Upstream commit 6ca6695f ]
      
      On OMAP 15xx machines there are no transceivers, and omap_udc_start()
      always fails as it forgot to adjust the default return value.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09542b32
    • Aaro Koskinen's avatar
      USB: omap_udc: fix crashes on probe error and module removal · 39d59193
      Aaro Koskinen authored
      [ Upstream commit 99f70036 ]
      
      We currently crash if usb_add_gadget_udc_release() fails, since the
      udc->done is not initialized until in the remove function.
      Furthermore, on module removal the udc data is accessed although
      the release function is already triggered by usb_del_gadget_udc()
      early in the function.
      
      Fix by rewriting the release and remove functions, basically moving
      all the cleanup into the release function, and doing the completion
      only in the module removal case.
      
      The patch fixes omap_udc module probe with a failing gadged, and also
      allows the removal of omap_udc. Tested by running "modprobe omap_udc;
      modprobe -r omap_udc" in a loop.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      39d59193
    • Aaro Koskinen's avatar
      USB: omap_udc: use devm_request_irq() · a20d7a30
      Aaro Koskinen authored
      [ Upstream commit 286afdde ]
      
      The current code fails to release the third irq on the error path
      (observed by reading the code), and we get also multiple WARNs with
      failing gadget drivers due to duplicate IRQ releases. Fix by using
      devm_request_irq().
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a20d7a30
    • Martynas Pumputis's avatar
      bpf: fix check of allowed specifiers in bpf_trace_printk · ac86c99c
      Martynas Pumputis authored
      [ Upstream commit 1efb6ee3 ]
      
      A format string consisting of "%p" or "%s" followed by an invalid
      specifier (e.g. "%p%\n" or "%s%") could pass the check which
      would make format_decode (lib/vsprintf.c) to warn.
      
      Fixes: 9c959c86 ("tracing: Allow BPF programs to call bpf_trace_printk()")
      Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com
      Signed-off-by: default avatarMartynas Pumputis <m@lambda.lt>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ac86c99c
    • Pan Bian's avatar
      exportfs: do not read dentry after free · d4327131
      Pan Bian authored
      [ Upstream commit 2084ac6c ]
      
      The function dentry_connected calls dput(dentry) to drop the previously
      acquired reference to dentry. In this case, dentry can be released.
      After that, IS_ROOT(dentry) checks the condition
      (dentry == dentry->d_parent), which may result in a use-after-free bug.
      This patch directly compares dentry with its parent obtained before
      dropping the reference.
      
      Fixes: a056cc89("exportfs: stop retrying once we race with
      rename/remove")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d4327131
    • Peter Ujfalusi's avatar
      ASoC: omap-dmic: Add pm_qos handling to avoid overruns with CPU_IDLE · 476bea0f
      Peter Ujfalusi authored
      [ Upstream commit ffdcc363 ]
      
      We need to block sleep states which would require longer time to leave than
      the time the DMA must react to the DMA request in order to keep the FIFO
      serviced without overrun.
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@bitmer.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      476bea0f
    • Peter Ujfalusi's avatar
      ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE · 3aa69785
      Peter Ujfalusi authored
      [ Upstream commit 373a500e ]
      
      We need to block sleep states which would require longer time to leave than
      the time the DMA must react to the DMA request in order to keep the FIFO
      serviced without under of overrun.
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@bitmer.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3aa69785
    • Robbie Ko's avatar
      Btrfs: send, fix infinite loop due to directory rename dependencies · 44b02439
      Robbie Ko authored
      [ Upstream commit a4390aee ]
      
      When doing an incremental send, due to the need of delaying directory move
      (rename) operations we can end up in infinite loop at
      apply_children_dir_moves().
      
      An example scenario that triggers this problem is described below, where
      directory names correspond to the numbers of their respective inodes.
      
      Parent snapshot:
      
       .
       |--- 261/
             |--- 271/
                   |--- 266/
                         |--- 259/
                         |--- 260/
                         |     |--- 267
                         |
                         |--- 264/
                         |     |--- 258/
                         |           |--- 257/
                         |
                         |--- 265/
                         |--- 268/
                         |--- 269/
                         |     |--- 262/
                         |
                         |--- 270/
                         |--- 272/
                         |     |--- 263/
                         |     |--- 275/
                         |
                         |--- 274/
                               |--- 273/
      
      Send snapshot:
      
       .
       |-- 275/
            |-- 274/
                 |-- 273/
                      |-- 262/
                           |-- 269/
                                |-- 258/
                                     |-- 271/
                                          |-- 268/
                                               |-- 267/
                                                    |-- 270/
                                                         |-- 259/
                                                         |    |-- 265/
                                                         |
                                                         |-- 272/
                                                              |-- 257/
                                                                   |-- 260/
                                                                   |-- 264/
                                                                        |-- 263/
                                                                             |-- 261/
                                                                                  |-- 266/
      
      When processing inode 257 we delay its move (rename) operation because its
      new parent in the send snapshot, inode 272, was not yet processed. Then
      when processing inode 272, we delay the move operation for that inode
      because inode 274 is its ancestor in the send snapshot. Finally we delay
      the move operation for inode 274 when processing it because inode 275 is
      its new parent in the send snapshot and was not yet moved.
      
      When finishing processing inode 275, we start to do the move operations
      that were previously delayed (at apply_children_dir_moves()), resulting in
      the following iterations:
      
      1) We issue the move operation for inode 274;
      
      2) Because inode 262 depended on the move operation of inode 274 (it was
         delayed because 274 is its ancestor in the send snapshot), we issue the
         move operation for inode 262;
      
      3) We issue the move operation for inode 272, because it was delayed by
         inode 274 too (ancestor of 272 in the send snapshot);
      
      4) We issue the move operation for inode 269 (it was delayed by 262);
      
      5) We issue the move operation for inode 257 (it was delayed by 272);
      
      6) We issue the move operation for inode 260 (it was delayed by 272);
      
      7) We issue the move operation for inode 258 (it was delayed by 269);
      
      8) We issue the move operation for inode 264 (it was delayed by 257);
      
      9) We issue the move operation for inode 271 (it was delayed by 258);
      
      10) We issue the move operation for inode 263 (it was delayed by 264);
      
      11) We issue the move operation for inode 268 (it was delayed by 271);
      
      12) We verify if we can issue the move operation for inode 270 (it was
          delayed by 271). We detect a path loop in the current state, because
          inode 267 needs to be moved first before we can issue the move
          operation for inode 270. So we delay again the move operation for
          inode 270, this time we will attempt to do it after inode 267 is
          moved;
      
      13) We issue the move operation for inode 261 (it was delayed by 263);
      
      14) We verify if we can issue the move operation for inode 266 (it was
          delayed by 263). We detect a path loop in the current state, because
          inode 270 needs to be moved first before we can issue the move
          operation for inode 266. So we delay again the move operation for
          inode 266, this time we will attempt to do it after inode 270 is
          moved (its move operation was delayed in step 12);
      
      15) We issue the move operation for inode 267 (it was delayed by 268);
      
      16) We verify if we can issue the move operation for inode 266 (it was
          delayed by 270). We detect a path loop in the current state, because
          inode 270 needs to be moved first before we can issue the move
          operation for inode 266. So we delay again the move operation for
          inode 266, this time we will attempt to do it after inode 270 is
          moved (its move operation was delayed in step 12). So here we added
          again the same delayed move operation that we added in step 14;
      
      17) We attempt again to see if we can issue the move operation for inode
          266, and as in step 16, we realize we can not due to a path loop in
          the current state due to a dependency on inode 270. Again we delay
          inode's 266 rename to happen after inode's 270 move operation, adding
          the same dependency to the empty stack that we did in steps 14 and 16.
          The next iteration will pick the same move dependency on the stack
          (the only entry) and realize again there is still a path loop and then
          again the same dependency to the stack, over and over, resulting in
          an infinite loop.
      
      So fix this by preventing adding the same move dependency entries to the
      stack by removing each pending move record from the red black tree of
      pending moves. This way the next call to get_pending_dir_moves() will
      not return anything for the current parent inode.
      
      A test case for fstests, with this reproducer, follows soon.
      Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      [Wrote changelog with example and more clear explanation]
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      44b02439
    • Huacai Chen's avatar
      hwmon: (w83795) temp4_type has writable permission · b9b45f49
      Huacai Chen authored
      [ Upstream commit 09aaf681 ]
      
      Both datasheet and comments of store_temp_mode() tell us that temp1~4_type
      is writable, so fix it.
      Signed-off-by: default avatarYao Wang <wangyao@lemote.com>
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Fixes: 39deb699 (" hwmon: (w83795) Simplify temperature sensor type handling")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b9b45f49
    • Tzung-Bi Shih's avatar
      ASoC: dapm: Recalculate audio map forcely when card instantiated · fd103a69
      Tzung-Bi Shih authored
      [ Upstream commit 882eab6c ]
      
      Audio map are possible in wrong state before card->instantiated has
      been set to true.  Imaging the following examples:
      
      time 1: at the beginning
      
        in:-1    in:-1    in:-1    in:-1
       out:-1   out:-1   out:-1   out:-1
       SIGGEN        A        B      Spk
      
      time 2: after someone called snd_soc_dapm_new_widgets()
      (e.g. create_fill_widget_route_map() in sound/soc/codecs/hdac_hdmi.c)
      
         in:1     in:0     in:0     in:0
        out:0    out:0    out:0    out:1
       SIGGEN        A        B      Spk
      
      time 3: routes added
      
         in:1     in:0     in:0     in:0
        out:0    out:0    out:0    out:1
       SIGGEN -----> A -----> B ---> Spk
      
      In the end, the path should be powered on but it did not.  At time 3,
      "in" of SIGGEN and "out" of Spk did not propagate to their neighbors
      because snd_soc_dapm_add_path() will not invalidate the paths if
      the card has not instantiated (i.e. card->instantiated is false).
      To correct the state of audio map, recalculate the whole map forcely.
      Signed-off-by: default avatarTzung-Bi Shih <tzungbi@google.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fd103a69
    • Nicolin Chen's avatar
      hwmon: (ina2xx) Fix current value calculation · 15d8d724
      Nicolin Chen authored
      [ Upstream commit 38cd989e ]
      
      The current register (04h) has a sign bit at MSB. The comments
      for this calculation also mention that it's a signed register.
      
      However, the regval is unsigned type so result of calculation
      turns out to be an incorrect value when current is negative.
      
      This patch simply fixes this by adding a casting to s16.
      
      Fixes: 5d389b12 ("hwmon: (ina2xx) Make calibration register value fixed")
      Signed-off-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      15d8d724
    • Thomas Richter's avatar
      s390/cpum_cf: Reject request for sampling in event initialization · 9ba7a303
      Thomas Richter authored
      [ Upstream commit 613a41b0 ]
      
      On s390 command perf top fails
      [root@s35lp76 perf] # ./perf top -F100000  --stdio
         Error:
         cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
         	Try 'perf stat'
      [root@s35lp76 perf] #
      
      Using event -e rb0000 works as designed.  Event rb0000 is the event
      number of the sampling facility for basic sampling.
      
      During system start up the following PMUs are installed in the kernel's
      PMU list (from head to tail):
         cpum_cf --> s390 PMU counter facility device driver
         cpum_sf --> s390 PMU sampling facility device driver
         uprobe
         kprobe
         tracepoint
         task_clock
         cpu_clock
      
      Perf top executes following functions and calls perf_event_open(2) system
      call with different parameters many times:
      
      cmd_top
      --> __cmd_top
          --> perf_evlist__add_default
              --> __perf_evlist__add_default
                  --> perf_evlist__new_cycles (creates event type:0 (HW)
      			    		config 0 (CPU_CYCLES)
      	        --> perf_event_attr__set_max_precise_ip
      		    Uses perf_event_open(2) to detect correct
      		    precise_ip level. Fails 3 times on s390 which is ok.
      
      Then functions cmd_top
      --> __cmd_top
          --> perf_top__start_counters
              -->perf_evlist__config
      	   --> perf_can_comm_exec
                     --> perf_probe_api
      	           This functions test support for the following events:
      		   "cycles:u", "instructions:u", "cpu-clock:u" using
      		   --> perf_do_probe_api
      		       --> perf_event_open_cloexec
      		           Test the close on exec flag support with
      			   perf_event_open(2).
      	               perf_do_probe_api returns true if the event is
      		       supported.
      		       The function returns true because event cpu-clock is
      		       supported by the PMU cpu_clock.
      	               This is achieved by many calls to perf_event_open(2).
      
      Function perf_top__start_counters now calls perf_evsel__open() for every
      event, which is the default event cpu_cycles (config:0) and type HARDWARE
      (type:0) which a predfined frequence of 4000.
      
      Given the above order of the PMU list, the PMU cpum_cf gets called first
      and returns 0, which indicates support for this sampling. The event is
      fully allocated in the function perf_event_open (file kernel/event/core.c
      near line 10521 and the following check fails:
      
              event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
      		                 NULL, NULL, cgroup_fd);
      	if (IS_ERR(event)) {
      		err = PTR_ERR(event);
      		goto err_cred;
      	}
      
              if (is_sampling_event(event)) {
      		if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
      			err = -EOPNOTSUPP;
      			goto err_alloc;
      		}
      	}
      
      The check for the interrupt capabilities fails and the system call
      perf_event_open() returns -EOPNOTSUPP (-95).
      
      Add a check to return -ENODEV when sampling is requested in PMU cpum_cf.
      This allows common kernel code in the perf_event_open() system call to
      test the next PMU in above list.
      
      Fixes: 97b1198f (" "s390, perf: Use common PMU interrupt disabled code")
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9ba7a303
    • YueHaibing's avatar
      sysv: return 'err' instead of 0 in __sysv_write_inode · c65b63a0
      YueHaibing authored
      [ Upstream commit c4b7d1ba ]
      
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      fs/sysv/inode.c: In function '__sysv_write_inode':
      fs/sysv/inode.c:239:6: warning:
       variable 'err' set but not used [-Wunused-but-set-variable]
      
      __sysv_write_inode should return 'err' instead of 0
      
      Fixes: 05459ca8 ("repair sysv_write_inode(), switch sysv to simple_fsync()")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c65b63a0
    • Janusz Krzysztofik's avatar
      ARM: OMAP1: ams-delta: Fix possible use of uninitialized field · d7589b83
      Janusz Krzysztofik authored
      [ Upstream commit cec83ff1 ]
      
      While playing with initialization order of modem device, it has been
      discovered that under some circumstances (early console init, I
      believe) its .pm() callback may be called before the
      uart_port->private_data pointer is initialized from
      plat_serial8250_port->private_data, resulting in NULL pointer
      dereference.  Fix it by checking for uninitialized pointer before using
      it in modem_pm().
      
      Fixes: aabf3173 ("ARM: OMAP1: ams-delta: update the modem to use regulator API")
      Signed-off-by: default avatarJanusz Krzysztofik <jmkrzyszt@gmail.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d7589b83
    • Nathan Chancellor's avatar
      ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup · d2f642b0
      Nathan Chancellor authored
      [ Upstream commit eef3dc34 ]
      
      When building the kernel with Clang, the following section mismatch
      warning appears:
      
      WARNING: vmlinux.o(.text+0x38b3c): Section mismatch in reference from
      the function omap44xx_prm_late_init() to the function
      .init.text:omap44xx_prm_enable_io_wakeup()
      The function omap44xx_prm_late_init() references
      the function __init omap44xx_prm_enable_io_wakeup().
      This is often because omap44xx_prm_late_init lacks a __init
      annotation or the annotation of omap44xx_prm_enable_io_wakeup is wrong.
      
      Remove the __init annotation from omap44xx_prm_enable_io_wakeup so there
      is no more mismatch.
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d2f642b0
    • Stefano Brivio's avatar
      neighbour: Avoid writing before skb->head in neigh_hh_output() · a56de6cd
      Stefano Brivio authored
      [ Upstream commit e6ac64d4 ]
      
      While skb_push() makes the kernel panic if the skb headroom is less than
      the unaligned hardware header size, it will proceed normally in case we
      copy more than that because of alignment, and we'll silently corrupt
      adjacent slabs.
      
      In the case fixed by the previous patch,
      "ipv6: Check available headroom in ip6_xmit() even without options", we
      end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware
      header and write 16 bytes, starting 2 bytes before the allocated buffer.
      
      Always check we're not writing before skb->head and, if the headroom is
      not enough, warn and drop the packet.
      
      v2:
       - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet
         (Eric Dumazet)
       - if we avoid the panic, though, we need to explicitly check the headroom
         before the memcpy(), otherwise we'll have corrupted slabs on a running
         kernel, after we warn
       - use __skb_push() instead of skb_push(), as the headroom check is
         already implemented here explicitly (Eric Dumazet)
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a56de6cd
    • Nicolas Dichtel's avatar
      tun: forbid iface creation with rtnl ops · 2da7c7b2
      Nicolas Dichtel authored
      [ Upstream commit 35b827b6 ]
      
      It's not supported right now (the goal of the initial patch was to support
      'ip link del' only).
      
      Before the patch:
      $ ip link add foo type tun
      [  239.632660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [snip]
      [  239.636410] RIP: 0010:register_netdevice+0x8e/0x3a0
      
      This panic occurs because dev->netdev_ops is not set by tun_setup(). But to
      have something usable, it will require more than just setting
      netdev_ops.
      
      Fixes: f019a7a5 ("tun: Implement ip link del tunXXX")
      CC: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2da7c7b2
    • Yuchung Cheng's avatar
      tcp: fix NULL ref in tail loss probe · ed7748bc
      Yuchung Cheng authored
      [ Upstream commit b2b7af86 ]
      
      TCP loss probe timer may fire when the retranmission queue is empty but
      has a non-zero tp->packets_out counter. tcp_send_loss_probe will call
      tcp_rearm_rto which triggers NULL pointer reference by fetching the
      retranmission queue head in its sub-routines.
      
      Add a more detailed warning to help catch the root cause of the inflight
      accounting inconsistency.
      Reported-by: default avatarRafael Tinoco <rafael.tinoco@linaro.org>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed7748bc
    • Eric Dumazet's avatar
      rtnetlink: ndo_dflt_fdb_dump() only work for ARPHRD_ETHER devices · 266b50e7
      Eric Dumazet authored
      [ Upstream commit 68883893 ]
      
      kmsan was able to trigger a kernel-infoleak using a gre device [1]
      
      nlmsg_populate_fdb_fill() has a hard coded assumption
      that dev->addr_len is ETH_ALEN, as normally guaranteed
      for ARPHRD_ETHER devices.
      
      A similar issue was fixed recently in commit da715775
      ("rtnetlink: Disallow FDB configuration for non-Ethernet device")
      
      [1]
      BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:143 [inline]
      BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576
      CPU: 0 PID: 6697 Comm: syz-executor310 Not tainted 4.20.0-rc3+ #95
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x32d/0x480 lib/dump_stack.c:113
       kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683
       kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743
       kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634
       copyout lib/iov_iter.c:143 [inline]
       _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576
       copy_to_iter include/linux/uio.h:143 [inline]
       skb_copy_datagram_iter+0x4e2/0x1070 net/core/datagram.c:431
       skb_copy_datagram_msg include/linux/skbuff.h:3316 [inline]
       netlink_recvmsg+0x6f9/0x19d0 net/netlink/af_netlink.c:1975
       sock_recvmsg_nosec net/socket.c:794 [inline]
       sock_recvmsg+0x1d1/0x230 net/socket.c:801
       ___sys_recvmsg+0x444/0xae0 net/socket.c:2278
       __sys_recvmsg net/socket.c:2327 [inline]
       __do_sys_recvmsg net/socket.c:2337 [inline]
       __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
       __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
       do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x441119
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffc7f008a8 EFLAGS: 00000207 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000441119
      RDX: 0000000000000040 RSI: 00000000200005c0 RDI: 0000000000000003
      RBP: 00000000006cc018 R08: 0000000000000100 R09: 0000000000000100
      R10: 0000000000000100 R11: 0000000000000207 R12: 0000000000402080
      R13: 0000000000402110 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:261 [inline]
       kmsan_internal_chain_origin+0x13d/0x240 mm/kmsan/kmsan.c:469
       kmsan_memcpy_memmove_metadata+0x1a9/0xf70 mm/kmsan/kmsan.c:344
       kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:362
       __msan_memcpy+0x61/0x70 mm/kmsan/kmsan_instr.c:162
       __nla_put lib/nlattr.c:744 [inline]
       nla_put+0x20a/0x2d0 lib/nlattr.c:802
       nlmsg_populate_fdb_fill+0x444/0x810 net/core/rtnetlink.c:3466
       nlmsg_populate_fdb net/core/rtnetlink.c:3775 [inline]
       ndo_dflt_fdb_dump+0x73a/0x960 net/core/rtnetlink.c:3807
       rtnl_fdb_dump+0x1318/0x1cb0 net/core/rtnetlink.c:3979
       netlink_dump+0xc79/0x1c90 net/netlink/af_netlink.c:2244
       __netlink_dump_start+0x10c4/0x11d0 net/netlink/af_netlink.c:2352
       netlink_dump_start include/linux/netlink.h:216 [inline]
       rtnetlink_rcv_msg+0x141b/0x1540 net/core/rtnetlink.c:4910
       netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116
       __sys_sendmsg net/socket.c:2154 [inline]
       __do_sys_sendmsg net/socket.c:2163 [inline]
       __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
       do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline]
       kmsan_internal_poison_shadow+0x6d/0x130 mm/kmsan/kmsan.c:170
       kmsan_kmalloc+0xa1/0x100 mm/kmsan/kmsan_hooks.c:186
       __kmalloc+0x14c/0x4d0 mm/slub.c:3825
       kmalloc include/linux/slab.h:551 [inline]
       __hw_addr_create_ex net/core/dev_addr_lists.c:34 [inline]
       __hw_addr_add_ex net/core/dev_addr_lists.c:80 [inline]
       __dev_mc_add+0x357/0x8a0 net/core/dev_addr_lists.c:670
       dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687
       ip_mc_filter_add net/ipv4/igmp.c:1128 [inline]
       igmp_group_added+0x4d4/0xb80 net/ipv4/igmp.c:1311
       __ip_mc_inc_group+0xea9/0xf70 net/ipv4/igmp.c:1444
       ip_mc_inc_group net/ipv4/igmp.c:1453 [inline]
       ip_mc_up+0x1c3/0x400 net/ipv4/igmp.c:1775
       inetdev_event+0x1d03/0x1d80 net/ipv4/devinet.c:1522
       notifier_call_chain kernel/notifier.c:93 [inline]
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x13d/0x240 kernel/notifier.c:401
       __dev_notify_flags+0x3da/0x860 net/core/dev.c:1733
       dev_change_flags+0x1ac/0x230 net/core/dev.c:7569
       do_setlink+0x165f/0x5ea0 net/core/rtnetlink.c:2492
       rtnl_newlink+0x2ad7/0x35a0 net/core/rtnetlink.c:3111
       rtnetlink_rcv_msg+0x1148/0x1540 net/core/rtnetlink.c:4947
       netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116
       __sys_sendmsg net/socket.c:2154 [inline]
       __do_sys_sendmsg net/socket.c:2163 [inline]
       __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
       do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Bytes 36-37 of 105 are uninitialized
      Memory access of size 105 starts at ffff88819686c000
      Data copied to user address 0000000020000380
      
      Fixes: d83b0603 ("net: add fdb generic dump routine")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Ido Schimmel <idosch@mellanox.com>
      Cc: David Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      266b50e7
    • Christoph Paasch's avatar
      net: Prevent invalid access to skb->prev in __qdisc_drop_all · d7519c01
      Christoph Paasch authored
      [ Upstream commit 9410d386 ]
      
      __qdisc_drop_all() accesses skb->prev to get to the tail of the
      segment-list.
      
      With commit 68d2f84a ("net: gro: properly remove skb from list")
      the skb-list handling has been changed to set skb->next to NULL and set
      the list-poison on skb->prev.
      
      With that change, __qdisc_drop_all() will panic when it tries to
      dereference skb->prev.
      
      Since commit 992cba7e ("net: Add and use skb_list_del_init().")
      __list_del_entry is used, leaving skb->prev unchanged (thus,
      pointing to the list-head if it's the first skb of the list).
      This will make __qdisc_drop_all modify the next-pointer of the list-head
      and result in a panic later on:
      
      [   34.501053] general protection fault: 0000 [#1] SMP KASAN PTI
      [   34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108
      [   34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90
      [   34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04
      [   34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202
      [   34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6
      [   34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038
      [   34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062
      [   34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000
      [   34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008
      [   34.512082] FS:  0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000
      [   34.513036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0
      [   34.514593] Call Trace:
      [   34.514893]  <IRQ>
      [   34.515157]  napi_gro_receive+0x93/0x150
      [   34.515632]  receive_buf+0x893/0x3700
      [   34.516094]  ? __netif_receive_skb+0x1f/0x1a0
      [   34.516629]  ? virtnet_probe+0x1b40/0x1b40
      [   34.517153]  ? __stable_node_chain+0x4d0/0x850
      [   34.517684]  ? kfree+0x9a/0x180
      [   34.518067]  ? __kasan_slab_free+0x171/0x190
      [   34.518582]  ? detach_buf+0x1df/0x650
      [   34.519061]  ? lapic_next_event+0x5a/0x90
      [   34.519539]  ? virtqueue_get_buf_ctx+0x280/0x7f0
      [   34.520093]  virtnet_poll+0x2df/0xd60
      [   34.520533]  ? receive_buf+0x3700/0x3700
      [   34.521027]  ? qdisc_watchdog_schedule_ns+0xd5/0x140
      [   34.521631]  ? htb_dequeue+0x1817/0x25f0
      [   34.522107]  ? sch_direct_xmit+0x142/0xf30
      [   34.522595]  ? virtqueue_napi_schedule+0x26/0x30
      [   34.523155]  net_rx_action+0x2f6/0xc50
      [   34.523601]  ? napi_complete_done+0x2f0/0x2f0
      [   34.524126]  ? kasan_check_read+0x11/0x20
      [   34.524608]  ? _raw_spin_lock+0x7d/0xd0
      [   34.525070]  ? _raw_spin_lock_bh+0xd0/0xd0
      [   34.525563]  ? kvm_guest_apic_eoi_write+0x6b/0x80
      [   34.526130]  ? apic_ack_irq+0x9e/0xe0
      [   34.526567]  __do_softirq+0x188/0x4b5
      [   34.527015]  irq_exit+0x151/0x180
      [   34.527417]  do_IRQ+0xdb/0x150
      [   34.527783]  common_interrupt+0xf/0xf
      [   34.528223]  </IRQ>
      
      This patch makes sure that skb->prev is set to NULL when entering
      netem_enqueue.
      
      Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: 68d2f84a ("net: gro: properly remove skb from list")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7519c01